The Omnizient Knowledge Engine (work-in-process) is a set of models, methodologies and tools to help achieve a “better” (deeper, new, more accurate) comprehension of any set of knowledge domains.
The first tool “Meta Sanity Test” is based on a set of questions.
META SANITY TEST
- What are we assuming?
- Have we made a complete list of “What- We-Know-That-We-Do-Not-Know” (WWKTWDNK)?
- How do we know (whatever we know) is right?
- How do we know (whatever we know) is accurate?
- Are we going beyond simply labeling pieces of information?
- Are we attempting to comprehend and get a real understanding of the subject in discussion?
- What have we totally missed, so far? What have we failed to consider, so far?
RELATED RESOURCES
In their paper “Knowledge and Assumptions” (2010), Brett Sherman and Gilbert Harman propose that “epistemologists should include an epistemic notion into the mix, namely the notion of assuming or taking for granted”.
From the abstract of the paper “Everyday Assumptions about Knowledge” by J. L. Evans:
- We do not, therefore, approach the study of philosophy with the expectation of learning about matters which were previously quite unfamiliar to us.
- Further, whereas we may approach the subject with the hope that our understanding of these familiar concepts may be broadened, we would, I think, be surprised if we were told at the start of our inquiry that we were to expect to find that our everyday understanding was radically mistaken.
- On the other hand, if we were approaching the study of physics, we would not be dismayed if we were told that we should be prepared for a radical alteration in our understanding of everyday concepts such as ‘force’ or ‘energy’; nor would it surprise us to be warned that we were likely to be confronted with concepts which were quite unfamiliar in everyday life, such as ‘entropy’.
From “How Prior Knowledge and Assumptions Impact New Learning”, UNT Teaching Commons, Center for Learning, Experimentation, Application, and Research (CLEAR), Division of Digital Strategy and Innovation, University of North Texas:
- According to Campbell and Campbell (2009), students use their prior understandings, their knowledge and assumptions, as “mental hooks that serve to anchor instructional concepts” (p. 7). Life experiences as well as prior learning influence prior knowledge.
- If prior knowledge is insufficient, inappropriate or inaccurate, students are likely to struggle to acquire the new knowledge in their classes (Ambrose, et al, 2010).
- Sometimes, students can even perform tasks such using formulas, but still have insufficient understanding of the concept for building on their existing knowledge.
- Inappropriate prior learning can distort new learning. For example, the student may have conventional or colloquial understanding of a word like “random” that gets in the way of an understanding in statistical terminology.
- Prior assumptions can block learning new concepts and ideas. We all have assumptions and often do not realize what they are. Our intellectual and socio-emotional development forms a base for many of our assumptions.
- Brookfield (2012, p. 12) states: “Key to this process [identifying assumptions] is identifying and assessing what both we as instructors and our students regard as convincing evidence for our assumptions. Sometimes this evidence is experiential (the things that have happened to us), sometimes it’s authoritative (what people we trust have told us is the truth), and sometimes it’s derived from disciplined research and inquiry we’ve conducted”.
This UNT Teaching Commons article also shares ideas and resources on:
- How to assess prior knowledge of students with varied backgrounds
- Intervening with Students’ Prior Understandings
- Work with students’ inappropriate prior understandings.
SANITY TESTING IN THE CONTEXT OF MACHINE LEARNING
From Strategies for Sanity-Checking Machine Learning Models (02.08.19 / John Brock):
- When you train a machine learning classifier, you want to end up with a model that generalizes well. In other words, you want a classifier that can accurately classify data that wasn’t seen during training. Or, to say it in machine learning lingo, you want a classifier that avoids overfitting.
- In practice, a simple validation technique that often works well is to run your classifier in “silent mode” against real-time customer data, i.e., recording classifier decisions, but not actually taking any action based on those decisions. If this model performs more poorly with real-time customer data than you saw in your internal testing, then you may need to go back to the drawing board.
- In this paper, we propose SaneDL , a tool that provides systematic data Sanity che ck for Deep Learning-based Systems. SaneDL serves as a lightweight, flexible, and adaptive plugin for DL models, which can achieve effective detection of invalid inputs during system runtime.
- We approach reliability enhancement of DL systems via data sanity check. We proposed a tool, namely SaneDL, to perform data sanity check for DL-based systems. SaneDL detects behavior deviation of DL model to identify invalid input cases. SaneDL is an assertion-based approach, where the assertions are automatically learned and inserted. To our knowledge, SaneDL is the first assertion-based tool that can automatically detects invalid input cases for DL systems. Our work can shed light to other practices in improving DL reliability.
From “Down with Pipeline debt / Introducing Great Expectations [Feb 21, 2018]”:
- Pipeline debt is a species of technical debt that infests backend data systems. It drags down productivity and puts analytic integrity at risk. The best way to beat pipeline debt is a new twist on automated testing: pipeline tests, which are applied to data (instead of code) and at batch time (instead of compile or deploy time). We’re releasing Great Expectations, an open-source tool that make it easy to test data pipelines.
- Pipeline debt is kind of technical debt. It’s closely related to Michael Feather’s concept of legacy code. Like other forms of technical debt, pipeline debt accumulates when unclear or forgotten assumptions are buried inside a complex, interconnected codebase.
Related:
- Checking the sanity of your data using automated testing
- ML Work-Flow (Part 4) – Sanity Checks and Data Splitting [NOV 3, 2014]
- 4 Key Steps for a Data Sanity Check [Connor Carreras, November 17, 2020]
META LEARNING RESOURCES
META LEARNING IN THE CONTEXT OF MACHINE LEARNING
From the abstract of the paper “Meta-Learning without Memorization” (2019) by Mingzhang Yin, George Tucker, Mingyuan Zhou, Sergey Levine, Chelsea Finn:
- The ability to learn new concepts with small amounts of data is a critical aspect of intelligence that has proven challenging for deep learning methods.
- Meta-learning has emerged as a promising technique for leveraging data from previous tasks to enable efficient learning of new tasks.
- However, most meta-learning algorithms implicitly require that the meta-training tasks be mutually-exclusive, such that no single model can solve all of the tasks at once. For example, when creating tasks for few-shot image classification, prior work uses a per-task random assignment of image classes to N-way classification labels. If this is not done, the meta-learner can ignore the task training data and learn a single model that performs all of the meta-training tasks zero-shot, but does not adapt effectively to new image classes.
- This requirement means that the user must take great care in designing the tasks, for example by shuffling labels or removing task identifying information from the inputs. In some domains, this makes meta-learning entirely inapplicable.
- In this paper, we address this challenge by designing a meta-regularization objective using information theory that places precedence on data-driven adaptation. This causes the meta-learner to decide what must be learned from the task training data and what should be inferred from the task testing input. By doing so, our algorithm can successfully use data from non-mutually-exclusive tasks to efficiently adapt to novel tasks.
From the abstract of the paper “MetaMix: Improved Meta-Learning with Interpolation-based Consistency Regularization” (2020) by Yangbin Chen, Yun Ma, Tom Ko, Jianping Wang, Qing Li:
- Model-Agnostic Meta-Learning (MAML) and its variants are popular few-shot classification methods.
- They train an initializer across a variety of sampled learning tasks (also known as episodes) such that the initialized model can adapt quickly to new tasks.
- However, current MAML-based algorithms have limitations in forming generalizable decision boundaries.
- In this paper, we propose an approach called MetaMix. It generates virtual feature-target pairs within each episode to regularize the backbone models.
- MetaMix can be integrated with any of the MAML-based algorithms and learn the decision boundaries generalizing better to new tasks.
From the abstract of paper “Meta Learning for End-to-End Low-Resource Speech Recognition” (2019) by Jui-Yang Hsu, Yuan-Jui Chen, Hung-yi Lee:
- In this paper, we proposed to apply meta learning approach for low-resource automatic speech recognition (ASR).
- We formulated ASR for different languages as different tasks, and meta-learned the initialization parameters from many pretraining languages to achieve fast adaptation on unseen target language, via recently proposed model-agnostic meta learning algorithm (MAML).
- We evaluated the proposed approach using six languages as pretraining tasks and four languages as target tasks.
From the abstract of the paper “Model-Agnostic Meta-Learning for Relation Classification with Limited Supervision”(2019) by Abiola Obamuyide, Andreas Vlachos:
- In this paper we frame the task of supervised relation classification as an instance of meta-learning.
- We propose a model-agnostic meta-learning protocol for training relation classifiers to achieve enhanced predictive performance in limited supervision settings.
- During training, we aim to not only learn good parameters for classifying relations with sufficient supervision, but also learn model parameters that can be fine-tuned to enhance predictive performance for relations with limited supervision.
From the abstract of the paper “Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks”(2019) by Zi-Yi Dou, Keyi Yu, Antonios Anastasopoulos:
- Learning general representations of text is a fundamental problem for many natural language understanding (NLU) tasks.
- Previously, researchers have proposed to use language model pre-training and multi-task learning to learn robust representations. However, these methods can achieve sub-optimal performance in low-resource scenarios.
- Inspired by the recent success of optimization-based meta-learning algorithms, in this paper, we explore the model-agnostic meta-learning algorithm (MAML) and its variants for low-resource NLU tasks.
From the abstract of the paper “Meta Learning-based MIMO Detectors: Design, Simulation, and Experimental Test” (2020) by Jing Zhang; Yunfeng He, Yu-Wen Li, Chao-Kai Wen, Shi Jin:
- Deep neural networks (NNs) have exhibited considerable potential for efficiently balancing the performance and complexity of multiple-input and multiple-output (MIMO) detectors.
- However, existing NN-based MIMO detectors are difficult to be deployed in practical systems because of their slow convergence speed and low robustness in new environments.
- To address these issues systematically, we propose a receiver framework that enables efficient online training by leveraging the following simple observation: although NN parameters should adapt to channels, not all of them are channel-sensitive. In particular, we use a deep unfolded NN structure that represents iterative algorithms in signal detection and channel decoding modules as multi layer deep feed forward networks. An expectation propagation (EP) module, called EPNet, is established for signal detection by unfolding the EP algorithm and rendering the damping factors trainable. An unfolded turbo decoding module, called TurboNet, is used for channel decoding. This component decodes the turbo code, where trainable NN units are integrated into the traditional max-log-maximum a posteriori decoding procedure.