I Don’t Know – Prediction Cascades Framework |
Advances in deep learning have led to substantial increases in prediction accuracy as well as the cost of rendering predictions. We conjecture that for a majority of real-world inputs, the recent advances in deep learning have created models that effectively ‘over-think’ on simple inputs. In this paper we revisit the classic idea of prediction cascades to reduce prediction costs. We introduce the ‘I Don’t Know’ (IDK) prediction cascades framework, a general framework for constructing prediction cascades for arbitrary multi-class prediction tasks. We propose two baseline methods for constructing cascades as well as a new objective within this framework and evaluate these techniques on a range of benchmark and real-world datasets to demonstrate the prediction cascades can achieve 1.7-10.5x speedups in image classification tasks while maintaining comparable accuracy to state-of-the-art models. When combined with human experts, prediction cascades can achieve nearly perfect accuracy(within 5%) while requiring human intervention on less than 30% of the queries. |

Ibis |
Ibis is a new Python data analysis framework with the goal of enabling data scientists and data engineers to be as productive working with big data as they are working with small and medium data today. In doing so, we will enable Python to become a true first-class language for Apache Hadoop, without compromises in functionality, usability, or performance. Having spent much of the last decade improving the usability of the single-node Python experience (with pandas and other projects), we are looking to achieve: • 100% Python end-to-end user workflows • Native hardware speeds for a broad set of use cases • Full-fidelity data analysis without extractions or sampling • Scalability for big data • Integration with the existing Python data ecosystem (pandas, scikit-learn, NumPy, and so on) |

IllinoisSL |
IllinoisSL is a Java library for learning structured prediction models. It supports structured Support Vector Machines and structured Perceptron. The library consists of a core learning module and several applications, which can be executed from command-lines. Documentation is provided to guide users. In Comparison to other structured learning libraries, IllinoisSL is efficient, general, and easy to use. |

Image Processing Language for Performance Portability on Heterogeneous Systems( ImageCL) |
Modern computer systems typically conbine multicore CPUs with accelerators like GPUs for inproved performance and energy efficiency. However, these sys- tems suffer from poor performance portability, code tuned for one device must be retuned to achieve high performance on another. Image processing is increas- ing in importance , with applications ranging from seismology and medicine to Photoshop. Based on our experience with medical image processing, we propose ImageCL, a high-level domain-specific language and source-to-source compiler, targeting heterogeneous hardware. ImageCL resembles OpenCL, but abstracts away per- formance optimization details, allowing the programmer to focus on algorithm development, rather than performance tuning. The latter is left to our source-to- source compiler and auto-tuner. From high-level ImageCL kernels, our source- to-source compiler can generate multiple OpenCL implementations with different optimizations applied. We rely on auto-tuning rather than machine models or ex- pert programmer knowledge to determine which optimizations to apply, making our tuning procedure highly robust. Furthermore, we can generate high perform- ing implementations for different devices from a single source code, thereby im- proving performance portability. We evaluate our approach on three image processing benchmarks, on different GPU and CPU devices, and are able to outperform other state of the art solutions in several cases, achieving speedups of up to 4.57x. |

Image-Text-Image( I2T2I) |
Translating information between text and image is a fundamental problem in artificial intelligence that connects natural language processing and computer vision. In the past few years, performance in image caption generation has seen significant improvement through the adoption of recurrent neural networks (RNN). Meanwhile, text-to-image generation begun to generate plausible images using datasets of specific categories like birds and flowers. We’ve even seen image generation from multi-category datasets such as the Microsoft Common Objects in Context (MSCOCO) through the use of generative adversarial networks (GANs). Synthesizing objects with a complex shape, however, is still challenging. For example, animals and humans have many degrees of freedom, which means that they can take on many complex shapes. We propose a new training method called Image-Text-Image (I2T2I) which integrates text-to-image and image-to-text (image captioning) synthesis to improve the performance of text-to-image synthesis. We demonstrate that %the capability of our method to understand the sentence descriptions, so as to I2T2I can generate better multi-categories images using MSCOCO than the state-of-the-art. We also demonstrate that I2T2I can achieve transfer learning by using a pre-trained image captioning module to generate human images on the MPII Human Pose |

Imagination-Augmented Agents( I2A) |
We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in arbitrary ways, by using the predictions as additional context in deep policy networks. I2As show improved data efficiency, performance, and robustness to model misspecification compared to several baselines. |

Imitation Learning |
Learning from Demonstration’: Imitation learning, a.k.a behavioral cloning, is learning from demonstration. In other words, in imitation learning, a machine learns how to behave by looking at what a teacher (or expert) does and then mimics that behavior. An example can be when we collect driving data from human and then use that data for a self driving car. Imitation Learning in Tensorflow |

Imperialist Competitive Algorithm( ICA) |
In computer science, Imperialist Competitive Algorithm (ICA) is a computational method that is used to solve optimization problems of different types. Like most of the methods in the area of evolutionary computation, ICA does not need the gradient of the function in its optimization process. From a specific point of view, ICA can be thought of as the social counterpart of genetic algorithms (GAs). ICA is the mathematical model and the computer simulation of human social evolution, while GAs are based on the biological evolution of species. ICAFF,ICAOD |

Implicit Association Test( IAT) |
The implicit-association test (IAT) is a measure within social psychology designed to detect the strength of a person’s automatic association between mental representations of objects (concepts) in memory. The IAT was introduced in the scientific literature in 1998 by Anthony Greenwald, Debbie McGhee, Joyce Sherry, and Jordan Schwartz. The IAT is now widely used in social psychology research and is used to some extent in clinical, cognitive, and developmental psychology research. Although some controversy still exists regarding the IAT and what it measures, much research into its validity and psychometric properties has been conducted since its introduction into the literature. IATscores |

Implicit Regression |
In 2011, Wooten introduced Non-Response Analysis the founding theory in Implicit Regression where Implicit Regression treats the variables implicitly as codependent variables and not as an explicit function with dependent or independent variables as in standard regression. The motivation of this paper is to introduce methods of implicit regression to determine the constant nature of a variable or the interactive term, and address inverse relationship among measured variables with random error present in both directions. |

Import Vector Machines |
The Import Vector Machines (Zhu and Hastie 2005) are a sparse, discriminative and probabilistic classifier. The algorithm is based on the Kernel Logistic Regression model, but uses only a few data points to define the decision hyperplane in the feature space. These data points are called import vectors. The Import Vector Machine shows similar results to the widely used Support Vector Machine, but has a probabilistic output. |

Importance Sampling |
In statistics, importance sampling is a general technique for estimating properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. It is related to umbrella sampling in computational physics. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both. |

Importance Weighted Autoencoder( IWAE) |
The variational autoencoder (VAE; Kingma, Welling (2014)) is a recently proposed generative model pairing a top-down generative network with a bottom-up recognition network which approximates posterior inference. It makes two strong assumptions about posterior inference: that the posterior distribution is approximately factorial, and that its parameters can be approximated with nonlinear regression from the observations. As we show empirically, the VAE objective can lead to overly simplified representations which fail to use the network’s entire modeling capacity. We present the importance weighted autoencoder (IWAE), a generative model with the same architecture as the VAE, but which uses a strictly tighter log-likelihood lower bound derived from importance weighting. In the IWAE, the recognition network uses multiple samples to approximate the posterior, giving it increased flexibility to model complex posteriors which do not fit the VAE modeling assumptions. We show empirically that IWAEs learn richer latent space representations than VAEs, leading to improved test log-likelihood on density estimation benchmarks. GitXiv |

Imputation |
In statistics, imputation is the process of replacing missing data with substituted values. When substituting for a data point, it is known as “unit imputation”; when substituting for a component of a data point, it is known as “item imputation”. Because missing data can create problems for analyzing data, imputation is seen as a way to avoid pitfalls involved with listwise deletion of cases that have missing values. That is to say, when one or more values are missing for a case, most statistical packages default to discarding any case that has a missing value, which may introduce bias or affect the representativeness of the results. Imputation preserves all cases by replacing missing data with a probable value based on other available information. Once all missing values have been imputed, the data set can then be analysed using standard techniques for complete data. |

Incident Analytics |
http://…through-big-data-predictive-analytics.pdf |

Incremental Classifier and Representation Learning( iCaRL) |
A major open problem on the road to artificial intelligence is the development of incrementally learning systems that learn about more and more concepts over time from a stream of data. In this work, we introduce a new training strategy, iCaRL, that allows learning in such a class-incremental way: only the training data for a small number of classes has to be present at the same time and new classes can be added progressively. iCaRL learns strong classifiers and a data representation simultaneously. This distinguishes it from earlier works that were fundamentally limited to fixed data representations and therefore incompatible with deep learning architectures. We show by experiments on the CIFAR-100 and ImageNet ILSVRC 2012 datasets that iCaRL can learn many classes incrementally over a long period of time where other strategies quickly fail. |

Incremental Decision Tree |
An incremental decision tree algorithm is an online machine learning algorithm that outputs a decision tree. Many decision tree methods, such as C4.5, construct a tree using a complete dataset. Incremental decision tree methods allow an existing tree to be updated using only new individual data instances, without having to re-process past instances. This may be useful in situations where the entire dataset is not available when the tree is updated (i.e. the data was not stored), the original data set is too large to process or the characteristics of the data change over time. |

Incremental Sequence Learning |
Deep learning research over the past years has shown that by increasing the scope or difficulty of the learning problem over time, increasingly complex learning problems can be addressed. We study incremental learning in the context of sequence learning, using generative RNNs in the form of multi-layer recurrent Mixture Density Networks. We introduce Incremental Sequence Learning, a simple incremental approach to sequence learning. Incremental Sequence Learning starts out by using only the first few steps of each sequence as training data. Each time a performance criterion has been reached, the length of the parts of the sequences used for training is increased. To evaluate Incremental Sequence Learning and comparison methods, we introduce and make available a novel sequence learning task and data set: predicting and classifying MNIST pen stroke sequences, where the familiar handwritten digit images have been transformed to pen stroke sequences representing the skeletons of the digits. We find that Incremental Sequence Learning greatly speeds up sequence learning and reaches the best test performance level of regular sequence learning 20 times faster, reduces the test error by 74%, and in general performs more robustly; it displays lower variance and achieves sustained progress after all three comparison method have stopped improving. A trained sequence prediction model is also used in transfer learning to the task of sequence classification, where it is found that transfer learning realizes improved classification performance compared to methods that learn to classify from scratch. |

Independent and identically distributed( iid, i.i.d.) |
In probability theory and statistics, a sequence or other collection of random variables is independent and identically distributed (i.i.d.) if each random variable has the same probability distribution as the others and all are mutually independent. The abbreviation i.i.d. is particularly common in statistics (often as iid, sometimes written IID), where observations in a sample are often assumed to be effectively i.i.d. for the purposes of statistical inference. The assumption (or requirement) that observations be i.i.d. tends to simplify the underlying mathematics of many statistical methods. However, in practical applications of statistical modeling the assumption may or may not be realistic. To test how realistic the assumption is on a given data set the autocorrelation can be computed, lag plots drawn or turning point test performed. The generalization of exchangeable random variables is often sufficient and more easily met. |

Independent Component Analysis( ICA) |
In signal processing, independent component analysis (ICA) is a computational method for separating a multivariate signal into additive subcomponents. This is done by assuming that the subcomponents are non-Gaussian signals and that they are statistically independent from each other. ICA is a special case of blind source separation. A common example application is the ‘cocktail party problem’ of listening in on one person’s speech in a noisy room. |

Independently Interpretable Lasso( IILasso) |
Sparse regularization such as $\ell_1$ regularization is a quite powerful and widely used strategy for high dimensional learning problems. The effectiveness of sparse regularization have been supported practically and theoretically by several studies. However, one of the biggest issues in sparse regularization is that its performance is quite sensitive to correlations between features. Ordinary $\ell_1$ regularization often selects variables correlated with each other, which results in deterioration of not only its generalization error but also interpretability. In this paper, we propose a new regularization method, ‘Independently Interpretable Lasso’ (IILasso for short). Our proposed regularizer suppresses selecting correlated variables, and thus each active variables independently affect the objective variable in the model. Hence, we can interpret regression coefficients intuitively and also improve the performance by avoiding overfitting. We analyze theoretical property of IILasso and show that the proposed method is much advantageous for its sign recovery and achieves almost minimax optimal convergence rate. Synthetic and real data analyses also indicate the effectiveness of IILasso. |

Indexation |
Indexation is a technique to adjust income payments by means of a price index, in order to maintain the purchasing power of the public after inflation, while Deindexation refers to the unwinding of indexation. From a macroeconomics standpoint there are four main categories of indexation: wage indexation, financial instruments rate indexation, tax rate indexation, and exchange rate indexation. The first three are indexed to inflation. The last one is typically indexed to a foreign currency mainly the US dollar. Any of these different types of indexation can be reversed (deindexation). |

Indirect Inference |
Indirect inference is a simulation-based method for estimating the parameters of economic models. Its hallmark is the use of an auxiliary model to capture aspects of the data upon which to base the estimation. The parameters of the auxiliary model can be estimated using either the observed data or data simulated from the economic model. Indirect inference chooses the parameters of the economic model so that these two estimates of the parameters of the auxiliary model are as close as possible. The auxiliary model need not be correctly specified; when it is, indirect inference is equivalent to maximum likelihood. |

Inductive Logic Programming( ILP) |
Inductive logic programming (ILP) is a subfield of machine learning which uses logic programming as a uniform representation for examples, background knowledge and hypotheses. Given an encoding of the known background knowledge and a set of examples represented as a logical database of facts, an ILP system will derive a hypothesised logic program which entails all the positive and none of the negative examples. Schema: positive examples + negative examples + background knowledge => hypothesis. Inductive logic programming is particularly useful in bioinformatics and natural language processing. Ehud Shapiro laid the theoretical foundation for inductive logic programming and built its first implementation (Model Inference System) in 1981: a Prolog program that inductively inferred logic programs from positive and negative examples. The term Inductive Logic Programming was first introduced in a paper by Stephen Muggleton in 1991. The term ‘inductive’ here refers to philosophical (i.e. suggesting a theory to explain observed facts) rather than mathematical (i.e. proving a property for all members of a well-ordered set) induction. |

Industry 4.0 |
Industry 4.0 is a project in the high-tech strategy of the German government, which promotes the computerization of the manufacturing industry. The goal is the intelligent factory (Smart Factory), which is characterized by adaptability, resource efficiency and ergonomics as well as the integration of customers and business partners in business and value processes. Technological basis are cyber-physical systems and the Internet of Things. Experts believe that Industry 4.0 or the fourth industrial revolution could be a reality in about 10 to 20 years. |

Inertial Regularization and Selection( IRS) |
In this paper, we develop a new sequential regression modeling approach for data streams. Data streams are commonly found around us, e.g in a retail enterprise sales data is continuously collected every day. A demand forecasting model is an important outcome from the data that needs to be continuously updated with the new incoming data. The main challenge in such modeling arises when there is a) high dimensional and sparsity, b) need for an adaptive use of prior knowledge, and/or c) structural changes in the system. The proposed approach addresses these challenges by incorporating an adaptive L1-penalty and inertia terms in the loss function, and thus called Inertial Regularization and Selection (IRS). The former term performs model selection to handle the first challenge while the latter is shown to address the last two challenges. A recursive estimation algorithm is developed, and shown to outperform the commonly used state-space models, such as Kalman Filters, in experimental studies and real data. |

Inferactive Data Analysis |
We describe inferactive data analysis, so-named to denote an interactive approach to data analysis with an emphasis on inference after data analysis. Our approach is a compromise between Tukey’s exploratory (roughly speaking ‘model free’) and confirmatory data analysis (roughly speaking classical and ‘model based’), also allowing for Bayesian data analysis. We view this approach as close in spirit to current practice of applied statisticians and data scientists while allowing frequentist guarantees for results to be reported in the scientific literature, or Bayesian results where the data scientist may choose the statistical model (and hence the prior) after some initial exploratory analysis. While this approach to data analysis does not cover every scenario, and every possible algorithm data scientists may use, we see this as a useful step in concrete providing tools (with frequentist statistical guarantees) for current data scientists. The basis of inference we use is selective inference [Lee et al., 2016, Fithian et al., 2014], in particular its randomized form [Tian and Taylor, 2015a]. The randomized framework, besides providing additional power and shorter confidence intervals, also provides explicit forms for relevant reference distributions (up to normalization) through the {\em selective sampler} of Tian et al. [2016]. The reference distributions are constructed from a particular conditional distribution formed from what we call a DAG-DAG — a Data Analysis Generative DAG. As sampling conditional distributions in DAGs is generally complex, the selective sampler is crucial to any practical implementation of inferactive data analysis. Our principal goal is in reviewing the recent developments in selective inference as well as describing the general philosophy of selective inference. |

Inferential Model( IM) |
Probability is a useful tool for describing uncertainty, so it is natural to strive for a system of statistical inference based on probabilities for or against various hypotheses. But existing probabilistic inference methods struggle to provide a meaningful interpretation of the probabilities across experiments in sufficient generality. In this paper we further develop a promising new approach based on what are called inferential models (IMs). The fundamental idea behind IMs is that there is an unobservable auxiliary variable that itself describes the inherent uncertainty about the parameter of interest, and that posterior probabilistic inference can be accomplished by predicting this unobserved quantity. We describe a simple and intuitive threestep construction of a random set of candidate parameter values, each being consistent with the model, the observed data, and a auxiliary variable prediction. Then prior-free posterior summaries of the available statistical evidence for and against a hypothesis of interest are obtained by calculating the probability that this random set falls completely in and completely out of the hypothesis, respectively. We prove that these IM-based measures of evidence are calibrated in a frequentist sense, showing that IMs give easily-interpretable results both within and across experiments. |

Inferential Statistics |
In statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation. More substantially, the terms statistical inference, statistical induction and inferential statistics are used to describe systems of procedures that can be used to draw conclusions from datasets arising from systems affected by random variation, such as observational errors, random sampling, or random experimentation. Initial requirements of such a system of procedures for inference and induction are that the system should produce reasonable answers when applied to well-defined situations and that it should be general enough to be applied across a range of situations. Inferential statistics are used to test hypotheses and make estimations using sample data. |

Infinite Feature Selection( IFS) |
Supervised Infinite Feature Selection |

Infinite Latent Feature Selection |
Feature selection is playing an increasingly significant role with respect to many computer vision applications spanning from object recognition to visual object tracking. However, most of the recent solutions in feature selection are not robust across different and heterogeneous set of data. In this paper, we address this issue proposing a robust probabilistic latent graph-based feature selection algorithm that performs the ranking step while considering all the possible subsets of features, as paths on a graph, bypassing the combinatorial problem analytically. An appealing characteristic of the approach is that it aims to discover an abstraction behind low-level sensory data, that is, relevancy. Relevancy is modelled as a latent variable in a PLSA-inspired generative process that allows the investigation of the importance of a feature when injected into an arbitrary set of cues. The proposed method has been tested on ten diverse benchmarks, and compared against eleven state of the art feature selection methods. Results show that the proposed approach attains the highest performance levels across many different scenarios and difficulties, thereby confirming its strong robustness while setting a new state of the art in feature selection domain. |

Infinite Layer Networks( ILN) |
Infinite Layer Networks (ILN) have recently been proposed as an architecture that mimics neural networks while enjoying some of the advantages of kernel methods. ILN are networks that integrate over infinitely many nodes within a single hidden layer. It has been demonstrated by several authors that the problem of learning ILN can be reduced to the kernel trick, implying that whenever a certain integral can be computed analytically they are efficiently learnable. In this work we give an online algorithm for ILN, which avoids the kernel trick assumption. More generally and of independent interest, we show that kernel methods in general can be exploited even when the kernel cannot be efficiently computed but can only be estimated via sampling. We provide a regret analysis for our algorithm, showing that it matches the sample complexity of methods which have access to kernel values. Thus, our method is the first to demonstrate that the kernel trick is not necessary as such, and random features suffice to obtain comparable performance. |

Infinite Variational Autoencoder( VAE) |
This paper presents an infinite variational autoencoder (VAE) whose capacity adapts to suit the input data. This is achieved using a mixture model where the mixing coefficients are modeled by a Dirichlet process, allowing us to integrate over the coefficients when performing inference. Critically, this then allows us to automatically vary the number of autoencoders in the mixture based on the data. Experiments show the flexibility of our method, particularly for semi-supervised learning, where only a small number of training samples are available. |

InfiniteBoost |
In machine learning ensemble methods have demonstrated high accuracy for the variety of problems in different areas. The most known algorithms intensively used in practice are random forests and gradient boosting. In this paper we present InfiniteBoost – a novel algorithm, which combines the best properties of these two approaches. The algorithm constructs the ensemble of trees for which two properties hold: trees of the ensemble incorporate the mistakes done by others; at the same time the ensemble could contain the infinite number of trees without the over-fitting effect. The proposed algorithm is evaluated on the regression, classification, and ranking tasks using large scale, publicly available datasets. |

InfiniteInsight Function Library( IFL) |
InfiniteInsight function library (“IFL”) for SAP HANA to allow in-memory execution of InfiniteInsight-classic workflows. |

Influence Diagram( ID) |
An influence diagram (ID) (also called a relevance diagram, decision diagram or a decision network) is a compact graphical and mathematical representation of a decision situation. It is a generalization of a Bayesian network, in which not only probabilistic inference problems but also decision making problems (following maximum expected utility criterion) can be modeled and solved. |

Infobesity |
Information overload (also known as infobesity or infoxication) refers to the difficulty a person can have understanding an issue and making decisions that can be caused by the presence of too much information. The term is popularized by Alvin Toffler in his bestselling 1970 book Future Shock, but is mentioned in a 1964 book by Bertram Gross, The Managing of Organizations. Speier et al. (1999) stated: “Information overload occurs when the amount of input to a system exceeds its processing capacity. Decision makers have fairly limited cognitive processing capacity. Consequently, when information overload occurs, it is likely that a reduction in decision quality will occur.” In recent years, the term ‘information overload’ has evolved into phrases such as ‘information glut’ and ‘data smog’ (Shenk, 1997). What was once a term grounded in cognitive psychology has evolved into a rich metaphor used outside the world of academia. In many ways, the advent of information technology has increased the focus on information overload: information technology may be a primary reason for information overload due to its ability to produce more information more quickly and to disseminate this information to a wider audience than ever before (Evaristo, Adams, & Curley, 1995; Hiltz & Turoff, 1985). |

Information Coefficient( IC) |
The information coefficient (IC) is a measure of the merit of a predicted value. In finance, the information coefficient is used as a performance metric for the predictive skill of a financial analyst. The information coefficient is similar to correlation in that it can be seen to measure the linear relationship between two random variables, e.g. predicted stock returns and the actualized returns. The information coefficient ranges from 0 to 1, with 0 denoting no linear relationship between predictions and actual values (poor forecasting skills) and 1 denoting a perfect linear relationship (good forecasting skills). |

Information Extraction( IE) |
Information extraction (IE) is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents. In most of the cases this activity concerns processing human language texts by means of natural language processing (NLP). Recent activities in multimedia document processing like automatic annotation and content extraction out of images/audio/video could be seen as information extraction. |

Information Fusion |
Information integration (II) (also called deduplication and referential integrity) is the merging of information from heterogeneous sources with differing conceptual, contextual and typographical representations. It is used in data mining and consolidation of data from unstructured or semi-structured resources. Typically, information integration refers to textual representations of knowledge but is sometimes applied to rich-media content. information fusion which is a related term involves the combination of information into a new set of information towards reducing uncertainty. |

Information Fuzzy Networks( IFN) |
Info Fuzzy Networks (IFN) is a greedy machine learning algorithm for supervised learning. The data structure produced by the learning algorithm is also called Info Fuzzy Network. IFN construction is quite similar to decision trees’ construction. However, IFN constructs a directed graph and not a tree. IFN also uses the conditional mutual information metric in order to choose features during the construction stage while decision trees usually use other metrics like entropy or gini. |

Information Gain |
In information theory and machine learning, information gain is a synonym for Kullback-Leibler divergence. However, in the context of decision trees, the term is sometimes used synonymously with mutual information, which is the expectation value of the Kullback-Leibler divergence of a conditional probability distribution. |

Information Harvesting |
Information Harvesting (IH) was an early data mining product from the 1990s. It was invented by Ralphe Wiggins and produced by the Ryan Corp, later Information Harvesting Inc., of Cambridge, Massachusetts. IH sought to infer rules from sets of data. It did this first by classifying various input variables into one of a number of bins, thereby putting some structure on the continuous variables in the input. IH then proceeds to generate rules, trading off generalization against memorization, that will infer the value of the prediction variable, possibly creating many levels of rules in the process. It included strategies for checking if overfitting took place and, if so, correcting for it. Because of its strategies for correcting for overfitting by considering more data, and refining the rules based on that data, IH might also be considered to be a form of machine learning. |

Information Integration |
Information integration (II) (also called deduplication and referential integrity) is the merging of information from heterogeneous sources with differing conceptual, contextual and typographical representations. It is used in data mining and consolidation of data from unstructured or semi-structured resources. Typically, information integration refers to textual representations of knowledge but is sometimes applied to rich-media content. information fusion which is a related term involves the combination of information into a new set of information towards reducing uncertainty. |

Information Maximization( Infomax) |
Infomax is an optimization principle for artificial neural networks and other information processing systems. It prescribes that a function that maps a set of input values I to a set of output values O should be chosen or learned so as to maximize the average Shannon mutual information between I and O, subject to a set of specified constraints and/or noise processes. Infomax algorithms are learning algorithms that perform this optimization process. The principle was described by Linsker in 1987. Infomax, in its zero-noise limit, is related to the principle of redundancy reduction proposed for biological sensory processing by Horace Barlow in 1961, and applied quantitatively to retinal processing by Atick and Redlich. One of the applications of infomax has been to an independent component analysis algorithm that finds independent signals by maximising entropy. Infomax-based ICA was described by Bell and Sejnowski in 1995. |

Information Potential Auto-Encoders |
In this paper, we suggest a framework to make use of mutual information as a regularization criterion to train Auto-Encoders (AEs). In the proposed framework, AEs are regularized by minimization of the mutual information between input and encoding variables of AEs during the training phase. In order to estimate the entropy of the encoding variables and the mutual information, we propose a non-parametric method. We also give an information theoretic view of Variational AEs (VAEs), which suggests that VAEs can be considered as parametric methods that estimate entropy. Experimental results show that the proposed non-parametric models have more degree of freedom in terms of representation learning of features drawn from complex distributions such as Mixture of Gaussians, compared to methods which estimate entropy using parametric approaches, such as Variational AEs. |

Information Retrieval( IR) |
Information retrieval is the activity of obtaining information resources relevant to an information need from a collection of information resources. Searches can be based on metadata or on full-text (or other content-based) indexing. Automated information retrieval systems are used to reduce what has been called “information overload”. Many universities and public libraries use IR systems to provide access to books, journals and other documents. Web search engines are the most visible IR applications. An information retrieval process begins when a user enters a query into the system. Queries are formal statements of information needs, for example search strings in web search engines. In information retrieval a query does not uniquely identify a single object in the collection. Instead, several objects may match the query, perhaps with different degrees of relevancy. |

Information Value( IV) |
In statistical data mining, sometimes we need to determine out of a set of variables which ones are best in capturing a desired behavior. For example, let’s say you have a pool of customers for your credit card company, and you want to determine who out of them are about to default (i.e. refuse to pay up after possibly making a huge expense). You need to then identify which of the attributes you have on the customer can potentially identify and alert you of such behavior. One of the popular ways in which this is done by analysts is by looking at something called ‘Information Value’. In the context of data mining is also sometimes referred to by the short form – InfoVal. |

Information Visualization |
Information visualization or information visualisation is the study of (interactive) visual representations of abstract data to reinforce human cognition. The abstract data include both numerical and non-numerical data, such as text and geographic information. However, information visualization differs from scientific visualization: “it’s infovis (information visualization) when the spatial representation is chosen, and it’s scivis (scientific visualization) when the spatial representation is given”. |

Information-Based Optimal Subdata Selection( IBOSS) |
Extraordinary amounts of data are being produced in many branches of science. Proven statistical methods are no longer applicable with extraordinary large data sets due to computational limitations. A critical step in big data analysis is data reduction. Existing investigations in the context of linear regression focus on subsampling-based methods. However, not only is this approach prone to sampling errors, it also leads to a covariance matrix of the estimators that is typically bounded from below by a term that is of the order of the inverse of the subdata size. We propose a novel approach, termed information-based optimal subdata selection (IBOSS). Compared to leading existing subdata methods, the IBOSS approach has the following advantages: (i) it is significantly faster; (ii) it is suitable for distributed parallel computing; (iii) the variances of the slope parameter estimators converge to 0 as the full data size increases even if the subdata size is fixed, i.e., the convergence rate depends on the full data size; (iv) data analysis for IBOSS subdata is straightforward and the sampling distribution of an IBOSS estimator is easy to assess. Theoretical results and extensive simulations demonstrate that the IBOSS approach is superior to subsampling-based methods, sometimes by orders of magnitude. The advantages of the new approach are also illustrated through analysis of real data. |

Inhomogeneous Self-Exciting Process( IHSEP) |
IHSEP |

Initial Data Analysis( IDA) |
The most important distinction between the initial data analysis phase and the main analysis phase, is that during initial data analysis one refrains from any analysis that is aimed at answering the original research question. The initial data analysis phase is guided by the following four questions: • Quality of data • Quality of measurements • Initial transformations • Did the implementation of the study fulfill the intentions of the research design? |

Innovation Management |
Innovation management is the management of innovation processes. It refers both to product and organizational innovation. Innovation management includes a set of tools that allow managers and engineers to cooperate with a common understanding of processes and goals. Innovation management allows the organization to respond to external or internal opportunities, and use its creativity to introduce new ideas, processes or products. It is not relegated to R&D; it involves workers at every level in contributing creatively to a company’s product development, manufacturing and marketing. |

Innovation Pursuit( iPursuit) |
In subspace clustering, a group of data points belonging to a union of subspaces are assigned membership to their respective subspaces. This paper presents a new approach dubbed Innovation Pursuit (iPursuit) to the problem of subspace clustering using a new geometrical idea whereby each subspace is identified based on its novelty with respect to the other subspaces. The proposed approach finds the subspaces consecutively by solving a series of simple linear optimization problems, each searching for some direction in the span of the data that is potentially orthogonal to all subspaces except for the one to be identified in one step of the algorithm. A detailed mathematical analysis is provided establishing sufficient conditions for the proposed approach to correctly cluster the data points. Remarkably, the proposed approach can provably yield exact clustering even when the subspaces have significant intersections under mild conditions on the distribution of the data points in the subspaces. Moreover, It is shown that the complexity of iPursuit is almost independent of the dimension of the data. The numerical simulations demonstrate that iPursuit can often outperform the state-of-the-art subspace clustering algorithms, more so for subspaces with significant intersections. |

Input Fast-Forwarding |
This paper introduces a new architectural framework, known as input fast-forwarding, that can enhance the performance of deep networks. The main idea is to incorporate a parallel path that sends representations of input values forward to deeper network layers. This scheme is substantially different from ‘deep supervision’ in which the loss layer is re-introduced to earlier layers. The parallel path provided by fast-forwarding enhances the training process in two ways. First, it enables the individual layers to combine higher-level information (from the standard processing path) with lower-level information (from the fast-forward path). Second, this new architecture reduces the problem of vanishing gradients substantially because the fast-forwarding path provides a shorter route for gradient backpropagation. In order to evaluate the utility of the proposed technique, a Fast-Forward Network (FFNet), with 20 convolutional layers along with parallel fast-forward paths, has been created and tested. The paper presents empirical results that demonstrate improved learning capacity of FFNet due to fast-forwarding, as compared to GoogLeNet (with deep supervision) and CaffeNet, which are 4x and 18x larger in size, respectively. All of the source code and deep learning models described in this paper will be made available to the entire research community |

Instance Segmentation |
Instance segmentation is the problem of detecting and delineating each object of interest appearing in an image. Current instance segmentation approaches consist of ensembles of modules that are trained independently of each other, thus missing learning opportunities. |

Instance Selection( IS) |
In supervised learning, a training set providing previously known information is used to classify new instances. Commonly, several instances are stored in the training set but some of them are not useful for classifying therefore it is possible to get acceptable classification rates ignoring non useful cases; this process is known as instance selection. Through instance selection the training set is reduced which allows reducing runtimes in the classification and/or training stages of classifiers. |

Instance-Based Learning( IBL) |
In machine learning, instance-based learning or memory-based learning is a family of learning algorithms that, instead of performing explicit generalization, compares new problem instances with instances seen in training, which have been stored in memory. Instance-based learning is a kind of lazy learning. |

Instantaneous Rates( IRATE) |
The Instantaneous Rates (IRATE) model is used to analyze tagging data. It is based on the Hoenig et al. (1998) alternate formulation of the Brownie et al. (1985) band recovery models that allow fishing and natural mortality to be derived from the exploitation rate and survival rate estimates of a Type II (continuous) fishery. IRATE allows both age-independent and age-dependent instantaneous rates models (Hoenig et al., 1998; Jiang et al., 2007) to be fitted to multi-year fish tag return data. IRATE allows model development with either age-dependent harvest-only or harvest and catch-release tag returns or similar age independent models. The software, developed by Dr. Gary Nelson of the Massachusetts Division of Marine Fisheries, also allows estimation of harvest reporting rates, catch and release reporting rates, and tag retention of harvested and/or released fish. However, not all parameters in the model can be estimated simultaneously with tag data alone. Some parameters must be fixed and assumed known (usually reporting rate and tag loss) to obtain good estimates of remaining parameters. Additionally, the model can account for non-mixing of the tagged fish in the first release year and adjust for harvest and M selectivity in the age-based models. The negative log likelihood is used as the objective function to obtain maximum likelihood estimates of parameters. Several model fit statistics are provided that can be used to select the best model formulation; these include the Akaike Information Criterion (AIC), c-hat (a measure of overdispersion) and standard residuals. The calculation engine is written in AD Model Builder. IRATER |

Instrumental Panel Data Models |
ivpanel |

Instrumental Variable( IV) |
In statistics, econometrics, epidemiology and related disciplines, the method of instrumental variables (IV) is used to estimate causal relationships when controlled experiments are not feasible or when a treatment is not successfully delivered to every unit in a randomized experiment. Instrumental variable methods allow consistent estimation when the explanatory variables (covariates) are correlated with the error terms of a regression relationship. Such correlation may occur when the dependent variable causes at least one of the covariates (‘reverse’ causation), when there are relevant explanatory variables which are omitted from the model, or when the covariates are subject to measurement error. In this situation, ordinary linear regression generally produces biased and inconsistent estimates. However, if an instrument is available, consistent estimates may still be obtained. An instrument is a variable that does not itself belong in the explanatory equation and is correlated with the endogenous explanatory variables, conditional on the other covariates. In linear models, there are two main requirements for using an IV: • The instrument must be correlated with the endogenous explanatory variables, conditional on the other covariates. • The instrument cannot be correlated with the error term in the explanatory equation (conditional on the other covariates), that is, the instrument cannot suffer from the same problem as the original predicting variable. ivmodel |

Integer Echo State Network( intESN) |
We propose an integer approximation of Echo State Networks (ESN) based on the mathematics of hyperdimensional computing. The reservoir of the proposed Integer Echo State Network (intESN) contains only n-bits integers and replaces the recurrent matrix multiply with an efficient cyclic shift operation. Such an architecture results in dramatic improvements in memory footprint and computational efficiency, with minimal performance loss. Our architecture naturally supports the usage of the trained reservoir in symbolic processing tasks of analogy making and logical inference. |

Integer Linear Programming( ILP) |
An integer programming problem is a mathematical optimization or feasibility program in which some or all of the variables are restricted to be integers. In many settings the term refers to integer linear programming (ILP), in which the objective function and the constraints (other than the integer constraints) are linear. Integer programming is NP-hard. A special case, 0-1 integer linear programming, in which unknowns are binary, and only the restrictions must be satisfied, is one of Karp’s 21 NP-complete problems. Book: Compact Extended Linear Programming Models |

Integrated Discrimination Improvement( IDI) |
Integrated Discrimination Improvement (IDI) described in the paper: Jialiang Li (2013) <doi:10.1093/biostatistics/kxs047>. mcca |

Integrated Nested Laplace Approximation( INLA) |
A fully automatic approach for approximate inference in latent Gaussian models. INLA,meta4diag |

Integrative Connectionist Learning Systems( ICOS) |
The so far developed and widely utilized connectionist systems (artificial neural networks) are mainly based on a single brain-like connectionist principle of information processing, where learning and information exchange occur in the connections. This paper extends this paradigm of connectionist systems to a new trend—integrative connectionist learning systems (ICOS) that integrate in their structure and learning algorithms principles from different hierarchical levels of information processing in the brain, including neuronal-, genetic-, quantum. Spiking neural networks (SNN) are used as a basic connectionist learning model which is further extended with other information learning principles to create different ICOS. For example, evolving SNN for multitask learning are presented and illustrated on a case study of person authentification based on multimodal auditory and visual information. Integrative gene-SNN are presented, where gene interactions are included in the functioning of a spiking neuron. They are applied on a case study of computational neurogenetic modeling. Integrative quantum-SNN are introduced with a quantum Hebbian learning, where input features as well as information spikes are represented by quantum bits that result in exponentially faster feature selection and model learning. ICOS can be used to solve more efficiently challenging biological and engineering problems when fast adaptive learning systems are needed to incrementally learn in a large dimensional space. They can also help to better understand complex information processes in the brain especially how information processes at different information levels interact. Open questions, challenges and directions for further research are presented. |

Intelligence Amplification |
Intelligence amplification (IA) (also referred to as cognitive augmentation and machine augmented intelligence) refers to the effective use of information technology in augmenting human intelligence. The idea was first proposed in the 1950s and 1960s by cybernetics and early computer pioneers. IA is sometimes contrasted with AI (Artificial Intelligence), that is, the project of building a human-like intelligence in the form of an autonomous technological system such as a computer or robot. AI has encountered many fundamental obstacles, practical as well as theoretical, which for IA seem moot, as it needs technology merely as an extra support for an autonomous intelligence that has already proven to function. Moreover, IA has a long history of success, since all forms of information technology, from the abacus to writing to the Internet, have been developed basically to extend the information processing capabilities of the human mind (see extended mind and distributed cognition). |

Intelligence Graph |
In fact, there exist three genres of intelligence architectures: logics (e.g. \textit{Random Forest, A$^*$ Searching}), neurons (e.g. \textit{CNN, LSTM}) and probabilities (e.g. \textit{Naive Bayes, HMM}), all of which are incompatible to each other. However, to construct powerful intelligence systems with various methods, we propose the intelligence graph (short as \textbf{\textit{iGraph}}), which is composed by both of neural and probabilistic graph, under the framework of forward-backward propagation. By the paradigm of iGraph, we design a recommendation model with semantic principle. First, the probabilistic distributions of categories are generated from the embedding representations of users/items, in the manner of neurons. Second, the probabilistic graph infers the distributions of features, in the manner of probabilities. Last, for the recommendation diversity, we perform an expectation computation then conduct a logic judgment, in the manner of logics. Experimentally, we beat the state-of-the-art baselines and verify our conclusions. |

Intelligent Data Analytics( IDA) |
The art of Conquering Data with Intelligent Systems includes all areas of Research and Development in Intelligent Data Analytics , the area including Data Analytics and Intelligent Systems, that focus on computational, mathematical, statistical, cognitive, and algorithmic techniques for modeling high dimensional data with the ultimate goal of extracting meaning from (raw) data. This requires methods ranging from learning, inference, prediction, knowledge discovery and visualisation that are applicable on both small and large volumes of mostly dynamic data sets collected and integrated from multiple sources, across multiple modalities. These methods and techniques trigger the need for assessment and evaluation: automated and by humans. Intelligent Data Analytics enables automated hypothesis generation, event correlation, and anomaly detection and helps in explaining phenomena and inferring results that would otherwise remain hidden. Intelligent Data Analytics is a cornerstone in modern Big Data, amplifying perhaps its most important aspect: Value. |

Intelligent K-Means( ik-Means) |
Intelligent K-Means (iK-Means) is an K-Means initialization algorithm. It is a simple algorithm based on the concept of anomalous patterns, its of easy implementation and may even help you to find how many clusters there are in a dataset (remember, you need to know this in order to run K-Means!). Intelligent Choice of the Number of Clusters in K -Means Clustering: An Experimental Study with Different Cluster Spreads |

Intelligent Personal Agent( IPA) |
An Intelligent Personal Agent (IPA) is an agent that has the purpose of helping the user to gain information through reliable resources with the help of knowledge navigation techniques and saving time to search the best content. The agent is also responsible for responding to the chat-based queries with the help of Conversation Corpus. |

Intelligent Software |
Christopher Bishop: “Software that can adapt, learn and reason” |

Intention Analysis |
Intention Analysis is the identification of intentions from text, be it the intention to purchase or the intention to sell or to complain, accuse, inquire, opine, advocate or to quit, in incoming customer messages or in call center transcripts. Intention analysis using topic models |

Inter Rater Reliability( IRR) |
In statistics, inter-rater reliability, inter-rater agreement, or concordance is the degree of agreement among raters. It gives a score of how much homogeneity, or consensus, there is in the ratings given by judges. It is useful in refining the tools given to human judges, for example by determining if a particular scale is appropriate for measuring a particular variable. If various raters do not agree, either the scale is defective or the raters need to be re-trained. There are a number of statistics which can be used to determine inter-rater reliability. Different statistics are appropriate for different types of measurement. Some options are: joint-probability of agreement, Cohen’s kappa and the related Fleiss’ kappa, inter-rater correlation, concordance correlation coefficient and intra-class correlation. rhoR |

Interactive Report |
An “Interactive Report” provides a new paradigm to fill the gap between Static Report and BI Tool. It has the following characteristics … 1. Like a static report, “Interactive Report” is still based on “static data”, which is a fixed set of data generated in a periodic batch fashion. 2. Unlike static report, this pre-generated “static data” is much larger and wider that covers a broader scope of questions that the execs may ask. 3. Because the “static data” is large and wide, it is impossible to visualize all aspects in the report. Therefore, only one perspective of the static data (based on the exec’s pre-specified requirement) is shown in the report. 4. However, if the exec wants to ask a different question, he/she can switch to a different perspective of the same “static data”. |

Interior Point( IP) |
Interior point methods (also referred to as barrier methods) are a certain class of algorithms to solve linear and nonlinear convex optimization problems. Example solution John von Neumann suggested an interior point method of linear programming which was neither a polynomial time method nor an efficient method in practice. In fact, it turned out to be slower in practice compared to simplex method which is not a polynomial time method. In 1984, Narendra Karmarkar developed a method for linear programming called Karmarkar’s algorithm which runs in provably polynomial time and is also very efficient in practice. It enabled solutions of linear programming problems which were beyond the capabilities of simplex method. Contrary to the simplex method, it reaches a best solution by traversing the interior of the feasible region. The method can be generalized to convex programming based on a self-concordant barrier function used to encode the convex set. Any convex optimization problem can be transformed into minimizing (or maximizing) a linear function over a convex set by converting to the epigraph form. The idea of encoding the feasible set using a barrier and designing barrier methods was studied by Anthony V. Fiacco, Garth P. McCormick, and others in the early 1960s. These ideas were mainly developed for general nonlinear programming, but they were later abandoned due to the presence of more competitive methods for this class of problems (e.g. sequential quadratic programming). Yurii Nesterov and Arkadi Nemirovski came up with a special class of such barriers that can be used to encode any convex set. They guarantee that the number of iterations of the algorithm is bounded by a polynomial in the dimension and accuracy of the solution. Karmarkar’s breakthrough revitalized the study of interior point methods and barrier problems, showing that it was possible to create an algorithm for linear programming characterized by polynomial complexity and, moreover, that was competitive with the simplex method. Already Khachiyan’s ellipsoid method was a polynomial time algorithm; however, it was too slow to be of practical interest. The class of primal-dual path-following interior point methods is considered the most successful. Mehrotra’s predictor-corrector algorithm provides the basis for most implementations of this class of methods. |

Interior Point Optimizer( Ipopt) |
Ipopt (Interior Point OPTimizer, pronounced eye-pea-Opt) is a software package for large-scale nonlinear optimization. It is designed to find (local) solutions of mathematical optimization problems of the form: min f(x) for x in R^n, so that gL <= g(x) <= gU; xL <= x <= xU. Ipopt is written in C++ and is released as open source code under the Eclipse Public License (EPL). It is available from the COIN-OR initiative. The code has been written by Andreas Wächter and Carl Laird. The COIN-OR project managers for Ipopt are Andreas Wächter und Stefan Vigerske. |

International Conference on Data Mining( ICDM) |
The IEEE International Conference on Data Mining series (ICDM) has established itself as the world’s premier research conference in data mining. It provides an international forum for presentation of original research results, as well as exchange and dissemination of innovative, practical development experiences. The conference covers all aspects of data mining, including algorithms, software and systems, and applications. ICDM draws researchers and application developers from a wide range of data mining related areas such as statistics, machine learning, pattern recognition, databases and data warehousing, data visualization, knowledge-based systems, and high performance computing. By promoting novel, high quality research findings, and innovative solutions to challenging data mining problems, the conference seeks to continuously advance the state-of-the-art in data mining. Besides the technical program, the conference features workshops, tutorials, panels and, since 2007, the ICDM data mining contest. |

International Institute for Analytics( IIA) |
Founded in 2010 by CEO Jack Phillips and Research Director Thomas H. Davenport, the International Institute for Analytics is an independent research firm that works with organizations to build strong and competitive analytics programs. IIA offers unbiased advice in an industry dominated by hardware and software vendors, consultants and system integrators. With a vast network of analytics experts, academics and leaders at successful companies, we guide our clients as they build and grow successful analytics programs. |

International Mathematics and Statistics Library( IMSL) |
IMSL (International Mathematics and Statistics Library) is a commercial collection of software libraries of numerical analysis functionality that are implemented in the computer programming languages of C, Java, C#.NET, and Fortran. A Python interface is also available. The IMSL Libraries are provided by Rogue Wave Software. |

International Phonetic Alphabet( IPA) |
The International Phonetic Alphabet (unofficially—though commonly—abbreviated IPA) is an alphabetic system of phonetic notation based primarily on the Latin alphabet. It was devised by the International Phonetic Association as a standardized representation of the sounds of oral language. The IPA is used by lexicographers, foreign language students and teachers, linguists, speech-language pathologists, singers, actors, constructed language creators, and translators. The IPA is designed to represent only those qualities of speech that are part of oral language: phones, phonemes, intonation, and the separation of words and syllables. To represent additional qualities of speech, such as tooth gnashing, lisping, and sounds made with a cleft palate, an extended set of symbols called the Extensions to the IPA may be used. IPA symbols are composed of one or more elements of two basic types, letters and diacritics. For example, the sound of the English letter ⟨t⟩ may be transcribed in IPA with a single letter, , or with a letter plus diacritics, , depending on how precise one wishes to be. Often, slashes are used to signal broad or phonemic transcription; thus, /t/ is less specific than, and could refer to, either or , depending on the context and language. Occasionally letters or diacritics are added, removed, or modified by the International Phonetic Association. As of the most recent change in 2005, there are 107 letters, 52 diacritics, and four prosodic marks in the IPA. These are shown in the current IPA chart, posted below in this article and at the website of the IPA. International Phonetic Association |

Internet of Everything( IoE) |
The Internet of Everything describes the networked connections between devices, people, processes and data. The Digitally Connected World. |

Internet of Things( IoT) |
The Internet of Things (IoT) is the interconnection of uniquely identifiable embedded computing devices within the existing Internet infrastructure. Typically, IoT is expected to offer advanced connectivity of devices, systems, and services that goes beyond machine-to-machine communications (M2M) and covers a variety of protocols, domains, and applications. The interconnection of these embedded devices (including smart objects), is expected to usher in automation in nearly all fields, while also enabling advanced applications like a Smart Grid. Things, in the IoT, can refer to a wide variety of devices such as heart monitoring implants, biochip transponders on farm animals, automobiles with built-in sensors, or field operation devices that assist fire-fighters in search and rescue. Current market examples include smart thermostat systems and washer/dryers that utilize wifi for remote monitoring. |

Internet of Us( IoU) |
Call it the internet of bodies, call it emotionally intelligent wearable tech. Designers, engineers and artists want to wake the mainstream tech giants up to the realities of asking people to wear technology. https://…/111811056605813020209 |

InterpNET |
Humans are able to explain their reasoning. On the contrary, deep neural networks are not. This paper attempts to bridge this gap by introducing a new way to design interpretable neural networks for classification, inspired by physiological evidence of the human visual system’s inner-workings. This paper proposes a neural network design paradigm, termed InterpNET, which can be combined with any existing classification architecture to generate natural language explanations of the classifications. The success of the module relies on the assumption that the network’s computation and reasoning is represented in its internal layer activations. While in principle InterpNET could be applied to any existing classification architecture, it is evaluated via an image classification and explanation task. Experiments on a CUB bird classification and explanation dataset show qualitatively and quantitatively that the model is able to generate high-quality explanations. While the current state-of-the-art METEOR score on this dataset is 29.2, InterpNET achieves a much higher METEOR score of 37.9. |

Interpretable Reasoning Network |
Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis. |

Interpretive Structural Modelling( ISM) |
The development of ISM was made by Warfield in 1974. ISM is the process of collaborating distinct or related essentials into a simplified and an organized format. Hence, ISM is a methodology that seeks the interrelationships among the various elements considered and endows with a hierarchical and multilevel structure. ISM |

Inter-rater Reliability (Concordance) |
In statistics, inter-rater reliability, inter-rater agreement, or concordance is the degree of agreement among raters. It gives a score of how much homogeneity, or consensus, there is in the ratings given by judges. It is useful in refining the tools given to human judges, for example by determining if a particular scale is appropriate for measuring a particular variable. If various raters do not agree, either the scale is defective or the raters need to be re-trained. |

Intervention Analysis( IA) |
Intervention analysis is the application of modeling procedures for incorporating the effects of exogenous forces or interventions in time series analysis. These interventions, like policy changes, strikes, floods, and price changes, cause unusual changes in time series, resulting in unexpected, extraordinary observations known as outliers. Specifically, four types of outliers resulting from interventions, additive outliers (AO), innovational outliers (IO), temporary changes (TC), and level shifts (LS), have generated a lot of interest in literature. They pose nonstationarity challenges, which cannot be represented by the usual Box and Jenkins (1976) autoregressive integrated moving average (ARIMA) models alone. The most popular modeling procedures are those where “intervention” detection and estimation is paramount. Box and Tiao (1975) pioneered this type of analysis in their quest to solve the Los Angeles pollution problem. Important extensions and contributions have been made by Chan … |

Intervention in Prediction Measure( IPM) |
Random forests are a popular method in many fields since they can be successfully applied to complex data, with a small sample size, complex interactions and correlations, mixed type predictors, etc. Furthermore, they provide variable importance measures that aid qualitative interpretation and also the selection of relevant predictors. However, most of these measures rely on the choice of a performance measure. But measures of prediction performance are not unique or there is not even a clear definition, as in the case of multivariate response random forests. A new alternative importance measure, called Intervention in Prediction Measure, is investigated. It depends on the structure of the trees, without depending on performance measures. It is compared with other well-known variable importance measures in different contexts, such as a classification problem with variables of different types, another classification problem with correlated predictor variables, and problems with multivariate responses and predictors of different types. IPMRF |

Intervention Time Series Analysis( ITSA) |
Intervention time series analysis (ITSA) is an important method for analysing the effect of sudden events on time series data. ITSA methods are quasi-experimental in nature and the validity of modelling with these methods depends upon assumptions about the timing of the intervention and the response of the process to it. |

Intrablocks Correspondence Analysis( IBCA) |
We propose a new method to describe contingency tables with double partition structures in columns and rows. Furthermore, we propose new superimposed representations, based on the introduction of variable dilations for the partial clouds associated with the partitions of the columns and the rows. pamctdp |

Intra-Class Correlation( ICC) |
In statistics, the intraclass correlation (or the intraclass correlation coefficient, abbreviated ICC) is a descriptive statistic that can be used when quantitative measurements are made on units that are organized into groups. It describes how strongly units in the same group resemble each other. While it is viewed as a type of correlation, unlike most other correlation measures it operates on data structured as groups, rather than data structured as paired observations. The intraclass correlation is commonly used to quantify the degree to which individuals with a fixed degree of relatedness (e.g. full siblings) resemble each other in terms of a quantitative trait. Another prominent application is the assessment of consistency or reproducibility of quantitative measurements made by different observers measuring the same quantity. ICC.Sample.Size |

Intrinsic Credible Regions |
This paper defines intrinsic credible regions, a method to produce objective Bayesian credible regions which only depends on the assumed model and the available data. Lowest posterior loss (LPL) regions are defined as Bayesian credible regions which contain values of minimum posterior expected loss: they depend both on the loss function and on the prior specification. An invariant, information-theory based loss function, the intrinsic discrepancy is argued to be appropriate for scientific communication. Intrinsic credible regions are the lowest posterior loss regions with respect to the intrinsic discrepancy loss and the appropriate reference prior. The proposed procedure is completely general, and it is invariant under both reparametrization and marginalization. The exact derivation of intrinsic credible regions often requires numerical integration, but good analytical approximations are provided. Special attention is given to one-dimensional intrinsic credible intervals; their coverage properties show that they are always approximate (and sometimes exact) frequentist confidence intervals. |

Intrinsic Dimension( ID) |
In signal processing of multidimensional signals, for example in computer vision, the intrinsic dimension of the signal describes how many variables are needed to represent the signal. For a signal of N variables, its intrinsic dimension M satisfies 0 = M = N. Usually the intrinsic dimension of a signal relates to variables defined in a Cartesian coordinate system. In general, however, it is also possible to describe the concept for non-Cartesian coordinates, for example, using polar coordinates. IDmining |

Invariant Causal Prediction( ICP) |
InvariantCausalPrediction |

Invariant Coordinate Selection( ICS) |
A general method for exploring multivariate data by comparing different estimates of multivariate scatter is presented. The method is based upon the eigenvalue-eigenvector decomposition of one scatter matrix relative to another. In particular, it is shown that the eigenvectors can be used to generate an affine invariant coordinate system for the multivariate data. Consequently, we view this method as a method for invariant coordinate selection (ICS). By plotting the data with respect to this new invariant coordinate system, various data structures can be revealed. For example, under certain independent components models, it is shown that the invariant coordinates correspond to the independent components. Another example pertains to mixtures of elliptical distributions. In this case, it is shown that a subset of the invariant coordinates corresponds to Fisher’s linear discriminant subspace, even though the class identi cations of the data points are unknown. Invariant Co-Ordinate Selection Multivariate Outlier Detection With ICS ICS |

Invariant Encoding Generative Adversarial Network( IVE-GAN) |
Generative adversarial networks (GANs) are a powerful framework for generative tasks. However, they are difficult to train and tend to miss modes of the true data generation process. Although GANs can learn a rich representation of the covered modes of the data in their latent space, the framework misses an inverse mapping from data to this latent space. We propose Invariant Encoding Generative Adversarial Networks (IVE-GANs), a novel GAN framework that introduces such a mapping for individual samples from the data by utilizing features in the data which are invariant to certain transformations. Since the model maps individual samples to the latent space, it naturally encourages the generator to cover all modes. We demonstrate the effectiveness of our approach in terms of generative performance and learning rich representations on several datasets including common benchmark image generation tasks. |

Inverse Classification |
Inverse classification is the process of perturbing an instance in a meaningful way such that it is more likely to conform to a specific class. Historical methods that address such a problem are often framed to leverage only a single classifier, or specific set of classifiers. These works are often accompanied by naive assumptions. In this work we propose generalized inverse classification (GIC), which avoids restricting the classification model that can be used. We incorporate this formulation into a refined framework in which GIC takes place. Under this framework, GIC operates on features that are immediately actionable. Each change incurs an individual cost, either linear or non-linear. Such changes are subjected to occur within a specified level of cumulative change (budget). Furthermore, our framework incorporates the estimation of features that change as a consequence of direct actions taken (indirectly changeable features). To solve such a problem, we propose three real-valued heuristic-based methods and two sensitivity analysis-based comparison methods, each of which is evaluated on two freely available real-world datasets. Our results demonstrate the validity and benefits of our formulation, framework, and methods. |

Inverse Distance Weighting( IDW) |
Inverse Distance Weighting (IDW) is a type of deterministic method for multivariate interpolation with a known scattered set of points. The assigned values to unknown points are calculated with a weighted average of the values available at the known points. The name given to this type of methods was motivated by the weighted average applied, since it resorts to the inverse of the distance to each known point (‘amount of proximity’) when assigning weights. geosptdb |

Inverse Reinforcement Learning( IRL) |
Inverse Reinforcement Learning (IRL) in Markov decision processes is the problem of extracting a reward function given observed, optimal behavior. |

Inverse Reward Design( IRD) |
Autonomous agents optimize the reward function we give them. What they don’t know is how hard it is for us to design a reward function that actually captures what we want. When designing the reward, we might think of some specific training scenarios, and make sure that the reward will lead to the right behavior in those scenarios. Inevitably, agents encounter new scenarios (e.g., new types of terrain) where optimizing that same reward may lead to undesired behavior. Our insight is that reward functions are merely observations about what the designer actually wants, and that they should be interpreted in the context in which they were designed. We introduce inverse reward design (IRD) as the problem of inferring the true objective based on the designed reward and the training MDP. We introduce approximate methods for solving IRD problems, and use their solution to plan risk-averse behavior in test MDPs. Empirical results suggest that this approach can help alleviate negative side effects of misspecified reward functions and mitigate reward hacking. |

Inverse Visual Question Answering( iVQA) |
In recent years, visual question answering (VQA) has become topical as a long-term goal to drive computer vision and multi-disciplinary AI research. The premise of VQA’s significance, is that both the image and textual question need to be well understood and mutually grounded in order to infer the correct answer. However, current VQA models perhaps `understand’ less than initially hoped, and instead master the easier task of exploiting cues given away in the question and biases in the answer distribution. In this paper we propose the inverse problem of VQA (iVQA), and explore its suitability as a benchmark for visuo-linguistic understanding. The iVQA task is to generate a question that corresponds to a given image and answer pair. Since the answers are less informative than the questions, and the questions have less learnable bias, an iVQA model needs to better understand the image to be successful. We pose question generation as a multi-modal dynamic inference process and propose an iVQA model that can gradually adjust its focus of attention guided by both a partially generated question and the answer. For evaluation, apart from existing linguistic metrics, we propose a new ranking metric. This metric compares the ground truth question’s rank among a list of distractors, which allows the drawbacks of different algorithms and sources of error to be studied. Experimental results show that our model can generate diverse, grammatically correct and content correlated questions that match the given answer. |

Iris |
Today’s conversational agents are restricted to simple standalone commands. In this paper, we present Iris, an agent that draws on human conversational strategies to combine commands, allowing it to perform more complex tasks that it has not been explicitly designed to support: for example, composing one command to ‘plot a histogram’ with another to first ‘log-transform the data’. To enable this complexity, we introduce a domain specific language that transforms commands into automata that Iris can compose, sequence, and execute dynamically by interacting with a user through natural language, as well as a conversational type system that manages what kinds of commands can be combined. We have designed Iris to help users with data science tasks, a domain that requires support for command combination. In evaluation, we find that data scientists complete a predictive modeling task significantly faster (2.6 times speedup) with Iris than a modern non-conversational programming environment. Iris supports the same kinds of commands as today’s agents, but empowers users to weave together these commands to accomplish complex goals. |

Irregular Convolutional Neural Network( ICNN) |
Convolutional kernels are basic and vital components of deep Convolutional Neural Networks (CNN). In this paper, we equip convolutional kernels with shape attributes to generate the deep Irregular Convolutional Neural Networks (ICNN). Compared to traditional CNN applying regular convolutional kernels like ${3\times3}$, our approach trains irregular kernel shapes to better fit the geometric variations of input features. In other words, shapes are learnable parameters in addition to weights. The kernel shapes and weights are learned simultaneously during end-to-end training with the standard back-propagation algorithm. Experiments for semantic segmentation are implemented to validate the effectiveness of our proposed ICNN. |

Irrelevant Variability |
We say that data variability is correlated with a specific task “if the removal of this variability from the data deteriorates (on average) the results of clustering or retrieval”. Variability is irrelevant if it is “maintained in the data” but “not correlated with the specific task” |

Isometry Blind Dynamic Time Warping( IBDTW) |
In this work, we explore the problem of aligning two time-ordered point clouds which are spatially transformed and re-parameterized versions of each other. This has a diverse array of applications such as cross modal time series synchronization (e.g. MOCAP to video) and alignment of discretized curves in images. Most other works that address this problem attempt to jointly uncover a spatial alignment and correspondences between the two point clouds, or to derive local invariants to spatial transformations such as curvature before computing correspondences. By contrast, we sidestep spatial alignment completely by using self-similarity matrices (SSMs) as a proxy to the time-ordered point clouds, since self-similarity matrices are blind to isometries and respect global geometry. Our algorithm, dubbed ‘Isometry Blind Dynamic Time Warping’ (IBDTW), is simple and general, and we show that its associated dissimilarity measure lower bounds the L1 Gromov-Hausdorff distance between the two point sets when restricted to warping paths. We also present a local, partial alignment extension of IBDTW based on the Smith Waterman algorithm. This eliminates the need for tedious manual cropping of time series, which is ordinarily necessary for global alignment algorithms to function properly. |

Isotonic Proportional Hazards Model |
isoph |

Isotonic Regression( IR) |
General isotonic regression is approximating given series of values with values satisfying a given partial ordering. The idea is to fit a piecewise-constant non-decreasing function to the data. http://…/Isotonic_regression |

ISOTYPE |
Isotype (International System of TYpographic Picture Education) is a method of showing social, technological, biological and historical connections in pictorial form. It was first known as the Vienna Method of Pictorial Statistics (Wiener Methode der Bildstatistik), due to its having been developed at the Gesellschafts- und Wirtschaftsmuseum in Wien (Social and economic museum of Vienna) between 1925 and 1934. The founding director of this museum, Otto Neurath, was the initiator and chief theorist of the Vienna Method. The term Isotype was applied to the method around 1935, after its key practitioners were forced to leave Vienna by the rise of Austrian fascism. http://…/Haroz_CHI_2015.pdf |

IT Operations Analytics( ITOA) |
In the fields of information technology and systems management, IT Operations Analytics (ITOA) is an approach or method applied to application software designed to retrieve, analyze and report data for IT operations. ITOA has been described as applying big data analytics to the IT realm. In its Hype Cycle Report, Gartner rated the business impact of ITOA as being ‘high’, meaning that its use will see businesses enjoy significantly increased revenue or cost saving opportunities. IT Operations Analytics (ITOA) (also known as Advanced Operational Analytics, or IT Data Analytics) technologies are primarily used to discover complex patterns in high volumes of often ‘noisy’ IT system availability and performance data. Forrester Research defines IT analytics as ‘The use of mathematical algorithms and other innovations to extract meaningful information from the sea of raw data collected by management and monitoring technologies.’ Taking a Horizontal Approach to Big Data for Better IT and Business Outcomes |

Item Explorer |
Item explorer is an approach to provide insights into a ubiquitous class of business questions like: • what kind of products do customers typically buy together? • what kind of web pages (on a web site) do users visit? • what combination of symptoms do patients have? • … For this class of business questions, the exponential number of combinations poses a severe practical challenge. Due to the explorative nature, visualization is well-suited for such business questions. More specifically, a visualization can provide a unique representation for both revealing insights and for intuitive user interaction based on business knowledge or own hypotheses. |

Item Factor Analysis |
➘ “Item Response Theory” ifaTools |

Item Response Theory( IRT) |
Item response theory (IRT) models are a class of statistical models used to describe the response behaviors of individuals to a set of items having a certain number of options. They are adopted by researchers in social science, particularly in the analysis of performance or attitudinal data, in psychology, education, medicine, marketing and other fields where the aim is to measure latent constructs. Most IRT analyses use parametric models that rely on assumptions that often are not satisfied. In such cases, a nonparametric approach might be preferable; nevertheless, there are not many software implementations allowing to use that. MLCIRTwithin |

Iterative Classification Algorithm( ICA) |
see also ➘ “Recurrent Collective Classification” |

Iterative Dichotomiser 3( ID3) |
In decision tree learning, ID3 (Iterative Dichotomiser 3) is an algorithm invented by Ross Quinlan used to generate a decision tree from a dataset. ID3 is the precursor to the C4.5 algorithm, and is typically used in the machine learning and natural language processing domains. |

Iterative Method |
In computational mathematics, an iterative method is a mathematical procedure that generates a sequence of improving approximate solutions for a class of problems. A specific implementation of an iterative method, including the termination criteria, is an algorithm of the iterative method. An iterative method is called convergent if the corresponding sequence converges for given initial approximations. A mathematically rigorous convergence analysis of an iterative method is usually performed; however, heuristic-based iterative methods are also common. In the problems of finding the root of an equation (or a solution of a system of equations), an iterative method uses an initial guess to generate successive approximations to a solution. In contrast, direct methods attempt to solve the problem by a finite sequence of operations. In the absence of rounding errors, direct methods would deliver an exact solution (like solving a linear system of equations Ax=b by Gaussian elimination). Iterative methods are often the only choice for nonlinear equations. However, iterative methods are often useful even for linear problems involving a large number of variables (sometimes of the order of millions), where direct methods would be prohibitively expensive (and in some cases impossible) even with the best available computing power. |

Iterative Proportional Fitting Procedure( IPFP) |
The iterative proportional fitting procedure (IPFP, also known as biproportional fitting in statistics, RAS algorithm in economics and matrix raking or matrix scaling in computer science) is an iterative algorithm for estimating cell values of a contingency table such that the marginal totals remain fixed and the estimated table decomposes into an outer product. mipfp |

Iterative Self-Organizing Data Analysis Technique( ISODATA) |
This is a more sophisticated algorithm which allows the number of clusters to be automatically adjusted during the iteration by merging similar clusters and splitting clusters with large standard deviations. |

Iterative Sequential Regression( ISR) |
Imputation of missing values is one of the major tasks for data pre-processing in many areas. Whenever imputation of data from o cial statistics comes into mind, several (additional) challenges almost always arise, like large data sets, data sets consisting of a mixture of di erent variable types, or data outliers. The aim is to propose an automatic algorithm called IRMI for iterative model-based imputation using robust methods, encountering for the mentioned challenges, and to provide a software tool in R. This algorithm is compared to the algorithm IVEWARE, which is the \recommended software’ for imputations in international and national statistical institutions. Using arti cial data and real data sets from o cial statistics and other elds, the advantages of IRMI over IVEWARE { especially with respect to robustness { are demonstrated. ISR3 |

Iterative Supervised Principal Components( ISPC) |
In high-dimensional prediction problems, where the number of features may greatly exceed the number of training instances, fully Bayesian approach with a sparsifying prior is known to produce good results but is computationally challenging. To alleviate this computational burden, we propose to use a preprocessing step where we first apply a dimension reduction to the original data to reduce the number of features to something that is computationally conveniently handled by Bayesian methods. To do this, we propose a new dimension reduction technique, called iterative supervised principal components (ISPC), which combines variable screening and dimension reduction and can be considered as an extension to the existing technique of supervised principal components (SPCs). Our empirical evaluations confirm that, although not foolproof, the proposed approach provides very good results on several microarray benchmark datasets with very affordable computation time, and can also be very useful for visualizing high-dimensional data. |

Iterative Weighted Least Squares( IWLS) |
The Iterative Weighted Least Squares (IWLS) method is one of the estimation procedures in logistic regression modeling. |

Iteratively Reweighted Least Squares( IRLS) |
IRLS is used to find the maximum likelihood estimates of a generalized linear model, and in robust regression to find an M-estimator, as a way of mitigating the influence of outliers in an otherwise normally-distributed data set. For example, by minimizing the least absolute error rather than the least square error. Although not a linear regression problem, Weiszfeld’s algorithm for approximating the geometric median can also be viewed as a special case of iteratively reweighted least squares, in which the objective function is the sum of distances of the estimator from the samples. One of the advantages of IRLS over linear programming and convex programming is that it can be used with Gauss-Newton and Levenberg-Marquardt numerical algorithms. |

Advertisements