Takeuchi’s Information Criteria (TIC) google
Takeuchi’s Information Criteria (TIC) is a linearization of maximum likelihood estimator bias which shrinks the model parameters towards the maximum entropy distribution, even when the model is mis-specified. In statistical machine learning, $L_2$ regularization (a.k.a. ridge regression) also introduces a parameterized bias term with the goal of minimizing out-of-sample entropy, but generally requires a numerical solver to find the regularization parameter. …

Latent Sequence Decompositions (LSD) google
We present the Latent Sequence Decompositions (LSD) framework. LSD decomposes sequences with variable lengthed output units as a function of both the input sequence and the output sequence. We present a training algorithm which samples valid extensions and an approximate decoding algorithm. We experiment with the Wall Street Journal speech recognition task. Our LSD model achieves 12.9% WER compared to a character baseline of 14.8% WER. When combined with a convolutional network on the encoder, we achieve 9.2% WER. …

Domain Adaptation (DA) google
Domain Adaptation is a field associated with machine learning and transfer learning. This scenario arises when we aim at learning from a source data distribution a well performing model on a different (but related) target data distribution. For instance, one of the tasks of the common spam filtering problem consists in adapting a model from one user (the source distribution) to a new one who receives significantly different emails (the target distribution). Note that, when more than one source distribution is available the problem is referred to as multi-source domain adaptation.
Domain Adaptation with Randomized Expectation Maximization