Denoising Random Forest google
This paper proposes a novel type of random forests called a denoising random forests that are robust against noises contained in test samples. Such noise-corrupted samples cause serious damage to the estimation performances of random forests, since unexpected child nodes are often selected and the leaf nodes that the input sample reaches are sometimes far from those for a clean sample. Our main idea for tackling this problem originates from a binary indicator vector that encodes a traversal path of a sample in the forest. Our proposed method effectively employs this vector by introducing denoising autoencoders into random forests. A denoising autoencoder can be trained with indicator vectors produced from clean and noisy input samples, and non-leaf nodes where incorrect decisions are made can be identified by comparing the input and output of the trained denoising autoencoder. Multiple traversal paths with respect to the nodes with incorrect decisions caused by the noises can then be considered for the estimation. …

Median Probability Model (MPM) google
Often the goal of model selection is to choose a model for future prediction, and it is natural to measure the accuracy of a future prediction by squared error loss. Under the Bayesian approach, it is commonly perceived that the optimal predictive model is the model with highest posterior probability, but this is not necessarily the case. In this paper we show that, for selection among normal linear models, the optimal predictive model is often the median probability model, which is defined as the model consisting of those variables which have overall posterior probability greater than or equal to 1/2 of being in a model.
The median probability model often differs from the highest probability model. The median probability model (MPM) Barbieri and Berger (2004) is defined as the model consisting of those variables whose marginal posterior probability of inclusion is at least 0.5. The MPM rule yields the best single model for prediction in orthogonal and nested correlated designs. This result was originally conceived under a specific class of priors, such as the point mass mixtures of non-informative and g-type priors. The MPM rule, however, has become so very popular that it is now being deployed for a wider variety of priors and under correlated designs, where the properties of MPM are not yet completely understood.
The Median Probability Model and Correlated Variables

Residual Memory Neural Network (RMN) google
Training deep recurrent neural network (RNN) architectures is complicated due to the increased network complexity. This disrupts the learning of higher order abstracts using deep RNN. In case of feed-forward networks training deep structures is simple and faster while learning long-term temporal information is not possible. In this paper we propose a residual memory neural network (RMN) architecture to model short-time dependencies using deep feed-forward layers having residual and time delayed connections. The residual connection paves way to construct deeper networks by enabling unhindered flow of gradients and the time delay units capture temporal information with shared weights. The number of layers in RMN signifies both the hierarchical processing depth and temporal depth. The computational complexity in training RMN is significantly less when compared to deep recurrent networks. RMN is further extended as bi-directional RMN (BRMN) to capture both past and future information. Experimental analysis is done on AMI corpus to substantiate the capability of RMN in learning long-term information and hierarchical information. Recognition performance of RMN trained with 300 hours of Switchboard corpus is compared with various state-of-the-art LVCSR systems. The results indicate that RMN and BRMN gains 6 % and 3.8 % relative improvement over LSTM and BLSTM networks. …