Posts Tagged ‘random-forest’

Machine Learning: Definition of %Var(y) in R’s randomForest package’s regression method

The second column is simply the first column divided by the variance of the response that have been OOB up to that point (20 trees), times 100. 

Stats: Gini Importance in Random Forest Models

Every time a split of a node is made on variable m the gini impurity criterion for the two descendent nodes is less than the parent node. Adding up the gini decreases for each individual variable over all trees in the forest gives a fast variable importance that is often very consistent with the permutation importance measure.