In this shallow exercise, I calculate impurity measures for a distribution with two possible values.
Functions used are:
\[H(X) = -\sum_{i=1}^c{p_i log_2(p_i)}\]
tmp.seq <- seq(0,1,by=0.01)
probA <- tmp.seq
probB <- 1-probA
entrop <- matrix(0, nrow =length(probA), ncol=1)
gini <- matrix(0, nrow =length(probA), ncol=1)
error <- matrix(0, nrow =length(probA), ncol=1)
for(i in 1:length(probA))
{
entrop[i] <- -1*(probA[i]*log2(probA[i])+probB[i]*log2(probB[i]))
gini[i] <- 1 - (probA[i]^2+probB[i]^2)
error[i] <- 1 - max(probA[i], probB[i])
}
entrop[1] <- 0
entrop[length(entrop)] <- 0