Multiclass logistic regression
Instead of y=0,1 we will expand our
definition so that y=0,1...n. Basically we re-run binary classification multiple times, once
for each class.
Procedure
1.
Divide the problem into n+1 binary classification problems (+1
because the index starts at 0?).
2.
For each class…
3.
Predict the probability the observations are in that single
class.
4.
prediction = <math>max(probability of the classes)
For each sub-problem, we select one class (YES) and lump all the
others into a second class (NO). Then we take the class with the highest
predicted value.
Softmax activation
The softmax function (softargmax or normalized exponential
function) is a function that takes as input a vector of K real numbers, and
normalizes it into a probability distribution consisting of K probabilities
proportional to the exponentials of the input numbers. That is, prior to applying
softmax, some vector components could be negative, or greater than one; and
might not sum to 1; but after applying softmax, each component will be in the
interval [ 0 , 1 ] , and the components will add up to 1, so that they can be
interpreted as probabilities. The standard (unit) softmax function is defined
by the formula
σ(zi)=ez(i)∑Kj=1ez(j) for i=1,.,.,.,K and z=z1,.,.,.,zK
In words: we apply the standard exponential function to each element zi of the input vector z and normalize these values by dividing by the sum of all these exponentials; this normalization ensures that the sum of the components of the output vector σ(z)is 1.
Scikit-Learn example
Let’s compare our performance to the LogisticRegression model provided by
scikit-learn.
import sklearn
from sklearn.linear_model import LogisticRegression
from sklearn.cross_validation import train_test_split
# Normalize grades to values between 0 and 1 for more efficient computation
normalized_range = sklearn.preprocessing.MinMaxScaler(feature_range=(-1,1))
# Extract Features + Labels
labels.shape = (100,) #scikit expects this
features = normalized_range.fit_transform(features)
# Create Test/Train
features_train,features_test,labels_train,labels_test = train_test_split(features,labels,test_size=0.4)
# Scikit Logistic Regression
scikit_log_reg = LogisticRegression()
scikit_log_reg.fit(features_train,labels_train)
#Score is Mean Accuracy
scikit_score = clf.score(features_test,labels_test)
print 'Scikit score: ', scikit_score
#Our Mean Accuracy
observations, features, labels, weights = run()
probabilities = predict(features, weights).flatten()
classifications = classifier(probabilities)
our_acc = accuracy(classifications,labels.flatten())
print 'Our score: ',our_acc
Scikit score: 0.88. Our score: 0.89
Comments
Post a Comment