Skip to main content

Multiclass logistic regression

 

Multiclass logistic regression

Instead of y=0,1 we will expand our definition so that y=0,1...n. Basically we re-run binary classification multiple times, once for each class.

Procedure

1.    Divide the problem into n+1 binary classification problems (+1 because the index starts at 0?).

2.     For each class…

3.     Predict the probability the observations are in that single class.

4.     prediction = <math>max(probability of the classes)

For each sub-problem, we select one class (YES) and lump all the others into a second class (NO). Then we take the class with the highest predicted value.

Softmax activation

The softmax function (softargmax or normalized exponential function) is a function that takes as input a vector of K real numbers, and normalizes it into a probability distribution consisting of K probabilities proportional to the exponentials of the input numbers. That is, prior to applying softmax, some vector components could be negative, or greater than one; and might not sum to 1; but after applying softmax, each component will be in the interval [ 0 , 1 ] , and the components will add up to 1, so that they can be interpreted as probabilities. The standard (unit) softmax function is defined by the formula

σ(zi)=ez(i)Kj=1ez(j)   for i=1,.,.,.,K and z=z1,.,.,.,zK

In words: we apply the standard exponential function to each element zi of the input vector z and normalize these values by dividing by the sum of all these exponentials; this normalization ensures that the sum of the components of the output vector σ(z)is 1.

Scikit-Learn example

Let’s compare our performance to the LogisticRegression model provided by scikit-learn.

import sklearn
from sklearn.linear_model import LogisticRegression
from sklearn.cross_validation import train_test_split
 
# Normalize grades to values between 0 and 1 for more efficient computation
normalized_range = sklearn.preprocessing.MinMaxScaler(feature_range=(-1,1))
 
# Extract Features + Labels
labels.shape =  (100,) #scikit expects this
features = normalized_range.fit_transform(features)
 
# Create Test/Train
features_train,features_test,labels_train,labels_test = train_test_split(features,labels,test_size=0.4)
 
# Scikit Logistic Regression
scikit_log_reg = LogisticRegression()
scikit_log_reg.fit(features_train,labels_train)
 
#Score is Mean Accuracy
scikit_score = clf.score(features_test,labels_test)
print 'Scikit score: ', scikit_score
 
#Our Mean Accuracy
observations, features, labels, weights = run()
probabilities = predict(features, weights).flatten()
classifications = classifier(probabilities)
our_acc = accuracy(classifications,labels.flatten())
print 'Our score: ',our_acc

Scikit score: 0.88. Our score: 0.89

 

Comments

Popular posts from this blog

Introduction to Machine Learning

What is Machine learning? The meaning of Machin Learning is machine learns itself with its experience.  Machine learning (ML) is a type of artificial intelligence (AI) that allows software applications to become more accurate at predicting outcomes without being explicitly programmed to do so. ML improve the performance of a computer program using sample data or past experience.  We have to create a model and then we have to train the model which will provide the prediction on the basis of machine learning algorithms. For eg: A robot learns with its past experience and improve itself.   Classification of Machine Learnings algorithms: Supervised Learning: Supervised learning, as the name indicates, has the presence of a supervisor as a teacher.  It is defined by its use of labelled datasets to train algorithms that to classify data or predict outcomes accurately. For eg:   Supposed to identify the fruits in the basket we mu...