Logistic Regression with SGD in 2 dimensions¶

Here are some training data points with labels, and a weight vector w initialized to all-zeros.

In [1]:
import numpy as np

X = Xy[:, :-1]
y = Xy[:, -1]
# split into train and test
ttsplit = int(y.size*.8)
trainX = X[:ttsplit, :]
trainy = y[:ttsplit]
testX = X[ttsplit:, :]
testy = y[ttsplit:]

import matplotlib.pyplot as plt
%matplotlib inline
from demofuncs import *

plot_labeled_data(trainX, trainy)


Run a perceptron learner (the basic one without bias, etc) on this training data, and measure the accuracy on the test set.

Accuracy is measured in two ways: raw, and mean squared, which is

$$\dfrac{1}{p}\sum_{i=1}^p (y-h_\theta(x^{(i)}))^2$$
In [4]:
def perceptron_train(X, y, maxiter):
num_points, num_dims = X.shape
w = np.zeros(num_dims)
for epoch in range(maxiter):
num_errors = 0
for i in range(num_points):
if y[i]*w.dot(X[i])<=0:
w += y[i]*X[i]
num_errors +=1
print 'Epoch', epoch, ':', num_errors, 'errors', w

if epoch%1==0:
plot_labeled_data(X, y, w)

if num_errors==0:
break
return w

w = perceptron_train(trainX, trainy, 20)
meansq_acc, acc = score(testX, testy, w)
print 'test accuracy: mean square=', meansq_acc, 'raw=', acc

Epoch 0 : 8 errors [ 1.41675788 -0.65539514]

Epoch 1 : 1 errors [ 1.3890965  -0.71399474]

Epoch 2 : 0 errors [ 1.3890965  -0.71399474]

test accuracy: mean square= 0.869028557654 raw= 1.0


Now learn the hyperplane with logistic regression (stochastic gradient descent, eta=0.8):

In [5]:
w = logreg_train(trainX, trainy, 20)
meansq_acc, acc = score(testX, testy, w)
print 'test accuracy: mean square=', meansq_acc, 'raw=', acc

Epoch 0 : 4 errors [ 3.20080905 -1.43296741]

Epoch 1 : 2 errors [ 4.52455114 -2.06947624]

Epoch 2 : 2 errors [ 5.44081693 -2.51873752]

Epoch 3 : 2 errors [ 6.16232983 -2.87569646]

Epoch 4 : 2 errors [ 6.7650757  -3.17627077]

Epoch 5 : 2 errors [ 7.2862502  -3.43811726]

Epoch 6 : 2 errors [ 7.74727984 -3.67137002]

Epoch 7 : 2 errors [ 8.16181289 -3.88246521]

Epoch 8 : 2 errors [ 8.53916045 -4.07578737]

Epoch 9 : 2 errors [ 8.88599954 -4.25447964]

Epoch 10 : 2 errors [ 9.20730335 -4.42088398]

Epoch 11 : 2 errors [ 9.50688805 -4.57679907]

Epoch 12 : 2 errors [ 9.78775333 -4.72364042]

Epoch 13 : 2 errors [ 10.05230431  -4.86254453]

Epoch 14 : 2 errors [ 10.30250181  -4.99443943]

Epoch 15 : 2 errors [ 10.53996725  -5.12009383]

Epoch 16 : 2 errors [ 10.76605788  -5.24015235]

Epoch 17 : 2 errors [ 10.98192206  -5.35516147]

Epoch 18 : 2 errors [ 11.18854051  -5.46558884]

Epoch 19 : 2 errors [ 11.38675787  -5.57183803]

test accuracy: mean square= 0.98553495913 raw= 1.0


The above experiment shows that logistic regression is better than the perceptron at finding a hyperplane that separates the test points well (where "well" means having a good mean square accuracy).

What happens when the data is not linearly separable?

In [6]:
nonlinX = np.vstack((trainX, [1, -0.5]))
nonliny = np.append(trainy, -1)

plot_labeled_data(nonlinX, nonliny, [0, 0])

In [7]:
w = perceptron_train(nonlinX, nonliny, 20)
meansq_acc, acc = score(testX, testy, w)
print 'test accuracy: mean square=', meansq_acc, 'raw=', acc

Epoch 0 : 9 errors [ 0.41675788 -0.15539514]

Epoch 1 : 2 errors [-0.6109035   0.28600526]

Epoch 2 : 5 errors [-0.1209306   0.04633054]

Epoch 3 : 9 errors [ 0.68191618 -0.63870174]

Epoch 4 : 8 errors [ 1.00171985  0.22253424]

Epoch 5 : 7 errors [ 0.66498973 -0.06809072]

Epoch 6 : 8 errors [ 0.28041934 -0.1464825 ]

Epoch 7 : 1 errors [-0.71958066  0.3535175 ]

Epoch 8 : 5 errors [-0.22960775  0.11384278]

Epoch 9 : 9 errors [ 0.57323903 -0.5711895 ]

Epoch 10 : 8 errors [ 0.89304269  0.29004648]

Epoch 11 : 8 errors [ 0.48402368 -0.47056766]

Epoch 12 : 9 errors [ 0.75172424 -0.2582918 ]

Epoch 13 : 6 errors [ 0.17749152 -0.10189397]

Epoch 14 : 1 errors [-0.82250848  0.39810603]

Epoch 15 : 5 errors [-0.33253557  0.15843131]

Epoch 16 : 9 errors [ 0.4703112  -0.52660097]

Epoch 17 : 8 errors [ 0.5764883 -0.0275209]

Epoch 18 : 8 errors [ 0.19191792 -0.10591268]

Epoch 19 : 1 errors [-0.80808208  0.39408732]

test accuracy: mean square= 0.646383395811 raw= 0.0

In [8]:
w = logreg_train(nonlinX, nonliny, 20)
meansq_acc, acc = score(testX, testy, w)
print 'test accuracy: mean square=', meansq_acc, 'raw=', acc

Epoch 0 : 5 errors [ 2.41641448 -1.04077012]

Epoch 1 : 3 errors [ 3.27860938 -1.45031627]

Epoch 2 : 3 errors [ 3.77223826 -1.69079293]

Epoch 3 : 3 errors [ 4.09692967 -1.84988036]

Epoch 4 : 3 errors [ 4.32470561 -1.96176911]

Epoch 5 : 3 errors [ 4.49047721 -2.04334696]

Epoch 6 : 3 errors [ 4.61397604 -2.10420861]

Epoch 7 : 3 errors [ 4.70745113 -2.15032678]

Epoch 8 : 3 errors [ 4.77899812 -2.18565835]

Epoch 9 : 3 errors [ 4.83420882 -2.21294241]

Epoch 10 : 3 errors [ 4.87707165 -2.23413657]

Epoch 11 : 3 errors [ 4.91050022 -2.25067335]

Epoch 12 : 3 errors [ 4.93666177 -2.26361987]

Epoch 13 : 3 errors [ 4.95719084 -2.27378196]

Epoch 14 : 3 errors [ 4.97333342 -2.2817745 ]

Epoch 15 : 3 errors [ 4.98604723 -2.2880705 ]