Logistic Regression(Loss method) in Machine Learning using Python-
Purchased through an advertisement on social media.
Logistic Regression
Logistic regression is similar to linear
regression, but it is used when the output is binary (i.e. when outcome can
have only two possible values). The prediction for this final output will be a
non-linear S-shaped function called the logistic function, g().
This logistic function maps the intermediate
outcome values into an outcome variable Y with values ranging from 0 to 1.
These values can then be interpreted as the probability of occurrence of Y. The
properties of the S-shaped logistic function make logistic regression better
for classification tasks.
Logistic
Model
Consider a model with features x1, x2, x3 … xn. Let
the binary output be denoted by Y, that can take the values 0 or 1.
Let p be the probability of Y = 1, we can denote it as p = P(Y=1).
The mathematical relationship between these variables can be denoted as:
Let p be the probability of Y = 1, we can denote it as p = P(Y=1).
The mathematical relationship between these variables can be denoted as:
Loss Function
The loss is basically the error in our predicted value. In other
words it is a difference between our predicted value and the actual value. We
will be using the L2 Loss Function to
calculate the error. Theoretically you can use any function to calculate the
error. This function can be broken down as:
- Let the actual value be yᵢ. Let the value predicted using our model be denoted as ȳᵢ.Find the difference between the actual and predicted value.
- Square this difference.
- Find the sum across all the values in training data.
Now that we have the error, we need to update
the values of our parameters to minimize this error. This is where the
“learning” actually happens, since our model is updating itself based on it’s
previous output to obtain a more accurate output in the next step. Hence with
each iteration our model becomes more and more accurate. We will be using the Gradient
Descent Algorithm to estimate our parameters. Another commonly used
algorithm is the Maximum
Likelihood Estimation.
The loss or error on the y axis and number of iterations on
the x axis.
Implementing the Model
The data was taken from kaggle and describes information about a product being
purchased through an advertisement on social media. We will be predicting the
value of Purchased and consider a single feature, Age to
predict the values of Purchased. You can have multiple
features as well.
# Product being Purchased through an advertisement on social media.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from math import exp
plt.rcParams["figure.figsize"] = (10, 6)
## Load the Input Data
data = pd.read_csv("Social_Network_Ads.csv")
data.head()
User ID
|
Gender
|
Age
|
EstimatedSalary
|
Purchased
|
|
0
|
15624510
|
Male
|
19
|
19000
|
0
|
1
|
15810944
|
Male
|
35
|
20000
|
0
|
2
|
15668575
|
Female
|
26
|
43000
|
0
|
3
|
15603246
|
Female
|
27
|
57000
|
0
|
4
|
15804002
|
Male
|
19
|
76000
|
0
|
# Lets Viswalise the given data.
plt.scatter(data['Age'], data['Purchased'])
plt.show()
# We need to normalize our training data by which we can shift the mean to the origin.(i.e : 0). Now we will do normalize method.
# Creating the logistic regression model
# Helper function to normalize data
def normalize(X):
return X - X.mean()
# Method to make predictions
def predict(X, b0, b1):
return np.array([1 / (1 + exp(-1*b0 + -1*b1*x)) for x in X])
# Method to train the model
def logistic_regression(X, Y):
X = normalize(X)
# Initializing variables
b0 = 0
b1 = 0
L = 0.001
epochs = 300
for epoch in range(epochs):
y_pred = predict(X, b0, b1)
D_b0 = -2 * sum((Y - y_pred) * y_pred * (1 - y_pred)) # Derivative of loss wrt b0
D_b1 = -2 * sum(X * (Y - y_pred) * y_pred * (1 - y_pred)) # Derivative of loss wrt b1
# Update b0 and b1
b0 = b0 - L * D_b0
b1 = b1 - L * D_b1
return b0, b1
# Training the model
b0, b1 = logistic_regression(X_train, y_train)
# Making predictions
X_test_norm = normalize(X_test)
y_pred = predict(X_test_norm, b0, b1)
y_pred = [1 if p >= 0.5 else 0 for p in y_pred]
plt.clf()
plt.scatter(X_test, y_test)
plt.scatter(X_test, y_pred, c="red")
plt.show()
# The accuracy
accuracy = 0
for i in range(len(y_pred)):
if y_pred[i] == y_test.iloc[i]:
accuracy += 1
print(f"Accuracy = {accuracy / len(y_pred)}")
Accuracy = 0.85
# Making predictions using scikit learn
from sklearn.linear_model import LogisticRegression # Create an instance and fit the model
lr_model = LogisticRegression() lr_model.fit(X_train.values.reshape(-1, 1), y_train.values.reshape(-1, 1)) # Making predictions
y_pred_sk = lr_model.predict(X_test.values.reshape(-1, 1)) plt.clf() plt.scatter(X_test, y_test) plt.scatter(X_test, y_pred_sk, c="red") plt.show() # Accuracy
print(f"Accuracy = {lr_model.score(X_test.values.reshape(-1, 1), y_test.values.reshape(-1, 1))}")
Accuracy = 0.8625
Thus we have implemented a seemingly
complicated algorithm easily using python from scratch and also compared it
with a standard model in sklearn that does the same. I think the most crucial
part here is the gradient descent algorithm, and learning how to the weights
are updated at each step. Once you have learned this basic concept, then you
will be able to estimate parameters for any function.
Now, to predict whether a user will purchase the product or
not, one needs to find out the relationship between Age and Estimated Salary.
Here User ID and Gender are not important factors for finding out this.
# input
x = data.iloc[:, [2,
3]].values
# output
y
= data.iloc[:, 4].values
from sklearn.preprocessing import
StandardScaler
xtrain, xtest, ytrain, ytest =
train_test_split(
x,
y, test_size = 0.25, random_state = 0)
sc_x = StandardScaler()
xtrain =
sc_x.fit_transform(xtrain)
xtest = sc_x.transform(xtest)
print (xtrain[0:10, :])
[[
0.58164944 -0.88670699]
[-0.60673761
1.46173768]
[-0.01254409 -0.5677824 ]
[-0.60673761
1.89663484]
[ 1.37390747 -1.40858358]
[ 1.47293972
0.99784738]
[ 0.08648817 -0.79972756]
[-0.01254409 -0.24885782]
[-0.21060859 -0.5677824 ]
[-0.21060859 -0.19087153]]
from
sklearn.linear_model import LogisticRegression
classifier
= LogisticRegression(random_state = 0)
classifier.fit(xtrain,
ytrain)
LogisticRegression(C=1.0,
class_weight=None, dual=False, fit_intercept=True,
intercept_scaling=1,
l1_ratio=None, max_iter=100,
multi_class='warn',
n_jobs=None, penalty='l2',
random_state=0,
solver='warn', tol=0.0001, verbose=0,
warm_start=False)
y_pred
= classifier.predict(xtest)
from
sklearn.metrics import confusion_matrix
cm
= confusion_matrix(ytest, y_pred)
print
("Confusion Matrix : \n", cm)
Confusion
Matrix :
[[65 3]
[ 8 24]]
from
sklearn.metrics import accuracy_score
print
("Accuracy : ", accuracy_score(ytest, y_pred))
Accuracy : 0.89
from
matplotlib.colors import ListedColormap
X_set,
y_set = xtest, ytest
X1,
X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1,
stop
= X_set[:, 0].max() + 1, step = 0.01),
np.arange(start
= X_set[:, 1].min() - 1,
stop
= X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1,
X2, classifier.predict(
np.array([X1.ravel(),
X2.ravel()]).T).reshape(
X1.shape), alpha = 0.75,
cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(),
X1.max())
plt.ylim(X2.min(),
X2.max())
for
i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0],
X_set[y_set == j, 1],
c =
ListedColormap(('red', 'green'))(i), label = j)
plt.title('Classifier
(Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated
Salary')
plt.legend()
plt.show()
from
matplotlib.colors import ListedColormap
X_set,
y_set = xtest, ytest
X1,
X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1,
stop
= X_set[:, 0].max() + 1, step = 0.01),
np.arange(start
= X_set[:, 1].min() - 1,
stop
= X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1,
X2, classifier.predict(
np.array([X1.ravel(),
X2.ravel()]).T).reshape(
X1.shape), alpha = 0.75,
cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(),
X1.max())
plt.ylim(X2.min(),
X2.max())
for
i, j in enumerate(np.unique(y_set)):
plt.scatter(X_set[y_set == j, 0],
X_set[y_set == j, 1],
c =
ListedColormap(('red', 'green'))(i), label = j)
Please post your comments.
ReplyDeleteCompleted.
ReplyDeleteThough i didnt understand to the core, hope it will be useful for those who stepping into Data science. Awesome efforts , keep going !!
ReplyDelete