Processing math: 100%

NOTES ON STATISTICS, PROBABILITY and MATHEMATICS


Perceptron:


In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers: functions that can decide whether an input (represented by a vector of numbers) belongs to one class or another (Wikipedia). The idea is similar to logistic regression, although the optimization is different:

Vectorially, the d features or attributes of an example are x, and the idea is to “pass” the example if:

d1θixi>theshold or…

h(x)=sign(d1θixitheshold). The sign function results in 1 or 1, as opposed to 0 and 1 in logistic regression.

The threshold will be absorbed into the bias coefficient, +θ0. The formula is now:

h(x)=sign(d0θixi), or vectorized:

h(x)=sign(θTx).

Misclassified points will have:

sign(θTx)yn, meaning that the dot product of theta and xn will be positive (vectors in the same direction), when yn is negative, or the dot product will be negative (vectors in opposite directions), while yn is positive:

The process starts with random weights or coefficients, and calculates for every missclassified points or examples n in the training sample:

θ:=θ+yn×xn

In this example I use logistic regression to get a decission boundary. We are looking at two test results, and the ultimate outcome of whether the student gets into college or not.

The code is as follows:

dat = read.csv("perceptron.txt", header=F)
colnames(dat) = c("test1","test2","y")
dat[1:5,]
##      test1    test2 y
## 1 34.62366 78.02469 0
## 2 30.28671 43.89500 0
## 3 35.84741 72.90220 0
## 4 60.18260 86.30855 1
## 5 79.03274 75.34438 1
plot(test2 ~ test1, col = as.factor(y), pch = 20, data=dat,
     main = "Decision Boundary - College Admission")

fit = glm(y ~ test1 + test2, family = "binomial", data = dat)
coefs = coef(fit)
x = c(min(dat[,1])-2,  max(dat[,1])+2)
y = c((-1/coefs[3]) * (coefs[2] * x + coefs[1]))
lines(x, y, lwd = 3, col = rgb(0,.9,.1,.4))

The boundary decision line corresponds to:

0=θ0+θ1×test1+θ2×test2. Hence, test2=(1θ2)×(θ0+θ1×test1).


Home Page

NOTE: These are tentative notes on different topics for personal use - expect mistakes and misunderstandings.