K means Clustering - 10601
Automatically partition unlabeled data points into groups of data points
Understanding Hidden Structure
Cluster NewsArticles, Web Pages or search results by topic.
Cluster Protein sequences by func or genes according to expression profile.
Cluster user of Social network based on intrest(Commuity detection) - Ad target
Goal - Minimize a function J(theta)
eg. theta = argmin(J(theta))
Idea: Pick one dimension, and minimize along that dimension.
1. Choosing Initial Point heta
2. Repeat until stopping criteria is reached
a. Theta1 = argmin J(theta1,theta2,….thetan)
b. Theta2 = argmin J(theta1,theta2,….thetan)
c. Theta1 = argmin J(theta1,theta2,….thetan)
d. Thetan = argmin J(theta1,theta2,….thetan)
Note: Steps abc..,d are exact line search alo ng same axis.
In some cases this algo can get stuck and start oscillating along two points. In this scenario we can use something called as Block Coordinate Descent.
Block Coordinate Descent:
Here: An eg. With two block alpha and beta where theta = [alpha, beta]
Goal: alpha,beta = argmin J(alpha,beta) where alpha in Ra and beta in Rb
Idea: Minimize over an entire group of variables at a time.
1.) Choose alpha and beta
2.) Repeat until stopping criteria is reached
a. Alpha – argmin (J(alpha,beta))
b. Beta – argmin (J(alpha,beta))
Left figure - Distance between two points in same cluster is less than different cluster
Right Figure - Distance between two points in same cluster is more than different cluster
Now we define a Object function to minimize.
What is clustering? Goal is to partition unlabeled instances into group of similar points.
Input: Unlabeld data D = {X1,X2,X3….Xn}, X(i) belongs Rm
*We do not know the labels!
View#1: Labeled Instances {(X(1),Z(1)), (X(2),Z(2)), (X(3),Z(3))….. (X(n),Z(n))}
Where Z(1) E {1,2,3,….,K} k is # of clusters
View#2: Clusterings: C1,C2,…..Ck where k is number of clusters Ci = {X(i): Z(i)=j} pints in jth partition
1.) How many clusters are there?
2.) How do we define similarity between points? Eg. Eucledian distance
Cluster Centers: {c1,c2,…,ck} = c
Decision Rule: Assign point X(i) to its nearest cluster center cj
C= argmin sumi=1N minj in {1,..,k}||X(i)-cj||2