Skip to content

Commit 47ea799

Browse files
committed
created k-means
first draft of k-means post, including intro, link to video and initial description of k-means algorithm.
1 parent 126d346 commit 47ea799

File tree

2 files changed

+39
-1
lines changed

2 files changed

+39
-1
lines changed

_apple-2-blog/emulator.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ I can't remember exactly when my parents ponied up a bought me the Apple ]\[+ bu
2222

2323
My obsession with the Apple computer began during the summer after the 6<sup>th</sup> grade. My mother signed me up for computer camp where we learned how to write code on the Apple ]\[. These machines didn't even have floppies, so we used cassette tape recorders. You'd press 'record' on the tape recorder before entering `SAVE` to store the program. To get it back you'd type `LOAD` and then press 'play' on the tape recorder. It wasn't punch cards but that's wild. My program was a lowres picture of a Star Wars battle (another childhood obsession).
2424

25-
From that moment, I had to have one. Every single birthday or holiday from that point forward, when anyone asked me what I wanted, to everyone's dismay, I'd say, "Money." My plan was to save up enough to get one. In the meantime, since these computers were not readily available, I would write BASIC programs on notebook paper (to do what, I can't remember), ride my bike to the mall and then try them out on the [TRS-80 Color Computer](https://en.wikipedia.org/wiki/TRS-80_Color_Computer){:target="_blank"}s at Radio Shack. As I recognized at the time, the sales people there were extremely cool to let me do that.
25+
From that moment, I had to have one. Every single birthday or holiday from that point forward, when anyone asked me what I wanted, to everyone's dismay, I'd say, "Money." My plan was to save up enough to get one. In the meantime, since these computers were not readily available, I would write BASIC programs on notebook paper (to do what, I can't remember), ride my bike to the mall and then try them out on the [TRS-80 Color Computers](https://en.wikipedia.org/wiki/TRS-80_Color_Computer){:target="_blank"} at Radio Shack. As I recognized at the time, the sales people there were extremely cool to let me do that.
2626

2727
Saving up that kind of money took years but sometime around the 9<sup>th</sup> grade I had enough to get a TRS-80, which was significantly less expensive at about half the cost. My parents, realizing that I was about to drop every penny I had on a computer inferior to the one I really wanted, stepped up and surprised me. I can still remember the boxes sitting in the living room when I got home.
2828

_apple-2-blog/k-means.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
title: "K-means by another means"
3+
excerpt: "Success! There is machine learning happening on my Apple ]\[+."
4+
tags:
5+
- K-means
6+
- machine learning
7+
date: 2025-09-20
8+
---
9+
10+
Wait. Does k-means count as machine learning? Yes. It does.
11+
12+
[CS229](https://cs229.stanford.edu/){:target="_blank"} is the graduate-level machine learning course I took at Stanford as part of the [Graduate Certificate in AI](https://www.linkedin.com/pulse/graduate-certificate-ai-achievement-unlocked-mark-cramer/){:target="_blank"} that I received back in 2021. While k-means is my choice as the easiest to understand algorithm in machine learning, it was taught as the introductory clustering algorithm for unsupervised learning. As a TA (technically a 'course facilitator') for [XCS229](https://online.stanford.edu/courses/xcs229-machine-learning){:target="_blank"}, which I have been doing since 2022 and most recently did this Spring, I know that it is still being taught as part of this seminal course in AI.
13+
14+
## We have liftoff!
15+
16+
Unlike [previously](/apple-2-blog/synthesizing-data/) where I saved the result for the end, let's start by taking a look at the algorithm in action!
17+
18+
[![Video of Apple][+ running k-means](https://img.youtube.com/vi/xi876Gqt4jk/0.jpg)](https://youtube.com/shorts/Cy0wMMLObVA "Video of Apple]\[+ running k-means"){:target="_blank"}
19+
20+
For debugging purposes, to speed up execution, I reduced the number of samples in each class to 5. That's obviously pretty small but you can see the algorithm iterating. At the end of each loop I draw a line between the latest estimates of cluster centroids. The perpendicular bisector of these segments are the decision boundaries between the classes, so I draw that, too. Some of the code was written to handle more than two classes but here there are only two which makes this relatively easy.
21+
22+
## K-means explained
23+
24+
[K-means clustering](https://en.wikipedia.org/wiki/K-means_clustering){:target="_blank"} aims to partition \\(n\\) observations into \\(k\\) clusters in which each observation belongs to the cluster with the nearest mean, called the cluster centroid. In pseudo-code it looks like this:
25+
26+
|Step|Description|
27+
|--|--|
28+
|Initialize|Produce and initial set of k cluster centroids. This can be done by randomly choosing k observations from the dataset.|
29+
|Step 1 - Assignment| Using Euclidean distance to the centroids, assign each observation to a cluster.|
30+
|Step 2 - Update| For each cluster, recompute the centroid using the newly assigned observations. If the centroids change (outside of a certain tolerance), go back to step 1 and repeat.|
31+
32+
Ezpz.
33+
34+
The math is also very easy. In step 1 the distance between two points, \\(x\\) and \\(y\\), is simply \\(\sqrt{(x_0 - y_0)^2 + (x_1 - y_1)^2 + \cdots + (x_d - y_d)^2}\\). Since we're just comparing distances it's not even necessary to take the square root. In step 2, the centroid is simply the sum of all the points divided by the number of points.
35+
36+
## Implementation
37+
38+
{to be written...}

0 commit comments

Comments
 (0)