K-means clustering allows us to dissect a data set and find K “cluster” points that best represent the entirety of the information. These K points minimise the distances of surrounding data points to the cluster, allowing us to potentially classify and summarise a large set of data in just a few points.

An application of simple K-means clustering technique is image compression. What K-means can do is to figure out the K colours that can best represent an image, and one can replace an image of (Red, Green, Blue) values down to log2(K) bits of information.

In practice though, it doesn’t work so well though as the compressed image too obviously doesn’t look realistic at all, as natural images require gradient and shading to look real. However, this disadvantage can lead to artistic pursuits!

I’ve hacked together a simple K-means image compressor using javascript canvas functions and raw pixel-by-pixel image processing to apply this concept, and then randomly mapped each of the K cluster colours to another entirely random colour, producing somewhat the famous Andy Warhol’s Marilyn Diptych effect.

I found when used on natural images, it is best to limit K to just 2 or 3 clusters, while for computer generated art pieces, more clusters can be more visually appealing.

Good example for natural photo:

Bad example:

Examples of applying Warhol to generative art.