Skip to content

Shaked-g/Kohonen-SOM-Algorithm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Kohonen-SOM-Algorithm

By Shaked Gofin

Self-Organizing Map (SOM) Overview

A Self-Organizing Map (SOM) is an unsupervised neural network model that performs both dimensionality reduction and data clustering. In this project, the SOM organizes high-dimensional data onto a lower-dimensional (2D) grid, preserving the original topological relationships within the data.

The SOM consists of a grid of neurons, each initialized with random weight vectors. During training, each data point is compared to all neurons, and the closest neuron (called the Best Matching Unit or BMU) is updated to become more similar to that data point. This process, applied over many iterations, gradually organizes similar data points to map closer together on the SOM grid.

In this project, the SOM is used to visualize and cluster data points sampled from different distributions. It adapts to data patterns and can cluster points into shapes like a square or a “donut,” showing how the data clusters or structures evolve over time. Visualization of the SOM grid helps in understanding the data’s underlying patterns and structure, making it a powerful tool for exploratory data analysis.

Implementation

Part A: Implementing the Kohonen algorithm and use it to fit a line of neurons to a data in the shape of a square. In order of achieving the data points of a square with parameters of {(x,y) | 0 <= x <= 1, 0<=y<=1} I used np.random.rand(5000, 2) which returns 5000 pairs of randomize numbers between [0,1) thus giving us data points of a square as can be seen in the main function:

image

I created a Class called SOM in order to incapsulate all the necessary functions. It holds the Shape of the SOM, creates a line of neurons along the y=0 line and initiates the learning rate, time and sigma parameters.

image

The initial line of neurons can be seen here when I ran the program with zero iterations.

image

The Train function sets the parameters and chooses a data point depending of if we asked for a Uniform distribution ( 1-Uniform , 0-NonUniform) Uniform - A random number to be as index for the given Data (in our case square or donut) NonUniform – the numbers are given a probability of being chosen as indexes for the data according to Dirichlet distribution.

image

The data point that we chose is sent to the find_bmu function as input_vector <x,y>. The function goes over all neurons and calculates the Euclidean distance between the input vector and the neurons and adds the result along with the coordinates of the neuron that we checked to a list. At the end we sort the results and return the neuron whose weight vector is most similar to the input vector also called the best matching unit (BMU). We then can update the SOM using the update_som function.

image image image

Here is a description of the Algorithm from the Wikipedia page:

image

The results:

image image

Non-Uniform:

image

Part A.2: Data from {<x,y> | 1<= x^2 +y^2 <= 2} For the donut shape sampling I made the following function that return a desired amount of data points:

image

And on the Main function:

image

Which produce the following results using 30 neurons:

image

image

image

Now if we use a grid of neurons: 15x15

image

image

iterations-550-neurons-30-shape-donut-uniform-1

About

Kohonen neural network algorithm Implementation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages