Using Circular Self-Organizing Maps to solve the Symmetric TSP

Prerequisites:

1. Knowing what the Traveling Salesman Problem (TSP) is. The wiki article is a good place to start. The symmetrical form of the problem is where the distance from one city to another is the same in both directions. 2. A decent understanding of what Kohonen/Self-Organizing Maps are. This blog post will point you in the right direction to learn more.

After I wrote my former blog post on implementing SOMs with TensorFlow, someone on Reddit mentioned that there are ways to solve TSPs using them. When I did some research, it appeared that there have been a few attempts to do so. Heres one, and heres another. So in my spare time, I decided to try and have a crack at it myself.

I will describe my approach (along with the code) in a short while, but let me give a disclaimer first. This post in no way claims to have found the ‘best’ way to solve TSPs (ACOs are pretty much the state of the art). It just describes my attempt, and the results I obtained (which were interesting, with respect to the methodology implemented).

I obviously make a few assumptions, and here they are:

1. Every city has a corresponding vector available, denoting its coordinates/location in a 2D/3D space. This source presents many TSP benchmark data-sets in this format. I also tried assigning every city a vector where every dimension is its distance from all the cities(including itself). This gives pretty sub-par results, so I decided not to pursue that. Maybe you could use a dimensionality reduction technique like PCA on these ‘distance’ vectors and get good results.

2. Because of the above point, my method is pretty much restricted to solving the symmetric form of a TSP (atleast the way I have done it).

The Method

Suppose the total number of neurons is $N$. The total number of cities in question is $C$. After some trial and error, I set $N = 2C$. It can be higher/lower as well. The lower the number, the less optimal is the understanding of the solution space. And as you go higher, after a certain point, there seems to be no improvement in the output (but your execution time increases many-fold).

The individual neurons are provided locations around a circle, as follows: $l_i = ( cos(\frac{2i\pi}{N}), sin(\frac{2i\pi}{N}) )$

where $l_i$ is the location vector for the $i$ th neuron in a 2-D space. $i$ goes from $0$ to $N - 1$. It is important to not confuse this space with the space in which the cities lie (even though both are 2-dimensional). Lets call one the Neuron-Space, and the other, the City-Space.

Because of the circular nature of this SOM, we inherently have a closed loop-structure to our framework (as is needed by the TSP solution), where neurons in proximity of each other in the Neuron-Space lie near-by in the City-Space. Also, do note how every neuron essentially corresponds to an angle ( $\frac{2i\pi}{N}$) around the circle. This point would be useful soon.

The initialization of the neuron vectors in City-Space is done by using the mean and standard deviation of the dimensions of the city vectors in City-Space. This is because I observed that the scale of all the dimensions in the city-vectors is pretty much the same. If it isn’t, you might have to resort to building a mean vector and covariance matrix. For training, I use $\alpha = 0.3$ and $\sigma = 0.3$ as parameters for the SOM. Putting a constant value of $\sigma$ ensures that the same proportion of neurons around the Best-Matching Unit get affected everytime, irrespective of the total number of neurons being used.

Once the SOM trained, we can say we have the rough ‘skeleton’ for a good route in the City-Space. But how do you convert this skeleton to the appropriate route? By using those angles I just talked about. Consider a city $c$. Lets say the $k$ nearest neurons to it, are $n_1, n_2, ..., n_k$.

So, we can construct a vector for every city in Neuron-Space, using this simple formula: $l_c = \sum_k \frac{1}{d_{c, n_k}} l_c$

where $d_{c, n_k}$ is the distance between city $c$ and neuron $n_k$ in City-Space. Using this vector and some pretty simple (inverse) trigonometry, you can assign the city an angle around the circle on which the SOM-neurons themselves lie in Neuron-Space. Now all you need to do to come up with an approximately good solution, is just sort the cities with respect to the angles they are assigned. Voila!

Ofcourse, the solution that this algorithm gives you would most likely be in the neighbourhood of a slightly better one, so I use a standard TSP local search heuristic like 3-opt over the solution obtained from my algorithm. What 3-opt essentially does, is change around the solution obtained from some other algorithm over so slightly in some pre-defined ways, in the hope of hitting on a better solution. 3-opt is better than 2-opt for this algorithm. I dint try higher-order k-opt searches.

One interesting possibility about this algorithm, is the fact that it could provide great solutions to solving TSPs for different target cities, without retraining the SOM– as long as they lie in the same City-Space. As long as the set of cities you try to solve for follows (atleast approximately) the trends shown by the dataset used for training, all you would need to do is compute their respective angles (which does not involve re-training), and just apply 3-opt. In short, you could use the same SOM, once trained, to solve for multiple subsets of cities.

The Implementation

This code builds on my Tensorflow-based SOM code. Its pretty rough around the edges, especially the 3-opt search. It was meant just as a proof-of-concept, so feel free to hack around if you like.

import tensorflow as tf
import numpy as np

def _distance(tsp_solution, distance_matrix):
"""
Total distance for a solution.
"""
total_distance = 0
for i in range(len(tsp_solution)-1):
total_distance += (distance_matrix[tsp_solution[i]][
tsp_solution[i + 1]])
total_distance += (distance_matrix[tsp_solution][
tsp_solution[-1]])

def _3_opt(route, distance_matrix):
"""
Uses 3-opt to find a better solution in the neighbourhood.
"""

current_dist = _distance(route, distance_matrix)
#Find three connections to break
for i in range(len(route) - 1):
for j in range(i + 2, len(route) - 1):
for k in range(j + 2, len(route) - 1):
#Now reconnect the graph in every other way possible,
#and see if any gives a better solution
#Way 1
temp = route[:]
temp[j+1:k+1] = reversed(route[j+1:k+1])
dist = _distance(temp, distance_matrix)
if dist < current_dist:
route = temp
current_dist = dist
#Way 2
temp = route[:]
temp[i+1:j+1] = reversed(route[i+1:j+1])
dist = _distance(temp, distance_matrix)
if dist < current_dist:
route = temp
current_dist = dist
#Way 3
temp = route[:]
temp[i+1:j+1], temp[j+1:k+1] = (
reversed(route[i+1:j+1]), reversed(route[j+1:k+1]))
dist = _distance(temp, distance_matrix)
if dist < current_dist:
route = temp
current_dist = dist
#Way 4
temp = (route[:i+1] + route[j+1:k+1] +
route[i+1:j+1] + route[k+1:])
dist = _distance(temp, distance_matrix)
if dist < current_dist:
route = temp
current_dist = dist
#Way 5
temp = route[:i+1] + route[j+1:k+1]
temp += reversed(route[i+1:j+1])
temp += route[k+1:]
dist = _distance(temp, distance_matrix)
if dist < current_dist:
route = temp
current_dist = dist
#Way 6
temp = route[:i+1]
temp += reversed(route[j+1:k+1])
temp += route[i+1:j+1]
temp += route[k+1:]
dist = _distance(temp, distance_matrix)
if dist < current_dist:
route = temp
current_dist = dist
#Way 7
temp = route[:i+1]
temp += reversed(route[j+1:k+1])
temp += reversed(route[i+1:j+1])
temp += route[k+1:]
dist = _distance(temp, distance_matrix)
if dist < current_dist:
route = temp
current_dist = dist
return route

class TSPSolver(object):
"""
Uses a Circular Self-Organizing Map with a Gaussian Neighbourhood
function and linearly decreasing learning rate, to solve TSP.
"""

#To check if the SOM has been trained
_trained = False

def __init__(self, input_vects, n_iterations=200):
"""
Initializes all necessary components of the TensorFlow
Graph.

'input_vects' should be a list of 1-D NumPy vectors denoting
location vectors for the cities.
'n_iterations' is an integer denoting the number of iterations
that would be run while training the SOM.
"""

#Assign required variables first
self._n = len(input_vects)
dim = len(input_vects)
self._n_centroids = self._n*3
alpha = 0.3
sigma = 0.3
self._n_iterations = abs(int(n_iterations))

#For initialization of the centroid(neuron) vectors
init_values = []
for vect in input_vects:
init_values.extend(list(vect))
init_mean = np.mean(init_values)
init_dev = np.sqrt(np.var(init_values))

##INITIALIZE GRAPH
self._graph = tf.Graph()

##POPULATE GRAPH WITH NECESSARY COMPONENTS
with self._graph.as_default():

##VARIABLES AND CONSTANT OPS FOR DATA STORAGE

#Randomly initialized weightage vectors for all neurons,
#stored together as a matrix Variable of size [self._n, dim]
self._weightage_vects = tf.Variable(tf.random_normal(
[self._n_centroids, dim], init_mean, init_dev))

#Matrix of size [self._n, 2] for SOM grid locations
#of neurons
self._location_vects = tf.constant(np.array(
list(self._neuron_locations(self._n_centroids))))

##PLACEHOLDERS FOR TRAINING INPUTS
#We need to assign them as attributes to self, since they
#will be fed in during training

#The training vector
self._vect_input = tf.placeholder("float", [dim])
#Iteration number
self._iter_input = tf.placeholder("float")

##CONSTRUCT TRAINING OP PIECE BY PIECE
#Only the final, 'root' training op needs to be assigned as
#an attribute to self, since all the rest will be executed
#automatically during training

#To compute the Best Matching Unit given a vector
#Basically calculates the Euclidean distance between every
#neuron's weightage vector and the input, and returns the
#index of the neuron which gives the least value
bmu_index = tf.argmin(tf.sqrt(tf.reduce_sum(
tf.pow(tf.sub(self._weightage_vects, tf.pack(
[self._vect_input for i in range(
self._n_centroids)])), 2), 1)),
0)

#This will extract the location of the BMU based on the BMU's
#index
np.array([[0, 1]]))
bmu_loc = tf.reshape(tf.slice(self._location_vects, slice_input,
tf.constant(np.array([1, 2]))),
)

#To compute the alpha and sigma values based on iteration
#number
learning_rate_op = tf.sub(1.0, tf.div(self._iter_input,
self._n_iterations))
_alpha_op = tf.mul(alpha, learning_rate_op)
_sigma_op = tf.mul(sigma, learning_rate_op)

#Construct the op that will generate a vector with learning
#rates for all neurons, based on iteration number and location
#wrt BMU.
bmu_distance_squares = tf.reduce_sum(tf.pow(tf.sub(
self._location_vects, tf.pack(
[bmu_loc for i in range(self._n_centroids)])), 2), 1)
neighbourhood_func = tf.exp(tf.neg(tf.div(tf.cast(
bmu_distance_squares, "float32"), tf.pow(_sigma_op, 2))))
learning_rate_op = tf.mul(_alpha_op, neighbourhood_func)

#Finally, the op that will use learning_rate_op to update
#the weightage vectors of all neurons based on a particular
#input
learning_rate_multiplier = tf.pack([tf.tile(tf.slice(
learning_rate_op, np.array([i]), np.array()), [dim])
for i in range(
self._n_centroids)])
weightage_delta = tf.mul(
learning_rate_multiplier,
tf.sub(tf.pack([self._vect_input for i in range(
self._n_centroids)]),
self._weightage_vects))
weightage_delta)
self._training_op = tf.assign(self._weightage_vects,
new_weightages_op)

##INITIALIZE SESSION
self._sess = tf.Session()

##INITIALIZE VARIABLES
init_op = tf.initialize_all_variables()
self._sess.run(init_op)

#Train the Solver
self.train(input_vects)

@property
def solution(self):
"""
Returns the TSP solution order as a list, and the total
Euclidean distance with respect to the input vectors.
"""
if not self._trained:
raise ValueError("SOM not trained yet")
return self._sol, self._sol_dist

@classmethod
def _neuron_locations(cls, n_centroids):
"""
Yields one by one the 2-D locations of the individual neurons
in the SOM.
"""

for i in range(n_centroids):
yield np.array([np.cos(i*2*np.pi/float(n_centroids)),
np.sin(i*2*np.pi/float(n_centroids))])

def train(self, input_vects):
"""
Trains the SOM.
'input_vects' should be a list of 1-D NumPy arrays with
dimensionality as provided during initialization of this SOM.
The points in the solution are enumerated as they are ordered
in 'input_vects'.
Current weightage vectors for all neurons(initially random) are
taken as starting conditions for training.
"""

#Training iterations
for iter_no in range(self._n_iterations):
#Train with each vector one by one
for input_vect in input_vects:
self._sess.run(self._training_op,
feed_dict={self._vect_input: input_vect,
self._iter_input: iter_no})

self._input_vects = input_vects
self._compute_solution()
self._trained = True

def _compute_solution(self):
"""
Computes the solution to the TSP with respect to the input
vectors.
"""
#TODO: All nearest-neighbour searches could be speeded up
#significantly with a KD-Tree/Ball Tree.

#Get the centroid data
centroid_vects = list(self._sess.run(self._weightage_vects))
centroid_locations = list(self._sess.run(self._location_vects))

#Distance matrix mapping input point number to list of distances
#from centroids
distances = {}

for point in range(self._n):
distances[point] = []
for centroid in range(self._n_centroids):
distances[point].append(
np.linalg.norm(centroid_vects[centroid] -
self._input_vects[point]))

#Distance matrix mapping input point number to list of distances
#from other input points
point_distances = {}

for point in range(self._n):
point_distances[point] = []
for other_point in range(self._n):
point_distances[point].append(
np.linalg.norm(self._input_vects[other_point] -
self._input_vects[point]))

#Compute angle with respect to each city(point)
point_angles = {}
for point in range(self._n):
total_vect = 0
cents = [j for j in range(self._n_centroids)]
cents.sort(key=lambda x: distances[point][x])
for centroid in cents[:2]:
total_vect += (1.0/(distances[point][centroid]) *
centroid_locations[centroid])
total_vect = total_vect/np.linalg.norm(total_vect)
if total_vect > 0 and total_vect > 0:
angle = np.arcsin(total_vect)
elif total_vect < 0 and total_vect > 0:
angle = np.arccos(total_vect)
elif total_vect < 0 and total_vect < 0:
angle = np.pi - np.arcsin(total_vect)
else:
angle = 2*np.pi - np.arccos(total_vect)
point_angles[point] = angle

#Find the rough solution
tsp_solution = [i for i in range(self._n)]
tsp_solution.sort(key=lambda x: point_angles[x])

tsp_solution = _3_opt(tsp_solution, point_distances)

#Compute the total distance for the solution
total_distance = _distance(tsp_solution, point_distances)

self._sol = tsp_solution
self._sol_dist = total_distance

Results

Each of these problem datasets can be found at the source here. All the presented results are over 10 runs. The max runtime was experienced with krA100, around one minute.

Berlin-52: Best Error: 0.0%, Average Error: 4.1%.

krA100: Best Error: 1.6%, Average Error: 2.7%.

eli76: Best Error: 2%, Average Error: 4.4%.

These results aren’t ‘close’ to those provided by ACOs, but they are practically feasible, and show a good promise (Oh now I just sound like I am writing a research paper.) Plus, the algorithm isn’t a time-hogger either (judging by runtimes on my laptop).

Anyways. Thanks for reading! Do drop a comment if you think of an improvement over this work.