Combinatorial Solutions to The Traveling Salesman Problem Faizan Khan 28 April 2016 1 Introduction A heuristic definition of the traveling salesman problem asks: For n cities with arbitrary distance between them, find a tour (roundtrip) through all cities visiting every city exactly once such that the sum of all distances travelled is as small as possible [1]. A special case of the traveling salesman problem assumes the distance between any pair of cities, a,b is the same as the distance b,a and is called the symmetric traveling salesman problem [1]. We will focus on the arbitrary distance case as it can easily be generalized to account for the special case, which is usually of little practical use. The physicist and mathematician William Hamilton is the first known researcher to devote significant time to studying this problem, from which he created an optimization game. Formal solutions to the problem began being published in academic journals in the 1950’s and utilized properties of combinatorics and ergodic theory [3]. The traveling salesman problem’s status as a research problem is distinct in that it is studied intensely in strictly applied and strictly theoretical fields of mathematics and computer science [2]. One challenge that mathematicians engage in is to produce faster and more efficient algorithms to solve the problem. Because of the immense number of parameters that could be present in any variation of the problem, there is no efficient general method that can be applied to the problem to return a solution. In this paper we discuss several combinatorial solutions to the traveling salesman problem. In particular, four sections are devoted to solution methods from general combinatorics and graph theory. Each solution has merits and faults, but combined they not only provide formidable tools to solving the problem, but provide greater insight into its structure. The first section of this paper will include a primer on the definitions needed to rigorously define the traveling salesman problem. The next three sections are devoted to understanding different solutions to the traveling salesman problem. The first two of these three sections will include an application of the solution to a particular traveling salesman problem. We will then discuss contemporary developments on the problem by examining Christofides algorithm. In conclusion we will discuss the reasons why new solutions of the traveling salesman problem still continue to hold interest among researchers. 1 2 Essential Definitions and Properties Of The Travel- ing Salesman Problem The graph theory formulation of the traveling salesman problem is best suited to the solutions that this paper will discuss. Its formulation is the following: Definition 2.1. Given a list of nodes (cities) and their pair-wise distance, the task is to find the shortest possible tour that visits every node exactly once [1]. A graph G is an ordered pair ( V, E ) where V is the vertex set whose elements are the vertices, or nodes of the graph and E is the edge set whose elements are the edges, or connections between vertices, of the graph [1]. An example of a typical graph is presented in Figure 1. In Figure 1, V = { 1 , 2 , 3 , 4 , 5 , 6 } and E = {{ 1 , 2 }{ 1 , 5 }{ 2 , 3 }{ 2 , 5 }{ 3 , 4 }{ 4 , 5 }{ 4 , 6 }} The order of the graph is the number of elements of the set V and for the graph in Figure 1 the order is | V | = 6. The size of the graph is the number of elements of the set E and for the graph in 1 the size is | E | = 7. An arc of a graph, is an ordered pair of adjacent vertices. An interesting property of the traveling salesman problem is that it is characterized as NP Hard. This means that the algorithms used to solve the problem don’t necessarily run in polynomial time. An equivalent definition of NP is there is not necessarily a polynomial-time way to find a solution, but once you have a solution it only takes polynomial time to verify that it is correct. One way to interpret this is to acknowledge how difficult the traveling salesman problem becomes once the number of nodes increases. Table 1 displays how just a slight increase of nodes can make The traveling salesman problem unruly. Number of Nodes Number of Arcs Number of Tours 6 15 60 7 21 360 8 28 2520 9 36 20160 Table 1: Effects of increasing the number of nodes [6] Figure 1: Diagram of a typical graph. The vertices are labeled 1-6. Image acquired from https://en.wikibooks.org/wiki/GraphTheory/Definitions/. 2 3 A Brute Force Solution to The Traveling Salesman Problem The brute force solution to a traveling salesman problem is typically the simplest method to implement. A brute force method consists of systematically enumerating all possible candidates for the solution and checking whether each candidate satisfies the problem’s statement [5]. The shortest tour is thus the optimal tour [2]. The following steps comprise the brute force method [2]. 1. Calculate the total number of tours. 2. Draw and list all the possible tours through the vertices of the graph. 3. Calculate the distance of each tour. 4. Choose the shortest tour; this is the optimal solution. Figure 2 presents a graph with vertices labeled A, B, C, D . The graph is a typical case where the above series of steps can be easily applied. The distances between each vertex in Figure 2 are also given in arbitrary units as 1, 3, 4, 5, 7, 10 . From step 1 it is clear that the total number of tours of this graph is 3. Applying steps 2 and 3 to Figure 2 we get the following tours and their distances: 1. A → B → C → D → A . The Total distance for this tour is 3+5+10+7 = 25. 2. A → B → D → C → A . The Total distance for this tour is 3+1+10+4 = 18. 3. A → C → B → D → A . The Total distance for this tour is 4+5+1+7 = 17. Finally, according to step 4, we choose the shortest tour which is tour number 3 with a distance of 17 units. Figure 3 traces tour 3 in green. The main attribute of the brute force method is that given enough time, the optimal solution will always be generated [7]. Because this method requires meticulous care for accounting each route manually, it is very inefficient. Recall from Table 1 that just a slight increase in vertices leads to a factorial increase in tours. The brute force method is best used when transmitted as a computer algorithm [1]. Figure 2: Diagram of the graph with vertices(cities) labeled A, B, C, D . Image acquired from https://www.cs.drexel.edu/jpopyack/Courses/AI/Wi14/assignments/HW5/index.html/. 3 Figure 3: Diagram of Tour 3 ( A → C → B → D → A ) Traced in Green. Image acquired from https://www.cs.drexel.edu/jpopyack/Courses/AI/Wi14/assignments/HW5/index.html/. 4 Nearest Neighbor Solution to The Traveling Sales- man Problem A more complicated, but simpler to implement, solution method than the brute force method is the nearest neighbor solution method. This method involves a process of elimination that requires a vertex to be chosen and to determine which arc has not yet been traveled to [8]. The nearest neighbor method applies the following steps to a graph [8]. 1. Choose any vertex. This vertex will be referred to as the starting vertex. 2. We will then move to closest as-yet-unvisited vertex (if such a vertex exists). 3. Repeat Step 2 until there remain no more unvisited vertices. 4. Return to the Starting Vertex Performing the brute force method without the aid of a computer is often time consuming and prone to error even with a small number of vertices. The nearest neighbor method focuses on the position of one vertex at a time and is demonstrated using Figure 4. In Figure 4 we decide to choose our starting vertex to be A . Applying Step 2 we next move to D . By implementing Step 3, we move from D to E (because it is the closest vertex to D that is unvisited). From E we move to C and from C we move to B . By executing Step 4 we arrive at the final destination which is Vertex A . Nearest neighbor solution’s route is outlined in red in Figure 5. A compelling component of the nearest neighbor solution is that it can be performed quickly on large graphs without utilizing a computer. One limitation that may discourage the use of this solution is that the final tour is usually sub-optimal [9]. The gain in calculation speed is frustrated by a likely increase in tour distance. For example, the nearest neighbor solution in Figure 5 produces the following tour: A → D → E → C → B → A which totals a distance of 1457 units [9]. Applying the brute force method to this same graph gives the optimal route as: A → D → B → C → E → A which totals a distance of 1220 units [9]. The difference in tour length is 237 units so the nearest neighbor solution calculates a tour that is 19.4% greater in distance than the optimal tour generated by the brute force method [9]. 4 Figure 4: Diagram of a Graph with 5 vertices labeled A, B, C, D, E . Image acquired from https://web.tuke.sk/fei-cit/butka/hop/htsp.pdf/ Figure 5: Diagram of Figure 4 with the Nearest Neighbor Route traced in red. Image acquired from https://web.tuke.sk/fei-cit/butka/hop/htsp.pdf/. 5 Branch and Bound Solution to The Traveling Sales- man Problem While the brute force solution is optimal, it is quite often time consuming and While the nearest neighbor solution is quick, it is typically sub-optimal [1]. The branch and bound solution method outputs an optimal route and is frequently quicker to implement than the brute force method [10]. The branch and bound method divides a graph into a number of sub-graphs to enumerate the best route for the traveling salesman problem. Calculating the distance between the nodes of every sub-graph is tedious so an arbitrary suboptimal solution is calculated and used as an upper bound [10]. If a new solution is found, with a shorter 5 route, then that route is used as the new upper bound. The following steps are used to implement the branch and bound solution method [11]: 1. Choose a starting node. 2. Consider a route with a large distance and set this as the bound. 3. Choose the cheapest arc between the current and unvisited node and add the distance to the current distance and repeat while the current distance is less than the bound (the current distance from the starting node is just the distance between the starting node and the shortest distance from the starting node). 4. If the current distance is less than the bound, we are done. 5. Total up the distance and the bound will be equal to the current distance. 6. Repeat step 5 until all the arcs have been covered. The branch and bound method outputs the optimal solution, just as the brute force method does in Section 3. In fact, it is typically more efficient to implement than the brute force method, especially when the graph contains 30-60 nodes [10], such as Figure 6. Figure 6: Diagram of a 36 node graph with a branching structure. Image acquired from http://artint.info/figures/ch03/sgraphbb.png. 6 6 Christofides Algorithm for The Traveling Salesman Problem The previous solution methods presented in this paper inherently sought optimal solutions to the traveling salesman problem (whether they achieved optimal solutions is a different matter). Christofide’s algorithm is distinct in that it limits its implementation to generating approximate solutions [12]. By examining the shortest distance between 2 nodes, a route is generated by process of elimination and retracing certain paths [12]. The algorithm is attributed to Nicos Christofides. In addition to discovering the algorithm, Christofides also proved the following bound: Theorem 6.1. The distance of the solution produced by Christofides algorithm is within 150 % of the optimal tour. Despite the algorithm’s relative inaccuracy, it has been recurrently used to generate reasonable upper bounds to use in concurrence with the optimal branch and bound method [13]. The algorithm is so congruous with the traveling salesman problem that since its discovery in 1976, the only algorithm with a bound closer to the optimal tour was only 4 × 10 − 50 % more optimal than Christofides algorithm [14]. 7 Conclusion The trivial increase in efficiency found in the more efficient algorithm than Christofide’s algorithm initiated a flurry of interest among researchers. At first glance however, the pursuit of new solution methods to the traveling salesman problem may be perceived as unnecessary. One may ask “If optimal solutions such as brute force and branch cut exist, then why resort to producing newer algorithms?” The traveling salesman problem has been solved, but it hasn’t been solved efficiently . The traveling salesman problem highlights an important aspect of the culture that exists among mathematicians and scientists: the impulse to constantly improve a given result no matter how prolonged the process may be. In many ways, this is how results are obtained in contemporary science and mathematics. Incremental improvements from a consortium of researchers are accumulated and, hopefully, new insights to a given problem are gleaned. Recall from Section 6 the algorithm that improved Christofides algorithm by a trivial amount. Though the improvement was of no applicable significance, it was considered a breakthrough by computer scientists and has been studied intensely by researchers since its discovery in 2011. Already, cynics claim that a more accurate bound will never be discovered, which mimics a time honored tradition of mathematical exchange and discourse. History has shown that these cynics will eventually be proven incorrect. 7 References [1] The Travelling Salesman Problem and its Applications, http://co-at-work.zib.de/berlin2009/downloads/2009-09-21/Applications.pdf [2] History, Analysis, and Implementation of Traveling Salesman Problem (TSP) and Re- lated Problems, http://cms.uhd.edu/faculty/redlt/annemseniorproject.pdf [3] Sales and Chips, http://www.ams.org/samplings/feature-column/fcarc-tsp [4] Basic Definitions and Concepts in Graph Theory, http://stanford.edu/ rezab/discrete/Notes/2.pdf [5] Chapter 3: Brute Force, http://faculty.simpson.edu/sinapova/www/cmsc250/LN250Levitin/L05BruteForce.html [6] The Traveling salesman problem: A Guided Tour of Combinatorial Optimization, Lawler E.L., et al [7] The Traveling Salesman Problem and Its Variations, G. Gutin, A.P. Punnen [8] Some non-optimal Algortihms for the TSP, http://www.people.vcu.edu/gasmerom/MAT131/hamilton.html [9] Heuristics for the Traveling Salesman Problem, https://web.tuke.sk/fei-cit/butka/hop/htsp.pdf [10] Branch and Bound Methods, https://stanford.edu/class/ee364b/lectures/bbslides.pdf [11] Branch and Bound Algorithms-Principles and Examples, http://www.imada.sdu.dk/Employees/jbj/heuristikker/TSPtext.pdf [12] Christofidess Approximation for Metric TSP, http://xlinux.nist.gov/dads//HTML/christofides.html [13] A Probabilistic Analysis of Christofides Algorithm, http://www.cse.iitm.ac.in/ bvrr/conf/swat12.pdf [14] Computer Scientists Find New Shortcuts for Infamous Traveling Salesman Problem, http://www.wired.com/2013/01/traveling-salesman-problem 8