Wednesday, November 26, 2014

Dijkstra's Algorithm, Uniform Cost Search, and A* Search (graph search series)

Let's say that you're developing a game which involves moving a hero around in order to avoid a monster that is chasing him or her. It's easy to program the hero to just move around wherever the arrow keys on the keyboard indicate, but how do you make the monster automatically follow a sensible path towards the hero?

If there are no obstacles in the way then the monster just has to move in the direction of the hero, but you don't need me to tell you that. In order to get from point A to point B with obstacles in the way you need to use some path finding algorithms, which ideally find the shortest path. Let's start from Dijkstra's Algorithm.

Dijkstra's Algorithm

Dijkstra's Algorithm finds the shortest path from a point to every other point. In other words it is not ideal for finding the shortest path between two points. This might not be what you need but it's a good basis to understand the more focused algorithms.

It is a dynamic programming algorithm which exploits the following observation about shortest paths. Consider the following diagram which shows the shortest path between two crosses with an obstacle in the middle:

If you take any point in the path, such as the circle, the shortest path from this point and one of the crosses is the same path as the shortest path between the two crosses. In other words, imagine that you are searching for the shortest path between the two crosses, and you meet a wizard who tells you that he knows what this shortest path is but that he will not tell you how to take it; however, he will tell you that the circle is on this path. With this information you can simplify your search by looking for the shortest path from one cross to the circle and from the circle to the other cross. These two paths can then be combined into a single shortest path between the two crosses.

Obviously if the circle is not on the shortest path between the two crosses then this strategy will not work, but it is useful for reducing the number of different paths to check. In general we probably will not have access to such a wizard, but we can still take advantage of this fact by looking for the shortest path between the first cross and every point along the way to the second cross. The shortest path to the second cross must begin with one of these paths. But how can we find the shortest path to these intermediate points? But using intermediate points that are along the way to these intermediate points of course. This strategy allows us to find the shortest path between two points bit by bit, by finding other shorter shortest paths along the way.

This is how Dijkstra's algorithm works. Here is the algorithm:
distance_from_start = { starting_point : 0 }
previous_point = { starting_point : null }
next_points = [ starting_point ]

while next_points is not empty do
next_point = point in next_points where distance_from_start[point] is minimum
remove next_point from next_points

for each neighbouring_point around next_point do
new_distance = distance_from_start[next_point] + distance from next_point to neighbouring_point
if neighbouring_point is not in distance_from_start or new_distance < distance_from_start[neighbouring_point] then
add neighbouring_point to next_points if not in distance_from_start

distance_from_start[neighbouring_point] = new_distance
previous_point[neighbouring_point] = next_point

The algorithm works by starting at the starting point and checking every neighbouring point around it. The path from the starting point to each one of these points is considered to be the shortest path to the points. Each of these points is checked for its neighbouring points also and if the point is a previously visited point, then the distance of the new path found is compared to the distance of the previous path that was used. If the new distance is shorter than the previous one then the new path is used instead of the old one. This is repeated until every point has been checked. At the end, every point is associated with some previous point which if followed to the starting point will form the shortest path to the starting point.

Uniform Cost Search
Uniform Cost Search is Dijkstra's Algorithm which is focused on finding a single shortest path to a single finishing point rather than a shortest path to every point. It does this by stopping as soon as the finishing point is found.

Here is the algorithm:
distance_from_start = { starting_point : 0 }
previous_point = { starting_point : null }
next_points = [ starting_point ]

while next_points is not empty do
next_point = point in next_points where distance_from_start[point] is minimum
if next_point = finishing_point then exit
remove next_point from next_points

for each neighbouring_point around next_point do
new_distance = distance_from_start[next_point] + distance from next_point to neighbouring_point
if neighbouring_point is not in distance_from_start or new_distance < distance_from_start[neighbouring_point] then
add neighbouring_point to next_points if not in distance_from_start

distance_from_start[neighbouring_point] = new_distance
previous_point[neighbouring_point] = next_point

It is not enough to check if a neighbouring point is the finishing point to exit as there might be other points which are the finishing point's neighbours that are better. In order to be sure that no other neighbouring point is better, you have to wait until it is the closest point available in the list of next points to explore. In this way you can be sure that any other point that can lead to the finishing point is further away from the starting point (and thus have a longer path) than the finishing node is.

The problem with this approach is that although it is guaranteed to give you the shortest path from the starting point to the finishing point, it does so because it checks every possible path blindly, which would be too long for practical use in a game. A faster algorithm exists.

A* Search
A* Search is the informed version of Uniform Cost Search. It differs in that you have to give it a way to estimate how close any point is to the finishing point which it will use to make informed decisions on which point it should follow next. This takes the "blindly" part out of the Uniform Cost Search. The estimate is used in order to estimate what the total distance of the path from the starting point to the finishing point is, provided that a particular point is included in the path. The way the path distance is estimated is by adding together the distance of the point from the starting point (which is done as usual) and the estimated distance from the point to the finishing point. In this way, the selection of the next point to explore is not made just on the point's distance from the starting point, but on the whole path length.

The important thing is that this estimate is never more than the actual distance to the finishing point, although it can be shorter. If it is longer then it will not give the shortest path. The reason for this is because as soon as an actual path is found (when the finishing point is discovered), if the actual distance of this path is shorter than the underestimated distances of the other paths, then it can be safely assumed that the others paths actual distances will be more than the actual path found. On the other hand, if the estimated distances are overestimated, then they cannot be compared to an actual path distance since their actual distance might be less.

Another thing to notice is that the closer the estimate is to zero, the more the search will behave like Uniform Cost Search and the more likely the path given is to be optimal. A good estimate would be the Euclidian distance (straight line distance) to the finishing point. This estimate is used to select the most promising point to process from all the ones discovered.

Here is the algorithm:

distance_from_start = { starting_point : 0 }
previous_point = { starting_point : null }
next_points = [ starting_point ]

while next_points is not empty do
>>> next_point = point in next_points where distance_from_start[point] + estimate to finishing point is minimum
if next_point = finishing_point then exit
remove next_point from next_points

for each neighbouring_point around next_point do
new_distance = distance_from_start[next_point] + distance from next_point to neighbouring_point
if neighbouring_point is not in distance_from_start or new_distance < distance_from_start[neighbouring_point] then
add neighbouring_point to next_points if not in distance_from_start

distance_from_start[neighbouring_point] = new_distance
previous_point[neighbouring_point] = next_point