Sunday, June 17, 2012

Rummy and the most probable sequence in a probability tree

Rummy
Lately I was trying to program a game of Rummy, which is a card game where you have to replace one of your cards with the card on the deck until you have grouped your cards into sequences, where each sequence is 3 or more cards. The challenge with programming such a game seemed to be with writing code which determines which of your cards should be discarded to be replaced with the one on the deck.

The way this was done was by giving each card in the deck a probability of being available. Cards in the player's hand are given a probability of 1 and cards which are lost in the graveyard or taken by the opponent are given a probability of 0. The other cards will be given a probability depending on how likely the opponent has them given the cards they have taken.

Once each card has a probability of being obtained we can then calculate the probability of every legal sequence of cards in a group and of each possible arrangement of groups. The probability of each group is calculated and the groups with the maximum probability, containing both cards in the player's hand and cards which are likely to be available, is to be pursued. Any cards in the player's hand which are not in the most likely groups to arrange can be discarded to be replaced.

Examples of legal sequences of cards which can be used as a group in Rummy are
  • ace of diamonds, ace of hearts, ace of spades
  • two of hearts, two of clubs, two of diamonds
  • two of hearts, two of clubs, two of diamonds, two of spades
  • ace of diamonds, two of diamonds, three of diamonds
  • two of spades, three of spades, four of spades
  • two of spades, three of spades, four of spades, five of spades, six of spades
  • etc.

A number of groups needs to be arranged such that the total number of cards equals 10 and no two groups have the same cards. The probability of a card group is the product of the probabilities of each card in the group. The aim is to find an arrangement of card groups that have the maximum probability of being obtained. Obviously since only cards that are already in the player's hand have a probability of 1, the more cards used which are already in the player's hand, the better.

Since a group cannot contain any cards in another group, we need to treat the choosing of groups as a probability tree, where each node in the tree is a new legal group. As we choose more groups, we have less groups to choose from for the next choice. Since there are so many cards to choose from, an algorithm which quickly chooses the most probable sequence in the probability tree was needed.

The probability tree would look something like this:




Most probable sequence in a probability tree
I'll be expressing the algorithm using Python. I'll be using this Node class to represent a node in the probability tree:
class Node:
    def __init__(self, prob, children):
        self.prob = prob
        self.children = children
Where "prob" is the probability of choosing that node and "children" is a list of child nodes to follow next. Obviously in practice both of these properties can be dynamically generated with callback functions.

After having a tree, we need to represent a root path in the tree which is a list of nodes from the root to one of the leaves. We also need to calculate the probability of the path which is done using this function:
def path_prob(path):
    prob = 1.0
    for node in path:
        prob *= node.prob
    return prob
The probability of nodes in sequence in a probability tree is the product of each node's probability.

We'll start from a simple algorithm, one which generates the paths from the root to every leaf node.
def all_root_paths_nonacc(node):
    if node.children == []:
        yield [node]
    else:
        for child in node.children:
            for subpath in all_root_paths_nonacc(child):
                yield [node] + subpath
It's a naturally recursive generator function which doesn't use an accumulator. It starts from the root and attaches that root node to the front of every path from its children to a leaf. When a leaf is reached, a list consisting of only that leaf is returned.

Once we have this function, we can generate every root path in our probability tree and then find which path has the greatest probability using some code such as:
#comment to make code longer than one line
max(all_root_paths_nonacc(root), key=lambda path:path_prob(path))
#comment to make code longer than one line

However this would be inefficient to use. We need to be able to choose the most probable path during the recursion in such a way that we can stop recurring mid-way through the paths in order to avoid wasting time. I'll explain how this will work in a minute. First we need to know what path we are building during recursion and to do that we need to use an accumulator, like this:
def all_root_paths(node, prev_path=[]):
    curr_path = prev_path + [node]
    if node.children == []:
        yield curr_path
    else:
        for child in node.children:
            for path in all_root_paths(child, curr_path):
                yield path
This is a modified version of the previous algorithm. It keeps an extra accumulator parameter which starts off being an empty list and then start adding the current node to the end of it every time a new recursion is made. That way at any point in the recursion you can know what path is being built from the root node to the current node. When a leaf is reached, the path is returned.

This algorithm still returns all the root paths in the probability tree which we then have to choose from. Now we shall use an algorithm which returns the most probable path without returning any other path.
def naive_max_root_path(node, prev_path=[], curr_max_path=None):
    curr_path = prev_path + [node]
    if node.children == []:
        if curr_max_path == None or path_prob(curr_path) > path_prob(curr_max_path):
            return curr_path
        else:
            return curr_max_path
    else:
        for child in node.children:
            curr_max_path = naive_max_root_path(child, curr_path, curr_max_path)
        return curr_max_path
Again this is a modification of the previous algorithm. This time we keep two accumulators, one with the path that has been built up to now and one with the path which is the most probable up to now. Every time a full path is built (by reaching a leaf) a choice is made between returning the newly built path and the best current path. The returned path is then used as the current best path. Obviously if the current best path is "None" (null), then it means that we are still generating the first path and so this should be considered the best path.

Even if only the best path is returned, the algorithm checks each and every possible path there is, so this algorithm will not be faster than the other algorithms. We shall now instead use to our advantage the fact that when a probability is multiplied by another probability, the product will be smaller than or equal to both probabilities. What this means is that as we multiply the probabilities of the nodes in a path from first to last, the product will never grow but will either remain the same (if there is a probability sequence of 1s) or will become smaller. So if at any point the probability of a partially built path becomes smaller than the probability of the best path, we can immediately stop building that path and try some other path since its probability will only get even smaller.
def quick_max_root_path(node, prev_path=[], curr_max_path=None):
    curr_path = prev_path + [node]
    if node.children == []:
        if curr_max_path == None or path_prob(curr_path) > path_prob(curr_max_path):
            return curr_path
        else:
            return curr_max_path
    else:
        for child in node.children:
            if curr_max_path == None or path_prob(curr_path + [child]) > path_prob(curr_max_path):
                curr_max_path = quick_max_root_path(child, curr_path, curr_max_path)
        return curr_max_path
As you can see, the only difference between this and the previous algorithm is that a decision is also made in the recursive case when the path is still being built. This algorithm can probably be even faster by avoiding the calls to "path_prob" and passing the probabilities as parameters too. Then with every recursive call, the probability of the current path is the probability of the previous path times the probability of the new node.

Many times, especially in the Rummy case, there is more than one best sequence in a probability tree, all of which have an equal probability. In such cases we might want an algorithm which returns all such paths and not just the first one. Here is a naive version of the algorithm which checks every possible root path:
def naive_max_root_paths(node, prev_path=[], curr_max_paths=[]):
    curr_path = prev_path + [node]
    if node.children == []:
        if curr_max_paths == [] or path_prob(curr_path) > path_prob(curr_max_paths[0]):
            return [curr_path]
        elif path_prob(curr_path) == path_prob(curr_max_paths[0]):
            return curr_max_paths + [curr_path]
        else:
            return curr_max_paths
    else:
        for child in node.children:
            curr_max_paths = naive_max_root_paths(child, curr_path, curr_max_paths)
        return curr_max_paths
Obviously we now need to return a list and so the parameter which keeps the current best path needs to be a list of paths. As can be seen, there are now three cases to consider in the base case which are:
  • If the current path is better than any of the best paths (all paths in "curr_max_paths" have an equal probability), return a list consisting of only the current path.
  • If the current path is equal to any of the best path, return a list of the best paths together with the current path/
  • If the best paths are better than the current path, return the best paths only.
So the only difference between this and "naive_max_root_path" is that the path is considered if it is of equal probability to the best and added to a list if so.

We now use the previous trick to make the algorithm quick by skipping paths which are worse than the best mid-way through. However we now need to consider paths which are of equal probability to the best too in the recursive case.
def quick_max_root_paths(node, prev_path=[], curr_max_paths=[]):
    curr_path = prev_path + [node]
    if node.children == []:
        if curr_max_paths == [] or path_prob(curr_path) > path_prob(curr_max_paths[0]):
            return [curr_path]
        elif path_prob(curr_path) == path_prob(curr_max_paths[0]):
            return curr_max_paths + [curr_path]
        else:
            return curr_max_paths
    else:
        for child in node.children:
            if curr_max_paths == [] or path_prob(curr_path + [child]) >= path_prob(curr_max_paths[0]):
                curr_max_paths = quick_max_root_paths(child, curr_path, curr_max_paths)
        return curr_max_paths

Conclusion
As already explained, the probability tree path in Rummy will be the card groups chosen and we need to find which arrangement of card groups is the most likely to be available and thus that the player should aim for. Now that we can find which arrangement of card groups is the most probable, we know which cards we can discard, and since, using the last algorithm, we can know all the arrangements that give the best probability, we can even narrow down the cards to discard to the ones which are needed the least in all the alternative best arrangements.

1 comment: