## Wednesday, January 27, 2021

### The secretary problem

Say you're dating several partners and want to get into a serious relationship eventually. After how many partners can you have a good idea of what a good partner is like? This is called the secretary problem, and it can be formalised as follows:

Let $n$ be the number of partners in total you can date. Let $A_i$ be the fitness of the $i$th partner, such that the order of the partners is random and we are interested in the partner with the maximum fitness. Let $r$ be the number of partners you date before you start looking for a serious relationship. Once a partner has been evaluated, you cannot go back to them if you try the next partner. What you're doing is going through all partners $i$ starting from 1 up to $r$, at which point you will take $\max_{i=1}^{r}(A_i)$ as a reference of what a good partner looks like and continue going through the rest of the partners $r$ to $n$ and stop as soon as $A_i$ is greater than the reference ($\max_{i=1}^{r}(A_i)$). If none of the partners are better than the best of the first $r-1$ then you settle for the last partner. The question is, what should $r$ be such that the probability of picking the partner with maximum $A_i$ is maximised.

The choice of $r$ actually makes a significant difference, as we can show in the following simulation in Python 3 (in Python everything needs to be 0-based such that the first item is at position 0 rather than 1):

import random

def select(A, r):
ref = max(A[:r]) # first r items
for i in range(r, len(A)): # rest of the items
if A[i] > ref:
return i
return len(A)-1 # return last item if search failed

n = 20
for r in range(1, n): # first item not included
successes = 0
for _ in range(10000):
A = [random.random() for _ in range(n)]
best = max(range(n), key=lambda i:A[i])
choice = select(A, r)
if choice == best:
successes += 1
print(r+1, successes/10000)

2 0.1746
3 0.2532
4 0.3099
5 0.3459
6 0.3581
7 0.3856
8 0.3927
9 0.3866
10 0.3717
11 0.366
12 0.3382
13 0.3227
14 0.2918
15 0.263
16 0.2228
17 0.1908
18 0.149
19 0.0969
20 0.0474


These are the fraction of times the best partner is selected out of 10000 simulations for each $r$ from 2 to 20. You will notice that the fraction steadily increases up to a peak at 8 and then steadily decreases until the end. It's interesting that the peak is not in the middle of the list but closer to the beginning. How can we find what the best $r$ would be for any $n$?

First, we need to find the probability of choosing the best partner for a given $r$ and $n$. This probability can be expressed as two events happening together: you stopping at element $i+1$ and the maximum element being element $i+1$ (we're using $i+1$ instead of just $i$ because it will be mathematically convenient later). These two events are independent and so we can find the probability of each one and then multiply them together.

The probability of the maximum element from all $n$ elements being the $i+1$th element is $\frac{1}{n}$.

The other probability is a little trickier. To stop at element $i+1$, all the elements prior to $i+1$ have to be less than the reference we got from the first $r$ elements. If a larger element is found sooner, we would have stopped there. Therefore, we are interested in the probability that the maximum element from the first $i$ elements is among the first $r$ elements, which is $\frac{r}{i}$. This is assuming that $i+1 \geq r+1$, since we cannot choose any partner before the $r+1$th partner.

Multiplying these two probabilities, we get $\frac{r}{n \times i}$. But this is the probability for a given $i+1$, whereas we want the probability for any $i+1$. Each $i+1$ is mutually exclusive so we can add up the probability of each $i+1$. Given that $i+1$ must be greater than or equal to $r+1$ by definition,

$$P(r, n) = \sum_{i=r}^{n}\frac{r}{n \times i} = \frac{r}{n}\sum_{i=r}^{n}\frac{1}{i}$$

Great, so now we have the probability of making a successful selection given $r$ and $n$. Now how do we find at what $r$ is this maximum? Normally we would find the derivative of this equation with respect to $r$ and see what $r$ has to be for the derivative to be 0 (the turning point), but we can't differentiate this expression because $r$ is discrete. We also cannot treat $r$ as continuous because it is used as the starting point in the summation. But if we focus on finding $\frac{r}{n}$ rather than $r$, then we can do so in a round about way.

Start by multiplying $\frac{1}{n}$ with the expression like this:

$$\frac{r}{n}\sum_{i=r}^{n}\frac{n}{i} \frac{1}{n}$$

We can do this since $n$ cannot be zero. Now what will this expression look like if we let $n$ go to infinity?

$$\lim_{n \to \infty} \frac{r}{n}\sum_{i=r}^{n}\frac{n}{i} \frac{1}{n}$$

Let $x = \lim_{n \to \infty} \frac{r}{n}$ and $t = \lim_{n \to \infty} \frac{i}{n}$. In this case, the summation looks like a Reinmann sum, that is, it is adding together the areas of equally wide rectangular strips in a fixed region under a graph. As the width of the strips goes to zero, the total area of all the strips equals the area under the graph (in the fixed region), making it equal to the definite integral of the graph. In this case, the graph is $f(t) = \frac{1}{t}$, the width of the strips is $\frac{1}{n}$, and the region is between $\frac{r}{n}$ and $\frac{n}{n}$. This means the following:

$$\lim_{n \to \infty} \frac{r}{n}\sum_{i=r}^{n}\frac{n}{i} \frac{1}{n} = x\int_x^1 \frac{1}{t} dt$$

Great. So now we need to find the turning point of $x\int_x^1 \frac{1}{t} dt$ with respect to $x$ in order to know what $\frac{r}{n}$ gives the most probable success. This comes as follows:

\begin{align} \frac{d}{dx} (x\int_x^1 \frac{1}{t} dt) &= 0 \\ \frac{d}{dx} x(\ln(1) - \ln(x)) &= 0 \\ \frac{d}{dx} -x\ln(x) &= 0 \\ -\ln(x) - 1 &= 0 \\ x &= \frac{1}{e} \\ \end{align}

Which says that the number of partners you should date before looking for a serious relationship is the total number of partners you will date divided by $\frac{1}{e}$, or about 37% of the number of people you'll go out with.