Wednesday, July 7, 2010

Biasing uniform random fraction generators towards larger or smaller fractions

The need for a function which allows you to bias a uniform probability distribution random number generator towards fractions closer to 0.0 or 1.0, and to control the degree of the bias, has been felt a few times whilst I was programming. C# only has uniform random number generators and in a recent project I had, I had to devise a function to enable me to bias the random numbers generated. In this post I shall describe this function.

We would like a function which skews the number line between 0.0 and 1.0 such that the range of numbers closer to 0.0 are denser than those closer to 1.0 or vice versa.

Consider the following statistics gathered from the Python function random.random(), where we shall count the number of random numbers generated that are greater than 0.5 and those that are less out of 10000:


The frequencies are very close to each other. This is because random.random() uses a uniform probability distribution, meaning that each number between 0.0 and 1.0 has an equal chance of being generated. But what if we wanted to bias the frequencies so that more numbers are less than 0.5?

An application for this could be to randomly select an element from an array which is sorted on priority such that you want to randomly select the first element more often than the last. To do this you bias the random number generator towards smaller indices.

We need a function which takes the set of inputs from 0.0 to 1.0 and returns numbers in the same range, however skewing the mapping such that a larger part of the number range gets mapped to a smaller part of the output range.

In the graph below, y = x, the mapping is unbiased, no part of the output range has been squashed.

However if we were to bend the line so that it still goes through (0,0) and (1,1) but is curved then we could skew the mapping as demonstrated below:

The graph on the left skews the output such that there is a bias for smaller numbers; all inputs which are less than 0.6 will be mapped to be closer to 0.0 whilst the rest of the inputs will be mapped to be closer to 1.0. On the other hand, the graph on the right skews the output such that there is a bias for larger numbers; only inputs smaller than 0.4 will be mapped to be closer to 0.0 whilst the rest of the inputs will be mapped to be closer to 1.0. Also, in the graph on the left, the smaller the number the greater the chance of it being returned whilst in the graph on the right, the larger the number the greater the chance of it being returned.

We would like to devise a function which will give us this curve, and to some how parameterise the curvature so that the bias can be made stronger or weaker. Such a curve is given by the exponent function
y = x^k           (1)
which, given that k is positive, goes through (0,0) and (1,1) and which different values of k will change the curvature such that when k is between 0.0 and 1.0 the curvature will be going flatter and when k is between 1.0 and infinity the curvature will be going steeper.

In order to control the curvature we shall allow the user to specify a point which the curve is to pass through. To find which value of k will go through the point (x',y'), we make k subject of the formula in (1) after substituting x with x' and y with y' and we get
k = ln y' / ln x'           (2)
and by substituting (2) into (1) we get
y = x^(ln y' / ln x')           (3)
which is useful for describing our curve, provided both y' and x' are between 0 and 1, both not included.

However we would like the way we control the curvature to be more meaningful to the user. So instead we let the user provide just one variable, called the bias, which will be the x value which will be mapped to 0.5, hence all smaller values will be mapped to be less than 0.5 and all greater values will be mapped to be more than 0.5. To do this, we set y' to be 0.5, so that x' will be the bias, that is, the x value which gives 0.5. After modifying equation (3), the new equation is
y = x^(ln 0.5 / ln b)           (4)
where b is the bias.

The following graphs are obtained for different values of b:

The following statistics demonstrate the validity of the function. Again 10000 random numbers are generated using Python's random.random() except that this time we shall pass the random numbers to the devised funtion as x using different values for b and all cases use a seed of 0:


So using this function we can easily give a bias to random number generators. The only problem with it is that a bias of zero or one will result in a log of zero and division by zero respectively and hence cannot be done.

To bias a random number generator which gives random fractions so that it is biased towards numbers closer to 1 or 0, use the equation y = x^(ln 0.5 / ln b) where x is the unbiased random number, b is the fraction of numbers to be less than 0.5 and y is the biased random number.


  1. Hi there,

    I tried to implement this in javascript and i couldnt get it to work for me. Could you please tell me how i've got it wrong.

    function bias(x,b)
    //Where x is the unbiased random number
    //Where b is the fraction of numbers to be less than 0.5
    return Math.pow(x, (Math.log(0.5) / Math.log(b)));

  2. It could be possible that i have not interpreted the formula correctly, or that I am using it wrong.

    When i use it like this i am hoping to get a number that is randomly biased towards 0.

    var biasedNumber = bias(550, 0.8);

  3. The random number must be between 0 and 1 for this to work. You can use Math.random() to get an unbiased random number.

    That said, if you want to use a particular range of random numbers, you first get a biased random number which is between 0 and 1 like so:

    var biasedNumberFraction = Math.pow(Math.random(), (Math.log(0.5) / Math.log(0.8)));

    And then you use this random fraction like so:

    var biasedNumber = (max - min) * biasedNumberFraction + min;

    where max and min are the maximum and minimum numbers in the range that you are interested in.

  4. What about making the bias hover about a point in the middle? A standard deviation / bell-curve sort of thing?

    1. I don't think it can be used for such an application. Bell curves are used to bias the numbers towards the middle, this is used to bias the numbers towards an extreme. There are different distributions for different applications.