Thursday, April 30, 2020

Rolling neighbourhood histograms: Computing histograms of sliding windows quickly

In an earlier blog post, I described local binary patterns, which are vectors that describe the texture of image regions. In segmentation tasks where you need to group pixels together by texture, you need to compute the local binary pattern of the neighbourhoods of pixels around each pixel. This means that if you want to find the texture descriptor for a particular pixel 'P', first you need to extract a square of pixels centered on 'P' called the neighbourhood of 'P', compute the local binary codes of all the pixels in the neighbourhood, compute the histogram of the local binary codes, and that histogram is the texture descriptor of pixel 'P'. This is illustrated in the figure below.

Now in order to compute the texture descriptor of every pixel in the image you would need to repeat this process of extracting neighbourhoods and computing histograms for every pixel. But this is inefficient because it ends up recomputing the same values multiple times. First of all, the local binary codes do not need to be computed for each neighbourhood but can instead be computed once for the whole image and then the neighbourhoods be extracted from the local binary codes image instead of from the original image. But we can also make the histogram computations faster as well. This is because adjacent pixels have a lot of common pixels in their neighbourhoods as shown in the figure below.

We can take advantage of this commonality by computing the histogram of the new neighbourhood based on the histogram of the previous neighbourhood. Assuming the the histograms are simple frequency histograms, the new histogram is the previous histogram plus the frequencies of the values in the blue vertical column (see last row in the above figure) minus the frequencies of the values in the green vertical column. This means that instead of having to compute the frequencies of the whole neighbourhood you would only need to compute the frequencies for two columns and then add or subtract from the previous histogram. This reduces the time complexity from quadratic time to linear time with respect to the side of the neighbourhood square. Assuming that a neighbourhood is described by the (x,y) coordinate of the central pixel and the (w,h) width and height of the neighbourhood, and that histograms can be added together by vector addition, here is the relationship of the new histogram to the previous histogram:

histogram(x,y,w,h) = histogram(x-1,y,w,h) + histogram(x-w/2,y,1,h) - histogram(x+w/2,y,1,h)

This lets us compute the histogram of each neighbourhood by iteratively using the previous histogram in the same row, requiring only the first pixel in the row to be computed fully. But we can also avoid computing the full histogram of the first pixel in a row after the first row by using the histogram of the above pixel in the previous row like this:

histogram(x,y,w,h) = histogram(x,y-1,w,h) + histogram(x,y-h/2,w,1) - histogram(x,y+h/2,w,1)

This is similar to the way rolling hashes are computed, which is where the name rolling histogram comes from.