An Implementation Of Array Binary Search In PHP

I have been doing some reading and watching lectures of programming theory recently and I was reminded of this algorithm I learned about in university. Binary searching an array is a divide and conquer algorithm that takes an array and searches for a value in that array by splitting the array into halves. The algorithm works like this.

  • Given a sorted array, find the midpoint.
  • If the value at the mid point is greater than the value being searched for then the value must be in the first half of the array.
  • If the value at the mid point is less than the value being searched for then the value must be in the first half of the array.
  • Take the half of the array in question and repeat the first step by re-pointing the mid point.
  • Repeat this until the length of the array is 1 or 0. If the array length is 1 then this is the value. If the length of the array is 0 then the value was not in the array.

There is some implementation details around ensuring that the middle value is correct and to avoid running off the end of the array.

Here is an implementation of the algorithm in PHP.

/**
 * Use binary search to find a key of a value in an array.
 *
 * @param array $array
 *   The array to search for the value.
 * @param int $value
 *   A value to be searched.
 *
 * @return int|null
 *   Returns the key of the value in the array, or null if the value is not found.
 */
function binarySearch($array, $value) {
    // Set the left pointer to 0.
    $left = 0;
    // Set the right pointer to the length of the array -1.
    $right = count($array) - 1;

    while ($left <= $right) {
      // Set the initial midpoint to the rounded down value of half the length of the array.
      $midpoint = (int) floor(($left + $right) / 2);

      if ($array[$midpoint] < $value) {
        // The midpoint value is less than the value.
        $left = $midpoint + 1;
      } elseif ($array[$midpoint] > $value) {
        // The midpoint value is greater than the value.
        $right = $midpoint - 1;
      } else {
        // This is the key we are looking for.
        return $midpoint;
      }
    }
    // The value was not found.
    return NULL;
}

To run some tests on this algorithm I ran the following code. All of which prints true.

// Generate an array.
$array = range(0, 10);

// Loop through the array, searching for each value.
foreach ($array as $key => $value) {
  echo var_export(binarySearch($array, $value) === $value, TRUE) . PHP_EOL;
}

// Search for values outside of the array.
echo var_export(binarySearch($array, -1) === NULL, TRUE) . PHP_EOL;
echo var_export(binarySearch($array, 11) === NULL, TRUE) . PHP_EOL;

This is a neat little algorithm that is great for searching over arrays of sorted data where the keys of the array are sequential. Much more performant than simply looping through the array to find the value.

Comments

Why do you count right as count($array) - 1. Why do you need this -1 ?

Permalink

This is because the count() function will return the length of the array, but if we want to get the last item in the array using this value it will be 1 item beyond the end of the array.

The array I'm creating here starts from 0 so although the length is 10, the index of the last item is 9.

Name
Philip Norton
Permalink

A good binary search implementation does not return NULL if element is not found. It should return the correct insertion index for the element instead. This way it can serve both purposes (retrieval and insertion), because it only costs O(1) to check if the element at the returned index is the needle.

Permalink

That's actually a really good point. I initially though "how would you know if the value was not found versus a found index", but you are right in that you would only need to go and check the index for the return value. I might have another go at this.

Thank you for the comment!

Name
Philip Norton
Permalink

Add new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
9 + 11 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.