I Still Haven't Found What I'm Looking For

| 0 Comments

Last time[1] we took a look at a selection of sorting operations that we can use to sort arrays, or ranges of elements within them. After defining some useful comparison functions satisfying JavaScript's requirement of returning a negative number when the first argument compares smaller than the second, zero when they compare equal and a positive number otherwise, and a function to map negative integers to indices read from the end of arrays in the same way that Array.slice does, we first implemented ak.partition which divides elements into two ranges; those elements that satisfy some given condition followed by those elements that don't. We saw how this could be used to implement the quicksort algorithm but instead defined ak.sort to sort a range of elements using Array.sort, slicing them out beforehand and splicing them back in again afterwards if they didn't represent whole arrays. We did use it, however, to implement ak.nthElement which puts a the correctly sorted element in a given position position within a range, putting before it elements that are no greater and after it elements that are no smaller. Finally, we implemented ak.partialSort which puts every element in a range up to, but not including, a given position into its correctly sorted place with all of the elements from that position onwards comparing no less than the last correctly sorted element.
This time we shall take a look at some of the ways that we can query data after we have manipulated it with these functions.

Partition? Partition!

In fact, the first thing that we shall do is implement functions to check whether ranges have been partitioned or sorted at all, starting with the ak.isPartitioned function given in listing 1.

Listing 1: ak.isPartitioned
ak.isPartitioned = function(a, pred, start, end) {
 if(ak.nativeType(pred)!==ak.FUNCTION_T) {
  throw new Error('invalid predicate in ak.isPartitioned');
 }

 start = ak.arrayIndex(a, start, 'ak.isPartitioned');
 end = ak.arrayIndex(a, end, 'ak.isPartitioned');

 if(ak.nativeType(start)===ak.UNDEFINED_T) start = 0;
 if(ak.nativeType(end)===ak.UNDEFINED_T) end = a.length;

 while(start<end && pred(a[start])) ++start;
 if(start<end) while(++start<end && !pred(a[start]));
 return start===end;
};

We begin by checking that the predicate for which we're checking whether or not the range has been partitioned is a valid function, before turning to our ak.arrayIndex to map the start and end positions to indices in according to the conventions of Array.slice, defaulting them to the start and end of the array , defined as the position immediately after its last element, respectively if they are undefined.
The test itself is implemented by the last few lines of the function which first searches for the first element, if any, for which the predicate is false and, if found, for the first subsequent element, if any, for which it is true. If the latter exists then the range is not partitioned, otherwise it is. Note that by comparing the final value of start to end for equality, we shall treat invalid ranges having the former after the latter as not being partitioned. The reason why we don't throw an exception for such ranges is because Array.slice doesn't and I'm a strong believer in following the conventions that a language has already adopted when programming in it.
Finally, this implementation can be found in IsPartitioned.js.

Next up is ak.isSorted, given in listing 2.

Listing 2: ak.isSorted
ak.isSorted = function(a, compare, start, end) {
 if(ak.nativeType(compare)===ak.UNDEFINED_T) {
  compare = ak.alphaCompare;
 }
 else if(ak.nativeType(compare)!==ak.FUNCTION_T) {
  throw new Error('invalid comparator in ak.isSorted');
 }

 start = ak.arrayIndex(a, start, 'ak.isSorted');
 end = ak.arrayIndex(a, end, 'ak.isSorted');

 if(ak.nativeType(start)===ak.UNDEFINED_T) start = 0;
 if(ak.nativeType(end)===ak.UNDEFINED_T) end = a.length;

 if(start<end) while(++start<end && !(compare(a[start-1], a[start])>0));
 return start===end;
};

This function, which can be found in IsSorted.js, first performs the usual initial housekeeping before searching for the first element, if any, that compares greater than the element that follows it. If such an elements exists then the range isn't sorted, otherwise it is, with invalid ranges again yielding false.

Where In The Range?

With a slight modification to ak.isSorted we can search for the end position of partially sorted elements within a range, as shown by ak.isSortedUntil in listing 3, which is defined in IsSortedUntil.js.

Listing 3: ak.isSortedUntil
ak.isSortedUntil = function(a, compare, start, end) {
 if(ak.nativeType(compare)===ak.UNDEFINED_T) {
  compare = ak.alphaCompare;
 }
 else if(ak.nativeType(compare)!==ak.FUNCTION_T) {
  throw new Error('invalid comparator in ak.isSortedUntil');
 }

 start = ak.arrayIndex(a, start, 'ak.isSortedUntil');
 end = ak.arrayIndex(a, end, 'ak.isSortedUntil');

 if(ak.nativeType(start)===ak.UNDEFINED_T) start = 0;
 if(ak.nativeType(end)===ak.UNDEFINED_T) end = a.length;

 if(start<end) while(++start<end && !(compare(a[start-1], a[start])>0));
 return start<=end ? start : -1;
};

The only difference is that, rather than checking that no element compares greater than the one following it, it returns the position after the first element that does so, or the end of the range if no such element exists. Once again we're treating invalid ranges as unsorted and so return minus one, in keeping with the behaviour of methods such as Array.indexOf which use it to mean not found.
Program 1 demonstrates that this correctly identifies the end of the sorted elements after calls to ak.partialSort with randomly populated arrays by separating them with a vertical bars instead of commas.

Program 1: Results Of ak.isSortedUntil

Note that the reason that the identified positions don't always align with the mid argument of ak.partialSort, which in this case is the randomly generated value j and is marked with a vertical bar in each of the original unsorted arrays, is because it only guarantees that those arrays will be correctly sorted up to those positions, not that they won't continue to compare in increasing order after them, nor that those additional elements will be in their correctly sorted positions.

Now, if a range of elements has been partitioned then we can search for the partition point by testing its middle element and setting either its start or its end to that position depending upon whether it satisfies the condition or not, repeating the process until the ends meet, as done by the ak.partitionPoint function given in listing 4 and defined in PartitionPoint.js.

Listing 4: ak.partitionPoint
ak.partitionPoint = function(a, pred, start, end) {
 var mid;

 if(ak.nativeType(pred)!==ak.FUNCTION_T) {
  throw new Error('invalid predicate in ak.partitionPoint');
 }

 start = ak.arrayIndex(a, start, 'ak.partitionPoint');
 end = ak.arrayIndex(a, end, 'ak.partitionPoint');

 if(ak.nativeType(start)===ak.UNDEFINED_T) start = 0;
 if(ak.nativeType(end)===ak.UNDEFINED_T) end = a.length;

 while(start<end) {
  mid = ak.floor(0.5*(start+end));
  if(!pred(a[mid])) end = mid;
  else              start = mid+1;
 }
 return start===end ? start : -1;
};

Note that, since the partition point is the first element that doesn't satisfy the condition, we set the start to the first position after the middle of the range if it does satisfy it both because we can definitely rule it out and, rather more importantly, we will end up in an infinite loop if the middle position was rounded down to the start.
Program 2 partitions a randomly shuffled array of the integers from one to sixteen on their comparing less than a randomly generated number between zero and seventeen or not, showing the result of a subsequent call to ak.partitionPoint with a vertical bar.

Program 2: Results Of ak.partitionPoint

We can easily adapt this to search for the first element that does not compare less than a given value in a sorted range, which is the first position at which we could splice it in whilst keeping the range sorted, as shown by ak.lowerBound in listing 5, defined in LowerBound.js.

Listing 5: ak.lowerBound
ak._unsafeLowerBound = function(a, value, compare, start, end) {
 var mid;

 while(start<end) {
  mid = ak.floor((start+end)/2);
  if(compare(a[mid], value)<0) start = mid+1;
  else                         end = mid;
 }
 return start<=end ? start : -1;
};

ak.lowerBound = function(a, value, compare, start, end) {
 if(ak.nativeType(compare)===ak.UNDEFINED_T) {
  compare = ak.alphaCompare;
 }
 else if(ak.nativeType(compare)!==ak.FUNCTION_T) {
  throw new Error('invalid comparator in ak.lowerBound');
 }

 start = ak.arrayIndex(a, start, 'ak.lowerBound');
 end = ak.arrayIndex(a, end, 'ak.lowerBound');

 if(ak.nativeType(start)===ak.UNDEFINED_T) start = 0;
 if(ak.nativeType(end)===ak.UNDEFINED_T) end = a.length;

 return ak._unsafeLowerBound(a, value, compare, start, end);
};

Note that we're again using the convention of a leading underscore to indicate that the actual implementation ak._unsafeLowerBound should not be called directly by the user whilst making it available for use by other functions in the library.

Similarly, the first element that a given value compares less than in a sorted range is the last position at which we could splice it in without leaving it unsorted, and can be found with the ak.upperBound function shown in listing 6 and which can be found in UpperBound.js.

Listing 6: ak.upperBound
ak._unsafeUpperBound = function(a, value, compare, start, end) {
 var mid;

 while(start<end) {
  mid = ak.floor((start+end)/2);
  if(!(compare(value, a[mid])<0)) start = mid+1;
  else                            end = mid;
 }
 return start<=end ? start : -1;
};

ak.upperBound = function(a, value, compare, start, end) {
 if(ak.nativeType(compare)===ak.UNDEFINED_T) {
  compare = ak.alphaCompare;
 }
 else if(ak.nativeType(compare)!==ak.FUNCTION_T) {
  throw new Error('invalid comparator in ak.upperBound');
 }

 start = ak.arrayIndex(a, start, 'ak.upperBound');
 end = ak.arrayIndex(a, end, 'ak.upperBound');

 if(ak.nativeType(start)===ak.UNDEFINED_T) start = 0;
 if(ak.nativeType(end)===ak.UNDEFINED_T) end = a.length;

 return ak._unsafeUpperBound(a, value, compare, start, end);
};

We can now use the unsafe lower and upper bound functions to find the range of elements within a sorted range that compare equal to a given value, as shown by ak.equalRange in listing 7.

Listing 7: ak.equalRange
ak.equalRange = function(a, value, compare, start, end) {
 var lb, ub;

 if(ak.nativeType(compare)===ak.UNDEFINED_T) {
  compare = ak.alphaCompare;
 }
 else if(ak.nativeType(compare)!==ak.FUNCTION_T) {
  throw new Error('invalid comparator in ak.equalRange');
 }

 start = ak.arrayIndex(a, start, 'ak.equalRange');
 end = ak.arrayIndex(a, end, 'ak.equalRange');

 if(ak.nativeType(start)===ak.UNDEFINED_T) start = 0;
 if(ak.nativeType(end)===ak.UNDEFINED_T) end = a.length;

 lb = ak._unsafeLowerBound(a, value, compare, start, end);
 ub = lb>=0 ? ak._unsafeUpperBound(a, value, compare, lb, end) : -1;

 return [lb, ub];
};

Note that if the lower bound is defined, or in other words is not negative, we reduce the amount of effort that we spend searching for the upper bound by using it as the start of the range of elements in which we search for it, exploiting the fact that the latter cannot come before the former.
Program 3 demonstrates this function, defined in EqualRange.js, by marking the lower and upper bounds of a randomly generated integer between zero and seven in an array of randomly generated integers between zero and seven with vertical bars.

Program 3: Results Of ak.equalRange

We can determine whether a value is equivalent to an element within a sorted range by testing whether the position returned by ak.lowerBound is within it and contains an element that compares equal to it, as done by ak.binarySearch in listing 8 which you can find in BinarySearch.js.

Listing 8: ak.binarySearch
ak.binarySearch = function(a, value, compare, start, end) {
 if(ak.nativeType(compare)===ak.UNDEFINED_T) {
  compare = ak.alphaCompare;
 }
 else if(ak.nativeType(compare)!==ak.FUNCTION_T) {
  throw new Error('invalid comparator in ak.binarySearch');
 }

 start = ak.arrayIndex(a, start, 'ak.binarySearch');
 end = ak.arrayIndex(a, end, 'ak.binarySearch');

 if(ak.nativeType(start)===ak.UNDEFINED_T) start = 0;
 if(ak.nativeType(end)===ak.UNDEFINED_T) end = a.length;

 start = ak._unsafeLowerBound(a, value, compare, start, end);
 return start>=0 && start<end && compare(a[start], value)===0;
};

Finally, it's also quite common to search for the elements that compare less and greater than every other in unsorted ranges, which we can do by simply iterating over every element within them. The ak.minElement returns the position of the element that compares least within a range, or minus one if the range is invalid, as shown by listing 9.

Listing 9: ak.minElement
ak.minElement = function(a, compare, start, end) {
 var min;

 if(ak.nativeType(compare)===ak.UNDEFINED_T) {
  compare = ak.alphaCompare;
 }
 else if(ak.nativeType(compare)!==ak.FUNCTION_T) {
  throw new Error('invalid comparator in ak.minElement');
 }

 start = ak.arrayIndex(a, start, 'ak.minElement');
 end = ak.arrayIndex(a, end, 'ak.minElement');

 if(ak.nativeType(start)===ak.UNDEFINED_T) start = 0;
 if(ak.nativeType(end)===ak.UNDEFINED_T) end = a.length;

 min = start;
 while(++start<end) if(compare(a[start],a[min])<0) min = start;
 return min<end ? min : -1;
};

Similarly, the ak.maxElement function given in listing 10 searches for the position of the element that compares greatest with a range.

Listing 10: ak.maxElement
ak.maxElement = function(a, compare, start, end) {
 var max;

 if(ak.nativeType(compare)===ak.UNDEFINED_T) {
  compare = ak.alphaCompare;
 }
 else if(ak.nativeType(compare)!==ak.FUNCTION_T) {
  throw new Error('invalid comparator in ak.maxElement');
 }

 start = ak.arrayIndex(a, start, 'ak.maxElement');
 end = ak.arrayIndex(a, end, 'ak.maxElement');

 if(ak.nativeType(start)===ak.UNDEFINED_T) start = 0;
 if(ak.nativeType(end)===ak.UNDEFINED_T) end = a.length;

 max = start;
 while(++start<end) if(compare(a[start],a[max])>0) max = start;
 return max<end ? max : -1;
};

Program 4 demonstrates these functions, which are implemented in MinElement.js and MaxElement.js, on arrays randomly generated using our ak.multiUniformRnd function.

Program 4: Results Of ak.minElement And ak.maxElement

These are the last query-like functions that I want to add to the ak library and in the next post we shall take a look at some ways that we can transform sorted ranges.

References

[1] We're All Sorted From A To Z, www.thusspakeak.com, 2017.

Leave a comment