K-sorted sequence#Algorithms
{{Orphan|date=October 2021}}
In computer science, a nearly-sorted sequence, also known as roughly-sorted sequence and as -sorted sequence is a sequence which is almost ordered. By almost ordered, it is meant that no element of the sequence is very far away from where it would be if the sequence were perfectly ordered. It is still possible that no element of the sequence is at the place where it should be if the sequence were perfectly ordered.
The nearly-sorted sequences are particularly useful when the exact order of element has little importance. For example Twitter{{cite web |last1=@rk |title=Announcing Snowflake |url=https://blog.twitter.com/engineering/en_us/a/2010/announcing-snowflake.html |website=Twitter blog |access-date=26 April 2021}} nearly sort tweets, up to the second, as there is no need for more precision. Actually, given the impossibility to exactly synchronize all computers, an exact sorting of all tweets according to the time at which they are posted is impossible. This idea led to the creation of Snowflake IDs.
-sorting is the operation of reordering the elements of a sequence so that it becomes -sorted. -sorting is generally more efficient than sorting. Similarly, sorting a sequence is easier if it is known that the sequence is -sorted. So if a program needs only to consider -sorted sequences as input or output, considering -sorted sequences may save time.
The radius of a sequence is a measure of presortedness, that is, its value indicate how much the elements in the list has to be moved to get a totally sorted value. In the above example of tweets which are sorted up to the second, the radius is bounded by the number of tweets in a second.
Definition
Given a positive number , a sequence is said to be -sorted if for each and for each , . That is, the sequence has to be ordered only for pairs of elements whose distance is at least .
The radius of the sequence , denoted {{cite journal |last1=Altman |first1=Tom |last2=Igarashi |first2=Yoshihide |title=Roughly Sorting: Sequential and Parallel Approach |journal=Journal of Information Processing |date=1989-08-25 |volume=12 |issue=2 |pages=154–158 }} or {{cite journal |last1=Estivill-Castro |first1=Vladimir |last2=Wood |first2=Derick |title=A new measure of presortdness |journal=Information and Computation |date=October 1989 |volume=83 |issue=1 |pages=111–119 |doi=10.1016/0890-5401(89)90050-3|doi-access=free }} is the smallest such that the sequence is -sorted. The radius is a measure of presortedness.
A sequence is said to be nearly-sorted or roughly-sorted if its radius is small compared to its length.
= Equivalent definition =
A sequence is -sorted if and only if each range of length , is -sorted.
Properties
All sequences of length are -sorted, that is,
= Relation with sorted sequences =
Given a sequence a
Algorithms
= Deciding whether a sequence is <math>k</math>-sorted =
Deciding whether a sequence
= Computing the radius of a sequence =
Computing the radius of a sequence can be computed in linear time and space. This follows from the fact that it can be defined as
= Halving the radius of a sequence =
Given a
First, given a sequence
The sequence
Using
= Merging two <math>k</math>-sorted sequences =
{{distinguish|k-way merge algorithm}}
Merging two
First, using the preceding algorithm, both sequences should be transformed into
Let's construct iteratively an output sequence
If both
Let us consider the case where neither
k/2) and
elements of each list. Let us assume that
A^1\} from
= Sorting a <math>k</math>-sorted sequence =
A
= <math>k</math>-sorting a sequence =
Since each sequence