square-free word
{{CS1 config|mode=cs1}}
In combinatorics, a square-free word is a word (a sequence of symbols) that does not contain any squares. A square is a word of the form {{mvar|XX}}, where {{mvar|X}} is not empty. Thus, a square-free word can also be defined as a word that avoids the pattern {{mvar|XX}}.
Finite square-free words
= Binary alphabet =
Over a binary alphabet , the only square-free words are the empty word , and .
= Ternary alphabet =
Over a ternary alphabet , there are infinitely many square-free words. It is possible to count the number of ternary square-free words of length {{mvar|n}}.
class="wikitable"
|+The number of ternary square-free words of length n{{Cite web|url=https://oeis.org/A006156|title=A006156 - OEIS|website=oeis.org|access-date=2019-03-28}} !{{mvar|n}} |0 |1 |2 |3 |4 |5 |6 |7 |8 |9 |10 |11 |12 |
{{tmath|c(n)}}
|1 |3 |6 |12 |18 |30 |42 |60 |78 |108 |144 |204 |264 |
---|
This number is bounded by , where .{{Cite journal|last=Shur|first=Arseny|date=2011|title=Growth properties of power-free languages|journal=Computer Science Review|volume=6|issue=5–6|pages=28–43|doi=10.1016/j.cosrev.2012.09.001}} The upper bound on can be found via Fekete's Lemma and approximation by automata. The lower bound can be found by finding a substitution that preserves square-freeness.
= Alphabet with more than three letters =
Since there are infinitely many square-free words over three-letter alphabets, this implies there are also infinitely many square-free words over an alphabet with more than three letters.
The following table shows the exact growth rate of the {{mvar|k}}-ary square-free words, rounded off to 7 digits after the decimal point, for {{mvar|k}} in the range from 4 to 15:
class="wikitable"
|+Growth rate of the {{mvar|k}}-ary square-free words !alphabet size ({{mvar|k}}) |4 |5 |6 |7 |8 |9 |
growth rate
|2.6215080 |3.7325386 |4.7914069 |5.8284661 |6.8541173 |7.8729902 |
---|
alphabet size ({{mvar|k}})
|10 |11 |12 |13 |14 |15 |
growth rate
|8.8874856 |9.8989813 |10.9083279 |11.9160804 |12.9226167 |13.9282035 |
== 2-dimensional words ==
Consider a map from to {{mvar|A}}, where {{mvar|A}} is an alphabet and is called a 2-dimensional word. Let be the entry . A word is a line of if there exists such that , and for .{{Citation|chapter=Preface|pages=xi–xviii|editor-last=Berthe|editor-first=Valerie|publisher=Cambridge University Press|isbn=9781139924733|editor2-last=Rigo|editor2-first=Michel|doi=10.1017/cbo9781139924733.001|title=Combinatorics, Words and Symbolic Dynamics|year=2016}}
Carpi{{Cite journal|last=Carpi|first=Arturo|date=1988|title=Multidimensional unrepetitive configurations|journal=Theoretical Computer Science|volume=56|issue=2|pages=233–241|doi=10.1016/0304-3975(88)90080-1|issn=0304-3975|doi-access=free}} proves that there exists a 2-dimensional word over a 16-letter alphabet such that every line of is square-free. A computer search shows that there are no 2-dimensional words over a 7-letter alphabet, such that every line of is square-free.
= Generating finite square-free words =
Shur{{Cite journal|last=Shur|first=Arseny|date=2015|title=Generating square-free words efficiently|journal=Theoretical Computer Science|volume=601|pages=67–72|doi=10.1016/j.tcs.2015.07.027|doi-access=free|hdl=10995/92700|hdl-access=free}} proposes an algorithm called R2F (random-t(w)o-free) that can generate a square-free word of length {{mvar|n}} over any alphabet with three or more letters. This algorithm is based on a modification of entropy compression: it randomly selects letters from a k-letter alphabet to generate a {{tmath|(k+1)}}-ary square-free word.
algorithm R2F is
input: alphabet size ,
word length
output: a {{tmath|(k+1)}}-ary square-free word {{mvar|w}} of length {{mvar|n}}.
{{gray|(Note that is the alphabet with letters .)
(For a word , is the permutation of such that {{mvar|a}} precedes {{mvar|b}} in if the
right most position of {{mvar|a}} in {{mvar|w}} is to the right of the rightmost position of {{mvar|b}} in {{mvar|w}}.
For example, has .)}}
choose in uniformly at random
set to followed by all other letters of in increasing order
set the number {{mvar|N}} of iterations to 0
while do
choose {{mvar|j}} in uniformly at random
append to the end of {{mvar|w}}
update shifting the first {{mvar|j}} elements to the right and setting
increment {{mvar|N}} by {{mvar|1}}
if {{mvar|w}} ends with a square of rank {{mvar|r}} then
delete the last {{mvar|r}} letters of {{mvar|w}}
return {{mvar|w}}
Every (k+1)-ary square-free word can be the output of Algorithm R2F, because on each iteration it can append any letter except for the last letter of {{mvar|w}}.
The expected number of random k-ary letters used by Algorithm R2F to construct a {{tmath|(k+1)}}-ary square-free word of length {{mvar|n}} isNote that there exists an algorithm that can verify the square-freeness of a word of length {{mvar|n}} in time. Apostolico and Preparata{{Cite journal|last1=Apostolico|first1=A.|last2=Preparata|first2=F.P.|date=Feb 1983|title=Optimal off-line detection of repetitions in a string|journal=Theoretical Computer Science|volume=22|issue=3|pages=297–315|doi=10.1016/0304-3975(83)90109-3|issn=0304-3975|doi-access=free}} give an algorithm using suffix trees. Crochemore{{Cite journal|last=Crochemore|first=Max|date=Oct 1981|title=An optimal algorithm for computing the repetitions in a word|journal=Information Processing Letters|volume=12|issue=5|pages=244–250|doi=10.1016/0020-0190(81)90024-7|issn=0020-0190}} uses partitioning in his algorithm. Main and Lorentz{{Cite journal|last1=Main|first1=Michael G|last2=Lorentz|first2=Richard J|date=Sep 1984|title=An O(n log n) algorithm for finding all repetitions in a string|journal=Journal of Algorithms|volume=5|issue=3|pages=422–432|doi=10.1016/0196-6774(84)90021-x|issn=0196-6774}} provide an algorithm based on the divide-and-conquer method. A naive implementation may require time to verify the square-freeness of a word of length {{mvar|n}}.
Infinite square-free words
There exist infinitely long square-free words in any alphabet with three or more letters, as proved by Axel Thue.{{Cite book|title=Axel Thue's papers on repetitions in words a translation|last=Berstel|first=Jean|publisher=Départements de mathématiques et d'informatique, Université du Québec à Montréal|year=1994|isbn=978-2892761405|oclc=494791187}}
=Examples=
== First difference of the [[Thue–Morse sequence]] ==
One example of an infinite square-free word over an alphabet of size 3 is the word over the alphabet obtained by taking the first difference of the Thue–Morse sequence. That is, from the Thue–Morse sequence
:
one forms a new sequence in which each term is the difference of two consecutive terms of the Thue–Morse sequence. The resulting square-free word is
: {{OEIS|id=A029883}}.
== [[John Leech (mathematician)|Leech]]'s [[morphism]] ==
Another example found by John Leech{{cite journal|last=Leech|first=J.|author-link=John Leech (mathematician)|year=1957|title=A problem on strings of beads|journal=Math. Gaz.|volume=41|pages=277–278|zbl=0079.01101|doi=10.1017/S0025557200236115|s2cid=126406225 }} is defined recursively over the alphabet . Let {{tmath|w_1}} be any square-free word starting with the letter {{val|0}}. Define the words recursively as follows: the word is obtained from {{tmath|w_i}} by replacing each {{val|0}} in {{tmath|w_i}} with {{val|fmt=none|0121021201210}}, each {{val|1}} with {{val|fmt=none|1202102012021}}, and each {{val|2}} with {{val|fmt=none|2010210120102}}. It is possible to prove that the sequence converges to the infinite square-free word
:{{math|0121021201210120210201202120102101201021202102012021...}}
= Generating infinite square-free words =
Infinite square-free words can be generated by square-free morphism. A morphism is called square-free if the image of every square-free word is square-free. A morphism is called k–square-free if the image of every square-free word of length k is square-free.
Crochemore{{Cite book|last=Berstel|first=Jean|date=April 1984|chapter=Some Recent Results on Squarefree Words|title=Annual Symposium on Theoretical Aspects of Computer Science|volume=166|pages=14–25|doi=10.1007/3-540-12920-0_2|series=Lecture Notes in Computer Science|isbn=978-3-540-12920-2|chapter-url=http://www.numdam.org/item/PDML_1985___2B_21_0/}} proves that a uniform morphism {{mvar|h}} is square-free if and only if it is 3-square-free. In other words, {{mvar|h}} is square-free if and only if is square-free for all square-free {{mvar|w}} of length 3. It is possible to find a square-free morphism by brute-force search.
algorithm square-free_morphism is
output: a square-free morphism with the lowest possible rank {{mvar|k}}.
set
while True do
set k_sf_words to the list of all square-free words of length {{mvar|k}} over a ternary alphabet
for each in k_sf_words do
for each in k_sf_words do
for each in k_sf_words do
if then
break from the current loop (advance to next )
if and then
if is square-free for all square-free {{mvar|w}} of length {{val|3}} then
return
increment {{mvar|k}} by {{val|1}}
Over a ternary alphabet, there are exactly 144 uniform square-free morphisms of rank 11 and no uniform square-free morphisms with a lower rank than 11.
To obtain an infinite square-free words, start with any square-free word such as {{val|0}}, and successively apply a square-free morphism {{mvar|h}} to it. The resulting words preserve the property of square-freeness. For example, let {{mvar|h}} be a square-free morphism, then as , is an infinite square-free word.
Note that, if a morphism over a ternary alphabet is not uniform, then this morphism is square-free if and only if it is 5-square-free.
Letter combinations in square-free words
= Avoid two-letter combinations =
Over a ternary alphabet, a square-free word of length more than 13 contains all the square-free two-letter combinations.{{Cite arXiv |eprint=1505.00019|last1=Zolotov|first1=Boris|title=Another Solution to the Thue Problem of Non-Repeating Words|class=math.CO|year=2015}}
This can be proved by constructing a square-free word without the two-letter combination {{mvar|ab}}. As a result, {{mvar|bcba}}{{mvar|cbca}}{{mvar|cbaca}} is the longest square-free word without the combination {{mvar|ab}} and its length is equal to 13.
Note that over a more than three-letter alphabet there are square-free words of any length without an arbitrary two-letter combination.
= Avoid three-letter combinations =
= Density of a letter =
The density of a letter {{mvar|a}} in a finite word {{mvar|w}} is defined as where
is the number of occurrences of {{mvar|a}} in
and
is the length of the word. The density of a letter {{mvar|a}} in an infinite word is where
is the prefix of the word {{mvar|w}} of length {{mvar|l}}.{{Cite journal|last=Khalyavin|first=Andrey|date=2007|title=The minimal density of a letter in an infinite ternary square-free word is 883/3215|url=http://emis.de/journals/JIS/VOL10/Khalyavin/khalyavin13.pdf|journal=Journal of Integer Sequences|volume=10|issue=2|pages=3|bibcode=2007JIntS..10...65K}}
The minimal density of a letter {{mvar|a}} in an infinite ternary square-free word is equal to .
The maximum density of a letter {{mvar|a}} in an infinite ternary square-free word is equal to .{{Cite journal|last=Ochem|first=Pascal|date=2007|title=Letter frequency in infinite repetition-free words|journal=Theoretical Computer Science|volume=380|issue=3|pages=388–392|doi=10.1016/j.tcs.2007.03.027|issn=0304-3975|doi-access=}}
Notes
References
{{Reflist}}
- {{cite book | last1=Berstel | first1=Jean | last2=Lauve | first2=Aaron | last3=Reutenauer | first3=Christophe | last4=Saliola | first4=Franco V. | author1-link = Jean Berstel | title=Combinatorics on words. Christoffel words and repetitions in words | series=CRM Monograph Series | volume=27 | location=Providence, RI | publisher=American Mathematical Society | year=2009 | isbn=978-0-8218-4480-9 | url=http://www.ams.org/bookpages/crmm-27 | zbl=1161.68043 }}
- {{cite book | last=Lothaire | first=M. | author-link=M. Lothaire | title=Combinatorics on words | publisher=Cambridge University Press | location=Cambridge | year= 1997 | isbn= 978-0-521-59924-5 }}.
- {{cite book | last=Lothaire | first=M. | author-link=M. Lothaire | title=Algebraic combinatorics on words | others=With preface by Jean Berstel and Dominique Perrin | edition=Reprint of the 2002 hardback | series=Encyclopedia of Mathematics and Its Applications | volume=90| publisher=Cambridge University Press | year=2011 | isbn=978-0-521-18071-9 | zbl=1221.68183 }}
- {{cite book | last=Pytheas Fogg | first=N. | editor1=Berthé, Valérie|editor1-link=Valérie Berthé|editor2=Ferenczi, Sébastien|editor3=Mauduit, Christian|editor4=Siegel, Anne | title=Substitutions in dynamics, arithmetics and combinatorics | series=Lecture Notes in Mathematics | volume=1794 | location=Berlin | publisher=Springer-Verlag | year=2002 | isbn=978-3-540-44141-0 | zbl=1014.11015 }}