monoid factorisation

In mathematics, a factorisation of a free monoid is a sequence of subsets of words with the property that every word in the free monoid can be written as a concatenation of elements drawn from the subsets. The Chen–FoxLyndon theorem states that the Lyndon words furnish a factorisation. The Schützenberger theorem relates the definition in terms of a multiplicative property to an additive property.{{clarify|reason=What, exactly, is the "additive property", here? What makes this additive? I have a vague glimmer of what it might be, but this statement is imprecise. |date=October 2020}}

Let {{math|A}} be the free monoid on an alphabet A. Let Xi be a sequence of subsets of {{math|A}} indexed by a totally ordered index set I. A factorisation of a word w in {{math|A}} is an expression

:w = x_{i_1} x_{i_2} \cdots x_{i_n} \

with x_{i_j} \in X_{i_j} and i_1 \ge i_2 \ge \ldots \ge i_n. Some authors reverse the order of the inequalities.

Chen–Fox–Lyndon theorem

A Lyndon word over a totally ordered alphabet A is a word that is lexicographically less than all its rotations.Lothaire (1997) p.64 The Chen–Fox–Lyndon theorem states that every string may be formed in a unique way by concatenating a lexicographically non-increasing sequence of Lyndon words. Hence taking {{math|Xl}} to be the singleton set {{math|{{mset|l}}}} for each Lyndon word {{mvar|l}}, with the index set L of Lyndon words ordered lexicographically, we obtain a factorisation of {{math|A}}.Lothaire (1997) p.67 Such a factorisation can be found in linear time and constant space by Duval's algorithm.{{cite journal | last = Duval | first = Jean-Pierre | doi = 10.1016/0196-6774(83)90017-2 | issue = 4

| journal = Journal of Algorithms | pages = 363–381 | title = Factorizing words over an ordered alphabet | volume = 4 | year = 1983}}. The algorithm{{Cite web |title=Lyndon factorization - Algorithms for Competitive Programming |url=https://cp-algorithms.com/string/lyndon_factorization.html |access-date=2024-01-30 |website=cp-algorithms.com}} in Python code is:

def chen_fox_lyndon_factorization(s: list[int]) -> list[int]:

"""Monoid factorisation using the Chen–Fox–Lyndon theorem.

Args:

s: a list of integers

Returns:

a list of integers

"""

n = len(s)

factorization = []

i = 0

while i < n:

j, k = i + 1, i

while j < n and s[k] <= s[j]:

if s[k] < s[j]:

k = i

else:

k += 1

j += 1

while i <= k:

factorization.append(s[i:i + j - k])

i += j - k

return factorization

Hall words

The Hall set provides a factorization.

Guy Melançon, (1992) "[https://core.ac.uk/download/pdf/82581798.pdf Combinatorics of Hall trees and Hall words]", Journal of Combinatoric Theory, 59A(2) pp. 285–308.

Indeed, Lyndon words are a special case of Hall words. The article on Hall words provides a sketch of all of the mechanisms needed to establish a proof of this factorization.

Bisection

A bisection of a free monoid is a factorisation with just two classes X0, X1.Lothaire (1997) p.68

Examples:

:{{math|1=A = {a,b}, X0 = {Ab}, X1 = {a}.}}

If X, Y are disjoint sets of non-empty words, then (X,Y) is a bisection of {{math|A}} if and only ifLothaire (1997) p.69

:YX \cup A = X \cup Y \ .

As a consequence, for any partition P, Q of A+ there is a unique bisection (X,Y) with X a subset of P and Y a subset of Q.Lothaire (1997) p.71

Schützenberger theorem

This theorem states that a sequence Xi of subsets of {{math|A}} forms a factorisation if and only if two of the following three statements hold:

  • Every element of {{math|A}} has at least one expression in the required form;{{clarify|reason=What does "required form" mean here? I guess I should assume the "required form" is the product in descending order, given previously, right? The wording is awkward.|date=October 2020}}
  • Every element of {{math|A}} has at most one expression in the required form;
  • Each conjugate class C meets just one of the monoids {{math|1=Mi = Xi}} and the elements of C in Mi are conjugate in Mi.Lothaire (1997) p.92{{clarify|reason=What, exactly is meant by "meet", here? I assume that it means that the intersection of C and M_i is non-empty for one and only one i? Right? I dislike having to guess; I sometimes guess incorrectly.|date=October 2020}}

See also

References

{{reflist}}

  • {{Citation | last=Lothaire | first=M. | authorlink=M. Lothaire | others=Perrin, D.; Reutenauer, C.; Berstel, J.; Pin, J.-É.; Pirillo, G.; Foata, D.; Sakarovitch, J.; Simon, I.; Schützenberger, M. P.; Choffrut, C.; Cori, R.; Lyndon, R.; Rota, G.-C. Foreword by Roger Lyndon | title=Combinatorics on words | edition=2nd | series=Encyclopedia of Mathematics and Its Applications | volume=17 | publisher=Cambridge University Press | year=1997 | isbn=0-521-59924-5 | zbl=0874.20040 }}

Category:Formal languages