weighted automaton

{{Short description|Finite-state machine where edges carry weights}}

File:Quantitative automata.svg of some classes of quantitative automata, ordered by expressiveness.{{Cite book|last1=Chatterjee|first1=Krishnendu|last2=Henzinger|first2=Thomas A.|last3=Otop|first3=Jan|title=Static Analysis |chapter=Quantitative Monitor Automata |date=2016|editor-last=Rival|editor-first=Xavier|chapter-url=https://link.springer.com/chapter/10.1007%2F978-3-662-53413-7_2|series=Lecture Notes in Computer Science|volume=9837 |language=en|location=Berlin, Heidelberg|publisher=Springer|pages=23–38|doi=10.1007/978-3-662-53413-7_2|isbn=978-3-662-53413-7}}{{rp|at=Fig.1}}]]

In theoretical computer science and formal language theory, a weighted automaton or weighted finite-state machine is a generalization of a finite-state machine in which the edges have weights, for example real numbers or integers. Finite-state machines are only capable of answering decision problems; they take as input a string and produce a Boolean output, i.e. either "accept" or "reject". In contrast, weighted automata produce a quantitative output, for example a count of how many answers are possible on a given input string, or a probability of how likely the input string is according to a probability distribution.{{Cite book|date=2009|editor-last=Droste|editor-first=Manfred|editor2-last=Kuich|editor2-first=Werner|editor3-last=Vogler|editor3-first=Heiko|title=Handbook of Weighted Automata|url=https://link.springer.com/book/10.1007/978-3-642-01492-5|series=Monographs in Theoretical Computer Science. An EATCS Series |language=en-gb|doi=10.1007/978-3-642-01492-5|bibcode=2009hwa..book.....D |isbn=978-3-642-01491-8 |issn=1431-2654}} chs.1-4, pp. 3–26, 69–71, 122–126. They are one of the simplest studied models of quantitative automata.{{rp|at=Fig.1}}

The definition of a weighted automaton is generally given over an arbitrary semiring R, an abstract set with an addition operation + and a multiplication operation \times. The automaton consists of a finite set of states, a finite input alphabet of characters \Sigma and edges which are labeled with both a character in \Sigma and a weight in R. The weight of any path in the automaton is defined to be the product of weights along the path, and the weight of a string is the sum of the weights of all paths which are labeled with that string. The weighted automaton thus defines a function from \Sigma^* to R.

Weighted automata generalize deterministic finite automata (DFAs) and nondeterministic finite automata (NFAs), which correspond to weighted automata over the Boolean semiring, where addition is logical disjunction and multiplication is logical conjunction. In the DFA case, there is only one accepting path for any input string, so disjunction is not applied. When the weights are real numbers and the outgoing weights for each state add to one, weighted automata can be considered a probabilistic model and are also known as probabilistic automata. These machines define a probability distribution over all strings, and are related to other probabilistic models such as Markov decision processes and Markov chains.

Weighted automata have applications in natural language processing where they are used to assign weights to words and sentences,{{Cite web|last=Chiang|first=David|title=Weighted Automata|url=https://www3.nd.edu/~dchiang/teaching/nlp/2019/notes/chapter4v3.pdf|access-date=2021-11-09|website=Natural Language Processing (CSE 40657/60657), course notes, Fall 2019}}{{rp|pages=571-596}} as well as in image compression.{{rp|pages=453-480}} They were first introduced by Marcel-Paul Schützenberger in his 1961 paper On the definition of a family of automata.{{Cite journal|last=Schützenberger|first=M. P.|date=1961-09-01|title=On the definition of a family of automata|journal=Information and Control|language=en|volume=4|issue=2|pages=245–270|doi=10.1016/S0019-9958(61)80020-X|issn=0019-9958|doi-access=}} Since their introduction, many extensions have been proposed, for example nested weighted automata,{{Cite journal|last1=Chatterjee|first1=Krishnendu|last2=Henzinger|first2=Thomas A.|last3=Otop|first3=Jan|date=2017-12-14|title=Nested Weighted Automata|url=https://doi.org/10.1145/3152769|journal=ACM Transactions on Computational Logic|volume=18|issue=4|pages=31:1–31:44|doi=10.1145/3152769|arxiv=1504.06117 |s2cid=15070803 |issn=1529-3785}} cost register automata,{{Cite book |last1=Alur |first1=Rajeev |last2=D'Antoni |first2=Loris |last3=Deshmukh |first3=Jyotirmoy |last4=Raghothaman |first4=Mukund |last5=Yuan |first5=Yifei |title=2013 28th Annual ACM/IEEE Symposium on Logic in Computer Science |chapter=Regular Functions and Cost Register Automata |date=2013 |chapter-url=https://ieeexplore.ieee.org/document/6571532 |pages=13–22 |doi=10.1109/LICS.2013.65|isbn=978-1-4799-0413-6 |s2cid=1286659 |url=https://repository.upenn.edu/cis_papers/568 }} and weighted finite-state transducers.{{Cite journal|last1=Mohri|first1=Mehryar|last2=Pereira|first2=Fernando|last3=Riley|first3=Michael|date=2002-01-01|title=Weighted finite-state transducers in speech recognition|url=https://repository.upenn.edu/bitstreams/0473dbb0-8787-4c32-97f6-0d3f6edaffed/download|journal=Computer Speech & Language|language=en|volume=16|issue=1|pages=69–88|doi=10.1006/csla.2001.0184|issn=0885-2308}} Researchers have studied weighted automata from the perspective of learning a machine from its input-output behavior{{Cite book|last1=Balle|first1=Borja|last2=Mohri|first2=Mehryar|title=Algebraic Informatics |chapter=Learning Weighted Automata |date=2015|editor-last=Maletti|editor-first=Andreas|chapter-url=https://link.springer.com/chapter/10.1007/978-3-319-23021-4_1|series=Lecture Notes in Computer Science|volume=9270 |language=en|location=Cham|publisher=Springer International Publishing|pages=1–21|doi=10.1007/978-3-319-23021-4_1|isbn=978-3-319-23021-4}} (see computational learning theory) and studying decidability questions.{{Cite book|last1=Almagor|first1=Shaull|last2=Boker|first2=Udi|last3=Kupferman|first3=Orna|title=Automated Technology for Verification and Analysis |chapter=What's Decidable about Weighted Automata? |date=2011|editor-last=Bultan|editor-first=Tevfik|editor2-last=Hsiung|editor2-first=Pao-Ann|chapter-url=https://link.springer.com/chapter/10.1007/978-3-642-24372-1_37|series=Lecture Notes in Computer Science|volume=6996 |language=en|location=Berlin, Heidelberg|publisher=Springer|pages=482–491|doi=10.1007/978-3-642-24372-1_37|isbn=978-3-642-24372-1|s2cid=10159261 }}

{{clear|right}}

Definition

A commutative semiring (or rig) is a set R equipped with two distinguished elements 0 \ne 1 and addition and multiplication operations \oplus and \otimes such that \oplus is commutative and associative with identity 0, \otimes is commutative and associative with identity 1, \otimes distributes over \oplus, and 0 is an absorbing element for \otimes.

A weighted automaton over R is a tuple \mathcal{A} = (Q, \Sigma, \Delta, I, F) where:

  • Q is a finite set of states.
  • \Sigma is a finite alphabet.
  • \Delta \subseteq Q \times \Sigma \times R \times Q is a finite set of transitions (q, \sigma, w, q'), where \sigma is called a character and w is called a weight.
  • I: Q \to R is an initial weight function.
  • F: Q \to R is a final weight function.

A path on input w \in \Sigma^* is a finite path in the graph, where the concatenation of the character labels equals w. The weight of the path q_0, q_1, \ldots, q_n is the product (\otimes) of the weights along the path, additionally multiplied by the initial and final weights I(q_0) \otimes F(q_n). The weight of the word w is the sum (\oplus) of the weights of all paths on input w (or 0 if there are no accepting paths). In this way the machine defines a function [\![ \mathcal{A} ]\!]: \Sigma^* \to R.

Ambiguity and determinism

Since \Delta is a set of transitions, weighted automata allow multiple transitions (or paths) on a single input string.

Therefore a weighted automaton can be considered analogous to a nondeterministic finite automaton (NFA).

As is the case with NFAs, restrictions of weighted automata are considered that correspond to the concepts of deterministic finite automaton and unambiguous finite automaton (deterministic weighted automata and unambiguous weighted automata, respectively).

First, a preliminary definition: the underlying NFA of \mathcal{A} is an NFA formed by removing all transitions with weight 0 and then erasing all of the weights on the transitions \Delta, so that the new transition set lies in Q \times \Sigma \times Q. The initial states and final states are the set of states q such that I(q) \ne 0 and F(q) \ne 0, respectively.

A weighted automaton is deterministic if the underlying NFA is deterministic and unambiguous if the underlying NFA is unambiguous.

Every deterministic weighted automaton is unambiguous.

In both the deterministic and unambiguous cases, there is always at most one accepting path, so the \oplus operation is never applied and can be omitted from the definition.

Variations

  • The requirement that there is a zero element for \oplus is sometimes omitted; in this case the machine defines a partial function from \Sigma^* to R rather than a total function.
  • It is possible to extend the definition to allow epsilon transitions (q, \epsilon, w, q'), where \epsilon is the empty string. In this case, one must then require that there are no cycles of epsilon transitions. This does not increase the expressiveness of weighted automata.{{Citation|last=Mohri|first=Mehryar|title=Weighted Automata Algorithms|date=2009|url=https://doi.org/10.1007/978-3-642-01492-5_6|work=Handbook of Weighted Automata|pages=213–254|editor-last=Droste|editor-first=Manfred|series=Monographs in Theoretical Computer Science. An EATCS Series|place=Berlin, Heidelberg|publisher=Springer|language=en|doi=10.1007/978-3-642-01492-5_6|bibcode=2009hwa..book..213M |isbn=978-3-642-01492-5|access-date=2021-11-11|editor2-last=Kuich|editor2-first=Werner|editor3-last=Vogler|editor3-first=Heiko}} If epsilon transitions are allowed, the initial weights and final weights can be replaced by initial and final sets of states without loss of expressiveness.
  • Some authors omit the initial and final weight functions I and F. Instead, I and F are replaced by a set of initial and final states. If epsilon transitions are not present, this technically decreases expressiveness as it forces [\![ \mathcal{A} ]\!](\varepsilon) to depend only on the number of states that are both initial and final.
  • The transition function can be given as a matrix \Delta_\sigma \in R^{Q \times Q} with entries in R for each \sigma, rather than a set of transitions. The entry of the matrix at (q, q') is the sum of all transitions labeled (q, \sigma, q').{{Cite journal|last1=Droste|first1=Manfred|last2=Gastin|first2=Paul|date=2007-06-21|title=Weighted automata and weighted logics|journal=Theoretical Computer Science|series=Automata, Languages and Programming|language=en|volume=380|issue=1|pages=69–86|doi=10.1016/j.tcs.2007.02.055|issn=0304-3975|doi-access=free}}
  • Some authors restrict to specific semirings, such as \mathbb{N} or \mathbb{Z}, particularly when studying decidability results.

See also

References