Context-sensitive language

{{Short description|Language defined by context-sensitive grammar}}

{{redirect|Context-dependent|the type of memory|Context-dependent memory}}

In formal language theory, a context-sensitive language is a language that can be defined by a context-sensitive grammar (and equivalently by a noncontracting grammar). Context-sensitive is known as type-1 in the Chomsky hierarchy of formal languages.

Computational properties

Computationally, a context-sensitive language is equivalent to a linear bounded nondeterministic Turing machine, also called a linear bounded automaton. That is a non-deterministic Turing machine with a tape of only kn cells, where n is the size of the input and k is a constant associated with the machine. This means that every formal language that can be decided by such a machine is a context-sensitive language, and every context-sensitive language can be decided by such a machine.

This set of languages is also known as NLINSPACE or NSPACE(O(n)), because they can be accepted using linear space on a non-deterministic Turing machine.{{citation

| last = Rothe | first = Jörg

| isbn = 978-3-540-22147-0

| location = Berlin

| mr = 2164257

| page = 77

| publisher = Springer-Verlag

| series = Texts in Theoretical Computer Science. An EATCS Series

| title = Complexity theory and cryptology

| year = 2005}}. The class LINSPACE (or DSPACE(O(n))) is defined the same, except using a deterministic Turing machine. Clearly LINSPACE is a subset of NLINSPACE, but it is not known whether LINSPACE = NLINSPACE.{{citation

| last = Odifreddi | first = P. G.

| isbn = 978-0-444-50205-6

| location = Amsterdam

| mr = 1718169

| page = 236

| publisher = North-Holland Publishing Co.

| series = Studies in Logic and the Foundations of Mathematics

| title = Classical recursion theory. Vol. II

| volume = 143

| year = 1999}}.

Examples

One of the simplest context-sensitive but not context-free languages is L = \{ a^nb^nc^n : n \ge 1 \}: the language of all strings consisting of {{mvar|n}} occurrences of the symbol "a", then {{mvar|n}} "b"s, then {{mvar|n}} "c"s (abc, {{not a typo|aabbcc}}, {{not a typo|aaabbbccc}}, etc.). A superset of this language, called the Bach language,{{cite conference |last=Pullum |first=Geoffrey K. |year=1983 |title=Context-freeness and the computer processing of human languages |conference=Proc. 21st Annual Meeting of the ACL |url=http://www.aclweb.org/anthology/P83-1001}} is defined as the set of all strings where "a", "b" and "c" (or any other set of three symbols) occurs equally often ({{not a typo|aabccb}}, {{not a typo|baabcaccb}}, etc.) and is also context-sensitive.Bach, E. (1981). [http://people.umass.edu/ebach/papers/nels11.htm "Discontinuous constituents in generalized categorial grammars"] {{Webarchive|url=https://web.archive.org/web/20140121022931/http://people.umass.edu/ebach/papers/nels11.htm |date=2014-01-21 }}. NELS, vol. 11, pp. 1–12.Joshi, A.; Vijay-Shanker, K.; and Weir, D. (1991). "The convergence of mildly context-sensitive grammar formalisms". In: Sells, P., Shieber, S.M. and Wasow, T. (Editors). Foundational Issues in Natural Language Processing. Cambridge MA: Bradford.

{{mvar|L}} can be shown to be a context-sensitive language by constructing a linear bounded automaton which accepts {{mvar|L}}. The language can easily be shown to be neither regular nor context-free by applying the respective pumping lemmas for each of the language classes to {{mvar|L}}.

Similarly:

L_\textit{Cross} = \{ a^mb^nc^{m}d^{n} : m \ge 1, n \ge 1 \} is another context-sensitive language; the corresponding context-sensitive grammar can be easily projected starting with two context-free grammars generating sentential forms in the formats

a^mC^m

and

B^nd^n

and then supplementing them with a permutation production like

CB\rightarrow BC, a new starting symbol and standard syntactic sugar.

L_{MUL3} = \{ a^mb^nc^{mn} : m \ge 1, n \ge 1 \} is another context-sensitive language (the "3" in the name of this language is intended to mean a ternary alphabet); that is, the "product" operation defines a context-sensitive language (but the "sum" defines only a context-free language as the grammar S\rightarrow aSc|R and R\rightarrow bRc|bc shows). Because of the commutative property of the product, the most intuitive grammar for L_\textit{MUL3} is ambiguous. This problem can be avoided considering a somehow more restrictive definition of the language, e.g. L_\textit{ORDMUL3} = \{ a^mb^nc^{mn} : 1 < m < n \}. This can be specialized to

L_\textit{MUL1} = \{ a^{mn} : m > 1, n > 1 \} and, from this, to L_{m^2} = \{ a^{m^2} : m > 1 \}, L_{m^3} = \{ a^{m^3} : m > 1 \}, etc.

L_{REP} = \{ w^

w
: w \in \Sigma^* \} is a context-sensitive language. The corresponding context-sensitive grammar can be obtained as a generalization of the context-sensitive grammars for L_\textit{Square} = \{ w^2 : w \in \Sigma^* \}, L_\textit{Cube} = \{ w^3 : w \in \Sigma^* \}, etc.

L_\textit{EXP} = \{ a^{2^n} : n \ge 1 \} is a context-sensitive language.Example 9.5 (p. 224) of Hopcroft, John E.; Ullman, Jeffrey D. (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley

L_\textit{PRIMES2} = \{ w : |w| \mbox { is prime } \} is a context-sensitive language (the "2" in the name of this language is intended to mean a binary alphabet). This was proved by Hartmanis using pumping lemmas for regular and context-free languages over a binary alphabet and, after that, sketching a linear bounded multitape automaton accepting L_{PRIMES2}.{{cite journal | doi=10.1145/321466.321470 | url=https://ecommons.cornell.edu/bitstream/1813/5864/1/68-1.pdf | author=J. Hartmanis and H. Shank | title=On the Recognition of Primes by Automata | journal=Journal of the ACM | volume=15 | number=3 | pages=382–389 | date=Jul 1968 | hdl=1813/5864 | s2cid=17998039 | hdl-access=free }}

L_\textit{PRIMES1} = \{ a^p : p \mbox { is prime } \} is a context-sensitive language (the "1" in the name of this language is intended to mean a unary alphabet). This was credited by A. Salomaa to Matti Soittola by means of a linear bounded automaton over a unary alphabetSalomaa, Arto (1969), Theory of Automata, {{ISBN|978-0-08-013376-8}}, Pergamon, 276 pages. {{doi|10.1016/C2013-0-02221-9}} (pages 213-214, exercise 6.8) and also to Marti Penttonen by means of a context-sensitive grammar also over a unary alphabet (See: Formal Languages by A. Salomaa, page 14, Example 2.5).

An example of recursive language that is not context-sensitive is any recursive language whose decision is an EXPSPACE-hard problem, say, the set of pairs of equivalent regular expressions with exponentiation.

Properties of context-sensitive languages

  • The union, intersection, concatenation of two context-sensitive languages is context-sensitive, also the Kleene plus of a context-sensitive language is context-sensitive.{{cite book|author=John E. Hopcroft|author2=Jeffrey D. Ullman|title=Introduction to Automata Theory, Languages, and Computation|url=https://archive.org/details/introductiontoau00hopc|url-access=registration|publisher=Addison-Wesley|year=1979|isbn=9780201029888}}; Exercise 9.10, p.230. In the 2000 edition, the chapter on context-sensitive languages has been omitted.
  • The complement of a context-sensitive language is itself context-sensitive{{cite journal | last = Immerman | first = Neil | year = 1988 | title =Nondeterministic space is closed under complementation | journal = SIAM J. Comput. | issue = 5 | pages = 935–938 | doi = 10.1137/0217058 | volume = 17 | url=http://www.cs.umass.edu/~immerman/pub/space.pdf |archive-url=https://web.archive.org/web/20040625094023/http://www.cs.umass.edu/~immerman/pub/space.pdf |archive-date=2004-06-25 |url-status=live| citeseerx = 10.1.1.54.5941 }} a result known as the Immerman–Szelepcsényi theorem.
  • Membership of a string in a language defined by an arbitrary context-sensitive grammar, or by an arbitrary deterministic context-sensitive grammar, is a PSPACE-complete problem.

See also

References

{{reflist}}

  • Sipser, M. (1996), Introduction to the Theory of Computation, PWS Publishing Co.

{{Formal languages and grammars}}

{{DEFAULTSORT:Context-Sensitive Language}}

Category:Formal languages