Quantitative linguistics

{{Short description|Subdiscipline of mathematical linguistics}}

{{Linguistics}}

Quantitative linguistics (QL) is a sub-discipline of general linguistics and, more specifically, of mathematical linguistics. Quantitative linguistics deals with language learning, language change, and application as well as structure of natural languages. QL investigates languages using statistical methods; its most demanding objective is the formulation of language laws and, ultimately, of a general theory of language in the sense of a set of interrelated languages laws.Reinhard Köhler: Gegenstand und Arbeitsweise der Quantitativen Linguistik. In: Reinhard Köhler, Gabriel Altmann, Rajmund G. Piotrowski (Hrsg.): Quantitative Linguistik - Quantitative Linguistics. Ein internationales Handbuch. de Gruyter, Berlin/ New York 2005, pp. 1–16. {{ISBN|3-11-015578-8}}. Synergetic linguistics was from its very beginning specifically designed for this purpose.Reinhard Köhler: Synergetic linguistics. In: Reinhard Köhler, Gabriel Altmann, Rajmund G. Piotrowski (Hrsg.): Quantitative Linguistik - Quantitative Linguistics. Ein internationales Handbuch. de Gruyter, Berlin/ New York 2005, pp. 760–774. {{ISBN|3-11-015578-8}}.

QL is empirically based on the results of language statistics, a field which can be interpreted as statistics of languages or as statistics of any linguistic object. This field is not necessarily connected to substantial theoretical ambitions. Corpus linguistics and computational linguistics are other fields which contribute important empirical evidence.

History

The earliest QL approaches date back to the ancient Indian world. One of the historical sources consists of applications of combinatorics to linguistic matters,N.L. Biggs: [https://www.sciencedirect.com/science/article/pii/0315086079900740/pdf?md5=8c541e132c321062fd29031d3f5c9c72&pid=1-s2.0-0315086079900740-main.pdf The Roots of Combinatorics.] In: Historia Mathematica 6, 1979, pp. 109–136. another one is based on elementary statistical studies, which can be found under the header colometry and stichometry.Adam Pawłowski: [https://www.researchgate.net/profile/Adam-Pawlowski-3/publication/311816013_Prolegomena_to_the_History_of_Corpus_and_Quantitative_Linguistics_Greek_Antiquity/links/590f1e8fa6fdccad7b124573/Prolegomena-to-the-History-of-Corpus-and-Quantitative-Linguistics-Greek-Antiquity.pdf Prolegomena to the History of Corpus and Quantitative Linguistics. Greek Antiquity.] In: Glottotheory 1, 2008, pp. 48–54.

Quantitative laws

File:Frequency of demonstratives2.jpg of demonstratives in Serbo-Croatian]]In QL, the concept of law is understood as the class of law hypotheses which have been deduced from theoretical assumptions, are mathematically formulated, are interrelated with other laws in the field, and have sufficiently and successfully been tested on empirical data, i.e. which could not be refuted in spite of much effort to do so. Reinhard Köhler writes about QL laws:

{{blockquote|Moreover, it can be shown that these properties of linguistic elements and of the relations among them abide by universal laws which can be formulated strictly mathematically in the same way as common in the natural sciences. One has to bear in mind in this context that these laws are of stochastic nature; they are not observed in every single case (this would be neither necessary nor possible); they rather determine the probabilities of the events or proportions under study. It is easy to find counterexamples to each of the above-mentioned examples; nevertheless, these cases do not violate the corresponding laws as variations around the statistical mean are not only admissible but even essential; they are themselves quantitatively exactly determined by the corresponding laws. This situation does not differ from that in the natural sciences, which have since long abandoned the old deterministic and causal views of the world and replaced them by statistical/probabilistic models.cf. note 1, pp. 1–2.}}

Linguistic laws

In quantitative linguistics, linguistic laws are statistical regularities emerging across different linguistic scales (i.e. phonemes, syllables, words or sentences) that can be formulated mathematically and that have been deduced from certain theoretical assumptions. They are also required to have been successfully tested through the use of data, that is, not to have been refuted by empirical evidence. Among the main linguistic laws proposed by various authors, the following can be highlighted: cf. references: Köhler, Altmann, Piotrowski (eds.) (2005)

  • Zipf's law: The frequency of words is inversely proportional to their rank in frequency lists. Similar distribution between rank and frequency of sounds, phonemes, and letters can be observed.H. Guiter, M. V. Arapov (eds.): Studies on Zipf's Law. Bochum: Brockmeyer 1982. {{ISBN|3-88339-244-8}}.
  • Heaps' law: It describes the number of distinct words in a document (or set of documents) as a function of the document length.
  • Brevity law or Zipf's law of abbreviation: It qualitatively states that the more frequently a word is used, the 'shorter' that word tends to be.Zipf GK. 1935The Psychobiology of language, an introduction to dynamic philology. Boston, MA: Houghton–Mifflin.
  • Menzerath's law (also, Menzerath-Altmann law): This law states that the sizes of the constituents of a construction decrease with increasing size of the construction under study. The longer, e.g. a sentence (measured in terms of the number of clauses) the shorter the clauses (measured in terms of the number of words), or: the longer a word (in syllables or morphs) the shorter the syllables or words in sounds).
  • Law of diversification: If linguistic categories such as parts-of-speech or inflectional endings appear in various forms it can be shown that the frequencies of their occurrences in texts are controlled by laws.
  • Martin's law: This law concerns lexical chains which are obtained by looking up the definition of a word in a dictionary, then looking up the definition of the definition just obtained etc. Finally, all these definitions form a hierarchy of more and more general meanings, whereby the number of definitions decreases with increasing generality. Among the levels of this kind of hierarchy, there exists a number of lawful relations.
  • Piotrowski's law of language change: Growth processes in language such as vocabulary growth, the dispersion of foreign or loan words, changes in the inflectional system etc. correspond to growth models in other scientific disciplines. Piotrowski's law is an application of the logistic function. It was shown that it also covers language acquisition processes (cf. language acquisition law).
  • Text block law: Linguistic units (e.g. words, letters, syntactic functions and constructions) show a specific frequency distribution in equally large text blocks.

Stylistics

The study of poetic and non-poetic styles can be based on statistical methods. Moreover, it is possible to conduct corresponding investigations on the basis of the specific forms (parameters) that language laws take in texts of different styles. In such cases, QL supports research into stylistics: One of the overall aims is to make evidence for stylistic phenomena as objective as possible by referring to language laws. One of the central assumptions of QL is that some laws (e.g. the distribution of word lengths) require different models, and hence different parameter values of the laws (distributions or functions) depending on the corpus that a text belongs to. If poetic texts are under study, QL methods form a sub-discipline of Quantitative Study of Literature (stylometrics).Alexander Mehler: Eigenschaften der textuellen Einheiten und Systeme. In: Reinhard Köhler, Gabriel Altmann, Rajmund G. Piotrowski (Hrsg.): Quantitative Linguistik - Quantitative Linguistics. Ein internationales Handbuch. de Gruyter, Berlin/ New York 2005, p. 325-348, esp. Quantitative Stilistik, pp. 339–340. {{ISBN|3-11-015578-8}}; Vivien Altmann, Gabriel Altmann: Anleitung zu quantitativen Textanalysen. Methoden und Anwendungen. Lüdenscheid: RAM-Verlag 2008, {{ISBN|978-3-9802659-5-9}}.

Important authors

  • Gabriel Altmann (1931-2020)Grzybek, Peter, & Köhler, Reinhard (eds.) (2007): [https://books.google.com/books?id=DghXCu3kLnAC Exact Methods in the Study of Language and Text. Dedicated to Gabriel Altmann on the Occasion of his 75th Birthday.] Berlin/ New York: Mouton de Gruyter
  • Otto Behaghel (1854–1936); cf. Behaghel's laws
  • {{interlanguage link|Karl-Heinz Best|de}} (1943):de:Benutzer:Dr._Karl-Heinz_Best[http://wwwuser.gwdg.de/~kbest/ index]
  • {{interlanguage link|Sergej Grigor'evič Čebanov|de|Sergei Grigorjewitsch Tschebanow}} (1897–1966):de:Sergei Grigorjewitsch Tschebanow
  • William Palin Elderton (1877–1962)Best, Karl-Heinz (2009): William Palin Elderton (1877-1962). Glottometrics 19, p. 99-101 (PDF [https://www.ram-verlag.eu/wp-content/uploads/2018/08/g19zeit.pdf#page=102 ram-verlag.eu]).
  • {{interlanguage link|Gertraud Fenk-Oczlon|de}}[http://wwwu.uni-klu.ac.at/gfenk/ Homepage_Gertraud Fenk]
  • Ernst Förstemann (1822–1906):de:Ernst Förstemann; Karl-Heinz Best: Ernst Wilhelm Förstemann (1822-1906). In: Glottometrics 12, 2006, pp. 77–86 (PDF [https://www.ram-verlag.eu/wp-content/uploads/2018/08/g12zeit-mit-Bild.pdf#page=82 ram-verlag.eu])
  • {{interlanguage link|Wilhelm Fucks|de}} (1902–1990)Dieter Aichele: Das Werk von W. Fucks. In: Reinhard Köhler, Gabriel Altmann, Rajmund G. Piotrowski (Hrsg.): Quantitative Linguistik - Quantitative Linguistics. Ein internationales Handbuch. de Gruyter, Berlin/ New York 2005, pp. 152–158. {{ISBN|3-11-015578-8}}
  • {{interlanguage link|Peter Grzybek|de}} (1957-2019)[http://www.uni-graz.at/peter.grzybek/site.php?show=1 Peter Grzybek :: Homepage : Home / Kontakt] {{webarchive |url=https://web.archive.org/web/20120929070246/http://www.uni-graz.at/peter.grzybek/site.php?show=1 |date=September 29, 2012 }}
  • {{interlanguage link|Gustav Herdan|de}} (1897–1968):de:Gustav Herdan{{Cite web |url=http://lql.uni-trier.de/index.php/Herdan_dimension |title=Herdan dimension - Laws in Quantitative Linguistics |access-date=2010-05-22 |archive-url=https://web.archive.org/web/20110719111208/http://lql.uni-trier.de/index.php/Herdan_dimension |archive-date=2011-07-19 |url-status=dead }}
  • {{interlanguage link|Luděk Hřebíček|cs}} (1934-2015):de:Luděk Hřebíček
  • {{interlanguage link|Friedrich Wilhelm Kaeding|de}} (1843–1928):de:Friedrich Wilhelm Kaeding
  • {{interlanguage link|Reinhard Köhler|de}} (1951)[http://www.uni-trier.de/index.php?id=11131 Universität Trier: Prof. Dr. Reinhard Köhler] {{webarchive|url=https://web.archive.org/web/20150407162235/http://www.uni-trier.de/index.php?id=11131 |date=2015-04-07 }}
  • Snježana Kordić (1964){{cite book|last=Kordić|first=Snježana|author-link=Snježana Kordić|language=de|title=Wörter im Grenzbereich von Lexikon und Grammatik im Serbokroatischen|trans-title=Serbo-Croatian Words on the Border Between Lexicon and Grammar|series=Studies in Slavic Linguistics; 18|location=Munich|publisher=Lincom Europa|year=2001|page=280|isbn=3-89586-954-6|lccn=2005530314|oclc=47905097|ol=2863539W|id={{NYPL|b15245330}}. {{NCID|BA56769448}}}}{{cite book|last=Kordić|first=Snježana|author-link=Snježana Kordić|language=de|title=Der Relativsatz im Serbokroatischen|trans-title=Relative Clauses in Serbo-Croatian|series=Studies in Slavic Linguistics; 10|location=Munich|publisher=Lincom Europa|year=2005|orig-date=1st pub. 1999; 2nd pub. 2002; 3rd pub. 2005|page=330|isbn=3-89586-573-7|oclc=42422661|s2cid=171902446|ol=2863535W|id={{NYPL|b14328353}}}} [http://d-nb.info/956417647/04 Contents]
  • Werner Lehfeldt (1943)[http://www.uni-goettingen.de/de/51122.html Georg-August-Universität Göttingen - Lehfeldt, Werner, Prof. em. Dr]
  • {{ill|Viktor Vasil'evič Levickij|uk|Левицький Віктор Васильович}} (1938–2012)Festschrift on the occasion of the 70. anniversary: Problems of General, Germanic and Slavic Linguistics. Papers for 70th Anniversary of Professor V. Levickij. Herausgegeben von Gabriel Altmann, Iryna Zadoroshna, Yuliya Matskulyak. Books, Chernivtsi 2008. (No ISBN.) Levickij dedicated: Glottometrics, Heft 16, 2008; Emmerich Kelih: Der Czernowitzer Beitrag zur Quantitativen Linguistik: Zum 70. Geburtstag von Prof. Dr. Habil. Viktor V. Levickij. In: Naukovyj Visnyk Černivec'koho Universytetu: Hermans'ka filolohija. Vypusk 407, 2008, pp. 3–10.
  • Haitao Liu[http://mypage.zju.edu.cn/en/lht Human-Language-Computer - staff Homepage, ZJU]
  • {{ill|Helmut Meier|de|Helmut Meier (Germanist)}} (1897–1973)
  • Paul Menzerath (1883–1954),Karl-Heinz Best: Paul Menzerath (1883-1954). In:

Glottometrics 14, 2007, pp. 86–98 (PDF [https://www.ram-verlag.eu/wp-content/uploads/2018/08/g14zeit.pdf#page=89 ram-verlag.eu]) cf. Menzerath's law

  • {{ill|Sizuo Mizutani|ja|水谷静夫}} (1926-2014)Shizuo Mizutani; Portrait on the occasion of his 80. anniversary in: Glottometrics 12, 2006 (PDF [https://www.ram-verlag.eu/wp-content/uploads/2018/08/g12zeit-mit-Bild.pdf#page=3 ram-verlag.eu]); about Mizutani: Naoko Maruyama: Sizuo Mizutani (1926). The Founder of Japanese Quantitative Linguistics. In: Glottometrics 10, 2005, pp. 99-107 (PDF [https://www.ram-verlag.eu/wp-content/uploads/2018/08/g10zeit.pdf#page=94 ram-verlag.eu]).
  • Augustus De Morgan (1806–1871)
  • {{ill|Charles Muller, Straßburg|de|Charles Muller (Romanist)}} (1909-2015)Charles Muller: Initiation à la statistique linguistique. Paris: Larousse 1968; German: Einführung in die Sprachstatistik. Hueber, München 1972.
  • {{interlanguage link|Raijmund G. Piotrowski|de|Piotrowski-Gesetz}}Rajmund G. Piotrowski, R.G. Piotrovskij; cf. Piotrowski's law: http://lql.uni-trier.de/index.php/Change_in_language {{Webarchive|url=https://web.archive.org/web/20110719111534/http://lql.uni-trier.de/index.php/Change_in_language |date=2011-07-19 }}:de:Piotrowski-Gesetz
  • L.A. Sherman
  • {{interlanguage link|Juhan Tuldava|et}} (1922–2003)Journal of Quantitative Linguistics 4, Nr. 1, 1997 (Festschrift in Honour of Juh. Tuldava)
  • Andrew Wilson, Lancaster[http://www.ling.lancs.ac.uk/profiles/Andrew-Wilson/ Dr Andrew Wilson - Linguistics and English Language at Lancaster University]
  • {{interlanguage link|Albert Thumb|de}} (1865–1915):de:Albert Thumb
  • George Kingsley Zipf (1902–1950); cf. Zipf's law
  • {{interlanguage link|Eberhard Zwirner|de}} (1899–1984). Phonometry:de:Eberhard Zwirner

See also

Notes

{{reflist}}

References

  • Karl-Heinz Best: Quantitative Linguistik. Eine Annäherung. 3., stark überarbeitete und ergänzte Auflage. Peust & Gutschmidt, Göttingen 2006, {{ISBN|3-933043-17-4}}.
  • Karl-Heinz Best, Otto Rottmann: Quantitative Linguistics, an Invitation. RAM-Verlag, Lüdenscheid 2017. {{ISBN|978-3-942303-51-4}}.
  • Reinhard Köhler with the assistance of Christiane Hoffmann: Bibliography of Quantitative Linguistics. Benjamins, Amsterdam/ Philadelphia 1995, {{ISBN|90-272-3751-4}}.
  • Reinhard Köhler, Gabriel Altmann, Gabriel, Rajmund G. Piotrowski (eds.): Quantitative Linguistik - Quantitative Linguistics. Ein internationales Handbuch – An International Handbook. de Gruyter, Berlin/ New York 2005, {{ISBN|3-11-015578-8}}.
  • Haitao Liu & Wei Huang. [http://www.journals.zju.edu.cn/soc/CN/abstract/abstract10497.shtml Quantitative Linguistics:State of the Art, Theories and Methods]. Journal of Zhejiang University (Humanities and Social Science). 2012,43(2):178-192. in Chinese.