Semantic analysis (machine learning)

{{Short description|Machine learning method for concept approximation}}

{{other uses|Semantic analysis (disambiguation)}}

{{More citations needed|date=January 2021}}{{Semantics}}

In machine learning, semantic analysis of a text corpus is the task of building structures that approximate concepts from a large set of documents. It generally does not involve prior semantic understanding of the documents.

Semantic analysis strategies include:

  • Metalanguages based on first-order logic, which can analyze the speech of humans.{{cite book|author1=Nitin Indurkhya|author2=Fred J. Damerau|title=Handbook of Natural Language Processing|url=https://books.google.com/books?id=nK-QYHZ0-_gC|date=22 February 2010|publisher=CRC Press|isbn=978-1-4200-8593-8}}{{rp|93-}}
  • Understanding the semantics of a text is symbol grounding: if language is grounded, it is equal to recognizing a machine-readable meaning. For the restricted domain of spatial analysis, a computer-based language understanding system was demonstrated.{{cite book|author=Michael Spranger|title=The evolution of grounded spatial language|url=https://books.google.com/books?id=z0VFDAAAQBAJ&pg=PA123|date=15 June 2016|publisher=Language Science Press|isbn=978-3-946234-14-2}}{{rp|123}}
  • Latent semantic analysis (LSA), a class of techniques where documents are represented as vectors in a term space. A prominent example is probabilistic latent semantic analysis (PLSA).
  • Latent Dirichlet allocation, which involves attributing document terms to topics.
  • n-grams and hidden Markov models, which work by representing the term stream as a Markov chain, in which each term is derived from preceding terms.

See also

References

{{reflist}}

{{Natural language processing}}

Category:Machine learning

{{Compsci-stub}}