Parsec (parser)

{{Infobox software

| name = Parsec

| logo =

| logo alt =

| logo caption =

| screenshot =

| screenshot size =

| screenshot alt =

| caption =

| author = Daan Leijen, Paolo Martini, Antoine Latter

| developer = Herbert Valerio Riedel, Derek Elkins, Antoine Latter, Roman Cheplyaka, Ryan Scott

| released = {{Start date and age|2006|11|02}}{{cite web |title=parsec 2.0 |url=https://hackage.haskell.org/package/parsec-2.0 |website=Hackage |access-date=3 September 2019}}

| latest release version = 3.1.17.0

| latest release date = {{Start date and age|2024|04|05}}{{cite web |title=Releases |url=https://github.com/haskell/parsec/releases |website=Github |access-date=22 September 2024}}

| latest preview version =

| latest preview date =

| repo = {{URL|https://github.com/haskell/parsec}}

| programming language = Haskell

| engine =

| operating system = Linux, macOS, Windows

| platform = Haskell Platform

| included with =

| size =

| language = English

| genre = Parser combinator, library

| license = BSD-2-clause

| website = {{URL|https://hackage.haskell.org/package/parsec}}

}}

Parsec is a library for writing parsers written in the programming language Haskell.{{cite web|title=Parsec on Haskell wiki|url=https://wiki.haskell.org/Parsec|website=Haskell Wiki|access-date=29 May 2017}} It is based on higher-order parser combinators, so a complicated parser can be made out of many smaller ones.{{cite web|url=http://research.microsoft.com/pubs/65201/parsec-paper-letter.pdf|website=Microsoft Research|title=Parsec: Direct Style Monadic Parser Combinators For The Real World|date=July 2001|access-date=22 November 2014|ref=parsec-paper|last1=Leijen|first1=Daan|last2=Meijer|first2=Erik}} It has been reimplemented in many other languages, including Erlang,{{cite web|title=Parsec Erlang|url=https://bitbucket.org/dmercer/parsec-erlang/|website=BitBucket|access-date=23 November 2014|ref=parsec-erlang}} Elixir,{{cite web|title=Nimble Parsec|url=https://github.com/plataformatec/nimble_parsec/|website=Github|access-date=18 December 2018|ref=parsec-elixir}} OCaml,{{cite web|title=Parsec OCaml|url=http://lprousnth.files.wordpress.com/2007/08/pcl.pdf|website=The OCaml Summer Project|access-date=23 November 2014|ref=parsec-ocaml}} Racket,{{cite web|title=Megaparsack: Practical Parser Combinators|url=https://docs.racket-lang.org/megaparsack/}} F#,{{cite web|title=XParsec by corsis|url=http://xparsec.corsis.tech/|website=XParsec|access-date=29 May 2017|ref=xparsec}}{{cite web|title=FParsec|url=http://www.quanttec.com/fparsec/|website=Quanttec|access-date=29 May 2017|ref=fparsec}} and the imperative programming languages C#,{{cite web|title=CSharp monad|url=https://github.com/louthy/csharp-monad|website=Github|access-date=10 December 2014|ref=parsec-csharp}} and Java.{{cite web|title=JParsec|url=https://github.com/jparsec/jparsec|website=Github|access-date=14 October 2016|ref=jparsec}}

Because a parser combinator-based program is generally slower than a parser generator-based program,{{cn|date=April 2024}} Parsec is normally used for small domain-specific languages, while Happy is used for compilers such as the Glasgow Haskell Compiler (GHC).{{cite web|title=The Glasgow Haskell Compiler (AOSA Vol. 2)|website=The Architecture of Open Source Applications|url=http://www.aosabook.org/en/ghc.html|access-date=23 November 2014|ref=aosa-ghc}}

Other Haskell parser combinator libraries that have been derived from Parsec include Megaparsec{{Cite web|url=https://hackage.haskell.org/package/megaparsec-6.5.0|title=megaparsec: Monadic parser combinators|website=Hackage|access-date=2018-09-10}} and Attoparsec.{{Cite web|url=https://hackage.haskell.org/package/attoparsec|title=attoparsec: Fast combinator parsing for bytestrings and text|website=Hackage|access-date=2018-09-10}}

Parsec is free software released under the BSD-3-Clause license.{{Cite web|url=https://github.com/haskell/parsec/blob/master/LICENSE|title=Parsec|website=GitHub |date=25 October 2021}}

Example

Parsers written in Parsec start with simpler parsers, such as ones that recognize certain strings, and combine them to build a parser with more complicated behavior. For example, digit parses a digit, and string parses a specific string (like "hello").

Parser combinator libraries like Parsec provide utility functions to run the parsers on real values. A parser to recognize a single digit from a string can be split into two functions: one to create the parser, and a main function that calls one of these utility functions (parse in this case) to run the parser:

import Text.Parsec -- has general parsing utility functions

import Text.Parsec.Char -- contains specific basic combinators

type Parser = Stream s m Char => ParsecT s u m String

parser :: Parser

parser = string "hello"

main :: IO ()

main = print (parse parser "" "hello world")

-- prints 'Right "hello"'

We define a Parser type to make the type signature of parser easier to read. If we wanted to alter this program, say to read either the string "hello" or the string "goodbye", we could use the operator <|>, provided by the Alternative typeclass, to combine two parsers into a single parser that tries either:

parser = string "hello" <|> string "goodbye"

References

{{Reflist}}