Snowball (programming language)

{{Short description|String processing programming language}}

{{distinguish|SNOBOL}}

{{Primary sources|date=March 2020}}

{{Update|inaccurate=yes|updated=September 2014|date=April 2021}}

Snowball is a small string processing programming language designed for creating stemming algorithms for use in information retrieval.[http://snowball.tartarus.org/ "Snowball"], Martin Porter, web page. Retrieved 2 September 2014.

The name Snowball was chosen as a tribute to the SNOBOL programming language, "with which it shares the concept of string patterns delivering signals that are used to control the flow of the program." The creator of Snowball, Dr. Martin Porter, "toyed with the idea of calling it 'strippergram,'" because it "effectively provides a 'suffix STRIPPER GRAMmar.'"

The Snowball compiler translates a Snowball script (an .sbl file) into program in thread-safe ANSI C, Java, Ada, C#, Go, Javascript, Object Pascal, Python or Rust.{{Cite web |last=Porter |first=Martin |title=Snowball: Quick introduction |url=http://snowball.tartarus.org/texts/quickintro.html |access-date=May 4, 2025}}{{Cite web |date=March 27, 2025 |title=Snowball README |url=https://github.com/snowballstem/snowball# |access-date=May 4, 2025}} For ANSI C, each Snowball script produces a program file and corresponding header file (with .c and .h extensions). The Snowball compiler checks the consistency of its script, and this check was used to discover a typo in a seminal academic paper by Lovins which had remained undetected for 30 years.{{Cite web|url=http://snowball.tartarus.org/algorithms/lovins/festschrift.html|title=Lovins revisited|website=snowball.tartarus.org |author1=Martin Porter |access-date=6 August 2024 |date=December 2001}}

The basic datatypes handled by Snowball are strings of characters, signed integers, and boolean truth values, or more simply strings, integers and booleans. Snowball's characters are either 8-bit wide, or 16-bit, depending on the mode of use. In particular, both ASCII and 16-bit Unicode are supported. Like the SNOBOL programming language, the flow of control in Snowball is arranged by the implicit use of signals (each statement returns a true or false value), rather than the explicit use of constructs such as if, then, and break found in C and many other programming languages.[http://snowball.tartarus.org/compiler/snowman.html "Snowball Manual"], Martin Porter, web page. Retrieved 2 September 2014.

Though the original [http://snowball.tartarus.org/ Snowball website] maintained by Dr. Martin Porter and colleague Richard Boulton has been closed since 2014 following Dr. Porter’s retirement,{{Cite web |last=Porter |first=Martin |title=Snowball - Credits |url=http://snowball.tartarus.org/credits.html |access-date=May 4, 2025}} the site itself is still accessible, and the language continues to be developed as [https://github.com/snowballstem a community project on GitHub]. Additionally, large projects like the Natural Language Toolkit (NLTK) for Python employ Snowball along with stemming algorithms designed by Dr. Porter and other contributors to the Snowball language.{{Cite web |title=nltk.stem.SnowballStemmer Documentation |url=https://www.nltk.org/api/nltk.stem.SnowballStemmer.html |access-date=May 4, 2025 |website=Natural Language Toolkit}}{{Cite web |title=Source code for nltk.stem.snowball |url=https://www.nltk.org/_modules/nltk/stem/snowball.html |access-date=May 4, 2025 |website=Natural Language Toolkit}}

References

{{reflist}}