Interning (computer science)

{{About|the computer science usage|professional training programs|Internship}}

{{Short description|Data structure for reusing strings}}

In computer science, interning is re-using objects of equal value on-demand instead of creating new objects. This creational pattern{{Cite web|title=Design Patterns|url=https://courses.cs.washington.edu/courses/cse331/11wi/lectures/lect09-design-patterns-1.pdf|website=University of Washington}} is frequently used for numbers and strings in different programming languages. In many object-oriented languages such as Python, even primitive types such as integer numbers are objects. To avoid the overhead of constructing a large number of integer objects, these objects get reused through interning.{{Cite book|last=Jaworski|first=Michał|url=https://www.worldcat.org/oclc/1125343555|title=Expert Python programming : become a master in Python by learning coding best practices and advanced programming concepts in Python 3.7|date=2019|others=Tarek Ziadé|isbn=978-1-78980-677-9|edition=Third|location=Birmingham, U.K.|oclc=1125343555}}

For interning to work the interned objects must be immutable, since state is shared between multiple variables. String interning is a common application of interning, where many strings with identical values are needed in the same program.

History

Lisp introduced the notion of interned strings for its symbols. The LISP 1.5 Programmers Manual {{Cite book|last=Levin|first=Michael I.|url=https://www.worldcat.org/oclc/1841373|title=LISP 1.5 programmer's manual : the Computation Center and Research Laboratory of Electronics, Massachusetts Institute of Technology|date=1965|publisher=M.I.T. Press|others=John McCarthy, Massachusetts Institute of Technology. Computation Center, Massachusetts Institute of Technology. Research Laboratory of Electronics|isbn=0-262-13011-4|edition=2nd|location=Cambridge|oclc=1841373}} describes a function called intern which either evaluates to an existing symbol of the supplied name, or if none exists, creates a new symbol of that name. This idea of interned symbols persists in more recent dialects of Lisp, such as Clojure in special forms such a (def symbol) which perform symbol creation and interning.{{Cite web|title=Clojure - Special Forms|url=https://clojure.org/reference/special_forms#def|access-date=2021-11-04|website=clojure.org}}

In the object-oriented programming paradigm interning is an important mechanism in the flyweight pattern, where an interning method is called to store the intrinsic state of an object such that this can be shared among different objects which share different extrinsic state, avoiding needless duplication.{{Cite book|url=https://www.worldcat.org/oclc/31171684|title=Design patterns : elements of reusable object-oriented software|date=1995|publisher=Addison-Wesley|others=Erich Gamma, Richard Helm, Ralph E. Johnson, John Vlissides|isbn=0-201-63361-2|location=Reading, Mass.|oclc=31171684}}

Interning continues to be an important technique for managing memory use in programming language implementations; for example, the Java Language Specification requires that identical string literals (that is, literals that contain the same sequence of code points) must refer to the same instance of class String, because string literals are "interned" so as to share unique instances.{{Cite web|title=Java Language Specification. Chapter 3. Lexical Structure|url=https://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.5|access-date=2021-11-04|website=docs.oracle.com}} In the Python programming language small integers are interned,{{Cite web|title=PEP 237 -- Unifying Long Integers and Integers|url=https://www.python.org/dev/peps/pep-0237/|access-date=2021-11-04|website=Python.org|language=en}} though the details of exactly which are dependent on language version.

Motivation

Interning saves memory and can thus improve performance and memory footprint of a program.{{Cite book|last=Oaks|first=Scott|url=https://www.worldcat.org/oclc/878059649|title=Java performance : the definitive guide|date=2014|publisher=O'Reilly Media|isbn=978-1-4493-6354-3|location=Sebastopol, CA|oclc=878059649}} The downside is time required to search for existing values of objects which are to be interned.

See also

References