Trimming (computer programming)
{{Refimprove|date=February 2015}}
In computer programming, trimming (trim) or stripping (strip) is a string manipulation in which leading and trailing whitespace is removed from a string.
For example, the string (enclosed by apostrophes)
would be changed, after trimming, to
Variants
=Left or right trimming=
The most popular variants of the trim function strip only the beginning or end of the string. Typically named ltrim and rtrim respectively, or in the case of Python: lstrip and rstrip. C# uses TrimStart and TrimEnd, and Common Lisp string-left-trim and string-right-trim. Pascal and Java do not have these variants built-in, although Object Pascal (Delphi) has TrimLeft and TrimRight functions.{{cite web|url=https://www.freepascal.org/docs-html/rtl/sysutils/trim.html |title=Trim |publisher=Freepascal.org |date=2013-02-02 |access-date=2013-08-24}}
=Whitespace character list parameterization=
Many trim functions have an optional parameter to specify a list of characters to trim, instead of the default whitespace characters. For example, PHP and Python allow this optional parameter, while Pascal and Java do not. With Common Lisp's string-trim
function, the parameter (called character-bag) is required. The C++ Boost library defines space characters according to locale, as well as offering variants with a predicate parameter (a functor) to select which characters are trimmed.
=Special empty string return value=
An uncommon variant of trim returns a special result if no characters remain after the trim operation. For example, Apache Jakarta's StringUtils has a function called stripToNull
which returns null
in place of an empty string.
=Space normalization=
Space normalization is a related string manipulation where in addition to removing surrounding whitespace, any sequence of whitespace characters within the string is replaced with a single space. Space normalization is performed by the function named Trim()
in spreadsheet applications (including Excel, Calc, Gnumeric, and Google Docs), and by the normalize-space()
function in XSLT and XPath,
=In-place trimming=
While most algorithms return a new (trimmed) string, some alter the original string in-place. Notably, the Boost library allows either in-place trimming or a trimmed copy to be returned.
Definition of whitespace
The characters which are considered whitespace varies between programming languages and implementations. For example, C traditionally only counts space, tab, line feed, and carriage return characters, while languages which support Unicode typically include all Unicode space characters. Some implementations also include ASCII control codes (non-printing characters) along with whitespace characters.
Java's trim method considers ASCII spaces and control codes as whitespace, contrasting with the Java isWhitespace()
method,{{cite web|url=https://java.sun.com/j2se/1.5.0/docs/api/java/lang/Character.html#isWhitespace(char) |title=Character (Java 2 Platform SE 5.0) |publisher=Java.sun.com |access-date=2013-08-24}} which recognizes all Unicode space characters.
Delphi's Trim function considers characters U+0000 (NULL) through U+0020 (SPACE) to be whitespace.
= Non-space blanks =
The Braille Patterns Unicode block contains {{unichar|2800|Braille pattern blank|html=}}, a Braille pattern with no dots raised.
The Unicode standard explicitly states that it does not act as a space.
The Non-breaking space {{unichar|00A0|NO-BREAK SPACE|html=}} can also be treated as non-space for trimming purposes.
Usage
{{Main|Comparison of programming languages (string functions)#trim}}
References
{{Reflist}}
External links
- [https://www.tcl.tk/man/tcl8.4/TclCmd/string.htm#M46 Tcl: string trim]
- [https://blog.stevenlevithan.com/archives/faster-trim-javascript Faster JavaScript Trim] - compares various JavaScript trim implementations
- [http://webwidetutor.com/php/PHP-Change-String-value-behaviour-or-look-?id=8 php string cut and trimming]- php string cut and trimming