Primitive data type

{{Short description|Extremely basic data type}}

In computer science, primitive data types are a set of basic data types from which all other data types are constructed.{{cite book |last1=Stone |first1=R. G. |url=https://books.google.com/books?id=0k_xz8O2SewC&pg=PA18 |title=Program Construction |last2=Cooke |first2=D. J. |date=5 February 1987 |publisher=Cambridge University Press |isbn=978-0-521-31883-9 |page=18 |language=en-US}} Specifically it often refers to the limited set of data representations in use by a particular processor, which all compiled programs must use. Most processors support a similar set of primitive data types, although the specific representations vary.{{cite book |last1=Wikander |first1=Jan |last2=Svensson |first2=Bertil |title=Real-Time Systems in Mechatronic Applications |date=31 May 1998 |publisher=Springer Science & Business Media |isbn=978-0-7923-8159-4 |page=101 |url=https://books.google.com/books?id=fDCNR7VwG-AC&pg=PA101 |language=en}} More generally, primitive data types may refer to the standard data types built into a programming language (built-in types).{{cite book |last1=Khurana |first1=Rohit |title=Data and File Structure (For GTU), 2nd Edition |publisher=Vikas Publishing House |isbn=978-93-259-6005-3 |page=2 |url=https://books.google.com/books?id=s0JDDAAAQBAJ&pg=PA2 |language=en}}{{cite book |last1=Chun |first1=Wesley |title=Core Python Programming |date=2001 |publisher=Prentice Hall Professional |isbn=978-0-13-026036-9 |page=77 |url=https://books.google.com/books?id=mh0bU6NXrBgC&pg=PA77 |language=en}} Data types which are not primitive are referred to as derived or composite.

Primitive types are almost always value types, but composite types may also be value types.{{cite book |last1=Olsen |first1=Geir |last2=Allison |first2=Damon |last3=Speer |first3=James |title=Visual Basic .NET Class Design Handbook: Coding Effective Classes |date=1 January 2008 |publisher=Apress |isbn=978-1-4302-0780-1 |page=80 |url=https://books.google.com/books?id=DUQnCgAAQBAJ&pg=PA80 |language=en}}

Common primitive data types

The most common primitive types are those used and supported by computer hardware, such as integers of various sizes, floating-point numbers, and Boolean logical values. Operations on such types are usually quite efficient. Primitive data types which are native to the processor have a one-to-one correspondence with objects in the computer's memory, and operations on these types are often the fastest possible in most cases.{{cite web |last1=Fog |first1=Agner |title=Optimizing software in C++ |url=https://www.agner.org/optimize/optimizing_cpp.pdf#page=29 |access-date=28 January 2022 |page=29 |quote=Integer operations are fast in most cases, [...]}} Integer addition, for example, can be performed as a single machine instruction, and some offer specific instructions to process sequences of characters with a single instruction.{{fact|date=March 2025}} But the choice of primitive data type may affect performance, for example it is faster using SIMD operations and data types to operate on an array of floats.{{r|Agner|p=113}}

=Integer numbers=

{{Main|Integer (computer science)}}

An integer data type represents some range of mathematical integers. Integers may be either signed (allowing negative values) or unsigned (non-negative integers only). Common ranges are:

class="wikitable"
Size (bytes)

! Size (bits)

! Names

! Signed range (two's complement representation)

! Unsigned range

1 byte

| 8 bits

| Byte, octet, minimum size of char in C99( see limits.h CHAR_BIT)

| −128 to +127

| 0 to 255

2 bytes

| 16 bits

| x86 word, minimum size of short and int in C

| −32,768 to +32,767

| 0 to 65,535

4 bytes

| 32 bits

| x86 double word, minimum size of long in C, actual size of int for most modern C compilers,{{cite web|url=http://www.agner.org/optimize/calling_conventions.pdf|title=Calling conventions for different C++ compilers and operating systems: Chapter 3, Data Representation |date=2010-02-16 |access-date=2010-08-30 |last=Fog |first=Agner}} pointer for IA-32-compatible processors

| −2,147,483,648 to +2,147,483,647

| 0 to 4,294,967,295

8 bytes

| 64 bits

| x86 quadruple word, minimum size of long long in C, actual size of long for most modern C compilers, pointer for x86-64-compatible processors

| −9,223,372,036,854,775,808 to +9,223,372,036,854,775,807

| 0 to 18,446,744,073,709,551,615

=Floating-point numbers=

{{Main|Floating-point arithmetic}}

A floating-point number represents a limited-precision rational number that may have a fractional part. These numbers are stored internally in a format equivalent to scientific notation, typically in binary but sometimes in decimal. Because floating-point numbers have limited precision, only a subset of real or rational numbers are exactly representable; other numbers can be represented only approximately. Many languages have both a single-precision (often called float) and a double-precision type (often called double).

=Booleans=

{{Main|Boolean data type}}

A Boolean type, typically denoted bool or boolean, is typically a logical type that can have either the value true or the value false. Although only one bit is necessary to accommodate the value set true and false, programming languages typically implement Boolean types as one or more bytes.

Many languages (e.g. Java, Pascal and Ada) implement Booleans adhering to the concept of Boolean as a distinct logical type. Some languages, though, may implicitly convert Booleans to numeric types at times to give extended semantics to Booleans and Boolean expressions or to achieve backwards compatibility with earlier versions of the language. For example, early versions of the C programming language that followed ANSI C and its former standards did not have a dedicated Boolean type. Instead, numeric values of zero are interpreted as false, and any other value is interpreted as true.{{cite book |first1= Brian W |last1= Kernighan |author-link1= Brian Kernighan |first2= Dennis M |last2= Ritchie |author-link2= Dennis Ritchie |page= [https://archive.org/details/cprogramminglang00kern/page/41 41] |title= The C Programming Language |edition= 1st |publisher= Prentice Hall |year= 1978 |location= Englewood Cliffs, NJ |isbn= 0-13-110163-3}} The newer C99 added a distinct Boolean type _Bool (the more intuitive name bool as well as the macros true and false can be included with stdbool.h),{{cite web|url=https://devdocs.io/c/types/boolean|access-date=October 15, 2020|title=Boolean type support library|website=devdocs.io}} and C++ supports bool as a built-in type and true and false as reserved words.{{cite web|url=https://www.geeksforgeeks.org/bool-data-type-in-c/|access-date=October 15, 2020|title=Bool data type in C++|website=GeeksforGeeks|date=5 June 2017}}

Specific languages

=Java=

The Java virtual machine's set of primitive data types consists of:{{cite book |last1=Lindholm |first1=Tim |last2=Yellin |first2=Frank |last3=Bracha |first3=Gilad |last4=Buckley |first4=Alex |title=The Java® Virtual Machine Specification |date=13 February 2015 |url=https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html#jvms-2.3 |chapter=Chapter 2. The Structure of the Java Virtual Machine}}

  • byte, short, int, long, char (integer types with a variety of ranges)
  • float and double, floating-point numbers with single and double precisions
  • boolean, a Boolean type with logical values true and false
  • returnAddress, a value referring to an executable memory address. This is not accessible from the Java programming language and is usually left out.{{cite book |last1=Cowell |first1=John |title=Essential Java Fast: How to write object oriented software for the Internet |date=18 February 1997 |publisher=Springer Science & Business Media |isbn=978-3-540-76052-8 |page=27 |url=https://books.google.com/books?id=5M9_fBX4QicC&pg=PA27 |language=en}}{{cite book |last1=Rakshit |first1=Sandip |last2=Panigrahi |first2=Goutam |title=A Hand Book of Objected Oriented Programming With Java |date=December 1995 |publisher=S. Chand Publishing |isbn=978-81-219-3001-7 |page=11 |url=https://books.google.com/books?id=aAsbEAAAQBAJ&pg=PA11 |language=en}}

= C basic types =

{{Main|C data types#Basic types}}

The set of basic C data types is similar to Java's. Minimally, there are four types, char, int, float, and double, but the qualifiers short, long, signed, and unsigned mean that C contains numerous target-dependent integer and floating-point primitive types.{{cite book |last1=Kernighan |first1=Brian W.|last2=Ritchie|first2=Dennis M. |title=The C programming language|chapter=2.2 Data Types and Sizes |date=1988 |location=Englewood Cliffs, N.J. |isbn=0131103709 |page=36 |edition=Second}} C99 extended this set by adding the Boolean type _Bool and allowing the modifier long to be used twice in combination with int (e.g. long long int).{{cite book | url=https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf | title=ISO/IEC 9899:1999 specification, TC3 | at=p. 255, § 6.2.5 Types}}

= XML Schema =

The XML Schema Definition language provides a set of 19 primitive data types:{{cite web |last1=Biron |first1=Paul V. |last2=Malhotra |first2=Ashok |title=XML Schema Part 2: Datatypes |url=https://www.w3.org/TR/xmlschema-2/#built-in-primitive-datatypes |website=www.w3.org |access-date=29 January 2022 |edition=Second}}

  • string: a string, a sequence of Unicode code points
  • boolean: a Boolean
  • decimal: a number represented with decimal notation
  • float and double: floating-point numbers
  • duration, dateTime, time, date, gYearMonth, gYear, gMonthDay, gDay, and gMonth: Calendar dates and times
  • hexBinary and base64Binary: binary data encoded as hexadecimal or Base64
  • anyURI: a URI
  • QName: a qualified name
  • NOTATION: a QName declared as a notation in the schema. Notations are used to embed non-XML data types.{{cite news |last1=Phillips |first1=Lee Anne |title=Declaring a NOTATION {{!}} Understanding XML Document Type Definitions |url=https://www.informit.com/articles/article.aspx?p=24992&seqNum=5 |access-date=29 January 2022 |work=www.informit.com |date=18 January 2002}} This type cannot be used directly - only derived types that enumerate a limited set of QNames may be used.

= JavaScript =

In JavaScript, there are 7 primitive data types: string, number, bigint, boolean, symbol, undefined, and null.{{cite web |title=Primitive - MDN Web Docs Glossary: Definitions of Web-related terms |url=https://developer.mozilla.org/en-US/docs/Glossary/Primitive |date=8 June 2023 |publisher=MDN}} Their values are considered immutable. These are not objects and have no methods or properties; however, all primitives except undefined and null have object wrappers.{{cite web |url=https://developer.mozilla.org/en-US/docs/Web/JavaScript/Data_structures#primitive_values |title=JavaScript data types and data structures |publisher=MDN |date=9 July 2024}}

= Visual Basic .NET =

In Visual Basic .NET, the primitive data types consist of 4 integral types, 2 floating-point types, a 16-byte decimal type, a Boolean type, a date/time type, a Unicode character type, and a Unicode string type.{{cite web |title=Types in Visual Basic |url=https://docs.microsoft.com/en-us/dotnet/visual-basic/reference/language-specification/types#primitive-types |website=Microsoft Docs |access-date=18 May 2022 |language=en-us |date=18 September 2021}}

= Rust =

Rust has primitive unsigned and signed fixed width integers in the format u or i respectively followed by any bit width that is a power of two between 8 and 128 giving the types u8, u16, u32, u64, u128, i8, i16, i32, i64 and i128.{{Cite web |title=Data Types - The Rust Programming Language |url=https://doc.rust-lang.org/book/ch03-02-data-types.html |access-date=2023-10-17 |website=doc.rust-lang.org}} Also available are the types usize and isize which are unsigned and signed integers that are the same bit width as a reference with the usize type being used for indices into arrays and indexable collection types.

Rust also has:

  • bool for the Boolean type.
  • f32 and f64 for 32 and 64-bit floating point numbers.
  • char for a unicode character. Under the hood these are unsigned 32-bit integers with values that correspond to the char's codepoint but only values that correspond to a valid unicode scalar value are valid.

Built-in types

Built-in types are distinguished from others by having specific support in the compiler or runtime, to the extent that it would not be possible to simply define them in a header file or standard library module.{{cite web |title=Built-in types (C++) |url=https://learn.microsoft.com/en-us/cpp/cpp/fundamental-types-cpp?view=msvc-170 |website=learn.microsoft.com |language=en-us |date=17 August 2021}} Besides integers, floating-point numbers, and Booleans, other built-in types include:

=Characters and strings=

A character type is a type that can represent all Unicode characters, hence must be at least 21 bits wide. Some languages such as Julia include a true 32-bit Unicode character type as primitive.{{cite web |title=Strings · The Julia Language |url=https://docs.julialang.org/en/v1/manual/strings/#man-characters |website=docs.julialang.org |access-date=29 January 2022}} Other languages such as JavaScript, Python, Ruby, and many dialects of BASIC do not have a primitive character type but instead add strings as a primitive data type, typically using the UTF-8 encoding. Strings with a length of one are normally used to represent single characters.

Some languages have character types that are too small to represent all Unicode characters. These are more properly categorized as integer types that have been given a misleading name. For example C includes a char type, but it is defined to be the smallest addressable unit of memory, which several standards (such as POSIX) require to be 8 bits. Recent versions of these standards refer to char as a numeric type. char is also used for a 16-bit integer type in Java, but again this is not a Unicode character type.{{cite web |last1=Mansoor |first1=Umer |title=The char Type in Java is Broken |url=https://codeahoy.com/2016/05/08/the-char-type-in-java-is-broken/ |website=CodeAhoy |date=8 May 2016 |access-date=10 February 2020 |ref=3}}

The term string also does not always refer to a sequence of Unicode characters, instead referring to a sequence of bytes. For example, x86-64 has string instructions to move, set, search, or compare a sequence of items, where an item could be 1, 2, 4, or 8 bytes long.{{cite web|title=I/O and string instructions|access-date=29 January 2022|url=http://linasm.sourceforge.net/docs/instructions/cpu.php#bit}}

See also

  • {{annotated link|Language primitive}}
  • {{section link|List of data structures|Data types}}
  • {{annotated link|Object type}}
  • {{annotated link|Primitive wrapper class}}
  • {{annotated link|Variable (computer science)}}

References

{{reflist}}