SoftWare Hash IDentifier
{{Short description|Software identifier}}
{{Italic title}}
{{Infobox identifier
| name =
| image =
| image_size =
| image_caption =
| image_alt =
| image_border =
| image_class =
| image_style =
| full_name = SoftWare Hash IDentifier
| acronym = SWHID
| number =
| start_date =
| organisation =
| digits =
| check_digit =
| example = [https://archive.softwareheritage.org/swh:1:dir:df32c75242bf8d797ccd43af8ce8e294f35cd8fd swh:1:dir:df32c75242bf8d797ccd43af8ce8e294f35cd8fd]
| website = {{official website|name=swhid.org}}
}}
The SoftWare Hash IDentifier (SWHID) is a persistent identifier used to uniquely identify a particular piece of software source code and its version. SWHID is a standard similar to the DOI, but is tailored specifically for software source code, compatible with versioning software such as git.
An SWHID can be used to point to different components or versions of the source code of a software package. The SWHID is an intrinsic identifier in the sense that it describes the software based only on the software's intrinsic properties, with no reliance on an external register.{{Cite web |language=en |title=Intrinsic and Extrinsic identifiers |url=https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/ |website=Software Heritage |access-date=2025-05-24}}
Format
The SWHID specification allows identifying different components of software source code. Object types relating to the software version are labelled as "snapshot", "release" or "revision"; a "directory" of files and possibly subdirectories can be identified; and a specific piece of a specific version of source code can be labelled as "content".{{cite Q|Q134581061|url-status=live|trans-title=Preserving and identifying research software with Software Heritage}} These are related to one another in a Merkle directed acyclic graph.{{cite Q|Q105094730|url-status=live}}
The identifier has the following syntax:
swh:
= Examples =
According to the French National Centre for Scientific Research (CNRS), software source code archived with SWHIDs includes the source codes of Apollo 11 navigation and of the NCSA Mosaic web browser.
Version 3.0 of the Linux kernel, released in July 2011, has the following SWHID:{{Cite web |language=en |title=Release v3.0 of torvalds/linux repository |url=https://archive.softwareheritage.org/browse/release/4204bcde7c0b93c5e127eb868e17b337a513cf34/?origin_url=https://github.com/torvalds/linux&release=v3.0&snapshot=130eecc6bd74794737bb078fe5c3fadd034eddcc |website=Software Heritage |access-date=2025-05-24}}
swh:1:dir:df32c75242bf8d797ccd43af8ce8e294f35cd8fd
The following example, drawn from the specification documentation,{{Cite web |language=en |title=Qualified identifiers |url=https://www.swhid.org/specification/v1.2/6.Qualified_identifiers/ |website=swhid.org |access-date=2025-05-27}} illustrates the use of multiple qualifiers in an SWHID:
swh:1:cnt:4d99d2d18326621ccdd70f5ea66c2e2ac236ad8b;origin=https://gitorious.org/ocamlp3l/ocamlp3l_cvs.git;visit=swh:1:snp:d7f1b9eb7ccb596c2622c4780febaa02549830f9;anchor=swh:1:rev:2db189928c94d62a3b4757b3eec68f0a4d4113f0;path=/Examples/SimpleFarm/simplefarm.ml;lines=9-15
Standards
SWHID is an open standard licensed under the Community Specification License.{{Cite web |language=en |title=Copyright Section of SWHID Specification v1.2 |url=https://www.swhid.org/specification/v1.2/ |access-date=2025-05-24}}
SWHID was formalized as the ISO 18670 standard in April 2025.{{Cite web |language=en |title=ISO/IEC 18670:2025 |url=https://www.iso.org/standard/89985.html |website=ISO |access-date=2025-05-24}}
Creation and history
The SoftWare Hash IDentifier was developed by Software Heritage. Software Heritage's archives, identified by their SWHIDs, were publicly released starting in 2018.{{cite Q|Q134581205|url-status=live|trans-title=The CNRS supports Software Heritage}}
{{as of|2020}}, SWHIDs were in use for about nine billion versions of pieces of software, termed "artefacts".{{cite Q|Q134580517|url-status=live}} SWHIDs are integrated with research repositories including HAL, Zenodo and the French catalog of Academic Research Free Software.{{Cite web |language=en |title=About the site |url=https://logiciels.catalogue-esr.fr/readme |website=French Catalog of Academic Research Free Software |access-date=2025-05-24}} The identifier can be used by package managers. Guix uses SWHIDs to retrieve source code in a software archive when unavailable at its original URL.{{Cite web |language=en |title=Identifying software |url=https://guix.gnu.org/fr/blog/2024/identifying-software |website=GNU Guix Blog |access-date=2025-05-27}}
The acronym SWHID originally referred to "Software Heritage Identifiers" used to catalog software artifacts in the early days of the Software Heritage archive.{{Cite web |language=en |title=SoftWare Hash IDentifier (SWHID) |url=https://www.softwareheritage.org/software-hash-identifier-swhid/ |website=Software Heritage |access-date=2025-05-24}} It later evolved into an open standard through a dedicated working group{{Cite web |language=en |title=SWHID working group |url=https://www.swhid.org/ |access-date=2025-05-24}} and was standardized as ISO in April 2025 under the more general name "Software Hash Identifier".{{Cite web |language=en |title=ISO/IEC 18670:2025 |url=https://www.iso.org/standard/89985.html |website=ISO |access-date=2025-05-24}}
Télécom Paris welcomed the ISO normalization arguing that it is a significant step in global digital infrastructure, providing traceability of software affected by vulnerabilities.{{cite Q|Q134580605|url-status=live|trans-title=A significant advance for global digital infrastructure: the ISO/IEC 18670 standard is now official}} UNESCO stated that SWHID is useful for the reproducibility and long-term accessibility of software.{{cite Q|Q134581397|url-status=live}}
References
{{Reflist}}
External links
- {{Official website}}
- [https://www.iso.org/obp/ui/en/#iso:std:iso-iec:18670:ed-1:v1:en ISO/IEC 18670:2025 Specification v1.2]
{{ISO standards}}
{{comp-sci-stub}}