Archive file#Archive formats

{{Short description|File with the content of one or more files with their associated metadata}}

In computing, an archive file stores the content of one or more files, possibly compressed, with associated metadata such as file name, directory structure, error detection and correction information, commentary, compressed data archives, storage, and sometimes encryption. An archive file is often used to facilitate portability, distribution and backup, and to reduce storage use.{{Cite web |title=Archive File: What it's Used For |url=https://www.lifewire.com/what-is-an-archive-file-2625792 |access-date=2022-06-17 |website=Lifewire |language=en |archive-date=2024-07-11 |archive-url=https://web.archive.org/web/20240711021756/https://www.lifewire.com/what-is-an-archive-file-2625792 |url-status=live }}{{Cite web |date=2015-02-07 |title=Archive files |url=https://www.ibm.com/docs/en/zos/2.1.0?topic=routine-archive-files |access-date=2022-06-17 |website=www.ibm.com |language=en-us |archive-date=2023-09-07 |archive-url=https://web.archive.org/web/20230907001929/https://www.ibm.com/docs/en/zos/2.1.0?topic=routine-archive-files |url-status=live }}{{Cite web |date=2015-03-23 |title=What is Archiving And Why is it Important? |url=https://www.securedatamgt.com/blog/what-is-archiving/ |access-date=2022-06-17 |website=Secure Data MGT |language=en |archive-date=2022-05-24 |archive-url=https://web.archive.org/web/20220524041509/https://www.securedatamgt.com/blog/what-is-archiving/ |url-status=live }}

Applications

= Portability =

As an archive file stores file system information, including file content and metadata, it can be leveraged for file system content portability across heterogeneous systems. For example, a directory tree can be sent via email, files with unsupported names on the target system can be renamed during extraction, timestamps can be retained rather than lost during data transmission.{{Cite web |title=Data Portability and Platform Competition {{!}} Is User Data Exported From Facebook Actually Useful to Competitors? |url=https://archive.org/details/data_portability_and_platform_competition_-_is_user_data_exported_from_facebook_ |access-date=June 17, 2022 |website=Archive.org |pages=22}} Also, transfer of a single archive file may be faster than processing multiple files due to per-file overhead,{{Cite web |date=2020-06-17 |title=Why file transfer speeds of small vs large files could be different |url=https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/Why_file_transfer_speeds_of_small_vs_large_files_could_be_different |access-date=2022-06-17 |website=NetApp Knowledge Base |language=en |archive-date=2022-01-01 |archive-url=https://web.archive.org/web/20220101141232/https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/Why_file_transfer_speeds_of_small_vs_large_files_could_be_different |url-status=live }}{{Cite web |date=2018-10-10 |title=Why Small Files Take Longer to Copy Than Large Files |url=https://www.dq-int.co.uk/blog/why-small-files-take-longer-to-copy-than-large-files/ |access-date=2022-06-17 |website=Dataquest |language=en-GB |archive-date=2022-07-02 |archive-url=https://web.archive.org/web/20220702161208/https://www.dq-int.co.uk/blog/why-small-files-take-longer-to-copy-than-large-files/ |url-status=live }} and even faster if compressed.

= Software distribution =

Beyond archiving, archive files are often used for software distribution. When used in connection with a package manager, an archive must conform to a package format and is called a package. In particular, the format usually requires a manifest file.{{Cite web |first=Amit |last=Ashbel |title=Data Archiving: The Basics and 5 Best Practices |url=https://cloud.netapp.com/blog/clc-blg-data-archiving-the-basics-and-5-best-practices |access-date=2022-06-17 |website=cloud.netapp.com |language=en-us |archive-date=2022-01-19 |archive-url=https://web.archive.org/web/20220119105316/https://cloud.netapp.com/blog/clc-blg-data-archiving-the-basics-and-5-best-practices |url-status=live }} Examples include deb for Debian, JAR for Java, APK for Android, and self-extracting Windows Installer executables.

Features

Notable features supported for various archives include:

  • Concatenate multiple files in a single file
  • Store file metadata as data, including file name, timestamps, permissions, source storage, notes and description
  • Compression
  • Encryption
  • Error detection via checksums
  • Error correction code to fix errors
  • Splitting a large file into multiple, smaller files
  • File patches/updates (when recording changes since a previous archive)
  • Self-extraction
  • Self-installation

= Error detection and recovery =

Archive files often include parity checks and other checksums for error detection, for instance zip files use a cyclic redundancy check (CRC). RAR archives may include additional error correction data (called recovery records).{{Cite book |last=Drummond |first=James R. |url=https://faraday.physics.utoronto.ca/PVB/Drummond/Micro/ln_comm1.pdf |title=Parity, Checksums and CRC Checks |year=1997 |edition=1st |location=Toronto |pages=13 |language=En |access-date=2022-06-17 |archive-date=2020-10-31 |archive-url=https://web.archive.org/web/20201031235541/https://faraday.physics.utoronto.ca/PVB/Drummond/Micro/ln_comm1.pdf |url-status=live }}

Archive files that do not natively support recovery records can use separate parchive (PAR) files that allows for additional error correction and recovery of missing files in a multi-file archive.{{Cite web |last=text |title=What are PAR and PAR2 Files? |url=https://help.easynews.com/kb/article/72-what-are-par-and-par2-files/ |access-date=2022-06-17 |website=Easynews |language=en |archive-date=2024-07-11 |archive-url=https://web.archive.org/web/20240711021759/https://help.easynews.com/kb/article/72-what-are-par-and-par2-files/ |url-status=live }}

Format

{{anchor|archive formats}}

The format of an archive file is its archive format. Some formats are well-defined and some have become conventions supported by multiple vendors and communities.{{Cite web |title=What are Archive Files? |url=https://www.exefiles.com/en/extensions/file-types/archive/ |access-date=2022-06-17 |website=www.exefiles.com |archive-date=2022-05-28 |archive-url=https://web.archive.org/web/20220528110322/https://www.exefiles.com/en/extensions/file-types/archive/ |url-status=live }} As is common for all files, the format of an archive is generally indicated by file name extension and/or file header.{{Cite web |title=What Is a File Extension & Why Are They Important? |url=https://www.lifewire.com/what-is-a-file-extension-2625879 |access-date=2022-06-17 |website=Lifewire |language=en |archive-date=2022-06-03 |archive-url=https://web.archive.org/web/20220603202723/https://www.lifewire.com/what-is-a-file-extension-2625879 |url-status=live }}

Commonly used formats include zip, rar, 7z, and tar.{{Cite web |title=Common file name extensions in Windows |url=https://support.microsoft.com/en-us/windows/common-file-name-extensions-in-windows-da4a4430-8e76-89c5-59f7-1cdbbc75cb01 |access-date=2022-06-17 |website=support.microsoft.com |archive-date=2022-05-27 |archive-url=https://web.archive.org/web/20220527200605/https://support.microsoft.com/en-us/windows/common-file-name-extensions-in-windows-da4a4430-8e76-89c5-59f7-1cdbbc75cb01 |url-status=live }} Java introduced archive formats including jar (j for Java) and war (w for web) that store an entire runnable deployment; usually compressed.{{Cite web |last=Malefanem |first=Moses |title=Learning Java Network Programming |url=https://www.academia.edu/21445522 |access-date=2022-06-17 |archive-date=2023-09-07 |archive-url=https://web.archive.org/web/20230907003607/https://www.academia.edu/21445522 |url-status=live }}

See also

  • {{Annotated link|Comparison of archive formats}}
  • {{Annotated link|Digital container format}}
  • {{Annotated link|Disk image}}
  • {{Annotated link|File archiver}}
  • {{Annotated link|List of archive formats}}

References

{{Reflist}}

  • [http://www.pkware.com/documents/casestudies/APPNOTE.TXT "Application Note on the .ZIP file format"]- official white paper published by PKWARE, Inc.
  • [https://web.archive.org/web/20090611043446/http://datacompression.info/ArchiveFormats/tar.txt Tape Archive (.TAR) file format specification]- excerpt from File Format List 2.0 by Max Maischein
  • [https://web.archive.org/web/20050122223045/http://www-03.ibm.com/ibm/history/exhibits/701/701_1415bx26.html "IBM 726 Magnetic tape reader/recorder] from IBM Archives
  • [https://web.archive.org/web/20120702230325/http://www-03.ibm.com/ibm/history/exhibits/mainframe/mainframe_PP1401.html "1401 Data Processing System"] from IBM Archives