:Binary Alignment Map
{{Short description|Raw data of genome sequencing}}
{{Infobox file format
| name = BAM file format
| icon =
| iconcaption =
| icon_size =
| screenshot =
| screenshot_size =
| caption =
| _noextcode =
| extension = .bam
| _nomimecode =
| mime =
| type_code =
| uniform_type =
| conforms_to =
| magic =
| developer = {{Plainlist|
- Heng Li
- Bob Handsaker
- Alec Wysoker
- Tim Fennell
- Jue Ruan
- Nils Homer
- Gabor Marth
- Gonçalo Abecasis
- Richard M. Durbin
- 1000 Genomes Project}}
| released =
| latest_release_version =
| latest_release_date =
| genre = Bioinformatics
| container_for =
| contained_by =
| extended_from = Tab-separated values
| extended_to =
| standard =
| free =
| url = {{URL|https://samtools.github.io/hts-specs/}}
}}
Binary Alignment Map (BAM) is the comprehensive raw data of genome sequencing;{{cite web | url=https://www.statnews.com/feature/game-of-genomes/season-one/ | title=Carl Zimmer's Game of Genomes, Season 1: Episode 3, BAM Reveals All | date=11 July 2016 | publisher=STAT | accessdate=2016-08-21}} it consists of the lossless, compressed binary representation of the Sequence Alignment Map-files.{{cite journal | title=The Sequence Alignment/Map format and SAMtools | journal=Bioinformatics | date=2009-06-08 | author=Li, Heng | pmc=2723002 | pmid=19505943 | doi=10.1093/bioinformatics/btp352 | volume=25 | issue=16 | pages=2078–9| url=https://dash.harvard.edu/bitstream/handle/1/10246875/2723002.pdf?sequence=1 }}{{cite web | url=https://wiki.nci.nih.gov/display/TCGA/Binary+Alignment+Map | title=Binary Alignment Map | publisher=National Cancer Institute Wiki | accessdate=2016-08-21}}
Schema
BAM is the compressed binary representation of SAM (Sequence Alignment Map), a compact and index-able representation of nucleotide sequence alignments.{{Cite web |title=Genome Browser BAM Track Format |url=http://genome.ucsc.edu/goldenPath/help/bam.html |access-date=2022-05-05 |website=genome.ucsc.edu}} The goal of indexing is to retrieve alignments that overlap a specific location quickly without having to go through all of them. Before indexing, BAM must be sorted by reference ID and then leftmost coordinate.{{Cite journal |date=3 Jun 2021 |title=Sequence Alignment/Map Format Specification |url=https://samtools.github.io/hts-specs/SAMv1.pdf |journal=The SAM/BAM Format Specification Working Group}} BAM is in compressed BGZF format.
The structure of BAM files include a header section and an alignment section:{{Cite web |title=BAM File Format |url=https://support.illumina.com/help/BaseSpace_App_WGS_v5_OLH_15050955_02/Content/Source/Informatics/BAM-Format.htm |access-date=2022-05-05 |website=support.illumina.com}}
- Header—The sample name, sample length, and alignment method are all included in this section. The alignments section contains alignments that are linked to specific information in the header section.
- Alignments—The read name, read sequence, read quality, alignment information, and custom tags are all included in this file. The chromosome, start coordinate, alignment quality, and match descriptor string are all included in the read name.
- Alignment Section includes the following:
- Read Group (RG)
- Barcode Tag (BC)
- Single-end alignment quality (SM)
- Paired-end alignment quality (AS)
- Edit distance tag (NM)
- Amplicon name tag (XN)
BAM format uses 0-based coordinate system, where as SAM uses 1-based coordinate system. BAM can represent values in the range [−2^31 , 2^32).
Tools
To view a list of sequencing and analysis tools that work with SAM/BAM [http://samtools.sourceforge.net/swlist.shtml click here].
See also
External links
- [https://samtools.github.io/hts-specs/SAMv1.pdf SAM format specification]
{{Portal bar|Biology}}
References
{{Reflist}}