FAM203B
{{Short description|Protein-coding gene in the species Homo sapiens}}
Family with the Sequence Similarity 203, Member B (FAM203B) is a protein encoded by the FAM203B gene (8q24.3) in humans.{{cite web|title=Predicted: protein FAM203B [Homo sapiens]|url=https://www.ncbi.nlm.nih.gov/protein/XP_001126758.1|publisher=NCBI Protein|accessdate=5 February 2013}}{{cite journal|title=Predicted: Homo sapiens family with sequence similarity 203, member B (FAM203B, mRNA|url=https://www.ncbi.nlm.nih.gov/nuccore/XM_001126758.4|publisher=NCBI Nucleotide|accessdate=5 February 2013|date=2012-10-30}} While FAM203B is only found in humans and possibly non-human primates, its paralog, FAM203A,{{cite web|title=FAM203 family|url=http://www.nextprot.org/db/term/FA-04837/proteins?referer=|publisher=NextProt Beta|accessdate=5 February 2013}} is highly conserved.{{cite web|title=HomoloGene: 48742, gene conserved in Eukaryota|url=https://www.ncbi.nlm.nih.gov/homologene/?term=FAM203B|publisher=NCBI HomoloGene|accessdate=18 January 2013}} The FAM203B protein contains two conserved domains of unknown function, DUF383 and DUF384, and no transmembrane domains. This protein has no known function yet, although the homolog of FAM203A in Caenorhabditis elegans (Y54H5A.2) is thought to help regulate the actin cytoskeleton.{{cite journal | vauthors = Fievet BT, Rodriguez J, Naganathan S, Lee C, Zeiser E, Ishidate T, Shirayama M, Grill S, Ahringer J | title = Systematic genetic interaction screens uncover cell polarity regulators and functional redundancy | journal = Nature Cell Biology | volume = 15 | issue = 1 | pages = 103–12 | date = January 2013 | pmid = 23242217 | pmc = 3836181 | doi = 10.1038/ncb2639 }}
{{Infobox_gene}}
__TOC__
Gene
FAM203B is located on the positive DNA strand of the long arm of chromosome 8 at locus 24.3 (8q24.3) from 76,368,898 - 76,371,411 in the human genome. The gene product contains 2,402 bp of mRNA with 6 predicted exons in the human gene.{{cite web|title=FAM203B family with sequence similarity 203, member B [Homo sapiens (human)]|url=https://www.ncbi.nlm.nih.gov/gene/728071|publisher=NCBI Gene|accessdate=5 February 2013}} There are no known isoforms.
=Gene Neighborhood=
The pseudogene TSSK5P2 is located on the negative strand opposite FAM203B (145,440,975 - 145,443,775),{{cite web|title=TSSK5P2 testis-specific serine kinase 5 pseudogene 2 [Homo sapiens (human)]|url=https://www.ncbi.nlm.nih.gov/gene?cmd=Retrieve&dopt=full_report&list_uids=100131992|publisher=NCBI Gene|accessdate=10 May 2013}} while LOC377711 is located immediately downstream on the positive strand (145,448,755 - 145,485,896).{{cite web|title=LOC377711 HEAT repeat-containing protein 7A-like [Homo sapiens (human)]|url=https://www.ncbi.nlm.nih.gov/gene?cmd=Retrieve&dopt=full_report&list_uids=377711|publisher=NCBI Gene|accessdate=10 May 2013}} FAM203A, MROH1, and SCXB are located upstream of FAM203B.{{cite web|title=FAM203A family with sequence similarity 203, member A [Homo sapiens (human)]|url=https://www.ncbi.nlm.nih.gov/gene/51236|publisher=NCBI Gene|accessdate=10 May 2013}}
=Gene Expression=
Expression Profile: mRNA expression has been localized in many tissue types (immune, nervous, muscle, internal, secretory, and reproductive) in similar quantities and may therefore be ubiquitous.{{cite web|title=FAM203B Gene|url=https://www.genecards.org/cgi-bin/carddisp.pl?gene=FAM203B&search=FAM203B|publisher=Weizmann Institute of Science|accessdate=9 May 2013}}
Promoter: The predicted promoter region of FAM203B is located between 145,437,380 and 145,438,015 on Chromosome 8 and has a length of 636 bp.
Protein
The function of FAM203B is not currently understood. The FAM203B protein has 390 amino acids, a molecular weight of 42.1 kdal,{{cite web|last=Brendel|first=Volker|title=SAPS (Statistical Analysis of PS)|url=http://workbench.sdsc.edu}} and an isoelectric point of 4.56.{{cite web|last=Toldo|first=Luca|title=PI (Isoelectric Point Determination)|url=http://workbench.sdsc.edu}}
=Structure=
FAM203B contains two domains of unknown function: DUF383 (residues 110–288) and DUF384 (residues 292–349). The protein is alanine-, proline-, and leucine-rich, but poor in serine, asparagine, threonine, isoleucine, lysine, and phenylalanine. The following internal repeats can be found in the primary sequence: LPFL (26-29, 245–248), ELAP (70-73), GRAL (54-57, 111–114), and LAADPGL (88-94, 99–105). There are no positive, negative, mixed charge, or hydrophobic clusters; no transmembrane domains; and no clusters of amino acid multiplets. The secondary structure prediction generated by the Phyre 2.0 bioinformatic server shows only α-helices, almost all of which have high confidence values. The overall confidence value of the model is 99.5%.{{cite journal | vauthors = Kelley LA, Sternberg MJ | title = Protein structure prediction on the Web: a case study using the Phyre server | journal = Nature Protocols | volume = 4 | issue = 3 | pages = 363–71 | year = 2009 | pmid = 19247286 | doi = 10.1038/nprot.2009.2 | hdl = 10044/1/18157 | s2cid = 12497300 | url = http://spiral.imperial.ac.uk/bitstream/10044/1/18157/2/Nature%20Protocols_4_3_2009.pdf | hdl-access = free }}
=Post-Translational Modifications=
There are at least six predicted phosphorylation sites in FAM203B: S17, S153, Y167, T223, S259, and S320.{{cite journal | vauthors = Blom N, Gammeltoft S, Brunak S | title = Sequence and structure-based prediction of eukaryotic protein phosphorylation sites | journal = Journal of Molecular Biology | volume = 294 | issue = 5 | pages = 1351–62 | date = December 1999 | pmid = 10600390 | doi = 10.1006/jmbi.1999.3310 | url = http://www.cbs.dtu.dk/services/NetPhos/ | url-access = subscription }} The FAM203B protein is also predicted to locate to the cytoplasm.{{cite web|last=Horton|first=Paul|title=PSORT II|url=http://psort.hgc.jp/form2.html}}
=Protein Interactions=
There are many possible transcription factor binding sites in the FAM203B promoter. Below is a table of the best possibilities, which have high confidence values, evolutionary conservation, and/or multiple possible binding sites in the promoter.
Table of Possible Transcription Factor Binding Sites in Predicted FAM203B Promoter:{{cite web|title=Genomatix El Dorado|url=http://www.genomatix.de/|accessdate=7 April 2013|archive-date=2 December 2021|archive-url=https://web.archive.org/web/20211202010908/https://www.genomatix.de/|url-status=dead}}
class="wikitable sortable" | ||||
Transcription Factor | Start | End | Strand | Sequence |
---|---|---|---|---|
Winged-helix transcription factor IL-2 enhancer binding factor, forkhead box K2 | 6 | 22 | - | gacaggacAACAcaggg |
Hypermethylated in Cancer 1 | 49 | 61 | + | ccgTGCCagcctg |
Zinc finger transcription factor ZBP-89 | 94 | 116 | + | tggccactCCCCcattcagccct |
Kidney-enriched kruppel-like factor, KLF15 | 142 | 158 | + | gagccGGGGcgcgggcc |
Transcription factor II B recognition element | 149 | 155 | - | ccgCGCC |
Glial cells missing homolog 1, chorion-specific transcription factor GCMα | 159 | 173 | + | tcagaCCCTcagggc |
Transcription factor AP-2α | 161 | 175 | - | gggcCCTGagggtct |
Smad4 transcription factor involved in TGFβ signaling | 245 | 255 | - | gtaGTCTcggc |
Nuclear factor 1 | 278 | 298 | - | gatTTGGccgcctgccgcgtc |
ZF5 POZ domain zinc finger, zinc finger protein 161 | 295 | 309 | + | aatCGCGccgggcct |
Smad3 transcription factor involved in TGFβ signaling | 365 | 375 | - | ggcGTCTggcc |
Myeloid zinc finger protein MZF1 | 384 | 394 | - | gcGGGGagtta |
X-linked zinc finger protein | 397 | 407 | + | gcGGCCtggcc |
Myeloid zinc finger protein MZF1 | 406 | 416 | - | gaGGGGagggg |
Core promoter-binding protein with 5 kruppel-type zinc fingers | 423 | 445 | + | ccggtcCCGCcccttgagcccag |
X gene core promoter element 1 | 424 | 434 | - | ggGCGGgaccg |
Zinc finger and BTB domain-containing 7A | 479 | 501 | - | cgcaaCCCCgcccaccagaggag |
Kruppel-like factor 7 | 483 | 499 | + | tctggtgGGCGgggttg |
Erythroid kruppel-like factor | 533 | 549 | + | ggcaccggtcGGGTggc |
Hypermethylated in cancer 1 | 541 | 553 | - | tgcTGCCacccga |
There are several other proteins that may interact directly with the FAM203B protein including C1orf112, HEATR3, MRTO4, BYSL, GINS1, DKC1, TXNDC12, PWP2, IMP4, and NIP7.{{cite web|title=C8orf30B Predicted Functional Partners|url=http://string-db.org/version_9_05/newstring_cgi/show_network_section.pl?all_channels_on=1&interactive=yes&network_flavor=evidence&targetmode=proteins&identifier=9606.ENSP00000366623/|publisher=STRING: functional protein association networks|accessdate=9 May 2013}}{{Dead link|date=March 2024 |bot=InternetArchiveBot |fix-attempted=yes }}
Homology and Evolution
=FAM203A: Paralog=
FAM203A is 99% identical to FAM203B with only one amino acid difference (E264Q) due to a point mutation (G857C).{{cite journal | vauthors = Higgins DG, Bleasby AJ, Fuchs R | title = CLUSTAL V: improved software for multiple sequence alignment | journal = Computer Applications in the Biosciences | volume = 8 | issue = 2 | pages = 189–91 | date = April 1992 | pmid = 1591615 | doi = 10.1093/bioinformatics/8.2.189 | url = http://workbench.sdsc.edu | url-access = subscription }} This indicates that the duplication event that produced FAM203B 242,266 bp downstream from FAM203A occurred very recently in evolutionary history. The FAM203A protein is highly conserved and has orthologs in primates, rodents, ungulates, marsupials, amphibians, fish, fungi, plants, and at least one monotreme, one reptile, and one hemichordate.{{cite web|title=BLAST: Basic Local Alignment Search Tool|url=http://blast.ncbi.nlm.nih.gov/|publisher=NCBI BLAST|accessdate=5 February 2013}}
=Orthologs and Homologs=
Table of FAM203B Paralog and Homologs:
class="wikitable sortable" | ||||||
Scientific Name | Common Name | Divergence from Humans (MYA){{cite journal | vauthors = Hedges SB, Dudley J, Kumar S | title = TimeTree: a public knowledge-base of divergence times among organisms | journal = Bioinformatics | volume = 22 | issue = 23 | pages = 2971–2 | date = December 2006 | pmid = 17021158 | doi = 10.1093/bioinformatics/btl505 | url = http://timetree.org/ | doi-access = free }} | NCBI Protein Accession | Gene Name | Protein Length | Sequence Similarity |
---|---|---|---|---|---|---|
Homo sapiens | Human | 0.0 | [https://www.ncbi.nlm.nih.gov/protein/NP_057542.2 NP_057542] | FAM203A | 390 | 100% |
Macaca mulatta | Rhesus macaque | 29.2 | [https://www.ncbi.nlm.nih.gov/protein/109087722 XM_001090013] | BRP16L | 396 | 94% |
Pan troglodytes | Chimpanzee | 6.3 | [https://www.ncbi.nlm.nih.gov/protein/XP_520011.2 XP_520011] | FAM203A | 395 | 98% |
Mus musculus | Mouse | 92.3 | [https://www.ncbi.nlm.nih.gov/protein/NP_067530.2 NP_067530] | FAM203A | 393 | 86% |
Sus scrofa | Wild boar | 94.2 | [https://www.ncbi.nlm.nih.gov/protein/311253283 XP_003125495] | FAM203A-like | 406 | 85% |
Monodelphis domestica | Gray short-tailed opossum | 162.6 | [https://www.ncbi.nlm.nih.gov/protein/334326436 XP_003340757] | FAM203A-like | 483 | 78% |
Columba livia | Rock dove | 296.0 | [https://www.ncbi.nlm.nih.gov/protein/EMC87403 EMC87403] | BRP16 (partial) | 194 | 64% |
Danio rerio | Zebrafish | 400.1 | [https://www.ncbi.nlm.nih.gov/protein/NP_001002522 NP_001002522] | FAM203A | 377 | 70% |
Xenopus tropicalis | Western clawed frog | 371.2 | [https://www.ncbi.nlm.nih.gov/protein/AAI60980 AAI60980] | LOC100145412 | 377 | 70% |
Xenopus tropicalis | Western clawed frog | 371.2 | [https://www.ncbi.nlm.nih.gov/protein/NP_001007916.1 NP_001007916] | FAM203A | 359 | 68% |
Strongylocentrotus purpuratus | Purple sea urchin | 742.9 | [https://www.ncbi.nlm.nih.gov/protein/72067022 XP_793139] | FAM203A-like | 372 | 62% |
Anolis carolinensis | Carolina anole | 301.7 | [https://www.ncbi.nlm.nih.gov/protein/327288414 XP_003228921] | BRP16L | 286 | 57% |
Saccoglossus kowglevski | Acorn worm | 661.2 | [https://www.ncbi.nlm.nih.gov/protein/291239979 XP_002739897] | BRP16L | 362 | 61% |
Danio rerio | Zebrafish | 400.1 | [https://www.ncbi.nlm.nih.gov/protein/XP_002665502.1 XP_002665502] | BRP16L | 181 | 57% |
Saccharomyces cerevisiae | Budding yeast | 1369.0 | [https://www.ncbi.nlm.nih.gov/protein/NP_011703.3 NP_011703] | Hgh1p | 394 | 52% |
Arabidopsis thaliana | Thale cress | 1369.0 | [https://www.ncbi.nlm.nih.gov/protein/NP_172882 NP_172882] | Armadillo/beta-catenin-like repeats-containing | 339 | 49% |
There is one ortholog of FAM203B, brain protein 16-like (BRP16L) in Macaca mulatta, although no other primates appear to have orthologous proteins. There are two possible explanations for this anomaly: (1) DNA of other primates has not been sequenced thoroughly in the genomic region of the FAM203B ortholog, or (2) FAM203B is the result of a gene duplication event unique to humans, meaning that BRP16L in M. mulatta resulted from an earlier duplication event unique to that species. The second explanation is supported by the following evidence:
- Like M. mulatta, Danio rerio has both a FAM203A gene and a BRP16L gene. The large amount of time since the divergence of the M. mulatta and D. rerio lineages suggests that these BRP16L genes are the result of separate duplication events.
- The BRP16L protein in D. rerio has a significant 3’ truncation compared to the M. mulatta protein, further supporting the hypothesis that these proteins evolved separately.{{cite web|title=Predicted: brain protein 16-like [Macaca mulatta]|url=https://www.ncbi.nlm.nih.gov/protein/109087722|publisher=NCBI Protein|accessdate=5 February 2013}}{{cite web|title=Predicted: brain protein 16-like [Danio rerio]|url=https://www.ncbi.nlm.nih.gov/protein/XP_002665502.1|publisher=NCBI Protein|accessdate=5 February 2013}}
- If the BRP16L genes in "M mulatta" and "D. rerio" are the result of separate duplication events, then it is also possible that FAM203B and BRP16L in "M. mulatta" are the result of separate duplication events.
- BRP16 (brain protein 16) is an alias of FAM203A, and BRP16L (brain protein 16-like) is an alias of FAM203B. A gene named BRP16L simply means that the gene is related to FAM203A but not necessarily to FAM203B.
- FAM203A and FAM203B are located in the telomeric region of chromosome 8, an area of chromosomes that frequently experiences recombination events.
However, because FAM203A and FAM203B are so similar, it is difficult to determine whether proteins are orthologs or just simply homologs.
=Phylogeny=
=Conserved Domains, Motifs, and Residues=
- ARM (armadillo/beta-catenin-like repeats-containing): Found in two homologs (FAM203A in Danio rerio and At1g14300 in Arabidopsis thaliana) and overlaps slightly with the beginning of the DUF383 domain. Related to the HEAT domain, consists of a 40-amino-acid tandemly repeated sequence motif, and is thought to mediate protein-protein interactions. Several eukaryotic genes contain ARM domains including armadillo in Drosophila melanogaster, beta-catenin, plakoglobin, and adenomatous polyposis coli in mammals.
- DUF383: Domain of unknown function 383
- DUF384: Domain of unknown function 383
File:FAM203B Conserved Domains.png
Every ortholog and homolog of FAM203B has a DUF383 domain and a DUF384 domain (except Anolis carolinensis, which is missing DUF384 due to a large 3' truncation{{cite web|title=Predicted: brain protein 16-like [Anolis carolinensis]|url=https://www.ncbi.nlm.nih.gov/protein/327288414|publisher=NCBI Protein|accessdate=10 May 2013}}). There is significant variation among mammals, marsupials, and monotremes as to where the DUF383 domain begins, whereas this variation is smaller in reptiles, amphibians, fish, invertebrates, plants, and fungi. Additionally, the DUF383 domain ends at the same location for all homologs, while the DUF384 domain starts and ends at roughly the same location in all homologs. There is high homology in the DUF384 domain (292..349) and in the DUF383 domain (154..288), and several amino acids are completely conserved in vertebrates, invertebrates, plants, and fungi, which include Arg190, Gly219, Asn226, Lys273, and Lys338. Other highly conserved amino acids include Asn87, Lys88, Arg216, and Phe229.
References
{{Reflist|33em}}