FAM98C

{{Short description|Gene}}

{{#invoke:Infobox_gene|getTemplateData|QID=Q18051801}}

Family with sequence 98, member C or FAM98C is a gene that encodes for FAM98C has two aliases FLJ44669 and hypothetical protein LOC147965.{{Cite web|title=FAM98C Gene - GeneCards {{!}} FA98C Protein {{!}} FA98C Antibody|url=https://www.genecards.org/cgi-bin/carddisp.pl?gene=FAM98C|access-date=2020-12-19|website=www.genecards.org}} FAM98C has two paralogs in humans FAM98A and FAM98B. FAM98C can be characterized for being a Leucine-rich protein. The function of FAM98C is still not defined. FAM98C has orthologs in mammals, reptiles, and amphibians and has a distant orhtologs in Rhinatrema bivittatum and Nanorana parkeri.

Gene

= Locus =

The FAM98C gene is located on 19q13.2 in humans on the "+" strand. FAM98C spans from 38,403,135 to 38,409,088 bp. The primary mRNA transcript for the FAM98C gene is 5,954 base pairs in length. FAM98C neighbors include RASGRP4 and RYR1.

= Transcripts =

FAM98C has two known transcript variants.{{Cite web|title=FAM98C family with sequence similarity 98 member C [Homo sapiens (human)] - Gene - NCBI|url=https://www.ncbi.nlm.nih.gov/gene/?term=FLJ44669|access-date=2020-12-15|website=www.ncbi.nlm.nih.gov}} The first variant encodes for the longest isoform of 349 amino acids.{{Cite web|title=protein FAM98C isoform 1 [Homo sapiens] - Protein - NCBI|url=https://www.ncbi.nlm.nih.gov/protein/NP_777565.3|access-date=2020-12-19|website=www.ncbi.nlm.nih.gov}} The second variant is encodes for a short isoform of 267 amino acids.{{Cite web|title=protein FAM98C isoform 2 [Homo sapiens] - Protein - NCBI|url=https://www.ncbi.nlm.nih.gov/protein/NP_001338604.1|access-date=2020-12-19|website=www.ncbi.nlm.nih.gov}} FAM98C is composed of eight exons.

Proteins

The FAM98C protein is 349 amino acids in length with a predicted molecular weight of 37.3 kDa and a predicted isoelectric point of 6.89.{{cite journal | vauthors = Brendel V, Bucher P, Nourbakhsh IR, Blaisdell BE, Karlin S | title = Methods and algorithms for statistical analysis of protein sequences | journal = Proceedings of the National Academy of Sciences of the United States of America | volume = 89 | issue = 6 | pages = 2002–6 | date = March 1992 | pmid = 1549558 | pmc = 48584 | doi = 10.1073/pnas.89.6.2002 | bibcode = 1992PNAS...89.2002B | doi-access = free }} Composition of FAM89A protein is notable for is its abundance of Leucine(16%) and the Lysine-rich C-terminus. FAM98C shows a high scoring positive segment with 6 consecutive Lysine residues.

= Domains and motifs =

FAM98C has a domain of unknown function 2465 (DUF2465) from the amino acids 18-334.{{Cite web|title=HomoloGene - NCBI|url=https://www.ncbi.nlm.nih.gov/homologene/?term=Homo+sapiens+FAM98C|access-date=2020-12-19|website=www.ncbi.nlm.nih.gov}} This domain of unknown function is unique to the FAM98 family and is conserved in all orthologs.{{Cite web|title=CDD Conserved Protein Domain Family: DUF2465|url=https://www.ncbi.nlm.nih.gov/Structure/cdd/cddsrv.cgi?uid=pfam10239|access-date=2020-12-19|website=www.ncbi.nlm.nih.gov}} DUF2465 is fairly unknown but its proposed to bind to RNA. The domain in paralogs FAM98A binds to mRNA, FAM98B targets tRNA splicing.{{cite journal | vauthors = Dürnberger G, Bürckstümmer T, Huber K, Giambruno R, Doerks T, Karayel E, Burkard TR, Kaupe I, Müller AC, Schönegger A, Ecker GF, Lohninger H, Bork P, Bennett KL, Superti-Furga G, Colinge J | display-authors = 6 | title = Experimental characterization of the human non-sequence-specific nucleic acid interactome | journal = Genome Biology | volume = 14 | issue = 7 | pages = R81 | date = July 2013 | pmid = 23902751 | pmc = 4053969 | doi = 10.1186/gb-2013-14-7-r81 | doi-access = free }}

= Structure =

The secondary structure of FAM98C is predicted to be composed of approximately 46% alpha helix, 46% random coil and 7% extended strand.{{Cite web|last=Prof. T. Ashok Kumar|title=CFSSP: Chou & Fasman Secondary Structure Prediction Server|url=https://www.biogem.org/tool/chou-fasman/|access-date=2020-12-16|website=www.biogem.org}}{{Cite web|title=NPS@ : GOR4 secondary structure prediction|url=https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_gor4.html|access-date=2020-12-16|website=npsa-prabi.ibcp.fr}} However, no beta strands were found in any of the predicted secondary structures.{{Cite web|title=Bioinformatics Toolkit|url=https://toolkit.tuebingen.mpg.de/tools/ali2d|access-date=2020-12-16|website=toolkit.tuebingen.mpg.de}} The tertiary structure of FAM98C is predicted to have 10 alpha helices by the I-TASSER software.{{Cite web|title=I-TASSER server for protein structure and function prediction|url=https://zhanglab.ccmb.med.umich.edu/I-TASSER/|access-date=2020-12-19|website=zhanglab.ccmb.med.umich.edu}}{{cite journal | vauthors = Roy A, Kucukural A, Zhang Y | title = I-TASSER: a unified platform for automated protein structure and function prediction | journal = Nature Protocols | volume = 5 | issue = 4 | pages = 725–38 | date = April 2010 | pmid = 20360767 | doi = 10.1038/nprot.2010.5 | pmc = 2849174 }}

Gene level regulation

= Promoter =

The FAM98C promoter(GXP_7536558) region is 1254 base pairs in length. Both E2F-myc activator/cell cycle regulator and Krueppel like transcription factors had nineteen sites predicted to bind on the promoter.{{Cite web|title=ElDorado: Annotation & Analysis|url=https://www.genomatix.de/online_help/help_eldorado/annotation_analysis_help.html|access-date=2020-12-16|website=www.genomatix.de|archive-date=2018-05-07|archive-url=https://web.archive.org/web/20180507085351/https://www.genomatix.de/online_help/help_eldorado/annotation_analysis_help.html|url-status=dead}}

= Expression pattern =

A GEO multiple normal tissue profile revealed that FAM98C is ubiquitously expressed, though not uniformly expressed.{{Cite web|title=GEO DataSet Browser|url=https://www.ncbi.nlm.nih.gov/sites/GDSbrowser?acc=GDS3834|access-date=2020-12-19|website=www.ncbi.nlm.nih.gov}}{{Cite web|title=GEO Accession viewer|url=https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14938|access-date=2020-12-19|website=www.ncbi.nlm.nih.gov}} The highest expressions levels are in the jejunum, liver, and kidney.

= Sub-cellular localization =

The subcellular localization of FAM98C was predicted using the PSORT II tool.{{Cite web|title=PSORT II Prediction|url=https://psort.hgc.jp/form2.html|access-date=2020-12-16|website=psort.hgc.jp}} FAM98C is predicted to be localized in the nucleus (60.9%), followed by the mitochondria (21.7%) and then the cytoplasm (17.4%).

Protein level regulation

= Post-translational modifications =

== Phosphorylation ==

FAM98C has three predicted phosphorylation sites located at amino acid positions 225, 239, and 300 that are conserved in distant orthologs.{{Cite web|title=GPS 5.0 - Kinase-specific Phosphorylation Site Prediction|url=http://gps.biocuckoo.cn/|access-date=2020-12-16|website=gps.biocuckoo.cn}} The predicted phosphorylation site at position 225 is Tyrosine Kinase can function as an "on" and "off" switch. A predicted calmodulin-dependent protein kinase site at position 239.{{Cite web|title=GPS 5.0 - Kinase-specific Phosphorylation Site Prediction|url=http://gps.biocuckoo.cn/|access-date=2020-12-19|website=gps.biocuckoo.cn}}

File:Domain_drawing.png

== SUMOylation ==

Sumoylation is a post-translation modification process, that regulates a lot of proteins. The GPS CUCKOO workgroup database predicted SUMO protein sites at 347, 348 and 349.{{Cite web|title=GPS-SUMO: Prediction of SUMOylation Sites & SUMO-interaction Motifs|url=http://sumosp.biocuckoo.org/|access-date=2020-12-16|website=sumosp.biocuckoo.org|archive-date=2013-05-10|archive-url=https://web.archive.org/web/20130510131129/http://sumosp.biocuckoo.org/|url-status=dead}} These residues were conserved in even the most distant FAM98C orthologs.

Homology

= Paralogs =

FAM98C only has two paralogs FAM98A and FAM98B.

= Orthologs =

Orthologs for FAM98C have been found in mammals, reptiles and amphibians. FAM98C’s orthologs are present as far back as amphibians roughly estimated 351.8 million years ago(mya). FAM98C is only present in the Metazoan kingdom but not present in protozoa. Below is a table of a variety of orthologs for human FAM98C. The orthologs listed below are in descending order in the terms of the date of divergence.{{Cite web|title=TimeTree :: The Timescale of Life|url=http://www.timetree.org|access-date=2020-12-19|website=www.timetree.org}}

class="wikitable"

|+

!Sequence Number

!Genus species

!Common Name

!Taxonomic Group

![http://www.timetree.org Date of Divergence(MYA)]

!Accession Number

!Sequence Length(aa)

!Sequence Identity

!Sequence Similarity

1

|Homo sapiens

|Human

|Primates

|0

|[https://www.ncbi.nlm.nih.gov/protein/NP_777565.3 NP_777565.3]

|349

|100%

|100%

2

|Pan troglodytes

|Chimpanzee

|Primates

|6.7

|[https://www.ncbi.nlm.nih.gov/protein/XP_524252.3?report=genpept XP_524252.3]

|350

|99%

|99%

3

|Microcebus murinus

|Gray mouse lemur

|Primates

|73.8

|[https://www.ncbi.nlm.nih.gov/protein/XP_012630183.1?report=genbank&log$=protalign&blast_rank=1&RID=T0WFMDFJ016 XP_012630183.1]

|353

|84%

|88%

4

|Octodon degus

|Common degu

|Rodentia

|90

|[https://www.ncbi.nlm.nih.gov/protein/XP_023577316.1 XP_023577316.1]

|352

|78%

|84%

5

|Ochotona princeps

|American pika

|Lagomorpha

|90

|[https://www.ncbi.nlm.nih.gov/protein/XP_004595135.1 XP_004595135.1]

|353

|77%

|83%

6

|Mus musculus

|Mouse

|Rodentia

|90

|[https://www.ncbi.nlm.nih.gov/protein/NP_001139495.1 NP_001139495.1]

|344

|74%

|79%

7

|Rattus norvegicus

|Brown Rat

|Rodentia

|90

|[https://www.ncbi.nlm.nih.gov/protein/NP_001185513.1 NP_001185513.1]

|344

|73%

|80%

8

|Bos taurus

|Cattle

|Artiodactyla

|96

|[https://www.ncbi.nlm.nih.gov/protein/XP_002695017.1 XP_002695017.1]

|353

|81%

|85%

9

|Canis lupus familiaris

|Dog

|Carnivora

|96

|[https://www.ncbi.nlm.nih.gov/protein/XP_541643.2 XP_541643.2]

|353

|80%

|83%

10

|Leptonychotes weddellii

|Weddell seal

|Carnivora

|96

|[https://www.ncbi.nlm.nih.gov/protein/XP_006739473.1 XP_006739473.1]

|345

|79%

|84%

11

|Monodon monoceros

|Narwhal

|Artiodactyla

|96

|[https://www.ncbi.nlm.nih.gov/protein/XP_029092965.1 XP_029092965.1]

|352

|78%

|84%

12

|Desmodus rotundus

|Common vampire bat

|Chiroptera

|96

|[https://www.ncbi.nlm.nih.gov/protein/XP_024433437.1 XP_024433437.1]

|355

|77%

|83%

13

|Chrysochloris asiatica

|Cape golden mole

|Afrosoricida

|105

|[https://www.ncbi.nlm.nih.gov/protein/XP_006871606.1 XP_006871606.1]

|348

|75%

|81%

14

|Vombatus ursinus

|common wombat

|Diprotodontia

|159

|[https://www.ncbi.nlm.nih.gov/protein/XP_027711296.1 XP_027711296.1]

|358

|64%

|73%

15

|Phascolarctos cinereus

|koala

|Diprotodontia

|159

|[https://www.ncbi.nlm.nih.gov/protein/XP_020834255.1 XP_020834255.1]

|358

|64%

|73%

16

|Ornithorhynchus anatinus

|Platypus

|Monotremata

|177

|[https://www.ncbi.nlm.nih.gov/protein/XP_028920793.1 XP_028920793.1]

|338

|57%

|65%

17

|Chelonoidis abingdonii

|Pinta Island tortoise

|Testudines

|312

|[https://www.ncbi.nlm.nih.gov/protein/XP_032660367.1 XP_032660367.1]

|329

|44%

|57%

18

|Podarcis muralis

|Common wall lizard

|Squamata

|312

|[https://www.ncbi.nlm.nih.gov/protein/XP_028597878.1 XP_028597878.1]

|330

|43%

|55%

19

|Python bivittatus

|Burmese python

|Squamata

|312

|[https://www.ncbi.nlm.nih.gov/protein/XP_015745259.1 XP_015745259.1]

|318

|42%

|58%

20

|Nanorana parkeri

|High Himalaya frog

|Gymnophiona

|351.8

|[https://www.ncbi.nlm.nih.gov/protein/XP_018411523.1 XP_018411523.1]

|351

|38%

|55%

21

|Rhinatrema bivittatum

|two-lined caecilian

|Anura

|351.8

|[https://www.ncbi.nlm.nih.gov/protein/XP_029475031.1 XP_029475031.1]

|338

|38%

|51%

File:FAM98C_evolution_rate.png

= Rate of Evolution =

FAM98C is rapidly evolving with a rate of divergence faster than both cytochrome C, a slowly evolving gene, and fibrinogen, a rapidly evolving gene.

Interacting proteins

FAM98C has been predicted to interact with DR1, LRRCC1, FAM83F, TMEM256, Pdrm16 and SPRED1.{{Cite web|title=FAM98C protein (human) - STRING interaction network|url=https://string-db.org/cgi/network?taskId=bZTlID7jYHNe|access-date=2020-12-19|website=string-db.org}}{{Cite web|title=PSICQUIC View|url=http://www.ebi.ac.uk/Tools/webservices/psicquic/view/results.xhtml?conversationContext=1|access-date=2020-12-19|website=www.ebi.ac.uk}} LRRCC1 and TMEM256 were both mentioned with FAM98C as potentially novel genes that are related with ciliopathies.{{cite journal | vauthors = Shaheen R, Szymanska K, Basu B, Patel N, Ewida N, Faqeih E, Al Hashem A, Derar N, Alsharif H, Aldahmesh MA, Alazami AM, Hashem M, Ibrahim N, Abdulwahab FM, Sonbul R, Alkuraya H, Alnemer M, Al Tala S, Al-Husain M, Morsy H, Seidahmed MZ, Meriki N, Al-Owain M, AlShahwan S, Tabarki B, Salih MA, Faquih T, El-Kalioby M, Ueffing M, Boldt K, Logan CV, Parry DA, Al Tassan N, Monies D, Megarbane A, Abouelhoda M, Halees A, Johnson CA, Alkuraya FS | display-authors = 6 | title = Characterizing the morbid genome of ciliopathies | journal = Genome Biology | volume = 17 | issue = 1 | pages = 242 | date = November 2016 | pmid = 27894351 | pmc = 5126998 | doi = 10.1186/s13059-016-1099-5 | doi-access = free }}

Clinical significance

In a bioinformatics study, FAM98C and 9 other novel genes were identified to be associated with a prognosis of cholangiocarcinoma.{{cite journal | vauthors = Da Z, Gao L, Su G, Yao J, Fu W, Zhang J, Zhang X, Pei Z, Yue P, Bai B, Lin Y, Meng W, Li X | display-authors = 6 | title = Bioinformatics combined with quantitative proteomics analyses and identification of potential biomarkers in cholangiocarcinoma | journal = Cancer Cell International | volume = 20 | issue = 1 | pages = 130 | date = 2020-04-22 | pmid = 32336950 | pmc = 7178764 | doi = 10.1186/s12935-020-01212-z | doi-access = free }}

References

{{reflist}}