SSHHPS
{{rewrite|date=December 2024}}
{{cs1 config|name-list-style=vanc|display-authors=6}}
SSHHPS is an acronym for short stretches of homologous host pathogen sequences. The acronym was first coined by Legler in a 2019 publication.{{cite journal | vauthors = Morazzani EM, Compton JR, Leary DH, Berry AV, Hu X, Marugan JJ, Glass PJ, Legler PM | title = Proteolytic cleavage of host proteins by the Group IV viral proteases of Venezuelan equine encephalitis virus and Zika virus | journal = Antiviral Research | volume = 164 | pages = 106–122 | date = April 2019 | pmid = 30742841 | pmc = 9575189 | doi = 10.1016/j.antiviral.2019.02.001 }} Legler used BLAST to search for host protein substrates for the nsP2 protease of the Venezuelan equine encephalitis virus (VEEV) and the protease from Zika virus. These viruses are Group 4 (+)ssRNA viruses. Short ~20–25 amino acid sequences from the viral polyprotein {{cite journal | vauthors = Hu X, Compton JR, Legler PM | title = Analysis of Group IV Viral SSHHPS Using In Vitro and In Silico Methods | journal = Journal of Visualized Experiments | volume = 154 | issue = 154 | date = December 2019 | pmid = 31904018 | doi = 10.3791/60421 }} containing the scissile bond were used to search the human proteome. Many of the sequence alignments were spurious, while some matched well with the residues surrounding the scissile bond. When all known host proteins shown to be cut by viral proteases were consolidated into a table, it became clear that the targets were not random.{{cite journal | vauthors = Reynolds ND, Aceves NM, Liu JL, Compton JR, Leary DH, Freitas BT, Pegan SD, Doctor KZ, Wu FY, Hu X, Legler PM | title = The SARS-CoV-2 SSHHPS Recognized by the Papain-like Protease | journal = ACS Infectious Diseases | volume = 7 | issue = 6 | pages = 1483–1502 | date = June 2021 | pmid = 34019767 | pmc = 8171221 | doi = 10.1021/acsinfecdis.0c00866 }} Most were related to innate immunity while others appeared to be related to viral pathogenesis and the virus-induced phenotype. Some hits were related to both. The list of experimentally confirmed host targets of Group IV viral proteases included key proteins involved in innate immunity e.g. MAVS, RIG-I, STING, TRIF, and TRIM14. In 1984, one of the first host proteins shown to be cut by a viral protease was histone H3 by foot-and-mouth disease virus.{{cite journal | vauthors = Grigera PR, Tisminetzky SG | title = Histone H3 modification in BHK cells infected with foot-and-mouth disease virus | journal = Virology | volume = 136 | issue = 1 | pages = 10–19 | date = July 1984 | pmid = 6330987 | doi = 10.1016/0042-6822(84)90243-5 }}{{cite journal | vauthors = Falk MM, Grigera PR, Bergmann IE, Zibert A, Multhaup G, Beck E | title = Foot-and-mouth disease virus protease 3C induces specific proteolytic cleavage of host cell histone H3 | journal = Journal of Virology | volume = 64 | issue = 2 | pages = 748–756 | date = February 1990 | pmid = 2153239 | pmc = 249169 | doi = 10.1128/jvi.64.2.748-756.1990 }} The histone tails are strategic targets of the viral proteases, the cleavage can shut down host cell transcription{{cite journal | vauthors = Tesar M, Marquardt O | title = Foot-and-mouth disease virus protease 3C inhibits cellular transcription and mediates cleavage of histone H3 | journal = Virology | volume = 174 | issue = 2 | pages = 364–374 | date = February 1990 | pmid = 2154880 | doi = 10.1016/0042-6822(90)90090-e }}{{cite journal | vauthors = Yi SJ, Kim K | title = Histone tail cleavage as a novel epigenetic regulatory mechanism for gene expression | journal = BMB Reports | volume = 51 | issue = 5 | pages = 211–218 | date = May 2018 | pmid = 29540259 | pmc = 5988574 | doi = 10.5483/BMBRep.2018.51.5.053 }} and the many effects of interferon.
File:VEEVcleavagesite.pdf Viral proteases recognize sequence motifs. The subsite tolerances in the protease can vary, leading to the recognition of many sequences. The protease is a complement to many peptides.
Silencing
Silencing can occur at the level of DNA, RNA, and protein. The 3rd mechanism of silencing would involve proteases and proteins. SSHHPS cleavage is a type of target specific co- or post-translational silencing.
{{CSS image crop
|Image = SSHHPS2.pdf
|bSize = 600
|cWidth =600
|cHeight =250
|oTop = 90
|oLeft = 40
}}Silencing can occur at the level of DNA, RNA, and Protein. SSHHPS are short stretches of homologous host pathogen sequences. These sequences can be found at the viral protease cleavage sites, they correspond to specific proteins in the host. The cleavage of these sequences can be co- or post-translational. Original figure can be found in Morazzani, et al.
Predictions
= SARS-CoV-2 =
[[File:COVID Graph.pdf|thumb|400px|Clustering of potential host targets of viral proteases. Plot of PHI-BLAST search results for the SARS-CoV-2 papain-like protease. Hits can be found in a file that can be downloaded on the BLAST website. On the right side of the graph are the sequences that have the strongest alignment to residues at the cleavage site over the longest continuous stretch. For SARS-CoV-2 papain-like protease (PLpro) the host proteins with the highest similarity were the cardiac myosins, myomesin, and PROS1. A distribution can be extracted as there are multiple proteins under each point.
ADGRA2 (GPR124) is a common hit among neuroinvasive viruses. The alignments on the left side of the graph were more spurious and gapped. Points in gray circle that are blue correspond to proteins experimentally shown to be cut. Original figure can be found in Doctor, et al. ]]
Using PHI-BLAST and a sequence pattern (e.g. L[RK]GG) a shorter list of host targets could be obtained; however, the searches still produced hundreds of host targets ([https://www.youtube.com/watch?v=T42Qxlrb_Y4 YouTube Video]). To sort them and rank order them Legler used clustering. Plotting 'percent positives' vs. 'alignment length' from the PHI-BLAST output file, the cleavable proteins were found to cluster and localize to the right of the graph.{{cite journal | vauthors = Doctor KZ, Gilmour E, Recarte M, Beatty TR, Shifa I, Stangel M, Schwisow J, Leary DH, Legler PM | title = Automated SSHHPS Analysis Predicts a Potential Host Protein Target Common to Several Neuroinvasive (+)ssRNA Viruses | journal = Viruses | volume = 15 | issue = 2 | page = 542 | date = February 2023 | pmid = 36851756 | pmc = 9961674 | doi = 10.3390/v15020542 | doi-access = free }} The hit lists could now be sorted by alignment length and percent positives and a rank-ordered list could be produced. At the top of the list are the most likely substrates and at the bottom the less likely substrates. This and experimental data became the basis for the first sequence-to-symptom software for viruses. An example of the software output can be found [https://www.researchgate.net/publication/368646727_COVID_SARS_MERS_EV71_WNV_NRV_EEEV_VEEV_ZIKV_3xlsx#fullTextFileContent here].
After sorting the hits, Legler found that the hits at the top of the list had similarities to the virus-induced phenotype. For the COVID-19 SARS-CoV-2 papain-like protease (PLpro), cardiac myosins were the strongest predicted hit (MYH6, MYH7); MYOM1, POT1, VWF, PROS1, HER4, and FOXP3 were also predicted and the sequences were shown to be cleavable. A group at UCSF, showed the cleavage of myofibrils in cardiomyocytes after infection with SARS-CoV-2.{{cite web | vauthors = Perez-Bermejo JA |title=Novel Coronavirus Could be Slicing Heart Muscles Cells into Pieces |url=https://www.statnews.com/wp-content/uploads/2020/09/Gladstone_Comparison_frag-CMs-COVID-19-1.jpg |website=Weather Channel |publisher=The Weather Channel |ref=Perez-Bermejo}}{{cite journal | vauthors = Pérez-Bermejo JA, Kang S, Rockwood SJ, Simoneau CR, Joy DA, Ramadoss GN, Silva AC, Flanigan WR, Li H, Nakamura K, Whitman JD, Ott M, Conklin BR, McDevitt TC | title = SARS-CoV-2 infection of human iPSC-derived cardiac cells predicts novel cytopathic features in hearts of COVID-19 patients | journal = bioRxiv | date = September 2020 | pmid = 32935097 | pmc = 7491510 | doi = 10.1101/2020.08.25.265561 | ref = Perez-Bermejo bioRx | publisher = Cold Spring Harbor Laboratory }} Fragments of the sarcomere are still visible showing that the cleavage of the myofibrils occurs post-translationally and after the assembly of the myofibril. The viral proteases have also been suspected in [https://www.wikidoc.org/index.php/COVID-19-associated%20coagulopathy COVID coagulopathy].{{cite journal | vauthors = Baroni M, Beltrami S, Schiuma G, Ferraresi P, Rizzo S, Passaro A, Molina JM, Rizzo R, Di Luca D, Bortolotti D | title = In Situ Endothelial SARS-CoV-2 Presence and PROS1 Plasma Levels Alteration in SARS-CoV-2-Associated Coagulopathies | journal = Life | volume = 14 | issue = 2 | page = 237 | date = February 2024 | pmid = 38398746 | pmc = 10890393 | doi = 10.3390/life14020237 | doi-access = free | bibcode = 2024Life...14..237B }} The PLpro of SARS-CoV-2 was able to cut sequences in PROS1 and VWF.
= Zika virus =
Zika virus has been associated with microcephaly and anencephaly. Using the sorting and graphical method described above, hits related to these phenotypes emerged, such as GIT1, FOXG1, and SFRP1. [https://www.ncbi.nlm.nih.gov/core/lw/2.0/html/tileshop_pmc/tileshop_pmc_inline.html?title=Click%20on%20image%20to%20zoom&p=PMC3&id=4363336_en-24-8-g001.jpg GIT1 knockout] mice develop microcephaly.{{cite journal | vauthors = Hong ST, Mah W | title = A Critical Role of GIT1 in Vertebrate and Invertebrate Brain Development | journal = Experimental Neurobiology | volume = 24 | issue = 1 | pages = 8–16 | date = March 2015 | pmid = 25792865 | pmc = 4363336 | doi = 10.5607/en.2015.24.1.8 }} Mice and rats have not been shown to develop microcephaly after infection with Zika virus (ZIKV). However, Goodfellow, et al.{{cite journal | vauthors = Goodfellow FT, Tesla B, Simchick G, Zhao Q, Hodge T, Brindley MA, Stice SL | title = Zika Virus Induced Mortality and Microcephaly in Chicken Embryos | journal = Stem Cells and Development | volume = 25 | issue = 22 | pages = 1691–1697 | date = November 2016 | pmid = 27627457 | pmc = 6453490 | doi = 10.1089/scd.2016.0231 }} showed that [https://pmc.ncbi.nlm.nih.gov/articles/PMC6453490/figure/f3/ chickens] can produce microcephaly when infected with ZIKV. Both humans and chickens have the same sequence at the predicted cleavage site in SFRP1. SFRP1 is a predicted host protein substrate for the Zika viral protease. The sequence is identical in humans and chickens, two species which both produce [https://people.com/celebrity/mother-says-zika-caused-sons-microcephaly/ microcephaly] after infection with Zika virus. SFRP1 is part of the Wnt signaling pathway. The loss of function of more than one protein may be needed to produce the virus-induced phenotype. File:SFRP1.pdf
= HKU5 =
The SSHHPS for Pipistrellus bat coronavirus HKU5 (Bat-CoV HKU5) have been predicted and can be found [https://www.researchgate.net/publication/389880568_HKU5_PLpro_SSHHPS#fullTextFileContent here]. Analysis of the PLpro SSHHPS in HKU5 identified hits related to neurodevelopmental disorders, epilepsy, seizures, respiratory effects, lung inflammation, spinocerebellar ataxia, microphthalmia, ocular abnormalities, IBS, anhidrosis, hydrocephalus, hearing loss, elevated hemoglobin and hematocrit, skeletal dysplasia, microcephaly, nephrotic syndrome, among others. ADGRA2 was among the predictions.
Experimental confirmation
In 1996, Blom, et al.{{cite journal | vauthors = Blom N, Hansen J, Blaas D, Brunak S | title = Cleavage site analysis in picornaviral polyproteins: discovering cellular targets by neural networks | journal = Protein Science | volume = 5 | issue = 11 | pages = 2203–2216 | date = November 1996 | pmid = 8931139 | pmc = 2143287 | doi = 10.1002/pro.5560051107 }} created a neural network to predict the host targets of picornaviral proteases. One of the predicted hits was dystrophin (DMD). Badorff, et al.{{cite journal | vauthors = Badorff C, Lee GH, Lamphear BJ, Martone ME, Campbell KP, Rhoads RE, Knowlton KU | title = Enteroviral protease 2A cleaves dystrophin: evidence of cytoskeletal disruption in an acquired cardiomyopathy | journal = Nature Medicine | volume = 5 | issue = 3 | pages = 320–326 | date = March 1999 | pmid = 10086389 | doi = 10.1038/6543 }} confirmed that dystrophin could be cleaved by the enteroviral 2A protease. Lim, et al.{{cite journal | vauthors = Lim BK, Peter AK, Xiong D, Narezkina A, Yung A, Dalton ND, Hwang KK, Yajima T, Chen J, Knowlton KU | title = Inhibition of Coxsackievirus-associated dystrophin cleavage prevents cardiomyopathy | journal = The Journal of Clinical Investigation | volume = 123 | issue = 12 | pages = 5146–5151 | date = December 2013 | pmid = 24200690 | pmc = 3859391 | doi = 10.1172/JCI66271 }} went one step further and generated a transgenic mouse ("the uncleavable mouse" experiment). The knock-in mice had a mutation in the predicted 2A protease cleavage site in dystrophin that could not be cut by the viral protease. When the viral protease was expressed in cardiomyocytes the cleavage-resistant dystrophin inhibited the cardiomyopathy induced by the viral protease. This experiment brought the idea full circle, i.e. that the viral protease is related to the virus-induced phenotype (i.e. cardiomyopathy). Moreover, the experiment indicated that the clinical presentation could be predicted directly from the viral genome sequence. While Blom's predictions were accurate and could be confirmed by others, a common hit was never found across family or genus.
Conservation - a common hit among neuroinvasive viruses
Using Python, the PHI-BLAST searches and UniProt descriptions could be combined and automated. The search could be repeated several times. Running the searches for 9 neuroinvasive viruses, Legler found that if the viruses were clustered by a common virus-induced phenotype (e.g. neuroinvasiveness) a common hit emerged. One protein common to all 9 hit lists was the orphan G-protein coupled receptor ADGRA2 (also known as GPR124). When ADGRA2 is knocked-out in mouse models of ischemia and glioblastoma blood-brain barrier (BBB) disruption is observed.{{cite journal | vauthors = Chang J, Mancuso MR, Maier C, Liang X, Yuki K, Yang L, Kwong JW, Wang J, Rao V, Vallon M, Kosinski C, Zhang JJ, Mah AT, Xu L, Li L, Gholamin S, Reyes TF, Li R, Kuhnert F, Han X, Yuan J, Chiou SH, Brettman AD, Daly L, Corney DC, Cheshier SH, Shortliffe LD, Wu X, Snyder M, Chan P, Giffard RG, Chang HY, Andreasson K, Kuo CJ | title = Gpr124 is essential for blood-brain barrier integrity in central nervous system disease | journal = Nature Medicine | volume = 23 | issue = 4 | pages = 450–460 | date = April 2017 | pmid = 28288111 | pmc = 5559385 | doi = 10.1038/nm.4309 }}{{cite journal | vauthors = Posokhova E, Shukla A, Seaman S, Volate S, Hilton MB, Wu B, Morris H, Swing DA, Zhou M, Zudaire E, Rubin JS, St Croix B | title = GPR124 functions as a WNT7-specific coactivator of canonical β-catenin signaling | journal = Cell Reports | volume = 10 | issue = 2 | pages = 123–130 | date = January 2015 | pmid = 25558062 | pmc = 4331012 | doi = 10.1016/j.celrep.2014.12.020 }} The cleavage sites for the viral proteases of 9 neuroinvasive viruses were all found in this one protein, in some cases the cleavage sites were predicted to be on the outside of the cell, in other cases the cleavage site is predicted to be in the cytoplasm. File:ADGRA2.pdfInterestingly, the software did not predict a specific cleavage site sequence or a particular type of protease (e.g. serine, cysteine, aspartyl) but rather a general pathway and common target. A strategy to enter the brain may have been preserved during viral evolution.
Origin
RNA viruses are known to acquire host sequences. In some cases whole enzymes have been acquired by viral genomes;{{cite journal | vauthors = Gorbalenya A |title=Host-related sequences in RNA viral genomes |journal=Seminars in Virology |date=1992 |volume=3 |page=359 |doi=10.1016/1044-5773(92)050359 |doi-broken-date=5 December 2024 |url=https://www.researchgate.net/publication/365960347}} the papain-like protease is a good example. Host genomes serve as the largest source of foreign genetic material.{{cite journal | vauthors = Gorbalenya A |title=Host-related sequences in RNA viral genomes |journal=Seminars in Virology |date=1992 |volume=3 |page=359 |doi=10.1016/1044-5773(92)050359 |doi-broken-date=5 December 2024 |url=https://www.researchgate.net/publication/365960347}} Using the RNA sequence of a viral protease cleavage site for SARS-CoV-2 and the bat genome, sequence matches can be found. In Group 4 viruses, the protein sequence of the SSHHPS match the virus, host, and reservoir, while the RNA sequences match sequences in the reservoir species suggesting that they were acquired.
Timing of cleavages
Location of symptom information in viral genomes
The sequences associated with the virus-induced phenotypes for other viruses may be hidden in transcription factors, endonuclease cleavage sites, phosphorylation sites, etc. File:SeqSymptom.pdf For Group 4 (+)ssRNA viruses, the information can be found in the protease cleavage sites (the SSHHPS).
For Group 6 (+)ssRNA retroviruses the information may be in the protease cleavage sites {{cite journal | vauthors = Nie Z, Phenix BN, Lum JJ, Alam A, Lynch DH, Beckett B, Krammer PH, Sekaly RP, Badley AD | title = HIV-1 protease processes procaspase 8 to cause mitochondrial release of cytochrome c, caspase cleavage and nuclear fragmentation | journal = Cell Death and Differentiation | volume = 9 | issue = 11 | pages = 1172–1184 | date = November 2002 | pmid = 12404116 | doi = 10.1038/sj.cdd.4401094 }} and elsewhere.
Conservation
SSHHP sequences must show evidence of sequence homology between host and pathogen and a host-pathogen interaction. The sequence in the viral genome may not be identical to a host DNA, but a short stretch of the protein sequence may match at the predicted protease cleavage site. If a protein is found in another species and shares a common evolutionary origin with the protein in the first species, then it is considered a "homologue" of that protein; essentially meaning they are both derived from the same ancestral gene in a common ancestor. SSHHPS appear to be acquired rather than products of accumulated random mutations. David Baltimore proposed a Copy Choice mechanism{{cite journal |last1=Simicic |first1=P |last2=Zidovec-Lepej |first2=S |title=A Glimpse on the Evolution of RNA Viruses: Implications and Lessons from SARS-CoV-2 |journal=Viruses |date=2022 |volume=15 |issue=1 |page=1 |doi=10.3390/v15010001 |doi-access=free |pmid=36680042|pmc=9866536 }} for RNA recombination in RNA viruses where the viral RNA-dependent RNA polymerase switches templates during negative strand synthesis.{{cite journal | vauthors = Kirkegaard K, Baltimore D | title = The mechanism of RNA recombination in poliovirus | journal = Cell | volume = 47 | issue = 3 | pages = 433–443 | date = November 1986 | pmid = 3021340 | pmc = 7133339 | doi = 10.1016/0092-8674(86)90600-8 }} Host genomes serve as the largest source of foreign genetic material for viruses. RNA has secondary structure and pauses in replication may occur. As to whether certain RNA-binding proteins or enzymes{{cite journal | vauthors = Lindley SR, Subbaiah KC, Priyanka F, Poosala P, Ma Y, Jalinous L, West JA, Richardson WA, Thomas TN, Anderson DM | title = Ribozyme-activated mRNA trans-ligation enables large gene delivery to treat muscular dystrophies | journal = Science | volume = 386 | issue = 6723 | pages = 762–767 | date = November 2024 | pmid = 39541470 | doi = 10.1126/science.adp8179 | bibcode = 2024Sci...386..762L }} in the reservoir species (e.g. bats) affect or promote RNA recombination is still unclear.
References
{{Reflist}}