protein pKa calculations

In computational biology, protein pK_a calculations are used to estimate the pK_a values of amino acids as they exist within proteins. These calculations complement the pK_a values reported for amino acids in their free state, and are used frequently within the fields of molecular modeling, structural bioinformatics, and computational biology.

Amino acid p''K''a values

pK_a values of amino acid side chains play an important role in defining the pH-dependent characteristics of a protein. The pH-dependence of the activity displayed by enzymes and the pH-dependence of protein stability, for example, are properties that are determined by the pK_a values of amino acid side chains.

The pK_a values of an amino acid side chain in solution is typically inferred from the pK_a values of model compounds (compounds that are similar to the side chains of amino acids). See Amino acid for the pK_a values of all amino acid side chains inferred in such a way. There are also numerous experimental studies that have yielded such values, for example by use of NMR spectroscopy.

The table below lists the model pK_a values that are often used in a protein pK_a calculation, and contains a third column based on protein studies.Hass and Mulder (2015) Annu. Rev. Biophys. vol 44 pp. 53–75 [https://dx.doi.org/10.1146/annurev-biophys-083012-130351 doi 10.1146/annurev-biophys-083012-130351].

class="wikitable sortable"

! Amino Acid

! pK_a

Asp (D)

| 3.9

| 4.0{{0}}

Glu (E)

| 4.3

| 4.4

Arg (R)

| 12.0

| 13.5

Lys (K)

| 10.5

| 10.4

His (H)

| 6.08

| 6.8

Cys (C) (–SH)

| 8.28

| 8.3

Tyr (Y)

| 10.1

| 9.6

N-term

| 8.0

C-term

| 3.6

The effect of the protein environment

Image:Back titration.jpg

When a protein folds, the titratable amino acids in the protein are transferred from a solution-like environment to an environment determined by the 3-dimensional structure of the protein. For example, in an unfolded protein, an aspartic acid typically is in an environment which exposes the titratable side chain to water. When the protein folds, the aspartic acid could find itself buried deep in the protein interior with no exposure to solvent.

Furthermore, in the folded protein, the aspartic acid will be closer to other titratable groups in the protein and will also interact with permanent charges (e.g. ions) and dipoles in the protein. All of these effects alter the pK_a value of the amino acid side chain, and pK_a calculation methods generally calculate the effect of the protein environment on the model pK_a value of an amino acid side chain.Bashford (2004) Front Biosci. vol. 9 pp. 1082–99 [https://dx.doi.org/10.2741/1187 doi 10.2741/1187]Gunner et al. (2006) Biochim. Biophys. Acta vol. 1757 (8) pp. 942–68 [https://dx.doi.org/10.1016/j.bbabio.2006.06.005 doi 10.1016/j.bbabio.2006.06.005]Ullmann et al. (2008) Photosynth. Res. 97 vol. 112 pp. 33–55 [https://dx.doi.org/10.1007/s11120-008-9306-1 doi 10.1007/s11120-008-9306-1]Antosiewicz et al. (2011) Mol. BioSyst. vol. 7 pp. 2923–2949 [https://dx.doi.org/10.1039/C1MB05170A doi 10.1039/C1MB05170A ]

Typically, the effects of the protein environment on the amino acid pK_a value are divided into pH-independent effects and pH-dependent effects. The pH-independent effects (desolvation, interactions with permanent charges and dipoles) are added to the model pK_a value to give the intrinsic pK_a value. The pH-dependent effects cannot be added in the same straightforward way and have to be accounted for using Boltzmann summation, Tanford–Roxby iterations or other methods.

The interplay of the intrinsic pK_a values of a system with the electrostatic interaction energies between titratable groups can produce quite spectacular effects such as non-Henderson–Hasselbalch titration curves and even back-titration effects.A. Onufriev, D.A. Case and G. M. Ullmann (2001). Biochemistry 40: 3413–3419 [https://dx.doi.org/10.1021/bi002740q doi 10.1021/bi002740q]

The image on the right shows a theoretical system consisting of three acidic residues. One group is displaying a back-titration event (blue group).

p''K''a calculation methods

Several software packages and webserver are available for the calculation of protein pK_a values.

=Using the Poisson–Boltzmann equation=

Some methods are based on solutions to the Poisson–Boltzmann equation (PBE), often referred to as FDPB-based methods (FDPB stands for "finite difference Poisson–Boltzmann"). The PBE is a modification of Poisson's equation that incorporates a description of the effect of solvent ions on the electrostatic field around a molecule.

The [http://newbiophysics.cs.vt.edu/H++/ H++ web server],{{Cite web |title=H++ (web-based computational prediction of protonation states and pK of ionizable groups in macromolecules) |url=http://newbiophysics.cs.vt.edu/H++/hppdetails.php |access-date=2023-01-26 |website=newbiophysics.cs.vt.edu}} the [https://web.archive.org/web/20070728080556/http://enzyme.ucd.ie/pKD pKD webserver],{{Cite journal |last1=Tynan-Connolly |first1=B. M. |last2=Nielsen |first2=J. E. |date=2006-12-22 |title=Redesigning protein pKa values |journal=Protein Science |language=en |volume=16 |issue=2 |pages=239–249 |doi=10.1110/ps.062538707 |issn=0961-8368 |pmc=2203286 |pmid=17189477}} [https://gunnerlab.github.io/Stable-MCCE/ MCCE2], [https://web.archive.org/web/20071202050537/http://agknapp.chemie.fu-berlin.de/karlsberg/ Karlsberg+],{{Dead link|date=January 2023}} [https://www.itqb.unl.pt/labs/molecular-simulation/in-house-software/ PETIT] and [https://rtullmann.de/parts/gmct-gcem.html GMCT] use the FDPB method to compute pK_a values of amino acid side chains.

FDPB-based methods calculate the change in the pK_a value of an amino acid side chain when that side chain is moved from a hypothetical fully solvated state to its position in the protein. To perform such a calculation, one needs theoretical methods that can calculate the effect of the protein interior on a pK_a value, and knowledge of the pKa values of amino acid side chains in their fully solvated states.

=Empirical methods=

A set of empirical rules relating the protein structure to the pK_a values of ionizable residues have been developed by Li, Robertson, and Jensen.{{Cite journal |last1=Li |first1=Hui |last2=Robertson |first2=Andrew D. |last3=Jensen |first3=Jan H. |date=2005-10-17 |title=Very fast empirical prediction and rationalization of protein pKa values |url=https://onlinelibrary.wiley.com/doi/10.1002/prot.20660 |journal=Proteins: Structure, Function, and Bioinformatics |language=en |volume=61 |issue=4 |pages=704–721 |doi=10.1002/prot.20660|pmid=16231289 |s2cid=38196246 |url-access=subscription }} These rules form the basis for the [https://biolib.com/bio-utils/propka/ web-accessible] program called PROPKA for rapid predictions of pK_a values. A recent empirical pK_a prediction program was released by Tan KP et.al. with the online server [http://mspc.bii.a-star.edu.sg/tankp/ DEPTH web server].{{Cite journal |last1=Tan |first1=Kuan Pern |last2=Nguyen |first2=Thanh Binh |last3=Patel |first3=Siddharth |last4=Varadarajan |first4=Raghavan |last5=Madhusudhan |first5=M. S. |date=2013-07-01 |title=Depth: a web server to compute depth, cavity sizes, detect potential small-molecule ligand-binding cavities and predict the pKa of ionizable residues in proteins |url=http://academic.oup.com/nar/article/41/W1/W314/1111943/Depth-a-web-server-to-compute-depth-cavity-sizes |journal=Nucleic Acids Research |language=en |volume=41 |issue=W1 |pages=W314–W321 |doi=10.1093/nar/gkt503 |issn=1362-4962 |pmc=3692129 |pmid=23766289}}

=Molecular dynamics (MD)-based methods=

Molecular dynamics methods of calculating pK_a values make it possible to include full flexibility of the titrated molecule.Donnini et al. (2011) J. Chem. Theory Comp. vol 7 pp. 1962–78 [https://dx.doi.org/10.1021/ct200061r doi 10.1021/ct200061r].Wallace et al. (2011) J. Chem. Theory Comp. vol 7 pp. 2617–2629 [https://dx.doi.org/10.1021/ct200146j doi 10.1021/ct200146j].Goh et al. (2012) J. Chem. Theory Comp. vol 8 pp. 36–46 [https://dx.doi.org/10.1021/ct2006314 doi 10.1021/ct2006314].

Molecular dynamics based methods are typically much more computationally expensive, and not necessarily more accurate, ways to predict pK_a values than approaches based on the Poisson–Boltzmann equation. Limited conformational flexibility can also be realized within a continuum electrostatics approach, e.g., for considering multiple amino acid sidechain rotamers. In addition, current commonly used molecular force fields do not take electronic polarizability into account, which could be an important property in determining protonation energies.

=Determining p''K''a values from titration curves or free energy calculations=

From the titration of protonatable group, one can read the so-called pK_a^{{1/2}} which is equal to the pH value where the group is half-protonated (i.e. when 50% such groups would be protonated). The pK_a^{{1/2}} is equal to the Henderson–Hasselbalch pK_a (pK{{su|b=a|p=HH}}) if the titration curve follows the Henderson–Hasselbalch equation.Ullmann (2003) J. Phys. Chem. B vol 107 pp. 1263–71 [https://dx.doi.org/10.1021/jp026454v doi 10.1021/jp026454v]. Most pK_a calculation methods silently assume that all titration curves are Henderson–Hasselbalch shaped, and pK_a values in pK_a calculation programs are therefore often determined in this way. In the general case of multiple interacting protonatable sites, the pK_a^{{1/2}} value is not thermodynamically meaningful. In contrast, the Henderson–Hasselbalch pK_a value can be computed from the protonation free energy via

$\mathrm{p}K_{\mathrm{a}}^{\mathrm{HH}}(\mathrm{pH}) =
\mathrm{pH} - \frac{\Delta G^{\mathrm{prot}}(\mathrm{pH})}{\mathrm{RT} \ln10}$

and is thus in turn related to the protonation free energy of the site via

$\Delta G^{\mathrm{prot}}(\mathrm{pH}) = \mathrm{RT} \ln10 \; ( \mathrm{pH} - \mathrm{p}K_{\mathrm{a}}^{\mathrm{HH}} )$

The protonation free energy can in principle be computed from the protonation probability of the group {{angle brackets|x}}(pH) which can be read from its titration curve

$\Delta G^{\mathrm{prot}}(\mathrm{pH}) = -\mathrm{RT}\ln\left[ \frac{\langle x \rangle}{1-\langle x \rangle} \right]$

Titration curves can be computed within a continuum electrostatics approach with formally exact but more elaborate analytical or Monte Carlo (MC) methods, or inexact but fast approximate methods. MC methods that have been used to compute titration curvesUllmann et al. (2012) J. Comput. Chem. vol 33 pp. 887–900 [https://dx.doi.org/10.1002/jcc.22919 doi 10.1002/jcc.22919] are Metropolis MCMetropolis et al. (1953) J. Chem. Phys. vol 23 pp. 1087–1092 [http://dx.doi.org/10.1063/1.1699114 doi 10.1063/1.1699114]Beroza et al. (1991) Proc. Natl. Acad. Sci. USA vol 88 pp. 5804–5808 [https://dx.doi.org/10.1073/pnas.88.13.5804 doi 10.1073/pnas.88.13.5804] or Wang–Landau MC.Wang and Landau (2001) Phys. Rev. E vol 64 pp 056101 [http://dx.doi.org/10.1103/PhysRevE.64.056101 doi 10.1103/PhysRevE.64.056101] Approximate methods that use a mean-field approach for computing titration curves are the Tanford–Roxby method and hybrids of this method that combine an exact statistical mechanics treatment within clusters of strongly interacting sites with a mean-field treatment of intercluster interactions.Tanford and Roxby (1972) Biochemistry vol 11 pp. 2192–2198 [https://dx.doi.org/10.1021/bi00761a029 doi 10.1021/bi00761a029]Bashford and Karplus (1991) J. Phys. Chem. vol 95 pp. 9556–61 [https://dx.doi.org/10.1021/j100176a093 doi 10.1021/j100176a093]Gilson (1993) Proteins vol 15 pp. 266–82 [https://dx.doi.org/10.1002/prot.340150305 doi 10.1002/prot.340150305]Antosiewicz et al. (1994) J. Mol. Biol. vol 238 pp. 415–36 [https://dx.doi.org/10.1006/jmbi.1994.1301 doi 10.1006/jmbi.1994.1301]Spassov and Bashford (1999) J. Comput. Chem. vol 20 pp. 1091–1111 [https://dx.doi.org/10.1002/(SICI)1096-987X(199908)20:11%3c1091::AID-JCC1%3e3.0.CO;2-3 doi 10.1002/(SICI)1096-987X(199908)20:11<1091::AID-JCC1>3.0.CO;2-3]

In practice, it can be difficult to obtain statistically converged and accurate protonation free energies from titration curves if {{angle brackets|x}} is close to a value of 1 or 0. In this case, one can use various free energy calculation methods to obtain the protonation free energy such as biased Metropolis MC,Beroza et al. (1995) Biophys. J. vol 68 pp. 2233–2250 [https://dx.doi.org/10.1016/S0006-3495(95)80406-6 doi 10.1016/S0006-3495(95)80406-6] free-energy perturbation,Zwanzig (1954) J. Chem. Phys. vol 22 pp. 1420–1426 [https://dx.doi.org/10.1063/1.1740409 doi 10.1063/1.1740409]Ullmann et al. 2011 J. Phys. Chem. B. vol 68 pp. 507–521 [https://dx.doi.org/10.1021/jp1093838 doi 10.1021/jp1093838] thermodynamic integration,Kirkwood (1935) J. Chem. Phys. vol 2 pp. 300–313 [https://dx.doi.org/10.1063/1.1749657 doi 10.1063/1.1749657]Bruckner and Boresch (2011) J. Comput. Chem. vol 32 pp. 1303–1319 [https://dx.doi.org/10.1002/jcc.21713 doi 10.1002/jcc.21713]Bruckner and Boresch (2011) J. Comput. Chem. vol 32 pp. 1320–1333 [https://dx.doi.org/10.1002/jcc.21712 doi 10.1002/jcc.21712] the non-equilibrium work methodJarzynski (1997) Phys. Rev. E vol pp. 2233–2250 [https://dx.doi.org/10.1103/PhysRevE.56.5018 doi 10.1103/PhysRevE.56.5018] or the Bennett acceptance ratio method.Bennett (1976) J. Comput. Phys. vol 22 pp. 245–268 [https://dx.doi.org/10.1016/0021-9991(76)90078-4 doi 10.1016/0021-9991(76)90078-4]

Note that the pK{{su|b=a|p=HH}} value does in general depend on the pH value.Bombarda et al. (2010) J. Phys. Chem. B vol 114 pp. 1994–2003 [https://dx.doi.org/10.1021/jp908926w doi 10.1021/jp908926w].

This dependence is small for weakly interacting groups like well solvated amino acid side chains on the protein surface, but can be large for strongly interacting groups like those buried in enzyme active sites or integral membrane proteins.Bashford and Gerwert (1992) J. Mol. Biol. vol 224 pp. 473–86 [https://dx.doi.org/10.1016/0022-2836(92)91009-E doi 10.1016/0022-2836(92)91009-E]Spassov et al. (2001) J. Mol. Biol. vol 312 pp. 203–19 [https://dx.doi.org/10.1006/jmbi.2001.4902 doi 10.1006/jmbi.2001.4902]Ullmann et al. (2011) J. Phys. Chem. B vol 115 pp. 10346–59 [https://dx.doi.org/10.1021/jp204644h doi 10.1021/jp204644h]

While many protein pKa prediction methods are available, their accuracies often differ significantly due to subtle and often drastic differences in strategy. Wanlei Wei, Hervé Hogues, and Traian Sulea (2023) J. Chem. Inf. Model. vol 63, iss 16, pp. 5169–5181 [https://pubs.acs.org/doi/10.1021/acs.jcim.3c00165]

References

External links

[https://web.archive.org/web/20050505074400/http://www.accelrys.com/products/dstudio/ AccelrysPKA] — Accelrys CHARMm based pK_a calculation
[http://newbiophysics.cs.vt.edu/H++/ H++] — Poisson–Boltzmann based pK_a calculations
[https://gunnerlab.github.io/Stable-MCCE/ MCCE2] — Multi-Conformation Continuum Electrostatics (Version 2)
[https://web.archive.org/web/20071202050537/http://agknapp.chemie.fu-berlin.de/karlsberg/ Karlsberg+] — pK_a computation with multiple pH adapted conformations
[http://www.itqb.unl.pt/labs/molecular-simulation/in-house-software/ PETIT] — Proton and Electron TITration
[https://rtullmann.de/parts/gmct-gcem.html GMCT] — Generalized Monte Carlo Titration

[http://mspc.bii.a-star.edu.sg/tankp/ DEPTH web server] — Empirical calculation of pK_a values using Residue Depth as a major feature

Category:Protein methods

Category:Equilibrium chemistry