A Self-Consistent Approach to Rotamer and Protonation State Assignments (RAPA): Moving Beyond Single Protein Configurations

  • 1Ph.D. Program in Chemistry, The Graduate Center, City University of New York, New York, New York 10016, United States.
  • 2Ventus Therapeutics, Inc., 4800 Rue Levy, Montreal, Quebec H4R 2P7, Canada.
  • 3Department of Chemistry, Lehman College, City University of New York, New York, New York 10468, United States.
  • 4Ventus Therapeutics U.S. Inc., 100 Beaver St. Suite 201, Waltham, Massachusetts 02453, United States.
  • 5Ph.D. Program in Biochemistry, The Graduate Center, City University of New York, New York, New York 10016, United States.

Abstract

There are currently over 160,000 protein crystal structures obtained by X-ray diffraction with resolutions of 1.5 Å or greater in the Protein Data Bank. At these resolutions hydrogen atoms do not resolve and heavy atoms such as oxygen, carbon, and nitrogen are indistinguishable. This leads to ambiguity in the rotamer and protonation states of multiple amino acids, notably asparagine, glutamine, histidine, serine, tyrosine, and threonine. When the rotamer and protonation states of these residues change, so too does the electrochemical surface of a binding site. A variety of computational approaches have been developed to assign states for these residues by investigating all possibilities and typically deciding on a single rotamer or protonation state for each residue that is consistent with the crystal structure. Here, we posit that there are multiple rotamer and protonation states that are consistent with the resolved structure of the proteins and introduce a Rotamer and Protonation Assignment (RAPA) protocol which analyzes local hydrogen-bonding environments in the resolved structures of proteins and identifies a set of unique rotamer and protonation states that are energetically consistent with the experimentally reported crystal structure. We evaluate the RAPA-predicted configurations in molecular dynamics simulations and find that there are multiple configurations for each protein that maintain structures consistent with the X-ray results. In our initial evaluations of the RAPA protocol, we find that for most proteins (69/77) there are multiple energetically accessible rotamer and protonation state configurations however the total number is limited to 8 or fewer for most of the proteins (62 of 77). This suggests that there is no combinatorial explosion in the number of energetically accessible rotamer and protonation states for most proteins and investigating all such states is computationally feasible.

Related Concept Videos

Conserved Binding Sites 01:49

4.2K

Many proteins’ biological role depends on their interactions with their ligands, small molecules that bind to specific locations on the protein known as ligand-binding sites. Ligand-binding sites are often conserved among homologous proteins as these sites are critical for protein function.
Binding sites are often located in large pockets, and if their location on a protein’s surface is unknown, it can be predicted using various approaches. The energetic method computationally...

¹H NMR of Conformationally Flexible Molecules: Temporal Resolution 00:52

811

At room temperature, the chair conformer of cyclohexane undergoes rapid ring flipping between two equivalent chair conformers at a rate of approximately 105 times per second. These two chair conformers are in equilibrium. The rapid ring flipping results in the interconversion of the axial proton to an equatorial proton and an equatorial to the axial proton. Such interconversions are too rapid and cannot be detected on the NMR timescale. Hence, the NMR spectrometer cannot distinguish between the...

Protein Organization 01:24

6.3K

Proteins are polymers of amino acid residues. They are versatile and responsible for different cellular functions, including DNA replication, molecular transport, catalysis, and structural support. Proteins have a hierarchical structure comprising at least three levels of organization: primary, secondary, and tertiary structure. Some large proteins have a quaternary structure where individual protein subunits are linked together.
The primary structure of a protein is its amino acid sequence....

Newman Projections 02:06

16.5K

Different notations are used to represent the three-dimensional structure of molecules on two-dimensional surfaces. One of the most commonly used representations is the dash-wedge formula. The dashed wedges, solid wedges, and the plane lines indicate the groups situated behind the plane, coming out of the plane, and in the plane, respectively.
The organic molecules rotate across the single bonds leading to numerous temporary three-dimensional structures of varying energy known as...

Protein Folding 01:22

117.6K

Overview

Proteins are chains of amino acids linked together by peptide bonds. Upon synthesis, a protein folds into a three-dimensional conformation which is critical to its biological function. Interactions between its constituent amino acids guide protein folding, and hence the protein structure is primarily dependent on its amino acid sequence.

Protein Structure Is Critical to Its Biological Function

Proteins perform a wide range of biological functions such as catalyzing chemical...

Ligand Binding and Linkage 00:49

4.8K

Allosteric proteins have more than one ligand binding site; the binding of a ligand to any of these sites influences the binding of ligands to the other sites. When a protein is allosteric, its binding sites are called coupled or linked.  In the case of enzymes, the site that binds to the substrate is known as the active site and the other site is known as the regulatory site. When a ligand binds to the regulatory site, this leads to conformational changes in the protein that can influence...