Please wait, while we are loading the content...
Please wait, while we are loading the content...
| Content Provider | ACM Digital Library |
|---|---|
| Author | Groth, Malgorzata Ripoll, Daniel R. Oldziej, Stanislaw Liwo, Adam Czaplewski, Cezary Pillardy, Jaroslaw Scheraga, Harold A. Lee, Jooyoung Rodziewicz-Motowidlo, Sylwia Kamierkiewicz, Rajmund Wawak, Ryszard J. |
| Abstract | United-residue models of polypeptide chains [3, 5, 19-22, 24, 31, 33] have long been of interest, because they enable one to carry out global conformational searches of proteins in real time, which in turn can facilitate ab initio protein structure predictions based solely on Anfinsen's thermodynamic hypothesis [1], according to which the native structure of a protein is a global minimum of its potential energy surface [32]. In the last few years, we developed a united-residue force field [20-22, 24], hereafter referred to as UNRES, in which a polypeptide chain is represented by a sequence of α-carbon (Cα) atoms linked by virtual bonds with attached united side chains (SC) and united peptide groups (p) located in the middle between the consecutive α-carbons (Figure 1). Only the united peptide groups and united side chains serve as interaction sites, the α-carbons serving to define the geometry. All the virtual bond lengths (i.e. Cα-Cα and Cα-SC) are fixed; the Cα-Cα distance is taken as 3.8 Å which corresponds to trans peptide groups, while the side-chain angles (αSC and ΒSC), as well the virtual-bond angles (&thgr; and γ) can vary. The energy of the virtual-bond chain is expressed by eq. (1). U = @@@@ Uscisci + @@@@ UsciPj + wel @@@@ UPiPj + wtor @@@@ Utor(γi) + wloc @@@@ (Ub(&thgr;i) + Urat(αsci, Βsci)] + wcorrUcorr (1) In contrast to all-atom force fields, the multibody terms are not just a small addition; the multi-body terms are an essential ingredient of coarse-grain united-residue force fields. This is because coarse-grain potentials are mean-field potentials, corresponding to the restricted free energy, F(X), calculated for given configurations of the “coarse-grain” interaction sites (p and SC in the case of the UNRES force field; Figure 1) and to averaging over the remaining, “less important” degrees of freedom, as expressed by eq. (2) [20]. F(X) = -RTln{1/VY @@@@ exp[-E(X;Y)/kBT]dVY} with VY = @@@@ dVY. (2) where E(Xi Y) is the original energy function, X denotes the vector of the degrees of freedom of the “coarse-grain” system (the virtual bond angles &thgr;, the virtual-bond dihedral angles γ, and the polar angles ΒSC and γSC in UNRES), Y denotes the vector of the degrees of freedom over which the average is computed (e.g., the positions and orientations of solvent molecules, the side-chain dihedral angles &KHgr;, etc.), R is the gas constant, T is the absolute temperature, &OHgr;Y is the region of the Y subspace of variables over which the integration is carried out and VY is the volume of this region. Expanding F(X) of eq. (2) in the cumulant series in Β = 1/RT, we obtain [20]: F(X) ≅ F(Β,X) = U1 - 1/2(U21)Β + 1/6(U3 - 3U1U2 + 2U31)Β2 - 1/24(U4 - 3U22 - 4U1U3 + 12U21U2 - 6U41)Β3 + … = @@@@ (-1)k-1/k!Ck(X)Βk-1 (3) where Ck is the k-th order cumulant and U1, U2, … Un are consecutive energy moments: Uk = 1/VY@@@@E(X;Y)kdVY (4) Even if the original energy function E(Xi Y) contains at most pairwise terms, the restricted free energy F(X) will in general contain higher-order terms that arise from the presence of higher energy moments in the cumulant expansion [eq. (3)]. The early version of UNRES [22, 24] did not contain multibody terms and was therefore good only for inverse folding (i.e., it could recognize a native fold corresponding to a given amino-acid sequence in the data base of decoys taken from the PDB), but was not capable of de novo folding of a protein [22, 24]. The capability of de novo folding was achieved only after introducing multibody or correlation terms in the backbone electrostatic interactions [20, 23]. Similar conclusions about the role of backbone hydrogen bonding and other multi-body terms have also been drawn by other workers [5, 10, 11].The side-chain (USCSC) and the components of the local-interaction potential (Utor, Ub, and Urot) were parameterized based on distribution and correlation functions determined [22, 24] from a set of 195 high-resolution non-homologous structures from the Protein Data Bank (PDB) [2]. The peptide-group interaction potential Upipi and the correlation terms pertaining to backbone hydrogen bonding (Ucorr) were parameterized by averaging the all-atom ECEPP/2 [27,28] potential. Finally, the relative weights of the energy terms were determined so as to maximize the energy gap between the native structure and the average energy of the non-native structures [24]; this was accomplished by the minimization of the so-called Z-score function [6-9]. The version of the UNRES force field described above with the Conformational Space Annealing (CSA) global-optimization procedure [16-18] was first tested on two simple helical proteins: the 10-58 fragment of the B-domain of staphylococcal protein A (a three-helix-bundle topology) and apo-calbindin D9K (a 75-residue protein with the topology of a four-helix bundle with an EF-hand motif) [15]. In both cases, the native structure and its mirror image were located; the native structure was lower in energy for protein A, and higher for apo-calbindin. A full-blown blind-prediction test was performed within the CASP3 experiment. We submitted predictions for seven targets, one of which, for the periplasmic protein HDEA from E. coli, turned out to be the most accurate one among all the models submitted [30], including homology modeling and threading (Figure 2). We also achieved very good results for DNA b helicase, a 116-residue protein. These two proteins were assessed as particularly difficult targets [30], because of their rare folds. Our predictions of most of the other targets were also fairly accurate [13, 14, 21, 30]. However, the version of the UNRES force field that includes only the multi-body terms pertaining to hydrogen bonding produces too-distorted Β-structure. In our present work, we introduced two new types of multibody terms: Umulttor and Uel,loc thet arise from the coupling between local interactions involving more than two consecutive peptide units and local and backbone electrostatic interactions, respectively. These terms comprise the third — sixth order terms in the cumulant expansion for the restricted free energy [eq. (3)]. These new terms should extend the performance of our procedure to proteins that contain Β-structure. By local-interaction energy (Eloc), we denote the conformational energy of an isolated peptide unit. Eloc can be modeled by the energy surface of an N-acetyl-N'-methylamide derivative of the amino acid residue under study [35]. It is usually expressed as a function of the dihedral angles φ and &psgr; however, for the purpose of implementing it in eqs. (2) and (3), we express it as a function of the dihedral angles of rotation λ1 and λ2 about the Cα-Cα bonds forming the peptide unit [29] (Figure 3), which provides a clear separation of the degrees of freedom into the “coarse” or “important” ones [X of eq. (2)] and “fine” or “less important” ones [Y of eq. (2)], the “less important” ones being the dihedral angles λ [25, 26]. The dihedral angles φ and &psgr; describe both the “coarse” and “fine” shape of the polypeptide backbone and cannot therefore be implemented directly in the calculation of the restricted free energy. The lowest-energy conformations (i.e., region C) [35] of terminally-blocked L-amino acid residues (the smallest peptide units) lie exactly in the region of the (φ, &psgr;)-dihedral angle space characteristic of Β-structure. Therefore, the mixed local and electrostatic energy moments in the cumulant expansion of the restricted free energy of the polypeptide chain [eq. (3)] contribute to the stabilization of Β-structures. The contributions to consecutive energy moments [needed to compute the cumulant expansion for F(X) in eq. (3)] that include the products of the local and electrostatic energies have the following general form: Uk;el,loc = 1/(2π)Nk @@@@ … @@@@ EjlocEk-jeldλi1dλi2 … λiNk, j = 1, 2, …, k-1; k= 2, 3, … (5) where λi1, λi2 … λiNh indicate the dihedral angles λ involved in integration (their number, Nk1 will vary depending on the contribution to the energy moment). Likewise, the contributions needed to determine multiple-torsional terms (Umulttor) are expressed by eq. (6). Umultk;tor = 1/(2π)Nk @@@@ … @@@@ Eklocdλi1dλi2 … λiNk, k= 2, 3, … (6) To a very good approximation, Eloc(λ1, λ2) can be expressed as a second-order Fourier series in λ1 and λ2. In our dipole model of the peptide group [25], the energy of electrostatic interaction between peptide groups is a second-order Fourier series in the angles λ (this follows directly from the energetics of the dipole-dipole interactions [25]). Therefore, all energy moments involving the local and/or the electrostatic energy can be calculated analytically. In our earlier work [20], we developed an algorithm for calculating the moments of the electrostatic-interaction energy; this algorithm was used to derive the hydrogen-bonding multibody terms in UNRES. We have now generalized this algorithm to compute the energy moments involving both electrostatic and local interactions, and derived the terms in the cumulant expansion for F(X) [eq. (3)] up to the sixth order. Instead of writing the complicated formulas for the component terms here, we present them as graphs in Figures 4, 5a and 5b. Figure 4 includes the terms already present in the original version of UNRES, while Figures 5a and 5b show the new terms. In Figure 4, the upper left graph represents the averaging of the square of the energy of the electrostatic interactions between the peptide groups; it corresponds to the contribution to Upp of eq. (1) coming from a pair of interacting peptide groups. The upper right graph corresponds to the averaging of the products of local-interaction energy of two consecutive peptide units. As a result of this, the average becomes dependent on the virtual dihedral angle γ centered at the Cα-Cα virtual bond connecting the two units. This term formally corresponds to Utor of eq. (1). Both of the graphs described above come from second moments of the energy of the all-atom chain. The middle and the bottom graphs represent the dominant three- and four-body contributions to the restricted free energy. They were derived in our earlier work [20] as the components of the third and the fourth-order term in the cumulant expansion of the backbone electrostatic energy. However, they also include the averaging of the electrostatic energy between neighboring peptide groups, which is a part of the local interaction energy of the peptide unit to which these peptide group belong. Therefore, these terms should be considered as parts of Uel,loc.The third-order components of Uel,loc presented in Figure 5a describe the correlation between the energy of the electrostatic interaction between two non-contiguous peptide groups and the energy of local interaction of the neighboring peptide units. As shown, these terms are dependent not only on the relative orientation of the two peptide groups, but also on the virtual-bond dihedral angles γ1 and γ2 centered on the corresponding Cα-Cα virtual-bond axes. In other words, a given orientation of two peptide groups invokes a certain local fold of the respective portion of the polypeptide chain, or, long-range interactions have the capacity of propagating an ordered local structure. Figure 6 shows that a parallel or antiparallel orientation of two interacting peptide groups, as in Β-sheets, favors extended virtual-bond dihedral angles and, thus propagates extended configurations of the polypeptide chain as in Β-strands (note the minimum at γ1 = γ2 = ±180°). The fifth- and sixth-order components displayed in Figure 5b should also be important, because they propagate the local fold of the chain, if two pairs of neighboring peptide groups are in contact.The double torsional terms (Umulttor; represented as the middle graph in the bottom of Figure 5a) should also be important with regard to the formation of ordered structures. Their importance has already been pointed out by other workers [4]. From our preliminary analysis, it appears that these double torsional terms contribute to the stabilization of left-handed extended strands [which, in turn, lead to (the observed) right-handed Β-sheets]. The other two third- and fourth-order terms shown in the bottom part of Figure 5a involve local and electrostatic interaction correlations within three and four adjacent peptide units, respectively; they can be important for the correct description of the geometry of Β- or γ-turns and of the geometry of α-helices. To test the ability of UNRES augmented with the new correlation terms to reproduce the structure of Β-sheets, we used the sequence of a 20-residue polypeptide betanova, which was recently designed as a minimum Β-sheet model [12]. This peptide forms a stable three-stranded Β-sheet in solution, as revealed by NMR spectroscopy [12]. The first calculation was carried out without including the new features of the force field (i.e., the only multibody terms included are those shown in Figure 4), while the second one was carried out with inclusion of the third-, fourth- and fifth-order correlation terms that are depicted in Figure 5; these term pertain to the coupling between local and electrostatic interactions. In both cases the CSA method [16-18] was used to find the global minimum. As shown, lack of sufficient terms responsible for the coupling between the local and backbone hydrogen-bonding interactions leads effectively to a coil structure (Figure 7a). Including the multibody terms introduced in this work leads to correct topology of betanova with correct positions of both turns and correct contacts between the side chains [12]. It should be noted that this result was obtained even without systematic calibration of the weights of the new correlation contributions to energy. At present, we are determining the weights of the new correlation terms in a systematic way by means of Z-score optimization. |
| Starting Page | 193 |
| Ending Page | 200 |
| Page Count | 8 |
| File Format | |
| ISBN | 1581131860 |
| DOI | 10.1145/332306.332544 |
| Language | English |
| Publisher | Association for Computing Machinery (ACM) |
| Publisher Date | 2000-04-08 |
| Publisher Place | New York |
| Access Restriction | Subscribed |
| Content Type | Text |
| Resource Type | Article |
National Digital Library of India (NDLI) is a virtual repository of learning resources which is not just a repository with search/browse facilities but provides a host of services for the learner community. It is sponsored and mentored by Ministry of Education, Government of India, through its National Mission on Education through Information and Communication Technology (NMEICT). Filtered and federated searching is employed to facilitate focused searching so that learners can find the right resource with least effort and in minimum time. NDLI provides user group-specific services such as Examination Preparatory for School and College students and job aspirants. Services for Researchers and general learners are also provided. NDLI is designed to hold content of any language and provides interface support for 10 most widely used Indian languages. It is built to provide support for all academic levels including researchers and life-long learners, all disciplines, all popular forms of access devices and differently-abled learners. It is designed to enable people to learn and prepare from best practices from all over the world and to facilitate researchers to perform inter-linked exploration from multiple sources. It is developed, operated and maintained from Indian Institute of Technology Kharagpur.
Learn more about this project from here.
NDLI is a conglomeration of freely available or institutionally contributed or donated or publisher managed contents. Almost all these contents are hosted and accessed from respective sources. The responsibility for authenticity, relevance, completeness, accuracy, reliability and suitability of these contents rests with the respective organization and NDLI has no responsibility or liability for these. Every effort is made to keep the NDLI portal up and running smoothly unless there are some unavoidable technical issues.
Ministry of Education, through its National Mission on Education through Information and Communication Technology (NMEICT), has sponsored and funded the National Digital Library of India (NDLI) project.
| Sl. | Authority | Responsibilities | Communication Details |
|---|---|---|---|
| 1 | Ministry of Education (GoI), Department of Higher Education |
Sanctioning Authority | https://www.education.gov.in/ict-initiatives |
| 2 | Indian Institute of Technology Kharagpur | Host Institute of the Project: The host institute of the project is responsible for providing infrastructure support and hosting the project | https://www.iitkgp.ac.in |
| 3 | National Digital Library of India Office, Indian Institute of Technology Kharagpur | The administrative and infrastructural headquarters of the project | Dr. B. Sutradhar bsutra@ndl.gov.in |
| 4 | Project PI / Joint PI | Principal Investigator and Joint Principal Investigators of the project |
Dr. B. Sutradhar bsutra@ndl.gov.in Prof. Saswat Chakrabarti will be added soon |
| 5 | Website/Portal (Helpdesk) | Queries regarding NDLI and its services | support@ndl.gov.in |
| 6 | Contents and Copyright Issues | Queries related to content curation and copyright issues | content@ndl.gov.in |
| 7 | National Digital Library of India Club (NDLI Club) | Queries related to NDLI Club formation, support, user awareness program, seminar/symposium, collaboration, social media, promotion, and outreach | clubsupport@ndl.gov.in |
| 8 | Digital Preservation Centre (DPC) | Assistance with digitizing and archiving copyright-free printed books | dpc@ndl.gov.in |
| 9 | IDR Setup or Support | Queries related to establishment and support of Institutional Digital Repository (IDR) and IDR workshops | idr@ndl.gov.in |
|
Loading...
|