Personal tools
CCPN

Software

Instructions

Instructions for a CcpNmr course covering making dihedral angle and distance restraints, NOE assignment & protein structure calculation using ARIA, violation analysis and structure validation.

Making Distance Restraints

This exercise is to look at how we can use a CCPN project to get peaks lists and restraints (both dihedral and distance) that can be used in an ARIA structure calculation, and how we can pass intermediate structural information back into Analysis to help with violation analysis and NOE peak assignment. It is assumes that the basic workings of Analysis are understood, otherwise the beginners course many be attempted first.

Open an existing project

For this last section we will be using programs that are part of the Extend-NMR software collection. The Extend-NMR project has its own graphical interface, but many of its components are also available from within Analysis.  If CcpNmr Analysis is not already open, start it up on the command line by typing:

-> analysis

from the main menu select M:Project:Open Project.  Navigate to find and select the CcpnStructCourse3a project, then click [Open]. This project is at the same point that we left the last project if everything went according to plan.

You might get a warning that various files have moved location. You might also get a dialog with a list of spectra paths (because those also have moved location). If the paths are all in grey then just click the "All Done!" button at the bottom. If any path is in red then Analysis cannot find the corresponding spectrum data file, so either you need to tell Analysis where it is (by double clicking the path cell and navigating to the correct location) or accept that that particular spectrum will not have its contours displayed. When the project data is loaded select M:Window:window2 and a blue NOESY spectrum will hopefully appear.

To make a list of distance restraints from the assigned peaks in an NOE peak list first go to M:Structure:Make Distance Restraints. At the top of the popup, change the peak list to C-NOESY:173:1 and set the Restraint Set pulldown to "1" . We can leave all of the other parameters alone for demonstration purposes. The Restraint Distance Params section would allow us to specify how the NOE peak intensities relate to the distance bounds of any generated distance restraints. The default method is to calculate a target distance as peak volume raised to the power of -1/6 multiplied by some scaling factor, such that the reference intensity (in this case defaults to the peak list's average volume) exactly corresponds to the reference distance (in this case 3.2 Angstroms). The upper and lower bounds of the distance restraint are calculated as fractional changes from the calculated target distance (default is 20% above and below) while observing absolute minimum and maximum values for the bounds (1.72 & 8.00 Angstroms respectively by default). The {Residue Ranges} and {Chem Shift Ranges} tabs would allow you to make only restraints for specific assigned regions of your molecule or for specific shift ranges.

Making Restraints From Assigned Peaks

To calculate restraints for assigned peaks from the selected peak list simply press [Make Assigned Restraints]. After a short pause you will see the Restraints and Violations popup appear. This shows that you have one restraint set (a way of grouping related restraints and violations) containing a H-bond restraint list, which was already loaded via the FormatConverter, and list of 600+ new restraints. Click on the row of the restraint list in the central table and then click on the {Restraints} tab. Note that you can also get to this point via the M:Structure:Restraints and Violations option.

The restraints popup will appear and in its table you will see the restraints listed. Each restraint has one green-coloured row. Note some restraints also have following grey rows. These grey rows are alternative distance pairs for restraints that are ambiguous, i.e. a possible connection between two different pairs of 1H resonances. Note that such ambiguous restraints can represent logical uncertainty (before an NOE is resolved) or real physical ambiguity where a peak is caused by two or more overlapping pairs of resonances. This project has already been through one ARIA run, which is why we have so many peaks with ambiguous assignments.

Making Restraints From Unassigned Peaks

There is a second common way to generate distance restraints, which is to match the chemical shifts of resonances to NOE peak positions, thus generating potentially highly ambiguous distance restraints. Such restraints would typically be used as input for an iterative structure generation program like ARIA, where they would eventually be filtered to select only the correct contributing resonance pairs. Firstly, we could leave the matching of chemical shifts to the ARIA program by handing the program peak lists rather than restraint lists, which is what we we will demonstrate for the N-NOESY data. However, it is also possible to make such restraints in CCPN. Accordingly, the {Shift Match Tolerances} and {Network Anchoring} tabs in the M:Structure:Make Distance Restraints popup allow you to generate such distance restraints for peaks which do not have assignments. To generate distance restraints by shift matching, firstly set the peak list to "C-NOESY:173:1" click [Make Shift Match Restraints] This command uses the current settings, but {Chem Shift Ranges} and {Shift Match Tolerances} are only relevant for this command.

In the case of the shift-matching method potentially ambiguous distance restraints are generated by simply matching peak positions to close chemical shifts. In the case of network anchoring method, chemical shifts are also matched to peaks, but the ambiguous possibilities are refined by selecting only NOE assignments from amongst the possibilities that are supported by other, assigned NOEs or covalent structure. Say, for example, that a peak could arise from a number of resonance pairs. Two resonances A & B are more likely to be a correct assignment for the peak if we know that they are close to (or bound to) the same intermediary resonance, C and therefore must be close to each other.

Merging and Splitting Restraint Lists

To prepare these newly generated restraint lists for the ARIA calculation we will merge and split them in order to generate restraints that are separated into "Unambiguous"  and "Ambiguous" categories. In ARIA we do this so that the "Ambiguous" restraints and peaks follow a different protocol; they enter the calculation after the unambiguous, more certain, restraints have formed the initial structure.

In the M:Structure:Restraints and Violations:{Restraint Lists} tab, merge the two lists that derive from the C-NOESY experiment by cicking on the two relevant rows (probably numbers 2 & 3) while holding down the <Ctrl> key. Now click [Merge Lists] at the bottom and [OK]. You will see that the restraints have been combined and there is now only one list from the C-NOESY. Then for the remaining, enlarged restraint list, select its row and click [Split Ambig/Unambig]. These are now ready for input to ARIA.

 

Making Dihedral Restraints from Chemical Shifts

Next we will generate restraints in a different manner; dihedral restraints from backbone chemical shifts. We will be using a program called DANGLE (Dihedral ANgles from Global Likelihood Estimates) which is embedded within Analysis. DANGLE estimates dihedral angles from chemical shifts in a similar manner to TALOS; i.e. it matches a chemical shift & sequence query to a structural database of known PHI/PSI angles and chemical shifts. However, DANGLE uses a different (Bayesian) method to produce an angle estimate and tolerance, compared to TALOS. The idea is to use Bayesian inference to infer what the range of likely PHI/PSI angles might be (using the chemical shifts) by checking all PHI/PSI combinations in 10 degree square bins to see how well such angles can be used to explain the data. Such an analysis allows for the user to see uncertainties in the angle predictions, including where the chemical shift to structure mapping is redundant and there are multiple regions in the Ramachandran plot which could explain the chemical shift data.

To run DANGLE select M:Structure:DANGLE: Predict Dihedrals. Note that at the top that the Chain should be set as "GI:A", the Shift List as "ShiftList 2:2" and Max No. of Islands as 2. This simply specifies which data to use and how strict the analysis should be. Using two islands means that we will reject predictions that result in more than two discrete regions of the Ramachandran plot. To start the analysis press [Run Prediction] and accept "Run1" as the name for the job by pressing [OK] at the opportune moment. Please be aware that DANGLE will take several minutes to finish the calculation.

Once the calculation is over you will see the main table filled in with PHI and PSI backbone dihedral angle predictions and their associated error ranges. Further, if you select a row in the main table you will see a plot in Ramachandran (PHI/PSI) space of where the likely angles are deemed to be. Click on the "7 Ser" row and note that there is a lot of red colour in the chart, indicating that DANGLE was not able to make a distinct choice of PHI/PSI: you should not use such a prediction in a structure calculation. Click on the [Next] button to get to "8 Lys". The prediction for this residue is somewhat better, and you could use this in a structure calculation (it has one discrete region) although the error bounds for such a dihedral restraint would be suitably large. Click on the  "12 Glu" row. This residue has a very precise range of predicted PHI/PSI angles. Such a residue could be used in a structure calculation with a high degree of confidence and proportionately narrow error margins.

Note that DANGLE also predicts the secondary structure of the residues, but that this calculation is not made from the angles, but directly from the measured secondary structures in the shift-structure database. To make the restraints themselves set the Restraint Set to "1", which will place the PHI/PSI dihedral restraints with our existing distance restraints and press [Commit Constraints].  View the generated restraints by going to M:Structure:Restraints & Violations:{Restraints}. Note that if you have a structural model for your protein you can see how the model's angles match with the DANGLE prediction.

Submitting a Structure Calculation to CCPN Grid

If for any reason you are not confident about the state of your CCPN project at this stage of the demonstration before we do a structure calculation you may like to open a pre-prepared CCPN project, which has all of the expected restraints present: In the Analysis menu bar select M:Project:Open Project then [Yes] to close the current project and [No] to not save.  Navigate to find and select the CcpnStructCourse3b project, then click [Open].

To start the structure calculation using the restraints we have created we will launch a panel that is dedicated to setting up an ARIA job, which will be run remotely on the CcpnGrid service. The ARIA Setup panel is officially part of the Extend-NMR project, but we can avoid having to install all the Extend-NMR software and start it directly from Analysis by typing the following at the python prompt:

>>> top.activateAriaSetup()

The resulting panel allows us to control which data goes to the ARIA calculation from the CCPN project. For more fine-grained control you need to use the ARIA GUI; for example to change the annealing protocol.

Create a new job for ARIA to work on by pressing the green "New Run" button. This "Run" object links together all the data that goes into the same calculation. Ensuring that the {Input Data} and {Peak Lists} tabs are selected (they ought to be by default), change the pulldown menu at the bottom right to N-NOESY:182:1 and click [Add Peak List]; this will state that in this run ARIA should use this data. Now swap to the {Restraint Lists} tab and add restraint lists in a similar way: Select the list name from the pulldown menu and click [Add Restraint List]. Do this for all of the restraint lists, i.e. the H-bond, two distance and dihedral restraint lists. Now that all of the input data is set we have to tell ARIA how to run on the data.

Move to the {Run Settings} tab, found at the top. Here you can see some of the settings to control the ARIA run. In the lower table make sure that the "Ambiguous protocol?" column set to "Yes" for the "Ambig" distance restraints and "No" for the "Unambig" distance restraints. If we wanted to run the ARIA structure calculation locally (noting you will have to have a CNS executable and ARIA itself available) we can [Launch ARIA GUI], make relevant changes, save the ARIA project and then [Setup Project], which will put the ARIA data in a state that it can be run from the command line in the usual way.

For this demonstration you may run the ARIA calculation remotely using the prototype CCPN Grid service. To do this simply press the [Submit to CcpnGrid] button. For this demonstration use the user identification ''test" and the password "test123". We will not wait for the final structure calculation, which will take some time (although this protein will take less than an hour on an unloaded server), but instead we will look at a structure calculation that has already been completed, and which uses exactly the same data as you have been looking at in the CCPN project.

Via a web browser go to the page http://webapps.ccpn.ac.uk/ccpngrid/status: if you do start a grid calculation this is the page that you get to via the [Show web page] button. If asked, log on with UID: ''test" password: "test123" . Here you will see various ARIA calculations, one of which will have status "Finished" and be entitled ''CcpnStructCourse...". Click on the [Results] button for this job and look at the available data. Note  that we have the option to download an updated CCPN project which contains the newly calculated structures and a violation analysis, all of which remained linked to the NMR data in the CCPN project; so that for example we can easily jump from a violation to the offending point in the spectrum.


Importing Structure Calculation Results

You can see the CCPN project that resulted from this completed ARIA run by loading the project CcpnStructCourse3c: In the menu bar select M:Project:Open Project then [Yes] to close the current project and [No] to not save. Navigate to find and select the CcpnStructCourse3c project, then click [Open]. Although we are using a pre-prepared CCPN project, to use the CCPN project that was generated by ARIA on the CCPN Grid it simply has to be downloaded. We will now have a look at the result of the structure calculation by using Analysis.

Structures

In the data that comes back from ARIA we will see that two structure ensembles have been entered into the CCPN project; one from the last ARIA iteration and one after the water refinement stage. To see the structures go to M:Structure:Structures. If you go to the {Structure Models} tab select the [Superpose & Calculate RMSDs] button at the top right and then [OK] to see how well the structural models align. To see the structure in a 3-dimensional representation for whichever is selected in the table click the [Viewer] at the top.

The controls for the structural viewer are as follows:

  • Rotate with middle-click & drag.

  • Zoom with the mouse wheel, or middle-click, <Shift> & drag.

  • Move with middle-click, <Ctrl> & drag.

The mouse right-click brings up a menu that allows you to change the display mode, spin, and print the structure. The left-click is used for atom selection. Try the atom selection by first ensuring that the N-NOESY peak list is selected at the top, click on an amide location on the structure (i.e. a blue atom) and then click [Show Peaks]. This will show a table containing all of the peaks that relate to connections from the selected atom in the structure. The numbers on the dashed lines represent the distances between the atoms.

The same sort of functionality is present in the Edit Assignment popup (M:Assignment:Assignment Panel). - If you look at the NOESY spectrum in window2, and assign a peak (by pressing <a> with the cursor over a peak), you can see assignment possibilities via the [Show On Structure] button. Also note that because we now have a structure the Edit Assignment popup will show distances between one peak assigned 1H resonance and the 1H possibilities in another peak dimension.

Referring back to the restraints popup (M:Structure:Restraints & Violations {Restraints}), set the structure pulldown menu to "2"; the structure from the ARIA/CNS water refinement. Next set the restraint list as one of the Distance lists. You will see that the "Struc Value" is now filled-in for the restraints and you are are able to select any restraint rows (using <Shift>/<Ctrl> keys) and then click [Show Selected On Structure] to illustrate graphically where on the loaded structure the restraints apply.

Restraints & Violations

In the Restraints & Violations popup (M:Structure:Restraints & Violations), choose Restraint Set "2" at the top; these are the results from the ARIA run (set 1 was the input) and choose the {Restraint Lists} tab. You will see that ARIA has generated one or more restraint list for each peak list and restraint list that was used as input data. Choose the second row labelled "ARIA_REJECT_run1_NOEs_it8", which represents the peaks/restraints that ARIA rejected, and select the {Restraints} tab. From the Structure pulldown menu at the top check that "2" is selected (the water refined ensemble) and in the Violation List pulldown menu table select "1"; you will hopefully see that many of the rows become orange/red, indicating that these restraints caused violations. If you change the Restraint List pulldown menu to a list that is not labelled as "REJECT" you will see that there are far fewer violations.

Choose a restraint row for one of the violated (red/orange) restraints and click [Show Peaks]. This will open a table containing the peak (or peaks) that gave rise to the restraint. Note that a restraint may be linked to more than one peak, for example where there are symmetry related peaks that correspond to the same close resonance pair. In this Selected Peaks table click on the peak row and then the [Find Peak] button at the top. You will be whisked to the point in the spectra where the peak resides. From here you may choose to look at the peak assignment by using the <a> key with the mouse over the peak. You can get to the same point directly from the restraint by using the [Assign Peak] button from below the Restraints table. Note that ARIA has generated an assigned peak list, so when we look at the N-NOESY spectrum we see two, differently assigned, peak lists on top of one another. To switch off the new peak list in the spectrum window you can click on the [Peaks] button at the top and toggle the [N-NOESY:182:2] option.

Now go back to the list of restraints. Click on the heading of the "Mean Viol" column then scroll to the bottom of the table to see the largest violation. Select the last red row (probably restraint number 77) and click [Assign Peak]. In the Assignment Panel that appears/updates you will see that the original peak that was used to make the restraint does not have an assignment an the second dimension. This peak was not satisfactorily assigned during the structure calculation process, but this is not surprising given that no care was taken in generating an initial set of seed assignments; we deliberately wanted a project with lots of interesting violations. For the second "F2" peak dimension in the Assignment Panel click on the "Dist" heading (mid-right table) so that the possible assignments are presented in order of distance from the assigned amide proton. Hopefully, you will see that there are two resonances "3GluHga" and "3GluHgb", from the same residue as the amide, at 2.7 Angstroms and that the chemical shift difference to the peak position "Delta" is 0.044 PPM. These options are close enough in both chemical shift and structural distance to be good assignment candidates, thus click on both the "3GluHga" and "3GluHgb" rows for the second dimension to assign them. The peak will now have contributions to both resonances.

Repeat this operation once more for the second to last restraint (number 22), again clicking [Assign Peak]. This time you will see that there is an assignment option for "36IleHa" which seems plausible. Although this is not the closest possibility in the structure ensemble (at least the what we have so far) it is chemical shift difference is fairly small, unlike the "41ProHdb" option. Once we have checked all of the rejected and seriously violated restraints, adjusting their peaks accordingly we would go through another round of structure calculation, with a better starting set of NOE assignments. Although we can go though each problematic restraint in a fairly manual matter, there is a more automated procedure for checking NOE peak lists and curating their assignments using chemical shift and structure information.

 

Using Structures to Assign NOE Contributions

An alternative, more detailed, method of looking at NOE assignments is via the M:Assignment:NOE Contributions option. This is designed to make the whole process of comparing NOE peaks to a structural model efficient. Select this and in the "Peak List & Display Settings" choose the "NOE Peak list" for the NOESY experiment you are currently looking at, then double click the "Use?" column in the lower table for the relevant window, so that it becomes "Yes". Also set ''Structure Display" to "Assigned" and 'Mark Peak' to true. Now move to the {Peak Assignments} tab, select the peak in the spectrum window (left click & drag) and in the NOE Contribution popup click [Selected Peak]. This will show you, in terms of the close chemical shifts and distances from the selected structure, what the likely NOE assignments are.

By clicking on the rows in the NOE peaks table you will see that several things happen automatically: The view of the selected windows (in this case the selected window2) zooms and marks the selected peak; the graphical structure view highlights the assigned atom connections from the peak; and the lower table of the NOE Contributions popup shows the structurally possible, shift-matched resonance pairs ordered in terms of shift and geometric distance. Select the first peak in the table (10Tyr) and you will see that the top row for the possible assignments is "9SerHbb", which is close in chemical shift and only 2.939 Angstroms away. Click on the "9SerHbb" for the second column in the table and then [Assign Selected]. The will see that the upper table, spectrum window and structure view all update to reflect the new assignment.

Next, in the upper table select peak number 4 (62Glu) and you will see that the assignment possibilities are not so clear cut. Here, selecting [Predict Peaks] will use the entered structure and the known chemical shift values to predict the positions of peaks, near to the selected real peak, which correspond to close resonance pairs. These artificial peaks are labelled with the structural distance and are contained in an entirely separate list to the real peaks (so nothing is contaminated and they can be removed easily). In this instance it is probably appropriate to assign the peak to bo

Assignment & Structure Validation

th of the top assignment possibilities (from residues 62Glu and 31Ala). While holding down the <Ctrl> key select both the "63AlaH" and "32LeuH" options in the table and click [Assign Selected]. We can confirm that this double assignment was the correct thing to do by looking for "return" peaks that show the reciprocal connection (looking from the point of view of the other amide). Clicking on [Find Reciprocal Peaks] you will see that the spectrum window splits into separate panels, with the original amide location on the left. Strips 2-4 show other amide positions with peaks that may possibly reciprocate the same proton-proton connection. Peaks for 63AlaH and 32Leu are indeed present and confirm the assignment in the second dimension we just made. Note also that 36Leu shows a potential return peak, but this is too far away in the structure and so does not appear in the table.

Note that if we change peak assignments then we have the choice of making (or letting ARIA make) a new set of restraints. Alternatively we can curate the existing restraints using the [Update Assignment From Peak] button (see M:Structure:Restraints & Violations {Restraints}), which will alter the distance restraint to reflect this new assignment.

 

Assignment & Structure Validation

The final part of these tutorial exercises is to look at how we can use the CING software to analyse and validate NMR and structure data. Also, we will consider the data checks that can be performed within Analysis, without the need for external programs, for example to find chemical shift outliers and bogus assignments..

Structure Validation

In the main menu select "M:Structure:CING: Validate Structures" and click the green [New Run] button in the upper right corner. This run specification will contain all of the data that CING will analyse. When a new run is made the {Structures} table will fill with the various models from an ensemble. Accepting this, we move on to the {Shifts & Measurements} tab. Here click [Add Measurement List] to add the shift list to the CING analysis. Move to the {Peak Lists} tab, and add the N-NOESY:182:2 peak list (this is the one that ARIA made) by selecting it from the lower right pulldown menu and selecting [Add Peak List]. Finally in the restraint lists tab add restraint list "Distance-2:1". This restraint list is the one that came back from ARIA and consequently lives in the second restraint set. Note that in general you may select as much data as you like, but we are just selecting a smaller subset for demonstration purposes.

Select the {Run Settings} tab. To submit the analysis press the [Submit Project] button. Note that this may take some time, so you may wish to let the demonstrator make a real submission and view some previously calculated results, thus sparing the iCING server. Although, feel free to use this server generally.

To view some precalculated validation data look at the following URL in a web browser:

http://nmr.cmbi.ru.nl/CASD-NMR-CING/data/GR/CGR26ACheshire/CGR26ACheshire.cittp

Assignment Quality Control

At the menu option M:Assignment:Quality Reports you can access a suite of analyses which generate automated reports about the quality and completeness of your assignments. The first tab simply lists how many of each kind of residue and atom type you have resonance assignments for. The second tab shows statistics for how many NOE (and other through-space) connections you have across your molecular system.

While the first two tabs are dedicated to analysing completeness, the last two tabs are dedicated to finding mistakes and inconsistencies in assignments. These tables look at the same kind of issues, but one presents the information from a resonance point of view, and the other from a peak point of view. Most notably the assignments are checked for the following criteria:

  • An assignment to a given atom is duplicated.

  • The standard deviation for a chemical shift is large.

  • If a peak position is far from the chemical shift average (irrespective of a shift's deviation).

  • A resonance has an impossible number of covalently bound partners: e.g. an amide proton assigned to two HSQC peaks cannot be linked to both nitrogen resonances.

  • The chemical shift for an atom assignment is unlikely given the known chemical shift distributions (BMRB).

  • If peaks have unusual intensities or sign.

When something is potentially wrong, then cells of the tables are coloured red. A highlighted cell doesn't mean that something is definitely wrong, just that it warrants an explanation or further investigation. To find the peaks responsible for any aberration, simply click on the row in the table and click [Show Peaks]. In the resulting table you can assign or find any of the peaks in the various spectrum windows.