Extend-NMR 4: Structure Calculation with ARIA
A structure calculation tutorial for ARiA 2.3 in an Extend-NMR context.
Analysis Tutorial Data
The first part uses: CcpnDemo005.tgz (151MB total size)
The second part uses: CcpnDemo006.tgz (151 MB total size)
The third part uses: CcpnDemo007.tgz (153 MB total size)
Help Note
For details about document nomenclature, keyboard shortcut and mouse operations please see the Extend-NMR 1 section of this tutorial.
Protein Structure Calculation: ARIA & CCPN
The penultimate part of the exercise is to look at how we can use a CCPN project to get peaks lists and restraints (both dihedral and distance) that can be used in an ARIA structure calculation, and how we can pass intermediate structural information back into Analysis to help with violation analysis and NOE peak assignment.
Open an existing project
Start Extend-NMR on the command line by typing:
-> extendNmr
When the Extend-NMR menu bar has appeared select M:Project:Open Project. Navigate to find and select the CcpnDemo005 project, then click [Open].
You might get a warning that various files have
moved location. You might also get a dialog with a list of spectra
paths (because those also have moved location). If the paths are all
in grey then just click the "All Done!" button at the bottom. If any
path is in red then Analysis cannot find the corresponding spectrum
data file, so either you need to tell Analysis where it is (by double
clicking the path cell and navigating to the correct location) or
accept that that particular spectrum will not have its contours
displayed. When the project data is loaded select M:CcpNmr:Analysis and a blue NOESY spectrum will hopefully appear.
Distance Restraints
To make a list of distance restraints from the assigned peaks in an NOE peak list first go to M:Structure:Make Distance Restraints. At the top of the popup, change the peak list to C-NOESY:173:1 and set the Restraint Set pulldown to "1" . We can leave all of the other parameters alone for demonstration purposes. The Restraint Distance Params section would allow us to specify how the NOE peak intensities relate to the distance bounds of any generated distance restraints. The default method is to calculate a target distance as peak volume raised to the power of -1/6 multiplied by some scaling factor, such that the reference intensity (in this case defaults to the peak list's average volume) exactly corresponds to the reference distance (in this case 3.2 Angstroms). The upper and lower bounds of the distance restraint are calculated as fractional changes from the calculated target distance (default is 20% above and below) while observing absolute minimum and maximum values for the bounds (1.72 & 8.00 Angstroms respectively by default). The {Residue Ranges} and {Chem Shift Ranges} tabs would allow you to make only restraints for specific assigned regions of your molecule or for specific shift ranges.
Making Restraints From Assigned Peaks
To calculate restraints for assigned peaks from the selected peak list simply press [Make Assigned Restraints]. After a short pause you will see the Restraints and Violations popup appear. This shows that you have one restraint set (a way of grouping related restraints and violations) containing a H-bond restraint list, which was already loaded via the FormatConverter, and list of over 1000 new restraints. Click on the row of the restraint list in the central table and then click on the {Restraints} tab. Note that you can also get to this point via the M:Structure:Restraints and Violations option.
The
restraints popup will appear and
in its table you will see the restraints listed, mostly as green
coloured rows. Note some restraints also have following grey rows.
These grey rows indicate restraints that are ambiguous, i.e. a possible
connection between two different pairs of 1H resonances. Note that such
ambiguous restraints can represent logical uncertainty (before an NOE
is resolved) or real physical ambiguity where a peak is caused by two
or more overlapping pairs of resonances.
Making Restraints From Unassigned Peaks
There is a second common way to generate distance restraints, which is to match the chemical shifts of resonances to NOE peak positions, thus generating potentially highly ambiguous distance restraints. Such restraints would typically be filtered to select only the correct contributing resonance pairs, by iterative structure generation and violation analysis in a program like ARIA. Firstly, we could leave the matching of chemical shifts to the ARIA program by handing the program peak lists rather than restraint lists, which is what we we will demonstrate for the N-NOESY data. However, it is also possible to make such restraints in CCPN. Accordingly, the {Shift Match Tolerances} and {Network Anchoring} tabs in the M:Structure:Make Distance Restraints popup allow you to generate such distance restraints for peaks which do not have assignments. To generate distance restraints by shift matching, firstly set the peak list to "C-NOESY:173:1" click [Make Shift Match Restraints] This command uses the current settings, but {Chem Shift Ranges} and {Shift Match Tolerances} are only relevant for this command.
In the case of the shift-matching method potentially ambiguous distance restraints are generated by simply matching peak positions to close chemical shifts. In the case of network anchoring method, chemical shifts are also matched to peaks, but the ambiguous possibilities are refined by selecting only NOE assignments from amongst the possibilities that are supported by other, assigned NOEs or covalent structure. Say, for example, that a peak could arise from a number of resonance pairs. Two resonances A & B are more likely to be a correct assignment for the peak if we know that they are close to (or bound to) the same intermediary resonance, C and therefore must be close to each other.
Merging and Splitting Restraint Lists
To prepare these newly generated restraint lists for the ARIA calculation we will merge and split them in order to generate restraints that are separated into "Unambiguous" and "Ambiguous" categories. In ARIA we do this so that the "Ambiguous" restraints and peaks follow a different protocol; they enter the calculation after the unambiguous, more certain, restraints have formed the initial structure.
In the M:Structure:Restraints and Violations:{Restraint Lists} tab, merge the two lists that derive from the C-NOESY experiment by cicking on the two relevant rows (probably numbers 2 & 3) while holding down the <Ctrl> key. Now click [Merge Lists] at the bottom and [OK]. You will see that the restraints have been combined and there is now only one list from the C-NOESY. Then for the remaining, enlarged restraint list, select its row and click [Split Ambig/Unambig]. These are now ready for input to ARIA.
Dihedral constraints
Next we will generate restraints in a different manner; dihedral restraints from backbone chemical shifts. We will be using a program called DANGLE (Dihedral ANgles from Global Likelihood Estimates) which is embedded within Analysis. DANGLE estimates dihedral angles from chemical shifts in a similar manner to TALOS; i.e. it matches a chemical shift & sequence query to a structural database of known PHI/PSI angles and chemical shifts. However, DANGLE uses a different (Bayesian) method to produce an angle estimate and tolerance, compared to TALOS. The idea is to use Bayesian inference to infer what the range of likely PHI/PSI angles might be (using the chemical shifts) by checking all PHI/PSI combinations in 10 degree square bins to see how well such angles can be used to explain the data. Such an analysis allows for the user to see uncertainties in the angle predictions, including where the chemical shift to structure mapping is redundant and there are multiple regions in the Ramachandran plot which could explain the chemical shift data.
To run DANGLE select M:Structure:DANGLE: Predict
Dihedrals. Note that at the top that the Chain should be set as
"GI:A", the Shift List as "ShiftList 2:2" and Max No. of Islands as 2.
This simply specifies which data to use and how strict the analysis
should be. Using two islands means that we will reject predictions that
result in more than two discrete regions of the Ramachandran plot. To
start the analysis press [Run Prediction] and accept "Run1" as the name
for the job by pressing [OK] at the opportune moment. Please be aware
that DANGLE will take several minutes to finish the calculation.
Once
the calculation is over you will see the main table filled in with PHI
and PSI backbone dihedral angle predictions and their associated error
ranges. Further, if you select a row in the main table you will see a
plot in Ramachandran (PHI/PSI) space of where the likely angles are
deemed to be. Click on the "7 Ser" row and note that there is a lot of
red colour in the chart, indicating that DANGLE was not able to make a
distinct choice of PHI/PHI: you should not use such a prediction in a
structure calculation. Click on the [Next] button to get to "8 Lys".
The prediction for this residue is somewhat better, and you could use
this in a structure calculation (it has one discrete region) although
the error bounds for such a dihedral restraint would be suitably large.
Click on the "12 Glu" row. This residue has a very
precise range of predicted PHI/PSI angles. Such a residue could be used
in a structure calculation with a high degree of confidence and
proportionately narrow error margins.
Note that DANGLE also
predicts the secondary structure of the residues, but that this
calculation is not made from the angles, but directly from the measured
secondary structures in the shift-structure database. To make the
restraints themselves set the Constraint Set to "1", which will place the PHI/PHI dihedral restraints with our existing distance restraints and press [Commit Constraints]. View the generated
restraints by going to M:Structure:Restraints &
Violations:{Restraints}. Note that if you have a structural model for
your protein you can see how the model's angles match with the DANGLE
prediction
Running ARIA
If for any reason you are not confident about the state of your CCPN project at this stage of the demonstration before we do a structure calculation you may like to open a pre-prepared CCPN project, which has all of the expected restraints present: In the Extend-NMR menu bar select M:Project:Open Project then [Yes] to close the current project and [No] to not save. Navigate to find and select the CcpnDemo006 project, then click [Open].
To start the structure calculation using the restraints we have setup, return to the Exend-NMR GUI and select the {ARIA 2} tab. This panel allows us to control which data goes to the ARIA calculation from the CCPN project. For more fine-grained control you need to use the ARIA GUI; for example to change the annealing protocol.
Create a new job for ARIA to work on by pressing the green "New Run" button. This "Run" object links together all the data that goes into the same calculation. Ensuring that the {Input Data} and {Peak Lists} tabs are selected (they ought to be by default), change the pulldown menu at the bottom right to N-NOESY:182:1 and click [Add Peak List]; this will state that in this run ARIA should use this data. Now swap to the {Restraint Lists} tab and add restraint lists in a similar way: Select the list name from the pulldown menu and click [Add Contraint List]. Do this for all of the restraint lists, i.e. the H-bond, two distance and dihedral restraint lists. Now that all of the input data is set we have to tell ARIA how to run on the data.
Move to the {Run Settings} tab, found just below the ARIA logo. Here you can see some of the settings to control the ARIA run. In the lower table make sure that the "Ambiguous protocol?" column set to "Yes" for the peak list and the "Ambig" distance restraints and "No" for the "Unambig" distance restraints. If we wanted to run the ARIA structure calculation locally (noting you will have to have a CNS executable available) we can [Launch ARIA GUI], make relevant changes save the ARIA project and then [Setup Project], which will put the ARIA data in a state that it can be run from the command line in the usual way.
For this demonstration you may run the ARIA calculation remotely using the prototype CCPN Grid service. To do this simply press the [Submit to CcpnGrid] button. For this demonstration use the user identification ''test" and the password "test123". We will not wait for the final structure calculation, which will take some time (although this protein will take less than an hour on an unloaded server), but instead we will look at a structure calculation that has already been completed, and which uses exactly the same data as you have been looking at in the CCPN project.
Via a web browser go to the page http://webapps.ccpn.ac.uk/ccpngrid/status: if you do start a grid calculation this is the page that you get to via the [Show web page] button. If asked, log on with UID: ''test" password: "test123" . Here you will see various ARIA calculations, one of which will have status "Finished" and be entitled ''CcpnDemo006...". Click on the [Results] button for this job and look at the available data, making note of the fact that we have the option to download an updated CCPN project which contains the newly calculated structures and a violation analysis, all of which remained linked to the NMR data in the CCPN project; so that for example we can easily jump from a violation to the offending point in the spectrum.
Analysing Structure Calculation Results
You can see the CCPN project that resulted from this completed ARIA run by loading the project CcpnDemo007: In the Extend-NMR menu bar
select M:Project:Open Project then [Yes] to close the current project
and [No] to not save. Navigate to find and select the
CcpnDemo007 project, then
click [Open]. Select the {ARIA 2} tab and then the {Output Data} tab contained therein, you will see a list of all of the data that ARIA has passed back to the CCPN project; restraints, structures & peaks. We will now have a look at this data by using Analysis. If you do not have Analysis open, open it by going to M:CcpNmr:Analysis.
Structures
In the data that comes back from ARIA we will see that two structure ensembles have been entered into the CCPN project; one from the last ARIA iteration and one after the water refinement stage. To see the structures go to M:Structure:Structures. If you go to the {Structure Models} tab select the [Calculate RMSDs] button at the top right and then [OK] to see how well the structural models align. To see the structure in a 3-dimensional representation for whichever is selected in the table click the [Viewer] at the top.
The controls for the structural viewer are as
follows:
Rotate with middle-click & drag.
Zoom with the mouse wheel, or middle-click, <Shift> & drag.
Move with middle-click, <Ctrl> & drag.
The
mouse right-click brings up a menu that allows you to change the
display mode, spin, and print the structure. The left-click is used for
atom selection. Try the atom selection by first ensuring that the N-NOESY peak list is selected at the top, click on an amide location on the structure (i.e. a blue atom) and then click [Show Peaks]. This will show a table containing all of the peaks that relate to connections from the selected atom in the structure. The numbers on the dashed lines represent the distances between the atoms.
The
same sort of functionality is present in the Edit Assignment popup (M:Assignment:Assignment Panel). -
If you look at the NOESY spectrum in window2, and assign a peak (by
pressing <a> with the cursor over a peak), you can see assignment
possibilities via the [Show On Structure] button. Also note that
because we now have a structure the Edit Assignment popup will show
distances between one peak assigned 1H resonance and the 1H
possibilities in another peak dimension.
Referring
back
to the restraints popup (M:Structure:Restraints & Violations
{Restraints}), set the structure pulldown menu to "2", the structure
from the ARIA/CNS water refinement, and set the restraint list as one of the Distance lists. You will see that the "Struc Value" is now filled-in for the restraints and you are are able to select any
restraint rows (using
<Shift>/<Ctrl> keys) and then click [Show Selected On
Structure] to illustrate graphically where on the loaded structure the
restraints apply.
Restraints & Violations
In the Restraints &
Violations popup (M:Structure:Restraints
& Violations), choose Restraint Set "2" at the top; these are the results from the ARIA run (set 1 was the input) and choose the {Restraint Lists} tab. You will see that ARIA has generated one or more restraint list for each peak list and restraint list that was used as input data. Choose one of the rows labelled "REJECT", which represents the peaks/restraints that ARIA rejected, and select the {Restraints} tab. From the Violation List pulldown menu at the top of the table select "1" and you will hopefully see that many of the rows become red, indicating that these restraints caused violations. If you change the Restraint List pulldown menu to a list that is not labelled as "REJECT" you will see that there are far fewer violations.
Choose a restraint row for one of the violated (red/orange) restraints and click [Show Peaks]. This will open a table containing the peak (or peaks) that gave rise to the restraint. Note that a restraint may be
linked to more than one peak, for example where there are symmetry
related peaks that correspond to the same close resonance pair. In this Selected Peaks table click on the peak row and then the [Find Peak] button at the top. You will be whisked to the point in the spectra where the peak resides. From here you may choose to look at the peak assignment by using the <a> key with the mouse over the peak. Note you can directly assign a peak via a restraint by using the [Assign Peak] button from below the Restraints table.
Making and Checking NOESY Assignments with Structures
An alternative, more detailed, method of looking at NOE assignments is via the M:Assignment:NOE Contributions option. This is designed to make the whole
process of comparing NOE peaks to a structural model efficient. Select this and in the "Peak List & Display Settings" choose the "NOE Peak list" for the NOESY experiment you are currently looking at, then double click the "Use?" column in the lower table for the relevant window, so that it becomes "Yes". Also set 'Mark Peaks' to true. Now move to the {Peak Assignments} tab, select the peak in the spectrum window (left click & drag) and in the NOE Contribution popup click [Selected Peak]. This will show you on the structure and in terms of the close chemical shifts (and distances given the selected structure) what the likely NOE assignments are.
By clicking on the rows in the NOE peaks table you will see that several things happen automatically: The view of the selected windows (in this case the selected Window 4) zooms and marks the selected peak; the graphical structure view highlights the possible atom connections that the peak could represent; and the lower table of the NOE Contributions popup shows the structurally possible, shift-matched resonance pairs ordered in terms of shift and geometric distance. In the lower table clicking on an assignment row and [Assign Selected] sets that assignment for that peak. Also, selecting [Predict Peaks] will use the entered structure and the known chemical shift values to predict the positions of peaks, near to the selected real peak, which correspond to close resonance pairs. These artificial peaks are labelled with the structural distance and are contained in an entirely separate list to the real peaks (so nothing is contaminated and they can be removed easily).
Note that if we change peak assignments then we have the choice of making (or letting ARIA make) a new set of restraints. Alternatively we can curate the existing restraints using the [Update Assignment From
Peak] button (see M:Structure:Restraints & Violations
{Restraints}), which will alter the distance restraint to reflect this new
assignment.