CcpNmr Course Day 3 of 3
Part three of a three day CcpNmr course covering making dihedral angle and distance restraints, NOE assignment & protein structure calculation using ARIA, violation analysis and structure validation.
Making Dihedral Angle and Distance Restraints
The penultimate part of the exercise is to look at how we can use a CCPN project to get peaks lists and restraints (both dihedral and distance) that can be used in an ARIA structure calculation, and how we can pass intermediate structural information back into Analysis to help with violation analysis and NOE peak assignment.
Open an existing project
For this last section we will be using programs that are part of the Extend-NMR software collection, so rather than starting Analysis we will initially launch the Extend-NMR graphical interface on the command line by typing:
-> extendNmr
When the Extend-NMR menu bar has appeared select M:Project:Open Project. Navigate to find and select the CcpnCourse3a project, then click [Open].
You might get a warning that various files have moved location. You might also get a dialog with a list of spectra paths (because those also have moved location). If the paths are all in grey then just click the "All Done!" button at the bottom. If any path is in red then Analysis cannot find the corresponding spectrum data file, so either you need to tell Analysis where it is (by double clicking the path cell and navigating to the correct location) or accept that that particular spectrum will not have its contours displayed. When the project data is loaded select M:CcpNmr:Analysis and a blue NOESY spectrum will hopefully appear.
Making Restraints From Assigned Peaks
To make a list of distance restraints from the assigned peaks in an NOE peak list first go to M:Structure:Make Distance Restraints. At the top of the popup, change the peak list to C-NOESY:173:1 and set the Restraint Set pulldown to "1" . We can leave all of the other parameters alone for demonstration purposes. The Restraint Distance Params section would allow us to specify how the NOE peak intensities relate to the distance bounds of any generated distance restraints. The default method is to calculate a target distance as peak height raised to the power of -1/6 multiplied by some scaling factor, such that the reference intensity (in this case defaults to the peak list's average volume) exactly corresponds to the reference distance (in this case 3.2 Angstroms). The upper and lower bounds of the distance restraint are calculated as fractional changes from the calculated target distance (default is 20% above and below) while observing absolute minimum and maximum values for the bounds (1.72 & 8.00 Angstroms respectively by default). The {Residue Ranges} and {Chem Shift Ranges} tabs would allow you to make only restraints for specific assigned regions of your molecule or for specific shift ranges.
To calculate restraints for assigned peaks from the selected peak list simply press [Make Assigned Restraints]. After a short pause you will see the Restraints and Violations popup appear. This shows that you have one restraint set (a way of grouping related restraints and violations) containing a H-bond restraint list, which was already loaded via the FormatConverter, and list of over 1000 new restraints. Click on the row of the restraint list in the central table and then click on the {Restraints} tab. Note that you can also get to this point via the M:Structure:Restraints and Violations option.
The
restraints popup will appear and
in its table you will see the restraints listed, mostly as green
coloured rows. Note some restraints also have following grey rows.
These grey rows indicate restraints that are ambiguous, i.e. a possible
connection between two different pairs of 1H resonances. Note that such
ambiguous restraints can represent logical uncertainty (before an NOE
is resolved) or real physical ambiguity where a peak is caused by two
or more overlapping pairs of resonances.
Making Restraints From Unassigned Peaks
There is a second common way to generate distance restraints, which is to match the chemical shifts of resonances to NOE peak positions, thus generating potentially highly ambiguous distance restraints. Such restraints would typically be filtered to select only the correct contributing resonance pairs, by iterative structure generation and violation analysis in a program like ARIA. Firstly, we could leave the matching of chemical shifts to the ARIA program by handing the program peak lists rather than restraint lists, which is what we we will demonstrate for the N-NOESY data. However, it is also possible to make such restraints in CCPN. Accordingly, the {Shift Match Tolerances} and {Network Anchoring} tabs in the M:Structure:Make Distance Restraints popup allow you to generate such distance restraints for peaks which do not have assignments. To generate distance restraints by shift matching, firstly set the peak list to "C-NOESY:173:1" and then click [Make Shift Match Restraints]. This command uses the current settings, but {Chem Shift Ranges} and {Shift Match Tolerances} are only relevant for this command.
In the case of the shift-matching method potentially ambiguous distance restraints are generated by simply matching peak positions to close chemical shifts. In the case of network anchoring method, chemical shifts are also matched to peaks, but the ambiguous possibilities are refined by selecting only NOE assignments from amongst the possibilities that are supported by other, assigned NOEs or covalent structure. Say, for example, that a peak could arise from a number of resonance pairs. Two resonances A & B are more likely to be a correct assignment for the peak if we know that they are close to (or bound to) the same intermediary resonance, C and therefore must be close to each other.
Merging and Splitting Restraint Lists
To prepare these
newly generated restraint lists for the ARIA calculation we will merge
and split them in order to generate restraints that are separated into
"Unambiguous" and "Ambiguous" categories. In ARIA we do this so that
the "Ambiguous" restraints and peaks follow a different protocol; they
enter the calculation after the unambiguous, more certain, restraints
have formed the initial structure.
In the
M:Structure:Restraints and Violations:{Restraint Lists} tab, merge the
two lists that derive from the C-NOESY experiment by cicking on the two
relevant rows (probably numbers 2 & 3) while holding down the
<Ctrl> key. Now click [Merge Lists] at the bottom and [OK]. You
will see that the restraints have been combined and there is now only
one list from the C-NOESY. Then for the remaining, enlarged restraint
list, select its row and click [Split Ambig/Unambig]. These are now
ready for input to ARIA.
Dihedral Restraints
Next we will generate restraints in a different manner; dihedral restraints from backbone chemical shifts. We will be using a program called DANGLE (Dihedral ANgles from Global Likelihood Estimates) which is embedded within Analysis. DANGLE estimates dihedral angles from chemical shifts in a similar manner to TALOS; i.e. it matches a chemical shift & sequence query to a structural database of known PHI/PSI angles and chemical shifts. However, DANGLE uses a different (Bayesian) method to produce an angle estimate and tolerance, compared to TALOS. The idea is to use Bayesian inference to infer what the range of likely PHI/PSI angles might be (using the chemical shifts) by checking all PHI/PSI combinations in 10 degree square bins to see how well such angles can be used to explain the data. Such an analysis allows for the user to see uncertainties in the angle predictions, including where the chemical shift to structure mapping is redundant and there are multiple regions in the Ramachandran plot which could explain the chemical shift data.
To run DANGLE select M:Structure:DANGLE: Predict
Dihedrals. Note that at the top that the Chain should be set as
"GI:A", the Shift List as "ShiftList 2:2" and Max No. of Islands as 2.
This simply specifies which data to use and how strict the analysis
should be. Using two islands means that we will reject predictions that
result in more than two discrete regions of the Ramachandran plot. To
start the analysis press [Run Prediction] and accept "Run1" as the name
for the job by pressing [OK] at the opportune moment. Please be aware
that DANGLE will take several minutes to finish the calculation.
Once
the calculation is over you will see the main table filled in with PHI
and PSI backbone dihedral angle predictions and their associated error
ranges. Further, if you select a row in the main table you will see a
plot in Ramachandran (PHI/PSI) space of where the likely angles are
deemed to be. Click on the "7 Ser" row and note that there is a lot of
red colour in the chart, indicating that DANGLE was not able to make a
distinct choice of PHI/PHI: you should not use such a prediction in a
structure calculation. Click on the [Next] button to get to "8 Lys".
The prediction for this residue is somewhat better, and you could use
this in a structure calculation (it has one discrete region) although
the error bounds for such a dihedral restraint would be suitably large.
Click on the "12 Glu" row. This residue has a very
precise range of predicted PHI/PSI angles. Such a residue could be used
in a structure calculation with a high degree of confidence and
proportionately narrow error margins.
Note
that DANGLE also
predicts the secondary structure of the residues, but that this
calculation is not made from the angles, but directly from the measured
secondary structures in the shift-structure database. To make the
restraints themselves set the Restraint Set to "1", which will place
the PHI/PHI dihedral restraints with our existing distance restraints
and press [Commit Restraints]. View the generated
restraints by going to M:Structure:Restraints &
Violations:{Restraints}. Note that if you have a structural model for
your protein you can see how the model's angles match with the DANGLE
prediction
NOE assignment & protein structure calculation using ARIA
If for any reason you are not confident about the state of your CCPN project at this stage of the demonstration before we do a structure calculation you may like to open a pre-prepared CCPN project, which has all of the expected restraints present: In the Extend-NMR menu bar select M:Project:Open Project then [Yes] to close the current project and [No] to not save. Navigate to find and select the CcpnCourse3b project, then click [Open].
To start the structure calculation using the
restraints we have setup, return to the Exend-NMR GUI and select the
{ARIA 2} tab. This panel allows us to control which data goes to the
ARIA calculation from the CCPN project. For more fine-grained control
you need to use the ARIA GUI; for example to change the annealing
protocol.
Create a new job for ARIA to work on by pressing
the green "New Run" button. This "Run" object links together all the
data that goes into the same calculation. Ensuring that the {Input
Data} and {Peak Lists} tabs are selected (they ought to be by default),
change the pulldown menu at the bottom right to N-NOESY:182:1 and click
[Add Peak List]; this will state that in this run ARIA should use this
data. Now swap to the {Restraint Lists} tab and add restraint lists in
a similar way: Select the list name from the pulldown menu and click
[Add Contraint List]. Do this for all of the restraint lists, i.e. the
H-bond, two distance and dihedral restraint lists. Now that all of the
input data is set we have to tell ARIA how to run on the data.
Move
to the {Run Settings} tab, found just below the ARIA logo. Here you can
see some of the settings to control the ARIA run. In the lower table
make sure that the "Ambiguous protocol?" column set to "Yes" for the
peak list and the "Ambig" distance restraints and "No" for the
"Unambig" distance restraints. If we wanted to run the ARIA structure
calculation locally (noting you will have to have a CNS executable
available) we can [Launch ARIA GUI], make relevant changes save the
ARIA project and then [Setup Project], which will put the ARIA data in
a state that it can be run from the command line in the usual way.
For this demonstration you may run the ARIA calculation remotely using the prototype CCPN Grid service. To do this simply press the [Submit to CcpnGrid] button. For this demonstration we are using the user identification ''test" and the password "test123", which is passed to the CCPN server automatically for the demonstration. We will not wait for the final structure calculation, which will take some time (although this protein will take less than an hour on an unloaded server), but instead we will look at a structure calculation that has already been completed, and which uses exactly the same data as you have been looking at in the CCPN project.
Either click on [Show web page] in the "CCPN Grid progress" popup that appears, or using a web browser go to the page http://webapps.ccpn.ac.uk/ccpngrid/status. If asked, log on with UID: ''test" password: "test123" . Here you will see various ARIA calculations, one of which will have status "Finished" and be entitled ''CcpnCourse3b...". Click on the [Results] button for this job and look at the available data, making note of the fact that we have the option to download an updated CCPN project which contains the newly calculated structures and a violation analysis, all of which remained linked to the NMR data in the CCPN project; so that for example we can easily jump from a violation to the offending point in the spectrum.
You can see the CCPN project that resulted from this completed ARIA
run by loading the project CcpnCourse3c: In the Extend-NMR menu bar
select M:Project:Open Project then [Yes] to close the current project
and [No] to not save. Navigate to find and select the
CcpnCourse3c project, then
click [Open]. Select the {ARIA 2} tab and then the {Output Data} tab
contained therein, you will see a list of all of the data that ARIA has
passed back to the CCPN project; restraints, structures & peaks. We
will now have a look at this data by using Analysis. If you do not have
Analysis open, open it by going to M:CcpNmr:Analysis.
Structures
In the data that comes back from ARIA we will see that two structure ensembles have been entered into the CCPN project; one from the last ARIA iteration and one after the water refinement stage. To see the structures go to M:Structure:Structures. If you go to the {Structure Models} tab select the [Calculate RMSDs] button at the top right and then [OK] to see how well the structural models align. To see the structure in a 3-dimensional representation for whichever is selected in the table click the [Viewer] at the top.
The controls for the structural viewer are as
follows:
Rotate with middle-click & drag.
Zoom with the mouse wheel, or middle-click, <Shift> & drag.
Move with middle-click, <Ctrl> & drag.
The
mouse
right-click brings up a menu that allows you to change the
display mode, spin, and print the structure. The left-click is used for
atom selection. Try the atom selection by first ensuring that the
N-NOESY peak list is selected at the top, click on an amide location on
the structure (i.e. a blue atom) and then click [Show Peaks]. This will
show a table containing all of the peaks that relate to connections
from the selected atom in the structure. The numbers on the dashed
lines represent the distances between the atoms.
The
same sort of functionality is present in the Edit Assignment popup (M:Assignment:Assignment Panel). -
If you look at the NOESY spectrum in window2, and assign a peak (by
pressing <a> with the cursor over a peak), you can see assignment
possibilities via the [Show On Structure] button. Also note that
because we now have a structure the Edit Assignment popup will show
distances between one peak assigned 1H resonance and the 1H
possibilities in another peak dimension.
Referring
back
to
the restraints popup (M:Structure:Restraints & Violations
{Restraints}), set the structure pulldown menu to "2", the structure
from the ARIA/CNS water refinement, and set the restraint list as one
of the Distance lists. You will see that the "Struc Value" is now
filled-in for the restraints and you are are able to select any
restraint rows (using
<Shift>/<Ctrl> keys) and then click [Show Selected On
Structure] to illustrate graphically where on the loaded structure the
restraints apply.
Violation analysis and structure validation
Restraints & Violations
In the Restraints &
Violations popup (M:Structure:Restraints
& Violations), choose Restraint Set "2" at the top; these are the
results from the ARIA run (set 1 was the input) and choose the
{Restraint Lists} tab. You will see that ARIA has generated one or more
restraint list for each peak list and restraint list that was used as
input data. Choose one of the rows labelled "REJECT", which represents
the peaks/restraints that ARIA rejected, and select the {Restraints}
tab. From the Violation List pulldown menu at the top of the table
select "1" and you will hopefully see that many of the rows become red,
indicating that these restraints caused violations. If you change the
Restraint List pulldown menu to a list that is not labelled as "REJECT"
you will see that there are far fewer violations.
Choose
a restraint row for one of the violated (red/orange) restraints and
click [Show Peaks]. This will open a table containing the peak (or
peaks) that gave rise to the restraint. Note that a restraint may be
linked to more than one peak, for example where there are symmetry
related peaks that correspond to the same close resonance pair. In this
Selected Peaks table click on the peak row and then the [Find Peak]
button at the top. You will be whisked to the point in the spectra
where the peak resides. From here you may choose to look at the peak
assignment by using the <a> key with the mouse over the peak.
Note you can directly assign a peak via a restraint by using the
[Assign Peak] button from below the Restraints table.
Making and Checking NOESY Assignments with Structures
An
alternative, more detailed, method of looking at NOE assignments is via
the M:Assignment:NOE Contributions option. This is designed to make the
whole
process of comparing NOE peaks to a structural model efficient. Select
this and in the "Peak List & Display Settings" choose the "NOE Peak
list" for the NOESY experiment you are currently looking at, then
double click the "Use?" column in the lower table for the relevant
window, so that it becomes "Yes". Also set 'Mark Peaks' to true. Now
move to the {Peak Assignments} tab, select the peak in the spectrum
window (left click & drag) and in the NOE Contribution popup click
[Selected Peak]. This will show you on the structure and in terms of
the close chemical shifts (and distances given the selected structure)
what the likely NOE assignments are.
By clicking on the rows in the NOE peaks table you will see that several things happen automatically: The view of the selected windows (in this case the selected Window 4) zooms and marks the selected peak; the graphical structure view highlights the possible atom connections that the peak could represent; and the lower table of the NOE Contributions popup shows the structurally possible, shift-matched resonance pairs ordered in terms of shift and geometric distance. In the lower table clicking on an assignment row and [Assign Selected] sets that assignment for that peak. Also, selecting [Predict Peaks] will use the entered structure and the known chemical shift values to predict the positions of peaks, near to the selected real peak, which correspond to close resonance pairs. These artificial peaks are labelled with the structural distance and are contained in an entirely separate list to the real peaks (so nothing is contaminated and they can be removed easily).
Note
that if we change peak assignments then we have the choice of making
(or letting ARIA make) a new set of restraints. Alternatively we can
curate the existing restraints using the [Update Assignment From
Peak] button (see M:Structure:Restraints & Violations
{Restraints}), which will alter the distance restraint to reflect this
new
assignment.
Open an existing project
If the project "CcpnCourse3c" is not already open start Extend-NMR on the command line by typing:
-> extendNmr
When the Extend-NMR menu bar has appeared select M:Project:Open Project. Navigate to find and select the CcpnCourse3c project, then click [Open]. This loads the data that resulted from an ARIA calculation and was used in part 4 of this tutorial series.
Setting Up a CING Run
In the Extend-NMR GUI select the {CING} tab and click the green [New Run] button in the upper right corner. This run specification will contain all of the data that CING will analyse. When a new run is made the {Structures} table will fill with the various models from an ensemble. Accepting this, we move on to the {Shifts & Measurements} tab. Here click [Add Measurement List] to add the shift list to the CING analysis. Move to the {Peak Lists} tab, and add the N-NOESY:182:2 peak list (this is the one that ARIA made) by selecting it from the lower right pulldown menu and selecting [Add Peak List]. Finally in the restraint lists tab add restraint list "Distance-2:1". This restraint list is the one that came back from ARIA and consequently lives in the second restraint set. Note that in general you may select as much data as you like, but we are just selecting a smaller subset for demonstration purposes.
Select the {Run Settings} tab below the
CING logo. To submit the analysis press the [Submit Project] button.
Note that this may take some time, so you may wish to let the
demonstrator make a real submission and view some previously calculated
results, thus sparing the iCING server. Although, feel free to use this
server generally.
To view some precalculated validation data look at the following URL in a web browser:
http://nmr.cmbi.ru.nl/CASD-NMR-CING/data/GR/CGR26ACheshire/CGR26ACheshire.cittpSimple Python Scripts for CCPN
The macro can either be run from the table directly by selecting its row and clicking [Run]. A more convenient alternative is often to put the macro in the main Analysis menu. Do this my double-clicking in the "In main menu?" column of the table. If you close this popup window you will now see the new macro appear in the M:Macro section of Analysis. If you make a change to the Python code for a macro that has been entered into Analysis make sure you select M:Macro:Reload Menu Macros to ensure that the latest version of the code is being used.
def addMarksToPeaks(argServer, peaks=None):
"""Descrn: Adds position line markers to the selected peaks.
Inputs: ArgumentServer, List of Nmr.Peaks
Output: None
"""
from ccpnmr.analysis.core.MarkBasic import createPeakMark
if not peaks:
peaks = argServer.getCurrentPeaks()
# no peaks - nothing happens
for peak in peaks:
createPeakMark(peak, remove=False)
def calcAveragePeakListIntensity(argServer, peakList=None, intensityType='height'):
"""Descrn: Find the average height of peaks in a peak list.
Inputs: ArgumentServer, Nmr.PeakList
Output: Float
"""
from ccpnmr.analysis.core.ConstraintBasic import getMeanPeakIntensity
if not peakList:
peakList = argServer.getPeakList()
if not peakList:
argServer.showWarning('No peak list selected')
return
answer = argServer.askYesNo('Use peak volumes? Height will be used otherwise.')
if answer: # is true
intensityType = 'volume'
spec = peakList.dataSource
expt = spec.experiment
intensity = getMeanPeakIntensity(peakList.peaks, intensityType=intensityType)
data = (intensityType,expt.name,spec.name,peakList.serial,intensity)
argServer.showInfo('Mean peak %s for %s %s peak list %d is %e' % data)
return intensity
*End*