CCPN LogoVideo
Tutorials
& Manual
CCPN Logo

Creating non-canonical amino acid ChemComp files

Creating non-canonical amino acid ChemComp files

ChemComp files

If you want to include an amino acid in a protein/peptide Chain in CcpNmr Analysis which is not one of the 20 standard natural amino acids, you will need a so-called ChemComp file for this amino acid. A ChemComp file is an .xml file which contains all the chemical information about your amino acid: atoms, bonds, protonation states, names, codes and information about magnetic equivalence, prochirality etc. The ChemComp files for the 20 natural amino acids as well as a some other common derivatives (e.g. the D-amino acids) are provided as part of your CcpNmr Analysis Version 3 distribution (see below for a full list of these). The .xml files themselves are stored in your Analysis installation path in the ccpnmr3.x.y/src/python/ccpnmodel/data/ccpnv3/ccp/molecule/ChemComp folder.

Additionally, you can find over 3000 other ChemComp files for non-natural amino acids on the VuisterLab GitHub page at https://github.com/VuisterLab/CcpNmr-ChemComps/tree/master/data/pdbe/chemComp/archive/ChemComp/protein. You can download any of these if you need them for your peptide/protein Chain. This repository is not very user-friendly when it comes to searching for your desired molecule. We recommend using the PDB's Chemical Sketch Tool to find your molecule of interest in the PDB. The PDB code (which we refer to as the chemComp code) should be 3-5 letters in length. You can then search for the code in the search box in the left-hand sidebar in our GitHub repository to see if it is present.

If you can't find a ChemComp for the amino acid you require, you can create your own using ChemBuild.

ChemBuild

ChemBuild is a program written by Tim Stevens which is part of the CcpNmr suite of programs. The executable for this program is located in your ccpnmr3.x.y/bat (Windows) or ccpnmr3.x.y/bin (Linux/Mac) folder along with the other CcpNmr Analysis executables. To start the program simply double-click ChemBuild.bat (Windows) or chemBuild (Linux/Mac) in your file browser or use the commandline argument ./chemBuild in the bin folder (Linux/Mac).

ChemBuild is similar to many other chemical drawing programs, but it also adds information about magnetic equivalence and is particularly good at dealing with peptides/proteins composed of linked amino acids (or similarly polynucleotides). It also has the ability to read and write CCPN ChemComps files. A manual is available, but much of what you will need to know to create an amino acid ChemComp is either relatively intuitive or will be described here.

Creating your own bespoke ChemComp

Linked Amino Acid Templates

In the Compound Library on the left-hand side of the ChemBuild Desktop, open up the Amino Acids section and then drag the Linked Template onto the main canvas. On the right-hand side, select the Compound Variants tab. You will see that the amino acid template contains a number of different Compound Variants, often with different pronation states. Some variants are of the free peptide, others of the amino acid at the N-terminal position in a chain (labelled start), a middle position (labelled middle) and others at the C-terminal position (labelled end). The linked variants will show little green arrows on the molecule where they link to neighbouring amino acids. Select the middle variant while you build your amino acid, so that you don't accidentally remove the links (this is particularly important if you will be replacing the amide hydrogen with a different chemical group.

Building your Amino Acid

You can build your amino acid by selecting, adding or deleting atoms and bonds.

Select atoms by clicking on them. Hold down the Ctrl key (the Cmd key on a Mac) to select multiple atoms in one go. You can also right-drag the mouse over multiple atoms to select several in one go. And finally, you can use Ctrl/Cmd+A to select all atoms.

Delete selected atoms by pressing the Delete button (or Fn+Backspace on a Mac). Alternatively use the main menu (Edit / Delete atoms), right-hand mouse menu (Edit / Delete selected atoms) or use the toolbar (icon with the black atom with a red cross over it).

Add atoms by dragging them from the Elements section on the left-hand side or by pressing the letter keys for atoms with one-letter atomic symbols (C for carbon etc.). If you press C while another atom is selected, the new carbon atom will automatically be bonded to the selected atom, otherwise your new atom will be placed onto the canvas without any bonds. Note that each element contains black dots for any free valencies. You can automatically add hydrogen atoms to all remaining free valencies of your selected atoms by using the menus (Edit / Add Hydrogens) or toolbar button (green plus icon with two small grey H atoms attached).

Create bonds between atoms by dragging one atom towards another, such that two valency dots are close and become highlighted in green. Let go of the dragged atom and a bond will form between the two atoms.

Delete bonds by deleting atoms, or by selecting atoms and then using the menus (Edit / Remove bonds) or toolbar (icon with two bonded atoms and a red cross over the bond).

Rearranging atoms is possible by dragging any selected atoms with the mouse (if several atoms ar selected they are dragged as a group). You can also use several buttons in the toolbar: the purple circular arrows will rotate the selected atom (group), the next two buttons with flip the atom (group) vertically or horizontally and the button with three atoms with green arrows between them will auto-arrange your atoms. These functions are also all available from the Edit menu.

You can also make use of the Compound Library on the left-hand side of the Desktop to drag particular chemical groups onto the canvas and connect them to your amino acid template.

Importing Molecules

It is also possible to import molecules in .pdb, .mdl or .mol2 format. Go to the Import menu and select the format you wish to import. You can then connect up your imported molecule to the amino acid template. But it is always important to start with the template, as you need the differently connected variants, and it isn't currently possible to create these yourself ab initio in ChemBuild.

Changing Labels

You may have noticed that the linked template starts with the standard C/N/CA/CB labelling for an amino acid. Additional groups will simply be lablled with numbers (C1, C2, H1 etc.). We suggest you adapt the atom labels to fit in with the usual Greek alphabet labelling. To change a label, simply double-click on a label, change it as desired and press Enter.

Adding Magnetic Properties

You may have noticed that the hydrogen atoms of a methyl group are automatically connected via red dashed lines. This indicates that these atoms are magnetically equivalent. Other sets of atoms (e.g. the hydrogen atoms in a CH2 group) may be connected by blue dashed lines. This indicates sets of prochiral atoms. Most of these groups will be recognised automatically, but phenyl rings typically need to have the magnetic equivalence added manually. To do this, select the Atom Properties tab on the right-hand side of the ChemBuild Desktop. Select each of the pairs of carbon or hydrogen atoms that are magnetically equivalent in turn and then press the Set equivalents button in the NMR Groups section. For each pair of equivalent atoms you also need to make sure that the atom names match each other (e.g. CH1 and CH2, HZ1 and HZ2, C16_1 and C16_2, H74 and H75), i.e.they always need a root that is the same and then they need to end with single-digit numbers which differ.

Compound Information

Before exporting your amino acid in ChemComp format, you need to edit the Compound Information. Select Compound Information on the right-hand side of the ChemBuild Desktop. Enter a name for your compound and set the CCPN MolType to protein. The CCPN Code can either be left as it is (the code is a random and almost certainly unique code), or you can enter a three-letter code. But make sure this does not clash with one of the three-letter codes used in the list of amino acids on GitHub or else you won't be able to use your amino acid within CcpNmr Analysis. If you want to use your ChemComp within CcpNmr Analysis Version 2, then you will need to use a three-letter code.

Exporting the ChemComp .xml file

Once your amino acid is ready for export, go to the Export menu and select CCPN ChemComp XML file and select a distination. The folder will be saved with the .xml file inside it.

Amino Acid ChemComp files included in the CcpNmr Analysis Version 3 Distribution

3-Letter CodeAmino Acid name
Aba(2R)-2-aminobutanoic acid
Aib2-methylalanine
Alaalanine
Argarginine
Asnasparagine
Aspaspartic acid
Bmt4-methyl-4-[(E)-2-butenyl]-4,N-methyl-threonine
Cgugamma-carboxy-glutamic acid
CsoS-hydroxycysteine
CssS-mercaptocysteine
CxmN-carboxymethionine
Cyscysteine
D11D-phoshphothreonine
DalD-alanine
DarD-arginine
DasD-aspartic acid
DcyD-cysteine
DglD-glutamic acid
DgnD-glutamine
Dha2-aminoacrylic acid
DhiD-histidine
DilD-isoleucine
DivD-isovaline
DleD-leucine
DlyD-lysine
DneD-norleucine
DngN-formyl-D-norleucine
DnmN-methyl-D-norleucine
DpnD-phenylalanine
DprD-proline
DseN-methyl-D-serine
DsgD-asparagine
DsnD-serine
DthD-threonine
DtrD-tryptophan
DtyD-tyrosine
DvaD-valine
Glnglutamine
Gluglutamic acid
Glyglycine
HisL-histidine
Hslhomoserine lactone
Hyp4-hydroxyproline
Ileisoleucine
IoyP-iodo-D-phenylalanine
IytN-alpha-acetyl-3,5-diiodotyrosyl-D-threonine
Kcxlysine NZ-carboxylic acid
Leuleucine
Llp2-lysine(3-hydroxy-2-methyl-5-phosphonooxymethyl-pyridin-4-ylmethane)
Lyslysine
MedD-methionine
Metmethionine
MleN-methylleucine
Mseselenomethionine
MvaN-methylvaline
NdfN-(carboxycarbonyl)-D-phenylalanine
Nlenorleucine
Ocscysteinesulfonic acid
Ornornithine
Pcapyroglutamic acid
PddN-(5'-phosphopyridoxyl)-D-alanine
Pg9D-phenylglycine
Phephenylalanine
Proproline
PtrO-phosphotyrosine
Sarsarcosine
Serserine
Thrthreonine
Trptryptophan
Tyrtyrosine
Tyssulfonated tyrosine
Valvaline