Quick Guide
Quick guide to the CCPN software development
What is the 'data model'?
The data model itself is an abstract description of all the data that is commonly used in NMR (with other areas like protein production being included). For example, the NMR part of the data model describes an Experiment object - this corresponds to an NMR spectrum. This Experiment is linked to ExpDim object(s) - these describe the different dimensions in the spectrum. This abstract description of the data model is represented and
maintained graphically using the Unified Modelling Language (UML). The part of the data model describing Experiment and ExpDim looks like this in UML:

The boxes describe the Experiment and ExpDim objects. The information inside the boxes are attributes that give meaning to the object. For example, you can set the name for an Experiment. Objects are then linked to each other - this is shown by the line between Experiment and ExpDim. The diamond in the link means that ExpDim is a child of Experiment - a dimension in the spectrum cannot exist without having a spectrum first.
What are 'packages'?
The data model is split up in packages. Each of these packages describes a 'unit' of information that can be shared by other packages. For example, the description of a template molecule is done in the 'Molecule' package, the description of a molecular system with 'real' molecules is done in the 'MolSystem' package. The 'Nmr' package uses information from the 'MolSystem' package, which could be shared by an 'Xray' package if it was available. For this reason the data of each package is stored in separate locations.
What is the 'API'?
API stands for Application Programming Interface. With an API the objects described by the data model can be manipulated in computer memory. Basically this means that the data is organized in a way that is consistent with the 'data model'. The API therefore also handles consistency checking of the objects (e.g. an Nmr Experiment object has to be linked to at least one ExpDim (experiment dimension)). The API is currently available in Python and Java, with an experiment C version being tested.
To continue with the example above, the objects that are the API maintains in memory for a 3D spectrum are shown below (note that the values for the 'dim' attribute are filled in for the ExpDim objects to distinguish between them):

Which programs use this 'API'?
Currently only the FormatConverter and Analysis CcpNmr applications interact directly with data inside the data model. The ARIA 2.1 and upcoming 2.2 software is fully compatible the data model, as is the CLOUDS variant developed as part of CCPN. The QUEEN validation software from the CMBI works with CCPN via the FormatConverter. The focus is now on making as many applications as possible compatible with the data model - already two ongoing European projects (Extend-NMR and EU-NMR) are committed to making the developed software work directly with CCPN.
How do I get my data into (and out of) the 'data model'?
The CcpNmr FormatConverter application allows you to import existing derived data formats (not raw spectra) into the data model. Export functions are also available so it can be used as a format converter between existing formats. Many formats are currently supported - click here for an up-to-date status.
What is the advantage of having data inside the 'data model'?
- All programs that work with the data model 'understand' each other. For example, you can read data into the data model with the ccpNmr FormatConverter, start using ccpNmr Analysis straight away (providing it understands the spectrum raw data format), and transfer the information to ARIA for a structure calculation.
- Scripts that work on the data model can be used by every application that uses a data model API. For example, if a good automatic assignment script was written it could be run from any data model based application.
- Import/export to foreign formats comes for free (see above). This basically allows you to store all your data in one place throughout a project, while going back and forth between different programs while doing that. Final export to an nmrStar file ready for deposition will also be included.
What else is planned?
- Further data modelling of other 'bio' areas,.
- Development of a LIMS system, which also includes information for sample preparation, protein production, ...
- New code for the data model to do things like automatic assignment, ... .
- Full support for databases - so that you can store all information in a relational database while you work on it.