ccpn.util.nef package

Star, Cif, and NEF I/O.

Files are converted to and from a tree of nested objects (essentially enhanced, ordered dictionaries)

The GenericStarParser.py will work for any valid Star file; see this file for the precise reading and writing behaviour. Parsing comes in several modes, depending on how strictly the STAR standard is enforced.

NmrStar and NEF files have a more restricted syntax, with all items and loops contained within saveframes, and with saveframe and loop tags beginning with a prefix matching the saveframe and loop category. The StarIo.py code is used for these files; it converts the output of the GenericStarParser into a simpler object tree that makes use of these restrictions, strips mandatory tag prefixes, and converts strings to numerical values with some limited heuristics.

Submodules

ccpn.util.nef.CompareNef module

compareNef - a series of routines to compare the contents of two Nef files

Command Line Usage:

compareNef for execution from the command line with a suitable script An example can be found in AnalysisV3/bin/compareNef:

#!/usr/bin/env sh export CCPNMR_TOP_DIR=”$(dirname $(cd $(dirname “$0”); pwd))” export ANACONDA3=${CCPNMR_TOP_DIR}/miniconda export PYTHONPATH=${CCPNMR_TOP_DIR}/src/python:${CCPNMR_TOP_DIR}/src/c ${ANACONDA3}/bin/python ${CCPNMR_TOP_DIR}/src/python/ccpn/util/nef/compareNef.py $*

Usage: compareNef [options]

optional arguments:
-h, --help

show this help message

-H, --Help

Show detailed help

-i, --ignoreblockname

Ignore the blockname when comparing two Nef files May be required when converting Nef files through different applications. May be used with -f and -b.

-f inFile1 inFile2, –file inFile1 inFile2

Compare two Nef files and print the results to the screen.

-b inDir1 inDir2 outDir, –block inDir1 inDir2 outDir

compare Nef files common to directories inDir1 and inDir2. Write output *.txt for each file into the outDir directory.

-s, --screen

Output batch processing to screen, default is to .txt files may be used with -b

-o, --overwrite

Overwrite existing .txt files. If false then files are appended with ‘(n)’ before the extension, where n is the next available number.

-c, --create

Automatically create directories as required.

-I, --ignorecase

Ignore case when comparing items.

Details of the contents of Nef files can be found in GenericStarParser The general structure of a Nef file is:

DataExtent
  DataBlock
    Item
    Loop
    SaveFrame
      Item
      Loop

DataExtent, DataBlock and SaveFrame are Python OrderedDict with an additional ‘name’ attribute DataBlocks and SaveFrames are entered in their container using their name as the key.

Loop is an object with a ‘columns’ list, a ‘data’ list-of-row-OrderedDict, and a name attribute set equal to the name of the first column. A loop is entered in its container under each column name, so that e.g. aSaveFrame[‘_Loopx.loopcol1’] and aSaveFrame[‘_Loopx.loopcol2’] both exist and both correspond to the same loop object.

Items are entered as a string key - string value pair.

the string value can be a dictionary

Module Contents

compareNef contains the following routines:

compareNefFiles compare two Nef files and return a compare object

Searches through all objects: dataExtents, dataBlocks, saveFrames and Loops within the files. Comparisons are made for all data structures that have the same name. Differences for items within a column are listed in the form:

dataExtent:dataBlock:saveFrame:Loop: <Column>: columnName <rowIndex>: row –> value1 != value2

dataExtents, dataBlocks, saveFrames, Loops, columns present only in one file are listed, in the form:
dataExtent:dataBlock:saveFrame:Loop: contains –> parameter1

parameter2 … parameterN

A compare object is a list of nefItems of the form:

NefItem
  inWhich         a flag labelling which file the item was found in
                  1 = found in first file, 2 = found in second file, 3 = common to both
  List
    Item          multiple strings containing the comparison tree
    (,List)       the last item of which may be a list of items common to the tree
e.g., for parameters present in the first file:
[

inWhich=1 list=[dataExtent1, dataBlock1, saveFrame1, Loop1, [parameter1, parameter2, parameter3]]

]

compareDataExtents compare two DataExtent objects and return a compare list as above compareDataBlocks compare two DataBlock objects and return a compare list as above compareSaveFrames compare two SaveFrame objects and return a compare list as above compareLoops compare two Loop objects and return a compare list as above

compareNefFiles compare two Nef files and return a compare list as above batchCompareNefFiles compare two directories of Nef files.

Nef Files common to specified directories are compared and the compare lists are written to the third directory as .txt

printCompareList print the compare list to the screen

class ccpn.util.nef.CompareNef.Test_Compare_Files(methodName='runTest')[source]

Bases: unittest.case.TestCase

Test the comparison of nef files and print the results

test_Compare_BatchFiles()[source]

Compare the Nef files in two directories

test_Compare_Files()[source]

Load two files and compare

ccpn.util.nef.CompareNef.addToList(inList, cItem, nefList)[source]

Append cItem to the compare list Currently adds one cItem with a list as the last element

Parameters
  • inList – a list of items to add to the end of cItem

  • cItem – object containing the current tree to add to the list

  • nefList – current list of comparisons

Returns

list of type nefItem

ccpn.util.nef.CompareNef.batchCompareNefFiles(inDir1, inDir2, outDir, options)[source]

Batch compare the Nef files common to the two directories For each file found, write the compare log to the corresponding .txt file

Parameters
  • inDir1

  • inDir2

  • outDir

  • options – nameSpace holding the commandLineArguments

ccpn.util.nef.CompareNef.compareDataBlocks(dataBlock1, dataBlock2, options, cItem=None, nefList=None)[source]

Compare two dataBlocks, if they have the same name then check their contents

Parameters
  • dataBlock1 – first DataBlock object, of type GenericStarParser.DataBlock

  • dataBlock2 – second DataBlock object, of type GenericStarParser.DataBlock

  • options – nameSpace holding the commandLineArguments

  • cItem – list of str describing differences between nefItems

  • nefList – input of nefItems

Returns

list of type nefItem

ccpn.util.nef.CompareNef.compareDataExtents(dataExt1, dataExt2, options, cItem=None, nefList=None)[source]

Compare two dataExtents, if they have the same name then check their contents

Parameters
  • dataExt1 – first DataExtent object, of type GenericStarParser.DataExtent

  • dataExt2 – second DataExtent object, of type GenericStarParser.DataExtent

  • options – nameSpace holding the commandLineArguments

  • cItem – list of str describing differences between nefItems

  • nefList – input of nefItems

Returns

list of type nefItem

ccpn.util.nef.CompareNef.compareLoops(loop1, loop2, options, cItem=None, nefList=None)[source]

Compare two Loops

Parameters
  • loop1 – first Loop object, of type GenericStarParser.Loop

  • loop2 – second Loop object, of type GenericStarParser.Loop

  • options – nameSpace holding the commandLineArguments

  • cItem – list of str describing differences between nefItems

  • nefList – input of nefItems

Returns

list of type nefItem

ccpn.util.nef.CompareNef.compareNefFiles(inFile1, inFile2, options, cItem=None, nefList=None)[source]

Compare two Nef files and return comparison as a nefItem list

Parameters
  • inFile1 – name of the first file

  • inFile2 – name of the second file

  • options – nameSpace holding the commandLineArguments

  • cItem – list of str describing differences between nefItems

  • nefList – input of nefItems

Returns

list of type nefItem

ccpn.util.nef.CompareNef.compareSaveFrames(saveFrame1, saveFrame2, options, cItem=None, nefList=None)[source]

Compare two saveFrames, if they have the same name then check their contents

Parameters
  • saveFrame1 – first SaveFrame object, of type GenericStarParser.SaveFrame

  • saveFrame2 – second SaveFrame object, of type GenericStarParser.SaveFrame

  • options – nameSpace holding the commandLineArguments

  • cItem – list of str describing differences between nefItems

  • nefList – input of nefItems

Returns

list of type nefItem

ccpn.util.nef.CompareNef.defineArguments()[source]

Define the arguments of the program

:return argparse instance

ccpn.util.nef.CompareNef.import_parents(level=1)[source]
class ccpn.util.nef.CompareNef.nefItem(cItem=None)[source]

Bases: object

Holds the contents of a single Nef comparison inWhich a flag labelling which file the item was found in

1 = found in the first file, 2 = found on the second file, 3 = common to both

list a list of strings containing the comparison information

ccpn.util.nef.CompareNef.printCompareList(nefList, inFile1, inFile2)[source]

Print the contents of the nef compare list to the screen

Output is in three parts:
  • items that are present only in the first file

  • items that are only in the second file

  • differences between objects that are common in both files

Parameters
  • nefList – list to print

  • inFile1 – name of the first file

  • inFile2 – name of the second file

ccpn.util.nef.CompareNef.printFile(thisFile)[source]

Print a file to the screen

ccpn.util.nef.CompareNef.printWhichList(nefList, whichType=0)[source]

List only those items that are of type whichType

Parameters
  • nefList – list to print

  • whichType – type to print

ccpn.util.nef.CompareNef.sizeNefList(nefList, whichType=0)[source]

List only those items that are of type whichType

Parameters
  • nefList – list to print

  • whichType – type to print

ccpn.util.nef.ErrorLog module

Module Documentation here

class ccpn.util.nef.ErrorLog.ErrorLog(logOutput=<built-in method write of _io.TextIOWrapper object>, loggingMode='standard', errorCode=0)[source]

Bases: object

A class to facilitate Logging of errors to stderr.

example:

errorLogging.logError(‘Error: %s’ % errorMessage)

functions available are:

logError write message to the current output func = logger return the current logger logger = func set the current logger loggingMode = mode set the logging mode where mode is:

‘standard’, ‘silent’, ‘strict’

current modes are:

‘standard’ errors are written to the stderr, no errors are raised ‘silent’ no errors are raised, no output to the stderr ‘strict’ errors are logged to stderr and errors are raised

to be handled by the calling functions

mode = loggingMode return the current mode.

NEFERRORS = {-17: 'error reading keys', -16: 'error reading attribute', -15: 'error reading attribute names', -14: 'bad add saveFrame', -13: 'bad categories', -12: '', -11: 'bad listType', -10: 'list type error', -9: 'bad table names', -8: 'bad multiColumnValues', -7: 'bad convert from string', -6: 'bad convert to string', -5: 'error saving file', -4: 'error loading file', -3: 'saveFrame does not exist', -2: 'table does not exist', -1: 'table error'}
property lastError

- None, immutable - Return the error code of the last action :return int:

property lastErrorString

- None, immutable - Return the error string of the last action :return string:

property logger

- None, mutable - Return the current logging function :return func; defaults to sys.stderr.write

profile of func: func(value:str)

property loggingMode

- None, mutable - Return the current logging Mode current modes are:

‘standard’ errors are written to the stderr, no errors are raised ‘silent’ no errors are raised, no output to the stderr ‘strict’ errors are logged to stderr and errors are raised

to be handled by the calling functions

Return string

ccpn.util.nef.GenericStarParser module

Star-type file parser, agnostic between cif, mmcif, star, nef, etc.

Returns a nested data structure that matches the file, with DataExtent, DataBlock, SaveFrame, and Loop objects. Additional support for distinguishing between None, True, False, and int/float values and the strings that match their representations.

For behaviour better suited to NmrStar and NEf see ./StarIo.py

Usage:

parse(text, mode) to parse a text representation to nested objects

parseFile(fileName, mode) to load and parse a file

starObject.toString() will convert any object in the object structure, complete with contents, to a string that can then be written to file.

Reading behaviour

follows the specification in International Tables for Crystallography volume G section 2.1 with the following exceptions:

  • Nested loops are NOT supported

  • Global blocks are treated as simple data blocks. If the first data block in the file is a global_, is is named ‘global_’; globals elsewhere are named global_1, global_2, etc. This may cause trouble downstream if there is a DataBlock named e.g. ‘data_global_1’.

The object structure returned is:

DataExtent
  DataBlock
    Item
    Loop
    SaveFrame
      Item
      Loop

DataExtent, DataBlock and SaveFrame are Python OrderedDict with an additional ‘name’ attribute DataBlocks and SaveFrames are entered in their container using their name as the key.

Loop is an object with a ‘columns’ list, a ‘data’ list-of-row-OrderedDict, and a name attribute set equal to the name of the first column. A loop is entered in its container under each column name, so that e.g. aSaveFrame[‘_Loopx.loopcol1’] and aSaveFrame[‘_Loopx.loopcol2’] both exist and both correspond to the same loop object.

Items are entered as a string key - string value pair.

All tags are preserved ‘as is’ (i.e. without stripping leading ‘_’, ‘data_’, or ‘save_’), basically to avoid the theoretical risk that a SaveFrame name might clash with an item name or loop column, or a DataBlock with a Global. Duplicate tags raise an error.

Quoted values are returned as str, whereas unquoted values are returned as a str subtype UnquotedValue(str) This allows later distinction between null, unknown, saveframe_reference and the equivalent quoted strings.

Writing behaviour

follows the specification in International Tables for Crystallography volume G section 2.1 with the following exceptions and additions.

  • Strings of type UnquotedValue are written as-is, without quoting them.

  • Note that e.g. ‘ “say” ‘what’?’ or ” ‘say”‘“what”?” are valid quoted strings according to the standard, since the end-quote marker is a quotation marker FOLLOWED BY WHITESPACE.

  • Strings that cannot be quoted on one line (e.g. ‘’’ “say” ‘what’ ‘’’ ) are converted to multiline strings by appending a newline

  • Strings with internal linebreaks but no terminal linebreak are converted by appending a linebreak

  • Strings that cannot be quoted as multiline strings because they contain a line starting with ‘;’ are converted by prepending a space (’ ‘) to each line

  • Values None, True, False, NaN, Infinity and -Infinity are converted to UNQUOTED strings ‘.’, ‘true’, ‘false’, ‘NaN’, ‘Infinity’ and ‘-Infinity’, respectively.

  • Normal (not unquoted) strings that evaluate to a float are written in quotes. Also the literal strings ‘.’, ‘true’, ‘false’, ‘NaN’, ‘Infinity’ and ‘-Infinity’ are always written in quotes.

    This makes it possible (but NOT mandatory) to distinguish the null, boolean and float values from the equivalent strings

The toString functions in this module will work with loop rows implemented as either tuples, lists, or OrderedDicts, and accept an optional tag prefix for DataBlocks, SaveFrames, and Loops, to prepend to item and column names.

class ccpn.util.nef.GenericStarParser.DataBlock(name=None)[source]

Bases: ccpn.util.nef.GenericStarParser.StarContainer

DataBlock for general STAR object tree

tagPrefix = None
toString(indent='', separator='  ')[source]

Convert DataBlock to string, for writing

class ccpn.util.nef.GenericStarParser.DataExtent(name='Root')[source]

Bases: ccpn.util.nef.GenericStarParser.NamedOrderedDict

Top level container for general STAR object tree

toString(indent='', separator='  ')[source]
class ccpn.util.nef.GenericStarParser.GeneralStarParser(text, enforceSaveFrameStop=True, enforceLoopStop=False, padIncompleteLoops=False, allowSquareBracketStrings=False, lowerCaseTags=True)[source]

Bases: object

Parser for text corresponding to a STAR file with one or more data blocks, producing a nested object structure matching the file (see module documentation for details:

DataExtent
  DataBlock
    Loop
    Item
  SaveFrame
    Loop
    Item

Parameters (default values correspond to the International Tables for Crystallography standard):

  • text Text to parse

  • enforceSaveFrameStop : True. Raise an error for missing ‘save_’ terminators - Yes/No

  • enforceLoopStop : False. Raise an error for missing ‘stop_’ terminators - Yes/No

  • padIncompleteLoops : False. Pad final loop row with ‘.’ for missing values - Yes/No

  • allowSquareBracketStrings : False. Allow values starting ‘[’ or ‘]’ - Yes/No

  • lowerCaseTags : True. Convert all data and object names to lower case

parse()[source]
processDataName(value)[source]
processValue(value)[source]
class ccpn.util.nef.GenericStarParser.Loop(name=None, columns=None)[source]

Bases: object

Loop for general STAR object tree Attributes are:

  • name: string

  • columns: List of string column headers

  • data: List-of-rows, where rows are OrderedDicts

addColumn(columnName, paddingValue={})[source]

Add new column to loop. if paddingValue is set, including to None, rows with None

property columns

- None, immutable - Column names

newRow(values=None)[source]

Add new row, initialised from values

removeColumn(columnName, removeData=False)[source]

Remove column from loop. Will NOT work properly if called during parsing.

tagPrefix = None
toString(indent='   ', separator='  ')[source]

Stringifier function for loop.

Accepts (subtypes of) Loop with data as sequence of rows, where rows can be tuples, lists, or OrderedDicts. In all cases the values must be in the order given by the columns attribute

class ccpn.util.nef.GenericStarParser.LoopRow[source]

Bases: collections.OrderedDict

Loop row - OrderedDict with additional functionality

class ccpn.util.nef.GenericStarParser.NamedOrderedDict(name=None)[source]

Bases: collections.OrderedDict

addItem(tag, value)[source]
class ccpn.util.nef.GenericStarParser.SaveFrame(name=None)[source]

Bases: ccpn.util.nef.GenericStarParser.StarContainer

SaveFrame for general STAR object tree

tagPrefix = None
toString(indent='   ', separator='  ')[source]

Convert SaveFrame to string, for writing

class ccpn.util.nef.GenericStarParser.StarContainer(name=None)[source]

Bases: ccpn.util.nef.GenericStarParser.NamedOrderedDict

DataBlock or SaveFrame containing items and loops

multiColumnValues(columns)[source]

get tuple of orderedDict of values for columns. Will work whether columns are in a loop or single values If columns match a single loop or nothing, return the loop data. Otherwise return a tuple with a single OrderedDict. If no column matches return None If columns match more than one loop throw an error

exception ccpn.util.nef.GenericStarParser.StarSyntaxError[source]

Bases: ValueError

class ccpn.util.nef.GenericStarParser.UnquotedValue[source]

Bases: str

A plain string - the only difference is the type: ‘UnquotedValue’. Used to distinguish values from STAR files that were not quoted. STAR special values (like null, unknown, …) are only recognised if unquoted strings

ccpn.util.nef.GenericStarParser.extractMatchingNameSequence(name, matchNames)[source]

Get list of matchNames matching ‘name_1’, ‘name_2’, …, in order.

ccpn.util.nef.GenericStarParser.parse(text, mode='standard')[source]

Parse STAR text string ‘text’. Standard settings allow skipping ‘stop_’ tags and strings starting with ‘[’ or ‘]’, but require ‘save_’ termination of SaveFrames and throw an error if the number of loop values do not match the number of columns.

‘strict’ and ‘lenient’ modes are available; mode=’IUCr’ follows the IUCr standard, which is like standard except that strings starting with ‘[’ and ‘]’ are not allowed

See GeneralStarParser class for details and control of individual settings

ccpn.util.nef.GenericStarParser.parseFile(fileName, mode='standard')[source]

load generic STAR file and parse the contents

ccpn.util.nef.GenericStarParser.valueToStarString(value, quoteNumberStrings=False)[source]

Convert value to properly quoted STAR string

if quoteNumberStrings, strings that evaluate to a float (e.g. ‘1’, ‘2.7e5’, …) are put in quotes

ccpn.util.nef.NefImporter module

NefImporter - a series of routines for reading a Nef file and examining the contents.

Module Contents

Introduction Error handling Examples Nef File Contents

Introduction

NefImporter consists of two classes: NefImporter - a class for handling the top-level object, and NefDict for handling individual saveFrames in the dictionary.

NefImporter contains:

initialise initialise a new dictionary loadFile read in the contents of a .nef file saveFile save the dictionary to a .nef file

getCategories return the current categories defined in the Nef structure getSaveFrameNames return the names of the saveFrames with the file hasSaveFrame return True if the saveFrame exists getSaveFrame return saveFrame of the given name addSaveFrame add a new saveFrame to the dictionary

get<name> return the relevant structures of the Nef file

defined by the available categories, where <name> can be:

NmrMetaData MolecularSystems ChemicalShiftLists DistanceRestraintLists DihedralRestraintLists RdcRestraintLists NmrSpectra PeakRestraintLinks

e.g. yourImport.getChemicalShiftLists()

add<name> add a new saveFrame to the dictionary

defined by the available categories, where <name> can be:

ChemicalShiftList DistanceRestraintList DihedralRestraintList RdcRestraintList NmrSpectra PeakLists LinkageTables

e.g. addChemicalShiftList

toString convert Nef dictionary to a string that can be written to a file fromString convert string to Nef dictionary

getAttributeNames get a list of the attributes attached to the dictionary getAttribute return the value of the attribute hasAttribute return True if the attribute Exists

lastError error code of the last operation lastErrorString error string of the last operation

NefDict contains handling routines:

getTableNames return a list of the tables in the saveFrame getTable return table from the saveFrame, it can be returned as an OrderedDict

or as a Pandas DataFrame

hasTable return true of the table exists setTable set the table - currently not implemented

getAttributeNames get a list of the attributes attached to the saveFrame getAttribute return the value of the attribute hasAttribute return True if the attribute Exists

lastError error code of the last operation lastErrorString error string of the last operation

Error Handling

Errors can be handled in three different modes:

‘silent’ errors are handled internally and can be interrogated with saveFrame.lastError

with no logging to the stderr

‘standard’ errors are handled internally, error messages are logged to stderr.

‘strict’ errors message are logged to stderr and errors are raised to be trapped by

the calling functions

error handling mode can be set at the instantiation of the object, e.g.

newObject = NefImporter(errorLogging=’standard’)

Examples

Here are a few examples of using the classes:

# load a Nef file test = NefImporter(errorLogging=NEF_STANDARD) test.loadFile(‘/Users/account/Projects/NefFile.nef’)

# get categories print (test.getCategories())

# get saveFrame names names = test.getSaveFrameNames(); print(names) names = test.getSaveFrameNames(returnType=NEF_RETURNALL); print(names) names = test.getSaveFrameNames(returnType=NEF_RETURNNEF); print (names) names = test.getSaveFrameNames(returnType=NEF_RETURNOTHER); print (names)

# get a particular saveFrame sf1 = test.getSaveFrame(names[0])

# convert NefImporter into a string for saving ts = test.toString() test.fromString(ts)

# getting tables from a saveFrame print (sf1.getTableNames()) table = sf1.getTable(‘nmr_atom’, asPandas=True) print (table) print (sf1.hasTable(‘nmr_residue’)) print (sf1.getAttributeNames()) print (sf1.hasAttribute(‘sf_framecode’)) print (sf1.hasAttribute(‘nothing’)) print (test.getSaveFrame(name=’ccpn_assignment’).getTable(name=’nmr_residue’, asPandas=True))

# saving a file print (‘SAVE ‘, test.saveFile(‘/Users/ejb66/PycharmProjects/Sec5Part3testing.nef’)) print (test.lastError)

# test meta creation of category names print (test.getMolecularSystems())

There are more examples i the __main__ function at the bottom of the module

Nef File Contents

More details of the contents of Nef files can be found in GenericStarParser The general structure of a Nef file is:

DataExtent
  DataBlock
    Item
    Loop
    SaveFrame
      Item
      Loop

DataExtent, DataBlock and SaveFrame are Python OrderedDict with an additional ‘name’ attribute DataBlocks and SaveFrames are entered in their container using their name as the key.

Loop is an object with a ‘columns’ list, a ‘data’ list-of-row-OrderedDict, and a name attribute set equal to the name of the first column. A loop is entered in its container under each column name, so that e.g. aSaveFrame[‘_Loopx.loopcol1’] and aSaveFrame[‘_Loopx.loopcol2’] both exist and both correspond to the same loop object.

Items are entered as a string key - string value pair.

the string value can be a dictionary

class ccpn.util.nef.NefImporter.NefDict(inFrame, errorLogging='standard', hidePrefix=True)[source]

Bases: ccpn.util.nef.StarIo.NmrSaveFrame, ccpn.util.nef.ErrorLog.ErrorLog

An orderedDict saveFrame object for extracting information from the NefImporter

getAttribute(name)[source]

Return attribute ‘name’ if in the saveFrame :param name: :return attribute:

getAttributeNames()[source]

Return list of attributes in the saveFrame :return list or None:

getTable(name=None, asPandas=False)[source]

Return the table ‘name’ from the saveFrame if it exists if asPandas is True then return as a pandas dataFrame, otherwise return a list of saveFrames :param name: :param asPandas: :return saveFrames, dataFrames or None:

getTableNames()[source]

Return list of attributes in the saveFrame :return list or None:

hasAttribute(name)[source]

Return True if attribute ‘name’ is in the saveFrame :param name: :return True or False:

hasTable(name)[source]

Return True if table ‘name’ exists in the saveFrame :param name: :return True or False:

property hidePrefix

- None, mutable -

multiColumnValues(column=None)[source]

Return tuple of orderedDict of values for columns. Will work whether columns are in a loop or single values If columns match a single loop or nothing, return the loop data. Otherwise return a tuple with a single OrderedDict. If no column matches return None If columns match more than one loop throw an error :param column: :return orderedDicts:

setTable(name)[source]
class ccpn.util.nef.NefImporter.NefImporter(programName='Unknown', programVersion='Unknown', errorLogging='standard', hidePrefix=True)[source]

Bases: ccpn.util.nef.ErrorLog.ErrorLog

Object for accessing Nef data tree. The Nef data consist of a single NmrStar dataBlock (an OrderedDict), with (saveFrameName, NmrSaveFrame) key,value pairs

addChemicalShiftList(name, cs_units='ppm')[source]
addDihedralRestraintList(name, potential_type, restraint_origin=None)[source]
addDistanceRestraintList(name, potential_type, restraint_origin=None)[source]
addLinkageTable()[source]
addPeakList(name, num_dimensions, chemical_shift_list, experiment_classification=None, experiment_type=None)[source]
addRdcRestraintList(name, potential_type, restraint_origin=None, tensor_magnitude=None, tensor_rhombicity=None, tensor_chain_code=None, tensor_sequence_code=None, tensor_residue_type=None)[source]
addSaveFrame(name, category, required_fields=None, required_loops=None)[source]

Add a new saveFrame to NefImporter :param name: :param category: :param required_fields: :param required_loops:

property data: ccpn.util.nef.StarIo.NmrDataBlock

- ccpn.util.nef.StarIo.NmrDataBlock, immutable - Return the NmrDataBlock instance

deleteSaveFrame(name)[source]
fromString(text, mode='standard')[source]
getAttribute(name)[source]
getAttributeNames()[source]
getCategories()[source]
getChemicalShiftLists()[source]

Return the nef_chemical_shift_list saveFrames :return list or single item:

getDihedralRestraintLists()[source]

Return the nef_dihedral_restraint_list saveFrames :return list or single item:

getDistanceRestraintLists()[source]

Return the nef_distance_restraint_list saveFrames :return list or single item:

getMolecularSystems()[source]

Return the nef_molecular_system saveFrames :return list or single item:

getName(prePend=False) str[source]

Get the name as defined by the NmrDataBlock, optionally pre-pended with ‘nefData_’ :return the name or ‘’ if undefined

getNmrMetaData()[source]

Return the nef_nmr_meta_data saveFrames :return list or single item:

getNmrSpectra()[source]

Return the nef_nmr_spectrum saveFrames :return list or single item:

Return the nef_peak_restraint_link saveFrames :return list or single item:

getRdcRestraintLists()[source]

Return the nef_rdc_restraint_list saveFrames :return list or single item:

getSaveFrame(name)[source]
getSaveFrameNames(returnType='all')[source]
hasAttribute(name)[source]
hasSaveFrame(name)[source]
property hidePrefix

- None, mutable - defines the current hidePrefix state True - Nef prefixes ‘nef_’ are hidden False - Nef prefixes ‘nef_’ are not hidden prefixes are still used in the saveFrames bit not seen in general use

Returns

the current hidePrefix state

property isValid: bool

- bool, immutable - Check whether the Nef object contains the required information :return True or False:

loadFile(fileName=None, mode='standard') ccpn.util.nef.StarIo.NmrDataBlock[source]

Load and parse Nef-file fileName :param fileName: path to a Nef-file :return a NmrDataBlock instance

loadText(text, mode='standard') ccpn.util.nef.StarIo.NmrDataBlock[source]

Load and parse Nef-formatted text :param text: Nef-formatted text :return a NmrDataBlock instance

loadValidateDictionary(fileName=None, mode='standard')[source]

Load and parse a Nef dictionary file (in star format) to validate the nef file.

Parameters
  • fileName – path of Nef dictionary file; defaults to current definition dictionary file

  • mode

property path: str

- str, immutable - :return the path of the last read Nef file (empty if undefined)

renameSaveFrame(name, newName)[source]
saveFile(fileName=None)[source]
toString()[source]
property validErrorLog

- None, immutable - Return the error log from checking validity :return dict:

ccpn.util.nef.NefImporter.import_parents(level=1)[source]

ccpn.util.nef.SafeOpen module

Functions to append a number to the end of a filename if it already exists

ccpn.util.nef.SafeOpen.getSafeFilename(path, mode='w')[source]

Get the first safe filename from the given path

Parameters
  • path – filepath and filename.

  • mode – open flags

Returns

Open file handle and new fileName

ccpn.util.nef.SafeOpen.safeOpen(path, mode)[source]

Open path, but if it already exists, add ‘(n)’ before the extension, where n is the first number found such that the file does not already exist. Returns an open file handle.

Usage: with safeOpen(path, [options]) as (fd, safeFileName):

fd is the file descriptor, to be used as with open, e.g., fd.read() safeFileName is the new safe filename.

Parameters
  • path – filepath and filename.

  • mode – open flags

Returns

Open file handle and new fileName

ccpn.util.nef.Specification module

Code for handling NEF specification and metadata

class ccpn.util.nef.Specification.CifDicConverter(inputText, skipExamples=True, additionalBlocks=(), logger=None)[source]

Bases: object

Converts mmcif .dic file, with program-specific additions datablocks into a single NEF data structure, containing:

1) a nef_specification saveframe, containing a dictionary_history loop and a item_type_list loop

  1. A saveframe for each saveframe_dategory in the specification. Each saveframe contains

    items: _nef_saveframe.sf_framecode _nef_saveframe.sf_category _nef_saveframe.is_mandatory _nef_saveframe.description _nef_saveframe.example

    A table for contained loops:

    loop_

    _nef_loop.category _nef_loop.is_mandatory _nef_loop.description _nef_loop.example

    And a table for contained items and loop columns:

    loop_

    _nef_item.name _nef_item.loop_category _nef_item.type_code _nef_item.is_mandatory _nef_item.is_key _nef_item.example_1 _nef_item.example_2 _nef_item.description

    The loop_category defines which loop the item belongs to (if empty it belongs directly inside the saveframe)

convertToNef()[source]

Convert RCSB .cif file into a nef specification summary file

extractGeneralDataFrame(rcsbDataBlock)[source]

Extract general data saveframe

extractItemDescription(inputSaveFrame)[source]

Extract item description

extractLoopDescription(inputSaveFrame)[source]

Extract loop description

extractSaveFrameDescription(inputSaveFrame)[source]

Extract saveframe description

ccpn.util.nef.Specification.extractByCategories(rcsbDataBlock)[source]

Get saveFrames describing SaveFrames, Loops, and items, respectively

ccpn.util.nef.Specification.getCcpnSpecification(filePath)[source]

Get NEF specification summary with ccpn-specific additions

ccpn.util.nef.Specification.transferLoop(genericContainer, saveFrame, inputTags)[source]

Transfer category.tag_x, … to loop named category with tags tag_x etc.

ccpn.util.nef.StarIo module

I/O for NEF and NmrStar formats.

The functions to use are

and other STAR variants satisfying the following requirements:

  • all plain tags in a saveframe start with a common prefix;

for NEF files this must be the ‘<sf_category>’ followed by ‘.’, and the framecode value must start with the ‘<sf_category>’ followed by underscore.

  • All loop column names start with ‘<loopcategory>.’

  • loopcategories share a namespace with tags within a saveframe

  • DataBlocks can contain only saveframes.

  • For NEF files the

Use the functions parseNmrStar, parseNef, parseNmrStarFile, parseNefFile

The ‘File’ functions take a file name and pass the file contents to corresponding parser.

The ‘NmrStar’ functions will read any Star file that satisfies the constraints above, while

the ‘Nef’ functions will also enforce the NEF=-specific constraints above

On reading tag prefixes (‘_’, ‘save_’, ‘data_’ are stripped, as are the parts of tags before the first ‘.’

class ccpn.util.nef.StarIo.NmrDataBlock(name=None)[source]

Bases: ccpn.util.nef.GenericStarParser.DataBlock

DataBlock (OrderedDict)for NMRSTAR/NEF object tree

addSaveFrame(saveFrame)[source]

Add existing NmrSaveFrame to the DataBlock

newSaveFrame(name, category)[source]

Make new NmrSaveFrame and add it to the DataBlock

class ccpn.util.nef.StarIo.NmrDataExtent(name='Root')[source]

Bases: ccpn.util.nef.GenericStarParser.DataExtent

Top level container (OrderedDict) for NMRSTAR/NEF object tree

class ccpn.util.nef.StarIo.NmrLoop(name=None, columns=None)[source]

Bases: ccpn.util.nef.GenericStarParser.Loop

Loop for NMRSTAR/NEF object tree

The contents, self.data is a list of OrderedDicts matching the column names. rows can be modified or deleted from data, but adding new rows directly is likely to break - use the newRow function.

property category

- None, immutable - Loop category tag - synonym for name (unlike the case of SaveFrame)

property tagPrefix

- None, immutable - Prefix to use before item tags on output

class ccpn.util.nef.StarIo.NmrLoopRow[source]

Bases: ccpn.util.nef.GenericStarParser.LoopRow

class ccpn.util.nef.StarIo.NmrSaveFrame(name=None, category=None)[source]

Bases: ccpn.util.nef.GenericStarParser.SaveFrame

SaveFrame (OrderedDict)for NMRSTAR/NEF object tree

newLoop(name, columns)[source]

Make new NmrLoop and add it to the NmrSaveFrame

property tagPrefix

- None, immutable - Prefix to use before item tags on output

exception ccpn.util.nef.StarIo.StarValidationError[source]

Bases: ValueError

ccpn.util.nef.StarIo.parseNef(text, mode='standard')[source]

load NEF from string

ccpn.util.nef.StarIo.parseNefFile(fileName, mode='standard', wrapInDataBlock=False)[source]

parse NEF from file

if wrapInDataBlock missing DataBlock start will be provided

ccpn.util.nef.StarIo.parseNmrStar(text, mode='standard')[source]

load NMRSTAR file

ccpn.util.nef.StarIo.parseNmrStarFile(fileName, mode='standard', wrapInDataBlock=False)[source]

parse NMRSTAR from file. :param fileName: path of the star-file to parse :param mode: parsing mode: any of (‘lenient’, ‘strict’, ‘standard’, ‘IUCr’) :param wrapInDataBlock: flag; if True a missing DataBlock start will be added :return NmrDataBlock instance

ccpn.util.nef.StarIo.splitNefSequence(rows)[source]

Split a sequence of nef_sequence dicts assumed to belong to the same chain into a list of lists of sequentially linked stretches following the NEF rules

Note that missing linkings are treated as ‘middle’ and missing start/end tags are ignored, with the first/last residue treated, effectively, as linking ‘break’

Only unknown linking values and incorrect pairs of ‘cyclic’ tags raise an error

ccpn.util.nef.StarIo.string2FramecodeString(text)[source]

ccpn.util.nef.StarTokeniser module

STAR file tokenizer

# Copyright © 2011, 2013 Global Phasing Ltd. All rights reserved. # # Author: Peter Keller # # This file forms part of the GPhL StarTools library. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # # Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # # Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the # distribution. # # If the regular expression used to match STAR/CIF data in the # redistribution is not identical to that in the original version, # this fact must be stated wherever the copyright notice is # reproduced. # # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS # FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, # INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES # (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR # SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, # STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) # ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED # OF THE POSSIBILITY OF SUCH DAMAGE.

‘’’ Created on 25 Nov 2013

@author: pkeller ‘’’

# # Modified by Rasmus Fogh, CCPN project, 5/2/2016 #

class ccpn.util.nef.StarTokeniser.StarToken(type, value)

Bases: tuple

type

Alias for field number 0

value

Alias for field number 1

ccpn.util.nef.StarTokeniser.getTokenIterator(text)[source]

Iterator that returns an iterator over all STAR tokens in a generic STAR file

ccpn.util.nef.Validator module

Module Documentation here

class ccpn.util.nef.Validator.Validator(nef=None, validateNefDict=None)[source]

Bases: object

isValid(nef=None, validNef=None)[source]

Return whether the Nef file is valid

property validationErrors

- None, immutable - Return the dict of validation errors

ccpn.util.nef.nef module

nef - nef handling routines; a series of routines to compare/verify Nef files

Command Line Usage:

nef for execution from the command line with a suitable script An example can be found in AnalysisV3/bin/nef:

#!/usr/bin/env sh export CCPNMR_TOP_DIR=”$(dirname $(cd $(dirname “$0”); pwd))” export ANACONDA3=${CCPNMR_TOP_DIR}/miniconda export PYTHONPATH=${CCPNMR_TOP_DIR}/src/python:${CCPNMR_TOP_DIR}/src/c ${ANACONDA3}/bin/python ${CCPNMR_TOP_DIR}/src/python/ccpn/util/nef/nef.py $*

Usage: nef [options]

optional arguments:
-h, --help

show this help message

-H, --Help

Show detailed help

--compare

Compare Nef files: with the following options

-i, --ignoreblockname

Ignore the blockname when comparing two Nef files May be required when converting Nef files through different applications. May be used with -f and -b

-f file1 file2, –files file1 file2

Compare two Nef files and print the results to the screen

-d dir1 dir2, –dirs dir1 dir2

compare Nef files common to directories dir1 and dir2. Write output *.txt for each file into the output directory specified below

-o outDir, --outdir

Specify output directory for batch processing

-s, --screen

Output batch processing to screen, default is to .txt files may be used with -d

-r, --replace

Replace existing .txt files. If false then files are appended with ‘(n)’ before the extension, where n is the next available number

-c, --create

Automatically create directories as required

-I, --ignorecase

Ignore case when comparing items

--same

output similarities between Nef files default is differences

-a, --almostequal

Consider float/complex numbers to be equal if within the relative tolerance

-p, --places

Specify the number of decimal places for the relative tolerance

--verify

Verify Nef files

Can be used with switches: -f, -d

Details of the contents of Nef files can be found in GenericStarParser The general structure of a Nef file is:

DataExtent
  DataBlock
    Item
    Loop
    SaveFrame
      Item
      Loop

DataExtent, DataBlock and SaveFrame are Python OrderedDict with an additional ‘name’ attribute DataBlocks and SaveFrames are entered in their container using their name as the key.

Loop is an object with a ‘columns’ list, a ‘data’ list-of-row-OrderedDict, and a name attribute set equal to the name of the first column. A loop is entered in its container under each column name, so that e.g. aSaveFrame[‘_Loopx.loopcol1’] and aSaveFrame[‘_Loopx.loopcol2’] both exist and both correspond to the same loop object.

Items are entered as a string key - string value pair.

the string value can be a dictionary

Module Contents

nef.py contains the following routines:

compareNefFiles compare two Nef files and return a comparison object

Searches through all objects: dataExtents, dataBlocks, saveFrames and Loops within the files. Comparisons are made for all data structures that have the same name. Differences for items within a column are listed in the form:

dataExtent:dataBlock:saveFrame:Loop: <Column>: columnName <rowIndex>: row –> value1 != value2

dataExtents, dataBlocks, saveFrames, Loops, columns present only in one file are listed, in the form:
dataExtent:dataBlock:saveFrame:Loop: contains –> parameter1

parameter2 … parameterN

A comparison object is a list of nefItems of the form:

NefItem
  inWhich         a flag labelling which file the item was found in
                  1 = found in first file, 2 = found in second file, 3 = common to both
  List
    Item          multiple strings containing the comparison tree
    (,List)       the last item of which may be a list of items common to the tree
e.g., for parameters present in the first file:
[

inWhich=1 list=[dataExtent1, dataBlock1, saveFrame1, Loop1, [parameter1, parameter2, parameter3]]

]

compareDataExtents compare two DataExtent objects and return a comparison list as above compareDataBlocks compare two DataBlock objects and return a comparison list as above compareSaveFrames compare two SaveFrame objects and return a comparison list as above compareLoops compare two Loop objects and return a comparison list as above

compareNefFiles compare two Nef files and return a comparison list as above batchCompareNefFiles compare two directories of Nef files.

Nef Files common to specified directories are compared and the comparison lists are written to the third directory as .txt

printCompareList print the comparison list to the screen

class ccpn.util.nef.nef.NEFOPTIONS(value)[source]

Bases: enum.Enum

An enumeration.

COMPARE = 'compare'
VERIFY = 'verify'
class ccpn.util.nef.nef.Test_compareFiles(methodName='runTest')[source]

Bases: unittest.case.TestCase

Test the comparison of nef files and print the results

test_commandLineParser()[source]

Test the output from the parser

test_compareBatchFiles()[source]

Compare the Nef files in two directories

test_compareDifferentFiles()[source]

Load two files and compare

test_compareObjects()[source]

Test the compareObjects method

test_compareSimilarFiles()[source]

Load two files and compare

test_verifyFiles()[source]

Load two files and verify

test_verifySingleFile()[source]

Load single file and verify

ccpn.util.nef.nef.batchCompareNefFiles(inDir1, inDir2, outDir, options)[source]

Batch compare the Nef files common to the two directories For each file found, write the compare log to the corresponding .txt file

Parameters
  • inDir1

  • inDir2

  • outDir

  • options – nameSpace holding the commandLineArguments

ccpn.util.nef.nef.compareDataBlocks(dataBlock1, dataBlock2, options, cItem=None, nefList=None)[source]

Compare two dataBlocks, if they have the same name then check their contents

Parameters
  • dataBlock1 – first DataBlock object, of type GenericStarParser.DataBlock

  • dataBlock2 – second DataBlock object, of type GenericStarParser.DataBlock

  • options – nameSpace holding the commandLineArguments

  • cItem – list of str describing differences between nefItems

  • nefList – input of nefItems

Returns

list of type nefItem

ccpn.util.nef.nef.compareDataExtents(dataExt1, dataExt2, options, cItem=None, nefList=None)[source]

Compare two dataExtents, if they have the same name then check their contents

Parameters
  • dataExt1 – first DataExtent object, of type GenericStarParser.DataExtent

  • dataExt2 – second DataExtent object, of type GenericStarParser.DataExtent

  • options – nameSpace holding the commandLineArguments

  • cItem – list of str describing differences between nefItems

  • nefList – input of nefItems

Returns

list of type nefItem

class ccpn.util.nef.nef.compareItem(attribute=None, row=None, column=None, thisValue=None, compareValue=None)[source]

Bases: object

Holds the details of a compared loop/saveFrame item at a particular row/column (if required)

ccpn.util.nef.nef.compareLoops(loop1, loop2, options, cItem=None, nefList=None)[source]

Compare two Loops

Parameters
  • loop1 – first Loop object, of type GenericStarParser.Loop

  • loop2 – second Loop object, of type GenericStarParser.Loop

  • options – nameSpace holding the commandLineArguments

  • cItem – list of str describing differences between nefItems

  • nefList – input of nefItems

Returns

list of type nefItem

ccpn.util.nef.nef.compareNefFiles(inFile1, inFile2, options, cItem=None, nefList=None)[source]

Compare two Nef files and return comparison as a nefItem list

Parameters
  • inFile1 – name of the first file

  • inFile2 – name of the second file

  • options – nameSpace holding the commandLineArguments

  • cItem – list of str describing differences between nefItems

  • nefList – input of nefItems

Returns

list of type nefItem

ccpn.util.nef.nef.compareSaveFrames(saveFrame1, saveFrame2, options, cItem=None, nefList=None)[source]

Compare two saveFrames, if they have the same name then check their contents

Parameters
  • saveFrame1 – first SaveFrame object, of type GenericStarParser.SaveFrame

  • saveFrame2 – second SaveFrame object, of type GenericStarParser.SaveFrame

  • options – nameSpace holding the commandLineArguments

  • cItem – list of str describing differences between nefItems

  • nefList – input of nefItems

Returns

list of type nefItem

ccpn.util.nef.nef.defineArguments()[source]

Define the arguments of the program

:return argparse instance

ccpn.util.nef.nef.import_parents(level=1)[source]
class ccpn.util.nef.nef.nefItem(cItem=None)[source]

Bases: object

Holds the contents of a single Nef comparison inWhich a flag labelling which file the item was found in

1 = found in the first file, 2 = found on the second file, 3 = common to both

list a list of strings containing the comparison information

ccpn.util.nef.nef.printCompareList(nefList, inFile1, inFile2, options)[source]

Print the contents of the nef compare list to the screen

Output is in three parts:
  • items that are present only in the first file

  • items that are only in the second file

  • differences between objects that are common in both files

Parameters
  • nefList – list to print

  • inFile1 – name of the first file

  • inFile2 – name of the second file

ccpn.util.nef.nef.printFile(thisFile)[source]

Print a file to the screen

ccpn.util.nef.nef.printOutput(*args, **kwds)[source]

Output a message

ccpn.util.nef.nef.printWhichList(nefList, options, whichType=whichTypes.NONE)[source]

List only those items that are of type whichType

Parameters
  • nefList – list to print

  • whichType – type to print

ccpn.util.nef.nef.processArguments(options)[source]

Process the command line arguments

ccpn.util.nef.nef.showError(msg, *args, **kwds)[source]

Show an error message

ccpn.util.nef.nef.showMessage(msg, *args, **kwds)[source]

Show a warning message

ccpn.util.nef.nef.sizeNefList(nefList, whichType=whichTypes.NONE)[source]

List only those items that are of type whichType

Parameters
  • nefList – list to print

  • whichType – type to print

ccpn.util.nef.nef.verifyFile(file, options)[source]

Verify a single file

Parameters
  • file

  • options

Returns

ccpn.util.nef.nef.verifyFiles(inFiles, options)[source]

Verify files

Parameters
  • inFiles

  • options – nameSpace holding the commandLineArguments

class ccpn.util.nef.nef.whichTypes(value)[source]

Bases: enum.Enum

An enumeration.

BOTH = 3
LEFT = 1
NONE = 0
RIGHT = 2