Module Documentation here
Basic API (data storage) I/O functions
- class ccpnmodel.ccpncore.lib.Io.Api.DefaultIoHandler¶
Class to handle interactions with user and logging Should be subclassed for actual functionality This default class simply does nothing
- ccpnmodel.ccpncore.lib.Io.Api.absentOrRemoved(path: str, overwriteExisting: bool = False) bool ¶
Check if file is present, possibly removing it first.
- If path already exists:
- If overwriteExisting:
Delete the path Return True
- Else if showYesNo:
Ask the user if it is ok to delete the path If yes, delete and return True. If no return False.
This function is not intended to be used outside this module but could be.
- ccpnmodel.ccpncore.lib.Io.Api.backupProject(project, dataLocationStores=None, skipRefData=True, clearOutDir=False)¶
Check that file on disk ends correctly
Check that topObject file on disk ends correctly
Clean up project preparatory to closing (close log handlers etc.)
- ccpnmodel.ccpncore.lib.Io.Api.copyV2ToV3Location(projectPath) str ¶
Copy V2 data to new directory with correct name and structure for V3
If project is already V3 does nothing (except converting ‘xyz.ccpn/ccpn’ to ‘xyz.ccpn’
Create backup of topObject in same directory as original file but with ‘.bak’ appended. This function is not intended to be used outside this module but could be.
delete temporary project directory, if there is one
- ccpnmodel.ccpncore.lib.Io.Api.findCcpXmlFile(project, packageName, fileSearchString)¶
Finds an XML file by a file search pattern from all available package repositories
- ccpnmodel.ccpncore.lib.Io.Api.getRepositoryPath(project, repositoryName)¶
Load all data for a given root (version >= 2.0)
- ccpnmodel.ccpncore.lib.Io.Api.loadProject(path: str, projectName: str = None, askFile: function = None, askDir: function = None, suppressGeneralDataDir: bool = False, fixDataStores=True, applicationName='ccpn', useFileLogger: bool = True) ccpnmodel.ccpncore.api.memops.Implementation.MemopsRoot ¶
Loads a project file and checks and deletes unwanted project repositories and changes the project repository path if the project has moved. Returns the project. (The project repository path is effectively the userData repository.) askFile (if not None) has signature askFile(title, message, initial_value = ‘’) askDir (if not None) has signature askDir(title, message, initial_value = ‘’) Throws an IOError if there is an I/O error. Throws an ApiError if there is an API exception.
- ccpnmodel.ccpncore.lib.Io.Api.modifyPackageLocators(project, repositoryName, repositoryPath, packageNames, resetPackageLocator=True, resetRepository=False)¶
Resets package locators for specified packages to specified repository.
Use as, for example:
- resetPackageLocator: True will reset the package locator completely, removing old info
False will add the repository to the package locator.
- resetRepository: True will reset url for the repository, even if it already exists
False will not reset the url for the repository if it already exists
Returns the relevant repository.
- ccpnmodel.ccpncore.lib.Io.Api.movePackageData(root, newPackageName, oldPackageName)¶
Move all data from package oldPackageName to newPackageName
- ccpnmodel.ccpncore.lib.Io.Api.newProject(projectName, path: Optional[str] = None, overwriteExisting: bool = False, applicationName='ccpn', useFileLogger: bool = True) ccpnmodel.ccpncore.api.memops.Implementation.MemopsRoot ¶
Create, and return, a new project using a specified path (directory). If path is not specified it takes the current working directory. The path can be either absolute or relative. The ‘userData’ repository is pointed to the path. The ‘backup’ repository is pointed to the path + ‘_backup’ + CCPN_DIRECTORY_SUFFIX. If either of these paths already exist (either as files or as directories):
- If overwriteExisting:
Delete the path
- Else if showYesNo:
Ask the user if it is ok to delete the path If yes, delete. If no return None.
Raise an IOError
- ccpnmodel.ccpncore.lib.Io.Api.saveProject(project, newPath=None, changeBackup=True, createFallback=False, overwriteExisting=False, checkValid=False, changeDataLocations=False, useFileLogger: bool = True) bool ¶
Save the userData for a project to a location given by newPath (the url.path of the userData repository) if set, or the existing location if not. Return True if save succeeded otherwise return False (or throw error)
NB Changes to project in the function can NOT be undone, but previous contents of the undo queue are left active, so you can undo backwards.
If userData does not exist then throws IOError. If newPath is not specified then it is set to oldPath. If changeBackup, then also changes backup URL path for project. If createFallback, then makes copy of existing modified topObjects files (in newPath, not oldPath) before doing save:
If newPath != oldPath and newPath exists (either as file or as directory): If overwriteExisting: Delete the newPath. Else if showYesNo: Ask the user if it is ok to delete the newPath If yes, delete. If no, return without saving. Else: Raise an IOError Elif newProjectName != oldProjectName and there exists corresponding path (file/directory): If overwriteExisting: Delete the path. Else if showYesNo: Ask the user if it is ok to delete the path. If yes, delete. If no, return without saving. Else: Raise an IOError If checkValid then does checkAllValid on project If changeDataLocations then copy to project directory If there is no exception or early return then at end userData is pointing to newPath. Return True if save done, False if not (unless there is an exception)
- ccpnmodel.ccpncore.lib.Io.Api.setRepositoryPath(project, repositoryName, path)¶
Code for reading Fasta format files
Parse Fasta file and return sequences
Code for recognising and analysing external data formats No longer in use as of version 3.1.0
Pdb IO functions
- class ccpnmodel.ccpncore.lib.Io.Pdb.PdbRecordProcessor¶
Class for custom record processing
fix atom record - make only globally acceptable changes, special-case stuff is for later
- ccpnmodel.ccpncore.lib.Io.Pdb.loadStructureEnsemble(molSystem: MolSystem, fil) StructureEnsemble ¶
Load PDB file into new structure ensemble matching MolSystem NB MolSystem is a required parameter for the data model, but there is no requirement that the data match
- ccpnmodel.ccpncore.lib.Io.Pdb.readModelRecords(fil) Tuple[List[ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.PDBRecord], List[List[ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.PDBRecord]]] ¶
Read file or input stream, and return list-of-lists-of PDBRecords, one per model All records are given in the first list, subsequent lists contain only ATOM records
Read file or input stream, and return of PDBRecords, one per model Header records are given in the first list
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.ANISOU¶
The ANISOU records present the anisotropic temperature factors. Columns 7 - 27 and 73 - 80 are identical to the corresponding ATOM/HETATM record.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.ATOM¶
The ATOM records present the atomic coordinates for standard residues. They also present the occupancy and temperature factor for each atom. Heterogen coordinates use the HETATM record type. The element symbol is always present on each ATOM record; segment identifier and charge are optional.
This should help older applications which do not use the element field of the ATOM record, these applications used column alignment to distinguish calcium (CA) from, say, an alpha-carbon (CA)
format resName correctly to allow for using 4-char resName fields
CHANGED FROM ORIGINAL - added
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.AUTHOR¶
The AUTHOR record contains the names of the people responsible for the contents of the entry.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.CAVEAT¶
CAVEAT warns of severe errors in an entry. Use caution when using an entry containing this record.
Returns a list of dictionaries with keys idCode and comment.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.CISPEP¶
CISPEP records specify the prolines and other peptides found to be in the cis conformation. This record replaces the use of footnote records to list cis peptides.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.COMPND¶
The COMPND record describes the macromolecular contents of an entry. Each macromolecule found in the entry is described by a set of token: value pairs, and is referred to as a COMPND record component. Since the concept of a molecule is difficult to specify exactly, PDB staff may exercise editorial judgment in consultation with depositors in assigning these names. For each macromolecular component, the molecule name, synonyms, number assigned by the Enzyme Commission (EC), and other relevant details are specified.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.CONECT¶
The CONECT records specify connectivity between atoms for which coordinates are supplied. The connectivity is described using the atom serial number as found in the entry. CONECT records are mandatory for HET groups (excluding water) and for other bonds not specified in the standard residue connectivity table which involve atoms in standard residues (see Appendix 4 for the list of standard residues). These records are generated by the PDB.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.CRYST1¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.CRYST2¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.CRYST3¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.CRYSTn¶
The CRYSTn (n=1,2,3) record presents the unit cell parameters, space group, and Z value. If the structure was not determined by crystallographic means, CRYSTn simply defines a unit cube.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.DBREF¶
The DBREF record provides cross-reference links between PDB sequences and the corresponding database entry or entries. A cross reference to the sequence database is mandatory for each peptide chain with a length greater than ten (10) residues. For nucleic acid entries a DBREF record pointing to the Nucleic Acid Database (NDB) is mandatory when the corresponding entry exists in NDB.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.END¶
The END record marks the end of the PDB file.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.ENDMDL¶
The ENDMDL records are paired with MODEL records to group individual structures found in a coordinate entry.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.EXPDTA¶
The EXPDTA record presents information about the experiment. The EXPDTA record identifies the experimental technique used. This may refer to the type of radiation and sample, or include the spectroscopic or modeling technique. Permitted values include: ELECTRON DIFFRACTION FIBER DIFFRACTION FLUORESCENCE TRANSFER NEUTRON DIFFRACTION NMR THEORETICAL MODEL X-RAY DIFFRACTION
Returns a list of 2-tuples: (technique, comment) where technique is one of the accepted techniques.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.FORMUL¶
The FORMUL record presents the chemical formula and charge of a non-standard group. (The formulas for the standard residues are given in Appendix 5.)
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.HEADER¶
This section contains records used to describe the experiment and the biological macromolecules present in the entry: HEADER, OBSLTE, TITLE, CAVEAT, COMPND, SOURCE, KEYWDS, EXPDTA, AUTHOR, REVDAT, SPRSDE, JRNL, and REMARK records.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.HELIX¶
HELIX records are used to identify the position of helices in the molecule. Helices are both named and numbered. The residues where the helix begins and ends are noted, as well as the total length.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.HET¶
The HET records are used to describe non-standard residues, such as prosthetic groups, inhibitors, solvent molecules, and ions for which coordinates are supplied. Groups are considered HET if they are: - not one of the standard amino acids, and - not one of the nucleic acids (C, G, A, T, U, and I), and - not one of the modified versions of nucleic acids (+C, +G, +A, +T, +U, and +I), and - not an unknown amino acid or nucleic acid where UNK is used to indicate the unknown residue name. Het records also describe heterogens for which the chemical identity is unknown, in which case the group is assigned the hetID UNK.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.HETATM¶
The HETATM records present the atomic coordinate records for atoms within “non-standard” groups. These records are used for water molecules and atoms presented in HET groups.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.HETNAM¶
This record gives the chemical name of the compound with the given hetID.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.HETSYN¶
This record provides synonyms, if any, for the compound in the corresponding (i.e., same hetID) HETNAM record. This is to allow greater flexibility in searching for HET groups.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.HYDBND¶
The HYDBND records specify hydrogen bonds in the entry.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.JRNL¶
The JRNL record contains the primary literature citation that describes the experiment which resulted in the deposited coordinate set. There is at most one JRNL reference per entry. If there is no primary reference, then there is no JRNL reference. Other references are given in REMARK 1.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.KEYWDS¶
The KEYWDS record contains a set of terms relevant to the entry. Terms in the KEYWDS record provide a simple means of categorizing entries and may be used to generate index files. This record addresses some of the limitations found in the classification field of the HEADER record. It provides the opportunity to add further annotation to the entry in a concise and computer-searchable fashion.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.LINK¶
The LINK records specify connectivity between residues that is not implied by the primary structure. Connectivity is expressed in terms of the atom names. This record supplements information given in CONECT records and is provided here for convenience in searching.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.MASTER¶
The MASTER record is a control record for bookkeeping. It lists the number of lines in the coordinate entry or file for selected record types.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.MODEL¶
The MODEL record specifies the model serial number when multiple structures are presented in a single coordinate entry, as is often the case with structures determined by NMR.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.MODRES¶
The MODRES record provides descriptions of modifications (e.g., chemical or post-translational) to protein and nucleic acid residues. Included are a mapping between residue names given in a PDB entry and standard residues.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.MTRIX1¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.MTRIX2¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.MTRIX3¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.MTRIXn¶
The MTRIXn (n = 1, 2, or 3) records present transformations expressing non-crystallographic symmetry.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.OBSLTE¶
OBSLTE appears in entries which have been withdrawn from distribution. This record acts as a flag in an entry which has been withdrawn from the PDB’s full release. It indicates which, if any, new entries have replaced the withdrawn entry. The format allows for the case of multiple new entries replacing one existing entry.
Processes continued record list to a list of dictionary objects. Each dictionary contains the data from one OBSLTE idCode.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.ORIGX1¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.ORIGX2¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.ORIGX3¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.ORIGXn¶
The ORIGXn (n = 1, 2, or 3) records present the transformation from the orthogonal coordinates contained in the entry to the submitted coordinates.
- exception ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.PDBError¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.PDBFile(iterable=(), /)¶
Class for managing a PDB file. This class inherits from a Python list object, and contains a list of PDBRecord objects. Load, save, edit, and create PDB files with this class.
Append object to the end of the list.
- insert(i, rec)¶
Insert object before index.
Loads a PDB file from File object fil.
Saves the PDBFile object in PDB file format to File object fil.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.PDBRecord¶
Base class for all PDB file records.
Read the PDB record line and convert the fields to the appropriate dictionary values for this class.
- reccat(rec_list, field)¶
Return the concatenation of field in all the records in rec_list.
- reccat_dictlist(rec_list, field, master_key)¶
- reccat_list(rec_list, field, sep)¶
Call reccat, then split the result by the separator.
- reccat_multi(rec_list, primary_key, translations)¶
Create a list of dictionaries from a list of records. This method has complex behavior to support translations of several PDB records into a Python format. The primary key is used to seperate the dictionaries within the list, and the translation argument is a list of strings or 2-tuples. If the translation is a string, the value from the PDB record field is copied to the return dictionary. If the field is a 2-tuple==t, then t is the return dictionary key whose value is a list formed from the list of PDB fields in t.
- reccat_tuplelist(rec_list, field, sep1, sep2)¶
Call reccat_list with sep1 as the list separator, then split the items into tuples by sep2.
Return a properly formed PDB record string from the instance dictionary values.
- exception ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.PDBValueError(text)¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.REMARK¶
REMARK records present experimental details, annotations, comments, and information not included in other records. In a number of cases, REMARKs are used to expand the contents of other record types. A new level of structure is being used for some REMARK records. This is expected to facilitate searching and will assist in the conversion to a relational database.
CHANGED FROM ORIGINAL
gv: 3 Feb 2006: Omitted the remarkNum field, and moved all to text field
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.REVDAT¶
REVDAT records contain a history of the modifications made to an entry since its release.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.RecordProcessor¶
- process_pdb_records(pdb_rec_iter, filter_func=None)¶
Iterates the PDB records in self, and searches for handling methods in the processor object for reading the objects. There are several choices for methods names for the processor objects.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SCALE1¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SCALE2¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SCALE3¶
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SCALEn¶
The SCALEn (n = 1, 2, or 3) records present the transformation from the orthogonal coordinates as contained in the entry to fractional crystallographic coordinates. Non-standard coordinate systems should be explained in the remarks.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SEQADV¶
The SEQADV record identifies conflicts between sequence information in the ATOM records of the PDB entry and the sequence database entry given on DBREF. Please note that these records were designed to identify differences and not errors. No assumption is made as to which database contains the correct data. PDB may include REMARK records in the entry that reflect the depositor’s view of which database has the correct sequence.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SEQRES¶
The SEQRES records contain the amino acid or nucleic acid sequence of residues in each chain of the macromolecule that was studied.
Returns a dictionary with attributes chain_id, num_res, and sequence_list
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SHEET¶
SHEET records are used to identify the position of sheets in the molecule. Sheets are both named and numbered. The residues where the sheet begins and ends are noted.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SIGATM¶
The SIGATM records present the standard deviation of atomic parameters as they appear in ATOM and HETATM records. Columns 7 - 27 and 73 - 80 are identical to the corresponding ATOM/HETATM record.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SIGUIJ¶
The SIGUIJ records present the standard deviations of anisotropic temperature factors scaled by a factor of 10**4 (Angstroms**2). Columns 7 - 27 and 73 - 80 are identical to the corresponding ATOM/HETATM record.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SITE¶
The SITE records supply the identification of groups comprising important sites in the macromolecule.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SLTBRG¶
The SLTBRG records specify salt bridges in the entry.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SOURCE¶
The SOURCE record specifies the biological and/or chemical source of each biological molecule in the entry. Sources are described by both the common name and the scientific name, e.g., genus and species. Strain and/or cell-line for immortalized cells are given when they help to uniquely identify the biological entity studied.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SPRSDE¶
The SPRSDE records contain a list of the ID codes of entries that were made obsolete by the given coordinate entry and withdrawn from the PDB release set. One entry may replace many. It is PDB policy that only the principal investigator of a structure has the authority to withdraw it.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.SSBOND¶
The SSBOND record identifies each disulfide bond in protein and polypeptide structures by identifying the two residues involved in the bond.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.TER¶
The TER record indicates the end of a list of ATOM/HETATM records for a chain.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.TITLE¶
The TITLE record contains a title for the experiment or analysis that is represented in the entry. It should identify an entry in the PDB in the same way that a title identifies a paper.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.TURN¶
The TURN records identify turns and other short loop turns which normally connect other secondary structure segments.
- class ccpnmodel.ccpncore.lib.Io.PyMMLibPDB.TVECT¶
The TVECT records present the translation vector for infinite covalently connected structures.
Reads a sequence of PDB lines from iterable sequence and converts them to the correct PDB record objects, then yields them.