Introduction
Short description of the programs.
From: S Chandra Shekar (scshekar@medusa.bioc.aecom.yu.edu)
Date: 31 May 2005
| See Authors web site for more/newer information |
- Scripts to extract information, manipulate, or generate cross peak
(.xpk) files in the 'nmrview' (an nmr data viewing, analysis graphical package
by Bruce Johnson).
Also contains the directory xpk2nv w/ Fortran77, C and Perl
codes to simulate, from a given .xpk file, 2d/3d spectra in frequency domain by
modelling the spectra as a collection of non-interacting Gaussian oscillators.
The directory is grouped into subdirectories with mostly Perl scripts,
with some c-shell, awk and a few sed scripts - there are about 225 in all.
The usage information for almost all the scripts can be obtained just by typing
the name of the script (change mode to executable, if that is a valid state; for
e.g., c-shell and perl scripts can be made executable, but it is not meaningful
to do so for awk and sed scripts). Bulk of the remainder of the scripts will
have the usage information written in the code itself (so look at the source
code near the top).
NOTE: some of the scripts may use bmrb chemical shift information (so you
may need to provide that table). there is such a file "bmrb.cs" in the
main directory; but you may wanna use the latest updated table
from bmrb (bioMagResBank). Also, if any of the scripts have the path to
bmrb.cs hard-wired into them, you may have to edit it suitably.
| Name | Type | Purpose |
| aaShift | csh | extract chem.shifts for a given type of amino acid from a bmrb (bioMagResBank) table (in the form of a text file; the pathname for the table should be changed according to local needs); cacbCS --- related script to extract just the ca and cb shifts for the given amino acid type |
| autoass/xpk2autoAssgnFormat.pl |
perl | reformat nmrview .xpk file for use with the auto-assignment package "autoassign"; usage: xpk2autoAssgnFormat.pl file.xpk exptName |
| bmrb.cs | txt | is an input file for some of the scripts in this tree; it was copied from bioMagResBank (bmrb) web site sometime in 2002; you may wanna use the updated information from bmrb web site and you can locate the file anywhere you wish; but if a script has the name of bmrb.cs hardwired into it you may have to edit it to make the pathname correct. |
| bmrb2tatapro/ | directory | a set of awk scripts to transform .xpk files into format suitable for use by the auto-assignment code tatapro |
| bmrb2xpk/ | directory | mostly perl scripts to generate from bmrb chemical shift (cs) table an nmrview .xpk file corresponding to a given type of nmr experiment; for e.g. hncacbAssgnd.pl generates a 3d .xpk file corresponding to 3d,3res hncacb experiment and fills in the assignment boxes; on the other hand hncacb.pl generates the .xpk file w/o the assignments. hncoca.pl, hnca.pl etc., almost all of them have a variant to carry over the assignment information into the .xpk file from bmrb data-base based table files. |
| cacbCS | (c)sh | see entry for "aaShift" above; this is one example where the name (i mean, full name including the pathname) for "bmrb.cs" has to be edited |
| diff.{awk,sed} | awk/sed | reformat output from unix command "diff"; see entry for sub{,.sed} |
| findXpk/ | directory | a bunch of perl and awk scripts to do various locating operations regarding peaks in an nmr data set; the most extensively used (and hence the most reliable and relevant?) of these are fb1.pl, fb2.pl; "fb" stands for forward/backward; usage: fb1.pl {hn,file{A,B}}.xpk c1 c2 [htol] [ctol] [ntol] (dflts) htol=0.02 ctol=0.4 ntol=0.325 see also: ~/xpkScrpts/findXpk/fb1.pl.readme above, we have used c-shell expansion notation in which {a,b}.xpk means a.xpk b.xpk anyhow, as the "usage" line shows above (which will appear if the just the name of the perl script is typed), this script given an hsqc.xpk and two other .xpk files corresponding to a 'sister' pair of 3d 3res experiments (for example giving the inter and inter-residue connectivities, for example hncacb and hncocacb) and two carbon shifts w/ the same h and n frequencies will try to find all possible matches arising from different h and n frequencies; and depending on whether the intra or the inter .xpk file is the 2nd .xpk file (hsqc.xpk is always the 1st .xpk file in the command line argument list), the possible matches are either in the forward (c-terminal) or backward direxion. |
| genXpk/ | directory |
to generate various types of peak lists (even modifications such as renumbering the peaks in a continuum of natural numbers is treated as generating a new peak list; some of the scripts that i have found very useful are: 0) assgnHsqc1Hsqc2v0.pl usage:assgnHsqc1Hsqc2v0.pl hsqcFile{1,2} [h1tol] [n15tol] (default) h1tol=0.05 n15tol=0.3125 this perl script transfers an assignments from a 2d .xpk file to another 2d xpk file, if the peaks match in frequency and if the peak from the 2nd .xpk file is unassigned! it does not touch the assigned xpk's in the 2nd xpk file. ...and its various versions which have different selection criteria for printing the matched/unmatched peaks from the 2nd .xpk file and also the way the output is ordered. some of the reame files in this folder may come in handy, but the best way to familiarize and explore these scripts is just do that --- explore! 1)assgnHsqc2hcnVrsn0.pl (and similar scripts) usage:assgnHsqc2hcn.pl hsqc.xpk hcn.xpk [h1tol] [n15tol] (default) h1tol=0.05 n15tol=0.3125 fill in two of the three assignable boxes in a 3d 3res .xpk file from the assignments present in a 2d hsqc .xpk file 1.1) fltrHsqc0Hsqc1v1.pl, fltrHsqcHcn.pl, hn1hn2cmmnAssgnd.pl, hsqc2hcnFltr.pl etc. filter 2d/3d .xpk files based on if the peaks in the file match a given 2d .xpk file in a pre-specified 2 coordinate/frequency axes; 2) fmtXpk.pl -- this is one of those scripts which doesn't print usage information when entered w/o arguments; just waits for input ==> can operate on data streaming thro' unix pipe, or unix file redirection or file as the argument to the command (which is just the name of the perl script: fmtXpk.pl); formats a given xpk file enables using "diff" on different xpk files more easily most importantly tries to keep the length of each record (representing one peak) to a minimum, thus reducing the required window size; 3d .xpk file records typically can be fit on a standard monitor screen this way. (also see the script xtrctEtc/xtrct.pl which can take an .xpk file from std.in and extracts only the most important fields in a record and thus making viewing the .xpk files extremely un-painfu). 3) hn2trXpk.pl, tr2hnXpk.pl convert the the shifts from hsqc -> trosy and vice versa for peaks in an .xpk file 4) xeasy2xpk.pl ... converts peak tables in xeasy format to .xpk files for nmrview 5) orderAssgnmntsHsqc.pl will rewrite the .xpk file such that the assignmened xpk's are in the increasing order of residue numbers to which they are assigned. 6) hsqc2hcn.pl usage:hsqc2hcn.pl hsqc.xpk cCarrierPpm cLbl generates a pseudo 3d 3-res .xpk file from an hsqc/trosy 2d .xpk file, w/ the 3rd (typically carbon dimension) containing all frequencies at an user specified frequency 7) renum.pl, renumbers the peak numbers in an .xpk file 8) renumAssgnmnts.pl usage:renumAssgnmnts.pl xpkFile <offset> (dflt offset=0) this xtremely useful script renumbers the assignments in an .xpk file (very useful when starting amino acid number in a given sequence is changed, based on one's need of the moment) 9) some of the above perl scripts have an "awk" versions in this directory |
| meg/ | directory | Prof. Mark E. Girvins Perl scripts. Serve to assign a given sequence of connected cross peaks to a given sequence of a.a. residues; i.e. translate peak connectivities to a possible of sequence string. |
| nmrDrw2nv/d2n.pl | perl | convert nmrDraw generated peak tables to nmrview .xpk file; there is a sample input file in the same directory; nmrDrw2nv/d2n.pl usage: nmrDrw2nv/d2n.pl parFile sample parFile ================ # lblX lblY hn n # swX swY 5204.34 2603.96 # larmorX larmorY 800.2338 81.0963 # nmDrawXpkFile test.tab |
| plScrpt{,0} | csh | to generate gnuplot script and execute the generated gnuplot script which in turn generates a postscript file (look@the usage info' in the script) |
| scrpts/ | directory | a small set of miscellaneous small scripts; just take a look and try; some of them are not finished; look for any updates via link to my web site or e.mail me |
| seq/ | directory | collexion of awk and perlscripts to output protein sequence in various formats (some required for some other popular and not so popular packages/programs such as nmrview, tatapro (an automated nmrdata to protein sequence assignment package) etc. Consult the readme files therein and just try, take a look, whatever |
| sl/ | directory | scripts here extensively used by the author for his day-to-day work w/ spin labeled nmr studies of protein under study; 0) reducahPH.pl -- perl script2calculate the amount of reducing agent (phenyl hydrazine) to be added to the nmr sample; reducahPH0.pl usage: ~/xpkScrpts/sl/reducah.pl protMolarConcn(mM) sampleVol vol2bAdded 1) xpk2ir.pl usage: xpk2ir.pl {oxd,red}.xpk tauc <sclFctr> <oxdNoiseLvl> <redNoiseLvl> default sclFctr=1, noiseLvl=0 calculate intensity ratios from 'oxidized' and 'reduced' hsqc nmr data set nmrview peak lists; it also calculates the distances from the spin labeled site to the xpk in question. it only deals w/ "assigned" peaks; however the directory contains scripts which calculate the intensity ratios for any two matching (in both frequency dimensions) cross peaks from the data pair 2) xpk2cnsBin.pl usage: xpkScrpts/sl/xpk2cnsBin.pl parFile ======================== sampleParFile ============================ # oxd.xpk oxd.xpk # red.xpk red.xpk # tauC 17e-9 # mutResNum 129 # cutOffRatio (above which, ir = cutOffRatio, d-d(cutOffRatio)<=100 0.85 # sclFctr 1 ==================================================================== calculates the 'binned' distances and generates constraint file for use w/ cns (crystallographic and nmr system), xplorNih structure refinement via molecular dynamics packages. (xpk2dyana.pl for e.g. generates constraint files for Dyana/Cyana packages) there are also a variety of other scripts here, for e.g. to generate color codes and build the color codes into pdb files so that molecule viewing graphic programs such as molmol can display the residues according to the color code (for e.g., the intensity ratios can be color coded); the color coding is achieved by assigning "temperature" factors; this is Prof. Givin's idea. |
| strngs/ | directory | scripts to chains of connectivity from 3res 3d/4d data (under development, but the scripts work suprisingly well, even at this stage) |
| sub{,.sed} | (c)sh/sed | reformats output of unix command "diff" to make it very useful in terms of sorting and other operations |
| tubes/ | directory | awk scripts put together when i was stil new to heavy duty solution nmr of proteins; idea was to get the chains of connected xpk's in 3res 3d nmr data; not worked on in a long time. just left there so.... |
| unfold.awk | awk | "awk" coding of the aliasing and unaliasing formulae that i came up with quite some time ago; interesting behavior; may still need some work. |
| xpk2nv/ | directory | powerful and beautiful and useful c-code (w/ couple of versions, including a faster version w/ reasonable compromise and a 'full' simulation version) to simulate 3d spectra (3res or otherwsie) modeled as a set of noninteracting gaussian oscillators using the information that may arise from .xpk files of nmrview (which in turn can be synthesized from bmrb data using scripts in bmrb2xpk directory (see entry for "bmrb2xpk"); but only FREQUENCY domain data is generated. fortran77 and perl versions are also presented. but c-code is the best! (as fast as fortran77 code but only far more versatile); whatever is the means (c, fortran, perl), the generated binary file is converted to nmrPipe data format (can be viewed in nmrDraw) via nmrPipe scripts (template scripts are included) and further converted into nmrview format. one potential application is to generate frequency domain nD nmr data from bmrb data banks; |
| xpk2xeasy/ | directory | contains script to convert nmrview's .xpk file into xeasy format |
| xtrctEtc/ | directory | a bunch of scripts to xtract info' from .xpk files; many of them handle streaming data (i.e. from stdin, so no usage information will be given by just typing the command w/o arguments, you have to look into the code itself to know how to use it). There are also somre readme files which may come in handy. 0) for example, ------------------------------------------------------------------- cat ...pathname..../hncacb.xpk | xtrct.pl |tail 1563 8.727 44.306 105.548 -1.24897 {56.hn} {?} {56.n} 1564 8.442 55.092 105.429 1.65900 {68.hn} {?} {68.n} 1565 8.436 45.884 105.426 -1.15888 {68.hn} {?} {68.n} 1566 8.440 45.299 105.400 18.89294 {68.hn} {?} {68.n} 1568 8.440 39.485 105.365 -3.13722 {68.hn} {?} {68.n} 1570 8.461 45.848 105.132 -1.31767 {68.hn} {?} {68.n} 1577 8.183 45.447 104.417 27.68684 {71.hn} {?} {71.n} 1578 8.181 42.526 104.406 -3.45622 {71.hn} {?} {71.n} 1579 8.185 56.074 104.352 3.44162 {71.hn} {?} {71.n} 1581 8.181 46.104 104.357 -1.32899 {71.hn} {?} {71.n} ------------------------------------------------------------------- extracting the peaknumber, the prton, carbon and nitrogen chemical shifts, peak intensity and assignments, if any, from the .xpk file; the .xpk file had 27 entries for each xpk, and could not fit on a screen w/ 80 characters. the above output could be further piped into other programs or unix commands such as lp, grep, sort etc. 1) similarly "assndXpks.awk" will find all records w/ assignments usage: awk -f assgndXpks.awk xpkFile OR cat xpkFile|awk -f assgndXpks.awk 2) assgndXpkList{,1} are c-shell scripts that work respectively w/ an .xpk file as an argument, as streaming data from stdin and print to stdout a single column of xpk numbers in the .xpk file which have assignment boxes filled in; the output is a numerically sorted list. 3) cunt{1,2}multiAssgn help extract records in an .xpk file w/ multiple assignments in them. 4) fmtPpm.awk is a useful script which taken a ppm.out file (from nmrview) and truns it into a 'horizontal' table (i.e. with all the chemical shifts for a given residue on a single line) 5) rmmbrStrpsByXpk.awk, to help manage recording/remembering the strips in strips and strips2 utilities of nmrview |