Analysis software

From Biowerkzeug Wiki
Revision as of 15:47, 30 June 2008 by Oliver (talk | contribs) (basic analysis list (Woolf Wiki + changes))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Running simulations is often the easy bit. The hard bit is to extract meaningful information from the Gigabytes of trajectory data. This list can act as a starting point. For most advanced uses, however, one will probably have to write analysis code in python, Perl, tcl, C/C++, bash ... or any other language that "gets the job done".

"Native" tools

Many of the MD packages come with their own analysis tools or scripting language. Sometimes it is possible to convert data formats between packages and use the other package's analysis tools.

Gromacs analysis tools
oone of the strengths of Gromacs is that it comes with a large number of useful analysis tools that make many of the standard analysis tasks simple to perform
NAMD/VMD
VMD can be used through its GUI or by scripting it in tcl to great effect
Charmm
Charmm is feature-rich but its scripting language can cause a steep learning curve
LAMMPS/pizza
pizza.py is a python library geared towards output from LAMMPS

MD Analysis libraries

MDAnalysis
a python library to analyze dcd trajectories (in conjunction with a psf)
Amber/ptraj
command-line based analysis
MMTK
Another python-based framework for doing analysis is the Molecular Modelling Tool Kit. However, it does not natively read Charmm dcd files and hence it can be cumbersome to use.


Specialized tools

HOLE
Oliver Smart's program to trace out pore surfaces and estimate single channel conductances.
CAVER
CAVER provides rapid, accurate and fully automated calculation of pathways leading from buried cavities to outside solvent in static and dynamic protein structures. Calculated pathways can be visualized by graphic program PyMol dissecting anatomy and dynamics of entrance tunnels. CAVER allows analysis of any molecular structure including proteins, nucleic acids, inorganic materials, etc. CAVER is available as online version or PyMol plugin suitable for calculation of pathways in discrete protein structures and stand alone version enabling analysis of trajectories from the molecular dynamics simulations.
dssp
Definition of secondary structure of proteins given a set of 3D coordinates. The DSSP program defines secondary structure, geometrical features and solvent exposure of proteins, given atomic coordinates in Protein Data Bank format. The program does NOT PREDICT protein structure. According to the Science Citation Index (July 1995), the program has been cited in the scientific literature more than 1000 times.
STAMP
Structural Alignment of Multiple Proteins. STAMP is a package for the alignment of protein sequence based on three-dimensional (3D) structure. It provides not only multiple alignments and the corresponding `best-fit' superimpositions, but also a systematic and reproducible method for assessing the quality of such alignments. It also provides a method for protein 3D structure data base scanning. In addition to structure comparison, the STAMP package provides input for programs to display and analyse protein sequence alignments and tertiary structures. Please note that, although STAMP outputs a sequence alignment, it is a program for 3D structures, and NOT sequences.
swinker
finds and calculates helix hinges. It optionally finds the hinge point and calculates kink and swivel angles.

General purpose mathematical packages

Scientific Python and pylab
a matlab-like python module that has sophisticated analysis and plotting capabilities
matlab
Mathematica
R
R is a language and environment for statistical computing and graphics. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formulae where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.