HPSandbox

HPSandbox is a package of Python modules and example scripts for experimenting with the two-dimensional HP lattice model of Dill and Chan. It is ideally used as a teaching tool, or as a way to quickly prototype 2D lattice simulation ideas with easy-to-use extensible code -- a "sandbox" if you will.

Download

HPSandbox.tar.gz (2.4M)

NOTE: HPSandbox.tar.gz when uncompressed will take up about 47M of total space, and will take a few minutes to unpack. Most of this diskspace is required to store all the native states of foldable sequences (in separate files, which is useful in that you can use your OS's filesystem for filename searching).

HP_designing.tar.gz (6.3M) - uncompresses to (?)

This is a set of text files originally from Anders Irback's group. ( http://www.thep.lu.se/~anders/ ). These contain native states for all foldable sequences up to chain lengths of 25. See the README file that goes with this for its full pedigree (and parent reference).

For future reference, these archives can be downloaded at http://laplace.compbio.ucsf.edu/~voelzv/HPSandbox

To unpack:

>> gunzip HPSandbox.tar.gz
>> tar xvf HPSandbox.tar.gz

Frequently Asked Questions (FAQ)

What can HPSandbox do?

So far, HPSandbox can either 1) enumerate, or 2) perform Monte Carlo "dynamics" for 2-dimensional, square-lattice "bead-on-a-string" type chains.

How long of a chain length can I simulate?

It depends on how long you are willing to wait. For instance, all conformations of 16-mers can be enumerated in a few minutes on a typical personal computer. Each increase in chain length adds a factor of about 2.7 to the calculation.

Can I use other potentials besides the HP model?

Sure! But you'll have to put it in yourself. The code is designed for HP sequences, so if you want to study a model using beads of only two flavors, it is easy to just modify the Monty.energy() class function. More complicated models would require a more thorough, but straightforward reworking of the code.

What are the reference(s) for the HP 2D Model?

Lau and Dill (1989) is the first use of the HP model, while Dill et al. (1995) is a more comprehensive review.

What movesets are used for the Monte Carlo routines in HPSandbox?

The movesets are described in Dill et al. Protein Science 4: 561-602, 1995:

Can I freely use/modify/sell-for-profit this code?

yes/yes/no. Feel free to modify this code as needed, as long as you keep it publicly available.

How can I share with the world the awesome modifications I've made to the basic HPSandbox code?

Contact me at my current email address: vvoelz.at.stanford.edu

I want to publish some research that used the HPSandbox code. How should I cite it?

HPSandbox. Copyright 2007, Vincent Voelz.

Documentation

HPSandbox Package Contents

Chain.py An object to represent the 2D HP lattice chain and its attributes, with method functions.
Config.py A data structure to hold configuration parameters.
Monty.py A collection of functions to perform Monte Carlo move-set operations on the Chain() object.
Replica.py A container object, to hold the Chain() and Monty() objects
Trajectory.py A set of functions for creating, reading, writing, and organizing trajectory files

/examples A directory of example scripts
/sequences Containing descriptions of the native states of foldable sequences:
/clist contact state lists for chain lengths 10 through 21
/conf coordinates (conformations) for chain lengths 10 through 19
COUNTS text file counts of all unique (nonsymmetric) conformations for a given chain length

Setup

This package has been tested with Python 2.3 and 2.4. Older/newer versions may work too, but haven't been tested.

In order to get these example scripts to work correctly, you need to set up the following:

*** IMPORTANT SETUP NOTES!!! ***

In order to get these example scripts to work correctly, you need to set up the following:

Examples

The best way to get started is to try and run the example scripts in /HPSandbox/examples. Please see the /examples directory and the README file therein for some test scripts and examples showing how to use the HPSandbox function. Included are two example programs to illustrate the use of the HPSandbox objects:

enumerate.py

usage:  enumerate.py [configfile]
Try:    enumerate.py enumerate.conf

This program will read in an HP chain specified in the configure file,
and perform a full enumeration of conformational space.

The problem tablulates:
    1) the density of states (in energies/contacts)
    2) the number density of unique contact states, i.e. disjoint collections
       of microscopic conformations all sharing a unique set of interresidue contacts. 

These values are printed as output.

mcrex.py

usage:  mcrex.py [configfile]
Try:    mcrex.py mcrex.conf

This program will read in an HP chain and run parameters specified in the configure file,
and perform a replica exchange Monte Carlo simulation.

For the example "mcrex.conf", an 11-mer sequence is simulated, and the program ends when
the native conformation (contact state) is found.
A directory of results is output to directory ./mcrex_data

Class Library

The following documentation can be obtained using the pydoc standard module of python. For example:

>>> from HPSandbox import *
>>> import pydoc
>>> pydoc.help(Chain)


CLASSES
    Chain
    
    class Chain
     |  An object to represent the 2D HP lattice chain and its attributes, with method functions.
     |  
     |  Methods defined here:
     |  
     |  __init__(self, config)
     |      Initialize the Chain object.
     |  
     |  contactstate(self)
     |      Return the contact state of the chain as a list of (res1,res2) contacts (duples),
     |      where the residue numbering starts at 0.
     |  
     |  grow(self)
     |      Add a new link onto the chain vector, updating the coords and viability correspondingly.
     |  
     |  hpstr2bin(self)
     |      Convert a string of type 'HPHPHPPPHHP' to a list of 1s and 0s.
     |  
     |  lastvec(self)
     |      Report the last entry on the list.
     |  
     |  nonsym(self)
     |      Many of the conformations are related by rotations and reflections.
     |      We define a "non-symmetric" conformation to have the first direction '0'
     |      and the first turn be a '1' (right turn)
     |      
     |      nonsym() returns 1 if the vec list is non-symmetric, 0 otherwise
     |  
     |  shift(self)
     |      Shifts the chain vector to the 'next' list, according to an enumeration scheme where
     |      the most distal chain vector is incremented 0->1->2->3.  After 3, the most distal vector
     |      element is removed, and the next most distal element is incremented.  If there are multiple
     |      "3" vectors, this process is done recursively.
     |      
     |      Example:
     |          [0,0,0,0] --> [0,0,0,1] 
     |          [0,0,1,2] --> [0,0,1,3] 
     |          [0,1,0,3] --> [0,1,1] 
     |          [0,3,3,3] --> [1]
     |      
     |      This operation is very useful for enumerating the full space of chain conformations.
     |      shift()  will also update the coords and the viability, accordingly.
     |      
     |      RETURN VALUES
     |      
     |          returns 1 if its the last possible "shift" --> i.e. if it's all 3's, the search is done
     |          returns 0 otherwise
     |  
     |  vec2coords(self, thisvec)
     |      Convert a list of chain vectors to a list of coordinates (duples).
     |  
     |  viability(self, thesecoords)
     |      Return 1 if the chain coordinates are self-avoiding, 0 if not.


    Config

    class Config
     |  A data structure to hold all the configuration data for an HP model calculation.
     |  
     |  Methods defined here:
     |  
     |  __init__(self, filename=None)
     |      Initialize the configuration data structure, and default values.
     |  
     |  print_config(self)
     |  
     |  read_configfile(self, filename)
     |      Read in configuration parameters from file.  The file should have formatted rows
     |      consisting of two fields, separated by white-space (or any non-printing characters, like tabs):
     |      
     |      HPSTRING              PHPPHPPPHP 
     |      INITIALVEC           [0,0,0,0,0,0,0,0,0,0]
     |      ....


    DistRestraint

    class DistRestraint
     |  For now, this is a harmonic constraint over a squared distance D = d^2
     |  where D = sum_{i,j} d^2_ij over all contacts.
     |  
     |  Methods defined here:
     |  
     |  D(self, chain)
     |      Return the sum of squared-distances over the selected contacts.
     |  
     |  __init__(self, contacts, kspring)
     |      Initialize the DistRestraint object
     |  
     |  energy(self, chain)
     |      return the energy of the distance restraint


    Monty

    class Monty
     |  A collection of functions to perform Monte Carlo move-set operations on an HP lattice Chain object.
     |  
     |  Methods defined here:
     |  
     |  __init__(self, config, temp, chain)
     |      Initialize the Monte Carlo object...
     |  
     |  energy(self, chain)
     |      Calculate potential energy of the chain.
     |  
     |  metropolis(self, replica)
     |      Accept Chain.nextvec over Chain.vec according to a Metropolis criterion.
     |  
     |  move1(self, replica)
     |      Apply moveset 'MC1' to the chain:
     |      (i)  three-bead flips
     |      (ii) end flips
     |      
     |      REFERENCE: Dill and Chan, 1994, 1996.
     |  
     |  move2(self, replica)
     |      Apply moveset MC2 to the chain:
     |      (i)   three-bead flips
     |      (ii)  end flips
     |      (iii) crankshaft moves
     |      (iv)  rigid rotations
     |      
     |      REFERENCE:  Dill and Chan, 1994, 1996
     |  
     |  
     |  move3(self, replica)
     |      Apply moveset 'MC3' to the chain.
     |      This is just a simple set to change the direction of a single chain link.
     |      Example:
     |          [0,0,0,0,0] --> [0,0,1,0,0]
     |      where {0,1,2,3}={n,e,s,w} direction
     |      
     |      About 5% viable moves are expected.
     |  
     |  move4(self, replica)
     |      Apply moveset 'MC4' to the chain:
     |      This is another vert simple moveset, to just change one angle in a rigid rotation
     |      Like 'MS3', this generates about 5% viable moves.


    Replica

    class Replica
     |  A container object, to hold the Chain() and Monty() objects
     |  
     |  Methods defined here:
     |  
     |  __init__(self, config, repnum)
     |      Initialize the Replica() object.


    Trajectory

    class Trajectory
     |  A set of functions for creating, reading, writing, and organizing trajcetory files
     |  
     |  Methods defined here:
     |  
     |  __init__(self, replicas, config)
     |      Initialize the trajectory object
     |  
     |  cleanup(self, replicas)
     |      Write any remaining points in the trajectory and energy buffers to file,
     |      and close any open file handles.
     |  
     |  dump_enequeue(self, replica)
     |      Dumps the queued energy values to the respective files and clears them for further use.
     |  
     |  dump_trjqueue(self, replica)
     |      Dump the queue to the the respective files and clear them for future use.
     |  
     |  mkdir(self, pathname)
     |      Automatically create directory if it doesn't exist
     |  
     |  queue_ene(self, replica)
     |      Queue an energy value to the buffer, for writing to file.
     |  
     |  queue_trj(self, replica)
     |      Queue a trajectory point to the buffer for writing to file
     |  
     |  write_eneheader(self, filename, replica)
     |      Write column headers for the energy file.