Réimplémentation du programme DSSP en Python

README.md 3.0KB

Implementation of the DSSP approach for secondary structure assignation

Attempt to implement the historic 1983 version of DSSP (Define Secondary Structure of Proteins): Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Kabsch W, Sander C, Biopolymers. 1983 222577-2637. PMID: 6667333; UI: 84128824.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Note that a 1est.dssp file is present. It is a real output from the original DSSP program, used for comparison.

Prerequisites

You need to have Python 3 and Conda on your machine, as it will be used to deploy the running environnement of DSSP. This program has been tested for Python 3.7.4.

This program only use standard Python lib, according to the batteries included principle, except for the PyMOL experimental lib, gotten from the schrodinger channel in Conda cloud forge.

Installing

Use conda to create the environment

conda env create --file environment.yml

Load your environnement

conda activate dssp

Running

You should be ready to run the program, by calling, for example

python3 src/dssp.py data/1est.pdb

And if you want the extended output, use the -v flag, or --verbose

python3 src/dssp.py data/1est.pdb -v

To get usage help, run

python3 src/dssp.py data/1est.pdb -h

Output

To understand the output, here is a short description of the DSSP output, as followed from the official website : https://swift.cmbi.umcn.nl/gv/dssp/

  • RESIDUE

two columns of residue numbers. First column is DSSP's sequential residue number, starting at the first residue actually in the data set and including chain breaks; this number is used to refer to residues throughout. Second column gives crystallographers' 'residue sequence number','insertion code' and 'chain identifier' (see protein data bank file record format manual), given for reference only.

  • AA

one letter amino acid code

  • BP1

residue number of first bridge partner

  • TCO

cosine of angle between C=O of residue I and C=O of residue I-1. For α-helices, TCO is near +1, for β-sheets TCO is near -1. Not used for structure definition.

  • KAPPA

virtual bond angle (bend angle) defined by the three Cα atoms of residues I-2,I,I+2. Used to define bend (structure code 'S').

  • ALPHA

virtual torsion angle (dihedral angle) defined by the four Cα atoms of residues I-1,I,I+1,I+2.Used to define chirality (structure code '+' or '-').

  • PHI PSI

IUPAC peptide backbone torsion angles

  • X-CA Y-CA Z-CA

echo of Cα atom coordinates

Author

License

This project is licensed under the CeCILL-C License - see eCILL-B FREE SOFTWARE LICENSE AGREEMENT for details