Réimplémentation du programme DSSP en Python
Thomas Forest d82c83a75d Final commit for evaluation with report under /doc folder | 5 years ago | |
---|---|---|
data | 5 years ago | |
doc | 5 years ago | |
src | 5 years ago | |
README.md | 5 years ago | |
environment.yml | 5 years ago |
Attempt to implement the historic 1983 version of DSSP (Define Secondary Structure of Proteins): Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Kabsch W, Sander C, Biopolymers. 1983 222577-2637. PMID: 6667333; UI: 84128824.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
Note that a 1est.dssp file is present. It is a real output from the original DSSP program, used for comparison.
You need to have Python 3 and Conda on your machine, as it will be used to deploy the running environnement of DSSP. This program has been tested for Python 3.7.4.
This program only use standard Python lib, according to the batteries included principle, except for the PyMOL experimental lib, gotten from the schrodinger channel in Conda cloud forge.
Use conda to create the environment
conda env create --file environment.yml
Load your environnement
conda activate dssp
You should be ready to run the program, by calling, for example
python3 src/dssp.py data/1est.pdb
And if you want the extended output, use the -v flag, or --verbose
python3 src/dssp.py data/1est.pdb -v
To get usage help, run
python3 src/dssp.py data/1est.pdb -h
To understand the output, here is a short description of the DSSP output, as followed from the official website : https://swift.cmbi.umcn.nl/gv/dssp/
two columns of residue numbers. First column is DSSP's sequential residue number, starting at the first residue actually in the data set and including chain breaks; this number is used to refer to residues throughout. Second column gives crystallographers' 'residue sequence number','insertion code' and 'chain identifier' (see protein data bank file record format manual), given for reference only.
one letter amino acid code
residue number of first bridge partner
cosine of angle between C=O of residue I and C=O of residue I-1. For α-helices, TCO is near +1, for β-sheets TCO is near -1. Not used for structure definition.
virtual bond angle (bend angle) defined by the three Cα atoms of residues I-2,I,I+2. Used to define bend (structure code 'S').
virtual torsion angle (dihedral angle) defined by the four Cα atoms of residues I-1,I,I+1,I+2.Used to define chirality (structure code '+' or '-').
IUPAC peptide backbone torsion angles
echo of Cα atom coordinates
This project is licensed under the CeCILL-C License - see eCILL-B FREE SOFTWARE LICENSE AGREEMENT for details