# Implementation of the DSSP approach for secondary structure assignation Attempt to implement the historic 1983 version of DSSP (Define Secondary Structure of Proteins): Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Kabsch W, Sander C, Biopolymers. 1983 222577-2637. PMID: 6667333; UI: 84128824. ## Getting Started These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system. Note that a 1est.dssp file is present. It is a real output from the original DSSP program, used for comparison. ### Prerequisites You need to have **Python 3** and **Conda** on your machine, as it will be used to deploy the running environnement of DSSP. This program has been tested for Python 3.7.4. This program only use standard Python lib, according to the batteries included principle, except for the PyMOL experimental lib, gotten from the schrodinger channel in Conda cloud forge. ### Installing Use conda to create the environment ``` conda env create --file environment.yml ``` Load your environnement ``` conda activate dssp ``` ### Running You should be ready to run the program, by calling, for example ``` python3 src/dssp.py data/1est.pdb ``` And if you want the extended output, use the -v flag, or --verbose ``` python3 src/dssp.py data/1est.pdb -v ``` To get usage help, run ``` python3 src/dssp.py data/1est.pdb -h ``` ## Output To understand the output, here is a short description of the DSSP output, as followed from the official website : https://swift.cmbi.umcn.nl/gv/dssp/ * RESIDUE two columns of residue numbers. First column is DSSP's sequential residue number, starting at the first residue actually in the data set and including chain breaks; this number is used to refer to residues throughout. Second column gives crystallographers' 'residue sequence number','insertion code' and 'chain identifier' (see protein data bank file record format manual), given for reference only. * AA one letter amino acid code * BP1 residue number of first bridge partner * TCO cosine of angle between C=O of residue I and C=O of residue I-1. For α-helices, TCO is near +1, for β-sheets TCO is near -1. Not used for structure definition. * KAPPA virtual bond angle (bend angle) defined by the three Cα atoms of residues I-2,I,I+2. Used to define bend (structure code 'S'). * ALPHA virtual torsion angle (dihedral angle) defined by the four Cα atoms of residues I-1,I,I+1,I+2.Used to define chirality (structure code '+' or '-'). * PHI PSI IUPAC peptide backbone torsion angles * X-CA Y-CA Z-CA echo of Cα atom coordinates ## Author * **FOREST Thomas** - *M2BI* - [Univ-Paris-Diderot.fr](https://www.univ-paris-diderot.fr/) - thomas.forest@etu.univ-paris-diderot.fr ## License This project is licensed under the CeCILL-C License - see [eCILL-B FREE SOFTWARE LICENSE AGREEMENT](http://cecill.info/licences/Licence_CeCILL-B_V1-en.html) for details