Oxypred2: Oxygen binding proteins prediction and analysis

 

Home
Submit
Developers
Help

Help


 

 

 

  1. Sequence Submission

    Input Sequence:- Our server provides single options for submitting the query sequences. The user can paste their protein sequence in the given inbox.

  2. Prediction Approach:-

    a). Composition of amino acids:-The amino acid composition provided the information of protein in 20 dimensions vector. The amino acid composition is the fraction of each amino acid in protein. We achieved the maximum accuracy was 83.45% , sensitivity 93.77, specificity 73.28 and MCC 0.77. In sub-class prediction od oxy-proteins, the maximum accuracy was 96.25%, 88.11%, 90.63%, 87.47%, 90.02%, 80.32% and MCC 0.78, 0.53, 0.80, 0.30, 0.87, 0.60 to Leghemoglobin, myoglobin, hemerythrin, erythrocruorin, hemocyanin and hemoglobin respectively.

  3. b). Dipeptide Composition:- The dipeptide composition provided the information of protein in the form of a vector of 400 dimensions. In dipeptide composition method, we achieved the maximum accuracy 82.69%, and 93.19,72.33, 0.76 of sensitivity, specificity and MCC respectively. In classification, the maximum accuracy was 92.60, 87.70, 87.00, 82.99, 77.74, 87.93 and MCC 0.86, 0.83, 0.32, 0.49, 0.54, 0.87 of leghemoglobin, hemerythrin, erythrocruorin, myoglobin, hemoglobin and hemocyanin respectively.

    d). Possition Specific Scoring Matrix(PSSM):-

    PSSM profiles were developed using the gpsr_1.0 package, which is freely available for linux/windows, http://www.imtech.res.in/raghava/gpsr/. The PSSM profile of Oxygen binding protein was developed against nr database downloaded through NCBI  ftp://ftp.ncbi.nih.gov/blast/db/. The 20 x N (N is length of protein) position specific scoring matrix was calculated from sequence alignment results.  All the profiles were generated using a suite of programs. First, seq2pssm_imp was used to calculate PSSM matrix in column format without any normalization. This generates the PSSM by performing PSI-BLAST against the non redundant protein database using different iterations (e.g. 3) with a cut off value 0.001. For a sequence of length N, a Nx20 position specific substitution matrix m is computed from the PSI-BLAST alignment output where m [i, j], provides information on the evol utionary conservation of residue type j at sequence position i. Following this pssm_n2 was executed to normalize the pssm profile based on (numb -min)/(max -min) formula. Finally, pssm_comp and col2svm computed the PSSM composition to generate the SVM_light input format (a 400 points vector representing the substitution rate of each amino acid into any other). The resulting composition vector (400 dimensions) was provided as input to build the SVM.   The PSSM profile of oxy and non-oxy proteins was generated for SVM prediction. the maximum accuracy was 89.20, sensitivity 97.33, specificity 79.86, and MCC 0.85. in sub-class of oxy-protein prediction result was accuracy 94.22, 93.46, 94.84,90.81,96.87, 89.74, MCC 0.82, 0.94,0.94, 0.83, 0.86,0.72 of erythrocruorin, hemocyanin, hemerythrin, hemoglobin, leghemoglobin and myoglobin respectively.  

  4. Prediction Results:

The prediction results are presented in very user friendly format.

Summary of query sequence:- 
This part provides the information about the submitted sequence like the sequence, length of sequence and date of scanning. This part also provides the information about the choosen prediction approaches.