view:37202 Last Update: 2024-11-29
Khosrow khalifeh., kamal Naderlu., Seyedeh Akram Shirdel and Sakineh Tofighi
Parsprot, a new algorithm for processing and analysing of protein and nucleic acid codes in Bioinformatics |
We have developed a new algorithm named as "Parsprot", appropriate for the processing and analyzing of protein and nucleic acid sequences and their related structures using raw sequence and pdb file as an entry. Considering amino acid or nucleotide codes as letters, it is possible to make two, three or k-Letter words with every defined sequence, in which we take into account degeneracy of sequential residues in raw sequences of proteins as well as Nucleic Acids. In this manner the number of k-Letter words will be 20k in protein sequences and 4k for Nucleic acid sequences, where k is the number of letters in each word. The output of this algorithm can be used in extracting sequence-based information of related proteins and genes, appropriate for using in different aspects of bioinformatics and neural network computations. It is also possible to determine the number of each residue position and that of k-Letter word incorporated into different secondary structural elements as well as their interactions with neighboring residues in 3D structure of protein. Further refinement of the method and its broader applications is currently in progress. It is also noticeable that we used Java web Application as a good and user friendly interface for developing above mentioned program. |