After press ENCODING link you will be directed to this page.There is two options for encoding process.You can give your protein sequences as inputs or you can upload your .fasta file.
If you have a .fasta file you can upload it with pressing CHOOSE FILE button.After that a window will be opened so you can find your .fasta file.Then you will press OPEN button and that is all.
If you want to type the sequence manually you have to write it down into the textarea.If you want to type more than one sequence you need to press enter between sequences.While you are writing the sequence encoding methods and CALCULATE button will appear immediately.
You can select the encoding method from the drop-down menu and press CALCULATE button to start the encoding process.
If you select RCM method, you will need to adjust rank of the RCM method.
The sequence table can not be seen before choosing a file.After uploading process a table with full of sequences and IDs will appear.We have a search option on the right corner of the table.You can find your ID or sequence with that.On the bottom side of the table there are PREVIOUS and NEXT buttons.You can use these buttons to navigate on the table and see all IDs and sequences of your .fasta file.
On the left side of the table there are checkboxes of sequences.You can manually choose your sequence which you want to encode.
If you want to encode all the sequences of the .fasta file you have to press SELECT ALL button.If you want to select the active page then you have to check the box on the top left side of the table.
After you click SelectAll button it will change to UnselectAll.
After selecting the sequences , you can choose your encoding method and press CALCULATE button to start the encoding process.
If you select RCM method, you will need to adjust rank of the RCM method.
You will see the result table of your encoding method after the encoding process.
Finally if you want to download the results, you can press DOWNLOAD button.A drop-down menu will appear then you can choose your output format from this menu.(.txt,.xls,.csv,.arff) are supported formats.
First you have to upload your .xls or .xlsx file which can be two types of format. First format is the pathogen-host interaction data file which you dowload from phisto.org and the second format is the file which has two columns;pathogen uniprotID, host uniprotID respectively.
A navigation window will appear then you will find your file's path and press OPEN to finish uploading process.
Your .xls or .xlsx file will turn into a table which has columns like (Pathogen,Pathogen ID,Taxonomy ID,Uniprot ID...).The most important thing in this table you can see that PATHOGEN ID and HUMAN ID shows the interaction of pathogen and host.
If you want to select all the rows of the active page you must check the box on the top left side of the table.
If you want to download the rows which you selected then you have to press DOWNLOAD button and select your output format that you wish for.
You may want to download the selected pathogens' protein sequences.All you have to do is to press PATHOGEN SEQUENCE DOWNLOAD button and after that a menu will appear.You can either choose to download selected pathogens or download all pathogens in the file.
You may want to download the selected hosts' protein sequences.All you have to do is to press HOST SEQUENCE DOWNLOAD button and after that a menu will appear.You can either choose to download selected hosts or download all hosts in the file.
You can convert your .fasta,.xls,.xlsx,.csv,.arff files to .txt,.xls,.csv,.arff formats.First of all you have to upload your file by clicking CHOOSE FILE button.
A navigation window will appear on the screen.Then you can find your file and press OPEN button to finish uploading process.
After uploading process you have to select the new format of your file.You can choose it from the drop-down menu and press CONVERT to start converting process.
When converting process is finished , your new file will be downloaded automatically.
Select the search type by pressing the header. It will be open immediately.
If you have an ID of a protein sequence you can choose SINGLE SEARCH to see the whole sequence of the protein.
You can download the protein sequence you searched by clicking DOWNLOAD button and select the output format.
If you have lots of IDs of sequences you can choose MULTIPLE SEARCH.While you are typing IDs press enter to seperate IDs.
You can download the protein sequences you searched by clicking DOWNLOAD button and select the output format.
If you have .txt, .xls, .xlsx, .csv, .arff file which has only one column you can choose FILE UPLOAD SEARCH.
You can download the protein sequences you searched by clicking DOWNLOAD button and select the output format.
Referance : M. Bhasin and G. P. S. Raghava, “Classification of nuclear receptors based on amino acid composition and dipeptide composition,” J. Biol. Chem., vol. 279, no. 22, pp. 23262–23266, 2004.
Referance : M. Bhasin and G. P. S. Raghava, “Classification of nuclear receptors based on amino acid composition and dipeptide composition,” J. Biol. Chem., vol. 279, no. 22, pp. 23262–23266, 2004.
Referance : I. Dubchak, I. Muchnik, S. R. Holbrook, and S. H. Kim, “Prediction of protein folding class using global description of amino acid sequence.,” Proc. Natl. Acad. Sci. U. S. A., vol. 92, no. 19, pp. 8700–4, 1995.
Referance : J. Shen, J. Zhang, X. Luo, and W. Zhu, “Predicting protein–protein interactions based only on sequences information,” Proc. …, vol. 104, no. 11, pp. 4337–4341, 2007.
Referance : J. Chen, H. Liu, J. Yang, and K. C. Chou, “Prediction of linear B-cell epitopes using amino acid pair antigenicity scale,” Amino Acids, vol. 33, no. 3, pp. 423–428, 2007.
Referance : S. MAETSCHKE, M. TOWSEY, and M. BODÉN, “Blomap: an Encoding of Amino Acids Which Improves Signal Peptide Cleavage Site Prediction,” in Proceedings of the 3rd Asia-Pacific Bioinformatics Conference, 2005, pp. 141–150.
Referance : J. Guo, Y. Lin, Z. Sun, and A, "Novel Method for Protein Subcellular Localization: Combining Residue-Couple Model and SVM", in: Proceedings of Third Asia-Pacific Bioinformatics Conference, 17-21 January 2005, Singapore, 2000, vol. pp, pp. 117–129.
Referance : W. R. Taylor, “The classification of amino acid conservation,” J. Theor. Biol., vol. 119, no. 2, pp. 205–218, Mar. 1986.
Referance : M. Gök and A. T. Özcerit, “A new feature encoding scheme for HIV-1 protease cleavage site prediction,” Neural Comput. Appl., vol. 22, no. 7–8, pp. 1757–1761, 2013.
Referance : J. Ruan, K. Wang, J. Yang, L. A. Kurgan, and K. Cios, “Highly accurate and consistent method for prediction of helix and strand content from primary protein sequences,” Artif. Intell. Med., vol. 35, no. 1–2, pp. 19–35, 2005