Multi Hit, Dropoff Percentage and NCM-2: Three Improvements in BLAST
339-346
Correspondence
Dr Deepak Garg,Computer Science Department, Thapar University,
Patilala-147004, India. Email : deep108@yahoo.com, Ph: +91-9815599654
Various algorithms are in use in medical processes to improve the speed, sensitivity and accuracy of the computations and analyses involved in those experiments.
The aim of this paper is to suggest three improvements, namely Multi Hit, Dropoff percentage and NCM-2 in the BLAST algorithm.
BLAST (Basic Local Alignment Search Tool) is a popular tool used for determining the patterns in genomic sequences. As the data is increasing exponentially, the need for advanced and complex algorithms for improving the accuracy, speed and sensitivity of pattern discovery tools in bioinformatics is also increasing.
First Improvement: The initialization of the word matches in a pairwise sequence alignment works either on single hit or two-hit algorithms. Instead, if we use a 3-hit or n-hit in general then the results improve in general and improve dramatically for some specific species and sequences.
Second Improvement: BLAST is using a drop-off score to calculate the highest scoring pairs between two sequences. A change has been proposed to calculate the threshold score that determines the inclusion of the subsequence in the result. Instead of using a drop-off score, if we use a drop-off percentage, it gives better results for some sequences.
Third Improvement: We propose an NCM-2 approach for normalizing BLAST values for simple regions. This approach is based upon the natural properties of the Amino acid sequences. The algorithms have been run on Linux ES platform with Compaq Presario 2GB RAM and compared to the original BLAST.