|
Frameshift Analysis and Sequencing Error Detection |
GENIO/frame detects frameshifts in human coding sequences with high accuracy. The frameshift analysis is based on G+C content dependent interleaved 6-tupel (hexamer) entropy of coding/non coding sequences. The system is extensible to other organism specific word entropies. Our results have shown that GENIO/frame reports single and dual frameshifts with good accuracy below +/- 10 nucleotides (in most cases) and high specificity. Sensitivity/specificity is nearly independent from G+C content. The minimum sequence length for reliable frameshift detection should be about 150nt. GENIO/frame results have been evaluated with non-redundant coding sequences of 301 human genes. The results are shown in GENIO/frame GeneTest. I'm grateful for your comments and suggestions.