Basecalling Using Hidden Markov Models

NCJ Number

309117

Journal

Journal of the Franklin Institute Volume: 341 Issue: 1-2 Dated: 2004 Pages: 23-36

Author(s)

Petros Boufounos; Sameh El-Difrawy; Dan Ehrlich

Date Published

March 2004

Length

14 pages

Annotation

The authors discuss the suitability and potential benefits of using hidden Markov models as a basecalling tool, presenting a very simple model that matches the overall performance of PHRED in a preliminary evaluation; they provide detailed discussion of the motivation and theoretical background for their research, model selection for DNA sequencing, the generation of training data and a discussion of their model training, and model implementation and results.

Abstract

In this paper we propose hidden Markov models to model electropherograms from DNA sequencing equipment and perform basecalling. The authors model the state emission densities using artificial neural networks, and modify the Baum–Welch re-estimation procedure to perform training. Moreover, they develop a method that exploits consensus sequences to label training data, thus minimizing the need for hand labeling. The authors propose the same method for locating an electropherogram in a longer DNA sequence. They also perform a careful study of the basecalling errors and propose alternative HMM topologies that might further improve performance. Their results demonstrate the potential of these models. Based on these results, the authors conclude by suggesting further research directions. (Published Index Provided)

Date Published: March 1, 2004

Downloads

HTML

Basecalling Using Hidden Markov Models

Downloads

Related Topics

Similar Publications

Basecalling Using Hidden Markov Models

Additional Details

Downloads

Related Topics

Similar Publications