TITLE

A High-Throughput DNA Sequence Aligner for Microbial Ecology Studies

AUTHOR(S)
Schloss, Patrick D.
PUB. DATE
December 2009
SOURCE
PLoS ONE;2009, Vol. 4 Issue 12, p1
SOURCE TYPE
Academic Journal
DOC. TYPE
Article
ABSTRACT
As the scope of microbial surveys expands with the parallel growth in sequencing capacity, a significant bottleneck in data analysis is the ability to generate a biologically meaningful multiple sequence alignment. The most commonly used aligners have varying alignment quality and speed, tend to depend on a specific reference alignment, or lack a complete description of the underlying algorithm. The purpose of this study was to create and validate an aligner with the goal of quickly generating a high quality alignment and having the flexibility to use any reference alignment. Using the simple nearest alignment space termination algorithm, the resulting aligner operates in linear time, requires a small memory footprint, and generates a high quality alignment. In addition, the alignments generated for variable regions were of as high a quality as the alignment of full-length sequences. As implemented, the method was able to align 18 full-length 16S rRNA gene sequences and 58 V2 region sequences per second to the 50,000-column SILVA reference alignment. Most importantly, the resulting alignments were of a quality equal to SILVA-generated alignments. The aligner described in this study will enable scientists to rapidly generate robust multiple sequences alignments that are implicitly based upon the predicted secondary structure of the 16S rRNA molecule. Furthermore, because the implementation is not connected to a specific database it is easy to generalize the method to reference alignments for any DNA sequence.
ACCESSION #
58081296

Tags: NUCLEOTIDE sequence;  RNA;  NUCLEIC acids -- Analysis;  ALGORITHMS;  MICROBIAL ecology;  DNA;  GENETIC polymorphisms

 

Related Articles

  • DNA meets its match. Hayden, Thomas // U.S. News & World Report;2/24/2003, Vol. 134 Issue 6, p45 

    Discusses the relationship between deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). Consideration of RNA interface, a process in which RNA blocks the flow of orders from DNA; Characteristics of epigenetics, the ways in which cells can alter the function of their genes without altering...

  • Phylogenetic Inference with Weighted Codon Evolutionary Distances. Criscuolo, Alexis; Michel, Christian J. // Journal of Molecular Evolution;Apr2009, Vol. 68 Issue 4, p377 

    We develop a new approach to estimate a matrix of pairwise evolutionary distances from a codon-based alignment based on a codon evolutionary model. The method first computes a standard distance matrix for each of the three codon positions. Then these three distance matrices are weighted...

  • A "Nano-Dial" Molecular Computing Model Based on Circular DNA. Cheng Zhang; Jing Yang; Jin Xu; Shudong Wang // Current Nanoscience;Jun2010, Vol. 6 Issue 3, p285 

    A novel molecular computing model based on circular DNA was developed to solve a 3-coloring graph problem. This computing model uses circular DNA and works as to dial a number. The method of selecting true solutions is similar to dialing on a telephone. Moreover, the key methods in this model...

  • An Index Based K-Partitions Multiple Pattern Matching Algorithm. Bhukya, Raju; Somayajulu, D. V. L. N. // International Journal of Network Security (2152-5064);Apr2011, Vol. 2 Issue 2, p18 

    The study of pattern matching is one of the fundamental applications and emerging area in computational biology. Searching DNA related data is a common activity for molecular biologists. In this paper we explore the applicability of a new pattern matching technique called Index based Kpartition...

  • Human F7 sequence is split into three deep clades that are related to FVII plasma levels. Sabater-Lleal, Maria; Soria, Jos� Manuel; Bertranpetit, Jaume; Almasy, Laura; Blangero, John; Fontcuberta, Jordi; Calafell, Francesc // Human Genetics;Feb2006, Vol. 118 Issue 6, p741 

    It is widely accepted that FVII levels are strongly, consistently, and independently related to cardiovascular risk. These levels are influenced by genetic and environmental factors. Among the genetic factors, only a limited number of polymorphisms in the F7 gene have been reported, and they...

  • HapMap Complete. Schmidt, Charles W. // Environmental Health Perspectives;Oct2005, Vol. 113 Issue 10, pA662 

    This article describes the International HapMap Project, consortium of researchers and funding agencies from the U.S., Japan, China, Nigeria, Canada, and Great Britain, which is set to release an enhanced version of its haplotype map. The HapMap currently characterizes a total of 4 million...

  • Inferring haplotypes at the NAT2 locus: the computational approach. Sabbagh, Audrey; Darlu, Pierre // BMC Genetics;2005, Vol. 6, p1 

    Background: Numerous studies have attempted to relate genetic polymorphisms within the Nacetyltransferase 2 gene (NAT2) to interindividual differences in response to drugs or in disease susceptibility. However, genotyping of individuals single-nucleotide polymorphisms (SNPs) alone may not always...

  • Complete Mitochondrial DNA Sequence of Conger myriaster (Teleostei: Anguilliformes): Novel Gene Order for Vertebrate Mitochondrial Genomes and the Phylogenetic Implications for Anguilliform Families. Inoue, Jun G.; Miya, Masaki; Tsukamoto, Katsumi; Nishida, Mutsumi // Journal of Molecular Evolution;Apr2001, Vol. 52 Issue 4, p311 

    The complete nucleotide sequence of the mitochondrial genome was determined for a conger eel, Conger myriaster (Elopomorpha: Anguilliformes), using a PCR-based approach that employs a long PCR technique and many fish-versatile primers. Although the genome [18,705 base pairs (bp)] contained the...

  • Bio�Cryptography: A Possible Coding Role for RNA Redundancy. Regoli, M. // AIP Conference Proceedings;3/10/2009, Vol. 1101 Issue 1, p368 

    The RNA-Crypto System (shortly RCS) is a symmetric key algorithm to cipher data. The idea for this new algorithm starts from the observation of nature. In particular from the observation of RNA behavior and some of its properties. The RNA sequences have some sections called Introns. Introns,...

Share

Read the Article

Courtesy of VIRGINIA BEACH PUBLIC LIBRARY AND SYSTEM

Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics