Miriam Payá Milans
University of Tennessee, USA
Title: Exploring the design of RNA-Seq analysis pipeline
Biography
Biography: Miriam Payá Milans
Abstract
Transcriptome analysis through RNA-Seq data is well-established in model organisms, but the data analysis on other species can be less straightforward. Compared to other kingdoms, genome sequencing projects are far lower in plants, resulting in an increased challenge to the study of crop species. For example, in working with blueberries, we have more than one species of interest, fewer genomic resources than many model plant systems and various levels of polyploidy. When developing a workfl ow of soft ware tools to analyze this data, a researcher faces decisions among numerous algorithms at each step. We have explored some of the current options available to analyze RNA-Seq data in two situations: fi rst, when the closest reference genome is from a diff erent species and second, when a polyploid species is being sequenced but the closest reference genome is a diploid progenitor species. Results are compared between the usages of a related species reference genome against the utilization of de novo transcriptome assemblies. Further, comparisons are made amongst read correcting, quality trimming, and read mapping soft ware choices. We conclude that diff erent soft ware packages and approaches infl uence RNA-Seq analysis and recommend the election of parameters that maximize desired metrics when using polyploid species and/or a distant reference genome.