MOSCA: an automated pipeline for integrated metagenomics and metatranscriptomics data analysis

Metagenomics (MG) and Metatranscriptomics (MT) approaches open new perspectives on the interpretation of biological systems composed by complex microbial communities. Dealing with large sequencing datasets, to extract the desired information and interpret the results are big challenges associated wi...

Full description

Bibliographic Details
Main Author: Sequeira, J. C. (author)
Other Authors: Rocha, Miguel (author), Alves, M. M. (author), Salvador, Andreia Filipa Ferreira (author)
Format: conferencePaper
Language:eng
Published: 2019
Subjects:
Online Access:http://hdl.handle.net/1822/56365
Country:Portugal
Oai:oai:repositorium.sdum.uminho.pt:1822/56365
Description
Summary:Metagenomics (MG) and Metatranscriptomics (MT) approaches open new perspectives on the interpretation of biological systems composed by complex microbial communities. Dealing with large sequencing datasets, to extract the desired information and interpret the results are big challenges associated with meta-omics studies. There are several bioinformatics pipelines for MG data analysis and less to MT. Up to date, none performs a complete analysis integrating both MG and MT data, including the assembly of reads into contigs, functional and taxonomic annotation of identified genes, differential gene expression analysis and the comparison of multiple samples. Here, we present Meta-Omics Software for Community Analysis (MOSCA) that was designed with this purpose. It integrates RNA-Seq analysis with Whole Genome Sequencing as reference. Raw sequencing reads are submitted to preprocessing for quality trimming and rRNA removal, and assembled into contigs, which afterwards are annotated by using a reference database. MOSCA performs differential gene expression and provides graphical visualization of the results and comparison of multiple samples. Validation and reproducibility of the pipeline was obtained by using simulated MG and MT datasets.