Resumo: | Satellite DNA (satDNA) sequences constitute the major component of constitutive heterochromatin (CH) and have been considered one of the most fascinating and intriguing repetitive DNA elements of eukaryotic genomes. For many years, satDNA was considered “junk” and a transcriptional inert fraction of eukaryotic genomes. Today is generally accepted that satDNAs play important structural and functional roles in genomes, such as genome architecture, chromosomal reorganization during evolution, or genome regulation, mainly driven by satellite transcripts or satellite non-coding RNAs (satncRNAs). Centromeric satncRNAs have been highlighted as crucial players in remodeling/CENP-A deposition and correct kinetochore assembly, essential for proper chromosome segregation. The advent of Next Generation Sequencing (NGS) strategies and the overwhelming advances in genome sequencing technologies have provided a massive amount of sequencing data from hundreds of model and non-model species. Similarly, a growing in bioinformatics tools and strategies have been established towards genome-wide identification and characterization of the repetitive genome elements, namely satDNAs – the Satellitome. In the last decades, rodents belonging to the Peromyscus genus have emerged as model systems across a variety of scientific disciplines, including chromosomal evolution, and the release of the first Peromyscus representative genome sequence (from P. maniculatus) was the gateway to further dissect the repetitive content of this genome. A bioinformatics pipeline was thus defined in this work, based on the Tandem Repeats Finder algorithm and an integrated analysis of sequence similarity allowed the identification of 21 distinct families of large tandem repeats in Peromyscus maniculatus genome (array length larger than 2 kb) being the majority of these satellite- or transposable elements-related families, presenting a tandem organization. Two orthologous satDNA families of the rat and mouse genomes were recognized for the first time in P. maniculatus genome: RNSAT1 and MMSAT4, respectively. The most prevalent satDNA family of the P. maniculatus satellitome corresponded to the previously described Peromyscus satDNA – PMSat –, an AT-rich satDNA displaying a 345 bp monomeric size. Physical mapping of PMSat conducted in four Peromyscus species (P. eremicus, P. maniculatus, P. leucopus and P. californicus) revealed that PMSat is mainly located at the active centromeres and pericentromeric regions of all chromosomes, and at other constitutive heterochromatin rich regions as telomeres and p-arms of some chromosomes. In all the studied species, PMSat showed a high degree of nucleotide conservation, despite the different number of PMSat copies per genome. Our results strongly suggest that the evolution of PMSat was driven by copy number fluctuations and the high similarity among Peromyscus and non-Peromyscus species may reflect non-concerted evolutionary events. Also, in light of the karyotype differences of these species, as well as many of the chromosome polymorphisms found in Peromyscus species, the distinct pattern of CH, and PMSat locations, we hypothesized that PMSat evolutionary molecular events may have promoted Peromyscus karyotype variations and genome evolution. Furthermore, the PMSat copy number fluctuations, promoted by molecular mechanisms such as unequal crossing-over and rolling circle amplification are clearly observed in the heterochromatin additions found in some of the genus species, namely on P. eremicus genome. The transcriptional analysis of PMSat in proliferative cells from all the studied Peromyscus species uncovered a positive correlation between PMSat expression and DNA copy number in each genome. Despite the pronounced variation levels of transcripts, the analysis of specific cell cycle phases revealed a similar transcriptional cellular profile throughout the cell cycle: PMSat satncRNA accumulates mostly at G2/M transition and at the mitosis onset and are restricted to the nucleus. To gain more insights on the putative function(s) of PMsat transcripts, a functional assay based on PMSat RNA knockdown on P. eremicus proliferative cells anticipated its potential role as key players on kinetochore assembly and centromeric function. Moreover, according to the putative transcription factors’ binding sites on PMSat monomer sequence found, RNA polymerase II may be the enzyme conducting the transcription of this satellite family in a variety of cell conditions, namely in response to cellular stresses. The work presented on this thesis uncovered PMSat not only as the trigger of Peromyscus karyotype evolution but also as a crucial element of the centromeric function and chromosome segregation fidelity, that seems to be conducted by their derived satncRNAs.
|