Resumo: | The efficiency of protein synthesis is highly dependent on codon usage and codon context. Indeed, the choice of particular synonymous codons is constrained by neighbour codons (codon context) to optimize mRNA decoding speed and accuracy. This is related to spatial (steric) effects created by the need to accommodate 3 tRNAs in the ribosome A-, P- and E-sites. Since these tRNAs interact with each other, with their cognate codons and with various structural domains of rRNAs, the structure of the 6 nucleotide RNA helix formed by the anticodon-codon interactions is strongly int1uenced by the type of codons and tRNAs present in the ribosome decoding centre. To ensure proper tRNA selection and correct codon decoding the rRNA monitors the structure of the codon-anticodon RNA helix. We hypothesized that large scale comparative analysis of 3 consecutive codons, corresponding to the ribosome A-, P- and E-sites codons, would unveil novel codon biases and "bad" codon combinations that are error prone. For this, we have built a software package that counts codon triplets in complete assemblies of open reading frames (ORFeomes) and used the ORFeome sequences of 12 fungal species, including Aspergillus fumigatus, Saccharomyces cerevisiae, and Candida albicans to validate our working hypothesis. We have used data mining methodologies to explore this large dataset of 220,000 combinations of 3 consecutive codons, and extracted the most biased contexts. The data showed that three-codon contexts are species-specific, although major context rules could also be found. Interestingly, biases introduced at DNA replication and transcription levels, namely trinucleotide repeats, play an important role in the evolution of ORFeomes. Candida albicans revealed unique features and very strong context biases. For example, codon triplet biases is much stronger in C. albicans than in other species and it has a very high number of consecutive codon repeats, which comprise up to 6% of the total ORFeome
|