Metabarcoding (identification of the plant, animal, and fungal taxa present in an environmental sample) rapidly gains importance in ecology, food safety, pest identification, and disease surveillance. It has a compelling advantage over traditional approaches for obtaining data on species distributions, however, it is often difficult to detect all the species present in a bulk sample using High-throughput Sequencing (HTS). This can – in parts – be attributed to the shorter read lengths most HTS instruments generate. Moreover, most HTS platforms are not portable, making in situ field-based sequencing not feasible. Oxford Nanopore sequencing platforms such as the MinION represent an exception to that and they are also known to provide longer reads albeit limited by rather high error rates (~12-15%). We used a freshwater mock community of 50 Operational Taxonomic Units (OTU) to test the capacity of the Oxford Nanopore MinION coupled with a rolling circle amplification protocol to provide long read metabarcoding results. We also propose a new Python pipeline that explores error profiles of nanopore consensus sequences, mapping accuracy, and overall community representation within a complex bulk sample. Using our molecular and bioinformatics workflow, we were able to estimate the diversity of the tested freshwater mock community with an average sequence accuracy of >99% for 1D2 sequencing on the nanopore platform. We also showed that the high error rates associated with long-read single-molecule sequencing can be mitigated by using a rolling circle amplification protocol. Future bioassessment programs will tremendously benefit from such portable, highly accurate, species-level metabarcoding and it appears that we reached a point were cost-effective field-based DNA metabarcoding is possible.
The fate of mutations in populations depends on their impact on the fitness of the individual that carries them. This fitness effect depends, in turn, on the location of the mutation in the genome: a mutation occurring in a non-coding region generates a new allele that will evolve neutrally, while a mutation located within a functional region can have deleterious or advantageous effects, effects that will furthermore depend on the function of the underlying gene. Yet within a given gene, mutations can have very distinct effects. For genes encoding a macromolecule, RNA or protein, an important determinant of these effects is the structure of the encoded molecule. I will here present some insights that we gained regarding the impact of protein structure on the evolution of sequences, with a focus on protein-encoding sequences. In particular, we ask the following questions: (1) what is the distribution of adaptive mutations along 3D protein structures and (2) to which extent does protein structure generate coevolution between positions? To leverage information about the distribution of fitness effects, we relied on comparative genome analyses. I will present two statistical approaches: an extension of the McDonald-Kreitman approach that allows inferring the rate of adaptive non-synonymous substitutions by modeling the distribution of fitness effects of mutations, and a substitution mapping procedure used for inferring coevolving positions.
The COVID-19 Pandemic originated in Wuhan, China, in December 2019 and became one of the worst global health crises ever. The first confirmed cases were announced early in March and since then, serious containment measures have taken place in Turkey. Here, we present a different approach, a Bayesian negative binomial multilevel model with mixed effects, for the projection of the COVID-19 pandemic and apply this model to the Turkish case. We predicted confirmed daily cases and cumulative numbers for June 6th to June 26th with 80%, 95%, and 99% prediction intervals (PI). Our projections showed that if we continued to comply with measures and no drastic changes are seen in diagnosis or management protocols, the epidemic curve would tend to decrease in this time interval. Also, the predictive validity analysis suggests that proposed model projections should be in the 95% PI band for the first 12 days of the projections.
RSG-Turkey is a member of The International Society for Computational Biology (ISCB) Student Council (SC) Regional Student Groups (RSG). We are a non-profit community composed of early career researchers interested in computational biology and bioinformatics.