Density based clustering and error correction of metabarcodes in Nanopore sequencing using the novel bioinformatics algorithm ASHURE – Bilgenur Baloğlu
HomeWebinarsDensity based clustering and error correction of metabarcodes in Nanopore sequencing using the novel bioinformatics algorithm ASHURE – Bilgenur Baloğlu
Metabarcoding (identification of the plant, animal, and fungal taxa present in an environmental sample) rapidly gains importance in ecology, food safety, pest identification, and disease surveillance. It has a compelling advantage over traditional approaches for obtaining data on species distributions, however, it is often difficult to detect all the species present in a bulk sample using High-throughput Sequencing (HTS). This can – in parts – be attributed to the shorter read lengths most HTS instruments generate. Moreover, most HTS platforms are not portable, making in situ field-based sequencing not feasible. Oxford Nanopore sequencing platforms such as the MinION represent an exception to that and they are also known to provide longer reads albeit limited by rather high error rates (~12-15%). We used a freshwater mock community of 50 Operational Taxonomic Units (OTU) to test the capacity of the Oxford Nanopore MinION coupled with a rolling circle amplification protocol to provide long read metabarcoding results. We also propose a new Python pipeline that explores error profiles of nanopore consensus sequences, mapping accuracy, and overall community representation within a complex bulk sample. Using our molecular and bioinformatics workflow, we were able to estimate the diversity of the tested freshwater mock community with an average sequence accuracy of >99% for 1D2 sequencing on the nanopore platform. We also showed that the high error rates associated with long-read single-molecule sequencing can be mitigated by using a rolling circle amplification protocol. Future bioassessment programs will tremendously benefit from such portable, highly accurate, species-level metabarcoding and it appears that we reached a point were cost-effective field-based DNA metabarcoding is possible.