Location: Massachusetts Institute of Technology (MIT), USA
I’m looking for collaborators interested in long-read (PacBio CLR/HiFi and Oxford Nanopore) sequence analysis and string and graph algorithms for a possible project on accelerating long-read genome assembly with sketching algorithms. The project would be based on our recent work on minimizer-space de Bruijn Graphs (Ekim, Berger, Chikhi, RECOMB 2021 & Cell Systems) which produces highly contiguous PacBio HiFi assemblies faster and more memory-efficiently. The project would be focused on incorporating long Oxford Nanopore reads to polish the assembly graph generated using PacBio HiiFi reads, and potentially extend the assembler to produce phased diploid assemblies. Moreover, other sketching methods to improve contiguity can be discussed.
Ideally, the collaborators would have some familiarity with genome assembly algorithms, but it’s not required. Some coding experience is required; currently the codebase is in Rust which doesn’t have a steep learning curve: https://github.com/ekimb/rust-mdbg
If you’re interested, feel free to get in touch with me at firstname.lastname@example.org. Thanks in advance!