UVictoria Maud Menten Institute /Mathematical and Statistical Biology Seminar: Lila Kari
Topic
Informatics and Extremophile Genomics
Speakers
Details
Although biologists discover and classify thousands of new species each year, an estimated 95% of the more than 20 million multicellular species on Earth remain unnamed and unclassified. Our research aligns with the long-term goals of the Planetary Biodiversity Mission—to map all multicellular life by 2045 —and with the challenge of deciphering the "Rosetta Stone" of genomics, by understanding the mathematical structure underlying genomic sequences.
In this talk, I discuss mathematical representations of DNA sequences and their integration with supervised machine learning and unsupervised deep learning techniques for ultrafast, accurate, and scalable genome classification across all taxonomic levels. I also present our recent findings, which provide compelling evidence that adaptations to extreme temperatures and pH leave a distinct environmental imprint on the genomic signatures of microbial extremophiles. Notably, our use of unsupervised learning on unlabelled DNA sequences has identified several instances of extremophile microbes that, despite their significant evolutionary divergence, share similar genomic signatures linked to the extreme environments they inhabit.