MacDATA Seminar: The Mathematics of Genomes by Lila Kari
March 23, 2018
Hamilton Hall Rm 305
3:30pm to 5:30pm
“In the same way we use the 26 letters of the alphabet to write text, and the 2 bits zero and one to write computer code, the 4 basic DNA units (Adenine, Cytosine, Guanine, Thymine) are used by Nature to encode information as DNA strands. Theoretically, a DNA strand can be viewed as a “word” over the 4-letter alphabet {A, C, G, T}, and the mathematical structure of such words has implications for their biological structure and function.
I describe our investigation into the Chaos Game Representation of a DNA sequence as a potential “genomic signature” for its species, and the usability of such genomic signatures for species identification and classification. The potential impact of such an alignment-free universal classification method could be significant, given that 86% of existing species on Earth and 91% of species in the oceans still await classification