Biography
Biography: Change L Tan
Abstract
Statement of the Problem: Two complementary goals of biological research are to understand how each organism works and how that relates to other organisms. Specifically, the function of all genes and non-genes (i.e., all the regions of a genome that do not code for any genes) of each organism and how its genes and non-genes compare with those of other organisms. The progress in DNA sequencing has generated large amounts of sequence data, and many computer programs have been developed to interpret these data, especially in identifying and analyzing the similarities among genes and genomes. Unfortunately, in the zeal of finding similarities, the differences among genes and genomes are often not just simply ignored, but intentionally masked, trimmed, or filtered. With the increase in the number of genes or organisms being compared, the deleted data increase exponentially. The tragic consequence is that the very data we need to answer a question such as “what makes a dog a dog, instead of a cat” are cut out, because much of our hard-generated data have been rendered invisible. Methodology & Theoretical Orientation: I propose to use a holistic approach to address the problem; using all the data, all the sequence of whole genes, all the genes of whole genomes. Instead of cherry-picking only those regions of genomes that are similar enough to be aligned, carefully inspect each section. Conclusion & Significance: Many a time, the very life of an organism hinges on a small part of its genome, even a single base pair as shown by the identification of many lethal point-mutations. Ignoring "irrelevant" data, however benign it may appear, can be devastating. It is foreseeable that a tremendous amount of knowledge can be gained by comparing and contrasting different life forms using both similar and different sequences of genes and genomes.