Change-point detection and sequence alignment: Statistical problems of genomics

by Nancy R. Zhang

Publisher: ProQuest / UMI
Publication Date: Saturday, March 18, 2006
Number of Pages: 136
ISBN: 0542287366

Book Summary:
In Part I of this thesis, we will study the problem of estimating the number of change-points in a data series that is hypothesized to have undergone abrupt changes. We examine two different models: Gaussian data points with changing mean and Poisson process with changing rate parameter. This problem can be approached through the model selection perspective, where model complexity grows with the number of change-points. The classic Bayes Information Criterion (BIC) statistic can not be used because of irregularities in the likelihood function. By asymptotic approximation of the Bayes Factor, we derive a “modified BIC” that is theoretically justified for the change-point models that we study. An example of application as well as a source of inspiration for the Gaussian model is the analysis of array comparative genomic hybridization (array-CGH) data. Array-CGH measures the number of chromosome copies at each genome location of a cell sample, and is useful for finding the regions of genome deletion and amplification in tumor cells. The new modified BIC statistic will be tested on array-CGH data sets and compared to existing methods. Variations to the basic change-point model that are inspired by array-CGH data will also be discussed. In Part II, we will switch to a different problem: The characterization of scores of optimal local sequence alignments. This problem was inspired by the comparison of protein and DNA sequences in biology. The specific question that we will ask is: For which scoring functions does the optimal local alignment score grow logarithmically with sequence length? We will define the concept of “Local Optimality” and use it to prove a sufficient condition on the scoring function for logarithmic growth of the optimal score for gapped alignments. “Local Optimality” refers to the fact that in an optimal alignment, any local changes around gaps should not increase the overall score. We will use numerical studies to compare our local optimality based result to previous results and also draw some theoretical connections.



More Genomics BooksMore Genomics Books ...


Copyright © 2014 All Rights Reserved. Contact Us - About Us - Privacy Policy - Terms & Conditions - Resources

Can't find what you are looking for? View our Site Map