|

advertisement
|
|
|
| Book Lists Home - Bioinformatics Books - Add To My Book List |
|
|
Book Summary: The focus of this dissertation is concerned with methods of model selection and parameter optimization associated with kernels for support vector machines. Two popular sub-sampling methods are cross-validation and bootstrapping, which are used to estimate the generalization error. Newer methods have been developed recently that use probability bounds associated with the generalization error. These methods are referred to as structural risk minimization methods. Originally introduced by Vapnik and Chervonenkis, these methods were furthered by the works of Blanchard, Koltchinskii, and others. These methods essentially involve the sum of two functionals. One is some empirical estimate associated with a specific loss function, the other is a penalty associated with the probability bounds developed theoretically. We use a specific type of structural risk minimization problem associated with support vector machines and show that this method of kernel optimization gives similar results as with cross-validation and bootstrapping, and can these new methods can be performed in a much more timely manor. Microarray Experiments involved the analysis of thousands of genes at the same time. In order to classify two distinctive groups within a given population, one must find a selective number of discriminating genes that optimally classify the groups. In Learning Theory this is often referred to as feature selection, or kernel sieving. First, we discuss methods ranking the genes for feature extraction. Second, we use our methods for model selection to determine how many genes are needed to classify for two microarray experiments and maintain a minimal error rate. The final application we look at is that of gene function classification. This problem involves analyzing yeast genes and their associated biological function as determined by the yeast genomes gene ontology. It was shown that linear kernels for support vector machines were sufficient for classifying gene function using phylogenetic profiles. Vert went further using phylogenetic trees to construct Bayesian tree kernels with fixed parameters that classify the same with much improvement. We go further, and optimize these parameters showing that there is considerable improvement in the performance of these Bayes tree Kernels. |
|
|
Book ReviewsPost a book review for this title
No reviews for this title. Be the first to post a review. |
More Bioinformatics BooksMore Bioinformatics Books ... |
| |
| |
|
|
|