Adaptive Learning Algorithms
In Partial Fulllment of the Requirements
for the Degree of
Doctor of Philosophy
California Institute of Technology
(Defended February, 11 2008)
All Rights Reserved
I would like to thank all the people who, through their valuable advice and support,
made this work possible. First and foremost, I am grateful to thank my advisor, Dr.
Yaser Abu-Mostafa for his support, assistance and guidance throughout my time at
I would also like to thank my colleagues at the Learning Systems Group, Ling
Li and Hsuan-Tien Lin, for many stimulating discussions and for their constructive
input and feedback . I also would like to thank Dr. Malik Magdon-Ismail, Dr. Amir
Atiya and Dr. Alexander Nicholson for their helpful suggestions.
I would like to thank the members of my thesis committee, Dr. Yaser Abu-
Mostafa, Dr. Alain Martin, Dr. Pietro Perona and Dr. Jehoshua Bruck, for their
time to review the thesis and all the helpful suggestions and guidance.
Finally I'd like to thank my family and friends for continuing love and support.
This thesis is in the eld of machine learning: the use of data to automatically learn
a hypothesis to predict the future behavior of a system. It summarizes three of my
We rst investigate the role of margins in the phenomenal success of the Boosting
Algorithms. AdaBoost (Adaptive Boosting) is an algorithm for generating an ensem-
ble of hypotheses for classication. The superior out-of-sample performance of Ad-
aBoost has been attributed to the fact that it can generate a classier which classies
the points with a large margin of condence. This led to the development of many
new algorithms focusing on optimizing the margin of condence. It was observed
that directly optimizing the margins leads to a poor performance. This apparent
contradiction has been the topic of a long unresolved debate in the machine-learning
community. We introduce new algorithms which are expressly designed to test the
margin hypothesis and provide concrete evidence which refutes the margin argument.
We then propose a novel algorithm for Adaptive sampling under Monotonicity
constraint. The typical learning problem takes examples of the target function as
input information and produces a hypothesis that approximates the target as an
output. We consider a generalization of this paradigm by taking dierent types of
information as input, and producing only specic properties of the target as output.
This is a very common setup which occurs in many dierent real-life settings where
the samples are expensive to obtain. We show experimentally that our algorithm
achieves better performance than the existing methods, such as Staircase procedure
One of the major pitfalls in machine learning research is that of selection bias.
This is mostly introduced unconsciously due to the choices made during the learning
process, which often lead to over-optimistic estimates of the performance. In the third
project, we introduce a new methodology for systematically reducing selection bias.
Experiments show that using cloned datasets for model selection can lead to better
performance and reduce the selection bias.
- 我的内容管理 收起
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额