Course: Machine Learning
(Alternate Name for course) Yes, there are other learning methods other
than neural nets (and genetic algorithms).
Edward (Ned) S. Blurock
(Home Page)
E-Mail:
Edward.Blurock@forbrf.lth.se
Snail-Mail
Combustion Physics
Lund University
P.O. Box 118
SE-221 00 LUND
Sweden
Phone: +46 46 222 1402
FAX: +46 46 0885
This is all the course materials and slides (to date) for the
winter semester 94 lecture on Machine Learning.
The purpose of the course is to introduce a class of learning methods
having more to do with (what I will call) predicate descriptions of
the training examples. Neural nets are not covered because this is
another course in this series and genetic algorithms are not covered
(extensively) because discussion of this should be under optimization
and search techniques. ID3 is used as the introduction to learning
methods in general and specific problems effecting this and other
learning methods (i.e. missing data, pruning) will be discussed within
the framework of this method. Other methods will also be introduced:
Mikalskis's AQ and Conceptual Clustering and the class of incremental
concept formation algorithms, EPAM, UNIMEM and COBWEB (and maybe
CLASSIT).
Apart from the theoretical introduction to machine learning,
experiments in parameter specification and analysis (using the
ANALYSIS system) will be performed. The course is itself an
experiment in "Learn by Doing".
Lecture 1: Introduction to Machine Learning
A very general overview about what one is trying to accomplish
with machine learning. The emphasis of this course will be methods in
which qualitative information can be extracted.
- Handouts:
- Lecture Notes I:
- Scope of Lecture and What does one what to accomplish with
machine learning
- Lecture Notes
II
- An Example using the machine learning program Analysis
- Exercises:
- None
- Overheads:
- Intro
Lecture 2: What is needed for machine learning (The Analysis
Program)
A tour of the parts of the Analysis program and at the same time
an introduction of the essential parts of a machine learning
calculation to produce a decision tree
- Handouts:
- Overview of
Analysis
- Exercises:
- 1 and 2
- Overheads:
- Analysis 4.0
Lecture 3: Details of the ID3 Calculation
The ID3 method is explained in more detail. First, an intuitive
introduction and then a more in depth explanation based on the paper of
Quinlan.
- Handouts:
- ID3
- J. R. Quinlan: Induction of Decision Trees, Machine
Learning, vol 1, p 81-106 (1986)
- The Analysis
Directory
- Exercises:
- 3 and 4
- The ID3 Selection measure and experiments in decision tree
size.
- Overheads:
- Details of
the ID3 Method
Lecture 4: Issues in Decision Tree Making I
Two aspects of decision tree making are explored: Missing Values
and Selection Criteria. The basis of the lecture are several papers
- Handouts:
- J. R. Quinlan: Unknown Attribute Values in Induction, Proc.
6th Int'l Workshop Machine Learning, 1989
- A review and comparison of Several Ways to deal with missing
values
- John Mingers: An Empirical Comparison of Selection Measures
for Decision Tree Induction, Machine Learning, vol 3, p 319-342 (1989)
- A review of several selection measures
- Wray Buntine and Tim Niblett: A Further Comparison of
Splitting Rules for Decision Tree Induction, Machine Learning (1993)
- This paper was included not only because of the more exact
comparisons of splitting rules, but also because of the outline of data
sets and their properties.
- Exercises:
-
Experiments with missing data (5-6)
- An simplified experiment is set up exploring the the effects
of substituting various values for missing attributes
-
Representing a polynomial for machine learning analysis (7)
- This is the first step of trying to represent a polynomial
system in terms of a set of parameters. The purpose is to illustrate
the problems involved in representation.
- Selection
Measures
- Calculation of Selection Measured for a few parameters (in
the voting data set)
- Overheads:
-
Missing Data
- Minger Paper
-
Buntine-Niblett Paper
Lecture 5: Issues in Decision Tree Making II
The Liu and White paper is used to explain some of the problems in
attribute selection and why the quality of selection measures is even
an issue to be discussed. Another issue is that of pruning the decision
tree. The paper of Mingers is used because several pruning methods are
introduced
- Handouts:
- W.Z. Liu and A.P. White: The importance of Attribute
Selection Measures in Decision Tree Induction
- An example of a study with random attributes and random
selection criteria
- John Mingers: An Empirical Comparison of Pruning Methods for
Decision Tree Induction
- Five methods of pruning are discussed
- Exercises:
-
Polynomial Analysis w.r.t. Calculation Times (9-10)
- Overheads:
- Liu and
White Paper
- Pruning
Lecture 7: Bayesian Statistics and Decision Theory
Since one of the purposes of machine learning is to make
decisions, bayesian statistics are introduced with examples of the
expectations of the accuracies of predictions. Part of the class is
used to discuss the polynomial paramters that were created.
- Handouts and References:
- "Chapter 19: Bayesian Inference", Introductory Statistics,
Wonnacott, Thomas H. and Wonnacott, Ronald J.
- Prior and Posterior probabilities and Likelihood functions
are introduced with examples first from binary decisions and then using
normal and binomial distributions.
- Overheads:
- Bayesian
Inference
- Exercises and Discussion:
- Continuing work on the Polynomial problem: Discussion of
what set of descriptors one could make to describe the special set of 30 polynomials .
Assignments were given to create these descriptors.
Lecture 8: Inductive Learning: Generalization and Specialization
The concepts and notation of inductive learning in terms of
generalization and specialization are introduced. This is used to put
the inductive learning concepts on a bit more formal level and to
introduce the star concept and the AQ Algorithm.
- Handouts and References:
- A Theory and Methodology of Inductive Learning, Michalski,
Ryszard S. in Machine Learning: An Artificial Intelligence Approach
-
- Overheads:
-
Generalization and Specialization
- Star And AQ
Exercises and Discussion:
- Continuing work on the Polynomial problem: The data sets have
been submitted and a short intuitive analysis (using Analysis) is
looked at.
- The data sets
and experiments:
Lecture 9: Inductive Learning: Generalization and Specialization
Half of the lecture is devoted to explaining the star technique in
more detail
and (on the blackboard) giving examples of the implementation in
Analysis (i.e. the descriptions as a set of predicates). The other half
of the lecture is a more
intensive discussion of the polynomial problem.
- Handouts and References:
- Learning from Observation: Conceptual Clustering, Michalski,
Ryszard S. in Machine Learning: An Artificial Intelligence Approach
-
- Overheads:
-
Conceptual Clustering
Exercises and Discussion:
- Continuing work on the Polynomial problem: A more intensive
discussion of the polynomial problem
Edward.Blurock@forbrf.lth.se