Introduction to ILP.ppt_麦多课文库mydoc123.com

资源描述

1、Introduction to ILP,ILP = Inductive Logic Programming = machine learning logic programming = learning with logic,Introduced by Muggleton in 1992,(Machine) Learning,The process by which relatively permanent changes occur in behavioral potential as a result of experience. (Anderson) Learning is constr

2、ucting or modifying representations of what is being experienced. (Michalski) A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. (Mitchell),Machine Lear

3、ning Techniques,Decision tree learning Conceptual clustering Case-based learning Reinforcement learning Neural networks Genetic algorithms and Inductive Logic Programming,Why ILP ? - Structured data,Seed example of East-West trains (Michalski)What makes a train to go eastward ?,Why ILP ? Structured

4、data,Mutagenicity of chemical molecules (King, Srinivasan, Muggleton, Sternberg, 1994)What makes a molecule to be mutagenic ?,Why ILP ? multiple relations,This is related to structured data,has_car,car_properties,Why ILP ? multiple relations,Genealogy example: Given known relations father(Old,Young)

5、 and mother(Old,Young) male(Somebody) and female(Somebody) learn new relations parent(X,Y) :- father(X,Y). parent(X,Y) :- mother(X,Y). brother(X,Y) :-male(X),father(Z,X),father(Z,Y).Most ML techniques cant use more than 1 relation e.g.: decision trees, neural networks, ,Why ILP ? logical foundation,

6、Prolog = Programming with Logicis used to represent: Background knowledge (of the domain): facts Examples (of the relation to be learned): facts Theories (as a result of learning): rules Supports 2 forms of logical reasoning Deduction Induction,Prolog - definitions,Variables: X, Y, Something, Somebo

7、dy Terms: arthur, 1, 1,2,3 Predicates: father/2, female/1Facts: father(christopher,victoria). female(victoria). Rules: parent(X,Y) :- father(X,Y).,Logical reasoning: deduction,From rules to facts,B T |- E,mother(penelope,victoria). mother(penelope,arthur). father(christopher,victoria). father(christ

8、opher,arthur).,parent(X,Y) :- father(X,Y). parent(X,Y) :- mother(X,Y).,parent(penelope,victoria). parent(penelope,arthur). parent(christopher,victoria). parent(christopher,arthur).,Logical reasoning: induction,From facts to rules,B E |- T,mother(penelope,victoria). mother(penelope,arthur). father(ch

9、ristopher,victoria). father(christopher,arthur).,parent(X,Y) :- father(X,Y). parent(X,Y) :- mother(X,Y).,parent(penelope,victoria). parent(penelope,arthur). parent(christopher,victoria). parent(christopher,arthur).,Induction of a classifier or Concept Learning,Most studied task in Machine Learning G

10、iven: background knowledge B a set of training examples E a classification c C for each example e Find: a theory T (or hypothesis) such that B T |- c(e), for all e E,Induction of a classifier: example,Example of East-West trains B: relations has_car and car_properties (length, roof, shape, etc.)ex.:

11、 has_car(t1,c11), shape(c11,bucket) E: the trains t1 to t10 C: east, west,Why ILP ? - Structured data,Seed example of East-West trains (Michalski)What makes a train to go eastward ?,Induction of a classifier: example,Example of East-West trains B: relations has_car and car_properties (length, roof,

12、shape, etc.)ex.: has_car(t1,c11) E: the trains t1 to t10 C: east, west,Possible T: east(T) :-has_car(T,C), length(C,short), roof(C,_).,Induction of a classifier: example,Example of mutagenicity B: relations atom and bondex.: atom(mol23,atom1,c,195). bond(mol23,atom1,atom3,7). E: 230 molecules with k

13、nown classification C: active and nonactive w.r.t. mutagenicityPossible T:active(Mol) :-atom(Mol,A,c,22), atom(Mol,B,c,10),bond(Mol,A,B,1).,c22,c10,Learning as search,Given: Background knowledge B Theory Description Language T Positives examples P (class +) Negative examples N (class -) A covering r

14、elation covers(B,T,e) Find: a theory that covers all positive examples (completeness) no negative examples (consistency),Learning as search,Covering relation in ILP covers(B,T,e) B T |- e A theory is a set of rules Each rule is searched separately (efficiency) A rule must be consistent (cover no neg

15、atives), but not necessary complete Separate-and-conquer strategy Remove from P the examples already covered,Space exploration,Strategy? Random walk Redundancy, incompleteness of the search Systematic according to some ordering Better control = no redundancy, completeness The ordering may be used to

16、 guide the search towards better rulesWhat kind of ordering?,Generality ordering,Rule 1 is more general than rule 2 = Rule 1 covers more examples than rule 2 If a rule is consistent (covers no negatives)then every specialisation of it is consistent too If a rule is complete (covers all positives)the

17、n every generalisation of it is complete too Means to prune the search space 2 kinds of moves: specialisation and generalisation Common ILP ordering: -subsumption,Generality ordering,parent(X,Y):-,parent(X,Y):- female(X),parent(X,Y) :- father(X,Y),parent(X,Y) :- female(X), mother(X,Y),parent(X,Y) :-

18、 female(X), father(X,Y),consistent rule,specialisation,Search biases,“Bias refers to any criterion for choosing one generalization over another other than strict consistency with the observed training instances.” (Mitchell) Restrict the search space (efficiency) Guide the search (given domain knowle

19、dge) Different kinds of bias Language bias Search bias Strategy bias,Choice of predicates:roof(C,flat) ? roof(C) ? flat(C) ? Types of predicates :east(T) :- roof(T), roof(C,3) Modes of predicates :east(T) :- roof(C,flat)east(T) :- has_car(T,C), roof(C,flat) Discretization of numerical values,Languag

20、e bias,Search bias,The moves direction in the search space Top-down start: the empty rule (c(X) :- .) moves: specialisations Bottom-up start: the bottom clause ( c(X) :- B.) moves: generalisations Bi-directional,Strategy bias,Heuristic search for a best rule Hill-climbing: Keep only one rule efficie

21、nt but can miss global maximum Beam search: also keep k rules for back-tracking less greedy Best-first search: keep all rules more costly but complete search,A generic ILP algorithm,procedure ILP(Examples)Initialize(Rules, Examples) repeatR = Select(Rules, Examples)Rs = Refine(R, Examples)Rules = Re

22、duce(Rules+Rs, Examples) until StoppingCriterion(Rules, Examples) return(Rules),A generic ILP algorithm,Initialize(Rules,Examples): initialize a set of theories as the search starting points Select(Rules,Examples): select the most promising candidate rule R Refine(R,Examples): returns the neighbours

23、 of R (using specialisation or generalisation) Reduce(Rules,Examples): discard unpromising theories (all but one in hill-climbing, none in best-first search),ILPnet2 www.cs.bris.ac.uk/ILPnet2/,Network of Excellence in ILP in Europe 37 universities and research institutes Educational materials Public

24、ations Events (conferences, summer schools, ) Description of ILP systems Applications,ILP systems,FOIL (Quinlan and Cameron-Jones 1993): top-down hill-climbing search Progol (Muggleton, 1995): top-down best-first search with bottom clause Golem (Muggleton and Feng 1992): bottom-up hill-climbing sear

25、ch LINUS (Lavrac and Dzeroski 1994): propositionalisation Aleph (Progol), Tilde (relational decision trees), ,ILP applications,Life sciences mutagenecity, predicting toxicology protein structure/folding Natural language processing english verb past tense document analysis and classification Engineer

26、ing finite element mesh design Environmental sciences biodegradability of chemical compounds,The end,A few books on ILP J. Lloyd. Logic for learning: learning comprehensible theories from structured data. 2003. S. Dzeroski and N. Lavrac, editors. Relational Data Mining. September 2001. L. De Raedt, editor. Advances in Inductive Logic Programming. 1996. N. Lavrac and S. Dzeroski. Inductive Logic Programming: Techniques and Applications. 1994.,

展开阅读全文