First Order Regression

Aram Karalic

Ph.D. Thesis

Thesis supervisor: prof.dr. Ivan Bratko


thesis.ps(907k),


Abstract

A system named FORS (First Order Regression System) is developed, capable of inducing numerical concepts in first order logic. FORS utilizes some useful ILP concepts (such as background knowledge), alleviates some shortcomings of the existing propositional regression systems, and integrates some of their advantages. Our goals were do develop a system which can: induce first-order logic concepts which incorporate continuous variables; make use of the background knowledge in intensional form; model dynamic systems (learn from time series); partition attribute space to subspaces and find a submodel for each subspace; handle noisy data. FORS was successfully applied in several synthetic and real-world domains where the requirements proved necessary and useful.

FORS constructs a model in the form of a Prolog program. Covering approach, similar to the one of FOIL is used. The clause building part of the algorithm uses a top-down approach. The algorithm starts with the most general candidate clause, covering the entire example set and then specializes the clause by adding literals. Clause construction uses beam search to guide the algorithm through the space of possible clauses.

As a part of the system, the pruning based on the Minimum description length principle was developed that can handle also continuous variables. It turned out that MDL pruning helps to build more comprehensible models, while at the same time preserves model's performance in terms of its prediction power.

In experiments with physical domains FORS rediscovered Kepler's third law of planetary motion and ideal gas law (including rediscovery of the gas constant and the absolute temperature scale).

In all the real-world domains, models using linear regression appealed most to the experts, since linear regression largely increased the expressive power of the models in comparison with pure first order logic. The experts were able to thoroughly analyse the induced models and carefully exploit the selected regions of attribute space and more reliably evaluate model's quality inside the regions.

Modelling of water behavior in surge tank proved that FORS can successfully handle times series data. Additionally, FORS' ability to partition the attribute space coupled with its linear regression capabilities proved to be crucial in inducing a useful model without prior knowledge of ``absolute value'' function.

Models induced from the Lake of Bled data adequately well describe the growth of algae in the lake. Newly induced background literals defining seasons help in better comprehensibility of the induced models. For the expert this was not the first approach to modelling the Lake of Bled behavior and experiments with FORS have confirmed his opinion that not much more can be done with current data without additional more precise and frequent measurements.

A domain expert in the domain of steel grinding claims to be satisfied with the results of the machine learning, since our models enabled him to grasp some additional process properties, which he wouldn't be able to discover only with classical statistical tools. However, he suggests that the machine learning approach should not be used alone, but should be considered as a powerful supplement to already existent instruments. In this domain it also turned out that simple background knowledge can sometimes significantly improve the usefulness of induced models.

FORS was also used in the domains of finite element mesh design and for construction of rules for the prediction of the mutagenic activity of nitroaromatic compounds, where its performance was at the level of other machine learning algorithms, used in those domains.

During the work in the electrical discharge machining process, a data acquisition environment was established which enabled us to monitor and record crucial process parameters as well as the operator's control actions. Models were induced which, according to the expert, capture the main behaviour patterns of the operator. During the knowledge acquisition process several important guidelines for knowledge acquisition, concerning mainly the process of interaction with the domain experts, emerged, confirming that comprehensibility of the induced model plays an important role in the process of behaviour cloning, and again confirming, that for successful modelling of the expert behavior one can not rely only on interviews with an expert.

Keywords

artificial intelligence
machine learning
regression
learning from examples
continuous class
first-order logic
data mining
induction
inductive logic programming