Using machine learning techniques in the construction of models, III. Learning systems with regression

Boris Kompare, Saso Dzeroski, Aram Karalic, Ivan Bratko, Milijan Sisko, Sven Erik Jorgensen.

Abstract

In the paper Kompare et al. (1994) an introduction and a brief overview was given of the possible tools of artificial intelligence (AI), which can be used for automatic data analysis, system identification, and model construction inecology. Dzeroski et al. have presented and exemplified rule induction techniques on river water quality classification based on physical, chemical and biological data. In this paper we proceed with a more thorough description of some machine learning (ML) tools that learn by using regression and can construct regression trees of Prolog programs.

Examples of successful applications of these tools to the prediction of algal growth in the Lagoon of Venice, Italy, and the Lake of Bled, Slovenia, are given. Tools for constructing regression trees were also applied to the classification of watercourses into quality classes. It is shown that regression trees depict in a compact and easily understandable form the most relevant factors (processes) which govern the system under consideration.

The described software tools generate the results described later from raw data (measurements) only. The tools are general and usually do not need any (background) knowledge about the domain. The programs used were not modified in any way to suit the domain or the particular examples. Contrary to the statistical methods, ML tools elicit, formulate, and present the learned knowledge in an easy understandable manner which gives new clear views on the tackled domain. ML tools thus help identifying and understanding the key processes in the observed system for which a purely deductionistic model might not (yet) exist.

Keywords: Ecological modelling, Artificial intelligence, Machine Learning, Data analysis, System identification.