Producing More Comprehensible Models While Retaining Their Performance

Abstract

Rissanen's Minimum Description Length (MDL) principle is adapted to handle continuous attributes in the Inductive Logic Programming setting. Application of the developed coding as a MDL pruning mechanism is devised. The behavior of the MDL pruning is tested in a synthetic domain with artificially added noise of different levels and in two real life problems - modelling of the surface roughness of a grinding workpiece and modelling of the mutagenicity of nitroaromatic compounds. Results indicate that MDL pruning is a successful parameter-free noise fighting tool in real-life domains since it acts as a safeguard against building too complex models while retaining the accuracy of the model.

Paper.ps