Refine
Document Type
- Working Paper (2)
Language
- English (2)
Has Fulltext
- yes (2)
Keywords
- Test Function (2) (remove)
When researchers and practitioners in the field of
computational intelligence are confronted with real-world
problems, the question arises which method is the best to
apply. Nowadays, there are several, well established test
suites and well known artificial benchmark functions
available.
However, relevance and applicability of these methods to
real-world problems remains an open question in many
situations. Furthermore, the generalizability of these
methods cannot be taken for granted.
This paper describes a data-driven approach for the
generation of test instances, which is based on
real-world data. The test instance generation uses
data-preprocessing, feature extraction, modeling, and
parameterization. We apply this methodology on a classical
design of experiment real-world project and generate test
instances for benchmarking, e.g. design methods, surrogate
techniques, and optimization algorithms. While most
available results of methods applied on real-world
problems lack availability of the data for comparison,
our future goal is to create a toolbox covering multiple
data sets of real-world projects to provide a test
function generator to the research community.
Benchmark experiments are required to test, compare, tune, and understand optimization algorithms. Ideally, benchmark problems closely reflect real-world problem behavior. Yet, real-world problems are not always readily available for benchmarking. For example, evaluation costs may be too high, or resources are unavailable (e.g., software or equipment). As a solution, data from previous evaluations can be used to train surrogate models which are then used for benchmarking. The goal is to generate test functions on which the performance of an algorithm is similar to that on the real-world objective function. However, predictions from data-driven models tend to be smoother than the ground-truth from which the training data is derived. This is especially problematic when the training data becomes sparse. The resulting benchmarks may not reflect the landscape features of the ground-truth, are too easy, and may lead to biased conclusions.
To resolve this, we use simulation of Gaussian processes instead of estimation (or prediction). This retains the covariance properties estimated during model training. While previous research suggested a decomposition-based approach for a small-scale, discrete problem, we show that the spectral simulation method enables simulation for continuous optimization problems. In a set of experiments with an artificial ground-truth, we demonstrate that this yields more accurate benchmarks than simply predicting with the Gaussian process model.