Refine
Year of publication
- 2016 (3) (remove)
Document Type
- Working Paper (2)
- Report (1)
Language
- English (3)
Has Fulltext
- yes (3)
Keywords
- Automated Learning (1)
- Benchmarking (1)
- Ensemble Methods (1)
- Function Approximation (1)
- Funktionstest (1)
- Globale Optimierung (1)
- Imputation (1)
- Maschinelles Lernen (1)
- Model Selection (1)
- Modelierung (1)
- Modeling (1)
- Optimierung (1)
- Optimization (1)
- Surrogate Models (1)
- Test Function (1)
- Time Series (1)
- Univariate Data (1)
When using machine learning techniques for learning a function approximation from given data it is often a difficult task to select the right modeling technique.
In many real-world settings is no preliminary knowledge about the objective function available. Then it might be beneficial if the algorithm could learn all models by itself and select the model that suits best to the problem.
This approach is known as automated model selection. In this work we propose a
generalization of this approach.
It combines the predictions of several into one more accurate ensemble surrogate model. This approach is studied in a fundamental way, by first evaluating minimalistic ensembles of only two surrogate models in detail and then proceeding to ensembles with three and more surrogate models.
The results show to what extent combinations of models can perform better than single surrogate models and provides insights into the scalability and robustness of the approach. The study focuses on multi-modal functions topologies, which are important in surrogate-assisted global optimization.
Data pre-processing is a key research topic in data mining because it plays a
crucial role in improving the accuracy of any data mining algorithm. In most
real world cases, a significant amount of the recorded data is found missing
due to most diverse errors. This loss of data is nearly always unavoidable.
Recovery of missing data plays a vital role in avoiding inaccurate data
mining decisions. Most multivariate imputation methods are not compatible
to univariate datasets and the traditional univariate imputation techniques
become highly biased as the missing data gap increases. With the current
technological advancements abundant data is being captured every second.
Hence, we intend to develop a new algorithm that enables maximum
utilization of the available big datasets for imputation. In this paper, we
present a Seasonal and Trend decomposition using Loess (STL) based
Seasonal Moving Window Algorithm, which is capable of handling patterns
with trend as well as cyclic characteristics. We show that the algorithm is
highly suitable for pre-processing of large datasets.
When researchers and practitioners in the field of
computational intelligence are confronted with real-world
problems, the question arises which method is the best to
apply. Nowadays, there are several, well established test
suites and well known artificial benchmark functions
available.
However, relevance and applicability of these methods to
real-world problems remains an open question in many
situations. Furthermore, the generalizability of these
methods cannot be taken for granted.
This paper describes a data-driven approach for the
generation of test instances, which is based on
real-world data. The test instance generation uses
data-preprocessing, feature extraction, modeling, and
parameterization. We apply this methodology on a classical
design of experiment real-world project and generate test
instances for benchmarking, e.g. design methods, surrogate
techniques, and optimization algorithms. While most
available results of methods applied on real-world
problems lack availability of the data for comparison,
our future goal is to create a toolbox covering multiple
data sets of real-world projects to provide a test
function generator to the research community.