TY - RPRT A1 - Fischbach, Andreas A1 - Zaefferer, Martin A1 - Stork, Jörg A1 - Friese, Martina A1 - Bartz-Beielstein, Thomas T1 - From Real World Data to Test Functions N2 - When researchers and practitioners in the field of computational intelligence are confronted with real-world problems, the question arises which method is the best to apply. Nowadays, there are several, well established test suites and well known artificial benchmark functions available. However, relevance and applicability of these methods to real-world problems remains an open question in many situations. Furthermore, the generalizability of these methods cannot be taken for granted. This paper describes a data-driven approach for the generation of test instances, which is based on real-world data. The test instance generation uses data-preprocessing, feature extraction, modeling, and parameterization. We apply this methodology on a classical design of experiment real-world project and generate test instances for benchmarking, e.g. design methods, surrogate techniques, and optimization algorithms. While most available results of methods applied on real-world problems lack availability of the data for comparison, our future goal is to create a toolbox covering multiple data sets of real-world projects to provide a test function generator to the research community. T3 - CIplus - 6/2016 KW - Modeling KW - Optimization KW - Benchmarking KW - Test Function KW - Modelierung KW - Optimierung KW - Benchmarking KW - Funktionstest Y1 - 2016 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:hbz:832-cos4-4326 ER - TY - RPRT A1 - Chandrasekaran, Sowmya A1 - Zaefferer, Martin A1 - Moritz, Steffen A1 - Stork, Jörg A1 - Friese, Martina A1 - Fischbach, Andreas A1 - Bartz-Beielstein, Thomas T1 - Data Preprocessing: A New Algorithm for Univariate Imputation Designed Specifically for Industrial Needs N2 - Data pre-processing is a key research topic in data mining because it plays a crucial role in improving the accuracy of any data mining algorithm. In most real world cases, a significant amount of the recorded data is found missing due to most diverse errors. This loss of data is nearly always unavoidable. Recovery of missing data plays a vital role in avoiding inaccurate data mining decisions. Most multivariate imputation methods are not compatible to univariate datasets and the traditional univariate imputation techniques become highly biased as the missing data gap increases. With the current technological advancements abundant data is being captured every second. Hence, we intend to develop a new algorithm that enables maximum utilization of the available big datasets for imputation. In this paper, we present a Seasonal and Trend decomposition using Loess (STL) based Seasonal Moving Window Algorithm, which is capable of handling patterns with trend as well as cyclic characteristics. We show that the algorithm is highly suitable for pre-processing of large datasets. T3 - CIplus - 7/2016 KW - Time Series KW - Imputation KW - Univariate Data Y1 - 2016 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:hbz:832-cos4-4331 ER - TY - RPRT A1 - Flasch, Oliver A1 - Friese, Martina A1 - Zaefferer, Martin A1 - Bartz-Beielstein, Thomas A1 - Branke, Jürgen T1 - Learning Model-Ensemble Policies with Genetic Programming N2 - We propose to apply typed Genetic Programming (GP) to the problem of finding surrogate-model ensembles for global optimization on compute-intensive target functions. In a model ensemble, base-models such as linear models, random forest models, or Kriging models, as well as pre- and post-processing methods, are combined. In theory, an optimal ensemble will join the strengths of its comprising base-models while avoiding their weaknesses, offering higher prediction accuracy and robustness. This study defines a grammar of model ensemble expressions and searches the set for optimal ensembles via GP. We performed an extensive experimental study based on 10 different objective functions and 2 sets of base-models. We arrive at promising results, as on unseen test data, our ensembles perform not significantly worse than the best base-model. T3 - CIplus - 3/2015 KW - Modellierung KW - Optimierung KW - Ensemble Methods KW - Genetic Programming KW - Surrogate-Model-Based Optimization Y1 - 2015 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:hbz:832-cos-787 ER - TY - RPRT A1 - Friese, Martina A1 - Bartz-Beielstein, Thomas A1 - Emmerich, Michael T1 - Building Ensembles of Surrogate Models by Optimal Convex Combination N2 - When using machine learning techniques for learning a function approximation from given data it is often a difficult task to select the right modeling technique. In many real-world settings is no preliminary knowledge about the objective function available. Then it might be beneficial if the algorithm could learn all models by itself and select the model that suits best to the problem. This approach is known as automated model selection. In this work we propose a generalization of this approach. It combines the predictions of several into one more accurate ensemble surrogate model. This approach is studied in a fundamental way, by first evaluating minimalistic ensembles of only two surrogate models in detail and then proceeding to ensembles with three and more surrogate models. The results show to what extent combinations of models can perform better than single surrogate models and provides insights into the scalability and robustness of the approach. The study focuses on multi-modal functions topologies, which are important in surrogate-assisted global optimization. T3 - CIplus - 4/2016 KW - Globale Optimierung KW - Maschinelles Lernen KW - Function Approximation KW - Surrogate Models KW - Model Selection KW - Ensemble Methods KW - Automated Learning Y1 - 2016 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:hbz:832-cos4-3480 ER - TY - RPRT A1 - Breiderhoff, Beate A1 - Bartz-Beielstein, Thomas A1 - Naujoks, Boris A1 - Zaefferer, Martin A1 - Fischbach, Andreas A1 - Flasch, Oliver A1 - Friese, Martina A1 - Mersmann, Olaf A1 - Stork, Jörg T1 - Simulation and Optimization of Cyclone Dust Separators N2 - Cyclone Dust Separators are devices often used to filter solid particles from flue gas. Such cyclones are supposed to filter as much solid particles from the carrying gas as possible. At the same time, they should only introduce a minimal pressure loss to the system. Hence, collection efficiency has to be maximized and pressure loss minimized. Both the collection efficiency and pressure loss are heavily influenced by the cyclones geometry. In this paper, we optimize seven geometrical parameters of an analytical cyclone model. Furthermore, noise variables are introduced to the model, representing the non-deterministic structure of the real-world problem. This is used to investigate robustness and sensitivity of solutions. Both the deterministic as well as the stochastic model are optimized with an SMS-EMOA. The SMS-EMOA is compared to a single objective optimization algorithm. For the harder, stochastic optimization problem, a surrogate-model-supported SMS-EMOA is compared against the model-free SMS-EMOA. The model supported approach yields better solutions with the same run-time budget. T3 - CIplus - 4/2013 KW - Soft Computing KW - Evolutionärer Algorithmus KW - Mehrkriterielle Optimierung KW - Entstauber KW - Simulation KW - Mehrkriterielle Optimierung KW - Surrogat-Modellierung KW - Sequentielle Parameter Optimierung KW - Zylon Enstauber KW - Multiobjective Optimization KW - Multi-Criteria Optimization KW - Surrogate Modeling KW - Sequential Parameter Optimization KW - Cyclone Dust Separator Y1 - 2013 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:hbz:832-cos-470 ER - TY - RPRT A1 - Friese, Martina A1 - Stork, Jörg A1 - Ramos Guerra, Ricardo A1 - Bartz-Beielstein, Thomas A1 - Thaker, Soham A1 - Flasch, Oliver A1 - Zaefferer, Martin T1 - UniFIeD Univariate Frequency-based Imputation for Time Series Data N2 - This paper introduces UniFIeD, a new data preprocessing method for time series. UniFIeD can cope with large intervals of missing data. A scalable test function generator, which allows the simulation of time series with different gap sizes, is presented additionally. An experimental study demonstrates that (i) UniFIeD shows a significant better performance than simple imputation methods and (ii) UniFIeD is able to handle situations, where advanced imputation methods fail. The results are independent from the underlying error measurements. T3 - CIplus - 5/2013 KW - Zeitreihe KW - Prognose KW - Datenanalyse KW - Vorverarbeitung KW - Zeitreihenanalyse KW - Fehlende Daten KW - Time-series KW - Missing Data KW - Imputation Y1 - 2013 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:hbz:832-cos-493 SN - 2194-2870 ER -