OPUS 4 | Search

Feature Selection for Surrogate Model-Based Optimization (2020)

Rehbach, Frederik ; Gentile, Lorenzo ; Bartz-Beielstein, Thomas

We propose a hybridization approach called Regularized-Surrogate- Optimization (RSO) aimed at overcoming difficulties related to high- dimensionality. It combines standard Kriging-based SMBO with regularization techniques. The employed regularization methods use the least absolute shrinkage and selection operator (LASSO). An extensive study is performed on a set of artificial test functions and two real-world applications: the electrostatic precipitator problem and a multilayered composite design problem. Experiments reveal that RSO requires significantly less time than Kriging to obtain comparable results. The pros and cons of the RSO approach are discussed and recommendations for practitioners are presented.

Variable Reduction for Surrogate-Based Optimization (2020)

Rehbach, Frederik ; Gentile, Lorenzo ; Bartz-Beielstein, Thomas

Real-world problems such as computational fluid dynamics simulations and finite element analyses are computationally expensive. A standard approach to mitigating the high computational expense is Surrogate-Based Optimization (SBO). Yet, due to the high-dimensionality of many simulation problems, SBO is not directly applicable or not efficient. Reducing the dimensionality of the search space is one method to overcome this limitation. In addition to the applicability of SBO, dimensionality reduction enables easier data handling and improved data and model interpretability. Regularization is considered as one state-of-the-art technique for dimensionality reduction. We propose a hybridization approach called Regularized-Surrogate-Optimization (RSO) aimed at overcoming difficulties related to high-dimensionality. It couples standard Kriging-based SBO with regularization techniques. The employed regularization methods are based on three adaptations of the least absolute shrinkage and selection operator (LASSO). In addition, tree-based methods are analyzed as an alternative variable selection method. An extensive study is performed on a set of artificial test functions and two real-world applications: the electrostatic precipitator problem and a multilayered composite design problem. Experiments reveal that RSO requires significantly less time than standard SBO to obtain comparable results. The pros and cons of the RSO approach are discussed, and recommendations for practitioners are presented.

Parallelized Bayesian Optimization for Problems with Expensive Evaluation Functions (2020)

Rebolledo, Margarita ; Rehbach, Frederik ; Eiben, A.E. ; Bartz-Beielstein, Thomas

Many black-box optimization problems rely on simulations to evaluate the quality of candidate solutions. These evaluations can be computationally expensive and very time-consuming. We present and approach to mitigate this problem by taking into consideration two factors: The number of evaluations and the execution time. We aim to keep the number of evaluations low by using Bayesian optimization (BO) – known to be sample efficient– and to reduce wall-clock times by executing parallel evaluations. Four parallelization methods using BO as optimizer are compared against the inherently parallel CMA-ES. Each method is evaluated on all the 24 objective functions of the Black-Box-Optimization-Benchmarking test suite in their 20-dimensional versions. The results show that parallelized BO outperforms the state-of-the-art CMA-ES on most of the test functions, also on higher dimensions.

Parallelized Bayesian Optimization for Expensive Robot Controller Evolution (2020)

Rebolledo, Margarita ; Rehbach, Frederik ; Eiben, A.E. ; Bartz-Beielstein, Thomas

An important class of black-box optimization problems relies on using simulations to assess the quality of a given candidate solution. Solving such problems can be computationally expensive because each simulation is very time-consuming. We present an approach to mitigate this problem by distinguishing two factors of computational cost: the number of trials and the time needed to execute the trials. Our approach tries to keep down the number of trials by using Bayesian optimization (BO) –known to be sample efficient– and reducing wall-clock times by parallel execution of trials. We compare the performance of four parallelization methods and two model-free alternatives. Each method is evaluated on all 24 objective functions of the Black-Box-Optimization- Benchmarking (BBOB) test suite in their five, ten, and 20-dimensional versions. Additionally, their performance is investigated on six test cases in robot learning. The results show that parallelized BO outperforms the state-of-the-art CMA-ES on the BBOB test functions, especially for higher dimensions. On the robot learning tasks, the differences are less clear, but the data do support parallelized BO as the ‘best guess’, winning on some cases and never losing.

Sensor Placement for Contamination Detection in Water Distribution Systems (2020)

Rebolledo, Margarita ; Chandrasekaran, Sowmya ; Bartz-Beielstein, Thomas

Sensor placement for contaminant detection in water distribution systems (WDS) has become a topic of great interest aiming to secure a population's water supply. Several approaches can be found in the literature with differences ranging from the objective selected to optimize to the methods implemented to solve the optimization problem. In this work we aim to give an overview of the current work in sensor placement with focus on contaminant detection for WDS. We present some of the objectives for which the sensor placement problem is defined along with common optimization algorithms and Toolkits available to help with algorithm testing and comparison.

Technical Report: Flushing Strategies in Drinking Water Systems (2021)

Rebolledo, Margarita ; Chandrasekaran, Sowmya ; Bartz-Beielstein, Thomas

Drinking water supply and distribution systems are critical infrastructure that has to be well maintained for the safety of the public. One important tool in the maintenance of water distribution systems (WDS) is flushing. Flushing is a process carried out in a periodic fashion to clean sediments and other contaminants in the water pipes. Given the different topographies, water composition and supply demand between WDS no single flushing strategy is suitable for all of them. In this report a non-exhaustive overview of optimization methods for flushing in WDS is given. Implementation of optimization methods for the flushing procedure and the flushing planing are presented. Suggestions are given as a possible option to optimise existing flushing planing frameworks.

Modelling Zero-inflated Rainfall Data through the Use of Gaussian Process and Bayesian Regression (2018)

Rebolledo Coy, Margarita Alejandra ; Bartz-Beielstein, Thomas

Rainfall is a key parameter for understanding the water cycle. An accurate rainfall measurement is vital in the development of hydrological models. By means of indirect measurement, satellites can nowadays estimate the rainfall around the world. However, these measurements are not always accurate. As a first approach to generate a bias-corrected rainfall estimate using satellite data, the performance of Gaussian process and Bayesian regression is studied. The results show Gaussian process as the better option for this dataset but leave place to improvements on both modelling strategies.

Modeling and Optimization of a Robust Gas Sensor (2016)

Rebolledo C., Margarita A. ; Krey, Sebastian ; Bartz-Beielstein, Thomas ; Flasch, Oliver ; Fischbach, Andreas ; Stork, Jörg

In this paper we present a comparison of different data driven modeling methods. The first instance of a data driven linear Bayesian model is compared with several linear regression models, a Kriging model and a genetic programming model. The models are build on industrial data for the development of a robust gas sensor. The data contain limited amount of samples and a high variance. The mean square error of the models implemented in a test dataset is used as the comparison strategy. The results indicate that standard linear regression approaches as well as Kriging and GP show good results, whereas the Bayesian approach, despite the fact that it requires additional resources, does not lead to improved results.

Trinkwasser-Sicherheit mit Predictive Analytics und Oracle (2017)

Moritz, Steffen ; Bartz-Beielstein, Thomas ; Strohschein, Jan ; Seger, Ralf ; Gross, Dimitri

Verunreinigungen im Wassernetz können weite Teile der Bevölkerung unmittelbar gefährden. Gefahrenpotenziale bestehen dabei nicht nur durch mögliche kriminelle Handlungen und terroristische Anschläge. Auch Betriebsstörungen, Systemfehler und Naturkatastrophen können zu Verunreinigungen führen.

Meta-model based optimization of hot rolling processes in the metal industry (2016)

Jung, Christian ; Zaefferer, Martin ; Bartz-Beielstein, Thomas ; Rudolph, Günter

To maximize the throughput of a hot rolling mill, the number of passes has to be reduced. This can be achieved by maximizing the thickness reduction in each pass. For this purpose, exact predictions of roll force and torque are required. Hence, the predictive models that describe the physical behavior of the product have to be accurate and cover a wide range of different materials. Due to market requirements a lot of new materials are tested and rolled. If these materials are chosen to be rolled more often, a suitable flow curve has to be established. It is not reasonable to determine those flow curves in laboratory, because of costs and time. A strong demand for quick parameter determination and the optimization of flow curve parameter with minimum costs is the logical consequence. Therefore parameter estimation and the optimization with real data, which were collected during previous runs, is a promising idea. Producers benefit from this data-driven approach and receive a huge gain in flexibility when rolling new materials, optimizing current production, and increasing quality. This concept would also allow to optimize flow curve parameters, which have already been treated by standard methods. In this article, a new data-driven approach for predicting the physical behavior of the product and setting important parameters is presented. We demonstrate how the prediction quality of the roll force and roll torque can be optimized sustainably. This offers the opportunity to continuously increase the workload in each pass to the theoretical maximum while product quality and process stability can also be improved.

Is Social Learning More Than Parameter Tuning? (2017)

Heinerman, Jacqueline ; Stork, Jörg ; Rebolledo Coy, Margarita Alejandra ; Hubert, Julien ; Eiben, A.E. ; Bartz-Beielstein, Thomas ; Haasdijk, Evert

Social learning enables multiple robots to share learned experiences while completing a task. The literature offers examples where robots trained with social learning reach a higher performance compared to their individual learning counterparts. No explanation has been advanced for that observation. In this research, we present experimental results suggesting that a lack of tuning of the parameters in social learning experiments could be the cause. In other words: the better the parameter settings are tuned, the less social learning can improve the system performance.

UniFIeD Univariate Frequency-based Imputation for Time Series Data (2013)

Friese, Martina ; Stork, Jörg ; Ramos Guerra, Ricardo ; Bartz-Beielstein, Thomas ; Thaker, Soham ; Flasch, Oliver ; Zaefferer, Martin

This paper introduces UniFIeD, a new data preprocessing method for time series. UniFIeD can cope with large intervals of missing data. A scalable test function generator, which allows the simulation of time series with different gap sizes, is presented additionally. An experimental study demonstrates that (i) UniFIeD shows a significant better performance than simple imputation methods and (ii) UniFIeD is able to handle situations, where advanced imputation methods fail. The results are independent from the underlying error measurements.

Building Ensembles of Surrogate Models by Optimal Convex Combination (2016)

Friese, Martina ; Bartz-Beielstein, Thomas ; Emmerich, Michael

When using machine learning techniques for learning a function approximation from given data it is often a difficult task to select the right modeling technique. In many real-world settings is no preliminary knowledge about the objective function available. Then it might be beneficial if the algorithm could learn all models by itself and select the model that suits best to the problem. This approach is known as automated model selection. In this work we propose a generalization of this approach. It combines the predictions of several into one more accurate ensemble surrogate model. This approach is studied in a fundamental way, by first evaluating minimalistic ensembles of only two surrogate models in detail and then proceeding to ensembles with three and more surrogate models. The results show to what extent combinations of models can perform better than single surrogate models and provides insights into the scalability and robustness of the approach. The study focuses on multi-modal functions topologies, which are important in surrogate-assisted global optimization.

Learning Model-Ensemble Policies with Genetic Programming (2015)

Flasch, Oliver ; Friese, Martina ; Zaefferer, Martin ; Bartz-Beielstein, Thomas ; Branke, Jürgen

We propose to apply typed Genetic Programming (GP) to the problem of finding surrogate-model ensembles for global optimization on compute-intensive target functions. In a model ensemble, base-models such as linear models, random forest models, or Kriging models, as well as pre- and post-processing methods, are combined. In theory, an optimal ensemble will join the strengths of its comprising base-models while avoiding their weaknesses, offering higher prediction accuracy and robustness. This study defines a grammar of model ensemble expressions and searches the set for optimal ensembles via GP. We performed an extensive experimental study based on 10 different objective functions and 2 sets of base-models. We arrive at promising results, as on unseen test data, our ensembles perform not significantly worse than the best base-model.

From Real World Data to Test Functions (2016)

Fischbach, Andreas ; Zaefferer, Martin ; Stork, Jörg ; Friese, Martina ; Bartz-Beielstein, Thomas

When researchers and practitioners in the field of computational intelligence are confronted with real-world problems, the question arises which method is the best to apply. Nowadays, there are several, well established test suites and well known artificial benchmark functions available. However, relevance and applicability of these methods to real-world problems remains an open question in many situations. Furthermore, the generalizability of these methods cannot be taken for granted. This paper describes a data-driven approach for the generation of test instances, which is based on real-world data. The test instance generation uses data-preprocessing, feature extraction, modeling, and parameterization. We apply this methodology on a classical design of experiment real-world project and generate test instances for benchmarking, e.g. design methods, surrogate techniques, and optimization algorithms. While most available results of methods applied on real-world problems lack availability of the data for comparison, our future goal is to create a toolbox covering multiple data sets of real-world projects to provide a test function generator to the research community.

CAAI - A Cognitive Architecture to Introduce Artificial Intelligence in Cyber-Physical Production Systems (2020)

Fischbach, Andreas ; Strohschein, Jan ; Bunte, Andreas ; Stork, Jörg ; Faeskorn-Woyke, Heide ; Moriz, Natalia ; Bartz-Beielstein, Thomas

This paper introduces CAAI, a novel cognitive architecture for artificial intelligence in cyber-physical production systems. The goal of the architecture is to reduce the implementation effort for the usage of artificial intelligence algorithms. The core of the CAAI is a cognitive module that processes declarative goals of the user, selects suitable models and algorithms, and creates a configuration for the execution of a processing pipeline on a big data platform. Constant observation and evaluation against performance criteria assess the performance of pipelines for many and varying use cases. Based on these evaluations, the pipelines are automatically adapted if necessary. The modular design with well-defined interfaces enables the reusability and extensibility of pipeline components. A big data platform implements this modular design supported by technologies such as Docker, Kubernetes, and Kafka for virtualization and orchestration of the individual components and their communication. The implementation of the architecture is evaluated using a real-world use case.

Konviviale Künstliche Intelligenz: Definition und Entwicklung eines Vorgehensmodells (2023)

Dusdal, Markus ; Richard, Schulz ; Haag, Christoph ; Bartz-Beielstein, Thomas

Die Arbeit beschreibt die Entwicklung und Verbreitung künstlicher Intelligenz (KI) und die damit verbundenen Herausforderungen und Chancen. Es wird hervorgehoben, dass trotz des offensichtlichen Nutzens von KI, Bedenken hinsichtlich unerwünschter Nebenwirkungen durch fehlerhafte oder missbräuchliche Anwendungen bestehen. Um diese Herausforderungen zu bewältigen, wird ein Ansatz vorgeschlagen, der als “konviviale künstliche Intelligenz” bezeichnet wird. Dieser Ansatz zielt auf ein harmonisches Zusammenspiel zwischen KI und Mensch ab und betont die Notwendigkeit einer menschenzentrierten Gestaltung bei der Entwicklung und Implementierung von KI-Modellen.

Data Preprocessing: A New Algorithm for Univariate Imputation Designed Specifically for Industrial Needs (2016)

Chandrasekaran, Sowmya ; Zaefferer, Martin ; Moritz, Steffen ; Stork, Jörg ; Friese, Martina ; Fischbach, Andreas ; Bartz-Beielstein, Thomas

Data pre-processing is a key research topic in data mining because it plays a crucial role in improving the accuracy of any data mining algorithm. In most real world cases, a significant amount of the recorded data is found missing due to most diverse errors. This loss of data is nearly always unavoidable. Recovery of missing data plays a vital role in avoiding inaccurate data mining decisions. Most multivariate imputation methods are not compatible to univariate datasets and the traditional univariate imputation techniques become highly biased as the missing data gap increases. With the current technological advancements abundant data is being captured every second. Hence, we intend to develop a new algorithm that enables maximum utilization of the available big datasets for imputation. In this paper, we present a Seasonal and Trend decomposition using Loess (STL) based Seasonal Moving Window Algorithm, which is capable of handling patterns with trend as well as cyclic characteristics. We show that the algorithm is highly suitable for pre-processing of large datasets.

EventDetectR – An Open-Source Event Detection System (2020)

Chandrasekaran, Sowmya ; Rebolledo, Margarita ; Bartz-Beielstein, Thomas

EventDetectR: An efficient Event Detection System (EDS) capable of detecting unexpected water quality conditions. This approach uses multiple algorithms to model the relationship between various multivariate water quality signals. Then the residuals of the models were utilized in constructing the event detection algorithm, which provides a continuous measure of the probability of an event at every time step. The proposed framework was tested for water contamination events with industrial data from automated water quality sensors. The results showed that the framework is reliable with better performance and is highly suitable for event detection.

Simulation and Optimization of Cyclone Dust Separators (2013)

Breiderhoff, Beate ; Bartz-Beielstein, Thomas ; Naujoks, Boris ; Zaefferer, Martin ; Fischbach, Andreas ; Flasch, Oliver ; Friese, Martina ; Mersmann, Olaf ; Stork, Jörg

Cyclone Dust Separators are devices often used to filter solid particles from flue gas. Such cyclones are supposed to filter as much solid particles from the carrying gas as possible. At the same time, they should only introduce a minimal pressure loss to the system. Hence, collection efficiency has to be maximized and pressure loss minimized. Both the collection efficiency and pressure loss are heavily influenced by the cyclones geometry. In this paper, we optimize seven geometrical parameters of an analytical cyclone model. Furthermore, noise variables are introduced to the model, representing the non-deterministic structure of the real-world problem. This is used to investigate robustness and sensitivity of solutions. Both the deterministic as well as the stochastic model are optimized with an SMS-EMOA. The SMS-EMOA is compared to a single objective optimization algorithm. For the harder, stochastic optimization problem, a surrogate-model-supported SMS-EMOA is compared against the model-free SMS-EMOA. The model supported approach yields better solutions with the same run-time budget.

Open Access

Refine

Author

Year of publication

Document Type

Language

Has Fulltext

Keywords

Institute

48 search hits