### Refine

#### Document Type

- Report (8)
- Working Paper (8)
- Preprint (3)
- Article (1)

#### Keywords

- Optimierung (8)
- Modellierung (6)
- Optimization (6)
- Simulation (5)
- Soft Computing (4)
- Globale Optimierung (3)
- Mehrkriterielle Optimierung (3)
- Modeling (3)
- Sequentielle Parameter Optimierung (3)
- 3D Printing (2)
- Benchmarking (2)
- Combined simulation (2)
- Imputation (2)
- Kriging (2)
- Multi-Criteria Optimization (2)
- Multiobjective Optimization (2)
- Sequential Parameter Optimization (2)
- Surrogat-Modellierung (2)
- Surrogate Modeling (2)
- Surrogate Models (2)
- Test Function (2)
- 3D-Druck (1)
- Algorithm Tuning (1)
- Biogas (1)
- Biogas Plant (1)
- Ccomputational fluid dynamics (1)
- Co-Kriging (1)
- Computational Intelligence (1)
- Computational fluid dynamics (1)
- Continuous Optimization (1)
- Cyclone Dust Separator (1)
- Datenanalyse (1)
- Discrete Optimization (1)
- Electrostatic Precipitator (1)
- Ensemble Methods (1)
- Entstauber (1)
- Event Detection (1)
- Evolutionary Computation (1)
- Evolutionärer Algorithmus (1)
- Expected Improvement (1)
- Expensive Optimization (1)
- Fehlende Daten (1)
- Flowcurve (1)
- Funktionstest (1)
- Gaussian Process (1)
- Gauß-Prozess (1)
- Genetic Programming (1)
- Hot rolling (1)
- Meta-model (1)
- Metal (1)
- Metamodel (1)
- Metamodels (1)
- Missing Data (1)
- Modelierung (1)
- Multi-criteria Optimization (1)
- Multi-fidelity (1)
- Numerische Strömungssimulation (1)
- Parallelization (1)
- Parametertuning (1)
- Prognose (1)
- R (1)
- SPOT (1)
- Simulated annealing (1)
- Simulation-based Optimization (1)
- Stacking (1)
- Surrogate (1)
- Surrogate Mod (1)
- Surrogate Optimization (1)
- Surrogate-Model-Based Optimization (1)
- Surrogate-based (1)
- Surrogate-model-based Optimization (1)
- Surrogates (1)
- Surrogatmodellbasierte Optimierung (1)
- Test function generator (1)
- Testgröße (1)
- Time Series (1)
- Time-series (1)
- Univariate Data (1)
- Versuchsplanung (1)
- Vorverarbeitung (1)
- Water Quality Monitoring (1)
- Zeitreihe (1)
- Zeitreihenanalyse (1)
- Zylon Enstauber (1)

Benchmark experiments are required to test, compare, tune, and understand optimization algorithms. Ideally, benchmark problems closely reflect real-world problem behavior. Yet, real-world problems are not always readily available for benchmarking. For example, evaluation costs may be too high, or resources are unavailable (e.g., software or equipment). As a solution, data from previous evaluations can be used to train surrogate models which are then used for benchmarking. The goal is to generate test functions on which the performance of an algorithm is similar to that on the real-world objective function. However, predictions from data-driven models tend to be smoother than the ground-truth from which the training data is derived. This is especially problematic when the training data becomes sparse. The resulting benchmarks may not reflect the landscape features of the ground-truth, are too easy, and may lead to biased conclusions.
To resolve this, we use simulation of Gaussian processes instead of estimation (or prediction). This retains the covariance properties estimated during model training. While previous research suggested a decomposition-based approach for a small-scale, discrete problem, we show that the spectral simulation method enables simulation for continuous optimization problems. In a set of experiments with an artificial ground-truth, we demonstrate that this yields more accurate benchmarks than simply predicting with the Gaussian process model.

Surrogate-based optimization relies on so-called infill criteria (acquisition functions) to decide which point to evaluate next. When Kriging is used as the surrogate model of choice (also called Bayesian optimization), one of the most frequently chosen criteria is expected improvement. We argue that the popularity of expected improvement largely relies on its theoretical properties rather than empirically validated performance. Few results from the literature show evidence, that under certain conditions, expected improvement may perform worse than something as simple as the predicted value of the surrogate model. We benchmark both infill criteria in an extensive empirical study on the ‘BBOB’ function set. This investigation includes a detailed study of the impact of problem dimensionality on algorithm performance. The results support the hypothesis that exploration loses importance with increasing problem dimensionality. A statistical analysis reveals that the purely exploitative search with the predicted value criterion performs better on most problems of five or higher dimensions. Possible reasons for these results are discussed. In addition, we give an in-depth guide for choosing the infill criteria based on prior knowledge about the problem at hand, its dimensionality, and the available budget.

The availability of several CPU cores on current computers enables
parallelization and increases the computational power significantly.
Optimization algorithms have to be adapted to exploit these highly
parallelized systems and evaluate multiple candidate solutions in
each iteration. This issue is especially challenging for expensive
optimization problems, where surrogate models are employed to
reduce the load of objective function evaluations.
This paper compares different approaches for surrogate modelbased
optimization in parallel environments. Additionally, an easy
to use method, which was developed for an industrial project, is
proposed. All described algorithms are tested with a variety of
standard benchmark functions. Furthermore, they are applied to
a real-world engineering problem, the electrostatic precipitator
problem. Expensive computational fluid dynamics simulations are
required to estimate the performance of the precipitator. The task
is to optimize a gas-distribution system so that a desired velocity
distribution is achieved for the gas flow throughout the precipitator.
The vast amount of possible configurations leads to a complex
discrete valued optimization problem. The experiments indicate
that a hybrid approach works best, which proposes candidate solutions
based on different surrogate model-based infill criteria and
evolutionary operators.

Increasing computational power and the availability of 3D printers provide new tools for the combination of modeling and experimentation. Several simulation tools can be run independently and in parallel, e.g., long running computational fluid dynamics simulations can be accompanied by experiments with 3D printers. Furthermore, results from analytical and data-driven models can be incorporated. However, there are fundamental differences between these modeling approaches: some models, e.g., analytical models, use domain knowledge, whereas data-driven models do not require any information about the underlying processes.
At the same time, data-driven models require input and output data, but analytical models do not. Combining results from models with different input-output structures might improve and accelerate the optimization process. The optimization via multimodel simulation (OMMS) approach, which is able to combine results from these different models, is introduced in this paper.
Using cyclonic dust separators as a real-world simulation problem, the feasibility of this approach is demonstrated and a proof-of-concept is presented. Cyclones are popular devices used to filter dust from the emitted flue gases. They are applied as pre-filters in many industrial processes including energy production and grain processing facilities. Pros and cons of this multimodel optimization approach are discussed and experiences from experiments are presented.

Surrogate-assisted optimization has proven to be very successful if applied to industrial problems. The use of a data-driven surrogate model of an objective function during an optimization cycle has many bene ts, such as being cheap to evaluate and further providing both information about the objective landscape and the parameter space. In preliminary work, it was researched how surrogate-assisted optimization can help to optimize the structure of a neural network (NN) controller. In this work, we will focus on how surrogates can help to improve the direct learning process of a transparent feed-forward neural network controller. As an initial case study we will consider a manageable real-world control task: the elevator supervisory group problem (ESGC) using a simplified simulation model. We use this model as a benchmark which should indicate the applicability and performance of surrogate-assisted optimization to this kind of tasks. While the optimization process itself is in this case not onsidered expensive, the results show that surrogate-assisted optimization is capable of outperforming metaheuristic optimization methods for a low number of evaluations. Further the surrogate can be used for signi cance analysis of the inputs and weighted connections to further exploit problem information.

The performance of optimization algorithms relies crucially on their parameterizations. Finding good parameter settings is called algorithm tuning. Using
a simple simulated annealing algorithm, we will demonstrate how optimization algorithms can be tuned using the Sequential Parameter Optimization Toolbox (SPOT). SPOT provides several tools for automated and interactive tuning. The underlying concepts of the SPOT approach are explained. This includes key techniques such as exploratory fitness landscape analysis and response surface methodology. Many examples illustrate
how SPOT can be used for understanding the performance of algorithms and gaining insight into algorithm behavior. Furthermore, we demonstrate how SPOT can be used as an optimizer and how a sophisticated ensemble approach is able to combine several meta models via stacking.

To maximize the throughput of a hot rolling mill,
the number of passes has to be reduced. This can be achieved by maximizing the thickness reduction in each pass. For this purpose, exact predictions of roll force and torque are required. Hence, the predictive models that describe the physical behavior of the product have to be accurate and cover a wide range of different materials.
Due to market requirements a lot of new materials are tested and rolled. If these materials are chosen to be rolled more often, a suitable flow curve has to be established. It is not reasonable to determine those flow curves in laboratory, because of costs and time. A strong demand for quick parameter determination and the optimization of flow curve parameter with minimum costs is the logical consequence. Therefore parameter estimation and the optimization with real data, which were collected during previous runs, is a promising idea. Producers benefit from this data-driven approach and receive a huge gain in flexibility when rolling new
materials, optimizing current production, and increasing quality. This concept would also allow to optimize flow curve parameters, which have already been treated by standard methods. In this article, a new data-driven approach for predicting the physical behavior of the product and setting important parameters is presented.
We demonstrate how the prediction quality of the roll force and roll torque can be optimized sustainably. This offers the opportunity to continuously increase the workload in each pass to the theoretical maximum while product quality and process stability can also be improved.

When designing or developing optimization algorithms, test functions are crucial to evaluate
performance. Often, test functions are not sufficiently difficult, diverse, flexible or relevant to real-world
applications. Previously,
test functions with real-world relevance were generated by training a machine learning model based on
real-world data. The model estimation is used as a test function.
We propose a more principled approach using simulation instead of estimation.
Thus, relevant and varied test functions
are created which represent the behavior of real-world fitness landscapes.
Importantly, estimation can lead to excessively smooth test functions
while simulation may avoid this pitfall. Moreover, the simulation
can be conditioned by the data, so that the simulation reproduces the training data
but features diverse behavior in unobserved regions of the search space.
The proposed test function generator is illustrated with an intuitive, one-dimensional
example. To demonstrate the utility of this approach it
is applied to a protein sequence optimization problem.
This application demonstrates the advantages as well as practical limits of simulation-based
test functions.

Cyclone separators are popular devices used to filter dust from the emitted flue gases. They are applied as pre-filters in many industrial processes including energy production and grain processing facilities.
Increasing computational power and the availability of 3D printers provide new tools for the combination of modeling and experimentation, which necessary for constructing efficient cyclones. Several simulation tools can be run in parallel, e.g., long running CFD simulations can be accompanied by experiments with 3D printers. Furthermore, results from analytical and data-driven models can be incorporated. There are fundamental differences between these modeling approaches: some models, e.g., analytical models, use domain knowledge, whereas data-driven models do not require any information about the underlying processes.
At the same time, data-driven models require input and output data, whereas analytical models do not. Combining results from models with different input-output structure is of great interest. This combination inspired the development of a new methodology. An optimization via multimodel simulation approach, which combines results from different models, is introduced.
Using cyclonic dust separators (cyclones) as a real-world simulation problem, the feasibility of this approach is demonstrated. Pros and cons of this approach are discussed and experiences from the experiments are presented.
Furthermore, technical problems, which are related to 3D-printing approaches, are discussed.

The use of surrogate models is a standard method to deal with complex, realworld
optimization problems. The first surrogate models were applied to continuous
optimization problems. In recent years, surrogate models gained importance
for discrete optimization problems. This article, which consists of three
parts, takes care of this development. The first part presents a survey of modelbased
methods, focusing on continuous optimization. It introduces a taxonomy,
which is useful as a guideline for selecting adequate model-based optimization
tools. The second part provides details for the case of discrete optimization
problems. Here, six strategies for dealing with discrete data structures are introduced.
A new approach for combining surrogate information via stacking
is proposed in the third part. The implementation of this approach will be
available in the open source R package SPOT2. The article concludes with a
discussion of recent developments and challenges in both application domains.

Data pre-processing is a key research topic in data mining because it plays a
crucial role in improving the accuracy of any data mining algorithm. In most
real world cases, a significant amount of the recorded data is found missing
due to most diverse errors. This loss of data is nearly always unavoidable.
Recovery of missing data plays a vital role in avoiding inaccurate data
mining decisions. Most multivariate imputation methods are not compatible
to univariate datasets and the traditional univariate imputation techniques
become highly biased as the missing data gap increases. With the current
technological advancements abundant data is being captured every second.
Hence, we intend to develop a new algorithm that enables maximum
utilization of the available big datasets for imputation. In this paper, we
present a Seasonal and Trend decomposition using Loess (STL) based
Seasonal Moving Window Algorithm, which is capable of handling patterns
with trend as well as cyclic characteristics. We show that the algorithm is
highly suitable for pre-processing of large datasets.

When researchers and practitioners in the field of
computational intelligence are confronted with real-world
problems, the question arises which method is the best to
apply. Nowadays, there are several, well established test
suites and well known artificial benchmark functions
available.
However, relevance and applicability of these methods to
real-world problems remains an open question in many
situations. Furthermore, the generalizability of these
methods cannot be taken for granted.
This paper describes a data-driven approach for the
generation of test instances, which is based on
real-world data. The test instance generation uses
data-preprocessing, feature extraction, modeling, and
parameterization. We apply this methodology on a classical
design of experiment real-world project and generate test
instances for benchmarking, e.g. design methods, surrogate
techniques, and optimization algorithms. While most
available results of methods applied on real-world
problems lack availability of the data for comparison,
our future goal is to create a toolbox covering multiple
data sets of real-world projects to provide a test
function generator to the research community.

Dieser Schlussbericht beschreibt die im Projekt „CI-basierte mehrkriterielle Optimierungsverfahren für Anwendungen in der Industrie“ (CIMO) im Zeitraum von November 2011 bis einschließlich Oktober 2014 erzielten Ergebnisse. Für aufwändige Optimierungsprobleme aus der Industrie wurden geeignete Lösungsverfahren entwickelt. Der Schwerpunkt lag hierbei auf Methoden aus den Bereichen Computational Intelligence (CI) und Surrogatmodellierung. Diese bieten die Möglichkeit, wichtige Herausforderung von aufwändigen, komplexen Optimierungsproblemen zu lösen. Die entwickelten Methoden können verschiedene konfliktäre Zielgrößen berücksichtigen, verschiedene Hierarchieebenen des Problems in die Optimierung integrieren, Nebenbedingungen beachten, vektorielle aber auch strukturierte Daten verarbeiten (kombinatorische Optimierung) sowie die Notwendigkeit teurer/zeitaufwändiger Zielfunktionsberechnungen reduzieren. Die entwickelten Methoden wurden schwerpunktmäßig auf einer Problemstellung aus der Kraftwerkstechnik angewendet, nämlich der Optimierung der Geometrie eines Fliehkraftabscheiders (auch: Zyklon), der Staubanteile aus Abgasen filtert. Das Optimierungsproblem, das diese FIiehkraftabscheider aufwerfen, führt zu konfliktären Zielsetzungen (z.B. Druckverlust, Abscheidegrad). Zyklone können unter anderem über aufwändige Computational Fluid Dynamics (CFD) Simulationen berechnet werden, es stehen aber auch einfache analytische Gleichungen als Schätzung zu Verfügung. Die Verknüpfung von beidem zeigt hier beispielhaft wie Hierarchieebenen eines Optimierungsproblems mit den Methoden des Projektes verbunden werden können. Neben dieser Schwerpunktanwendung konnte auch gezeigt werden, dass die Methoden in vielen weiteren Bereichen Erfolgreich zur Anwendung kommen können: Biogaserzeugung, Wasserwirtschaft, Stahlindustrie. Die besondere Herausforderung der behandelten Probleme und Methoden bietet viele wichtige Forschungsmöglichkeiten für zukünftige Projekte, die derzeit durch die Projektpartner vorbereitet werden.

We propose to apply typed Genetic Programming (GP) to the problem of finding surrogate-model ensembles for global optimization on compute-intensive target functions. In a model ensemble, base-models such as linear models, random forest models, or Kriging models, as well as pre- and post-processing methods, are combined. In theory, an optimal ensemble will join the strengths of its comprising base-models while avoiding their weaknesses, offering higher prediction accuracy and robustness. This study defines a grammar of model ensemble expressions and searches the set for optimal ensembles via GP. We performed an extensive experimental study based on 10 different objective functions and 2 sets of base-models. We arrive at promising results, as on unseen test data, our ensembles perform not significantly worse than the best base-model.

An essential task for operation and planning of biogas plants is the optimization of substrate feed mixtures. Optimizing the monetary gain requires the determination of the exact amounts of maize, manure, grass silage, and other substrates. Accurate simulation models are mandatory for this optimization, because the underlying chemical processes are very slow. The simulation models themselves may be time-consuming to evaluate, hence we show how to use surrogate-model-based approaches to optimize biogas plants efficiently. In detail, a Kriging surrogate is employed. To improve model quality of this surrogate, we integrate cheaply available data into the optimization process. Doing so, Multi-fidelity modeling methods like Co-Kriging are employed. Furthermore, a two-layered modeling approach is employed to avoid deterioration of model quality due to discontinuities in the search space. At the same time, the cheaply available data is shown to be very useful for initialization of the employed optimization algorithms. Overall, we show how biogas plants can be efficiently modeled using data-driven methods, avoiding discontinuities as well as including cheaply available data. The application of the derived surrogate models to an optimization process is shown to be very difficult, yet successful for a lower problem dimension.

This paper introduces UniFIeD, a new data preprocessing method for time series. UniFIeD can cope with large intervals of missing data. A scalable test function generator, which allows the simulation of time series with different gap sizes, is presented additionally. An experimental study demonstrates that (i) UniFIeD shows a significant better performance than simple imputation methods and (ii) UniFIeD is able to handle situations, where advanced imputation methods fail. The results are independent from the underlying error measurements.

Cyclone Dust Separators are devices often used to filter solid particles from flue gas. Such cyclones are supposed to filter as much solid particles from the carrying gas as possible. At the same time, they should only introduce a minimal pressure loss to the system. Hence, collection efficiency has to be maximized and pressure loss minimized. Both the collection efficiency and pressure loss are heavily influenced by the cyclones geometry. In this paper, we optimize seven geometrical parameters of an analytical cyclone model. Furthermore, noise variables are introduced to the model, representing the non-deterministic structure of the real-world problem. This is used to investigate robustness and sensitivity of solutions. Both the deterministic as well as the stochastic model are optimized with an SMS-EMOA. The SMS-EMOA is compared to a single objective optimization algorithm. For the harder, stochastic optimization problem, a surrogate-model-supported SMS-EMOA is compared against the model-free SMS-EMOA. The model supported approach yields better solutions with the same run-time budget.

Multi-criteria optimization has gained increasing attention during the last decades. This article exemplifies multi-criteria features, which are implemented in the statistical software package SPOT. It describes related software packages such as mco and emoa and gives a comprehensive introduction to simple multi criteria optimization tasks. Several hands-on examples are used for illustration. The article is well-suited as a starting point for performing multi-criteria optimization tasks with SPOT.

Formerly, multi-criteria optimization algorithms were often tested using tens of thousands function evaluations. In many real-world settings function evaluations are very costly or the available budget is very limited. Several methods were developed to solve these cost-extensive multi-criteria optimization problems by reducing the number of function evaluations by means of surrogate optimization. In this study, we apply different multi-criteria surrogate optimization methods to improve (tune) an event-detection software for water-quality monitoring. For tuning two important parameters of this software, four state-of-the-art methods are compared: S-Metric-Selection Efficient Global Optimization (SMS-EGO), S-Metric-Expected Improvement for Efficient Global Optimization SExI-EGO, Euclidean Distance based Expected Improvement Euclid-EI (here referred to as MEI-SPOT due to its implementation in the Sequential Parameter Optimization Toolbox SPOT) and a multi-criteria approach based on SPO (MSPOT). Analyzing the performance of the different methods provides insight into the working-mechanisms of cutting-edge multi-criteria solvers. As one of the approaches, namely MSPOT, does not consider the prediction variance of the surrogate model, it is of interest whether this can lead to premature convergence on the practical tuning problem. Furthermore, all four approaches will be compared to a simple SMS-EMOA to validate that the use of surrogate models is justified on this problem.

There is a strong need for sound statistical analysis of simulation and optimization algorithms. Based on this analysis, improved parameter settings can be determined. This will be referred to as tuning. Model-based investigations are common approaches in simulation and optimization. The sequential parameter optimization toolbox (SPOT), which is implemented as a package for the statistical programming language R, provides sophisticated means for tuning and understanding simulation and optimization algorithms. The toolbox includes methods for tuning based on classical regression and analysis of variance techniques; tree-based models such as classification and regressions trees (CART) and random forest; Gaussian process models (Kriging), and combinations of different meta-modeling approaches. This article exemplifies how an existing optimization algorithm, namely simulated annealing, can be tuned using the SPOT framework.