### Refine

#### Document Type

- Working Paper (4)
- Report (2)

#### Keywords

- Modeling (2)
- Optimization (2)
- Simulation (2)
- Soft Computing (2)
- Bayesian Learning (1)
- Benchmarking (1)
- Cyclone Dust Separator (1)
- Entstauber (1)
- Evolutionärer Algorithmus (1)
- Funktionstest (1)
- Imputation (1)
- Lineare Regression (1)
- Mehrkriterielle Optimierung (1)
- Modelierung (1)
- Multi-Criteria Optimization (1)
- Multiobjective Optimization (1)
- Optimierung (1)
- Regression (1)
- Sensortechnik (1)
- Sequential Parameter Optimization (1)
- Sequentielle Parameter Optimierung (1)
- Surrogat-Modellierung (1)
- Surrogate Modeling (1)
- Test Function (1)
- Test function generator (1)
- Time Series (1)
- Univariate Data (1)
- Zylon Enstauber (1)

Data pre-processing is a key research topic in data mining because it plays a
crucial role in improving the accuracy of any data mining algorithm. In most
real world cases, a significant amount of the recorded data is found missing
due to most diverse errors. This loss of data is nearly always unavoidable.
Recovery of missing data plays a vital role in avoiding inaccurate data
mining decisions. Most multivariate imputation methods are not compatible
to univariate datasets and the traditional univariate imputation techniques
become highly biased as the missing data gap increases. With the current
technological advancements abundant data is being captured every second.
Hence, we intend to develop a new algorithm that enables maximum
utilization of the available big datasets for imputation. In this paper, we
present a Seasonal and Trend decomposition using Loess (STL) based
Seasonal Moving Window Algorithm, which is capable of handling patterns
with trend as well as cyclic characteristics. We show that the algorithm is
highly suitable for pre-processing of large datasets.

When researchers and practitioners in the field of
computational intelligence are confronted with real-world
problems, the question arises which method is the best to
apply. Nowadays, there are several, well established test
suites and well known artificial benchmark functions
available.
However, relevance and applicability of these methods to
real-world problems remains an open question in many
situations. Furthermore, the generalizability of these
methods cannot be taken for granted.
This paper describes a data-driven approach for the
generation of test instances, which is based on
real-world data. The test instance generation uses
data-preprocessing, feature extraction, modeling, and
parameterization. We apply this methodology on a classical
design of experiment real-world project and generate test
instances for benchmarking, e.g. design methods, surrogate
techniques, and optimization algorithms. While most
available results of methods applied on real-world
problems lack availability of the data for comparison,
our future goal is to create a toolbox covering multiple
data sets of real-world projects to provide a test
function generator to the research community.

In this paper we present a comparison of different data driven modeling methods. The first instance of a data driven linear Bayesian model is compared with several linear regression models, a Kriging model and a genetic programming model.
The models are build on industrial data for the development of a robust gas sensor.
The data contain limited amount of samples and a high variance.
The mean square error of the models implemented in a test dataset is used as the comparison strategy.
The results indicate that standard linear regression approaches as well as Kriging and GP show good results,
whereas the Bayesian approach, despite the fact that it requires additional resources, does not lead to improved results.

Cyclone Dust Separators are devices often used to filter solid particles from flue gas. Such cyclones are supposed to filter as much solid particles from the carrying gas as possible. At the same time, they should only introduce a minimal pressure loss to the system. Hence, collection efficiency has to be maximized and pressure loss minimized. Both the collection efficiency and pressure loss are heavily influenced by the cyclones geometry. In this paper, we optimize seven geometrical parameters of an analytical cyclone model. Furthermore, noise variables are introduced to the model, representing the non-deterministic structure of the real-world problem. This is used to investigate robustness and sensitivity of solutions. Both the deterministic as well as the stochastic model are optimized with an SMS-EMOA. The SMS-EMOA is compared to a single objective optimization algorithm. For the harder, stochastic optimization problem, a surrogate-model-supported SMS-EMOA is compared against the model-free SMS-EMOA. The model supported approach yields better solutions with the same run-time budget.

When designing or developing optimization algorithms, test functions are crucial to evaluate
performance. Often, test functions are not sufficiently difficult, diverse, flexible or relevant to real-world
applications. Previously,
test functions with real-world relevance were generated by training a machine learning model based on
real-world data. The model estimation is used as a test function.
We propose a more principled approach using simulation instead of estimation.
Thus, relevant and varied test functions
are created which represent the behavior of real-world fitness landscapes.
Importantly, estimation can lead to excessively smooth test functions
while simulation may avoid this pitfall. Moreover, the simulation
can be conditioned by the data, so that the simulation reproduces the training data
but features diverse behavior in unobserved regions of the search space.
The proposed test function generator is illustrated with an intuitive, one-dimensional
example. To demonstrate the utility of this approach it
is applied to a protein sequence optimization problem.
This application demonstrates the advantages as well as practical limits of simulation-based
test functions.

Surrogate-assisted optimization has proven to be very successful if applied to industrial problems. The use of a data-driven surrogate model of an objective function during an optimization cycle has many bene ts, such as being cheap to evaluate and further providing both information about the objective landscape and the parameter space. In preliminary work, it was researched how surrogate-assisted optimization can help to optimize the structure of a neural network (NN) controller. In this work, we will focus on how surrogates can help to improve the direct learning process of a transparent feed-forward neural network controller. As an initial case study we will consider a manageable real-world control task: the elevator supervisory group problem (ESGC) using a simplified simulation model. We use this model as a benchmark which should indicate the applicability and performance of surrogate-assisted optimization to this kind of tasks. While the optimization process itself is in this case not onsidered expensive, the results show that surrogate-assisted optimization is capable of outperforming metaheuristic optimization methods for a low number of evaluations. Further the surrogate can be used for signi cance analysis of the inputs and weighted connections to further exploit problem information.