### Refine

#### Document Type

- Working Paper (12) (remove)

#### Keywords

- Optimization (4)
- Benchmarking (3)
- Modeling (3)
- Optimierung (3)
- Parallelization (2)
- Artificial intelligence (1)
- Bayesian Optimization (1)
- Big Data (1)
- Big data platform (1)
- Cognition (1)

When designing or developing optimization algorithms, test functions are crucial to evaluate
performance. Often, test functions are not sufficiently difficult, diverse, flexible or relevant to real-world
applications. Previously,
test functions with real-world relevance were generated by training a machine learning model based on
real-world data. The model estimation is used as a test function.
We propose a more principled approach using simulation instead of estimation.
Thus, relevant and varied test functions
are created which represent the behavior of real-world fitness landscapes.
Importantly, estimation can lead to excessively smooth test functions
while simulation may avoid this pitfall. Moreover, the simulation
can be conditioned by the data, so that the simulation reproduces the training data
but features diverse behavior in unobserved regions of the search space.
The proposed test function generator is illustrated with an intuitive, one-dimensional
example. To demonstrate the utility of this approach it
is applied to a protein sequence optimization problem.
This application demonstrates the advantages as well as practical limits of simulation-based
test functions.

Surrogate-assisted optimization has proven to be very successful if applied to industrial problems. The use of a data-driven surrogate model of an objective function during an optimization cycle has many bene ts, such as being cheap to evaluate and further providing both information about the objective landscape and the parameter space. In preliminary work, it was researched how surrogate-assisted optimization can help to optimize the structure of a neural network (NN) controller. In this work, we will focus on how surrogates can help to improve the direct learning process of a transparent feed-forward neural network controller. As an initial case study we will consider a manageable real-world control task: the elevator supervisory group problem (ESGC) using a simplified simulation model. We use this model as a benchmark which should indicate the applicability and performance of surrogate-assisted optimization to this kind of tasks. While the optimization process itself is in this case not onsidered expensive, the results show that surrogate-assisted optimization is capable of outperforming metaheuristic optimization methods for a low number of evaluations. Further the surrogate can be used for signi cance analysis of the inputs and weighted connections to further exploit problem information.

As the amount of data gathered by monitoring systems increases, using computational tools to analyze it becomes a necessity.
Machine learning algorithms can be used in both regression and classification problems, providing useful insights while avoiding the bias and proneness to errors of humans. In this paper, a specific kind of decision tree algorithm, called conditional inference tree, is used to extract relevant knowledge from data that pertains to electrical motors. The model is chosen due to its flexibility, strong statistical foundation, as well as great capabilities to generalize and cope with problems in the data. The obtained knowledge is organized in a structured way and then analyzed in the context of health condition monitoring. The final
results illustrate how the approach can be used to gain insight into the system and present the results in an understandable, user-friendly manner

The availability of several CPU cores on current computers enables
parallelization and increases the computational power significantly.
Optimization algorithms have to be adapted to exploit these highly
parallelized systems and evaluate multiple candidate solutions in
each iteration. This issue is especially challenging for expensive
optimization problems, where surrogate models are employed to
reduce the load of objective function evaluations.
This paper compares different approaches for surrogate modelbased
optimization in parallel environments. Additionally, an easy
to use method, which was developed for an industrial project, is
proposed. All described algorithms are tested with a variety of
standard benchmark functions. Furthermore, they are applied to
a real-world engineering problem, the electrostatic precipitator
problem. Expensive computational fluid dynamics simulations are
required to estimate the performance of the precipitator. The task
is to optimize a gas-distribution system so that a desired velocity
distribution is achieved for the gas flow throughout the precipitator.
The vast amount of possible configurations leads to a complex
discrete valued optimization problem. The experiments indicate
that a hybrid approach works best, which proposes candidate solutions
based on different surrogate model-based infill criteria and
evolutionary operators.

An important class of black-box optimization problems relies on using simulations to assess the quality of a given candidate solution. Solving such problems can be computationally expensive because each simulation is very time-consuming. We present an approach to mitigate this problem by distinguishing two factors of computational cost: the number of trials and the time needed to execute the trials. Our approach tries to keep down the number of trials by using Bayesian optimization (BO) –known to be sample efficient– and reducing wall-clock times by parallel execution of trials. We compare the performance of four parallelization methods and two model-free alternatives. Each method is evaluated on all 24 objective functions of the Black-Box-Optimization- Benchmarking (BBOB) test suite in their five, ten, and 20-dimensional versions. Additionally, their performance is investigated on six test cases in robot learning. The results show that parallelized BO outperforms the state-of-the-art CMA-ES on the BBOB test functions, especially for higher dimensions. On the robot learning tasks, the differences are less clear, but the data do support parallelized BO as the ‘best guess’, winning on some cases and never losing.

To maximize the throughput of a hot rolling mill,
the number of passes has to be reduced. This can be achieved by maximizing the thickness reduction in each pass. For this purpose, exact predictions of roll force and torque are required. Hence, the predictive models that describe the physical behavior of the product have to be accurate and cover a wide range of different materials.
Due to market requirements a lot of new materials are tested and rolled. If these materials are chosen to be rolled more often, a suitable flow curve has to be established. It is not reasonable to determine those flow curves in laboratory, because of costs and time. A strong demand for quick parameter determination and the optimization of flow curve parameter with minimum costs is the logical consequence. Therefore parameter estimation and the optimization with real data, which were collected during previous runs, is a promising idea. Producers benefit from this data-driven approach and receive a huge gain in flexibility when rolling new
materials, optimizing current production, and increasing quality. This concept would also allow to optimize flow curve parameters, which have already been treated by standard methods. In this article, a new data-driven approach for predicting the physical behavior of the product and setting important parameters is presented.
We demonstrate how the prediction quality of the roll force and roll torque can be optimized sustainably. This offers the opportunity to continuously increase the workload in each pass to the theoretical maximum while product quality and process stability can also be improved.

When researchers and practitioners in the field of
computational intelligence are confronted with real-world
problems, the question arises which method is the best to
apply. Nowadays, there are several, well established test
suites and well known artificial benchmark functions
available.
However, relevance and applicability of these methods to
real-world problems remains an open question in many
situations. Furthermore, the generalizability of these
methods cannot be taken for granted.
This paper describes a data-driven approach for the
generation of test instances, which is based on
real-world data. The test instance generation uses
data-preprocessing, feature extraction, modeling, and
parameterization. We apply this methodology on a classical
design of experiment real-world project and generate test
instances for benchmarking, e.g. design methods, surrogate
techniques, and optimization algorithms. While most
available results of methods applied on real-world
problems lack availability of the data for comparison,
our future goal is to create a toolbox covering multiple
data sets of real-world projects to provide a test
function generator to the research community.

This paper introduces CAAI, a novel cognitive architecture for artificial intelligence in cyber-physical production systems. The goal of the architecture is to reduce the implementation effort for the usage of artificial intelligence algorithms. The core of the CAAI is a cognitive module that processes declarative goals of the user, selects suitable models and algorithms, and creates a configuration for the execution of a processing pipeline on a big data platform. Constant observation and evaluation against performance criteria assess the performance of pipelines for many and varying use cases. Based on these evaluations, the pipelines are automatically adapted if necessary. The modular design with well-defined interfaces enables the reusability and extensibility of pipeline components. A big data platform implements this modular design supported by technologies such as Docker, Kubernetes, and Kafka for virtualization and orchestration of the individual components and their communication. The implementation of the architecture is evaluated using a real-world use case.

Data pre-processing is a key research topic in data mining because it plays a
crucial role in improving the accuracy of any data mining algorithm. In most
real world cases, a significant amount of the recorded data is found missing
due to most diverse errors. This loss of data is nearly always unavoidable.
Recovery of missing data plays a vital role in avoiding inaccurate data
mining decisions. Most multivariate imputation methods are not compatible
to univariate datasets and the traditional univariate imputation techniques
become highly biased as the missing data gap increases. With the current
technological advancements abundant data is being captured every second.
Hence, we intend to develop a new algorithm that enables maximum
utilization of the available big datasets for imputation. In this paper, we
present a Seasonal and Trend decomposition using Loess (STL) based
Seasonal Moving Window Algorithm, which is capable of handling patterns
with trend as well as cyclic characteristics. We show that the algorithm is
highly suitable for pre-processing of large datasets.

The use of surrogate models is a standard method to deal with complex, realworld
optimization problems. The first surrogate models were applied to continuous
optimization problems. In recent years, surrogate models gained importance
for discrete optimization problems. This article, which consists of three
parts, takes care of this development. The first part presents a survey of modelbased
methods, focusing on continuous optimization. It introduces a taxonomy,
which is useful as a guideline for selecting adequate model-based optimization
tools. The second part provides details for the case of discrete optimization
problems. Here, six strategies for dealing with discrete data structures are introduced.
A new approach for combining surrogate information via stacking
is proposed in the third part. The implementation of this approach will be
available in the open source R package SPOT2. The article concludes with a
discussion of recent developments and challenges in both application domains.