Refine
Year of publication
Document Type
- Report (17)
- Working Paper (16)
- Article (8)
- Preprint (5)
- Book (4)
- Conference Proceeding (1)
- Doctoral Thesis (1)
Language
- English (52) (remove)
Has Fulltext
- yes (52)
Keywords
- Optimierung (15)
- Optimization (13)
- Benchmarking (5)
- Modellierung (5)
- Simulation (5)
- Globale Optimierung (4)
- Soft Computing (4)
- Disaster Risk Reduction (3)
- Evolutionärer Algorithmus (3)
- Kriging (3)
Institute
- Fakultät für Informatik und Ingenieurwissenschaften (F10) (26)
- Fakultät 10 / Institut für Informatik (11)
- Fakultät 09 / Institut für Rettungsingenieurwesen und Gefahrenabwehr (4)
- Fakultät 02 / Köln International School of Design (2)
- Fakultät 04 / Institut für Versicherungswesen (2)
- Fakultät 10 / Institut für Data Science, Engineering, and Analytics (2)
- Fakultät 08 / Institut für Fahrzeugtechnik (1)
- Institut für Technologie und Ressourcenmanagement in den Tropen und Subtropen (ITT) (1)
We propose to apply typed Genetic Programming (GP) to the problem of finding surrogate-model ensembles for global optimization on compute-intensive target functions. In a model ensemble, base-models such as linear models, random forest models, or Kriging models, as well as pre- and post-processing methods, are combined. In theory, an optimal ensemble will join the strengths of its comprising base-models while avoiding their weaknesses, offering higher prediction accuracy and robustness. This study defines a grammar of model ensemble expressions and searches the set for optimal ensembles via GP. We performed an extensive experimental study based on 10 different objective functions and 2 sets of base-models. We arrive at promising results, as on unseen test data, our ensembles perform not significantly worse than the best base-model.
When using machine learning techniques for learning a function approximation from given data it is often a difficult task to select the right modeling technique.
In many real-world settings is no preliminary knowledge about the objective function available. Then it might be beneficial if the algorithm could learn all models by itself and select the model that suits best to the problem.
This approach is known as automated model selection. In this work we propose a
generalization of this approach.
It combines the predictions of several into one more accurate ensemble surrogate model. This approach is studied in a fundamental way, by first evaluating minimalistic ensembles of only two surrogate models in detail and then proceeding to ensembles with three and more surrogate models.
The results show to what extent combinations of models can perform better than single surrogate models and provides insights into the scalability and robustness of the approach. The study focuses on multi-modal functions topologies, which are important in surrogate-assisted global optimization.
This paper introduces UniFIeD, a new data preprocessing method for time series. UniFIeD can cope with large intervals of missing data. A scalable test function generator, which allows the simulation of time series with different gap sizes, is presented additionally. An experimental study demonstrates that (i) UniFIeD shows a significant better performance than simple imputation methods and (ii) UniFIeD is able to handle situations, where advanced imputation methods fail. The results are independent from the underlying error measurements.
A pension system is resilient if it able to absorb external (temporal) shocks and if it is able to adapt to (longterm) shifts of the socio-economic environment. Defined benefit (DB) and defined contribution pension plans behave contrastingly with respect to capital market shocks and shifts: while DB-plan benefits are not affected by external shocks they totally lack adaptability with respect to fundamental changes; DC-plans automatically adjust to a changing environment but any external shock has a direct impact on the (expected) pensions. By adding a collective component to DC-plans one can make these collective DC (CDC)-plans shock absorbing - at least to a certain degree. In our CDC pension model we build a collective reserve of assets that serves as a buffer to capital market shocks, e.g. stock market crashes. The idea is to transfer money from the collective reserve to the individual pension accounts whenever capital markets slump and to feed the collective reserve whenever capital market are booming. This mechanism is particular valuable for age cohorts that are close to retirement. It is clear that withdrawing assets from or adding assets to the collective reserve is essentially a transfer of assets between the age cohorts. In our near reality model we investigate the effect of stock market shocks and interest rate (and mortality) shifts on a CDC- pension system. We are particularly interested in the question, to what extend a CDC-pension system is actually able to absorb shocks and whether the intergenerational transfer of assets via the collective reserve can be regarded as fair.
Collective Defined Contribution Plans – Backtesting Based on German Capital Market Data 1950 - 2022
(2022)
Using historical capital market data for Germany (1950-2022) we analyze and compare (individual) defined contribution (IDC-) and collective defined contribution (CDC) pension plans. To this end we define simple asset liability management rules that govern a CDC pension plan and compare these to IDC-plans with the same asset allovation. Our main result is, that the CDC pension plans allow for a significant improvement of the risk return profile compared to individual pension plans. Hereby we consider different risk measures. This empirical study affirms the theoretical results based on stochastic CDC-models.
Social learning enables multiple robots to share learned experiences while completing a task. The literature offers examples where robots trained with social learning reach a higher performance compared to their individual learning counterparts. No explanation has been advanced for that observation. In this research, we present experimental results suggesting that a lack of tuning of the parameters in social learning experiments could be the cause. In other words: the better the parameter settings are tuned, the less social learning can improve the system performance.
To maximize the throughput of a hot rolling mill,
the number of passes has to be reduced. This can be achieved by maximizing the thickness reduction in each pass. For this purpose, exact predictions of roll force and torque are required. Hence, the predictive models that describe the physical behavior of the product have to be accurate and cover a wide range of different materials.
Due to market requirements a lot of new materials are tested and rolled. If these materials are chosen to be rolled more often, a suitable flow curve has to be established. It is not reasonable to determine those flow curves in laboratory, because of costs and time. A strong demand for quick parameter determination and the optimization of flow curve parameter with minimum costs is the logical consequence. Therefore parameter estimation and the optimization with real data, which were collected during previous runs, is a promising idea. Producers benefit from this data-driven approach and receive a huge gain in flexibility when rolling new
materials, optimizing current production, and increasing quality. This concept would also allow to optimize flow curve parameters, which have already been treated by standard methods. In this article, a new data-driven approach for predicting the physical behavior of the product and setting important parameters is presented.
We demonstrate how the prediction quality of the roll force and roll torque can be optimized sustainably. This offers the opportunity to continuously increase the workload in each pass to the theoretical maximum while product quality and process stability can also be improved.
Architecural aproaches are considered to simplify the generation of re-usable building blocks in the field of data warehousing. While SAP’s Layer Scalable Architecure (LSA) offers a reference model for creating data warehousing infrastructure based on SAP software, extented reference models are needed to guide the integration of SAP and non-SAP tools. Therefore, SAP’s LSA is compared to the Data Warehouse Architectural Reference Model (DWARM), which aims to cover the classical data warehouse topologies.
In the present paper a calculation tool for the lifetime prediction of composite materials with focus on local multiaxial
stress states and different local stress ratios within each lamina is developed. The approach is based on repetitiv, progressive in-plane stress calculations using classical laminate theory with subsequent analysis of the material stressing effort and use of appropriate material degradation models. Therefore experimentally data of S-N curves are
used to generate anistropic constant life diagrams for a closer examination of critical fracture planes under any given combination of local stress ratios. The model is verified against various balanced angle plies and multi-directional
laminates with arbitrary stacking sequences and varying stress ratios throughout the analysis. Different sections of the
model, such as residual strength and residual stiffness, are examined and verified over a wide range of load cycles. The obtained results agree very well with the analyzed experimental data.
In this paper we present a comparison of different data driven modeling methods. The first instance of a data driven linear Bayesian model is compared with several linear regression models, a Kriging model and a genetic programming model.
The models are build on industrial data for the development of a robust gas sensor.
The data contain limited amount of samples and a high variance.
The mean square error of the models implemented in a test dataset is used as the comparison strategy.
The results indicate that standard linear regression approaches as well as Kriging and GP show good results,
whereas the Bayesian approach, despite the fact that it requires additional resources, does not lead to improved results.
Modelling Zero-inflated Rainfall Data through the Use of Gaussian Process and Bayesian Regression
(2018)
Rainfall is a key parameter for understanding the water cycle. An accurate rainfall measurement is vital in the development of hydrological models. By means of indirect measurement, satellites can nowadays estimate the rainfall around the world. However, these measurements are not always accurate. As a first approach to generate a bias-corrected rainfall estimate using satellite data, the performance of Gaussian process and Bayesian regression is studied. The results show Gaussian process as the better option for this dataset but leave place to improvements on both modelling strategies.
Sensor placement for contaminant detection in water distribution systems (WDS) has become a topic of great interest aiming to secure a population's water supply. Several approaches can be found in the literature with differences ranging from the objective selected to optimize to the methods implemented to solve the optimization problem. In this work we aim to give an overview of the current work in sensor placement with focus on contaminant detection for WDS. We present some of the objectives for which the sensor placement problem is defined along with common optimization algorithms and Toolkits available to help with algorithm testing and comparison.
Drinking water supply and distribution systems are critical infrastructure that has to be well maintained for the safety of the public. One important tool in the maintenance of water distribution systems (WDS) is flushing. Flushing is a process carried out in a periodic fashion to clean sediments and other contaminants in the water pipes. Given the different topographies, water composition and supply demand between WDS no single flushing strategy is suitable for all of them. In this report a non-exhaustive overview of optimization methods for flushing in WDS is given. Implementation of optimization methods for the flushing procedure and the flushing planing are presented. Suggestions are given as a possible option to optimise existing flushing planing frameworks.
Many black-box optimization problems rely on simulations to evaluate the quality of candidate solutions. These evaluations can be computationally expensive and very time-consuming. We present and approach to mitigate this problem by taking into consideration two factors: The number of evaluations and the execution time. We aim to keep the number of evaluations low by using Bayesian optimization (BO) – known to be sample efficient– and to reduce wall-clock times by executing parallel evaluations. Four parallelization methods using BO as optimizer are compared against the inherently parallel CMA-ES. Each method is evaluated on all the 24 objective functions of the Black-Box-Optimization-Benchmarking test suite in their 20-dimensional versions. The results show that parallelized BO outperforms the state-of-the-art CMA-ES on most of the test functions, also on higher dimensions.
An important class of black-box optimization problems relies on using simulations to assess the quality of a given candidate solution. Solving such problems can be computationally expensive because each simulation is very time-consuming. We present an approach to mitigate this problem by distinguishing two factors of computational cost: the number of trials and the time needed to execute the trials. Our approach tries to keep down the number of trials by using Bayesian optimization (BO) –known to be sample efficient– and reducing wall-clock times by parallel execution of trials. We compare the performance of four parallelization methods and two model-free alternatives. Each method is evaluated on all 24 objective functions of the Black-Box-Optimization- Benchmarking (BBOB) test suite in their five, ten, and 20-dimensional versions. Additionally, their performance is investigated on six test cases in robot learning. The results show that parallelized BO outperforms the state-of-the-art CMA-ES on the BBOB test functions, especially for higher dimensions. On the robot learning tasks, the differences are less clear, but the data do support parallelized BO as the ‘best guess’, winning on some cases and never losing.
We propose a hybridization approach called Regularized-Surrogate- Optimization (RSO) aimed at overcoming difficulties related to high- dimensionality. It combines standard Kriging-based SMBO with regularization techniques. The employed regularization methods use the least absolute shrinkage and selection operator (LASSO). An extensive study is performed on a set of artificial test functions and two real-world applications: the electrostatic precipitator problem and a multilayered composite design problem. Experiments reveal that RSO requires significantly less time than Kriging to obtain comparable results. The pros and cons of the RSO approach are discussed and recommendations for practitioners are presented.
Real-world problems such as computational fluid dynamics simulations and finite element analyses are computationally expensive. A standard approach to mitigating the high computational expense is Surrogate-Based Optimization (SBO). Yet, due to the high-dimensionality of many simulation problems, SBO is not directly applicable or not efficient. Reducing the dimensionality of the search space is one method to overcome this limitation. In addition to the applicability of SBO, dimensionality reduction enables easier data handling and improved data and model interpretability. Regularization is considered as one state-of-the-art technique for dimensionality reduction. We propose a hybridization approach called Regularized-Surrogate-Optimization (RSO) aimed at overcoming difficulties related to high-dimensionality. It couples standard Kriging-based SBO with regularization techniques. The employed regularization methods are based on three adaptations of the least absolute shrinkage and selection operator (LASSO). In addition, tree-based methods are analyzed as an alternative variable selection method. An extensive study is performed on a set of artificial test functions and two real-world applications: the electrostatic precipitator problem and a multilayered composite design problem. Experiments reveal that RSO requires significantly less time than standard SBO to obtain comparable results. The pros and cons of the RSO approach are discussed, and recommendations for practitioners are presented.
Surrogate-based optimization relies on so-called infill criteria (acquisition functions) to decide which point to evaluate next. When Kriging is used as the surrogate model of choice (also called Bayesian optimization), one of the most frequently chosen criteria is expected improvement. We argue that the popularity of expected improvement largely relies on its theoretical properties rather than empirically validated performance. Few results from the literature show evidence, that under certain conditions, expected improvement may perform worse than something as simple as the predicted value of the surrogate model. We benchmark both infill criteria in an extensive empirical study on the ‘BBOB’ function set. This investigation includes a detailed study of the impact of problem dimensionality on algorithm performance. The results support the hypothesis that exploration loses importance with increasing problem dimensionality. A statistical analysis reveals that the purely exploitative search with the predicted value criterion performs better on most problems of five or higher dimensions. Possible reasons for these results are discussed. In addition, we give an in-depth guide for choosing the infill criteria based on prior knowledge about the problem at hand, its dimensionality, and the available budget.
The availability of several CPU cores on current computers enables
parallelization and increases the computational power significantly.
Optimization algorithms have to be adapted to exploit these highly
parallelized systems and evaluate multiple candidate solutions in
each iteration. This issue is especially challenging for expensive
optimization problems, where surrogate models are employed to
reduce the load of objective function evaluations.
This paper compares different approaches for surrogate modelbased
optimization in parallel environments. Additionally, an easy
to use method, which was developed for an industrial project, is
proposed. All described algorithms are tested with a variety of
standard benchmark functions. Furthermore, they are applied to
a real-world engineering problem, the electrostatic precipitator
problem. Expensive computational fluid dynamics simulations are
required to estimate the performance of the precipitator. The task
is to optimize a gas-distribution system so that a desired velocity
distribution is achieved for the gas flow throughout the precipitator.
The vast amount of possible configurations leads to a complex
discrete valued optimization problem. The experiments indicate
that a hybrid approach works best, which proposes candidate solutions
based on different surrogate model-based infill criteria and
evolutionary operators.
Land-use intensification and urbanisation processes are degrading ecosystem services in the Guapiaçu-Macacu watershed in the state of Rio de Janeiro, Brazil. Paying farmers to forgo agricultural production activities in order to restore natural watershed services might be a viable means of securing water resources over the long term for the approximately 2.5 million urban water users in the region. This study quantified the costs of changing current land-use patterns to enhance watershed services. These costs are compared to estimates of the avoided water treatment costs for the public potable water supply as a proxy of willingness-to-pay for watershed services. Farm-household data was used to estimate the opportunity costs of abandoning current land uses in order to allow natural vegetation succession; a process that is very likely to improve water quality in terms of reducing erosion and subsequently water turbidity. Opportunity cost estimates were extrapolated to the watershed scale based on land-use classifications and a vulnerability analysis for identifying priority areas for watershed management interventions. Water quality and treatment cost data from the primary local water treatment plant (principal water user in the study area) were analysed to assess the potential demand for watershed services. The conversion of agricultural land uses for the benefit of watershed service provision was found to entail high opportunity costs in the study area, which is near the city of Rio de Janeiro. Alternative, relatively low-cost practices that support watershed conservation do exist for the livestock production systems. Other options include: implementing soil conservation techniques, permanent protection of areas that are vulnerable to erosion, protecting and restoring riparian and headwater areas, and applying more sustainable agricultural practices. These measures have the potential to directly reduce the amount of sediment and nutrients reaching water bodies and, in turn, decrease the costs of treatment required for providing the potable water supply. Based on treatment costs, the state water utility company’s willingness-to-pay for watershed services alone will not be sufficient to compensate farmers for forgoing agricultural production activities in order to improve the provision of additional watershed services. The results suggest that the opportunity costs of land-cover changes at the scale needed to improve water quality will likely exceed the cost of additional investments in water treatment. Monetary incentives conditioned on specific adjustments to existing production systems could offer a complementary role for improving watershed services. The willingness-to-pay analysis, however, only focused on chemical treatment costs and one of a potentially wide range of ecosystem services provided by the natural vegetation in the Guapiaçu-Macacu watershed (water quality maintenance for potable water provision). Other ecosystem services provided by forest cover include carbon sequestration and storage, moderation of extreme weather events, regulation of water flows, landscape aesthetics, and biodiversity protection. Factoring these additional ecosystem services into the willingness-to-pay equation is likely to change the conclusions of the assessment in favour of additional conservation action, either through payments for ecosystem services (PES) or other policy instruments. This effort contributes to the growing body of related scientific literature by offering additional knowledge on how to combine spatially explicit economic and environmental information to provide valuable insights into the feasibility of implementing PES schemes at the scale of entire watersheds. This is relevant to helping inform decision-making processes with respect to the economic scope of incentive-based watershed management in the context of the Guapiaçu-Macacu watershed. Furthermore, the findings of this research can serve long-term watershed conservation initiatives and public policy in other watersheds of the Atlantic Forest biome by facilitating the targeting of conservation incentives for a cost-effective watershed management.