Refine
Document Type
- Report (4)
- Working Paper (2)
Language
- English (6)
Has Fulltext
- yes (6)
Keywords
- Ensemble Methods (2)
- Imputation (2)
- Optimierung (2)
- Automated Learning (1)
- Benchmarking (1)
- Cyclone Dust Separator (1)
- Datenanalyse (1)
- Entstauber (1)
- Evolutionärer Algorithmus (1)
- Fehlende Daten (1)
- Function Approximation (1)
- Funktionstest (1)
- Genetic Programming (1)
- Globale Optimierung (1)
- Maschinelles Lernen (1)
- Mehrkriterielle Optimierung (1)
- Missing Data (1)
- Model Selection (1)
- Modelierung (1)
- Modeling (1)
- Modellierung (1)
- Multi-Criteria Optimization (1)
- Multiobjective Optimization (1)
- Optimization (1)
- Prognose (1)
- Sequential Parameter Optimization (1)
- Sequentielle Parameter Optimierung (1)
- Simulation (1)
- Soft Computing (1)
- Surrogat-Modellierung (1)
- Surrogate Modeling (1)
- Surrogate Models (1)
- Surrogate-Model-Based Optimization (1)
- Test Function (1)
- Time Series (1)
- Time-series (1)
- Univariate Data (1)
- Vorverarbeitung (1)
- Zeitreihe (1)
- Zeitreihenanalyse (1)
- Zylon Enstauber (1)
Data pre-processing is a key research topic in data mining because it plays a
crucial role in improving the accuracy of any data mining algorithm. In most
real world cases, a significant amount of the recorded data is found missing
due to most diverse errors. This loss of data is nearly always unavoidable.
Recovery of missing data plays a vital role in avoiding inaccurate data
mining decisions. Most multivariate imputation methods are not compatible
to univariate datasets and the traditional univariate imputation techniques
become highly biased as the missing data gap increases. With the current
technological advancements abundant data is being captured every second.
Hence, we intend to develop a new algorithm that enables maximum
utilization of the available big datasets for imputation. In this paper, we
present a Seasonal and Trend decomposition using Loess (STL) based
Seasonal Moving Window Algorithm, which is capable of handling patterns
with trend as well as cyclic characteristics. We show that the algorithm is
highly suitable for pre-processing of large datasets.