ML-Based Wind Turbine Condition Monitoring

Normal behavior models with CUSUM anomaly detection for predictive bearing maintenance

Schematic overview of the Normal Behavior Model framework for wind turbine condition monitoring
57%
Fewer False Alarms
1,120 h
Detection Lead Time

My Role

I supervised this MSc thesis by James Kanoksilp at DTU Wind & Energy Systems. Over four months of weekly meetings, I guided the research direction, helped frame the problem as an end-to-end detection pipeline (not just a prediction task), and advised on the systematic optimization methodology—particularly the CUSUM parameter grid search, the target-lag tradeoff analysis, and the bias correction strategy that ultimately produced the best result.

Project Overview

Wind turbine O&M costs account for 25-30% of offshore lifecycle costs. This project developed Normal Behavior Models (NBMs) that learn healthy operating patterns from SCADA data and flag deviations as potential bearing failures. The approach combines ML prediction (XGBoost and LSTM) with CUSUM anomaly detection, systematically optimizing the full pipeline from feature selection through alarm thresholds.

The study used open SCADA data from 5 wind turbines provided by EDP, covering 80 operational parameters at 10-minute intervals over 2 years. The pipeline was evaluated on generator and gearbox bearing failure detection, with the goal of minimizing false alarms while maintaining early detection within a 60-day pre-failure window.

Dataset: Failure Events

Timeline of failure events across 5 wind turbines from the EDP SCADA dataset

Annotated failure events across 5 turbines over the 2-year monitoring period. Generator bearing, gearbox bearing, and hydraulic system failures are the key detection targets.

XGBoost vs LSTM

Cross-validation revealed that XGBoost consistently outperformed LSTM by 10-15% on both RMSE and MAE for bearing temperature prediction. XGBoost achieved optimal performance with lag step 3 and 15-17 SHAP-ranked features, while LSTM showed stable but lower accuracy across configurations.

Cross-validation performance comparison: XGBoost lag steps vs LSTM sequence lengths

Cross-validation RMSE and MAE for XGBoost (varying lag steps) and LSTM (varying sequence lengths) on generator and gearbox bearing temperature prediction.

The Target Lag Tradeoff

A critical insight: including lagged target features as inputs dramatically improves prediction accuracy (MAE drops ~80% for generator bearings) but can mask early degradation signals. When the model uses recent target history to predict the next value, it tracks degradation so closely that subtle pre-failure drift disappears from the residuals. Excluding lagged targets produces worse predictions but clearer anomaly signals—after bias correction.

Generator bearing temperature prediction with and without lagged target features

Generator bearing prediction comparison: with target lag (top, low error but masked degradation) vs without (bottom, higher error but visible pre-failure drift).

CUSUM Parameter Optimization

The Cumulative Sum (CUSUM) algorithm detects shifts in the mean of a process by accumulating prediction residuals. The upper and lower CUSUM scores are updated at each time step:

S+t = max(0, S+t-1 + (et - k))

S-t = min(0, S-t-1 + (et + k))

where et is the prediction residual, k is the sensitivity parameter, and an alarm is raised when |St| > h.

The sensitivity (k) and threshold (h) parameters were systematically optimized via grid search for each model-component combination. The optimization minimized false alarms subject to successful detection within the 60-day window. For generator bearing detection, CUSUM optimization reduced LSTM false alarms from 14 to 6—a 57% reduction—while maintaining a detection lead time of 1,120 hours. Component-specific requirements emerged: generator bearings benefit from higher sensitivity (k=1.5) to capture spike-based failures, while gearbox bearings need lower sensitivity (k=0.2) for gradual degradation accumulation.

CUSUM parameter grid search for XGBoost generator bearing detection

Grid search over CUSUM k and h for XGBoost generator bearing detection. Red regions fail to detect within 60 days; color intensity shows false alarm count for successful configurations.

Best Result: 1 False Alarm

The best configuration combined XGBoost on raw (unfiltered) temperature data with a 100-day rolling mean bias correction. This achieved generator bearing failure detection with only 1 false alarm—and that single alarm corresponded to a real sensor replacement event. The CUSUM detected the failure through small but persistent prediction error drift, exactly the type of subtle signal the algorithm was designed for.

XGBoost generator bearing detection with bias correction achieving 1 false alarm

XGBoost generator bearing detection with 100-day rolling mean bias correction. CUSUM parameters k=0.03, h=1.5. Only 1 false alarm over the full test period.

Key Takeaways

  • Better prediction accuracy does not mean better fault detection. The target lag tradeoff shows that overly accurate models can suppress the very degradation signals you need to detect.
  • Bias correction is essential. A 100-day rolling mean adjustment transformed XGBoost from 7 false alarms to 1, enabling CUSUM to operate as designed on zero-centered residuals.
  • Component-specific optimization is unavoidable. Generator and gearbox bearings require fundamentally different CUSUM parameters, feature counts, and preprocessing strategies.
  • XGBoost outperforms LSTM end-to-end. Not just in prediction accuracy (10-15% better RMSE/MAE), but also in detection robustness due to greater resilience to input magnitude variations.
  • Two distinct detection mechanisms emerge. Median filtering enables spike-based detection (longer lead times, event-dependent), while raw data with bias correction enables drift-based detection (more reliable, shorter lead times).