Wind speed prediction for site selection and reliable operation of wind power plants in coastal regions using machine learning algorithm variants

Mollick, Tajrian; Hashmi, Galib; Sabuj, Saifur Rahman

doi:10.1186/s40807-024-00098-z

Research
Open access
Published: 01 February 2024

Wind speed prediction for site selection and reliable operation of wind power plants in coastal regions using machine learning algorithm variants

Tajrian Mollick¹,
Galib Hashmi² &
Saifur Rahman Sabuj¹

Sustainable Energy Research volume 11, Article number: 5 (2024) Cite this article

642 Accesses
1 Citations
Metrics details

Abstract

The challenge of predicting wind speeds to facilitate site selection and the consistent operation of wind power plants in coastal regions is a global concern. The output of wind turbines is subject to fluctuations corresponding to changes in wind speed. The unpredictable characteristics of wind patterns introduce vulnerabilities to wind power facilities in wind power plants. To address this unpredictability, an effective strategy involves forecasting wind speeds at specific locations during wind power plant operations. While previous research has explored various machine learning algorithms to tackle these issues, satisfactory results have not been achieved, and Bangladesh faces challenges in this regard, especially in low-wind speed areas. This study aims to identify the most accurate machine learning-based algorithm to forecast the short-term wind speed of two areas (Kutubdia and Cox's Bazar) located on the eastern coast of Bangladesh. Wind speed data for a span of 21.5 years, ranging from January 2001 to June 2022, were sourced from two outlets: the Bangladesh Meteorological Department and the website of NASA. Wind speed has been forecasted using 14 different regression-based machine learning models with a comprehensive overview. The results of the experiment highlight the exceptional predictive performance of a boosting-based ensemble method known as categorical boosting, especially in the context of forecasting wind speed data obtained from NASA. Based on the testing data, the evaluation yields remarkable results, with coefficients of determination measuring 0.8621 and 0.8758 for wind speed in Kutubdia and Cox's Bazar, respectively. The study underscores the critical importance of prioritizing optimal turbine site selection in the context of wind power facilities in Bangladesh. This approach can yield benefits for stakeholders, including engineers and project owners associated with wind projects.

Introduction

Rapid economic growth and improved lifestyles have increased human energy consumption. However, reliance on conventional fossil fuels like natural gas, coal, and oil results in pollution and contributes to global warming. As these resources are non-renewable and finite, nations increasingly invest in renewable energy sources to meet their present and future needs. Wind energy, being readily available and pollution-free, has emerged as a prominent renewable energy solution (Anjum, 2014; Bharani & Sivaprakasam, 2022). Therefore, wind power plants are rapidly evolving globally to address the growing demand for cleaner and more sustainable power. In the last 20 years, there has been a rapid growth in the installed capacity of wind power, as depicted in Fig. 1, which showcases global yearly wind power generation. It is assumed that wind-generated power will top the renewable energy sector by producing around 7932.5 TWh of electricity in 2030 (Iea, 2023). Currently, appropriate actions are being taken in several nations. However, for many countries, like Bangladesh, the contribution of wind power is quite minor.

In 2041, Bangladesh aims to achieve high-income country status, emphasizing the need for sustainable and uninterrupted power supply to drive industrialization. With a forecasted electricity demand of 82,292 MW in 2041, the country faces challenges due to depleting natural gas reserves and dependency on imported fuels. The current energy mix relies heavily on natural gas, and the depletion of reserves by 2028 poses a threat (Babu et al., 2022). Diesel imports for power plants and nuclear power plant limitations further complicate the quest for self-sufficiency. Moreover, Bangladesh, minimizing emissions of greenhouse gases by 21.85% by 2030, faces the dual challenge of increasing energy consumption and decreasing CO₂ emissions to achieve Sustainable Development Goals (SDGs) by 2030 and advanced nation status by 2041 (Das et al., 2020). The current energy mix of Bangladesh is natural gas 64.36%, furnace oil 21%, coal 33.54%, coal 9.52%, solar 0.84%, hydro 1.25%, and wind 0.01% (“Share of primary energy from wind” & Our World in Data, 2023). Embracing renewable energy practices becomes crucial for efficient energy utilization and environmental sustainability. The United States Agency for International Development (USAID), Bangladesh, and the Government of Bangladesh (GoB) collaborated to assist the National Renewable Energy Laboratory (NREL) to conduct a recent national wind resource assessment in Bangladesh. (Babu et al., 2022). According to the evaluation document of NERL, Bangladesh has more than 20,000 km² of land with a wind speed of 5.75–7.75 m/s, which leads to a gross wind potential of over 30,000 MW (Siddique et al., 2021). The findings prove that the entire coastline area, e.g., Cox’s Bazar, Patenga, Teknaf, Kutubdia, Char Fassion, and Kuakata, falls into the zone that is commercially important for the production of wind power by installing small and medium-scale wind farms.

Therefore, it can be said that if the right laws, programs, and technological innovations are implemented, wind can be included as a key contributor to renewable energies to tackle the energy crisis (Siami-Namini et al., 2018). However, wind energy is an intermittent renewable energy source (IRES) because it cannot be dispatched due to its fluctuating nature. Forecasting the wind speed of a location before constructing a wind power plant may be the answer to the unpredictability of wind speed. Moreover, accurate wind speed predictions during the operation of the wind could aid stakeholders in making vital decisions, such as regarding wind power storage or grid transmission activity (Shi et al., 2022). Thus, to identify optimal sites for wind energy plants and guarantee operational safety, researchers concentrate on developing precise predictions of wind speed (Babu et al., 2022).

A thorough study of the literature shows that there are two basic approaches for wind speed forecasting: the time horizon and modeling theory (as depicted in Fig. 2). Four sorts of wind speed predictions are possible in terms of time horizon, and they are as follows: very short-time (a few seconds), short-time (30 min–6 h), medium-time (6 h–1 day), and long-time (more than 1 day) (Babu et al., 2022). Operational engineers, armed with predictions of wind speed from the short term up to the long term in advance, can make a variety of decisions to optimize the performance and efficiency of wind energy operations. They can strategically optimize wind turbine operations by altering angles and speeds for maximal energy capture based on wind speed estimates available three hours in advance. They use energy storage based on anticipated wind conditions, distribute resources wisely, and effectively integrate wind energy into the power grid (Yousuf et al., 2019). Anticipated variations in energy production inform financial planning, while safety protocols are implemented in advance of extreme weather. To guarantee the efficient and secure operation of wind energy systems, engineers also plan grid connections and implement environmental impact mitigation strategies during certain wind conditions (Santhosh et al., 2020; Yousuf et al., 2019).

Similar to the time horizon, modeling theory is classified into four types of forecasting models: persistence methods, physical models, conventional statistical models, and models based on artificial intelligence (AI) (Chang, 2014). The persistence method seems to be more accurate than other wind forecasting techniques in very short-term forecasting. However, as the prediction horizon expands, the persistence method's accuracy will rapidly decline. Physical models are good for long-term forecasting, but they are time-consuming due to the numerous computations required. Statistical models are used to ascertain the mathematical relationship between inputs and outputs under the assumption of linear correlations. Despite their extensive use in the research, their effectiveness fell short of expectations because they were ineffective in identifying nonlinear interactions (Chang, 2014). A large subset of AI is machine learning (ML), which aims to train the computer to comprehend situations and perform actions that are both advantageous and beneficial to the environment after training it on a previously stated dataset (Jagdale et al., 2022). An examination of existing literature reveals that ML algorithms can be categorized into supervised, unsupervised, semi-supervised, and reinforcement learning categories (Sarker, 2021). A supervised learning algorithm determines a mapping function to map the input variable to the output variable. If a hidden layer is used by the mapping function, then it becomes deep learning (DL), a subclass of ML that can intelligently evaluate data on a large scale (Babu et al., 2022). ML and DL have been widely employed in the field of prediction because of their superior prediction capability over conventional prediction models (Tarek et al., 2023).

Wind speed forecasting can be performed using the following ML algorithms following a detailed investigation of the literature: multiple linear regression (MLR), support vector regression (SVR), lasso regression, ridge regression, random forest (RF), light gradient boosting machine (LightGBM), extreme gradient boost (XGBoost), and long short-term memory networks (LSTM) (Elsaraiti & Merabet, 2021; Hanoon et al., 2022; Krishnaveni et al., 2021; Malakouti, 2023; Mohsin et al., 2021; Salah et al., 2022; Senthil Kumar P, 2019; Shawon et al., 2021; Xie et al., 2021). Air pressure, temperature, humidity, and wind speed were implemented as input variables in the proposed models. Numerous studies pointed out that multi-variable long short-term memory network model (MV-LSTM) methodology is more effective than techniques like autoregressive moving average (ARMA) and single-variable LSTM (Elsaraiti & Merabet., 2021). Additionally, different ML techniques, including bagged regression trees (BTs), SVR, and Gaussian process regression (GPR) were adapted by many reviewers in terms of the weekly prediction of wind speed (Hanoon et al., 2022). A variety of ML methods, such as MLR, ridge, lasso, RF, SVR, and LSTM, were applied in a different study to predict wind speed for a specific weather station. These models incorporated wind direction, temperature, pressure, timestamp, and other variables for precise estimation. Notably, the RF and LSTM-RNN models outperformed other approaches for accurately wind speed forecasting (Salah et al., 2022). To anticipate short-term wind speed at certain ground observation stations, a MV-LSTM was also evolved in a different study (Xie et al., 2021). ML models were also deployed in the study to forecast wind speed and electricity generation in a SCADA system. Six techniques, including adaptive boosting (AdaBoost) and LightGBM, were applied in Malakouti (2023). Outcomes achieved from the ensemble technique with cross-validation were promising: the wind power and wind speed predictions had root mean square errors (RMSEs) of 11.78 and 0.2080, respectively. Although several studies have successfully used a variety of ML models to anticipate wind speed, there is still unexposed potential to attain the best outcomes considering wind's inconstancy. Previous studies have mostly concentrated on certain areas and used a limited set of ML models. There is a distinct need for more research to better encompass how these models perform in a wider range of geographic contexts, such as areas with changing climates or opposing weather patterns. Moreover, earlier examinations were inconclusive in considering important factors associated with site and turbine selection, leaving a substantial gap in addressing this crucial part of the study's goals.

The wind energy initiatives in Bangladesh have been predominantly concentrated in specific regions, leaving a significant portion of the country unexplored in terms of wind energy projects. Limited studies on short-term wind speed forecasts have been conducted in Bangladesh, hindering the effective communication of mitigation and adaptation strategies to project stakeholders. This research addresses the knowledge gap by focusing on Kutubdia and Cox’s Bazar, situated in the southeastern region of Bangladesh, known for their favorable wind potential. In this study, the research employs fourteen well-established ML models to forecast 3-h interval wind speed, utilizing a 21.5-year weather dataset from Bangladesh Meteorological Department (BMD) and NASA's website. The urban environment of BMD suggests a relatively low wind potential, prompting the utilization of the NASA's dataset, which reveals preferable wind energy availability. The application of diverse ML models enhances the accuracy of wind speed predictions, offering valuable insights for site and turbine selection, operational safety measures, and the uninterrupted performance of wind power systems.

System model

The comprehensive methodology employed for this research is fully depicted in Fig. 3. The six fundamental stages of this procedure are succinctly outlined below:

Step 1. Data collection and formatting: Initially, the observed data (wind speed, wind direction, temperature, humidity, and pressure) of the two coastal areas with a 3-h interval from January 1, 2001, to June 30, 2022 (62,808 data samples of 21.5 years) have been collected from two sources: i) BMD (Kutubdia and Cox's Bazar weather stations) and ii) the website of NASA (Data Access Viewer) (“POWER | Data Access Viewer”, 2023). Data formatting is done by removing irrelevant data and rearranging the required parts.
Step 2. Exploratory data analysis: Conducting exploratory data analysis aids in gaining a deeper understanding of the data's underlying patterns. It is fundamental to the structure of any machine-learning algorithm. In this part, descriptive statistics are analyzed to extract knowledge from the formatted data.
Step 3. Data preprocessing: Before applying the ML models, data preprocessing is an essential stage in shaping an optimal data structure. In contrast, the absence of well-preprocessed data can compromise the efficiency and performance of machine-learning models, resulting in suboptimal outcomes. This preprocessing phase covers tasks such as handling missing values, extracting and selecting features, and normalizing data.
Step 4. Train-test splitting: Following preprocessing, the dataset is partitioned into three subsets: i) the training set (70%), ii) the validation set (15%), and iii) the test set (15%).
Step 5. Model optimization and training: In this stage, 14 distinct regression-based ML methods, including MLR, Lasso, Ridge, Elastic Net, KNN, DT, GBR, RF, XGBoost, LightGBM, CatBoost, LSTM, and GRU, are deployed to predict the wind speed three hours ahead. For model optimization, k-fold cross-validation is implemented with and without parameter tuning.
Step 6. Forecasting and performance evaluation: The model, which has been trained on the validation dataset, is assessed, and its performance is contrasted with that of the initial model trained on the training dataset. If the disparity is minimal, the forecasting performance using the test dataset is cross-checked with the observed data to ascertain the system's accuracy in construction. A comprehensive assessment of wind resources has been carried out using both observed and predicted wind speeds, demonstrating the detailed advantages of forecasting.

Training models: renowned predictive algorithms

Predictive ML models based on regression present a versatile range of techniques for forecasting wind speed, each possessing unique strengths and suitability for different contexts. The choice of a model involves different aspects, including the properties of the data, the computing capacity, and the specific requirements of the forecasting goal. A brief synopsis of the models used in this study is provided in Table 1.

Table 1 Details of regression models utilized in this research

Wind speed prediction for site selection and reliable operation of wind power plants in coastal regions using machine learning algorithm variants

Abstract

Introduction

System model

Training models: renowned predictive algorithms

Optimizing and fine-tuning models: K-fold cross-validation and Hyperopt

Comparing model performance: different evaluation metrics

Determining wind energy potential: key factors

Experimental procedure

Site selection and data collection

Data formatting

Exploratory data analysis

Data preprocessing

Data splitting

Model optimization and training

Results and discussion

Comparing predictive models using evaluation metrics

Generation scale and turbine compatibility

Conclusion and recommendations

Availability of data and materials

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords