CollectHomepage AdvertisementContact usMessage

Arid Land Geography ›› 2020, Vol. 43 ›› Issue (4): 880-888.doi: 10.12118/j.issn.1000-6060.2020.04.03

• Climatology and Hydrology • Previous Articles     Next Articles

Drought prediction based on machine learning models in the northern part of Haihe River Basin

ZHAO Mei-yan1, HU Tao1, ZHANG Yu-hu2, PU Xiao2, GAO Feng3   

  1. 1 School of Mathematical Sciences,Capital Normal University,Beijing 100048,China;
    2 College of Resources Environment & Tourism,Capital Normal University,Beijing 100048,China;
    3 National Meteorological Information Center,Beijing 100081,China
  • Received:2019-11-12 Revised:2020-03-19 Online:2020-07-25 Published:2020-11-18

Abstract: Drought is one of the major natural disasters. Improving the accuracy of drought prediction can provide reliable data to support drought response and risk prevention. The construction of suitable drought prediction models is a current research hotspot. Machine learning models are widely used for drought forecasting such as artificial neural network (ANN),wavelet neural network (WNN),support vector regression (SVR) and random forest (RF). This paper explored and compared the forecasting abilities and stabilities of the wavelet neural network (WNN),support vector regression (SVR) and random forest (RF) in the northern part of the Haihe River Basin,China. The northern part of the Haihe River Basin is located in the upper reaches of Beijing and Tianjin,which is an important industrial and agricultural production area in China. The total area is 8.34×105 km2. It has a temperate monsoon climate with average annual precipitation of 490 mm. The models used in this paper are based on the standard precipitation index (SPI) at different time scales (3,6,9 and 12 months). The SPI was calculated using daily precipitation data obtained at eight meteorological points in the northern part of the Haihe River Basin from 1960 to 2010. Then,the SPI series were predicted use the WNN,SVR and RF models separately. The effectiveness of the three machine learning models is compared by Kendall rank correlation(Kendall),Kolmogorov-Smirnov(K-S) test and mean absolute error (MAE). The following results were observed:(1) The prediction abilities of the WNN and SVR models vary at different time scales,with WNN performing best suited for SPI-12 and SVR best suited for SPI-6. (2) For the SPI-3 and SPI-12,the RF prediction performance was optimal (Kendall > 0.898,MAE < 0.05). For the SPI-6 and SPI-9,the SVR prediction performance was optimal (Kendall > 0.95,MAE < 0.04). (3) The stability of the model prediction performances differed,with RF being most stable,followed by SVR. (4) The variation in model predictions performance is due to the following: the convex optimization of SVR resolves the WNN weakness of falling into a local optimal solution,thereby improving the prediction performance of the model. The RF boosting diversified regression trees,which reduce the negative influence of weak learners,improve the prediction accuracy and stability of the model. Furthermore,the capacity of the RF model is strongest in its ability to cope with precipitation data that contains noise. This paper presents a comprehensive analysis of the drought prediction performance of multiple models at multiple time scales for SPI series and preliminarily explores the internal mechanisms of model differentiation. The result of this study provides alternative models and research ideas for the northern part of the Haihe River Basin and beyond.

Key words: drought, SVR, RF, WNN, SPI, the northern part of Haihe River Basin