基于机器学习模型的海河北系干旱预测研究

doi:10.12118/j.issn.1000-6060.2020.04.03

摘要/Abstract

摘要： 提高干旱预测精度能为流域干旱应对及风险防范提供可靠数据支撑,构建比选合适的干旱模型是当前研究的热点。研究以4个时间尺度（3、6、9、12月）标准化降水指数（SPI）为表征指标,利用小波神经网络（WNN）、支持向量回归（SVR）、随机森林（RF）三种机器学习算法分别构建了海河北系干旱预测模型,利用Kendall、K-S、MAE三种检验方法判定模型表现及其稳定性。研究表明：（1） WNN、SVR模型呈现结果在不同时间尺度SPI存在差异,WNN最适合12个月尺度SPI干旱预测;SVR最适合6个月尺度SPI干旱预测。（2）对3、12个月尺度SPI,RF预测性能最优（Kendall>0.898,MAE<0.05）;对6、9个月尺度SPI,SVR预测性能最优（Kendall>0.95,MAE<0.04）。（3）模型预测性能稳定性存在区别,RF预测稳定性最高,其次为SVR。（4）构建的三种模型表现异同主要是因为SVR转为凸优化问题解决了WNN易陷入局部最优解的不足,从而提高了模型预测性能,RF集成多样化回归树,降低了弱学习器的负面影响,提高了模型预测准确率及稳定性,同时,RF处理包含噪声的降水数据的能力更强。

关键词: 干旱, WNN, SVR, RF, SPI, 海河北系

Abstract: Drought is one of the major natural disasters. Improving the accuracy of drought prediction can provide reliable data to support drought response and risk prevention. The construction of suitable drought prediction models is a current research hotspot. Machine learning models are widely used for drought forecasting such as artificial neural network （ANN）,wavelet neural network （WNN）,support vector regression （SVR） and random forest （RF）. This paper explored and compared the forecasting abilities and stabilities of the wavelet neural network （WNN）,support vector regression （SVR） and random forest （RF） in the northern part of the Haihe River Basin,China. The northern part of the Haihe River Basin is located in the upper reaches of Beijing and Tianjin,which is an important industrial and agricultural production area in China. The total area is 8.34×10⁵ km². It has a temperate monsoon climate with average annual precipitation of 490 mm. The models used in this paper are based on the standard precipitation index （SPI） at different time scales （3,6,9 and 12 months）. The SPI was calculated using daily precipitation data obtained at eight meteorological points in the northern part of the Haihe River Basin from 1960 to 2010. Then,the SPI series were predicted use the WNN,SVR and RF models separately. The effectiveness of the three machine learning models is compared by Kendall rank correlation（Kendall）,Kolmogorov-Smirnov（K-S） test and mean absolute error （MAE）. The following results were observed:（1） The prediction abilities of the WNN and SVR models vary at different time scales,with WNN performing best suited for SPI-12 and SVR best suited for SPI-6. （2） For the SPI-3 and SPI-12,the RF prediction performance was optimal （Kendall > 0.898,MAE < 0.05）. For the SPI-6 and SPI-9,the SVR prediction performance was optimal （Kendall > 0.95,MAE < 0.04）. （3） The stability of the model prediction performances differed,with RF being most stable,followed by SVR. （4） The variation in model predictions performance is due to the following: the convex optimization of SVR resolves the WNN weakness of falling into a local optimal solution,thereby improving the prediction performance of the model. The RF boosting diversified regression trees,which reduce the negative influence of weak learners,improve the prediction accuracy and stability of the model. Furthermore,the capacity of the RF model is strongest in its ability to cope with precipitation data that contains noise. This paper presents a comprehensive analysis of the drought prediction performance of multiple models at multiple time scales for SPI series and preliminarily explores the internal mechanisms of model differentiation. The result of this study provides alternative models and research ideas for the northern part of the Haihe River Basin and beyond．

Key words: drought, SVR, RF, WNN, SPI, the northern part of Haihe River Basin

赵美言, 胡涛, 张玉虎, 蒲晓, 高峰. 基于机器学习模型的海河北系干旱预测研究[J]. 干旱区地理, 2020, 43(4): 880-888.

ZHAO Mei-yan, HU Tao, ZHANG Yu-hu, PU Xiao, GAO Feng. Drought prediction based on machine learning models in the northern part of Haihe River Basin[J]. Arid Land Geography, 2020, 43(4): 880-888.

参考文献

[1] 高涛涛,殷淑燕,王水霞. 基于SPEI指数的秦岭南北地区干旱时空变化特征[J]. 干旱区地理,2018,41(4):85-94.
[GAO Taotao,YIN Shuyan,WANG Shuixia.Spatial and temporal variations of drought in northern and southern regions of Qinling Mountains based on standardized precipitation evapotranspiration index[J]. Arid Land Geography,2018,41(4):85-94.]
[2] 王文静,延军平,刘永林,等. 基于综合气象干旱指数的海河流域干旱特征分析[J]. 干旱区地理,2016,39(2):334-336.
[WANG Wenjing,YAN Junping,LIU Yonglin,et al.Characteristics of droughts in the Haihe Basin based on meteorological drought composite index[J]. Arid Land Geography,2016,39(2):334-336.]
[3] 倪深海,顾颖,彭岳津. 近七十年中国干旱灾害时空格局及演变[J]. 自然灾害学报,2019,28(6):176-181.
[MI Haishen,GU Yin,PENG Yuejin.Patio-temporal pattern and evolution trend of drought disaster in China in recent seventy years[J]. Journal of Natural Disasters,2019,28(6):176-181.]
[4] ZHU S, LUO X, CHEN S, et al.Improved hidden markov model incorporated with Copula for probabilistic seasonal drought forecasting[J]. Journal of Hydrologic Engineering,2020,25(6).
[5] 王志成. 基于改进马尔柯夫链的区域干旱预测[J]. 水资源开发与管理,2018,(2):55-57.
[Wang Zhicheng.Regional drought prediction based on improved Markov chain[J]. Water Resources Development and Management,2018,(2):55-57.]
[6] 马齐云,张继权,王永芳,等. 内蒙古牧区牧草生长季干旱特征及预测研究[J]. 干旱区资源与环境,2016,30(7):157-163.
[MA Qiyun,ZHANG Jiquan,WANG Yongfang,et al.Characteristics and prediction of drought in growing season in Inner Mongolia pastoral area[J]. Journal of Arid Land Resources and Environment,2016,30(7):157-163. ]
[7] 韩会明,刘喆玥,刘成林,等. 灰色模型的改进及其在气象干旱预测中的应用[J]. 南水北调与水利科技,2019,17(6):62-68.
[HAN Huiming,LIU Zheyue,LIU Chenglin,et al.Improvement of grey model and its application in forecast of meteorological drought[J]. South-to-North Water Transfers and Water Science & Technology,2019,17(6):62-68.]
[8] 谷洪波,刘芷妤. 湖南农业旱灾的时间规律分析及重灾年份预测[J]. 湖南科技大学学报(社会科学版),2016,19(5):110-116.
[GU Hongbo,LIU Zhiyu.Time regularity analysis and trend prediction of agricultural drought disaster in Hunan Province[J]. Journal of Hunan University of Science & Technology (Social Science Edition),2016,19(5):110-116.]
[9] 杨慧荣,张玉虎,崔恒建,等. ARIMA和ANN模型的干旱预测适用性研究[J]. 干旱区地理,2018,41(5):47-55.
[YANG Huirong,ZHANG Yuhu,CUI Hengjian,et al.Applicability of ARIMA and ANN models for drought forecasting[J]. Arid Land Geography,2018,41(5):47-55.]
[10] ZHANG Y,YANG H,CUI H,et al.Comparison of the ability of ARIMA,WNN and SVM models for drought forecasting in the Sanjiang Plain,china[J]. Natural Resources Research,2019,29:1447-1464.
[11] 杨海民,潘志松,白玮. 时间序列预测方法综述[J]. 计算机科学,2019,46(1):21-28.
[YANG Haimin,PAN Zhisong,BAI Wei.A survey of time series prediction methods[J]. Computer Science,2019,46(1):21-28.]
[12] 疏杏胜,王子茹,李福威. 基于机器学习模型的短期降雨多模式集成预报[J]. 南水北调与水利科技,2020,18(1):42-50.
[SHU Xingsheng,WANG Ziru,LI Fuwei.Short-term rainfall multi-mode integrated forecasting based on machine learning models[J]. South-to-North Water Transfers and Water Science & Technology,2020,18(1):42-50.]
[13] 措姆,加勇次成,红梅. 利用数据挖掘方法探索流域尺度气象干旱预报的研究[J]. 四川环境,2018,37(4):65-70.
[CUO Mu,JIAYONG Cicheng,HONG Mei.Using data mining methods to explore meteorological drought forecasts at river basin scales[J]. Sichuan Environment,2018,37(4):65-70.]
[14] 吴晶,陈元芳,余胜男. 基于随机森林模型的干旱预测研究[J]. 中国农村水利水电,2016,(11):17-22.
[WU Jing,CHEN Yuanfang,YU Shengnan.Research on drought prediction based on random forest model[J]. China Rural Water and Hydropower,2016,(11):17-22.]
[15] ZHANG Y, LI W, CHEN Q, et al.Multi-models for SPI drought forecasting in the north of Haihe River Basin, China[J]. Stochastic Environmental Research & Risk Assessment,2017,31(10):2471-2481.
[16] 张佼,田琦,王美萍. 基于交叉验证支持向量回归的供热负荷预测[J]. 中北大学学报(自然科学版),2014,35(5):189-206.
[ZHANG Jiao,TIAN Qi,WANG Meiping.Heating load prediction for heating systems based on support vector regression with cross validation[J]. Journal of North University of China(Natural Science Edition),2014,35(5):189-206.]
[17] 王金安,李飞. 复杂地应力场反演优化算法及研究新进展[J]. 中国矿业大学学报,2015,44(2):189-205.
[WANG Jinan,LI Fei.Review of inverse optimal algorithm of in-situ stress filed and new achievement[J]. Journal of China University of Mining & Technology,2015,44(2):189-205.]
[18] AHMADEBRAHIMPOUR E, AMINNEJAD B, KHALILI K.Application of global precipitation dataset for drought monitoring and forecasting over the Lake Urmia Basin with the GA-SVR model[J]. International Journal of Water,2018,12(3):262-277.
[19] 葛强. 基于随机森林的奎屯河水资源可持续利用评价[J]. 人民珠江,2019,40(1):79-83.
[GE Qiang.Evaluation of sustainable utilization of water resources in Kuitun River based on random forest[J]. Pearl River,2019,40(1):79-83.]
[20] TYRALIS H, PAPACHARALAMPOUS G, LANGOUSIS A.A brief review of random forests for water scientists and practitioners and their recent history in water resources[J]. Water,2019,11(5):910.
[21] 沈润平,郭佳,张婧娴,等. 基于随机森林的遥感干旱监测模型的构建[J]. 地球信息科学学报,2017,19(1):125-133.
[SHEN Runping,GUO Jia,ZHANG Jingxian,et al.Construction of a drought monitoring model using the random forest based remote sensing[J]. Journal of Geo-information Science,2017,19(1):125-133.]
[22] 张玉虎,向柳,孙庆,等. 贝叶斯框架的copula季节水文干旱预报模型构建及应用[J]. 地理科学,2016,36(9):1437-1444.
[ZHANG Yuhu,LIU Xiang,SUN Qing,et al.Bayesian probabilistic forecasting of seasonal hydrological drought based on Copula function[J]. Scientia Geographica Sinica,2016,36(9):1437-1444.]
[23] CAI W, ZHANG Y, YAO Y, et al.Probabilistic analysis of drought spatiotemporal characteristics in the Beijing-Tianjin-Hebei metropolitan area in China[J]. Atmosphere,2015,6(4):431-450.
[24] ZHANG Y, YAO Y, LIN Y, et al.Satellite characterization of terrestrial drought over Xinjiang Uygur Autonomous Region of China over past three decades[J]. Environmental Earth Sciences,2016,75(6):451.
[25] ZHANG Y, XIE P, PU X, et al.Spatial and temporal variability of drought and precipitation using cluster analysis in Xinjiang, northwest China[J]. Asia Pacific Journal of Atmospheric Sciences,2019,55:155-164.
[26] ZHANG Y, CAI W, CHEN Q, et al.Analysis of changes in precipitation and drought in Aksu River Basin, northwest China[J]. Advances in Meteorology,2015,2015:1-15.
[27] 章数语,王建华,翟家齐. 海河北系1956年-2012年降水时序演变特征[J]. 南水北调与水利科技,2016,14(3):36-42.
[ZHANG Shuyu,WANG Jianhua,ZHAI Jiaqi.Characteristics analysis of time serial of rainfall in the northern part of Haihe River Basin from 1956 to 2012[J]. South-to-North Water Transfers and Water Science & Technology,2016,14(3):36-42.]
[28] 宗燕,王艳君,翟建青. 海河流域气象干旱时空特征分析[J]. 干旱区资源与环境,2013,27(12):198-202.
[ZONG Yan,WANG Yanjun,ZHAI Jianqing.Spatial and temporal characteristics of meteorological drought in the Haihe River Basin based on standardized precipitation index[J]. Journal of Arid Land Resources and Environment,2013,27(12):198-202.]
[29] HE J, YANG X, LI J, et al.Spatiotemporal variation of meteorological droughts based on the daily comprehensive drought index in the Haihe River Basin, China[J]. Natural Hazards,2015,75(S-2):199-217.
[30] 李文卿,江源,赵守栋,等. 六盘山地区油松树轮宽度年表与多尺度标准化降水指数的关系[J]. 生态学报,2017,37(10):3365-3374.
[LI Wenqing,JIANG Yuan,ZHAO Shoudong,et al.Response of tree-ring width chronology of pinus tabulaeformis to multi-scale standardized precipitation index( SPIn) in the Liupan Mountain area[J]. Acta Ecologica Sinica,2017,37(10):3365-3374.]
[31] 王宇,卢文喜,卞建民,等. 基于小波神经网络的地下水流数值模拟模型的替代模型研[J]. 中国环境科学,2015,35(1):139-146.
[WANG Yu,LU Wenxi,BIAN Jianmin,et al.Surrogate model of numerical simulation model of groundwater based on wavelet neural network[J]. China Environmental Science,2015,35(1):139-146.]
[32] 王霞,王占岐,金贵,等. 基于核函数支持向量回归机的耕地面积预测[J]. 农业工程学报,2014,(4):204-211.
[WANG Xia,WANG Zhanqi,JIN Gui,et al.Land reserve prediction using different kernel based support vector regression[J]. Transactions of the Chinese Society of Agricultural Engineering,2014,(4):204-211.]
[33] BREIMAN L.Random forests[J]. Machine Learning,2001,45(1):5-32.
[34] BREIMAN L.Statistical modeling: The two cultures (with comments and a rejoinder by the author)[J]. Statistical Science,2001,16(3):199-231.
[35] 王奕森,夏树涛. 集成学习之随机森[J]. 信息通信技术,2018,12(1):49-55.
[WANG Yisen,XIA Shutao.A survey of random forests algorithms[J]. Information and Communications Technologies,2018,12(1):49-55.]