高光谱, 土壤有机质含量, 小波变换, 随机森林 ," /> 高光谱, 土壤有机质含量, 小波变换, 随机森林 ,"/> hyperspectral, soil organic matter content, wavelet transform, random forest

,"/> <span> </span> <span><span>基于随机森林算法的土壤有机质含量高光谱检测</span></span> <span> </span>
收藏设为首页 广告服务联系我们在线留言

干旱区地理 ›› 2019, Vol. 42 ›› Issue (6): 1404-1414.doi: 10.12118/j.issn.1000-6060.2019.06.20

• 地球信息科学 • 上一篇    下一篇

基于随机森林算法的土壤有机质含量高光谱检测

包青岭1,2,丁建丽1,2,王敬哲1,2,蔡亮红1,2   

  1. 1新疆大学资源与环境科学学院智慧城市与环境建模自治区普通高校重点实验室,新疆 乌鲁木齐 830046; 2绿洲生态教育部重点实验室,新疆 乌鲁木齐 830046
  • 收稿日期:2019-03-20 修回日期:2019-07-18 出版日期:2019-11-15 发布日期:2019-11-18
  • 通讯作者: 丁建丽
  • 作者简介:包青岭(1993-),男,硕士研究生,新疆伊犁人,主要从事土壤遥感研究. E-mail: 13139805801@163.com
  • 基金资助:
    新疆自治区重点实验室专项基金(2016D03001);新疆自治区科技支疆项目(201591101

Hyperspectral detection of soil organic matter content based on random forest algorithm

BAO Qing-ling1,2,DING Jian-li1,2,WANG Jing-zhe1,2,CAI Liang-hong   

  1. 1 Key Laboratory of Wisdom City and Environmental Modeling Department of Education,Xinjiang University,Urumqi 830046, Xinjiang,China; 2 Key Laboratory of Oasis Ecology,Xinjiang University,Urumqi 830046,Xinjiang,China
  • Received:2019-03-20 Revised:2019-07-18 Online:2019-11-15 Published:2019-11-18

摘要: 为了探讨既能保留光谱信息又能准确对土壤有机质含量进行快速检测。以新疆南部渭干河—库车绿洲内部73个土壤样点及其对应的高光谱数据为研究对象,采用小波变换与数学变换进行光谱数据预处理,分析各小波分解重构光谱在不同有机质含量与不同土壤类型下光谱曲线差异,通过相关分析确定最大小波分解层并筛选敏感波段,结合灰色关联分析与随机森林预测分类模型对各小波分解特征光谱进行重要性分析,最后基于最优特征光谱建立多元线性预测模型并进行分析。结果表明:(1 耕作土壤与林地土壤光谱曲线波段相较盐渍土壤和荒漠土壤光谱曲线变化较为平缓,同时在水分吸收波段处,盐渍土壤光谱曲线吸收谷最深。(2 小波变换分解光谱与土壤有机质含量的相关性随着分解层数增加呈现先减后增趋势,在第6层中,特征光谱曲线与敏感波段数量变化趋于稳定,确定为小波变换最大分解层。(3 随机森林模型相比灰色关联分析对于各小波分解层因子的筛选符合预期,按照对土壤有机质含量影响从高到低排序为L3-(1/LgR)′、L4-(1/LgR)′、L6-(1/LgR)′、L5-(1/LgR)′、L2-(1/LgR)′、L0-1/LgRL1-1/LgR。(4)在小波分解光谱中,中频范围特征光谱对干旱区土壤有机质含量的估测能力优于高频与低频范围特征光谱,同时基于L-MC建立的模型精度最高。研究表明:基于机器学习分类方法结合小波分解的土壤光谱有机质含量监测,可以有效的减少噪声波段干扰,并提高特征波段的分类预测精度。

关键词: font-size:10.5pt, 高光谱')">">高光谱, Times New Roman", ,serif, font-size:10.5pt, "> font-size:10.5pt, 土壤有机质含量')">">土壤有机质含量, Times New Roman", ,serif, font-size:10.5pt, "> font-size:10.5pt, 小波变换')">">小波变换, Times New Roman", ,serif, font-size:10.5pt, "> font-size:10.5pt, 随机森林 ')">">随机森林

Abstract:

In order to explore how to retain the spectral information and accurately detect the soil organic matter content, this paper investigated the possibility of using spectral processing techniques such as wavelet decomposition and random forest method to estimate the soil organic matter content and analyze the spectral curves of different wavelet decomposition reconstruction spectra in different soil types using spectroscopy data. This study took the soil samples as the study objects which were collected in Weigan River Oasis of Kuqa County, a typical arid area oasis at north-central of the Tarim Basin in Xinjiang, China. The soil organic matter content of these samples was determined. The ASD Field Spec FR was used to measure the soil samplesspectrum, and the spectral data were preprocessed by wavelet decomposition and mathematical transformation. Discrete wavelet transform (DWT) has the function of multi-scale analysis, which can transform multi-scale wavelet decomposition of soil near infrared spectroscopy data to analyze the spectral curves of different wavelet decomposition reconstruction spectra in different organic matter content and different soil types. The correlation analysis was used to determine the maximum wavelet decomposition layer and filter sensitive bands. Finally, a multi-variant linear prediction model about soil organic matter content was established based on the optimal characteristic spectrum produced by combining grey correlation analysis, random forest method to analyze the significance of different wavelet decomposition characteristic spectra. The results showed as follows: (1) The spectral reflectance of each wavelet decomposed is decreased with the increase of organic matter content. At the same time, the spectral curve of cultivated soil and forest soil shows a more gradual change than that of the saline soil and desert soil. (2) The correlation between the decomposition spectrum of the wavelet transform and the soil organic matter content is decreased first and then increased with the increase of the decomposition layer. In the sixth layer, the characteristic spectral curve and the number of sensitive bands tend to be stable, which helps to determine this layer as the largest decomposition layer of wavelet transform. (3) Compared with the gray correlation analysis, the random forest model is in line with the expectation for screening the factors of wavelet decomposition at each layer, and it comes a list of descending order according to the impact on soil organic matter content as follows: L3-(1/LgR)L4-(1/LgR)L6-(1/LgR)L5-(1/LgR)L2-(1/LgR)L0-1/LgRL1-1/LgR. (4) Combining all SOM estimation models for statistical analysis,the model based on L-MC has the highest accuracy. The research shows that the monitoring of soil spectral organic matter content based on machine learning classification method combined with wavelet decomposition can effectively reduce noise band interference and improve the classification prediction accuracy of feature bands. The random forest prediction classification model has significant advantages over the traditional linear prediction classification model, such as gray correlation analysis. The random forest model not only outperforms the grey correlation analysis in statistical results, but also shows better reliability and stability in predicting ability. The results could provide scientific reference and support for the study of soil nutrients in the arid zone and local precision agriculture.

Key words: font-size:10.5pt, ">Times New Roman", ,serif, hyperspectral')">">hyperspectral, soil organic matter content, wavelet transform, random forest