Agricultural cropland extent and areas of South Asia derived using Landsat satellite 30-m time-series big-data using random forest machine learning algorithms on the Google Earth Engine cloud

Gumma, M K and Thenkabail, P S and Teluguntla, P G and Oliphant, A and Xiong, J and Giri, C and Pyla, V and Dixit, S and Whitbread, A M (2019) Agricultural cropland extent and areas of South Asia derived using Landsat satellite 30-m time-series big-data using random forest machine learning algorithms on the Google Earth Engine cloud. GIScience & Remote Sensing (TSI). pp. 1-21. ISSN 1548-1603

[img] PDF - Published Version
Download (4MB)


The South Asia (India, Pakistan, Bangladesh, Nepal, Sri Lanka and Bhutan) has a staggering 900 million people (~43% of the population) who face food insecurity or severe food insecurity as per United Nations, Food and Agriculture Organization’s (FAO) the Food Insecurity Experience Scale (FIES). The existing coarse-resolution (≥250-m) cropland maps lack precision in geo-location of individual farms and have low map accuracies. This also results in uncertainties in cropland areas calculated fromsuch products. Thereby, the overarching goal of this study was to develop a high spatial resolution (30-m or better) baseline cropland extent product of South Asia for the year 2015 using Landsat satellite time-series big-data and machine learning algorithms (MLAs) on the Google Earth Engine (GEE) cloud computing platform. To eliminate the impact of clouds, 10 time-composited Landsat bands (blue, green, red, NIR, SWIR1, SWIR2, Thermal, EVI, NDVI, NDWI) were derived for each of the three timeperiods over 12 months (monsoon: Days of the Year (DOY) 151–300; winter: DOY 301–365 plus 1–60; and summer: DOY 61–150), taking the every 8-day data from Landsat-8 and 7 for the years 2013–2015, for a total of 30-bands plus global digital elevation model (GDEM) derived slope band. This 31-band mega-file big data-cube was composed for each of the five agro-ecological zones (AEZ’s) of South Asia and formed a baseline data for image classification and analysis. Knowledgebase for the Random Forest (RF) MLAs were developed using spatially well spread-out reference training data (N = 2179) in five AEZs. The classification was performed on GEE for each of the five AEZs using well-established knowledge-base and RF MLAs on the cloud. Map accuracies were measured using independent validation data (N = 1185). The survey showed that the South Asia cropland product had a producer’s accuracy of 89.9% (errors of omissions of 10.1%), user’s accuracy of 95.3% (errors of commission of 4.7%) and an overall accuracy of 88.7%. The National and sub-national (districts) areas computed from this cropland extent product explained 80-96% variability when compared with the National statistics of the South Asian Countries. The full-resolution imagery can be viewed at full-resolution, by zooming-in to any location in South Asia or the world, atwww.croplands. org and the cropland products of South Asia downloaded from The Land Processes Distributed Active Archive Center (LP DAAC) of National Aeronautics and Space Administration (NASA) and the United States Geological Survey (USGS):

Item Type: Article
Divisions: Research Program : Innovation Systems for the Drylands (ISD)
CRP: CGIAR Research Program on Climate Change, Agriculture and Food Security (CCAFS)
CGIAR Research Program on Water, Land and Ecosystems (WLE)
CGIAR Research Program on Grain Legumes and Dryland Cereals (GLDC)
Uncontrolled Keywords: Google Earth Engine, random forest, Landsat, cloud computing, 30m-South, Asia croplands
Subjects: Others > GIS Techniques/Remote Sensing
Others > South Asia
Others > Food Security
Depositing User: Mr Arun S
Date Deposited: 04 Feb 2020 06:27
Last Modified: 15 Mar 2021 09:01
Official URL:
Acknowledgement: This research was supported by the CGIAR Research Program Climate Change, Agriculture and Food Security (CCAFS), the CGIAR Research Program Water, Land and Ecosystems (WLE) which are carried out with support from the CGIAR Trust Fund and through bilateral funding agreements. For details visit and donors. Funding was also from NASA MEaSUREs, through the NASA ROSES solicitation, for a period of 5 years (1 June 2013- 31 May 2018). The NASA Making Earth System Data Records for Use in Research Environments (MEaSUREs) grant number is NNH13AV82I and the USGS Sales Order number is 29039. We gratefully acknowledge this support. The United States Geological Survey (USGS) provided significant direct and indirect supplemental funding through its Land Resources Mission Area (LRMA), National Land Imaging Program (NLIP), and Land Change Science (LCS) program. We gratefully acknowledge this. This research is a part of the Global Food Security – support Analysis Data Project at 30-m (GFSAD30). The project was led by USGS in collaboration with NASA Ames, University of New Hampshire (UNH), University of Wisconsin (UW), NASA Goddard Space Flight Center (GSFC), and the Northern Arizona University (NAU). Finally, special thanks to Ms. Susan Benjamin, Director of Western Geographic Science Center (WGSC) of USGS, Mr. Larry Gaffney, Administrative officer of WGSC of USGS for their support and encouragement throughout this research. We would like to thank to Dr Sunil Dubey, Assistant director, MNCFC for providing sub-national statistics.
View Statistics

Actions (login required)

View Item View Item