1. Introduction
Land cover (LC) is an important variable that is necessary for studying the functional and morphofunctional changes occurring in the global ecological environment (Feddema et al., 2005). Thus, LC mapping is considered an important field of study related to global environmental change and sustainability (Running, 2008; Zhang et al., 2014). Data from sensor systems have been applied to global land cover (GLC) mapping due to their frequent coverage of the earth’s surface through the efforts of many scientific communities. To date, several GLC datasets based on remote sensing data have arisen from various initiatives, such as the International GeosphereBiosphere Programme (IGBP) LC dataset (Loveland and Belward, 1997; Loveland et al., 2000), University of Maryland (UMD) LC dataset (Hansen et al., 2000), global land cover 2000 (GLC2000) dataset(Bartholomé and Belward, 2005), global LC map (GlobCover) (Arino et al., 2008; Bontemps et al., 2011), Moderate Resolution Imaging Spectroradiometer (MODIS) LC dataset (MOD12Q1 and MCD12Q1) (Friedl et al., 2002; Friedl et al., 2010), and 30-m global LC dataset (GlobeLand30) with 10 classes for the years 2000 and 2010 (Chen et al., 2011).
The primary purpose of these global mapping projects is to serve the scientific and research communities by producing a variety of GLC datasets (Russell et al., 2014). These GLC datasets reflect the characteristics of Earth’s surface well. However LCdatasetsthat are derived from different satellite sensors, with differing spatial resolutions, classification methods, classification schemes, and approaches, are certain to have some discrepancies(Yang et al., 2017). Reported global area-weighted accuracy averages range from 66.9% (IGBP), 69% (UMD), 68.6% (GLC2000), 67.1% (GlobCover2009), 78.3% (MODIS), to more than 80% (GlobeLand30).
Although these GLC datasets have been widely applied in various fields in Korea, they have some drawbacks. First, there are many validated GLC datasets in China according to reference data portals provided by the Global Observation of Forest Cover and Land Cover Dynamics (GOFC-GOLD) program. However, only one validation point is present in MODIS MCD12Q1 (Gwangneung coniferous forest, Korea). Moreover, there are no validation points for GLC2000 or GlobCover2009 (https://www.gofcgold. wur.nl/) in South Korea. Therefore, users of GLC datasets may question whether existing GLC datasets are capable of reflecting LC in South Korea. In addition, much of the land cover in some GLCdatasets has been misclassified as non-realistic types such as savanna or snow and ice (Park and Suh, 2015).
The purpose of this study is to assess the classification accuracy of several GLCdatasetsthat are widely used in Korea. The primary steps and contributions of this study are summarized as follows: (1) GLC datasets were selected for evaluation, including GLC2000, GlobCover2009, MCD12Q1, and GlobeLand30, sing a level-2 LC map of South Korea as a reference dataset to assess accuracy,(2)The level-2 LC map of Korea was resampled to the spatial resolution of each GLC dataset, and (3) four accuracy measures(i.e. overall, user’s and producer’s accuracies, kappa coefficient) were derived from four confusion matrices for each GLC map established based on the reference dataset.
2. Data and Methodology
1) Global land cover datasets
The importance of LC data has been recently recognized in Korea, as GLC datasets are very useful in numerous fields including environmental and ecosystem management and meteorology. The purpose of this study is to assess the accuracy of four GLC datasets(GLC2000, GlobCover2009, MCD12Q1, and GlobeLand30) that are widely used in Korea.
The global LC for the year 2000 (GLC2000) was produced by an international partnership of 30 research groups coordinated by the European Joint Research Centre (JRC) (Bartholomé and Belward, 2005). GLC2000 makes use of the VEGA2000 dataset, which includes 14 months of preprocessed daily global data acquired by the VEGETATION instrument onboard the SPOT 4 satellite, and was derived from 19 regional subsets with region-specific classification schemes. GLC2000 follows the land-cover classification system (LCCS), a major contribution developed by the Food and Agriculture Organization (FAO) of the United Nations (UN). The overall accuracy of GLC2000 was 68.8%, as assessed using Landsat data and 1265 samples distributed throughout Asia, Africa, and Europe (Mayaux et al., 2006).
The GlobCover2009 project was developed to generate a global LC map at 300-m spatial resolution using full-resolution data acquired by the Medium Resolution Imaging Spectrometer (MERIS) sensor onboard the environmental satellite (ENVISAT) collected from 1 January 2009 to 31 December 2009. This product counts 22 LC classes defined by the UN LCCS. The classification module consists of per-pixel supervised classification for the urban and wetland classes and unsupervised classification for the remaining classes to create clusters with similar spectral and temporal characteristics (Congalton et al., 2014). For GlobCover2009, Bontemps et al. (2011) assessed the overall accuracy weighted by the class area, which reached 67.5% using a set of 2,190 globally homogeneous and heterogeneous points.
Table 1. Characteristics of the four global land cover (GLC) datasets used in this study
The MODIS LC dataset (MCD12Q1) has been widely applied. Collection 5 MCD12Q1 was released in 2009 and has an annual update cycle. MCD12Q1 provides five data layers, corresponding to five global LC schemes with spatial resolution of 500 m. In this study, we used MCD12Q1 based on the IGBP classification scheme. The IGBPscheme was classified using a decision tree algorithm that analyzed a full year of 8-day MODIS nadir bidirectional reflectance distribution function (BRD)-adjusted reflectance (NBAR) data. The cross-validation for MCD12Q1 2010 reported per-class user’s accuracies in the range of 60 to 90%, with the overall area-weighted global accuracy estimated at 78.3% (Friedl et al., 2010).
Finally, the most recent GLC dataset is GlobeLand30. This product was presented to the UN by the national administration of Surveying, Mapping and Geoinformation of China (NASG) in September 2014. This dataset offers a detailed portrait of land cover, with 10 classes for 2000-2010.It is the first GLC product with a fine spatial resolution of 30 m and is based on Landsat Thematic Mapper (TM), Enhanced TM+, and multispectral images from the Chinese Environmental Disaster Alleviation Satellite (HJ-1). Nine LC types (tundra is not included) and a total of 159,874 sample pixels were evaluated for accuracy assessment of GlobeLand30 2010. The overall accuracy of the product was estimated to be greater than 80%, and the kappa coefficient was 0.75 (Chen et al., 2015).
2) Reference dataset
The Ministry of Environment (MOE) of South Korea offers three types of LC maps, which are divided into level-1, level-2, and level-3 LC maps according to their spatial resolution. The level-2 LC map was produced in 2007 and classified Korea into 22 types using LandsatTM+, IRS-1C, SPOT5, and KOMPSAT2 satellite imagery (Fig. 1). In this study, a reference dataset was produced using the level-2 LC map for accuracy assessment of the four GLC maps. Most GLC maps have different class types and class numbers based on their classification schemes. Thus, we aggregated the different classes found in different GLC datasets into 7 classes. In addition, the 22 classes of the level-2 LC map were aggregated into 7 classes (Table 2). These classes were then converted into four type in a reference dataset with the same resolution as each GLC dataset (Fig. 2).
Fig. 1. 2007 Level-2 land cover (LC) map of South Korea in with 22 classes.
Fig. 2. Reference datasets based on the spatial resolution of each GLC dataset. (a) 30 m, (b) 300 m, (c) 500 m, and (d) 1,000 m.
Table 2. Classification scheme employed in this study. The GLC schemes are converted to match to 7 classes
3) Kappa analysis
Confusion matrices and kappa analysis were used to assess the accuracy of the four GLC maps. Kappa analysis is a discrete multivariate technique used in accuracy assessment to statistically determine if one confusion matrix differs significantly from another (Bishop et al., 1975). A commonly used measure is the kappa coefficient of agreement (Congalton and Green, 2009). The kappa coefficient is a flexible index derived by Cohen (1960) for use when chance agreement between two datasets is a concern. It can be calculated as
\(\hat{K}=\frac{p_{o}-p_{c}}{1-p_{c}}\) (1)
where po is the observed proportion of agreement (i.e., the actual agreement) and pc is the proportion of agreement that is expected to occur due to chance (i.e., the chance agreement). Lands and Koch have proposed the following as standards for strength of agreement for the kappa coefficient: ≤0 = poor, 0.01-0.20 = slight, 0.21-0.40 = fair, 0.41-0.60 = moderate, 0.61-0.80 = substantial, and 0.81-1.0 = almost perfect (Landis and Koch, 1977).
In this study, four accuracy measures (i.e. overall, user’s and producer’s accuracies, and kappa coefficient) were derived from four confusion matrices of each GLC map that were established based on the reference dataset (shown in Fig. 2).
3. Results
Because the classification scheme used in GlobeLand30 differs from that of the other four LC maps, it is difficult to compare their accuracy directly. Thus, we also determined the accuracy of the four GLC maps based on an aggregated classification scheme (with classes of urban, croplands, forest, grasslands, wetlands, barren, and water) (Fig. 3)
Fig. 3. Spatial patterns of LC classes for four GLC datasets based on an aggregated classification scheme with seven classes. (a) GLC2000, (b) GlobCover2009, (c) MCD12Q1, and (d) GlobeLand30.
We compared the areas covered by these seven classes among the four GLC datasets and the level-2 LC map (Fig. 4). Urban areas and croplands are very similar among MC12Q1, GlobeLand30, and the level-2 LC map. However, the urban area in GLC2000 and GlobCoer2009 is less than 1%. Whereas the area of croplands in GLC2000 is almost 50%, croplands make up less than 5% in GlobCover2009. Forest, which occupies the largest area among the 7 classes, ranges from almost 50% to 70%. Forest cover is the lowest in GLC2000, and that of GlobCover2009 is relatively similar to the level-2 LCmap. For grasslands, the level-2 LC map indicates less than 1%, which all four GLC datasets overestimated. On the other hand, the area covered by wetlands was underestimated in the four GLC datasets. The barren area was largest in the GlobCover2009 dataset (almost 20%), whereas the other datasets had less than 1% barren land. Water areas were relatively similar in all LC maps.
Fig. 4. Area comparison of seven classes among the four GLC datasets and the level-2 LC map
Table 3 shows confusion matrices between the four GLC maps and reference datasets based on the level-2 LC map. Four accuracy measures (i.e. overall, user’s and producer’s accuracies, and kappa coefficient) were derived from these confusion matrices
Table 3. Confusion matrices for the four GLC maps (unit: km2)
Fig. 5 shows the producer’s and user’s accuracies for the seven classes and four GLC maps. The producer’s and user’s accuracies of urban area for GLC2000 and GlobCover2009 had contradictory results, which were related to the area of GLC2000 and GlobCover2009. In GlobCover2009, urban land use covered the same area as croplands. Forest, which occupied the largest proportion of our study area, had high producer’s and user’s accuracies. In contrast to forest, the overall producer’s and user’s accuracies for grasslands in the four GLC datasets were very low. These results are derived from the area of grasslands on the level-2 LC map. Except for GlobeLand30, producer’s and user’s accuracies for wetlands were very low, and this result was related to spatial resolution. The spatial resolutions of the three GLC datasets(GlobCover2009,MCD12Q1, and GLC2000) were 300, 500, and 1,000m, respectively. The producer’s and user’s accuracies for barren area were very low except for the producer’s accuracy for GlobCover2009. This discrepancy is derived from the barren area of GlobCover2009. The producer’s and user’s accuracies for water are similar to those for wetlands. Due to the spatial resolutions of GLC2000 and MCD12Q1, these two GLC datasets can rarely distinguish regions where water and other land covers mix
Fig. 5. The producer’s and user’s accuracies of each class for four GLC datasets in South Korea. (a) producer’s and (b) user’s accuracy.
Among the four GLC datasets, GlobeLand30 has the highest overall accuracy (77.59%) and kappa coefficient 0.61), and these results arise from the spatial resolution (Table 4). Compared to other GLC datasets, the spatial resolution of GlobeLand30 is very fine (30m). The accuracy measures of MCD12Q1 are slightly lower than those of GlobeLand30, at 75.51% and 0.55. This difference is derived from the classification scheme, as GLC2000 and Globcover2009 use LCCS, whereas GlobeLand30 uses IGBP. GlobCover2009 has the lowest overall accuracy (57.99%) and kappa coefficient (0.26) because it misclassified most croplands as forest or barren.
Table 4. Overall accuracies and kappa coefficients of the four GLC maps
4. Conclusions
The national accuracy of GLC products is of great importance to ecosystem and environmental research. This paper presents the results of accuracy assessment in South Korea of four commonly used GLCdatasets, GLC2000, GlobCover2009, MCD12Q1, and GlobeLand30. First, we compared the area of seven classes between the four GLC datasets and a level-2 LC map. The urban and cropland areas were very similar among MCD12Q1, GlobeLand30, and the level-2 LC map. Forest, which occupies the largest area among the seven classes, covers from 50% to 70% of the study area. Barren land area was greatest in GlobCover2009, and was less than 1% in all other datasets.The area of water was similar on all LC maps. We determined the accuracy of the four GLC datasets based on an aggregated classification scheme with seven classes(i.e., urban, cropland, forest, grasslands, wetlands, barren, and water). GlobeLand30 had the highest overall accuracy (77.59), and the second-highest was MCD12Q1 (75.51%). The overall accuracies of GLC2000 and GlobCover2009 were 68.38% and 57.99%, respectively. GlobeLand30 was released at the end of September 2014 and is the most recent GLC dataset. Although the overall accuracy (77.59%) of GlobeLand30 2010 in South Korea is slightly lower than its global overall accuracy (80.30%), this dataset can be applied to support a variety of national and international scientific endeavors in South Korea.
Acknowledgment
This work was supported by a Research Grant of Pukyong National University (2017 year).
References
- Arino, O., P. Bicheron, F. Achard, F. Latham, R. Witt, and J.L.Weber, 2008. GLOBCOVER-the most detailed portrait of Earth, European Space Agency Bulletin, 136: 25-31.
- Bartholome, E. and A. Belward, 2005. GLC2000: a new approach to global land cover mapping from Earth observation data, International Journal of Remote Sensing, 26(9): 1959-1977. https://doi.org/10.1080/01431160412331291297
- Bishop,Y., S. Fienberg, and P. Holland, 1975. Discrete Multivariate Analysis: Theory and Practice, MIT Press, Cambridge, MA, USA.
- Bontemps, S., P. Defourny, E.V. Bogaert, O. Arino, V. Kalogirou, and J.R. Perez, 2011. GLOBCOVER 2009 - Products description and validation report, European Space Agency.
- Chen, J., J. Chen, A. Liao, X. Cao, L. Chen, X. Chen, C. He, G. Han, S. Peng, M. Lu, W. Zhang, X. Tong, and J. Mills, 2015. Global land cover mapping at 30 m resolution: A POK-based operational approach, ISPRS Journal of Photogrammetry and Remote Sensing, 103: 7-27. https://doi.org/10.1016/j.isprsjprs.2014.09.002
- Chen, J., J. Chen, P. Gong, A. Liao, and C. He, 2011. Higher resolution GLC mapping, Geomatics World, 2: 12-14.
- Congalton, R. and K. Green, 2008. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices (2nd Edition), CRC Press, FL, USA.
- Congalton, R., J. Gu, K. Yadav, P. Thenkabail, and M. Ozdogan, 2014. Global land covermapping: a review and uncertainty analysis, Remote Sensing, 6(12): 12070-12093. https://doi.org/10.3390/rs61212070
- Feddema,J.J., K.W. Oleson, G.B. Bonan, L.O. Mearns, L.E. Buja, G.A. Meehl, and W.M.Washington, 2005. The importance of land-cover change in simulating future climates, Science, 310(5754): 1674-1678. https://doi.org/10.1126/science.1118160
- Friedl, M.A., D. Sulla-Menashe, B. Tan,A. Schneider, N.Ramankutty, A. Sibley, and X. Huang, 2010. MODIS collection 5 global land cover: algorithm refinements and characterization of new datasets, Remote Sensing of Environment, 114(1): 168-182. https://doi.org/10.1016/j.rse.2009.08.016
- Friedl, M.A., D.K. McIver, J.C. Hodges, X. Zhang, D. Muchoney, A.H. Strahler, C.E. Woodcock, S. Gopal, A. Schneider, and A. Cooper, 2002. Global land cover mapping from MODIS: algorithms and early results, Remote Sensing of Environment, 83(1-2): 287-302. https://doi.org/10.1016/S0034-4257(02)00078-0
- Hansen, M.C.,R.S. Defries, J.R.G. Townshend, and R. Sohlberg, 2000. Global land cover classification at 1 km spatial resolution using a classification tree approach, International Journal of Remote Sensing, 21(6-7): 1331-1364. https://doi.org/10.1080/014311600210209
- Landis, J.R. and G.G. Koch, 1977. The measurement of observer agreement for categorical data, Biometrics, 33: 159-174. https://doi.org/10.2307/2529310
- Loveland,T.R. and A.S. Belward, 1997.The IGBP-DIS global 1 km land cover data set, DIS-Cover: First results, International Journal of Remote Sensing, 18: 3289-3295. https://doi.org/10.1080/014311697217099
- Loveland, T.R., B.C. Reed, J.F. Brown, D.O. Ohlen, Z. Zhu, L. Yang, and J.W. Merchant, 2000. Development of a global land cover characteristics database and IGBP DISCover from 1 km AVHRR data, International Journal of Remote Sensing, 21(6-7): 1303-1330. https://doi.org/10.1080/014311600210191
- Mayaux, P., H. Eva, J. Gallego, A. Strahler, M. Herold, S. Agrawal, S. Naumov, E. Eduardo, C.M. Di Bella, C. Ordoyne, Y. Kopin, and P.S. Roy, 2006. Validation of the Global Land Cover 2000 Map, IEEE Transactions on Geoscience and Remote Sensing, 44(7): 1728-1739. https://doi.org/10.1109/TGRS.2006.864370
- Park, J. and M. Suh, 2015. Improvement of MODIS land cover classification over the Asia-Oceania region, Korean Journal of Remote Sensing, 31(2): 51-64 (in Korean with English abstract). https://doi.org/10.7780/kjrs.2015.31.2.1
- Running, S.W., 2008. Ecosystem disturbance, carbon, and climate, Science, 321(5889): 652-653. https://doi.org/10.1126/science.1159607
- Russell, G.C., J. Gu, K. Yadav, P. Thenkabail, and M. Ozdogan, 2014. Global land cover mapping: A review and uncertainty analysis, Remote Sensing, 6(12): 12070-12093. https://doi.org/10.3390/rs61212070
- Yang, Y., P. Xiao, X. Feng, and H. Li, 2017. Accuracy assessment of seven global land cover datasets over China, ISPRS Journal of Photogrammetry and Remote Sensing, 125: 156-173. https://doi.org/10.1016/j.isprsjprs.2017.01.016
- Zhang, Z.X., X. Wang, X.L. Zhao, B. Liu, L. Yi, L.J. Zuo, Q.K. Wen, F. Liu, J.Y. Xu, and S.G. Hu, 2014. A 2010 update of national land use/cover database of China at 1:100000 scale using medium spatial resolution satellite images, Remote Sensing of Environment, 149: 142-154. https://doi.org/10.1016/j.rse.2014.04.004
Cited by
- Land Cover Classification Map of Northeast Asia Using GOCI Data vol.35, pp.1, 2019, https://doi.org/10.7780/kjrs.2019.35.1.6
- Different Agricultural Responses to Extreme Drought Events in Neighboring Counties of South and North Korea vol.11, pp.15, 2019, https://doi.org/10.3390/rs11151773