Development of an R-based Spatial Downscaling Tool to Predict Fine Scale Information from Coarse Scale Satellite Products

Kwak, Geun-Ho;Park, No-Wook;Kyriakidis, Phaedon C.;

doi:10.7780/kjrs.2018.34.1.6

대한원격탐사학회지 (Korean Journal of Remote Sensing)

제34권1호
/
Pages.89-99
/
2018
/
1225-6161(pISSN)
/
2287-9307(eISSN)

대한원격탐사학회 (Korean Society of Remote Sensing)

DOI QR Code

Development of an R-based Spatial Downscaling Tool to Predict Fine Scale Information from Coarse Scale Satellite Products

Kwak, Geun-Ho (Department of Geoinformatic Engineering, Inha University) ;
Park, No-Wook (Department of Geoinformatic Engineering, Inha University) ;
Kyriakidis, Phaedon C. (Department of Civil Engineering and Geomatics, Cyprus University of Technology)

투고 : 2018.02.09
심사 : 2018.02.19
발행 : 2018.02.28

https://doi.org/10.7780/kjrs.2018.34.1.6 인용 PDF KSCI HTML

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

Spatial downscaling is often applied to coarse scale satellite products with high temporal resolution for environmental monitoring at a finer scale. An area-to-point regression kriging (ATPRK) algorithm is regarded as effective in that it combines regression modeling and residual correction with area-to-point kriging. However, an open source tool or package for ATPRK has not yet been developed. This paper describes the development and code organization of an R-based spatial downscaling tool, named R4ATPRK, for the implementation of ATPRK. R4ATPRK was developed using the R language and several R packages. A look-up table search and batch processing for computation of ATP kriging weights are employed to improve computational efficiency. An experiment on spatial downscaling of coarse scale land surface temperature products demonstrated that this tool could generate downscaling results in which overall variations in input coarse scale data were preserved and local details were also well captured. If computational efficiency can be further improved, and the tool is extended to include certain advanced procedures, R4ATPRK would be an effective tool for spatial downscaling of coarse scale satellite products.

키워드

1. Introduction

Many environmental parameters such as precipitation, temperature,soil moisture, and vegetation index have been provided from Earth observation satellite imagery and products. The analysis and modeling based on satellite products depend heavily on spatial and temporal resolutions of the satellite products. Using satellite products with appropriate spatial resolutions is crucial for both global/regional and local analyses. The revisiting time or temporal resolution is also important in the proper selection of satellite products for periodic monitoring. Periodic environmental monitoring and time-series analysis are possible using geostationary satellite products owing to their high temporal resolutions. However, the spatial resolution of geostationary satellite products is too coarse to perform analysis at a local scale (Park, 2013; Park et al., 2016). The increase in spatial resolution, called spatial downscaling (Atkinson, 2013), is often required to fully utilize thematic information from coarse scale satellite products with high temporal resolution (Park, 2013; Kim and Park, 2017).

To generate thematic information at a finer scale from coarse scale satellite products, various downscaling algorithms have been proposed and applied to spatial downscaling of various environmental parameters such as precipitation (Immerzeel et al., 2009; Jia et al., 2011; Park, 2013; Park et al., 2016), land surface temperature (Hutengs and Vohland, 2016; Yooet al., 217), and evapotranspiration (Ke et al., 2016). Particularly, among spatial downscaling algorithms, Kyriakidis (2004) proposed area-to-point (ATP) kriging, a modification of conventional univariate kriging, which can properly account for variations within coarse scale blocks. Park (2013) extended ATP kriging to a multivariate framework by combining regression-based trend estimates with ATP kriging-based residual correction. This algorithm was also applied to pan-sharpening, and later named area-to-point regression kriging (ATPRK) by Wang et al. (2015). ATPRK is known as an effective algorithm owing to its capacity not only for integration of auxiliary data available at a fine scale during spatial downscaling, but also for proper prediction of fine scale residuals from coarse scale residuals(Park, 2013; Park et al., 2016). If linear regression and ATP kriging are combined for spatial downscaling, the average of downscaled values within each coarse scale block is the same as the attribute values of the input coarse scale block, which is called a coherence or consistency property (Kyriakidis, 2004; Goovaerts, 2008; Park, 2013).

Because ATPRK combines regression modeling and residual correction, both procedures should be considered in implementing the ATPRK algorithm. Many commercial and open source statistical packages can be applied to regression modeling. The open source statistical environment R(RCoreTeam, 2017) provides several geostatistical packages such a gstat(Pebesma, 004) and geo R (Ribeiro Jr. and Diggle, 2016), which are useful in the implementation of geostatistical algorithms. However, only conventional point kriging algorithms are provided in most R packages. To the best of our knowledge, an open source package or tool has not yet been developed for the implementation of ATPRK.

The main objective of this study is to develop an R-based tool to implement ATPRK for spatial downscaling of coarse scale satellite products. The tool named R4ATPRK (R for area-to-point regression kriging) is developed to efficiently downscale coarse scale satellite products using R and related packages. The applicability and implementation issues of the R4ATPRK tool, as investigated via an experiment on spatial downscaling of coarse scale land surface temperature (LST) products, are described in this paper.

2. Spatial DownscalingAlgorithm:ATPRK

The ATPRK algorithm adopted as the primary spatial downscaling algorithm in this study is theoretically based on component decomposition. More specifically, ATPRK decomposes the attribute of interest into deterministic trend and stochastic residual components (Park, 2013). The deterministic trend components are estimated using regression modeling based on the statistical relationships between input coarse scale data and auxiliary variables available at a finer scale. The stochastic residual components that cannot be explained by the auxiliary variables are estimated using ATP riging. The final downcaling result is obtained by adding the two components.

The overall procedure for ATPRK is shown in Fig. 1. The salient features of ATPRK synthesized from Park (2013) and Park et al. (2016) are given in this paper. The input target data, which are regarded as dependent variables for regression modeling, are only available at a coarse scale. Therefore, regression modeling is conducted at a coarse scale after the auxiliary variables at a fine scale have been upscaled to the coarse scale. Linear/non-linear models and machine learning algorithms such as random forests and support vector machines can be employed for regression modeling. Assuming that the statistical relationships quantified at a coarse scale are preserved at a fine scale (Park, 2013), the regression models estimated at a coarse scale are applied directly to the auxiliary variables at a fine scale, and then, trend components at a fine scale are obtained. It should be noted that above assumption is valid only for linear regression models.

OGCSBN_2018_v34n1_89_f0001.png 이미지

Fig. 1. Workflow of the area-to-point regression kriging algorithm adopted to develop the R4ATPRK tool.

After the trend components at a fine scale have been estimated, the next step is to estimate residual components at a fine scale using ATP kriging. Suppose there are N coarse scale residuals in the study area {R(vn), n=1, …, N}, where vn denotes the nth coarse scale block. ATP kriging predicts residual componets (R(un)) at a fine scale loation (un) within the nth coarse scale block by a weighted linear combination of neighboring residuals at a coarse scale (Kyriakidis, 2004). ATP simple kriging can be formulated as:

$R\left(\mathbf{u}_{n}\right)=\sum_{\alpha=1}^{L\left(\mathbf{u}_{n}\right)} \lambda_{\alpha}\left(\mathbf{u}_{n}\right) R\left(v_{\alpha}\right)$ (1)

where λα(un) is a simple kriging weight assigned to the neighboring coarse scale residual component at an estimation location. L(un) denotes the number of neighboring residual blocks closest to the estimation location.

The vector of ATP kriging weight (λ) is computed by solving the following block simple kriging system in matrix notation:

Kλ = k (2)

where $K=\bar C (v_\alpha, v_\beta)$ is the block-to-block covariance between the αth and βth blocks, and $K=\bar C (v_\alpha, u_\beta)$ is the block-to-point covariance between the αth block and the estimation location (u_n).

The block-to-block covariance (K) is approximated by averaging the covariance values between any two points discretizing two blocks(Goovaerts, 2008; Hansen and Mosegaard, 2008). Similarly, the block-to-point covariance(k)is computed by averaging the covariance values between the point location and a set of points discretizing the block. A point-support covariance (equivalently, ariogram) required for the above covariance computation is generally estimated using variogram deconvolution (Journel and Huijbregts, 1978; Goovaerts, 2008).

Once both trend and residual components have been estimated at a fine scale, the final downscaling results are obtained by adding residual components to trend components. Assuming that coarse scale data are the linear average of fine scale point values, the average of the trend estimates at a fine scale is the same as the trend component values at a coarse scale if linear regression is applied. Additionally, residual estimates by ATP kriging also satisfy the coherence property. Thus, the final downscaling result, which isthe sum of the trend and residual components, can satisfy the coherence property and reproduce the original coarse scale block values when upscaled.

3. Description of R4ATPRK Tool

The R4ATPRK tool was developed using the R language (version 3.3.2). SeveralRpackages were also used for data import, regression modeling, raster data processing, and basic geostatistical analysis (Table 1). This tool was tested on both Windows and Linux operating systems.

Table 1. List of R packages used for R4ATPRK

OGCSBN_2018_v34n1_89_t0001.png 이미지

1) Input preparation

R4ATPRK requires two types of input: the first input type is atarget variable to be downscaled, and the second input type is auxiliary variable data that are associated with the target variable and should be available at a finer scale. R4ATRK uses the raster data-type for both coarse scale data and fine scale auxiliary data, because the processing time of raster data is much less than that of vector type point data. All input data should be geocoded to the same coordinate within the study area of interest. The spatial resolution offline scale auxiliary data is the target resolution for spatial downscaling. The file format of all input data is an ESRI ASCII format with headers (Fig. 2). The basic structure of the ESRI ASCII raster has the header information at the beginning of the file followed by data values. The first six lines provide general information about the input data (e.g., numbers of columns and rows, pixel size, no data value, and the coordinates of the upper lower left corner). The lines following the header information are considered to be data values. A parameter file, which includes the data path and name, and the number of fine scale auxiliary datasets, is used to manage the input data conveniently. This tool is executed through a function named kriging, which uses the input parameter file and the number of neighboring pixels as parameters.

OGCSBN_2018_v34n1_89_f0002.png 이미지

Fig. 2. Input data file format

2) Trend component estimation

Prior to regression modeling at a coarse scale, the fine scale auxiliary varibles are first upscaled to the spatial resolution of coarse scale data via linear averaging. Because a lot of regression models can be available in R packages, different regression models can e easily applied to estimate trend component. If residuals are directly computed from trend components estimated by non-linear regression models, the coherence property cannot be satisfied. Therefore, this tool employed multiple linear regression modeling as a basic regression model using the stats library in R.

# Multiple linear regression

MLR.model <- lm(Coarse ~ ., data = Block.data)

3) Residual component estimation

Prediction of fine scale residual components from coarse scale residuals is implemented using ATP kriging. ATP kriging consists of three processing steps: (1) neighboring block search, where block denotes a coarse scale pixel, (2) construction of covariance matrix, and (3) calculation of kriging weights and prediction of residual components.

The KnnLookup function in the SearchTrees library is used to search neighboring blocks rapidly. This function provides the index information of neighboring blocks according to the pre-specified number of neighboring blocks.

# Neighboring block search

Tree <- createTree(pcoarsegrid@coords) Newdat <- matrix(pcoarsegrid@coords[i,], ncol =2)

Block.loc <- as.numeric(knnLookup(Tree, newdat = Newdat, k = nmax))

Two types of covariance are equired to compute ATP kriging weights: block-to-block covariance and block-to-point covariance. Due to the time limit, the variogram deconvolution for estimation of a point-support variogram model is not yet included in the current version of R4ATPRK. Instead, our in-house Fortran code developed for the implementation of variogram deconvolution is first used, and then, the parameters of the estimated point-support variogram model(e.g., nugget effects, sill, range, and model type) are imported into the tool.

Because covariance is a function of distance, the spDists function is used to calculate the distance between any two points. The variogramLine function is also used to retrieve the variogram value corresponding to the distance. This step is one of the most CPU-intensive procedures. Because regular raster data are used as input, it is not necessary to calculate the distance between any two points multiple times. To avoid many re-calculation of the same distance and covariance values, distance calculation and retrieval of covariance can be performed only once, and a covariance look-up table that contains covariance values according to the separation distance is prepared

(Hansen and Mosegaard, 2008; Liu and Journel, 2009).

# Calculation of covariance values and preparation of a look-up table

D1 <- matrix(as.numeric(spDists(matrix (Point.Dist, nrow = 1), matrix(Block.coord, ncol = 2))), ncol = 1)

vmm <- .Internal(mean(variogramLie (MLR.result$vm$var_model, dist_vector = D1, covariance = FALSE)))

lookup$dist[Pdist.no] <- BTP.Dist[m];

lookup$vm[Pdist.no] <- vmm

In addition, all variables in a data frame are converted to a numeric matrix, ad two-dimensional covariance matrices are also converted to a one-dimensional array for further fast computation of covariance. Because prediction points within each block have the same neighboring blocks, they can share the same block-to-block covariance matrix. Therefore, it is not necessary to calculate the same inverse of the block-to-block covariance matrix at every prediction point. Instead, the inverse of the block-to-block covariance matrix is calculated once for prediction points within the block, and used for multiplication to the different block-to-point covariance matrix. Through this procedure, ATP kriging weights are computed, and the residual components at a fine scale are predicted by a weighted linear combination of neighboring coarse scale data.

# Calculation of ATP kriging weights and estimation of residuals

Wts <- solve(BTB.matrix) %*% BTP.matrix

value <- as.numeric(MLR.result$res[Block.loc]%*% Wts)

pred[PtoB.idx] <- value

4) Output

The downscaling result is obtained by summing trend and residual components. In this tool, the downscaling result is saved as a geotiff or binary file. A pdf format file can also be generated using the plot function.

# Savig the downscaling result as geotiff and binary files

writeRaster(fine.grid$outATP, Output_name, format = ”GTiff”, overwrite = TRUE)

wFile <- file(Output_name, “wb”)

writeBin(getValues(fine.grid$outATP), wFile)

close(wFile)

4. Experiment

1) Study area and data

An experiment on spatial downscaling of coarse scale LST products over South Korea was conducted to test the applicability of R4ATPRK. The LST was selected as a target variable for spatial downscaling due to its high spatial continuity and the availability of environmental variables at a fine scale. It should be noted that the purpose of this downscaling experiment is to demonstrate the applicability of R4ATPRK, not to analyze the details of local LST features in the study area. The LST products, which were generated from Himawari-8 AHI data and provided by Kongju National University, were used for the downscaling experiment. Two LST products acquired at 03:00 KST on August 10, 2016 (hereafter referred to as D1) and at 15:00 KST on August 11, 2016 (referred to as D2) were used as input coarse scale data (Fig. 3). The LST products were geocoded to Transverse Mercator coordinates with a spatial resolution of 2 km. Eleven environmental variables that are associated with variations of LST were used as fine scale auxiliary datasets (Table 2). The spatial resolution of auxiliary variables with different spatial resolutions was set to 500 m hrough resampling. Thus, the downscaling result was generated at a spatial resolution of 500 m.

OGCSBN_2018_v34n1_89_f0003.png 이미지

Fig. 3. Himawari-8 AHI LST products used for the spatial downscaling experiment: (a) D1 (03:00 KST on August 10, 2016) and (b) D2 (15:00 KST on August 11, 2016). White on the Korea peninsula denotes cloud pixels. The polygons are administrative boundaries.

Table 2. Fine scale auxiliary variables used for LST downscaling

OGCSBN_2018_v34n1_89_t0002.png 이미지

2) Downscaling results

Downscaling results generated by R4ATPRK are shown in Fig. 4. Cloud pixels were masked out in the downscaling results.It can be seen that overall patterns of downscaling results are very similar to the input coarse scale LST shown in Fig. 3, and local details originated from fine scale auxiliary variables are well depicted. As shown in Fig. 5, the local variations of LST by topography and land-cover types are well reflected in the downscaling results. In addition, the downscaling results satisfied the coherence property, as shown in Fig. 6. These results confirm that downscaling results by R4ATPRK can delineate the local details of LST while preserving the overall characteristics of input coarse scale LST.

OGCSBN_2018_v34n1_89_f0004.png 이미지

Fig. 4. Downscaling results generated by R4ATPRK: (a) D1 and (b) D2.

OGCSBN_2018_v34n1_89_f0005.png 이미지

Fig. 5. Zoomed view of the Himawari-8 AHI LST product (D1) and the downscaling results in Jeju (upper row) and Daejeon (lower row). The 2 percent clip stretch is aplied to highlight the temperature variations.

OGCSBN_2018_v34n1_89_f0006.png 이미지

Fig. 6. Scatterplots of coarse scale LST values versus upscaled LST values within each coarse scale block: (a) D1 and (b) D2.

To check the computational efficiency, CPU times elapsed in the implementation of R4ATPRK were measured and compared. Elapsed CPU times were measured on a single Intel® Core™ i5-6600 at 3.30 GHz.For comparison purposes, the code that repetitively calculates the same inverse of the block-to-block covariance matrix and does matrix multiplication at all prediction points was prepared and implemented. The coarse scale LST at a 2 km spatial resolution is composed of 177 by 305 pixels. The number of overland pixels is 27,675 for D1 and 18,188 for D2, respectively. The fine scale explanatory variables at a 500m spatial resolution are composed of 708 by 1,200 pixels and the number of over-land pixels is 404,115. The elapsed CPU time for R4ATPRK that does not repeat calculation of the block-to-block covariance matrix and matrix multiplication for prediction points within the same block was approximately 466 and 344 seconds for D1 and D2, respectively (Table 3). Meanwhile, when matrix inversion and multiplication for solving the ATP kriging weighting system were repeated at all prediction points, the CPU cost was much higher, compared to the cost using R4ATPRK. Although further improvement is still required in terms of speed, the current version of R4ATPRK showed cmputationally efficient performance with lower CPU cost.

Table 3. Comparison of elapsed CPU times for D1 and D2 data (unit: seconds)

OGCSBN_2018_v34n1_89_t0003.png 이미지

5. Conclusion

In this study, we developed an R-based tool R4ATPRK that implements the ATPRK algorithm for spatial downscaling of coarse scale satellite products. R4ATPRK is based on a theoretically sound downscaling algorithm, and can generate downscaling results with high performance in capturing local details, as well as major patterns of input coarse scale data. Although the applicability of R4ATPRK was tested on regular raster data such as satellite LST products, the tool is flexible in that it can be applied to spatial data with irregular shapes because any objects with irregular shapes are discretized by internal points.

Many aspects of R4ATPRK should be further considered to improve its efficiency. When input coarse scale data consist of a large number of pixels and much dense discretization of each coarse scale pixel is required, calculation of ATP kriging weights might be CPU intensive. To reduce heavy computational loads, a fast Fourier transform method (Liu and Journel, 2009; Guan et al., 2011) could be employed for covariance calculation. Inclusion of the Fortran code prepared for the point-support variogram estimation into R4ATPRK will also be carried out in future.

Acknowledgment

This work was supported by ‘Development of Geostationary Meteorological Satellte Ground Segment’ program funded by National Meteorological Satellite Center (NMSC) of Korea Meteorological Administration (KMA), and also by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2015R1A1A1 A05000966). Th authors thank Prof. Myoung-Seok Suh from Kongju National University for providing the Himawari LST products. Constructive comments and suggestions by two anonymous reviewers are also greatly appreciated.

참고문헌

Atkinson, P.M., 2013. Downscaling in remote sensing, International Journal of Applied Earth Observation and Geoinformation, 22: 106-114. https://doi.org/10.1016/j.jag.2012.04.012
Becker, G., 2012. SearchTrees: Spatial Search Trees, R package version 0.5.2, https://CRAN.R-project.org/package=SearchTrees, Accessed on Jan. 22, 2018.
Bivand, R. and C. Rundel, 2017. rgeos: Interface to Geometry Engine - Open Source (GEOS), R package version 0.3-23, https://CRAN.Rproject.org/package=rgeos, Accessed on Jan. 22, 2018.
Bivand, R, T. Keitt, and B. Rowlingson, 2017. rgdal: Bindings for the Geospatial Data Abstraction Library, R package version 1.2-8, https://CRAN.R-project.org/package=rgdal, Accessed on Jan. 22, 2018.
Goovaerts, P., 2008. Kriging and semivariogram deconvolution in the presence of irregular geographical units, Mathematical Geosciences, 40(1): 101-128. https://doi.org/10.1007/s11004-007-9129-1
Guan, Q., P.C. Kyriakidis, and M.F. Goodchild, 2011. A parallel computing approach to fast geostatistical areal interpolation, International Journal of Geographical Information Science, 25(8): 1241-1267. https://doi.org/10.1080/13658816.2011.563744
Hansen, T.M. and K. Mosegaard, 2008. VISIM: Sequential simulation for linear inverse problems, Computers & Geosciences, 34(1): 53-76. https://doi.org/10.1016/j.cageo.2007.02.003
Hijmans, R.J., 2016. raster: Geographic Data Analysis and Modeling, R package version 2.5-8, https://CRAN.R-project.org/package=raster, Accessed on Jan. 22, 2018.
Hutengs, C. and M. Vohland, 2016. Downscaling land surface temperatures at regional scales with random forest regression, Remote Sensing of Environment, 178: 127-141. https://doi.org/10.1016/j.rse.2016.03.006
Immerzeel, W.W., M.M. Rutten, and P. Droogers, 2009. Spatial downscaling of TRMM precipitation using vegetative response on the Iberian Peninsula, Remote Sensing of Environment, 113(2): 362-370. https://doi.org/10.1016/j.rse.2008.10.004
Jia, S., W. Zhu, A. Lu, and T. Yan, 2011. A statistical spatial downscaling algorithm of TRMM precipitation based on NDVI and DEM in the Qaidam Basin of China, Remote Sensing of Environment, 115(12): 3069-3079. https://doi.org/10.1016/j.rse.2011.06.009
Journel, A.G. and C.J. Huijbregts, 1978. Mining Geostatistics, Academic Press, New York, NY, USA.
Ke, Y., J. Im, S. Park, and H. Gong, 2016. Downscaling of MODIS one kilometer evapotranspiration using Landsat-8 data and machine learning approaches, Remote Sensing, 8(3):215, doi:10.3390/rs8030215.
Kim, Y. and N.-W. Park, 2017. Impact of trend estimates on predictive performance in model evaluation for spatial downscaling of satellite based precipitation data, Korean Journal of Remote Sensing, 33(1): 25-35. https://doi.org/10.7780/kjrs.2017.33.1.3
Kyriakidis, P.C., 2004. A geostatistical framework for area-to-point spatial interpolation, Geographical Analysis, 36(3): 259-289. https://doi.org/10.1111/j.1538-4632.2004.tb01135.x
Liu, Y. and A.G. Journel, 2009. A package for geostatistical integration of coarse and fine scale data, Computers & Geosciences, 35(3): 527-547. https://doi.org/10.1016/j.cageo.2007.12.015
Park, N.-W., 2013. Spatial downscaling of TRMM precipitation using geostatistics and fine scale environmental variables, Advances in Meteorology, 2013: 237126, doi:10.1155/2013/237126.
Park, N.-W., S. Hong, P.C. Kyriakidis, W. Lee, and S.-J. Lyu, 2016. Geostatistical downscaling of AMSR2 precipitation with COMS infrared observations, International Journal of Remote Sensing, 37(16): 3858-3869. https://doi.org/10.1080/01431161.2016.1204031
Pebesma, E.J., 2004. Multivariable geostatistics in S: the gstat package, Computers & Geosciences, 30(7): 683-691. https://doi.org/10.1016/j.cageo.2004.03.012
R Core Team, 2017. R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org, Accessed on Jan. 22, 2018.
Ribeiro Jr., P.J. and P.J. Diggle, 2016. geoR: Analysis of Geostatistical Data, R package version 1.7-5.2, https://CRAN.R-project.org/package=geoR, Accessed on Jan. 22, 2018.
Wang, Q., W. Shi, P.M. Atkinson, and Y. Zhao, 2015. Downscaling MODIS images with area-topoint regression kriging, Remote Sensing of Environment, 166: 191-204. https://doi.org/10.1016/j.rse.2015.06.003
Yoo, C., J. Im, S. Park, and D. Cho, 2017. Thermal characteristics of Daegu using land cover data and satellite-derived surface temperature downscaled based on machine learning, Korean Journal of Remote Sensing, 33(6-2): 1101-1118 (in Korean with English abstract). https://doi.org/10.7780/kjrs.2017.33.6.2.6