DOI QR코드

DOI QR Code

High-Resolution Satellite Image Super-Resolution Using Image Degradation Model with MTF-Based Filters

  • Minkyung Chung (Department of Civil and Environmental Engineering, Seoul National University) ;
  • Minyoung Jung (Lyles School of Civil Engineering, Purdue University) ;
  • Yongil Kim (Department of Civil and Environmental Engineering, Seoul National University)
  • Received : 2023.05.31
  • Accepted : 2023.07.06
  • Published : 2023.08.31

Abstract

Super-resolution (SR) has great significance in image processing because it enables downstream vision tasks with high spatial resolution. Recently, SR studies have adopted deep learning networks and achieved remarkable SR performance compared to conventional example-based methods. Deep-learning-based SR models generally require low-resolution (LR) images and the corresponding high-resolution (HR) images as training dataset. Due to the difficulties in obtaining real-world LR-HR datasets, most SR models have used only HR images and generated LR images with predefined degradation such as bicubic downsampling. However, SR models trained on simple image degradation do not reflect the properties of the images and often result in deteriorated SR qualities when applied to real-world images. In this study, we propose an image degradation model for HR satellite images based on the modulation transfer function (MTF) of an imaging sensor. Because the proposed method determines the image degradation based on the sensor properties, it is more suitable for training SR models on remote sensing images. Experimental results on HR satellite image datasets demonstrated the effectiveness of applying MTF-based filters to construct a more realistic LR-HR training dataset.

Keywords

1. Introduction

Super-resolution (SR) is a technique for reconstructing high-resolution (HR) images from their low-resolution (LR) counterparts (Freeman et al., 2000) and has been extensively investigated in various application fields, including mobile devices, medical imaging, face recognition, and remote sensing (Lepcha et al., 2022). SR is widely recognized as an ill-posed problem because a single LR image can correspond to multiple HR images. Recent studies have attempted to solve this problem by adopting deep learning networks and achieved superior performance compared to conventional example-based methods. Since the seminal work by Dong et al. (2016) introduced the use of convolutional neural networks(CNNs) for SR, many CNN-based SR models have been proposed utilizing various learning strategies such as residual learning (Kim et al., 2016; Lim et al., 2017), recursive learning (Tai et al., 2017), and adversarial learning (Ledig et al., 2017; Wang et al., 2018).

Unlike the SRCNN (Dong et al., 2016), which performed the SR process on upsampled LR images using bicubic interpolation, most recent CNN-based SR models have adopted a post-upsampling structure that increases the size of the input image at the end of the network for computational efficiency. Fig. 1 illustrates the general structure of CNN-based SR models using post-upsampling. Typically, these models consist of convolutional layers that extract features from the input LR images, followed by upsampling layers that increase the size of the LR images. Following the success of the SRCNN (Dong et al., 2016),VDSR(Kim et al., 2016), and EDSR (Lim et al., 2017) employed residual learning to leverage the learning capability of deeper networks. Furthermore, Zhang et al. (2018c) proposed RDN, which utilizes residual dense blocks to extract local features from densely connected convolutional layers. In addition, RCAN (Zhang et al., 2018b) and HAN (Niu et al., 2020)incorporated a channel attention mechanism into a residual structure to efficiently learn more useful channel-wise features. On the other hand, D-DBPN (Haris et al., 2018) utilized an iterative up- and downsampling approach to increase image features and provide multiple HR features from different depths.

OGCSBN_2023_v39n4_395_f0001.png 이미지

Fig. 1. The flowchart illustrates the general structure of CNN-based SR models using post-upsampling. Typically, the network consists of convolutional layers that extract features from the input LR images and upsampling layers that increase the size of the LR images to match the size of the HR images.

Deep-learning-based SR models generally require training datasets consisting of LR images and the corresponding HR images. However, due to the difficulties in obtaining real-world LR-HR datasets, most SR studies have used only HR images and generated LR images by applying degradation to HR images. The degradation process from HR image (IHR) to LR image (ILR) can be expressed as:

ILR = (IHR * k) ↓S + n       (1)

where k and n indicate blur kernel and noise, respectively, and ↓S denotes downsampling with scale factor s. However, many SR models used bicubic downsampling to generate LR images from HR images (Kim et al., 2016; Ledig et al., 2017; Tai et al., 2017; Haris et al., 2018; Wang et al., 2018). Such simple degradation is not appropriate for real-world applications because other key factors of image degradation (blur and noise) are not considered during the training of SR models. Some researchers have attempted to alleviate the gap between bicubic downsampling and real-world degradation by including a blur kernel and noise within the image degradation model (Zhang et al., 2018b; Zhang et al., 2018c; Dai et al., 2019; Niu et al., 2020). The most widely used degradation is a 7 × 7 Gaussian kernel with a standard deviation of 1.6 or 3, regardless of the image properties of the training datasets. Other recent studies have involved additional networks for kernel estimation to produce synthetic LR images that have similar degradation properties to real-world LR images (Bell-Kligler et al., 2019; Ji et al., 2020).

Based on the improvements in SR performance, deep-learning-based methods have been adopted for SR of remote sensing images (Wang et al., 2022). Following the common protocols, most deep-learning-based SR models used predefined degradation, such as bicubic downsampling to generate an LR-HR dataset (Lu et al., 2019). In addition, some recent studies used a pre-trained degrader (Zhang et al., 2020) or a downsample generator(Zhang et al., 2022)to generate a more realistic training dataset. However, because the orientation of the imaging sensor mounted on the satellite platform is more restricted than that of the natural images used in computer vision, we assumed that the degradation model of the satellite images could be simulated by considering the imaging sensor properties. In particular, HR satellite images are usually provided as a pair of panchromatic (PAN) and multispectral(MS) images. Thus, these paired satellite images provide a favorable chance to construct and validate realistic LR-HR image datasets

The modulation transfer function (MTF)-based low-pass filtering is a widely accepted downsampling method in satellite image fusion (Vivone et al., 2015). Given its effectiveness and adequacy for HR satellite images, MTF-based filters have been extensively investigated for satellite images from various imaging sensors (Kallel, 2015; Palsson et al., 2016). Therefore, in this study, we introduced MTF-based Gaussian filters in an image degradation model to construct a more realistic LR-HR satellite image dataset for deep-learning-based SR without any additional network components. To the best of our knowledge, this work is the first to integrate deep-learning-based SR models with MTF-based image degradation by exploiting the domain knowledge of remote sensing. This study focused on how the image degradation model selection can affect the training dataset generation and the image quality of the resulting SR images.

The remainder of this study is organized as follows. Section 2 describes the HR satellite image datasets used in this study and the proposed image degradation method based on MTF-based filters. Section 3 presents the experimental results of the proposed degradation model on real-world satellite image datasets. Finally, section 4 presents the conclusions of this study.

2. Materials and Methods

2.1. Data Description

In this study, we used satellite images obtained from WorldView-3 (WV3) and WorldView-2 (WV2)(Fig. 2). The WV3 satellite provides PAN and MS band images with spatial resolutions of 0.31 m and 1.24 m, respectively (Table 1). We obtained two WV3 images, denoted as WV3-1 and WV3-2, which were captured over Pyeongdong Industrial Park in Gwangju, Republic of Korea with a temporal interval of approximately 1 year (May 26, 2017 and May 4, 2018). On the other hand, WV2 provides PAN image with a resolution of 0.46 m and MS band image with a resolution of 1.84 m. The WV2 image was acquired in the same region as the WV3 images on May 19, 2021. The satellite images were selected considering land-cover composition, temporal similarity, and the availability of image data. In particular, the WV3 datasets were intended to verify the applicability of the analysis results using images from the same sensor.

OGCSBN_2023_v39n4_395_f0002.png 이미지

Fig. 2. HR satellite images used in this study: WorldView-3 and WorldView-2 images are acquired over Pyeongdong Industrial Complex located in Gwangju, Republic of Korea on (a) May 26, 2017; (b) May 4, 2018; and (c) May 19, 2021.

Table 1. Specifications of images used in this study

OGCSBN_2023_v39n4_395_t0001.png 이미지

To generate the LR-HR datasets, we employed Gram-Schmidt adaptive (GSA) algorithm(Aiazzi et al., 2007) for pansharpening of paired PAN and MS images, and the resulting pansharpened MS images were utilized as HR images. Depending on the source of LR images, the satellite image datasets in this study were categorized into two groups: synthetic- and real-world satellite image datasets. For a synthetic satellite image dataset, we utilized the pansharpened MS images as HR images and the downsampled images from HR images via the image degradation model as LR images. In contrast, for a real-world satellite image dataset, the original MS images were used as LR images.

2.2. Image Degradation Model with MTF-Based Filters

In many previous studies on satellite image fusion, Gaussian low-pass filters have been widely used as MTF filters because Gaussian filters can be tuned to closely match the MTF of specific sensors (Vivone et al., 2015). Thus, MTF-based filters have been utilized for pansharpening to extract high-frequency details from PAN images and for evaluation purposes(Palsson et al., 2016). Gaussian filters can be expressed as a one-dimensional Gaussian curve in the frequency domain equal to the amplitude at the Nyquist frequency (NA):

\(\begin{aligned}e^{-\frac{x^{2}}{2 \sigma^{2}}}=N^{A}\end{aligned}\)       (2)

The amplitude response at the Nyquist frequency can either be provided by the manufacturer or estimated when exact MTF values are not available. The NA values for the MS sensors are generally known to be around 0.3 (Kallel, 2015). Since \(\begin{aligned}x=f_{c} \times \frac{n}{2}\end{aligned}\) and cut-off frequency fc can be expressed as the scale ratio between PAN and MS images (Aiazzi et al., 2006), Eq. (2) can be reformulated as follows:

\(\begin{aligned}\sigma=\sqrt{-\frac{x^{2}}{2 \log \left(N^{4}\right)}}=\sqrt{-\frac{\left(\frac{f_{c}}{2} \times n\right)^{2}}{2 \log \left(N^{4}\right)}}=\sqrt{-\frac{\left(\frac{n}{2 \times s}\right)^{2}}{2 \log \left(N^{4}\right)}}\end{aligned}\)       (3)

where n and s refer to the size of the Gaussian kernel and scale factor, respectively. s is set to four because PAN images usually have four times higher spatial resolution than MS images. Since s and NA are fixed by the scale factor and sensor specifications, the standard deviation (σ) of the Gaussian filter is determined by the kernel size.

In this study, we employed MTF-based filters to simulate image degradation from HR image to LR image to generate a paired LR-HR training dataset. To verify the effectiveness of using MTF-based Gaussian filters for SR of HR satellite images, we defined four image degradation models, including three key factors of the degradation process: downsampling, blur, and noise. The first degradation model, bicubic (BI) model, utilized bicubic downsampling to generate LR images, which is accepted in most SR studies. The second model, blur-downsample (BD)model, involved blurring the HR image using an MTF-based Gaussian kernel, followed by downsampling with a specified scale factor. The third model, downsample-noise (DN) model, encompassed downsampling the HR image and adding Gaussian noise to the downsampled image. Gaussian noise was set to have a zero mean and a standard deviation of 0.02, considering the noise level of the satellite images (Choi and Kim, 2020; Choi et al., 2021).

Lastly, the blur-downsample-noise (BDN) model combined BD and DN models, sequentially degrading the HR image by blurring with an MTF-based filter, downsampling, and injecting Gaussian noise. The denotations used for image degradation models (BI, BD, and DN) refer to Zhang et al. (2018b) and Zhang et al. (2018c), where similar degradation models were utilized.

2.3. Implementation Details

In the training process, the SR models were trained on synthetic datasets with four image degradation models: BI, BD, DN, and BDN. The trained SR models were then tested on real-world datasets. To be specific, the SR model strained on synthetic dataset were applied to real-world dataset without any fine-tuning to avoid diluting the influence of each degradation method on the SR performance in real-world cases. To train and test the SR models, the HR images were divided into sub-images of 512×512 pixels, which corresponds to 128×128 LR images. As a result, the WV3-1 and WV3-2 datasets comprised 1,208 images and 1,136 images, respectively. These datasets were split into training, validation, and test datasets in a ratio of 6:2:2. In the training phase, we randomly cropped the HR image patches with a size of 256×256 pixels for every iteration. Data augmentation techniques were then applied, including random horizontal and vertical flips, as well as rotations of 90°, 180°, or 270°. The SR networks were trained for 100 epochs using the Adam optimizer with β1 = 0.9, β2 = 0.999, and ε = 10-8. The learning rate was initialized to 10-4 for all layers and decreased by a factor of 10 after half of the total epochs.

3. Experiments Results and Analysis

3.1. Optimal Gaussian Kernel Size Analysis

As the standard deviation of the MTF-based Gaussian kernel is determined by the kernel size, determining the proper kernel size is important to reflect the actual image properties in the image degradation procedure. Therefore, we performed optimal Gaussian kernel size analysis in advance and used the derived kernel size in the following evaluation of the image degradation models.

To derive the optimal kernel size for the MTF-based low-pass filter, HR images were blurred using a Gaussian kernel with kernel sizes varying from 5 × 5 to 51 × 51. In the calculation of the standard deviation of the MTF-based Gaussian kernel from (3), the MTF amplitudes at the Nyquist frequency (NA) were set to 0.3 and 0.23 for WV3 and WV2, following previous studies(Kallel, 2015; Palsson et al., 2016). The blurred HR images were then downsampled using bicubic interpolation with a scale factor of four. The resulting LR images were evaluated by peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) with the corresponding real-world LR images as a reference. By comparing the synthesized LR images with real-world LR images, the optimal kernel size for satellite images was derived in a sensor-specific manner.

Fig. 3 illustrates how the image similarity indices (PSNR and SSIM) change with respect to the Gaussian kernel size. Higher PSNR and SSIM values indicate higher image similarity, implying that the synthesized LR image more resembles the real-world LR image. Both WV3 datasets(WV3-1 and WV3-2)show similar trends in PSNR and SSIM, as the metrics increase until it reaches a kernel size of 19 × 19, and then decrease from that point as the kernel size increases. Despite the temporal interval of image acquisition and differences in imaging conditions, WV3-1 and WV3-2 datasets show peak PSNR and SSIM values when a 19 × 19 kernel is applied. Similarly, WV2 dataset shows a unimodal distribution in PSNR and SSIM, with the peak occurring at a kernel size of 13 × 13. Although WV3 and WV2 capture images from the same imaging instrument (WorldView-110 camera), the two sensors are positioned at different altitudes of 617 km and 770 km. These altitude difference results in WV3 requiring a larger kernel size (19 × 19) compared to WV2 (13 × 13)in order to generate LR images similar to real-world LR images. Hence, the derived optimal MTF-based Gaussian filters are 19 × 19 with σ = 2.323 and 13 × 13 with σ = 1.702 for WV3 and WV2 datasets, respectively. It should be noted that the properties of the resulting Gaussian kernels are different from those of the commonly used Gaussian blur kernel (7 × 7 with σ = 1.6) (Zhang et al., 2017; Zhang et al., 2018d) and also differ between the sensors. However, the analysis was performed using a limited number of images, and further investigation is required to confirm the applicability of the derived kernel size to larger-scale image datasets.

OGCSBN_2023_v39n4_395_f0003.png 이미지

Fig. 3. Gaussian kernel size analysis results from WV3-1, WV3-2, and WV2 datasets based on (a) PSNR and (b) SSIM.

Fig. 4 provides a clear visualization of how the image degradation model can affect the quality of the resulting LR images. Comparing the synthetic LR image generated through bicubic interpolation (Fig. 4d) with the LR images generated using the Gaussian blur kernel (Figs. 4a–c), it is evident that the clarity of the object boundaries is reduced when the Gaussian blur kernel is applied similar to as observed in the real-world LR image (Fig. 4e). These differences in LR image quality affect the SR performance and SR model strained on synthetic LR images from bicubic downsampling often fail to achieve satisfactory results when applied to real-world LR images. Therefore, based on these observations, the proposed method aimed to enhance the SR performance for real-world images by simulating the image degradation based on the sensor properties.

OGCSBN_2023_v39n4_395_f0004.png 이미지

Fig. 4. Comparison of the simulated LR images with real-world LR and HR images: Synthetic LR images generated from Gaussian kernel with varying kernel size (n): (a) n = 9, (b) n = 19, (c) n = 29, (d) synthetic LR images obtained from bicubic downsampling, (e) real-world LR images (MS images), and (f) HR images (pansharpened MS images). For the convenience of comparison, the LR images are enlarged to the size of the HR image.

3.2. Evaluation of Image Degradation Models

To evaluate the influence of image degradation model on SR performance, four image degradation models were applied to WV3-1 and WV3-2 datasets. In comparison, several state-of-the-art deep-learning-based SR models were implemented to determine whether the SR performance was affected by image degradation or by the SR model: SRResNet (Ledig et al., 2017), EDSR (Lim et al., 2017), D-DBPN (Haris et al., 2018), RRDBNet (Wang et al., 2018), RCAN (Zhang et al., 2018b), RDN (Zhang et al., 2018c), and HAN (Niu et al., 2020). For a fair comparison among the SR models, the same hyperparameters were applied for training except for the loss function. The loss functions used in the SR models were selected from L1 and L2 losses, as proposed in the original studies. Thus, SRResNet (Ledig et al., 2017) and D-DBPN (Haris et al., 2018) utilized L2 loss, and the remaining SR models used L1 loss as a loss function.

The SR performance was evaluated using PSNR, SSIM, and learned perceptual image patch similarity (LPIPS) (Zhang et al., 2018a). Although PSNR and SSIM are the most widely used metrics for evaluating image quality in SR products, these conventional metrics tend to focus on image fidelity rather than human perception. In contrast, LPIPS was designed to reflect human perception by measuring the perceptual similarity between images. While higher PSNR and SSIM values indicate better image quality, a low LPIPS value is desirable because it measures the distance between the multi-layer features from the pre-trained network.

Tables 2 and 3 present a quantitative evaluation of the SR models on WV3-1 and WV3-2 datasets. The results indicate that all the SR models tend to yield SR results with the better image quality from BD and BDN than BI and DN from all three-evaluation metrics. Particularly, the LPIPS values demonstrate the clear superiority of the proposed degradation model with MTF-based filters for all SR models. The experimental results suggest that the conventional degradation model, BI model, does not consider the other key factors of image degradation (blur and noise), potentially leading to deterioration in the quality of SR images in real-world applications. These findings are inconsistent with earlier observations (Zhang et al., 2018b; Zhang et al., 2018c)where BD model showed inferior SR performance compared to BI model. Such disagreement highlights the importance of employing proper degradation method to enhance the SR performance.

Table 2. Evaluation of image degradation models on WV3-1 image datasets

OGCSBN_2023_v39n4_395_t0002.png 이미지

The best- and the second-best performances for each SR model are indicated in bold and underlined, respectively.

Table 3. Evaluation of image degradation models on WV3-2 image datasets

OGCSBN_2023_v39n4_395_t0003.png 이미지

The best- and the second-best performances for each SR model are indicated in bold and underlined, respectively.

In addition, as shown in Fig. 5, using LR images from BD and BDN recovers more details and produces clearer SR images compared to using LR images from image degradation models without MTF-based filters (BI and DN). Although the experimental results indicate that the proposed image degradation method can improve the SR performance, it is difficult to presume that the derived Gaussian kernel can perfectly simulate the degradation characteristics of real-world LR images. It is because that real-world blur kernels are known to be anisotropic and inconsistent within an image (Zhou and Süsstrunk, 2019). Nonetheless, the proposed image degradation with MTF-based filters can be easily implemented based on sensor properties with a lower computational burden as it does not require an additional network for image degradation.

OGCSBN_2023_v39n4_395_f0005.png 이미지

Fig. 5. Visual comparison of LR and SR images from four image degradation models. SR results are retrieved from RDN using real-world LR images as input. For the convenience of comparison, the LR images are enlarged to the size of the HR image.

The SR outputs generated from the SR networks can be utilized for various downstream applications. Previous studies have demonstrated that improving the spatial resolution can enhance the performance of subsequent tasks such as classification (Xiong et al., 2020) and object detection (Rabbi et al., 2020). However, it is important to note that the SR process may alter the spectral values of the images. Therefore, further investigation is required to assess the suitability of using SR outputs for precise spectral analysis.

3.3. Analysis of Gaussian Kernel Size and SR Performance

In this section, Gaussian kernels of various sizes were applied to analyze the image quality of the resulting SR images from WV3-1 dataset. The SR results were retrieved from RDN and RCAN models, which showed superior SR performance in section 3.2.

As shown in Fig. 6, both PSNR and SSIM maintain relatively stable values until the kernel size of 19 × 19 and then decrease rapidly as the blur kernel size increases. These trends appeared to be the same for both SR models. While PSNR and SSIM show similar trends with respect to kernel size, LPIPS provides a more significant difference by presenting the lowest LPIPS values for the 19 × 19 kernel. As the highest perceptual image quality was achieved from the optimal kernel size derived from this study, the experimental results validated that the image degradation for dataset generation can affect the image quality of the SR outputs in real-world cases. In this regard, the proposed degradation method can effectively simulate image degradation from HR to LR images based on the sensor properties.

OGCSBN_2023_v39n4_395_f0006.png 이미지

Fig. 6. Comparison of SR performance depending on Gaussian kernel size of MTF-based filters on WV3-1 dataset: SR performance is evaluated by (a) PSNR, (b) SSIM, and (c) LPIPS.

4. Conclusions

In this study, we proposed the image degradation model with MTF-based Gaussian filters to generate a realistic LR-HR image dataset for deep-learning-based SR of HR satellite images. The experimental results on real-world satellite image datasets demonstrated that the commonly used degradation model is not appropriate for remote sensing images due to the differences in image properties. With the domain knowledge background, the proposed method reflects the properties of the imaging sensor to the image degradation model and proves its adequacy for HR satellite images by achieving a considerable improvement in SR performance. Our degradation model can be further validated with multi-sensor remote sensing images for real-world SR applications. In future studies, further investigation is required to consider the other unique properties of satellite sensors within the deep learning framework. 

Acknowledgments

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government(MSIT)(Grant: NRF-2023R1A2C2005548) and by the Korea Agency for Infrastructure Technology Advancement (KAIA) grant funded by the Ministry of Land, Infrastructure and Transport (Grant: RS-2022-00155763). The Institute of Engineering Research at Seoul National University provided research facilities for this work.

Conflict of Interest

No potential conflict of interest relevant to this article was reported.

References

  1. Aiazzi, B., Alparone, L., Baronti, S., Garzelli, A., and Selva, M., 2006. MTF-tailored multiscale fusion of high-resolution MS and Pan imagery. Photogrammetric Engineering & Remote Sensing, 72(5), 591-596. https://doi.org/10.14358/PERS.72.5.591
  2. Aiazzi, B., Baronti, S., and Selva, M., 2007. Improving Component substitution pansharpening through multivariate regression of MS+Pan data. IEEE Transactions on Geoscience and Remote Sensing, 45(10), 3230-3239. https://doi.org/10.1109/TGRS.2007.901007
  3. Bell-Kligler, S., Shocher, A., and Irani, M., 2019.Blind super-resolution kernel estimation using an internal-gan. In: Wallach, H. M., Larochelle, H., Beygelzimer, A., d'Alche-Buc, F., Fox, E. A., Garnett, R. (eds.), Advances in neural information processing systems 32: Annual conference on neural information processing systems 2019, NeurIPS, pp. 284-293. https://proceedings.neurips.cc/paper/2019/hash/5fd0b37cd7dbbb00f97ba6ce92bf5add-Abstract.html
  4. Choi, Y., Han, S., and Kim, Y., 2021. A no-reference CNN-based super-resolution method for KOMPSAT-3 using adaptive image quality modification. Remote Sensing, 13(16), 3301. https://doi.org/10.3390/rs13163301
  5. Choi, Y., and Kim, Y., 2020. A no-reference super resolution for satellite image quality enhancement for KOMPSAT-3. In Proceedings of the IGARSS 2020 - 2020 IEEE International Geoscience and Remote Sensing Symposium, Waikoloa, HI, USA, Sept. 26-Oct. 2, pp. 220-223. https://doi.org/10.1109/IGARSS39084.2020.9324422
  6. Dai, T., Cai, J., Zhang, Y., Xia, S.-T., and Zhang, L., 2019. Second-order attention network for single image super-resolution. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, June 15-20, pp. 11057-11066. https://doi.org/10.1109/CVPR.2019.01132
  7. Dong, C.,L oy, C. C., He, K., and Tang, X., 2016.Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(2), 295-307. https://doi.org/10.1109/TPAMI.2015.2439281
  8. Freeman, W. T., Pasztor E. C., and Carmichael, O. T., 2000. Learning low-level vision. International Journal of Computer Vision, 40, 25-47. https://doi.org/10.1023/A:1026501619075
  9. Haris, M., Shakhnarovich, G., and Ukita, N., 2018. Deep back-projection networks for super-resolution. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 18-23, pp. 1664-1673. https://doi.org/10.1109/CVPR.2018.00179
  10. Ji, X., Cao, Y., Tai, Y., Wang, C., Li, J., and Huang, F., 2020. Real-world super-resolution via kernel estimation and noise injection. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, June 14-19, pp. 1914-1923. https://doi.org/10.1109/CVPRW50498.2020.00241
  11. Kallel, A., 2015.MTF-adjusted pansharpening approach based on coupled multiresolution decompositions. IEEE Transactions on Geoscience and Remote Sensing, 53(6), 3124-3145. https://doi.org/10.1109/TGRS.2014.2369056
  12. Kim, J., Lee, J. K., and Lee, K. M., 2016.Accurate image super-resolution using very deep convolutional networks. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 27-30, pp. 1646-1654. https://doi.org/10.1109/CVPR.2016.182
  13. Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham,A.,Acosta,A. et al., 2017. Photorealistic single image super-resolution using a generative adversarial network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, July 21-26, pp. 105-114. https://doi.org/10.1109/CVPR.2017.19
  14. Lepcha, D. C., Goyal, B., Dogra, A., and Goyal, V., 2022.Image super-resolution: A comprehensive review, recent trends, challenges and applications. Information Fusion, 91, 230-260. https://doi.org/10.1016/j.inffus.2022.10.007
  15. Lim, B., Son, S., Kim, H., Nah, S., and Lee, K. M., 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, July 21-26, pp. 1132-1140. https://doi.org/10.1109/CVPRW.2017.151
  16. Lu, T., Wang, J., Zhang, Y., Wang, Z., and Jiang, J., 2019. Satellite image super-resolution via multiscale residual deep neural network. Remote Sensing, 11(13), 1588. https://doi.org/10.3390/rs11131588
  17. Niu, B., Wen, W., Ren, W., Zhang, X., Yang, L., Wang, S. et al., 2020. Single image super-resolution via a holistic attention network. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J. M. (eds.), Computer vision - ECCV 2020, Springer, pp. 191-207. https://doi.org/10.1007/978-3-030-58610-2_12
  18. Palsson, F., Sveinsson, J. R., Ulfarsson, M. O., and Benediktsson,J.A., 2016. Quantitative quality evaluation of pansharpened imagery: Consistency versus synthesis. IEEE Transactions on Geoscience and Remote Sensing, 54(3), 1247-1259. https://doi.org/10.1109/TGRS.2015.2476513
  19. Rabbi, J., Ray, N., Schubert, M., Chowdhury, S., and Chao, D., 2020. Small-object detection in remote sensing images with end-to-end edge-enhanced GAN and object detector network. Remote Sensing, 12(9), 1432. https://doi.org/10.3390/rs12091432
  20. Tai, Y., Yang, J., and Liu, X., 2017. Image super-resolution via deep recursive residual network. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, July 21-26, pp. 2790-2798. https://doi.org/10.1109/CVPR.2017.298
  21. Vivone, G., Alparone, L., Chanussot, J., Mura, M. D., Garzelli, A., Licciardi, G. A. et al., 2015.Acritical comparison among pansharpening algorithms. IEEE Transactions on Geoscience and Remote Sensing, 53(5), 2565-2586. https://doi.org/10.1109/TGRS.2014.2361734
  22. Wang, X., Yi, J., Guo, J., Song, Y., Lyu, J., Xu, J. et al., 2022. A review of image super-resolution approaches based on deep learning and applications in remote sensing. Remote Sensing, 14(21), 5423. https://doi.org/10.3390/rs14215423
  23. Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C. et al., 2018. ESRGAN:Enhanced super-resolution generative adversarial networks. In: Leal-Taixe, L., Roth, S. (eds.), Computer vision - ECCV 2018 Workshops, Springer, pp. 63-79. https://doi.org/10.1007/978-3-030-11021-5_5
  24. Xiong, Y., Guo, S., Chen, J., Deng, X., Sun, L., Zheng, X., and Xu, W., 2020. Improved SRGAN for remote sensing image super-resolution across locations and sensors. Remote Sensing, 12(8), 1263. https://doi.org/10.3390/rs12081263
  25. Zhang, J., Xu, T., Li, J., Jiang, S., and Zhang, Y., 2022. Single-image super resolution of remote sensing images with real-world degradation modeling. Remote Sensing, 14(12), 2895. https://doi.org/10.3390/rs14122895
  26. Zhang, K., Zuo, W., Gu, S., and Zhang, L., 2017. Learning deep CNN denoiser prior for image restoration. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, July 21-26, pp. 2808-2817. https://doi.org/10.1109/CVPR.2017.300
  27. Zhang, K., Zuo, W., and Zhang, L., 2018d. Learning a single convolutional super-resolution network for multiple degradations. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 18-23, pp. 3262-3271. https://doi.org/10.1109/CVPR.2018.00344
  28. Zhang, N., Wang, Y., Zhang, X., Xu, D., Wang, X., Ben, G. et al., 2020.Amulti-degradation aided method for unsupervised remote sensing image super resolution with convolution neural networks. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-14. https://doi.org/10.1109/TGRS.2020.3042460
  29. Zhang, R., Isola, P., Efros, A. A., Shechtman, E., and Wang, O., 2018a.The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 18-23, pp. 586-595. https://doi.org/10.1109/CVPR.2018.00068
  30. Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., and Fu, Y., 2018b. Image super-resolution using very deep residual channel attention networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.), Computer vision - ECCV 2018, Springer, pp. 294-310. https://doi.org/10.1007/978-3-030-01234-2_18
  31. Zhang, Y., Tian, Y., Kong, Y., Zhong, B., and Fu, Y., 2018c. Residual dense network for image super-resolution. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, June 18-23, pp. 2472-2481. https://doi.org/10.1109/CVPR.2018.00262
  32. Zhou, R., and Susstrunk, S., 2019. Kernel modeling super-resolution on real low-resolution images. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, South Korea, Oct. 27-Nov. 2, pp. 2433-2443. https://doi.org/10.1109/ICCV.2019.00252