Ferdowsi University of MashhadJournal of Geography and Environmental Hazards2322-16828220190622Evaluation of the SPA Algorithm and the Feasibility of using MM5 Model Output to Estimate the Cloud Gaps in MODIS LST ImagesEvaluation of the SPA Algorithm and the Feasibility of using MM5 Model Output to Estimate the Cloud Gaps in MODIS LST Images1331473339310.22067/geo.v0i0.65717FANegar SiabiFerdowsi university of MashhadSeyed Hossein SanaeinejadFerdowsi university of Mashhad0000-0002-4013-4359Bijan GhahramanFerdowsi university of MashhadJournal Article201707021 Introduction
Land Surface Temperature (LST) is an important parameter in controlling surface heat and water exchange with the atmosphere (Li, Tang, Wu, Ren, Yan, Wan, Trigo, & Sobrino, 2013). Remote sensing images are now one of the most important data sources for estimating LST (Hengl, Heuvelink, Perˇcec Tadi´c, & Pebesma, 2012). But the presence of pollutants in the air, cloudiness and failure of the sensors result in the huge data loss, which is called image gaps. So far, several methods have been proposed to estimate the missing values. These methods are divided into three categories: spatial, temporal or spatio-temporal. Time-based methods for estimating missing data are mathematical calculations, the most famous of which are Savitzky and Golay filters (1964). In addition to simple solutions, more sophisticated methods such as Brooks, Thomas, Wynne, and Coulston )2012) and harmonic analysis of Zhou, Jia, and Menenti )2015) were also introduced and implemented on various remote sensing products. Simple interpolation methods such as the nearest neighborhood, SP-Line method, and Inverse Distance Weighing (IDW) are spatial methods. In most of these algorithms, the weighted average is used. Compared to the simple interpolation methods, the approaches that utilize auxiliary data are more of a researcher's interest. Chen, Zhu, Vogelmann, Gao, and Jin (2011) proposed a neighborhood Similar Pixel Interpolation method (NSPI). Zhu, Liu, and Chen (2012) presented the improved NSPI method named Geostatistical Neighborhood Similarity Pixel Interpolation using geostatistics. Gerber, de Jong, Schaepman, Schaepman-Strub, and Furrer (2018) proposed a spatio-temporal algorithm based on subsetting method for estimating the missing values of MODIS NDVI data. The estimation results are influenced by factors such as the structure of the algorithm, the used scenarios, the type of variable, and the region of study. Finding the right fitting image on cloudy days or providing a long time series are also among the most important factors influencing the results of these methods Kandasamy, Baret, Verger, Neveux, and Weiss (2013). To solve this problem, time series from other satellites are mostly used. These images may differ with the target images in terms of spectral structure and have negative effect on the algorithms performance. Only a few researches like Jang, Kang, Kim, Lee, Kim, Kim, Hirata (2010) have used numerical prediction models outputs in producing continuous spatial-temporal remote sensing data. They concluded that the outputs of numerical prediction models could be used on cloudy days.
The aim of this study is to evaluate the approach proposed by Gerber, de Jong, Schaepman, Schaepman-Strub, and Furrer (2018) to reconstruct MODIS LST images and also to study the feasibility of using MM5 model outputs as an auxiliary data for estimating missing values of the images.
2 Materials and Methods
The study area is North Khorasan, Khorasan Razavi, and Southern Khorasan provinces, northeast of Iran, located between 55 to 61 degrees east and 30 to 38 degrees north. The total area of the region is 313,000 square kilometers and the overall climate is semi-arid to dry Ahmadian, Sheibani, Araqi, Shirmohammadi, and Mojarad (2001).
Two types of data were used in this study. 1- LST images, which are produced by Level 3 MODIS (MOD11A2) with a spatial resolution of one square kilometer and a time interval of 8 days. 2- MM5 model Output data, which were images with spatial resolution of 0.5 × 0.5 degrees for the period of 2000-2010. The data was prepared based on the latitude and longitude of the study area from NASA and NOAA internet pages. In the present study, the Subset-Predict Algorithm (SPA) proposed by Gerber et al. (2018) was selected as the basic method. They used the spatio-temporal approach to estimate the missing values of remote sensing images. This approach is suitable for a data set with a four-dimensional array structure. In this approach, the missing values are predicted in two main steps: 1. Subset, 2. Forecasting lost values based on subsets. This approach was implemented for MODIS LST images. MATLAB software was used to do this. The inputs of the algorithm were an original image and a series of auxiliary images. In this research, two types of tests were designed to estimate missing values. The first test was performed using time series information of LST MODIS and the second test using MM5 output data. In this study, the Root Mean Square Error (RMSE), Mean Difference (AD), and Determination Coefficient (R2) were used to evaluate the performance of the SPA approach in two different situations.
3 Results and Discussion
The results of simulations show that the SPA method has a good accuracy. The average value of the obtained error is 1.487 degrees Celsius. Meanwhile, Kilibarda, Hengl, Heuvelink, Gräler, Pebesma, Perčec Tadić, and Bajat (2014) reported a mean error of ± 2.5 degrees Celsius in reconstruction of LST images of 2011. Implementing SPA algorithm with the MM5 outputs is less accurate than test 1. This can be due to the uncertainty of the MM5 model in predicting the surface temperature. The visual test of the images showed that the spatial pattern of the LST trend was preserved in estimating the missing values, and the algorithm did not impose artificial pattern on the images. This algorithm has been able to easily reset missing values in most places by maintaining a spatial pattern on the edges and also inside the gaps. Only in the quartile section of the right and above the gap area, the temperature pattern was different from the original LST image. Moreover, in this case, using the MM5 model output, the spatial pattern reflects the temperature trend better than the remote sensing time series.
The spatial distribution map of the mean error in the gap region showed that in most pixels, the error value is in the range of 0 to 2 degrees Celsius. Also, an error of more than 6 degrees Celsius is seen in a small number of pixels in the center of the gap. This may be due to the structure of the method and the subsetting in the neighborhoods or because of the extreme changes in the topography in the area. In contrast to the approach proposed by Chan and Shen (2001), SPA method was accurately simulating missing values at both edges. Given that the maximum error location is in both the center of the gap center, there is likely to be a structural problem in the SPA algorithm that needs further investigation.
Validation of the method revealed that test 1 RMSE value is less than test 2. This means that the accuracy of the algorithm is greater in test 1. According to the AD index, in both cases, the SPA algorithm underestimated the missing values. Also, the implementation of the algorithm with the test 1 scenarios with the correlation coefficient was 0.8% more than test 2.
4 Conclusion
In this study, the spatio-temporal SPA method proposed by Gerber et al. (2018) was used to estimate the missing values of MODIS LST and image reconstruction in the years 2000-2010 in the north east of Iran. The results of the SPA method in both cases showed that the chosen method was able to accurately estimate the missing values. They also showed that the obtained error values were within the acceptable range in the temperature data (Ferguson & Wood, 2010). Implementing the SPA algorithm was also less accurate than test 1 with the help of MM5 output maps. The algorithm did not impose an artificial pattern on the images. This algorithm has been able to retrieve missing values by maintaining a spatial pattern on the edges and also inside the gap. On the contrary, many of the existing documentation methods, such as Chen et al. (2004), can't estimate all of the missing pixels.
Error spatial distribution maps show that the highest simulation error relates to a number of pixels in the central region of the gap. The results of this study showed that in the absence of sufficient information for temperature in a region, data from the MM5 model can be used to fill in the missing data pixels and maintain the spatio-temporal continuity of the remote sensing images.1 Introduction
Land Surface Temperature (LST) is an important parameter in controlling surface heat and water exchange with the atmosphere (Li, Tang, Wu, Ren, Yan, Wan, Trigo, & Sobrino, 2013). Remote sensing images are now one of the most important data sources for estimating LST (Hengl, Heuvelink, Perˇcec Tadi´c, & Pebesma, 2012). But the presence of pollutants in the air, cloudiness and failure of the sensors result in the huge data loss, which is called image gaps. So far, several methods have been proposed to estimate the missing values. These methods are divided into three categories: spatial, temporal or spatio-temporal. Time-based methods for estimating missing data are mathematical calculations, the most famous of which are Savitzky and Golay filters (1964). In addition to simple solutions, more sophisticated methods such as Brooks, Thomas, Wynne, and Coulston )2012) and harmonic analysis of Zhou, Jia, and Menenti )2015) were also introduced and implemented on various remote sensing products. Simple interpolation methods such as the nearest neighborhood, SP-Line method, and Inverse Distance Weighing (IDW) are spatial methods. In most of these algorithms, the weighted average is used. Compared to the simple interpolation methods, the approaches that utilize auxiliary data are more of a researcher's interest. Chen, Zhu, Vogelmann, Gao, and Jin (2011) proposed a neighborhood Similar Pixel Interpolation method (NSPI). Zhu, Liu, and Chen (2012) presented the improved NSPI method named Geostatistical Neighborhood Similarity Pixel Interpolation using geostatistics. Gerber, de Jong, Schaepman, Schaepman-Strub, and Furrer (2018) proposed a spatio-temporal algorithm based on subsetting method for estimating the missing values of MODIS NDVI data. The estimation results are influenced by factors such as the structure of the algorithm, the used scenarios, the type of variable, and the region of study. Finding the right fitting image on cloudy days or providing a long time series are also among the most important factors influencing the results of these methods Kandasamy, Baret, Verger, Neveux, and Weiss (2013). To solve this problem, time series from other satellites are mostly used. These images may differ with the target images in terms of spectral structure and have negative effect on the algorithms performance. Only a few researches like Jang, Kang, Kim, Lee, Kim, Kim, Hirata (2010) have used numerical prediction models outputs in producing continuous spatial-temporal remote sensing data. They concluded that the outputs of numerical prediction models could be used on cloudy days.
The aim of this study is to evaluate the approach proposed by Gerber, de Jong, Schaepman, Schaepman-Strub, and Furrer (2018) to reconstruct MODIS LST images and also to study the feasibility of using MM5 model outputs as an auxiliary data for estimating missing values of the images.
2 Materials and Methods
The study area is North Khorasan, Khorasan Razavi, and Southern Khorasan provinces, northeast of Iran, located between 55 to 61 degrees east and 30 to 38 degrees north. The total area of the region is 313,000 square kilometers and the overall climate is semi-arid to dry Ahmadian, Sheibani, Araqi, Shirmohammadi, and Mojarad (2001).
Two types of data were used in this study. 1- LST images, which are produced by Level 3 MODIS (MOD11A2) with a spatial resolution of one square kilometer and a time interval of 8 days. 2- MM5 model Output data, which were images with spatial resolution of 0.5 × 0.5 degrees for the period of 2000-2010. The data was prepared based on the latitude and longitude of the study area from NASA and NOAA internet pages. In the present study, the Subset-Predict Algorithm (SPA) proposed by Gerber et al. (2018) was selected as the basic method. They used the spatio-temporal approach to estimate the missing values of remote sensing images. This approach is suitable for a data set with a four-dimensional array structure. In this approach, the missing values are predicted in two main steps: 1. Subset, 2. Forecasting lost values based on subsets. This approach was implemented for MODIS LST images. MATLAB software was used to do this. The inputs of the algorithm were an original image and a series of auxiliary images. In this research, two types of tests were designed to estimate missing values. The first test was performed using time series information of LST MODIS and the second test using MM5 output data. In this study, the Root Mean Square Error (RMSE), Mean Difference (AD), and Determination Coefficient (R2) were used to evaluate the performance of the SPA approach in two different situations.
3 Results and Discussion
The results of simulations show that the SPA method has a good accuracy. The average value of the obtained error is 1.487 degrees Celsius. Meanwhile, Kilibarda, Hengl, Heuvelink, Gräler, Pebesma, Perčec Tadić, and Bajat (2014) reported a mean error of ± 2.5 degrees Celsius in reconstruction of LST images of 2011. Implementing SPA algorithm with the MM5 outputs is less accurate than test 1. This can be due to the uncertainty of the MM5 model in predicting the surface temperature. The visual test of the images showed that the spatial pattern of the LST trend was preserved in estimating the missing values, and the algorithm did not impose artificial pattern on the images. This algorithm has been able to easily reset missing values in most places by maintaining a spatial pattern on the edges and also inside the gaps. Only in the quartile section of the right and above the gap area, the temperature pattern was different from the original LST image. Moreover, in this case, using the MM5 model output, the spatial pattern reflects the temperature trend better than the remote sensing time series.
The spatial distribution map of the mean error in the gap region showed that in most pixels, the error value is in the range of 0 to 2 degrees Celsius. Also, an error of more than 6 degrees Celsius is seen in a small number of pixels in the center of the gap. This may be due to the structure of the method and the subsetting in the neighborhoods or because of the extreme changes in the topography in the area. In contrast to the approach proposed by Chan and Shen (2001), SPA method was accurately simulating missing values at both edges. Given that the maximum error location is in both the center of the gap center, there is likely to be a structural problem in the SPA algorithm that needs further investigation.
Validation of the method revealed that test 1 RMSE value is less than test 2. This means that the accuracy of the algorithm is greater in test 1. According to the AD index, in both cases, the SPA algorithm underestimated the missing values. Also, the implementation of the algorithm with the test 1 scenarios with the correlation coefficient was 0.8% more than test 2.
4 Conclusion
In this study, the spatio-temporal SPA method proposed by Gerber et al. (2018) was used to estimate the missing values of MODIS LST and image reconstruction in the years 2000-2010 in the north east of Iran. The results of the SPA method in both cases showed that the chosen method was able to accurately estimate the missing values. They also showed that the obtained error values were within the acceptable range in the temperature data (Ferguson & Wood, 2010). Implementing the SPA algorithm was also less accurate than test 1 with the help of MM5 output maps. The algorithm did not impose an artificial pattern on the images. This algorithm has been able to retrieve missing values by maintaining a spatial pattern on the edges and also inside the gap. On the contrary, many of the existing documentation methods, such as Chen et al. (2004), can't estimate all of the missing pixels.
Error spatial distribution maps show that the highest simulation error relates to a number of pixels in the central region of the gap. The results of this study showed that in the absence of sufficient information for temperature in a region, data from the MM5 model can be used to fill in the missing data pixels and maintain the spatio-temporal continuity of the remote sensing images.https://geoeh.um.ac.ir/article_33393_e9209b5dbd28e93c602db682580f57b8.pdf