Austin J. Riley | 16 August 2024
GIST 602B: Vector Spatial Analysis
Abstract
The choice of an optimal interpolation method for estimating precipitation distribution is essential for a variety of resource management issues. However, achieving an accurate representation of precipitation at unsampled locations is a difficult task, especially in places with spatially diverse rainfall patterns. The climate of the Hawaiian Islands contains a great deal of diversity in a very small area, and because of this, the assumption of second order stationarity is not met. To evaluate the performance of spatial interpolation methods in the islands, the inverse distance weighting (IDW) method was compared to the geostatistical ordinary kriging (OK) method to interpolate mean monthly precipitation for the period 1970–2007. The performance of both methods was assessed using root mean square error (RMSE) cross-validation statistics and rainfall surface maps were produced. The overall RMSE for IDW was 41.67 mm whereas for OK the RMSE was 33.12 mm, and the surface produced by OK was smoother compared to IDW. Results indicate that OK outperformed IDW despite the influence of non-stationarity from steep rainfall gradients.
Introduction
Precipitation data are important for many environmental studies, especially if related to water resources. Despite the use of measurements from individual rain gauges being appropriate at small scales (Wagner et al., 2012), achieving an accurate representation of precipitation at large scales typically requires the use of spatial interpolation from these measurements (Chaubey et al., 1999; Tabios & Salas, 1985). A wide range of interpolation methods exist, ranging from simple techniques such as inverse distance weighting (IDW) (Teegavurapu et al., 2009) to more complex geostatistical techniques such as ordinary kriging (OK) (Buytaert et al., 2006; Li & Heap, 2013). The choice of an optimal interpolation method is essential to research in hydrology, terrestrial ecosystem processes, regional climate, and regional impacts of global change (Frazier et al., 2015). There is some debate, however, as to which is the best or most appropriate method in specific circumstances (Li & Heap, 2013).
In the Hawaiian Islands, a diverse terrain and a persistent trade wind inversion (TWI) lead to extremely complex rainfall patterns (Leopold et al., 1951). Achieving an accurate representation of these patterns is therefore a difficult task, even with a relatively dense network of stations. To evaluate the performance of spatial interpolation methods in places with spatially diverse rainfall patterns, a comparative analysis within a geographic information system (GIS) was selected. Frazier et al. (2015) compared OK, ordinary cokriging (OCK), and kriging with an external drift (KED) for interpolating month-year rainfall in the Hawaiian Islands, finding that OK performed best. However, others have suggested that kriging is not suitable for monthly precipitation interpolation because the assumption of second order stationarity is usually not met (Nadler & Wein, 1998; Chen et al., 2009). In addition, the performance of kriging is generally hampered by the presence of steep gradients (Nadler & Wein, 1998), such as in places with climatic diversity. The objective of this study was thus to compare a widely used deterministic method (i.e., IDW) to the geostatistical OK method for interpolating mean monthly rainfall with fewer stations in the island of Hawai'i. In particular, the following questions are addressed:
1. Is there an optimal method for interpolating monthly precipitation distribution in the island of Hawai'i?
2. Is it appropriate to use geostatistical interpolation methods in places with spatially diverse rainfall patterns?
Methods
A. Study Area
The area under consideration is the southernmost island of the Hawaiian Island chain, more specifically – the Big Island of Hawai'i. The six other major islands Kaua'i, O'ahu, Moloka'i, Lāna'i, Maui, and Kaho'olawe were not considered in this study because previous work using a similar number of rain gauge stations to those currently available has already been conducted (Frazier et al., 2015). No rainfall data were available for the island of Ni'ihau. The main Hawaiian Islands are located in the Pacific Ocean between 18.9° and 22.24° N latitude and 160.25° and 154.8° W longitude, with the island of Hawai'i between 18.9° and 20.27° N latitude and 156.06° and 154.8° W longitude. The islands have a total land area of 16,637 km2 (Juvik & Juvik, 1998) and the island of Hawai'i comprises 63% of this total. The complex topography of the Hawaiian Islands and the large elevation range (0–4205 m) creates a great diversity of mean annual rainfall (Frazier et al., 2015), ranging from 204 to 10,271 mm (Giambelluca et al., 2013). In addition, the average rainfall gradients for some places in the islands are among the steepest in the world (Frazier et al., 2015).
B. Data
Over 2,000 rain gauge stations have operated across the islands over the past 150 years (Giambelluca et al., 2013). However, many stations were discontinued throughout the 1970s and 1980s, and as of 2012, only 404 gauges were still in operation (Frazier et al., 2015). Fortunately for this study, the Hawai'i Statewide GIS Program maintained by the Office of Planning and Sustainable Development contains over 1,100 stations for the period 1970–2007 prior to the discontinuance. These data were downloaded from the Hawaii Statewide GIS Data Portal and were brought into ArcGIS Pro 3.3 as a CSV file. Because the month of May had the lowest cross-validation statistics for the island of Hawai'i using OK in the Frazier et al. (2015) study, the dataset was filtered to represent only mean rainfall for May. The ArcGIS XY Table To Point tool was used to create a point feature class based on the x- and y-coordinates from the table, and a separate feature layer for stations in the island of Hawai'i was created using the Make Feature Layer tool. The total number of remaining stations was 260, representing a station density of about 0.025 km2 (Figure 1).
Figure 1. Study area depicting the 260 rain gauge stations in the island of Hawai'i.
C. Interpolation Techniques
The mean May rainfall dataset from the Hawaii Statewide GIS Program served as the primary input for both the IDW and OK interpolation methods. IDW was the first method performed, which is a simple deterministic technique based on an assumption that the interpolated values are influenced by the nearby values and less by the distant observations (Wang et al., 2014). IDW weights the observations closer to the interpolation location greater than those farther away, such that the weighting is a function of the distance between the point of interest and the sampling points (Sun et al., 2009). The usual mathematical expression of IDW is shown as:
where Z is the predicted value of the interpolation point; Zi is the value of sampling point i (i = 1, 2, ..., n); n is the number of sample points; di is the distance between the interpolated and sampled values; and p represents the power parameter which is a positive real number (Wang et al., 2014). The principal factor affecting the accuracy of IDW is the value of p (Borges et al., 2014); the optimal value (i.e., 2.5) was determined by minimizing the root mean square error (RMSE) calculated from the cross-validation procedure in the Geostatistical Wizard tool in ArcGIS Pro 3.3. Furthermore, a standard neighborhood using the 10 closest sample points was specified.
In contrast to IDW, kriging is a geostatistical method which accounts for the spatial correlation and statistical relationships between measured points (Krige, 1966). Among the most popular of the kriging methods is OK (Wang et al., 2014), which assumes that the mean is constant but unknown and focuses on spatial components by using only the sample points within the local neighborhood for the estimate of the predicted value (Luo et al., 2008). OK uses a semivariogram to assess the dissimilarity between points in the local neighborhood. To fit the semivariogram to the mean May rainfall data, the Geostatistical Wizard tool was used to define a spherical model, which was optimal for producing a linear behavior at the origin plausible to the semivariogram (Figure 2). In addition, OK assumes that the data are normally distributed, but because precipitation data are often positively skewed, results in semivariances may be less reliable (Frazier et al., 2015). Log and box-cox transformations were tested to assess the effect on the model performance. Neither transformation improved the RMSE; therefore, OK was performed on the original, un-transformed data.
Figure 2. Semivariogram and optimal fitted model (i.e., spherical) of the OK method for the 260 stations.
Results
The RMSE was used in this study to evaluate the accuracy of both interpolation methods. RMSE is a commonly used cross-validation statistic which measures prediction accuracy, and the value approximates the average deviation of the predicted values from the measured values (Esri, 2024). In a typical cross-validation, the original sample is randomly partitioned into two datasets, with one used to train a model and the other used to validate the model (Wang et al., 2014). The overall cross-validation results (scatter plots of predicted values versus measured values) for the island of Hawai'i in May are shown for both methods in Figure 3, indicating that OK outperformed IDW. The overall RMSE cross-validation statistics for IDW and OK were 41.67 mm and 33.12 mm, respectively. Although both methods have shown a similar pattern on the spatial distribution of rainfall – predicting the most rainfall at lower elevations in the east – the rainfall surface produced by OK was generally smoother compared to IDW (Figure 4). This is because IDW is directly based on the surrounding sample point values to determine the smoothness of the resulting surface (Munyati & Sinthumule, 2021). A smaller neighborhood size was selected for IDW (n = 10), so it makes sense that OK produced the smoother surface.
Figure 3. Cross-validation results for the island of Hawai'i in the month of May for both methods tested.
Figure 4. Final rainfall maps for the island of Hawai'i in the month of May created by both methods tested.
Discussion
Previous research suggests that the two major factors influencing interpolation quality are the frequency of rain and the stationarity of the precipitation distribution (Chen et al., 2009). The monthly climate of the Hawaiian Islands contains a great deal of diversity in a very small area (Giambelluca et al., 2013), and because of this, the assumption of second order stationarity is not met (Nadler & Wein, 1998). However, results in this study show that the OK method still has a better performance than IDW despite this non-stationarity. The overall RMSE for IDW was 41.67 mm whereas for OK the RMSE was 33.12 mm, and the rainfall surface produced by OK was generally smoother compared to IDW. Although RMSE was not provided in the Frazier et al. (2015) study, results in this study agree that OK is the optimal method in the island of Hawai'i, even when considering a widely used deterministic method. Finally, it is acknowledged that the accuracy of this interpolation may be limited by the smaller number of sample points used in both methods. Frazier et al. (2015) included 348 rain gauge stations in the island of Hawai'i, but because available data were sparse, this study included 88 fewer stations. OK is therefore considered the overall optimal interpolation method irrespective of small reductions in sample size.
Conclusion
Mean May rainfall maps for the island of Hawai'i were produced for the period 1970–2007 using IDW and OK. Based on the cross-validation results, OK was found to outperform IDW, which is in agreement with other precipitation studies (Frazier et al., 2015; Chen et al., 2009). Thus, geostatistical methods are found to be appropriate in the Hawaiian Islands despite the potential impact of non-stationarity from steep rainfall gradients. Further research could investigate whether environmental variables not yet tested (e.g., direction of winds or slope orientation) might improve performance in more advanced geostatistical methods such as KED.
References
Borges, P. D. A., Franke, J., da Anunciação, Y. M. T., Weiss, H., & Bernhofer, C. (2016). Comparison of spatial interpolation methods for the estimation of precipitation distribution in Distrito Federal, Brazil. Theoretical and Applied Climatology, 123, 335-348.
Buytaert, W., Celleri, R., Willems, P., De Bievre, B., & Wyseure, G. (2006). Spatial and temporal rainfall variability in mountainous areas: A case study from the south Ecuadorian Andes. Journal of hydrology, 329(3-4), 413-421.
Chaubey, I., Haan, C. T., Salisbury, J. M., & Grunwald, S. (1999). Quantifying model output uncertainty due to spatial variability of rainfall. JAWRA Journal of the American Water Resources Association, 35(5), 1113-1123.
Chen, D., Ou, T., Gong, L., Xu, C. Y., Li, W., Ho, C. H., & Qian, W. (2009). Spatial interpolation of daily precipitation in China: 1951–2005. Advances in Atmospheric Sciences, 27, 1221-1232.
Ersi (2024). Using cross validation to assess interpolation results. Available at: https://pro.arcgis.com/en/pro-app/latest/help/analysis/geostatistical-analyst/performing-cross-validation-and-validation.htm.
Frazier, A. G., Giambelluca, T. W., Diaz, H. F., & Needham, H. L. (2016). Comparison of geostatistical approaches to spatially interpolate month-year rainfall for the Hawaiian Islands. International Journal of Climatology, 36, 1459-1470.
Giambelluca, T. W., Nullet, M. A., & Schroeder, T. A. (1986). Rainfall atlas of Hawaii. Department of Land and Natural Resources, State of Hawaii.
Juvik, S. P., Juvik, J. O., & Paradise, T. R. (Eds.). (1998). Atlas of Hawai'i. University of Hawaii Press.
Krige, D. G. (1966). Two-dimensional weighted moving average trend surfaces for ore-evaluation. J. South Afr. Inst, 66, 13-18.
Leopold, L. B., Landsberg, H., Stidd, C. K., Yeh, T. C., Wallén, C. C., Carson, J. E., ... & Leopold, L. B. (1951). The geographic distribution of average monthly rainfall, Hawaii. On the Rainfall of Hawaii: A Group of Contributions, 24-33.
Li, J., & Heap, A. D. (2014). Spatial interpolation methods applied in the environmental sciences: A review. Environmental Modelling & Software, 53, 173-189.
Luo, W., Taylor, M. C., & Parker, S. R. (2008). A comparison of spatial interpolation methods to estimate continuous wind speed surfaces using irregularly distributed data from England and Wales. International Journal of Climatology: A Journal of the Royal Meteorological Society, 28(7), 947-959.
Munyati, C., & Sinthumule, N. I. (2021). Comparative suitability of ordinary kriging and Inverse Distance Weighted interpolation for indicating intactness gradients on threatened savannah woodland and forest stands. Environmental and Sustainability Indicators, 12, 100151.
Nalder, I. A., & Wein, R. W. (1998). Spatial interpolation of climatic normals: test of a new method in the Canadian boreal forest. Agricultural and forest meteorology, 92(4), 211-225.
Sun, Y., Kang, S., Li, F., & Zhang, L. (2009). Comparison of interpolation methods for depth to groundwater and its temporal and spatial variations in the Minqin oasis of northwest China. Environmental Modelling & Software, 24(10), 1163-1170.
Tabios III, G. Q., & Salas, J. D. (1985). A comparative analysis of techniques for spatial interpolation of precipitation. JAWRA Journal of the American Water Resources Association, 21(3), 365-380.
Teegavarapu, R. S., Tufail, M., & Ormsbee, L. (2009). Optimal functional forms for estimation of missing precipitation data. Journal of Hydrology, 374(1-2), 106-115.
Wagner, P. D., Fiener, P., Wilken, F., Kumar, S., & Schneider, K. (2012). Comparison and evaluation of spatial interpolation schemes for daily rainfall in data scarce regions. Journal of Hydrology, 464, 388-400.
Wang, S., Huang, G. H., Lin, Q. G., Li, Z., Zhang, H., & Fan, Y. R. (2014). Comparison of interpolation methods for estimating spatial distribution of precipitation in Ontario, Canada. International Journal of Climatology, 34(14).