Sminar Analysis & Modeling
Course description
The course 'Analysis & Modeling' was held as a seminar by Prof. Strobl and the seminar featured a wide range of possible topics and case studies. Either as single-work or in groups of two, we had to work on one of those topics and present our results at the end of the semester. Our group decided to work on 'precipitation interpolation with co-Kriging', a highly specialized and complex topic.
Approach
1. Collecting data
In a first step we had to collect precipitation data from at least 150 stations to get plausible results. From the start it was clear that we wanted to focus on the county of Salzburg. But that meant that we also had to find data from areas around Salzburg to get plausible results. The collected data covers the areas of Tyrol, East Tyrol, Salzburg, Carinthia, Upper Austria, the western part of Styria, the southern part of Bavaria and the northern part of South Tyrol in Italy. All in all, we gathered data from 172 weather-stations. The first place to go is of course the Zentralanstalt für Meteorologie und Geodynamik' (ZAMG) in Austria, additionally we had a look at the government's homepage of South Tyrol and the homepage klimadiagramme.de.
The collection of precipitation-data was one thing. As we also wanted to include elevation-data into our calculations, we had to find a digital elevation model (DEM). Finally we decided to go for NASA's SRTM data which has been created during the mission 'STS-99' in February 2000 in order to obtain a digital elevation model for most regions in the world. This data is accessible for free for a ground resolution of 90 meters and thus perfectly suited our purpose, as this resolution is fine enough for our project.
2. Data exploration
Before starting the actual analysis, it's important to gain an idea of the structure of the data to be able to validate the results. Because if you do not know how your data looks like, you can not be sure if your results are plausible. We had three different steps of data exploration:
- histograms
- voronoi maps
- semivariogram cloud
The picture to the left has been calculated with ArcGIS 9 and shows the histogram of our 172 measurement-points. Basically a histogram is nothing but a frequency distribution with the number of occurences on the y-axis and the precipitation values on the x-axis. The statistics on the top right corner demarcate a range between 582 mm up to 2165 mm precipitation. Most values gather around 800 - 900 mm. The histogram is used to get a first impression of your data.
The next step comprises the creation of a Voronoi-map, also known as Thiessen-polygons. The basic idea behind them is to get a visual impression of the influence of single points to their surrounding area. The most straightforward access to this task would be to simply link two points and construct the perpendicular bisectors. If you link all point-pairs with each other, create all perpendicular bisectors and link the intersection of all half planes, the result is a Voronoi-map.
The next step in our data exploration - the semivariogram cloud - is already tending towards geostatistical interpolation methods. This is a pairwise comparison between all sample points and a plot showing the distance between the points on the x-axis and the Variance (colloquial the "unsimilarity" of the values) on the y-axis. As all points are compared between them this results in quite a lot of points in the Semivariogram Cloud, namely 14,706! The name "semi" (from greek "half") already suggests that we only use the half. This is quite obvious because we only need to compare the points one time. When we compare point A with point B we do not need to compare point B with point A as well as the result would be the same. The picture you can see above does already give some hints about the autocorrelation between the points. We can see high autocorrelation at places where the distance between the points is quite low and the Variance is quite low as well, this means at the lower left corner of the graph. Logically this is the opposite for a low level of autocorrelation, which we can find at the upper right corner where the distance as well as the Variance are quite high. This is used for "fine-tuning" and for digging deeper into autocorrelation.
3. Interpolation with Co-Kriging
"Kriging is an advanced geostatistical procedure that generates an estimated surface from a scattered set of points with z-values. More so than other interpolation methods supported by ArcGIS Spatial Analyst, a thorough investigation of the spatial behavior of the phenomenon represented by the zvalues should be done before you select the best estimation method for generating the output surface" Source ARCIS RESOURCE CENTER 2011
The method belongs to the so called geostatistical interpolation methods and the aim idea is to discover pattern and properties of the measured values and then apply them. The Semivariogram is describing the theoretical spatial coherence, while Kriging is estimating the spatial distribution of the observed data. Co-Kriging is basically the same as Kriging, but with Co-Kriging you can take additional variables into account. In our case this additional variables are the elevation-values of the SRTM dataset. The usage of additional variables allow for a better prediction of the dependent variable which is in our case the uncertain prediction precipitation value between the known measured values.
4. Result