My previous post discussed Doug Keenan's climate contest. I wondered how accurate we could actually expect to be in such a situation. I assume that the temperature series is a simple random walk, possibly with a constant drift term. We want to see how accurately we can determine whether there is a drift term in the random walk or not.
So, again just using Excel, I created 1000 series of 134 observations each distributed as Normal(mu, 0.11), where mu is the drift term. For 250 series I set mu to 0.01, for 250 series I set it to -0.01 and for 500 to 0. I then compute the usual t-test for the significance of the sample mean for each series.
Only 127 t-tests were significant at the 5% level and 201 at the 10% level. Using a 10% significance level, statistical power - correct rejection of the incorrect null hypothesis of no drift - is 29%. Using a 5% significance level, power is 20%. There is no distortion of the actual "size" of the test - the number of incorrect rejections of the true null.
So, combining this information, if you use this method and a 10% significance level you will get 595 correct classifications of whether a random walk has a drift or does not have a drift, which is far below the 900 required to win the contest.
Of course, it seems that Keenan's data is a bit more complicated than this and may or may not have any relevance to the actual nature of climate data or the nature of the climate change problem.
You can download my data here. The first column is the drift term used and the first row indicates the years and the statistics columns.
So, again just using Excel, I created 1000 series of 134 observations each distributed as Normal(mu, 0.11), where mu is the drift term. For 250 series I set mu to 0.01, for 250 series I set it to -0.01 and for 500 to 0. I then compute the usual t-test for the significance of the sample mean for each series.
Only 127 t-tests were significant at the 5% level and 201 at the 10% level. Using a 10% significance level, statistical power - correct rejection of the incorrect null hypothesis of no drift - is 29%. Using a 5% significance level, power is 20%. There is no distortion of the actual "size" of the test - the number of incorrect rejections of the true null.
So, combining this information, if you use this method and a 10% significance level you will get 595 correct classifications of whether a random walk has a drift or does not have a drift, which is far below the 900 required to win the contest.
Of course, it seems that Keenan's data is a bit more complicated than this and may or may not have any relevance to the actual nature of climate data or the nature of the climate change problem.
You can download my data here. The first column is the drift term used and the first row indicates the years and the statistics columns.
No comments:
Post a Comment