Monday, December 24, 2012

Merry Christmas, and plans for the New Year

Merry Christmas to all, and hopefully, relief from floods and snow. Different problems here.

Last year I produced a graphics gallery at Christmas, and I was hoping to have an update here. But there are some new things coming, so I'll do it when they are out. They are mostly to do with my new enthusiasm for large data sets selectively downloaded in response to user requests (via XMLHTTPRequest). I tried here combining the Google Maps survey of station with the machinery of the climate plotter. I'd like to do the same with the globe plots. I'll add globe plots of GHCN temperature for at least a century of year averages, and over decades, and combine that with the trend data to make a universal globe plot.

I'm also thinking about how to put monthly data in the climate plotter. There are issues about handling seasonality, and there are the mechanics of keeping it updated. But I think that can be mechanised.

I'll also do some analysis of the ISTI data when it is out of beta.

Wishing you all well for the New Year.

Saturday, December 15, 2012

November GISS Temp unchanged from October

This isn't breaking news - I'm running late this month. But I wanted to give the usual comparison with TempLS. The GISS land/sea monthly anomaly was 0.68°C September; October had been readjusted down to this after initially 0.69°C. Time series and graphs are shown here

As usual, I compare the previously posted TempLS distribution to the GISS plot.

Here is GISS:

And here is the previous TempLS spherical harmonics plot:


Previous Months

December 2011
August 2011

More data and plots

Friday, December 14, 2012

Universal station locator and history plotter

This is next in the series of things that can be done with XMLHTTPRequest. It merges the capability of the Google Maps display of stations with the machinery of the climate plotter. But the key new thing is that the station location information can be backed up with a store of temperature histories, which can be plotted on demand.

So what we have is a map which allows you to choose a category of stations to show with markers. The usual Google Maps interactivity works. You can choose from different data sets - currently there are GHCN, CRUTEM 4 and the new (and beta) ISTI. BEST will be there soon [Update - it's there now]. Mouse over the markers shows the name. But if you click, you not only get station information as before, but a plot of the annual temperatures in the record (for that data set). You can add to the plot, to, say, compare the records of different data sets. Or you can compare GHCN adjusted with unadjusted, or stations at different locations. And you can smooth and regress. There is an information window that shows the numbers.

How to use it - choosing stations

The map controls are in the table bottom right. Families of controls are indicated by background color. The marker buttons in the top row (None, yellow, pink) are the ones that create actions, in line with the current state of the other selection buttons.

The world is divided into regions, because the larger sets (BEST and ISTI) will make everything about the map very slow if everything is shown. I'd suggest beginning by choosing a single region. The region numbering is shown by a small map under the plot space. You should also choose one or more datasets - GHCN is shown by default, but can be unset. Gadj means GHCN adjusted. You'll probably only want one station for each color. BEST is coming, but not there yet.

You can also choose a subset of those stations. You have to unset the All checkbox, and set the checkbox of the choices you want. The inequality buttons toggle.

When you have chosen a color, marker representing your choices will appear. The choice "None" makes them disappear - often useful. Mouseover the markers to see the names, and when you click, detailed information will appear in the frame bottom left, and a curve will be plotted, or added to the plot. A handle for the curve will be added to the section headed.

Managing the plot

The x and y axes are active. You can click on the pink bars to translate. The step is equal to the distance from the green marker in the middle. The blue bars translate too, but with a fixed point, so the scale changes. On the y-axis, the top stays still, and the bottom point translates (and in between proportionately). For the x-axis, it's the right end that stays still.

There is a column of controls headed Prop/value. To use these you set a value on the rightt, select handles of curves that you want to apply the change to on the left, and then click the prop button to make it happen. For regression you choose type and years (all is default). Colors you can choose from the vertical bar between plot and map. The offset is not incremental - it is what it shows.

There are more usage details on the climate plotter page.


The plot shows annual values. These were taken from the monthly data by averaging (unweighted by days). Missing values were infilled by the monthly average for that station. If more than three months were missing, the year was omitted. I've also omitted sites with less than three eligible years.

Wednesday, December 12, 2012

November TempLS Global Temp down 0.02°C

I see GISS has already posted (no change) - you have to be early to get ahead of them lately. But I'll produce the normal pair of posts with TempLS results and then comparison.

The TempLS analysis, based on GHCNV3 land temperatures and the ERSST sea temps, showed a monthly average of 0.52°C for November, down from 0.54 °C in October. I had reported 0.52 °C for October, but late data raised it a bit. These are small changes. There are more details at the latest temperature data page.

Below is the graph (lat/lon) of temperature distribution for November. I've also included a count and map of the stations that have reported to this date.

This spherical harmonics plot is done with the GISS colors and temperature intervals, and as usual I'll post a comparison when GISS comes out.

And here, from the data page, is the plot of the major indices for the last four months:

Here is the map of 3334 "stations" which contributed to this report. That's even lower than last month. So I probably couldn't have done this analysis any earlier.

Saturday, December 8, 2012

TempLS correlation with other indices.

Since June 2011 I've been posting monthly TempLS global averages, before the other surface indices appear. The purpose of this haste is partly to see how well it performs in comparison, uninfluenced by "peeking". Here is a recent monthly comparison, with links to earlier months. I post the data here.

So it's now time for a review on how well TempLS tracks. Along the way, I found some interesting results on how the main indices track each other.

Data plot

The data sources are:
HADCrut 4
Gistemp Land/Ocean
NOAA Global Land Ocean
RSS MSU Lower Troposphere
UAH Lower Troposphere
and TempLS. The data is tabulated

So here's a plot of the indices for those 17 months, set to a common anomaly base period of 1979-2000. Generally the surface-based (non-satellite) follow each other pretty closely:

Now to show more detail of the differences, I'll plot the monthly differences between TempLS and the others. I'll arbitrarily zero the plots in a staggered way to make a point:

Now it becomes clearer. TempLS tracks NOAA very well, HADCrut 4 a little less, GISS less again, and the lower troposphere indices rather poorly.

There is, of course, a good reason for this. TempLS and NOAA use very similar datasets - GHCN land data, and ERSST. TempLS uses unadjusted GHCN, but there is very little adjustment in this time frame.


I wanted to see also how the other indices track each other, and to give a statistically testable measure. An obvious one is just the standard deviation of the scatter seen in the figure above. Here is a table of that measure for each pairing:

Standard Deviations of differences (°C)


The differences are marked - 0.0183°C for NOAA vs 0.0653°C for GISS, relative to TempLS.

Another measure is the correlation coefficient ρ for the monthly changes. This has the advantage that it can be easily tested for significance, with the formula for t-value:

t = ρsqrt((n-2)/(1-ρ*ρ))
where n is number of months. As usual, t is significantly above zero at 95% confidence if it exceeds 1.96. Actually, the significance is diminished by autocorrelation etc. Still, in cases of interest it clears that level by a wide margin.

Correlation coefficients of monthly changes


t-value of monthly changes


The correlation of TempLS with all the indices is significantly positive, although with GISS barely so, over this period

Here's a graphical representation of the correlation. The circle areas are proportional to the t-value of the pairing. Big means close tracking. In fact, the area is proportional to ρ*sqrt(1/(1-ρ*ρ)); there's no difference for one plot, but it means that when I compare to different periods, the circles do not inflate with the longer period.
The best correlations are in fact between TempLS and HADCrut and NOAA, which likely indicates the commonality of their data sources. There is also quite good tracking between the satellite indices. It seems that the different methods used have less effect than the different data sets.

Longer periods

I looked at the 17 months for which TempLS made predictions. But comparisons between other indices are valid beyond that period. As indeed are comparisons with TempLS, because in calculating the monthly values I actually didn't peek.

The story is very similar. All the correlations are now highly significant. I'll just show below the circle plot for periods of five and ten years:

Correlations over 5 yearsCorrelations over 10 years
Correlation of TempLS and GISS seems better over the longer periods, and with NOAA not quite so good..



There are interesting patterns of correlation between the various temperature indices. Those using similar datasets correlate very well. GISS, which uses a more diverse set, behaves rather differently.

TempLS fits very well into the NOAA/HADCrut grouping.

Thursday, December 6, 2012

Using present expectation anomalies for station data.

As I foreshadowed in a recent post, for plotting recent monthly data I wanted to shift from anomalies based on a past period (1961-1990) to one based on the present. For each station and each month, I would use the present value of a weighted linear regression as the expectation, and the anomaly would be the deviation from that.

The reason was mainly that I suspected that irregular happenings in the history of the stations was distorting the anomaly base, and creating noise in the anomaly plot which isn't needed. In my most recent post I traced the prominent deviations due to Nitchequon and Shahr-e-kord to gaps in the record and noted big (and probably correct) adjustments made by GHCN.

I've done it, and the monthly maps now use this basis. I think it has been very successful in removing this source of error. Of course, it also means that the anomalies do not give any measure of AGW. For that the right source is the trend map.

Below the jump, I'll illustrate the improvement.

I haven't updated some earlier maps (June 2012, Nov 2011), so you can use these for comparison. Here is a snapshot of North America for June:

June 2012 using anomaly base 1961-1990June 2012 using anomaly from present estimate

Not only is the Nitchequon dip in Quebec gone, but the US is very much smoother. I have often commented previously how these plots seem less smooth in the US; that may well be due to a greater frequency of station changes. Anyway, it's much less true now.

It also shows more emphatically how spatially correlated are the changes in individual station monthly averages.

Wednesday, December 5, 2012

Visualizing the need for homogenization

I've put up two recent posts which show temperature results for individual stations using a shaded mesh. One shows monthly anomalies relative to 1961-1990, or 1975, and the other shows trends. There's an interesting spatial consistency, with exceptions.

The exceptions may be climate. But they may also be the effects of things happening to stations. This is what homogenization is designed to overcome, and I think there are some good illustrations here.

I usually use GHCN unadjusted readings, mainly because people like to argue over adjustments, and I think for the headline effects they don't make much difference. But these spatial plots show that they can, and it's probably for the good.


I mentioned in the monthly post the strange behaviour of this cold place in Quebec. Here are a couple of snapshots:

October 2012June 2012
You can see the big blue dip in NW Quebec. That's caused by Nitchequon. The plots are of anomalies wrt 1961-1990. The same thing is seen in most recent months. It always seems unusually cold in Nitchequon.

But it's likely that that is because it was never as warm (in 1975 etc) as we thought. Here is the unadjusted GHCN history:

You can see a lot of missing years from 1984 to 2000 and even later. I've omitted years with less than 9 months of data, but that isn't the issue. Most of those years had none at all. And there's a lot of scope for something to have changed during the gap.

The GHCN adjustment process picked this up. Here's what they have:

Those past temperatures have been adjusted down. That would stop Nitchequon standing out as a place of ongoing (relative) cold.

Ideally, I'd show you the plot with recalc anomaly base. But I haven't done that, because as foreshadowed, I'm moving away from using a past basis at all.

Update - see below for another example


Trends are more subject to measurement vagaries, especially long term trends. And this tends to show up as visible inhomogeneity. My trend plot now offers the option of using adjusted data.

Here is a picture of 30 year trends (to present) in N America

It's a bit more irregular in the US that Canada (this often seems to happen), but not too much. But going further back, to 1892, you get this:

Data is rather sparse in Canada now, but the US shows a lot of variability. How much is due to measurement vagaries?

Well, here are the corresponding adjusted plots:

Adjusted GHCN 1982-2011Adjusted GHCN 1892-2011

You can see that the shorter term makes not much difference, but the longer term smoothes a lot. This is of course not surprising - it's what homogenization should do. I'm just noting that it does, and there did appear to be a problem.

Still, it's not always like that. Here's what homogenization does in Europe for 1892-2011:

Unadjusted GHCN 1892-2011Adjusted GHCN 1892-2011

It wasn't bad before adjustment, and may be worse after.

No real conclusions here, though I do notice a pattern suggesting that inhomogeneity is more of a problem in the US. But you can try your own cases.

Update - another Nitchequon

I went looking for more examples. Some really stand out. Here are the last three months of a place listed as SHAHRE... in W Iran (It is the city of Shahr-e-kord).

October 2012Sep 2012Aug 2012

I've shown the last with mesh lines. The pattern continues back. And the cause is evident in the data, this time shown on one plot:

Again, big gaps in the data, and the base period adjusted down in GHCN.

Station trends - more

This is the second in the series of large datasets made available by XMLHTTPRequest. I had shown a globe map of station trends. I was limited to 3 time periods and even then there was some awkwardness because I had to use a single mesh to save download time.

Now I can do many periods, each with its own mesh. The resulting plot is shown below. All the periods end at present - I could do selected past periods too, but couldn't think of a scheme for preselecting.

I originally called this a cherrypickers guide, because it shows out the locations where the trends have been negative. But it also puts it in proportion - there are more positive trends than negative. The color scheme often obscures that, because I center the rainbow colors on the midpoint of the data, which is often well above the zero trend, which is down among the blues. I think the spatial homogeneity is worth noting. Nearby stations tend to warm and cool together.

Anyway, the plot is below the jump. Or you can go here to see it in a separate window. As with monthly data, you can select different time ranges, ask to see the nodes and mesh - just refresh when you've selected. Click on the small map to reorient the globe. Click on the main globe to bring up the data for the nearest station.

Update: I've put up corresponding data using GHCN adjusted. Check the box and refresh to see it.

How it works

The flat map at top right is your navigator. If you click a point in that, the sphere will rotate so that point appears in the centre.
The buttons below allow modification. Set what you want, and press refresh. You can show stations, and the mesh, and magnify 2×, 4×, or 8× (by setting both). You can click again to unset (and press refresh).

Then you can click in the sphere. At the bottom on the right, the nearest station name, lat/lon and trend will appear. It's easier to do this with stations displayed.

Data details

These are as for the previous post.

Tuesday, December 4, 2012

On Anomalies for Stations

I've recently posted a map (one of a series) of monthly temperature anomalies for individual stations. I've been thinking about what kind of anomaly is really appropriate here.

Some skeptics don't like anomalies, and say only real temperatures should be plotted. But then the plot is dominated by the variations in altitude and latitude. In January it's cold in Moscow and warm in Booligal. We knew that. If you hear that it was 15°C in Rome last month, you'll ask "but what is it normally?".

You need the anomaly, because that's the real information in the month's readings. And a plot should show that. The anomaly is the difference between what is observed and what you expect.

But what expectation? More below the jump.

Indices like GISS and HADCrut use a thirty period to calculate averages on which to base anomalies. That's the expectation, so deviation from it includes global warming. That has to be related to a fixed period.

I used that for the map anomalies. There is a practical difficulty that a station with October 2012 readings and a substantial history may not have enough information in, say, 1961-1990. This is the period that I used. So a reasonable thing to do is to use other information and get a regression estimate for 1975. That will avoid bias from a warming trend.

What do we really want?

The idea of that was that the anomaly will include global warming since 1975. And indeed, recent anomalies are mostly positive. However, this isn't obvious, because by graphing scheme shifts the color map relative to the range. So because of warming, small positive anomalies are shown with bluish colors. That would be the same whatever base period was used.

Looking at a monthly map, global warming isn't news. Even relative warming like that in the Arctic isn't new. Seeing a reddish Arctic month after month may not be what we need. Because it's all pushed into the upper color range, there isn't much new information.


One thing that I think is important in these plots is that you get an idea of spatial consistency. Where it's hot, most stations nearby are hot. The colors are fairly smooth. This is only true if the anomaly base is also consistent.

There is a station Nitchequon, in NW Quebec, which shows up with consistent low anomalies relative to neighbors. Otherwise Canada has mostly good consistency. I suspect the anomaly base is wrong. Nitchequon has a fairly long record, including quite a lot in 1961-90, but is missing many years from 1985 to 2005. Temperatures after the break are much lower than before. The adjusted version moves these later numbers way up. I'm using unadjusted GHCN. That's not so important in absolute terms, but, unadjusted, it does produce the marked dip in the plot.

Incidentally, I think the Nitchequon story does show how inhomogeneities can really stand out to be identified.


What does the expected value really mean? I could produce a value that allowed for ENSO, solar forcing etc. This might well be a lower variance estimate. But I think most users would expect to see those effects reflected in the anomalies, not removed from them. So there is a middle ground to be found.

My current thinking.

I think that I should plot anomalies relative to the current mean values (for month) with adjustment for trend. That would be the expected value. It has the advantage that it would avoid issues with past jumps, as at Nitchequon. And it does show the information that is new with each month.

I think the best way to do it is with a weighted least squares fit to a linear model, as with TempLS. I'd fit a model for each station:
The L's are offsets constant for each month (m) ("monthly averages") and J is a linear progression over years (y). The weighting would be an exponential decay back in time, with a time constant of maybe thirty years. This would give higher weight to recent data. The anomaly would be the residual.

I'll think about it a bit more, but I'll probably redo the data for the previous post.
Update - it has now been done as described here.

Monday, December 3, 2012

Monthly station surface temperature shown on globe

I've been discovering new things in Javascript. I have been much constrained by data download time. JS frowns on interactive downloading - you generally have to download all data initially, as part of the code. However, there is a newish feature, XMLHTTPRequest, which allows download in response to user choices (with restrictions on domains). This means I can make very large datasets available to select from. I've also found new ways of compacting them, which I'll write about later.

My initial exercise was the plot that I have sometimes shown for recent months (eg June). It's based solely on the data reported for that month (plus the anomaly base). But now you can select any month you like (currently only for this century). The data is downloaded when you ask, so there isn't a huge initial wait. It's a plot based purely on the station data for GHCN V3 unadjusted and ERSST. For SST a "station" is a 4°x4° lat/lon cell. A triangle mesh is fitted and used for color shading between stations.

As before, you can rotate the globe by selecting focus points on the top right map. You can magnify, display stations and mesh, and click to print numerical data (on the right). There are more details of that below.

The plot is below. You can also click here to see it in a separate tab/window. More discussion and user guidance follows.

How it works - details

The flat map at top right is your navigator. If you click a point in that, the sphere will rotate so that point appears in the centre.
The buttons below allow modification. Set what you want, and press refresh. You can show stations, and the mesh, and magnify 2×, 4×, or 8× (by setting both). You can click again to unset (and press refresh).

When you select a month/year, you also have to refresh. Using the navigator automatically refreshes.

Then you can click in the sphere. At the bottom on the right, the nearest station name and anomaly will appear. Lat/Lon and date are also shown. You may want to have stations displayed when you click.

Data details

Anomalies are relative to the 1961-1990 period. Where stations did not have enough data there, I took extra years and did a linear regression, and used the 1975 value.
Update - as foreshadowed in later posts, I've now switched to a weighted linear regression estimate of present month value as the basis for the anomaly. The weight function is an exponential with a time constant of thirty years.  The results are smoother.

As mentioned, I'm using GHCN v3 for station data. I've downloaded late Nov 2012, and I probably won't update past months regularly, but I'll try to add future months as they appear. I'll update older data occasionally, since late stations will appear. I have taken a more conservative approach to GHCN - anything with a quality flag is not shown. That loses some good data, and I may review.

ERSST shows frozen sea as -1.8°C, the temperature below the ice. I've eliminated these readings, as they don't reflect climate.

The shading is not ideal, but is what HTML 5 provides. It gets two nodes in the triangle exactly right, and I've done the best I could with the third. Where there are big variations, you'll sometimes see nodes with adjacent shading which differs in some triangles. Usually the majority is correct.

Saturday, November 24, 2012

Climate data postings

For a couple of years I've been running a daily script to track new monthly temperature indices and also sea ice data. The script posts them on the data page.

I'm upgrading this - I now use wget to monitor the data sites. A by-product is that I have a listing of the times as posted by the source, and I thought I could add that to the site, newest first. I've put that table at the head of the list on the data page. Time zone is US Eastern (eg NY). Here's how it looks at the moment:

DateNameSize KbSource
Nov 22 08:56 GHCN V3 station temps 12196 source
Nov 20 13:53 HadCRUT 4 land/sea temp anomaly 209 source
Nov 20 00:59 NOAA land/sea temp anomaly 29 source
Nov 19 15:42 GISS land/sea temp anomaly 16 source
Nov 18 00:06 HADSST2 mean 4145 source
Nov 18 00:05 CRUTEM CRU global mean Station anomaly 30 source
Nov 09 07:48 UAH lower trop anomaly 65 source
Nov 05 12:58 RSS-MSU Lower trop anomaly 37 source
Nov 04 21:21 ERSST NOAA SST grid 3274 source

Indices are in black - spatial data (large) in brown. I can fairly easily add new sources to the list. I've chosen one representative file from each source, expecting that others of interest are likely to be posted at the same time. The list will be maintained on the data page. I also plan to post a history file (when there is some history). Some sources like GISS update their monthly postings frequently, so the date you see here may not be the first posting for the month.

Saturday, November 10, 2012

October GISS Temp up 0.08°C

The GISS land/sea monthly anomaly rose from 0.61°C September to 0.69°C in October. This came out just as I posted TempLS, which was unusually early in the month for GISS. Furthermore, GHCN is rather tardy in its posting, so there are fewer stations than usual for the time of the month. TempLS went down by 0.02°C from September. Time series and graphs are shown here

As usual, I compare the previously posted TempLS distribution to the GISS plot.

Here is GISS:

And here is the previous TempLS spherical harmonics plot:


Previous Months

December 2011
August 2011

More data and plots

October TempLS Global Temp down 0.02°C

The TempLS analysis, based on GHCNV3 land temperatures and the ERSST sea temps, showed a monthly average of 0.52°C for October, down from 0.54 °C in September. There are more details at the latest temperature data page.

Below is the graph (lat/lon) of temperature distribution for October. I've also included a count and map of the stations that have reported to this date.

This spherical harmonics plot is done with the GISS colors and temperature intervals, and as usual I'll post a comparison when GISS comes out.

And here, from the data page, is the plot of the major indices for the last four months:

Here is the map of 3514 "stations" which contributed to this report. Reporting seems to be slow this month - there are about 500 less than usual for this time.

Thursday, October 18, 2012

New ISTI dataset - duplicates

This is my third post on the new beta release of the ISTI temperature database. In the first post, a Google Maps display, I noticed a number of stations which appeared to be duplicates. So I thought I'd check more comprehensively.

I first ordered the inventory alphabetically by name. A complication here is that 430 have no name. Some still showed up as duplicates.

The next step was to collect pairs of adjacent stations whose data began in the same or adjacent year. Then I did a rough distance check and retained pairs for which the sum of lat and longitude differences (absolute value) was less than 1°. That's within about 70 km at most near the equator, requiring greater closeness near the poles. In fact most pairs at this stage have near identical coordinates.

That left 1077 pairs. I've made a list as a zipped CSV file here.

There will be some missing. I suspect Vienna/Wien are duplicates, but are missed alphabetically. The two Trondheims I noticed are assigned coords too far apart. And of course, my test doesn't prove duplication - just flags for checking.

Tuesday, October 16, 2012

New ISTI temperature dataset - station trends map

This is my second post on the new beta release of the ISTItemperature database. The first post was a Google Maps interactive map of the stations. This time we have an interactive global trend map in the style I did for GHCN.

Again, the dataset is large, and takes a few seconds to load, so I have put it below the fold. It is a globe that shows individual station trends with shading on a triangular mesh. The shading color is accurate at the stations themselves. You can display the stations and mesh, and click to pick up the numerical information. There is a little navigator map that lets you reorient the globe as you wish. Maps are available for periods of 30, 45 and 60 years to present. You can magnify 2x, 4x or 8x.

I should emphasise that these are trends for individual stations - there is no modelling or smoothing, except for the triangle shading interpolation. I find that valuable in that it shows the spatial correlation (or lack of). I was interested to see if the larger ISTI set gave a similar result to GHCN.

I think it does. A notable feature of all these plots, whether for period averages or trends, is that the US seems more of a patchwork than ROW. This is of course partly the higher density of stations, but I think there really is less coherence. Perhaps there is a quality issue (associated with the large numbers).

Anyway, the map is below the jump. There is some further discussion of the methods in this plot from last November. Below the map I've written a little about the numerics..

your browser does not support the canvas tag

Click on this map to orient the world plot.

Show Stations

Show Mesh


Trend period


How the trends are calculated.

The trends are calculated using monthly data over calendar years (So I didn't use 2012 months). There is a tricky issue with seasonality. If you take the trend of a year from autumn to autumn, monthly, you'll get an uptrend even if the year ends as it began. The calendar year is mostly winter to winter or summer to summer, so the effect is much diminished. I allowed for this using the method from TempLS. Instead of just one intercept, I fit (OLS) twelve monthly offsets (means) as well as the slope. This subtracts out the seasonal variation and gives the underlying trend.

Sea Temperatures

As usual I've added SST's by taking a published gridded set and putting artificial stations at the grid centres. Previously I've used ERSST, this time I used HADSST2. The emphasis is on the land data, but the mesh goes haywire without ocean nodes. The HAD grid is coarser, so the ocean trends are more patchy. Because of the arbitrary placement of stations at 5x5 grid centres, it can happen that they appear on land. Please excuse.

Update There are some odd hotspots in southern oceans for the longer time periods. This is an artefact. It is caused by an arrangement I have which enables me to use the same mesh for all three time intervals (meshes take time to download). I use a single mesh with a node for every station with any admissible trend - a station has to have 80% of months reporting to be assigned a trend over a period. Where for a time period a station doesn't have a trend, I assign an interpolated value for coloring purposes. The station is not shown, so the effect should be a minor upset in the shading only. However, there are in those parts some stations which do not have any connected stations with trends (for that time interval - worst at 60 years). Then the interpolation goes wrong, and the shading shows artificial heat. I don't think it happens anywhere on land.

Monday, October 15, 2012

New ISTI Temperature database - stations in Google Maps with GHCN

A few days ago, Peter Thorne of NOAA noted in a comment that a new Initiative ISTI, with NOAA involvement, has released a large new database of surface temperatures. It's on a similar scale to BEST, and I'll do comparisons in due course. But as an initial step, I thought it would be useful to gather a Google maps presentation, as I did for GHCN.

The beta release is here. I used the recommended merge (3 Oct version). The data combines TMIN, TMAX and TAVG (and is big!); I extracted the TAVG. There were 39430 stations in the inventory.

This is a big set for the GM application, so I've divided it into 8 regions with about 5000 each. You can look at them all at once if you like, but it will run very slowly. Selecting one or two regions is much better. There is a little map at the right of the display showing where they are. Because the data takes several seconds to download, I've put the map beneath the fold.

The idea of the map is that it shows stations with tags with information that you can pop up by clicking. But the main use is that you can filter by categories. You can choose ranges of start date (of data), end date, duration and altitude. The mechanics are that you make these selections, select regions, and then press one of the colors (for tag). What you've asked for will appear in that color, additional to what was there before. A special (and useful) color is invisible. The range choices combine with "or" logic, so you get "and" by making what you don't want go away. Because it is "or", you need to suppress the "All" button to make other choices. The buttons toggle. The middle columns with gt and lt signs also toggle.

I've included GHCN stations for comparison. There is a checkbox for each database. To compare you'll probably want to display in one color with one box ticked, and in another with the other box ticked (only one at a time). The green pin is useful here, in case of overlaps.

The map starts out blank, waiting for you to choose a region (or two). It's below the jump:


The map initially shows no markers. You need to select regions. I'd recommend starting with just one. If you click one of the marker colors, you'll then see a mass of markers in that region. You can filter some out with the invisible button - remember to toggle the "All" button when making selections. You could filter out, say, all stations starting after 1850. The selections are only operative if the left radio button is on.

You can ask for a different selection and a different color. It is the color request that creates actions. This second request doesn't erase markers already showing, though it will change the colors of those that qualify.

Note that choosing a small number of regions helps with performance, but other choices which reduce tags on screen do not help (the tags are there but invisible).

Worked example

I was curious about whether ISTI had more really old data than GHCN. As I found earlier, BEST v1 has a similar number of stations overall, but little new before 1850. So I did this:
  1. Set Region 6
  2. Unset All, set StartYr, and change the textbox to 1800.
  3. Under Actions, click Yellow.
  4. Unset ISTI, set GHCN
  5. Under Actions, click Pin.
So I see a lot of yellow tags in Europe, some with green pins, and some green pins on their own. Clicking on the pins brings up info including start dates. Some things I notice:
  • In UK, GHCN has Gordon Castle, Greenwich and Manchester. ISTI doesn't, but has a Central England, which may cover the last two.
  • Both have a Trondheim in the right place, with similar dates (ISTI 1761-2012, GHCN 1761-1981). But ISTI has TRONDHEIM_VAERNES, about 2.5° further E, from 1762-2011. Duplicate?
  • GHCN has Lund, 1753-1773. ISTI not.
  • ISTI has two Budapest records, both starting in 1780. GHCN has one.
    ISTI has San Fernando, from 1786, GHCN not.
  • ISTI has two Prague records (PRAHA-KLEMENTINUM,PRAHA-RUZNY), one starting in 1771, the other in 1775. GHCN has Praha-Ruzyne.
  • ISTI has a record for Vienna and one for Wien. Both start in 1775.

So ISTI has a few extras in that period in Europe, some of which may be duplicates.

Saturday, October 13, 2012

September GISS Temp up 0.03°C

The GISS land/sea monthly anomaly rose from 0.57°C July to 0.60°C in September. This was almost exactly the same rise as shown by TempLS. Time series and graphs are shown here

As usual, I compare below  the previously posted TempLS distribution to the GISS plot.

Here is GISS:

And here is the previous TempLS spherical harmonics plot:

Previous Months

December 2011
August 2011

More data and plots

Thursday, October 11, 2012

Fancy graphics and climate data

Over the last two years I've been exploring various ways of using interactivity to make climate data more accessible and attractive. I've been trying to maintain a gallery, but that takes time, and the effort has been lagging.

So I want to write a post which just summarizes the techniques used, to act mainly as a catalogue with pointers.

Here is a table of headings

KML, Google EarthSpaghetti graphs
Trend plotsEarth projections
HTML 5 and triangle meshesWebGL
Google MapsClimate Plotter

KML, Google Earth

The first thing I tried was writing Google Earth files, using their API language KML (of the XML family). This was really just to organise the data provided by the various surface temperature collectors (GHCN, GSOD, CRUTEM, BEST etc. I could show the stations with various colored and sized pins; summary information would appear in balloons on clicking. Then I found an even more useful capability with folders. GE lets you turn the folders on and off. So if I list stations in folders according to their decade of commencement, for example, the user can show a plot with and range of decades he wants.

The downside is that the user needs to download the KML files, or KMZ (a zipped equivalent). They are not excessively big. I have a collection here.

Anyway, the posts on this are here:

Fun with Google Earth, GSOD and KML

More Google Earth station data

Still more KML and Google Earth

GHCN KML visualisation by years

GE visualisation of changes to GHCN stations 1990...

A KMZ file for the BEST stations

A combined KMZ file for BEST, GHCN, GSOD and CRUTE...

Spaghetti graphs

I then looked at different ways of making "spaghetti" graphs (many tangled curves) more readable. I began with just color experiments, then fancy patterns, then a GIF animation (more here and here). But a suggestion from Eli, with advice from TheFordPrefect, got me using Javascript.

The user could, by hovering the mouse, select individual components of the plot in black contrast. Another such plot here included a general R program for generating code from a CSV data file. And this ice extent plot doesn't pick out curves but lets you focus on time ranges.

Trend plots

This series started with a ordinary graphic which showed, in a color triangle, how the total of possible trends from a single time series, over any time interval could be visualised. However, this was then adorned with JS facilities for choosing different datasets,. Then options were added so that the time series would be simultaneously plotted and the interval in question, and its trend, displayed. This would be updated by clicking on the colored triangle, or by controls on the time series graph. Numerical information, including later confidence limits, could be displayed. Here is a list:
A JS gadget for viewing temperature trends.
A picture of statistically significant warming.Observed SST and model trends
Significant trend differencesSignificant trends in Foster/Rahmstorf
Combined GMST trend viewer

Earth projections

I've never been happy with the various projections which attempt to render the whole earth on a page. I have preferred spherical projections, even though it requires several views. But JS creates a Google Earth like possibility - a spherical projection in which you can adjust the viewpoint.

Every month recently I have published a flat projection of a spherical harmonics fit to the temperatures anomalies, so this was a clear candidate. The globe views were typically from the corners of a surrounding cube. Examples are here:

A Javascript worldview for surface temp.April 2012 temperatures up 0.2°C

HTML 5 and triangle meshes

HTML 5 Canvases are relatively new. They offer huge flexibility for interactive graphing. This was developed in the climate plotter to be described, but the first application was to shading global plots. The canvas lets you prescribe colors at the corners of a triangle and it will then shade continuously between them, or at least between two of them. Getting all three right is harder to manage, unlike with Gouraud shading that does it automatically.

The big merit of this is that I can show temperature anomalies with no model fitting, so that colors will be correct at the stations. With the above caveat, it may be that only two thirds will be exactly right, but that's enough to see what is going on. When that is done, the mesh and stations can be shown, and the user can click to bring up station names and details. Again, a spherical projection is used, and can be user-rotated to an arbitrary view. I provide a navigator map (flat projection) on which you can click to choose the point to focus on. Examples are here:

Nov temps displayed with HTML 5Cherrypicker's guide to station trends
Visualizing 2011 temperature anomalies


This new facility, described in a recent post, makes available the full power of GL, and is potentially an improvement on HTML5 with spherical triangle meshing. However, support from browsers is still patchy, and the JavaScript that implements it is more difficult to modify - GL has its own style that it is perilous to disrupt. Some day it will be the best option.

Google Maps

This is a different style of using Javascript. It creates the facilities of GE with KML, but in a browser window. Google makes its API available, and as in Maps style the globe can be traversed and flags shown of coded colors, with balloons of information in response to clicking etc. Control is by the familiar JS gadgets of radio buttons etc.

Google Maps display of GHCN stations, with Javas...

Climate Plotter

I think the most advanced use of Javascript and HTML 5 is with the Climate Plotter. This interactively draws plots from a store of annual climate data. You can form regressions (of just about anything with anything); you can plot against an image background. You can interactively rescale, show different axes with different units; adjust anomaly basis intervals. The history is here:

Interactive JS climate plotter (update)Interactive climate plotting news.
Interactive JS plotter Ver 2

Snafu with comments

Probably in conjunction with my introducing the Captcha system, my system imposed moderation. I didn't realise this, and so the queue built up. My apologies here, especially to Girma. I've released all the comments. I've also removed the Captcha for the moment. I'll try to make sure moderation is not re-imposed.

Wednesday, October 10, 2012

WebGL - dynamic global temperature maps

This is an advance on HTML 5 that I've been using for some previous dynamic temperature plots. Some Web browsers now support WebGL, a Javascript version of openGL for 3D plotting. I've been playing with it for monthly temperature maps - here's September.

What you'll notice is a globe that responds to mouse dragging just like Google Earth - in fact I believe GE, at least initially, used a version of WebGL. The main advantage for presentation is better shading - Gouraud shading rather than the rather kludgy HTML5 canvas version.

A downside is that support for WebGL is quite patchy, and implementation depends also on your graphics card. I've found Chrome is fine; Firefox produces a fragment of picture and then gives an error message, and of course IE is way behind. Some other browsers have the capability but disable it by default. I understand the reason is that it creates a vulnerability to DOS attacks which send dynamic but very slow pictures.

I don't think WebGL will replace my HTML 5 versions for a while. I don't have as much control, so I can't, for example, allow you to click on the picture to bring up local station info.

The other downside is that the files are fairly large, and take a few seconds to download. So I've put them below the jump. There is the WebGL version and a snapshot of the HTML5 version. The color scheme is the same - I haven't figured out yet how to put a bar on the WebGL version. It's a direct plot from the TempLS September station anomalies - exact for each station, and shaded elsewhere.

I've added a technical update describing the methods I used here.

So here is the globe. Give it a spin! The left button rotates, the middle button/wheel enlarges, and the right button changes field of view (which can also enlarge).

Here is a snapshot from the HTML 5 version. You can see that the shading is more ragged.

Technical update

I should say more about how this is done. I use my R program which does the interactive monthly presentations, in conjunction with an excellent R package, rgl, in which Duncan Murdoch has a big role. This enables me to show in an R GUI the spinnable globe as you see it (browser permitting).

Then there is an rgl routine writeWebGF, which generates an HTML program with Javascript. I could use that directly, but it is bulky (about 2.5 Mb, mostly data). The JS is beautifully written, but the data layout is extravagant. I've been able to cut it down to about 500 Kb, with some added Javascript. I show that via an iframe, because I need to run the program when loading is complete. Normally that would be a flag in the body element, but Blogger controls that, hence the iframe.

Tuesday, October 9, 2012

September TempLS Global Temp up 0.04°C

The TempLS analysis, based on GHCNV3 land temperatures and the ERSST sea temps, showed a monthly average of 0.53°C for September, up from 0.49 °C in August. Last month also showed a higher rise with late data. There are more details at the latest temperature data page.

Below is the graph (lat/lon) of temperature distribution for September. I've also included a count and map of the stations that have reported to this date.

This spherical harmonics plot is done with the GISS colors and temperature intervals, and as usual I'll post a comparison when GISS comes out.

And here, from the data page, is the plot of the major indices for the last four months:

Here is the map of 4087 "stations" which contributed to this report.

Thursday, October 4, 2012


I've been battling with some persistent spammers in comments. I've invoked Captcha word verification for the time being - I hope I can remove it when things quieten down.

Monday, October 1, 2012

A necessary adjustment - Time of Observation

Sceptics complain a lot about adjustments made in indexing temperatures. Rarer is an acknowledgement of the argument for the adjustments. The fact is that if an adjustment is appropriate, then it is required. It's not optional.

This post will set out the quantitative basis for one of the larger adjustments to USHCN, a frequent object of this complaint. This is TOBS, the time of observation. It arises because USHCN gets its data from a wide variety of observers, many voluntary. The time at which min/max thermometers are read and reset is recommended but not mandated, but is on record. For many stations it has changed, and this matters.

In this post I take a USCRN station, Boulder, Colorado, with hourly data from 2009-2011. I calculate the effect of varying the notional reading time of a min/max thermometer. There is a positive bias of about 1.3°F if it is read in mid-afternoon, tapering to nearly nil around midnight. There is potentially a cooling bias in the morning, though for this site it was small.

But firstly, a discussion of why temperature measurement is relevant to the climate debate, and what kind of measure should be used.

The role of temperature measurement

The climate debate is about potential global warming caused by the addition of carbon dioxide and other greenhouse gases to the atmosphere. Sometimes the impression is created that the basis for worry is in fact the observation of rising temperatures, and if doubt can be cast on this observation, the worry goes away.

This is not true. The case for AGW is now, and always has been, based on the physics of the greenhouse effect. Addition of GHG's to the atmosphere leads to warming. The amount of warming (climate sensitivity) is not perfectly known, but there are reasonable estimates.

GHG's have increased, so a rise in temperature should be discernible. If there were none, then AGW would be in doubt. That is why a study of historical measurements is important. But history shows that temperatures have risen. There are of course uncertainties about all measurements, and there are short-trerm fluctuations which can obscure trends. But the rise is consistent with AGW. It is not the proof of AGW.

Measuring daily temperature

Every now and then a post like this appears, in which someone discovers that the measure of daily temperature commonly used (Tmax+Tmin)/2 is not exactly what you'd get from integrating the temperature over time. It's not. But so what? They are both just measures, and you can estimate trends with them.

The reason (Tmax+Tmin)/2 is used is that a very long history is available. In pre-electronic days, observers used min-max thermometers like the one on the right. The pins are pushed up as the mercury rises, and do not descend until reset. Note that the minimum scale is reversed. Typically, once a day the position of the pins is read and the pins are moved (eg with magnet) back to sit on the current position of the mercury. The reading gives the max and min of the previous day, but the time when they occurred is not shown. Typically an observer would record the location of the pins and the temperature at the time of reading.

Regular hourly readings are only widely available since the introduction of MMTS a couple of decades ago.

Note that with the minmax thermometer, if you reset the max when the temperature is falling, it may happen that the temperature may not return to that level for the whole next day. In that case, the next max you read will be the value that you set it to. This is, as I'll show, why time of observation matters.

USHCN Adjustments

Discussions of USHCN adjustments often refer to this plot:

Note that is is V1, and so out of date, but it does show the TOBS effect. This paper of Vose et al has more details, and describes the underlying cause thus:
[3] The majority of weather stations in the U.S. Cooperative Observing Network (and therefore in HCN) are staffed by volunteers. Consequently, the network has no mandatory time at which daily measurements must be taken. Most individuals prefer observing times other than midnight, resulting in an observation day that differs from the standard calendar day. For example, at a station where the volunteer reads the thermometers at 0800 LST, the observation day extends from 0800 LST the previous day to 0800 LST on the current day. At a station where the volunteer reads the thermometers at 1700 LST, the observation day starts and ends 9 hours later. Nevertheless, the observations at both stations are recorded for the same calendar day.

[4] When the observation day differs from the calendar day, a ‘‘carry over’’ bias of up to 2.0°C is introduced into monthly mean temperatures. This bias occurs when atmospheric conditions cause a temperature from one day to be ascribed to the following day. For instance, suppose an observer reads the maximum and minimum thermometers at 1700 LST on April 1, then a cold front passes through the area overnight. If the temperature on April 2 never exceeds the value at 1700 LST on April 1 (when the thermometers were last reset), then the recorded maximum will actually be the temperature at 1700 LST on April 1. This temperature will be higher than if the 24-hour measurement ended at midnight, and because the monthly mean is computed by averaging the daily maximums and minimums, the mean for April will likewise be artificially high. In general, this carryover phenomenon results in a warm bias for observation days ending in the afternoon and a cool bias for those ending in the morning.

Station observations - Boulder Colorado

As mentioned, I was looking around for a station with a good set of hourly readings for some years, with few missing values. I first looked at Washington, DC, but there were lots of gaps. So I thought the USCRN station at Boulder was promising, and indeed it had 2009-2011 with only 38 missing hours (which I interpolated). To simplify, I used MST (Mountain Standard Time) only, no daylight saving. All temperatures are in Fahrenheit.

Diurnal pattern

The diurnal pattern varies through the year. But here is a graph of the hourly averages (°F) for all of those three years. As expected there is an afternoon maximum and a minimum in the early morning.

Time of observation effect

This is simulated, supposing that we took the max and min of 24-hour blocks. Often the max measured is the afternoon max of each day, and the min is the early morning min. Then the time of observation doesn't matter.

But sometimes the reset value is not reached again in the next 24 hours. Then the "max" recorded is not a real max - it reflects the warmth of the previous 24-hr period, rather than the cold of the next. So it is a warm bias. If the same thing happens with the min, it is a cold bias.

The following tableau shows the frequency of times of max measurement subject to this notional reset. Here and later it is assumed that the reset occurred just before the time stated. The times are 0:00, 6:00,9:00,14:00,17:00,21:00 MST. Any of the plots can be expanded - just right click and View.

If you look at the first plot, with reset at midnight, you see the expected afternoon peak of maxima, but also a peak at midnight. This says that about 80 times in 3 years the maximum of a calendar day occurred at midnight. This implies that the weather turned cold after midnight. The max doesn't reflect how cold it became. There is a smaller peak at 11pm, reflecting the fewer occasions on which a warm front came through.

Resetting at 6 am, there are still a few days where there is a measured max there. But moving on to 5 pm, there is now a very marked peak. In fact, for more than a fifth of days, the 5 pm temperature is higher than for the next 24 hours. This is significant because the NWS recommendation had been to reset at 5 pm.

A more sensitive histogram is of the durations between maxima, shown below for the same reset times. "Normal" is about 24 hours. A short interval, or a long one of near 48 hours, indicates that the same peak is effectively being counted twice. This is very marked at 2pm, but also strong at 5 pm (and significant at 9 am).

I'll show this side-lobe effect as a daily cycle by plotting the variance of the histogram. This increases with the size of the side lobes:

Minima behave somewhat similarly, though with the afternoon effects replaced by morning. Here is the variance plot for the difference between minima. In fact doubling of minima is rarer, perhaps because the minima themselves vary less, or because the daily minimum is less peaky.

Temperature bias

So here is a plot, as a function of reset time, of
  • the three year average max, Tmax
  • the three year average min, Tmin
  • (Tmax+Tmin)/2, the min/max used in indices (in black)
Each is plotted relative to its mean

Each temperature should be adjusted to restore it to a standard reset time. Vose et al quoted above say this should be midnight. This adjustment really only is important if the time of observation changes, introducing an apparent trend. Again Vose et al explain how changes did occur in USHCN. I've also plotted each individual year, just for the min/max, to show that the pattern is fairly reproducible from year to year:
So there is every reason to expect that adjustments calculated from the present hourly obs can be applied to past readings where we only know the min/max and time of obs.


The range shown here is quite large (for this 1 station), about 1.5°F, while the TOBS adjustment in practice was only about 0.3°F over a century. Not all stations did change their time of obs, and those that did typically changed from about 5pm to 9am, which only has a fraction of the full effect.