Results of 8 months of WXSIM-Lite use

I just compiled data from mid October to present, for five different percentages of use of WXSIM-Lite in WXSIM. I used only forecasts which were common to all five setups, to give a really valid comparison. There are a total of 966 forecasts, most of which were after the GFS changeover in January. I had initially used 0, 50, 80, and 100% mix, then later split the 50 into a 40 and a 60, with each of these “inheriting” shared 50% data. I also temporarily messed up the 60% one (some forecasts were zero %), so this larger set has a bit of a glitch in the middle. I analyzed a subset (exactly half of the total, as it turned out), all since the GFS changeover and with completely consistent mix percentages.

The relatively smoother, smaller data set shows an optimum near 60% mixture, with improvement of 24% relative to WXSIM and 20% relative to WXSIM-Lite. This is exactly the kind of thing I was hoping for when I started this project! :slight_smile:

I do see a slight difference in the optimum mix for maxes versus mins, and I think some of this is due to an internal adjustment I made a while backed, based on very preliminary data. I plan to tweak this in the next (pretty soon) version of WXSIM.

See attached graphic.

Tom


First, thanks to Chris for getting this web site back up! :smiley:

Thanks to Doug Paulley, I’ve added some more data to my analysis of what mixture of WXSIM-Lite data works best, and by how much it helps. Doug sent me the results of about three months of forecasts, with MANY different mix percentages (I’ve used perhaps most, but not all of his data here). It has some glitches in it, but overall it tells about the same story as my 8 months of data. I’ve added Doug’s to the graph and posted it here. Note that his numbers are Celsius degrees, which are Fahrenheit degrees divided by 1.8. Not a bad thng here, though, as that allowed me to display both his and mine on the same graph.

Putting it all together, using lots of curve-fitting and averaging (which I won’t describe right now), I’ve got a pretty solid conclusion: WXSIM-Lite data should generally be mixed in with somewhere between 60 and 75% weighting. The low end of this range may be best for sites where the customization was always pretty good anyway, at least with autolearn in effect. The high end may be better for more “troublesome” sites, where my customization was not quite satisfactory. For me, it turns out about 63% works best. Doug’s data suggests a “sweet spot” around 72%.

Another interesting issue is comparing WXSIM (alone) and WXSIM-Lite (alone). The data I have shows WXSIM-Lite better, on average, by about 6% relative to WXSIM. The optimal mix for all these data combined (around 65%) results in about a 19% improvement over WXSIM (raw), meaning the mean absolute temperature error is about 81% as big, and a 14% improvement over WXSIM-Lite (raw). Since over half this data is my own, quite-well customized site, my suspicion is that most sites should see an even bigger improvement, perhaps in the range of 20-25% over WXSIM alone and about 10-15% over WXSIM-Lite alone.

Sam and I have changed servers (to a higher capacity virtual private server than before), so hopefully this will pretty much eliminate the reliability issues we’ve had off and on.

Interestingly, both Doug’s data and mine show almost exactly the same “sweet spot” mean absolute errors: about 1.5 C (2.7 F) for days 2-4, with quite a mix of seasons, including the more challenging transitional time of spring. I don’t have equivalent numbers for “official” forecasts for the same period, though I do have some data from nine years ago which shows the National Weather Service with an error of about 2.9 F and The Weather Channel about 2.7 F, using essentially the same months (I’d say give or take about 0.2 degrees for each of these to account for possible difficulty differences between the two time periods, and then one might guess a 0.1 F improvement in the last 9 years). In summary, the WXSIM/WXSIM-Lite combination is just about “state of the art”, comparing very favorably with the best other forecast sources. It’s not clear here that it’s the best, but it’s definitely about as good as anything out there, and may well be the best for some locations.

I keep trying! :smiley:

Tom


Great analysis Tom :smiley:

Just realized something: the NWS and TWC data were forecast for and validated against an official site (the Atlanta Airport). One of WXSIM’s potential strengths is the ability to tailor the forecast specifically for your exact location, with whatever local “quirks” it might have. Given this, I think there will be many cases where WXSIM (combined with WXSIM-Lite) beats all other available forecasts (as attempted to be applied to your exact spot).

Tom

One more thing: I made a slight change in WXSIM based on these results. It reduces the disparate treatment I was giving max and min temperatures in this mixing thing. This should allow a slight additional improvement (maybe only 1% or so). The new beta version of the main program is at

www.wxsim.com/wxsim.exe

I’m working towards a new release fairly soon, so please let me know if there are any issues with this one.

Thanks!

Tom