Operation has been slightly spotty in the past few days, due to limited disk space.
The system makes a backup of all data, every night. Given that, for example, the Ottawa Airport has daily records going back to 1938,
plus backups of all data captured since early this year (one of my prime tenets is that the preserved input data must be able to
reconstitute the full dataset--just in case), that's a lot of data on an older, storage-limited computer.
I've freed up some space and will now teach the system to run one full backup per week, plus incremental backups of the input data only.
That'll greatly cut down on space requirements while still preserving the full dataset.
There is a major error in the YOW precipitation data. I'll correct that in the next week or so (by reconstituting the dataset).
I've been tinkering with the system now and then, when I have time. Over the last month or so, I've added weather warnings, an events
calendar, and graphics for current and forecast conditions. The images I created myself; they're cartoonish, but they get the job done.
I'll package them up and release them free, at some point; if you'd like a copy, email me at mizar64@gmail.com.
I'm now incorporating some JavaScript into the pages. It's trivial to interface the stats system with JS, through data
tags in each page. This allows for on-the-fly conversions (and once I decide on how to effect data persistence, you'll be able to set a
preference when you visit the site), or for math (differences between actuals and normals, for example), without
having to add more code to the system and further complicate the output language.
I'm quite pleased with the Events system; the calendar is a simply-structured text file, and some fairly flexible tag syntax gives access to
events sorted by time and category. Coupled with the introduction of macro loops, it allows things like showing lunar events separately from
the other events in a given time frame (check out any of the year pages), or all of the seasonal milestones over a given period forward or
backward.
I'm working on a scheme to give some access to the webcam archives (and will soon write a program to generate an interesting thumbnails poster).
It's turning into a scheme to index and allow access to custom input data, such as lists of dates and filesnames, or other data collected on a
periodic basis. That data, too, will be available via some clever tag language. Coding won't actually take very long for this, as I can largely
adapt code from existing routines.
Here and there, I'm continuing to work on the pages, to make them more readable. As I'm really not adept with design, it goes slowly and with a
very utilitarian theme. Now, pay me to do this eight hours a day, and I might get good at it.
I'll play with graphics again, in the near future, and probably introduce some more charts.
No work has been done on the program code for a while, in favour of structural and cosmetic work on the web pages themselves. My next coding
work will involve bugfixes and a forecast parser, leading to the addition of weather condition and forecast graphics.
There now is a separate set of pages for the Ottawa Airport, with the same in-depth analyses as for CTO data. In fact, the dataset is larger,
including wind, visibility and precipitation data which cannot be collected reliably at CTO. We're still working on the colour scheme. Next up
will be a Regional section, with somewhat expanded data for the sites we track; and then a National section covering the capitals coast-to-coast.
I won't bother adding a World section, as global weather summaries infest the Internet as fleas a stray.
Data-capture accuracy is close to 100 percent since the last hardware/software update; we're now confident enough to resume data feeds to WUG and
PWS.
We didn't bother addressing Tuesday's Venus transit, as, like the partial eclipse a few weeks ago, it began so late in the day that there really
wasn't much to see. I did watch the beginning of the event, and tried unsuccessfully to snap some low-quality photos; but no dice. I was able to
show Goddess (the other pink-skinned inhabitant at the CTO Weather Centre) a peek at the event, by projecting the sun's image onto the kitchen
wall with a pair of binoculars; but it was impossible to hold still enough to snap a picture.
The media actually got things right about the transit, except for being much less diligent in warning about trying to view the event without
appropriate optical protection. Roughly every 105 years, we see a pair of Venus transits (I'll call them VTs, for short), eight years apart. This
was the second transit, the first having been in 2004; the next won't come until 2117. I'm not making plans for that event.
Most folks are a bit puzzled about why VTs are so rare, though perhaps not bothered enough to ask why. The answer involves several factors, but
bear with me. First, Venus and Earth orbit the sun at different speeds, Venus faster than Earth by dint of being only about two-thirds as far out.
This means that Venus aligns with the sun and Earth more than once per (Earth) year. However, both Earth's and Venus' orbits are tilted relative to each
other by several degrees--and the Sun is only a half-degree wide. Also, because Venus is two-thirds as far from the sun as Earth, the
relative Angle of Venus to the sun is magnified, as it were. This means that, except for two specific windows, just a few
days wide, in our 365-day orbit, Venus will pass slightly
above or below the Sun while overtaking us in her lower, faster orbit. It just so happens that when these factors are combined (i.e. both planets
arriving at just the right place and time), it's 105 years between alignments, at which time we get one with Venus transiting above or below the sun's
equator, and a second, eight years later, with Venus transiting to the other side of the sun's equator. In 2004, Venus was north of the solar
equator; this year, south.
Each pair of transits occurs on the same side of our orbit, and therefore in the same month (June, this
time round); the following pair of transits, 105 years later, takes place on the other side of our orbit (December). So, the whole cycle is actually
about 210 years long.
You might find it interesting to know that, as one moves farther out from the sun, Venus transits become more common and last longer, due to
perspective.
This information can help to illuminate astromers' search for extrasolar planets, which orbit other stars. The transit search method monitors stars
continuously, looking for short periods of time where they dim very slightly. If these tiny dimmings occur at regular intervals, it's an
indication that a planet is orbiting that distant star, each time passing in front of it from our point of view. Given some knowledge of the star's
properties--mass, diameter and luminosity--and the orbital period, they can deduce from the transit data the approximate mass and diameter of the
extrasolar planet. Comparing a planet's mass with its diameter can tell us about its composition. For example, solid 'terrestrial' planets like
Earth (density: 5) are much smaller and much denser than gas giants like Jupiter and Saturn (the former being about 10 times larger in diameter
than Earth, and the latter being less dense than water [density 1]).
Grams-per-cubic-centimetre, if you must know.
In order to be seen transiting its parent star, an extrasolar planet's orbit must be aligned with us to within a few degrees. This means that most
of the stars we look at won't exhibit transiting planets. The fact that a significant percentage of the stars being observed do, in fact, have
planets implies that they are very common and will far outnumber the several-hundred-billion stars in our galaxy. Bear in mind that Earth-sized
planets are extremely difficult to detect with current methods and therefore are not well-represented in what's already been found. Multiply that
by the several-hundred-billion galaxies in the observable universe, and you run out of zeroes fast. Now go read up on the
Drake Equation, and you'll better understand what these results mean for the probability
that there is other intelligent life out there.
Understand: I don't believe that any of that intelligent life has visited this planet as yet. I also don't read comic books, collect figurines,
play mediaevel dress-up. Nor do I call myself a 'maker' just because I can put together a simple Heathkit. But I digress.
This year's transit will also directly aid in the search for extrasolar planets. Again, if we know something of the composition of a star (usually
hydrogen, helium and small amounts of other relatively-light elements), then by studying its spectrum during a planetary transit, we can learn a
bit about that planet's atmosphere. The chemical constituents will create 'absorption lines' in the star's spectrum. Any absorption lines which
appear during a transit betray the planet's atmospheric makeup. By studying this phenomenon during Venus' transit, astronomers can better
calibrate their extrasolar observations. In fact, just to make it more challenging, they had the Hubble stare at the Moon during the Venus
transit--for obvious reasons, if you think about it.
Micro-rant: I see that the Harper Government is eliminating the Environmental Advisory Council, on the grounds that it's been saying things the
government doesn't want to hear. Please see my previous rant about neoconservative ignorance; and I refer you further to the Simpsons
comet-impact episode which ends with: "Let's go burn down the observatory, so this can never happen again!"
Art ridicules stupid imitates art. 'Nuff said.
-Bill
Solar Eclipse
2012-05-20
Happy Victoria Day weekend. As so often seems to happen for this occasion, we're enjoying full summer weather.
A few brief notes about today's annular solar eclipse.
An annular eclipse occurs because the moon's orbit around the Earth is not a perfect circle, but a slight oval; this means that the moon's
distance from us varies slightly during its four-week orbit. When the moon aligns exactly with the sun and the earth, there is an
eclipse. If the eclipse occurs near the far point of the moon's orbit ("apogee" -- 'away from Earth'), we get an annual eclipse, featuring a thin ring of sunlight
surrounding the black circle of the moon. If the moon is closer, a total eclipse occurs. (When closest to Earth in its orbit, the moon is
said to be at "perigee" -- 'nearest to Earth'.)
Today's eclipse begins in east Asia, travels across the Pacific, makes landfall in Oregon, and continues towards the US Southeast.
Due to the timing and angles of today's event, only folks roughly west of the Mississippi will see the full eclipse; anywhere east of that, the
sun sets before the event is over. In Ontario, the sun sets before maximum eclipse, and anywhere east, it sets before the eclipse even begins,
and the Moon's shadow lifts back into space until next time.
Here's what to expect in Ottawa tonight - and if you want to see it, you'll have to prepare now.
In Ottawa, the eclipse event begins at 8:17 in the evening, when the moon will begin to take a tiny bite out of the sun. Fourteen minutes later,
the sun sets. Mid-eclipse would be roughly an hour after beginning, with about 60% of the sun (now well below the horizon) obscured. This would
be about 9:15, local time. This means nightfall is going to happen more quickly than normal. After 9:15, the pace will slow, and what twilight
remains should persist almost until the usual time.
Want to see the eclipse? If you're in a location where you can view the sun right down to the horizon, you'll have a chance of seeing the eclipse.
I've got a spot staked out nearby. I hope to snap a few appropriately amateurish photos.
Protect your eyes at all costs! Even near the horizon, unfiltered sunlight can damage your eyes very quickly. There are several
ways to view the sun more safely. Punch a small hole into a thin sheet of cardboard; you can then use it like a lens, to project an image of the
sun onto a flat, light-coloured surface. Aluminized mylar, one or two layers, often works well for direct viewing; e.g. Pop Tarts wrappers -- but
do be cautious, and look through it for brief glimpses. Don't stare--there's nothing dynamic to watch.
If you want to take photos, you are well-advised to protect your camera in a similar way. Unfiltered sunlight, concentrated through the camera's
lens, can easily damage the imaging sensor and ruin your camera, in which case all you can do is take flawed pictures of things, with your flawed
camera, and eventually be hailed as a genius by WIRED News.
And this brings me to my next rant.
This probably won't interest you, but today's eclipse is the latest in a long series of events which are methamtically related. It is, in fact,
the same series that produced the near-miss eclipse event of May 10, 1994, here in Ottawa. That event occurred in the afternoon and was visible
in full. Indeed, Ottawa was hardly more than 100 km from the centerline, so it was reduced to a thin crescent, that day.
Prior to the eclipse, the usual nonsense started. In a bank lineup, I overheard one bingo-hall denizen warn another not to hang her laundry
outside during the eclipse, as the clipse would burn little crescent-shaped pinholes into her sheets. I wanted to ask her why she didn't feel this
would ignite the whole region in a flaming maelstrom -- but I doubt she'd have understood 'maelstrom.' The media, for its part, did a great job of
warning the public not to view the eclipse with naked eye--then blew themselves out of the water by referring to 'concentrated sunlight' during
the eclipse. I still have difficulty accepting that people can be that stupid; honestly. But it went beyond that. Schools were locked down, lest
Little Johnny impulsively peer upward and blind himself. Rather than a wonderful demonstration of practical astronomy, the event was turned into
a mysterious cosmic threat, to be shunned in fear. I'm surprised they didn't insist on waiting on an All-Clear from God.
We may as well go back to living in caves.
There is a novel, Fallen Angels, by Jerry Pournelle and Larry Niven. It's a blatantly transparent ass-kissing of sci-fi
fans in general; but it does do an excellent job of portraying the growing scientific ignorance creeping into modern society and its leadership.
In fact, one could say that politics increasingly views science as the enemy; money wasted on sneering acadamics bent on destroying the
economy through lies and manipulated data. The Harper government is slashing expenditures on science. StatsCan and Environment Canada have been
eviscerated to the point of ineffectuality. (Half a year, and EC still can't get daily weather stats online from the new third-party data source.)
The global-warming deniers continue to grow in number and stridency, by the day, even as climatic
models (which can now accurately predict the present from a starting point well in the past--I'll leave it to you to figure out what that implies)
demonstrate beyond any doubt that the warming is indeed human-caused, accelerating, and worse than originally thought. It's become conventional
wisdom among the under-40 population that the Apollo missions to the moon were faked; in fact, more people believe in vampires than in the moon
landings, even though simple mathematics demonstrate that vampires must cannot exist, and the moon-landing deniers have been soundly debunked since
the day they first started their ignorant bleating. But, hey, if you don't know math, don't know how anything
works, you're certainly not about to trust it, are you?
Okay. Rant ended. Go find a Twilight movie to watch, or maybe an alien autopsy or a faked-moon-landings documentary on the
'Learning' Channel -- or perhaps treat yourself to the studied wisdom of the Wild Rose Party or its mother ship, the Teabaggers.
-Bill
We're Back
2012-05-06
As you can see, we're back up and running.
We had a hardware failure in mid-April. As my priority right now is locating suitable employment, I took my time getting the system back up
and running. What I've got now is a somewhat more-robust system.
The system still consists of two computers, networked. For local weather-data capture, I'm now using an ancient HP T5700 thin client that was
essentially useless for anything else. This
device uses a Transmeta Crusoe processor which almost, but not quite, completely emulates a 686-class processor. Long story
short: with
considerable effort, I managed to install Arch Linux (which fortuitously doesn't use the one instruction that the Crusoe chip doesn't
implement) and a basic GUI environment, and my custom OCR program. The original webcam was replaced with a higher-resolution model. The new
cam delivers a sharper, brighter image which has significantly enhanced the OCR program's accuracy; currently running at 72+ hours without a
scanning error. Data feeds to Wunderground and PWSWeather will resume later today.
Note: in installing the purist Arch flavour of Linux, I discovered that I'm not a Linux guru yet. At the same time, I was struck yet again by the strange
attitude, widely prevalent in the Linux and open-source communities, that rejects making products 'user-friendly' and instead demands that the
user become intimately familiar with the underpinnings of the system, in order to use it to satisfactory purpose. I completely disagree; the user
should never have to tinker with the internals. If it can be tinkered with, there should be a simple, GUI way to do it. Even
nominally consumer-oriented products such as Ubuntu fall flat, particularly where 'non-free' software must be added after-the-fact and configured,
just to make the stock installation useful from a pragmatic point of view. Well, the rest of the world has moved on from that kind of monkey-
business.
'RTFM' was a cute catchphrase among hackers in the 1980s, but let's pull up our socks and move a few decades forward, shall we? If Linux wants
wide acceptance, it's going to have to focus, in a big way, on making its power truly, intuitively, and transparently accessible to
the consumer. There's a reason why Apple dominates the personal-electronics market in the present era, with Linux at other end of the list,
still expecting Granny to play with .ini files in vi. Go ahead, click the link
and behold the modern, user-friendly glory of vi.
Carrying on (it kills me, by the way, that web browsers refuse to render two spaces after a colon or full stop): the data-capture system
transmits the readings directly to the server, via HTTP protocol. In the event of communications failure, the OCR
program falls back on attempting to transmit the readings via a text file and standard file shares. If that doesn't work, the results are
cached until a transmission method becomes available again. This lets the system coast through server reboots and downtime without errors.
The server (again, an ancient Toshiba Tecra 8000 laptop that continues to motor on faithfully--raps twice on skull) runs the statistical
/ web-output system, in realtime. A multi-level timing routine keeps data input synchronized with output, to maximize the 'freshness' of the
output data. The scheduling is completely configurable.
Please keep an eye on the CTO Weather Cam over the next couple of weeks; the trees are leafing out in earnest.
Finally, I note in the news this morning a story which outlines the double danger of climatic warming. It seems that our early spring this
year (the second in three years) caused early blossoming among many of Ontario's fruit trees, especially in the southern part of the province.
A subsequent unseasonal cold snap (a phenomenon that will continue to occur even with climatic warming) killed or damaged many of those blossoms.
As a result, fruit yields this year are expected to be way down.
As you may have noticed, things are improving in little ways, here and there.
The automated system has been running stably, without incident, for over a week. Bugfixes and new features are being implemented, and the
server's software will be refreshed tonight from the latest build.
New pages are being added; in the next few weeks, every menu item will come alive. Forecast-handling will be improved, and warnings will be
added.
Custom queries, mentioned previously, should show up in the next couple of weeks, along with an expanded package of analysis graphics. The
goal there will be to look up information on any time period in the database. I'm also working on implementing mathematical calculations;
for example, compare this month's temperatures with normals, or compare one location's humidex with another's; I can do the same with dates.
Gonna take a bit of work, but it'll show up eventually.
While I've been ranting on about my stats system, we've been enjoying an early spring, and for the second time in three years. Temperatures since
mid-winter have run three to four weeks ahead of normal; March would have passed as a slightly-disappointing April, and the first week of April
has featured temperatures close to the monthly average. We've had record-setting temperatures
in the high-twenties, at a time of year when the existing records were in the teens.
I'm looking forward to getting some of the new graphics online, such as a year-by-year graph for observations at the airport, going back to
1938; then you can see for yourself what global warming is doing right here in Ottawa. From my own research, I'm seeing a significant
acceleration in warming over the past decade; something of a tipping-point which many climatologists have warned of.
If there's one thing I'm good at, it's learning. Quickly. Skills-wise, anyway.
The weather system is now running in realtime on the server. System load is very light. The scheduler needs a little fine-tuning, but
I'm really impressed with the efficiency already. A bunch of new current-readings-only locations
will be added, to give a more-complete synoptic view of Eastern Ontario. I can basically add as many as I want. I'm still tweaking
the output routines, to correct properly for DST in all situations and to parse all the tag options properly. Easy work, just tedious.
The CTO Weather Centre readings are now transmitted over the network, directly to the server and without using text files, with buffering
in case the server is unavailable
due to a network error or scheduled reboot. If I opened up a port, the system could happily accept readings from any other place on
Earth. This is exactly how it's done at sites like The Weather Underground or
PersonalWeatherStations.net. As a result, our live readings are posted within about 90 seconds.
The graphics are updated on a five-minute cycle.
Most of the tweaks will show up later today, when I update the server software from the development version. Things will get much neater
and more... correct.
Data-capture will be refined some more; I'll try to eliminate as many bad readings as possible. There are a lot of things I can do in that
regard, including comparing strings of readings, checking the source image to see if it's worth scanning at all (I'll soon post a picture
of our trailing-edge setup), rejecting readings which change too quickly, checking against predefined acceptable-value ranges, checking
against a few quirks of the capture setup (especially false 8s and 0s). More tweaking of the image-analysis
routines will also help; I can scan and analyze the values in different ways. I'll also have to teach it to calculate Humidex and
Wind Chill values; that's just simple math.
Most of what's on the to-do list at this point is blue-sky stuff, plus getting the website itself into shape. Both I'll do at my leisure,
while enjoying the results of my efforts to date. I now have a working, automated statistical/reporting system that will give me the
information I want, when I want it. I need to turn the software into a service (a daemon under POSIX systems, and a service under Windows),
rather than just have it run continuously in a session; I have some more data-gathering automation to work on, some tag extensions to make
site-building easier, and then some geekily-useful features like value and temporal arithmetic (moonrise in 36 minutes; yesterday's high was
2.5 degrees above Normal), perhaps some basic scripting, custom data
queries (show me the daily mean temperature, maximum wind gust and average visibility for January 6 to February 2, 1941)... and whatever
else I can think of that'll be fun to set up and play with.
I offer a parting thought for the day. While I've worked with many different computer platforms since the early Eighties, the constant over the
past 25+ years has been DOS and then Windows. I started playing with Linux nearly 20 years ago, when it was still a talking-dog sort
of curiosity, and have used it for a home server for
at least the past decade. In the last three or four years, I've been programming on it, too. The Free Pascal / Lazarus package is great for
cross-platform programming; and Pascal's way of doing things has always agreed very closely with mine. That said, I've noticed something
lately: increasingly I'm using my Windows computers to do things I've always done (producing a weekly radio show, because switching to new
or different software means compatibility issues with your archival production files), or things where there just isn't a good Linux
alternative. I'm using Linux, on the other hand, for most of my geek projects, or anything else new. I'm finding it easier to develop with, more efficient with
resources, and more stable than Windows 7 (and I don't say that lightly). Tying different processes together--or at times eliminating
them altogether--is much easier with a bash script than trying to use DOS/Windows' limited batch language; and it's not worth it to me to learn
Windows Powershell's tricky little language. The cron scheduler is far more powerful than Windows' cutesy Scheduled Tasks facility. I could
go on. Linux helps me get things done, plain-and-simple.
There's been steady progress on the software side, all week. And we recorded some radio comedy this morning.
Bad-data-undo is implemented in the testing version. I've worked a bit more on the site pages. I'm now monitoring readings from a number of other locations;
and, as you can see, there's now a regional temperatures map which will evolve with the rest of the site. I now have a simple GUI
utility to input daily temp/precip stats or issue data-kill orders. Spending a couple of hours on that will soon save me a lot of time
monkeying with input text files.
The regional temperatures map, by the way, is generated via a script file; it's templated, and the system fills in the necessary values.
Another script runs it; it then calls ImageMagick Convert to superimpose the values onto the map image. The binary part of the system
(working name: WxPro, written in Free Pascal) required no modification. I've stolen the present image from the Ontario Cattlemen's
Association website. Promise to eat a steak this week so they'll turn a blind eye, okay?
I'd mentioned before that the system was designed from the outset to handle multiple locations ('weather stations')--up to 200 in the
current version. Having learned my codin' during the dinosaur days of the early 1980s, I've always been extremely careful about
memory usage: don't keep it in precious RAM if you ain't using it; and don't give it more space than it needs. WxPro is designed along
those lines; nonetheless, memory usage may become a
concern on my 128-MB ancient-laptop server, if I add many more sites. Two things will help: running realtime on the server and processing
data as they appear, instead of loading-processing-outputting everything every five minutes; and in configuring how much data should
be kept, processed or output for any given location. For example, I like seeing the current conditions for Bancroft, Ontario, which
often experiences weather which will later reach Ottawa. I don't necessarily want to waste memory and processing resources on the data,
or perhaps on outputting the results. I'm adding in a lot of flexibility in this regard; some locations may be current-conditions-only,
for example, and require minimal resources, generating minimal output.
I say all this because, as you've noticed, data from other than the original two locations (I don't think Current Conditions in the
Living Room are statistically significant to anybody outside of the building) have begun to appear; see the Regional Summary table on the
Temperatures page. I've added them in to test the system, and because it's what I want on my weather site. :-) In the near future, each
location CTO is tracking will have its own sub-site, and some locations will have more information than others. Some, like our Weather
Centre and the Ottawa Airport, will be full-fledged analysis sites; others, like Bancroft, probably won't ever tell you more than what's
happening right now. To be sure, the readings will be archived, at zero memory cost, for future use. I'm pretty sure I can write some
code to sift through the available data and learn how to generate accurate short-term forecasts; the kind you actually need before you head
out the door. No one seems to do that right now; but radar images require interpretation in conjunction with other data, such as motion
and evolution over time. J.Q. Public doesn't know or care how to factor in the temperature, barometric pressure, relative humidity,
wind direction, and their recent history, to figure out whether that radar blob near Carleton Place is likely to cause trouble for the
electric mower in the next 30 minutes. I'm certain I can teach my computers to be helpful with that.
To get things running on the server in realtime, WxPro needs a scheduler. Not particularly tricky. I'm working on this and tidying-up
some of the internals, to help prevent future bugs and make it much easier to add a finer grain to everything. I'm sure we'll
be into realtime operation a week from now.
On a different note: currently, Eastern Ontario is enjoying very unseasonably-warm weather. This whole late-winter period has been
reminiscent of 2010--if memory serves, the warmest year on record in Ottawa and a great many other places.
Two years ago, I found thousands of spring flowers on March 7; usually, except for on sun-rich embankments, there's little before month's
end. This year has been eerily similar.
March here is less and less a winter month, and increasingly often a real spring month.
Record-setting high temperatures this week are forecast to be in the low- to mid-twenties. When I arrived in Ottawa at the end of March,
1986, there was similar, also record-setting, weather. Two full weeks later, by the calender, during the period of maximum seasonal change.
Think about that; and at some point in the next month or so, I'll post a chart where you can see the climate change for yourself,
straight from the data.
All of the statistical routines are now in place, and nearly the entire data tagset has been implemented. On the way, a funny thing
happened that will negate a lot of further programming.
As you'll see on the "Today" page, I've taught the system how to generate tabular data. The last-24 view is particularly useful, as it
helps me quickly track down bad data which can be nullified through another nifty feature I'm working on: search-and-destroy. The
implementation was surprisingly quick, as it mostly hooked into existing code.
Here's the cute part: because of this feature, I won't have to write any code whatsoever in order for the system to be able to
export customized listings of any kind of statistical information it stores or generates. Just set up a template page for the output,
defining the data fields to export and the time period. The system does the rest of the work--just as with the pages that make up the
CTO website. Hot damn!
As mentioned, I'm working on an automated way of nullifying bad data, and rolling back any effects they've had on Extremes, Records,
Means or Totals, after the fact.
In the morning, I'll add support for Daylight Saving Time, which begins this weekend. That, too, won't take long. I'll continue
to work on the web pages themselves. Much of the rest of the development work will be blue-sky stuff, including a graphical control
program, more data analyses, better charts and graphics, more work on forecasts and the addition of public weather warnings, data
interchange with my graphical analysis program; things like that. I'll probably also start collecting data from a number of other
locations around Eastern Ontario, to see what can be done with that. Might also be nice to program a query server, to let you explore
the datasets online; right now, everything's a fixed page.
But, for now, it's almost fully functional and runs unattended. Not bad for three months of hacking in my spare time.
So, while the aesthetics of the site will continue to change, the bulk of the information won't. To be sure, the available data will
be expanded significantly, with lots of monthly and yearly extremes, plus all-time records. And, at some point, I've got to write and
produce some more radio comedy. Hey, at least I don't read comic books or play Dungeons and Dragons.
As you can see, we've made great progress in recent days.
Realtime data are being captured reliably; the system scans for new inputs every five minutes. Normals are now supported, along with
importing of same plus daily/monthly/annual statistics. Sun and Moon data have been imported. The data-embedding tagset is largely
implemented, save for extremes/records data and system information.
The charts have returned, and you can expect to see some new ones in the next little while, plus lots of other new graphics.
Forecasts are being captured and processed. I've been looking at the syntax of Environment Harper Canada's [mostly] computer-generated
forecasts, and it looks pretty simple to parse and extract some data from those.
You'll notice that while most of the data displayed are recorded here at CTO, a few items, such as winds and precipitation, have been
borrowed from EC. Eventually there'll be separate pages for each, plus some additional ones for forecasts, warnings, records, other
locations, etc.
At this point, there are a couple of big things still to implement, and lots of little things. As of Midnight tonight, we're officially
operational in terms of data capture, processing and archiving. Once the main weather-stats system is largely completed, we'll work on
a few companion projects to make the whole thing easier to maintain. Because the system's been designed to maintain data for multiple
locations and has a small resource footprint, it could easily keep stats for a whole network of locations--say, all of Eastern Ontario.
And from that would come some interesting opportunities for analysis and "nowcasting". :-)
Incidentally, with the transition to Daylight Saving Time just a couple of weekends away, we've decided that the system will operate
strictly on Standard Time, year-round, translating to DST only for outputs. This means that, as of March 11, daily statistics will be
calculated from 01:00 Daylight Timeone day, to 00:59 the next (00:00 through 23:59 ). This guarantees that every day
can hold a reliable 24 hours of data. That said, few daily extremes (of the meteorological type, anyway) are set between Midnight and
1am.
The ugly, marked-up pages you're seeing are an indication that data-output routines are being implemented. Our goal is to make every
red mark disappear. Progress will be slow, at first, as supporting routines are written; t'll pick up steam as it goes, thanks to those
support routines.
It took surprisingly little time to hack up a program to capture Environment Harper Canada data and leave it for the new system to
process. We'll play with that when there's time; it's not a priority anymore.
Live data-capture has been operating on a test basis since yesterday afternoon. The OCR program was updated to deal with negative
temperatures and the new system's data-input format. Daily and monthly stats are being calculated automatically. Some attention will
have to be paid to bad-data filtering (sometimes the OCR program makes a mistake) before operational use begins.
Next up: data export to HTML pages. A large set of tags will allow data to be embedded into any text document, with a lot of
flexibility in formatting. It'll take a while to implement properly, but you should see live data begin to appear in the next 24-48
hours.
Once the basic tag set is implemented, hourly XML data capture from Environment Canada will resume. Following that, the remainder of
the dataset (normals and records, and related calculations and output tags) will be implemented, bulk-data import will be implemented
to allow historical and 'official' records to be imported and analyzed, and graphics-generation will be implemented. Much of this later work
will borrow heavily from the old system, as my later coding was much cleaner than the initial.
This whole endeavour originally was intended to be quick-and-dirty hack to get some live data, display and record it. It kind of
snowballed and evolved as it went. Poor planning led to sloppy code that was difficult to maintain. The new system's architecture and
specifications were extensively planned before a single line of code was written; as a result, it's much easier to expand and maintain,
and new features will appear regularly.
The dataset is now fully defined; implementation will occur modularly. The system is now capable of gathering and storing stats and
calculating extremes (highs/lows) for the day. Data-locking is now being implement, to protect 'official' values once input. (For
example, the daily High and Low don't usually occur just as the hourly reading is taken; therefore, the officially published values
usually differ from (exceed) the values WxPro will calculate by default from the periodic readings. Once official values are pulled in,
observations values (e.g. bad datestamp, archival data) won't overwrite them. Surprisingly easy to implement; my code is clean and
modular.
At that point, we'll take a break and turn our attention to WxOCR, the program that captures data from an LCD display. It needs to be
taught WxPro's particular data-exchange format, and while I'm at it I'll probably make it a bit friendlier. That one, in particular, I'll
end up releasing as an open-source project. It's just too niftily cute, and a great example of a surprisingly large effort to solve a
surprisingly simple problem: how do I get the information from the LCD display, on my cheapo home weather station, into the computer?
The coolest feature: it even shows you, graphically, how the data-capture process works. With a decent interface, anyone could set it
up to read almost any LCD display with just a few minutes of aligning and clicking. As an open-source project, perhaps it'll attract a
programmer with the time to give it the attention it deserves.
Sorry; I digress.
After a bit more tweaking, the system will learn how to export data to web pages. When that happens (and, as you can see, it's not that
far down the road), live data will gradually reappear on these pages, followed by statistics, records and normals, charts and graphs,
and so on. In time, it'll outstrip the old system and become fully automated. Someday, if traffic climbs into the double-digits, I might
even toss in some sidebar ads, get the site hosted and complete the dream of a complete, public CTO site, full of actual real information and
learnin', and graced with a cheesey, leaf-laden inner-city motif.
The system is being rewritten from the ground up. Useful routines are being borrowed from the old code. The new system is being designed
inherently to support multiple locations and multiple years, with minimal data loading. It will be self-maintaining, in that it will
automatically create all data and file structures necessary to accommodate new time periods in the input data. Efficient scheduling will
cut down on needless recalculating.
Because of the programming approach (dataset and inputs-handling first, then worry about analyses and outputs), a whole lot of programming
will be followed by a whole lot of debugging, followed by a lot more programming, a bunch more debugging and the beginning of operational
use. But when fresh data do begin to appear on the site, expect frequent updates.
Given a rainy afternoon, we may find a temporary way to put current-conditions data back online, in the meantime; just the basics, using the
old system.
The CTO Weather Webcam remains online. We're archiving one picture per hour, now, with thoughts of later turning it into a video or perhaps
creating some sort of graphical analysis. I have something in mind.
The focus, this time around, is on realtime data and statistics from CTO's own weather station, with Environment Canada data presented as an
augment or for comparison. I'm really displeased that their handing-over of the hourly observations to a private concern has created large
gaps in their recent daily statistics. Here at CTO, I'm one person and have probably four regular visitors. Environment Canada has a staff
of hundreds to thousands and serves the country; I hold them to a somewhat higher standard.
As the new changes come online, we'll describe them here, and at some point we'll summarize the architecture of the new system.