Thursday, July 30, 2009

California Highway Patrol incident feed for bicycles

I've whipped up a script that takes the California Highway Patrol live traffic incident data feed from the 20 CHP dispatch centers around the state and filters for anything mentioning bikes or bicycles. You can see the results here.

At the time I'm writing this post, there's a single incident between Woodland and Davis, CA: a bicyclist riding in the number 1 lane on State Route 113 just south of County Road 25A is reported to be a traffic hazard. "1097 in the CD" means the responding officer is parked in the center divider. I have no idea if bikes are permitted on this state highway or not.

Location: SB SR113 JSO CR25A, Woodland - 7/30/2009 9:46:53 AM
1125 - Traffic Hazard
    -- 9:56AM 1097 IN THE CD

Right now, this is a rough proof of concept, which means:
  • I grab the incidents as they're available on the CHP server.
  • No caching of incidents, by me or anybody else.
  • No permanent links of incident reports.
  • When the incident ages out on the CHP dispatch center it is gone forever.
  • Good probability of false positives. You'll see reports of motorcycles and anything mentioning "bike lane." I doubt I'll change this heuristic much.
  • You need to decode the terse radio codes and abbreviations yourself. I may work this decoding into the application.

To make this truly useful, I need to:
  • Poll the CHP server automatically every 10 minutes or so and spit the results out to Twitter, an XML feed, or to email.
  • Cache the incidents and;
  • Provide permanent links to the incident information.
I can do all of these things; I just need to be sufficiently motivated and make time to do these things. Please let me know if this would be interesting / useful for you, and feel free also to leave any kind of suggestion and opportunity for improvement.

In the meantime -->

Update: I've modified the script so it can take a query string so you can input your own search criteria. Examples:


mattbot said...

Not sure if this will everything you want, but it seems like this is the kind of thing you could do with Yahoo pipes:

Could make your development a little easier...

Yokota Fritz said...

Thanks Matt. I'm a big user and fan of Yahoo Pipes.

The CHP info is realtime and short lived and isn't RSS. Pipes _can_ check an RSS feed as often as every 10 minutes if you ping Pipes, but the crawl interval is unknown (to me) for non-RSS data sources. I experimented a little with Pipes with the CHP feed but it was missing a _lot_ of incidents.

The part where Pipes could possibly help me, which is digesting the XML info, is trivial to program. The PHP code looks something like this:

$xml = file_get_contents($CHP_URL);
$parsed = simplexml_load_string($xml);
foreach ($parsed->Center as $center)
    ... etc ...

I just look at the details I'm interested in and search for the keywords I need.

To do this as a cron job, I plan to rewrite as perl or python, save the cached info in a mysql database, update an RSS file as needed and use the Twitter API to send new incidents to Twitter. The Web UI part of it will just query the database instead of going to the CHP feed directly as it does right now.

bikesgonewild said... this conversation still in english ???...

...i'm sure i'm displaying my ignorance but just askin'...