Interesting Items OSCON 2008 – Dealing with Streaming Data

This is a collection of links to things discussed, or just mentioned, at OSCON that I found interesting enough to note. Hopefully one of a series for OSCON 2008, as time allows.

These items are from a great talk on “A Streaming Database” by Rafael J. Fernández-Moctezuma at PDXPUG day.

Hancock is a C-based domain-specific language designed to make it easy to read, write, and maintain programs that manipulate large amounts of relatively uniform data. In addition to C constructs, Hancock provides domain-specific forms to facilitate large-scale data processing

The CQL continuous query language (google)

Borealis is a distributed stream processing engine. Borealis builds on previous efforts in the area of stream processing: Aurora and Medusa.

CEDR is the Complex Event Detection and Response project from Microsoft Research.

Google Protocol Buffers “allow you to define simple data structures in a special definition language, then compile them to produce classes to represent those structures in the language of your choice”.
Which seems like Thrift which is “a software framework for scalable cross-language services development. It combines a powerful software stack with a code generation engine to build services that work efficiently and seamlessly between langauges”.

Concurrency and Erlang, and more

Just found the excellent Concurrency and Erlang page by André Pang. (I’m not sure how I got there, but I started froma post by Pedro Melo.)

The page has great links to quality articles and resources with commentary and context for each. It also includes sections specific to C, Objective-C, C++, Java, Python, JavaScript, and Haskell.

What, no Perl? Well, using threads in Perl 5 is rather painful. I’ve never had to use threads with Perl 5 (beyond making DBI thread safe a few years ago) and I’d be happy to never have to.

On the other hand, I believe people are using threads successfully, though I’ve no handy links for you beyond pointing out that CPAN offers a number of solid Thread:: modules.

All this reminded me that I’d never got around to reading Parrot’s Concurrency design document. So I did. I liked it as a statement of direction, though it’s a little thin on the interaction between schedulers.

I couldn’t find many interesting links discussing both Parrot and Erlang. An O’Reilly Radar post called Parrot and Multi-threading from September 2007 was hopeful.

I’m still wondering if Parrot could act as a virtual machine for Erlang. I think that would be a valuable test case for the quality and scalability of the concurrency design.