Architecture Behind Tracksperanto-web

So it’s been quite a while since tracksperanto-web went production, with more than 300 files converted so far with great results. I hope some sleepless nights for matchmovers worldwide have already been preserved and nothing could be better.

I would like to bring a little breakdown on how Tracksperanto on the web is put together (interesting case for background processing). The problem with this kind of applications (converters to be specific) is that the operation to convert something might take quite some time. To mitigate that, I decided to write the web UI for Tracksperanto in such a way that it would not be blocking for processing, so extensive use of multitasking is made.

Basically, the workers on the web app are fork-and-forget. Very simple - the worker gets spun off directly from the main application, it writes it’s status into Redis and at the end of the job to the database. Tracksperanto jobs use alot of processor time and alot of memory, and with Ruby nobody can guarantee that jobs won’t leak memory. fork() is tops here since after the job has been completed the forked worker will just die off, releasing any memory that has been consumed in the process.

When processing is taking place, the worker process writes the status into redis which is perfect for this kind of a message bus application. Tracksperanto is designed so that every component can report it’s own status and a simple progress bar can be constructed to display the current state. So basically we constantly (many times per second) write a status of the job into memcached - percent complete and the last status message. To let the user see how processing is going, I’ve made an action in Sinatra that quickly polls memcached for the status and returns it to the polling Javascript as a JSON hash.

This scheme has the following benefits:

  1. Status reporting does not load the database (not needed and the information is hardly crucial).
  2. Zero memory leakage
  3. No Ruby daemon processes
  4. Start/stop control is tied into the webserver.

Note: lots of stuff removed from this post since it’s no longer relevant.