Asynchronous Processing with girl_friday

I want to introduce you to my new gem, girl_friday. The problem: current asynchronous processing tools with Ruby are too inefficient and too complex.

Efficiency

It’s sad to admit but commonly with Ruby if you want to process 5 messages at the same time you have to spin up 5 processes, each of which boots your application and loads your code into memory; if each process is 100MB, that’s 500MB, most of which is redundant code.

Threads are the best answer, long term. Threads are hard to get right if you are managing them yourself but they are simpler to use than Fibers and give us real parallelism in JRuby and the upcoming Rubinius 2.0 release. Ruby 1.9′s threading isn’t quite as good but it is still useful for typical IO-heavy, server-side systems. girl_friday uses Actors for safe and simple concurrency on top of Ruby threads. With actors, we get the benefits of threads with fewer drawbacks! Since we are using multiple threads in a single process, the memory overhead is far less than booting another process.

Complexity

With girl_friday, your queue processing happens in-process with the rest of your application. You don’t need a separate project, deployment, process monitoring and alerts, etc. If your application is running, so is girl_friday.

Usage

You define your queues and how to process incoming messages when your application starts:

    # config/initializers/girl_friday.rb
    EMAIL_QUEUE = GirlFriday::WorkQueue.new(:user_email) do |msg|
      UserMailer.registration_email(msg).deliver
    end

then just push a message hash with your data onto the queue to be processed:

    EMAIL_QUEUE << { :name => @user.name, :email => @user.email }

Dead simple by design.

Design

Each queue in girl_friday is composed of a supervisor actor and a set of worker actors. The supervisor actor is the only one that manages the internal state of a queue. It receives work to perform and hands work to workers as they become available to process more work. If you are interested in the nitty-gritty detail, here’s the WorkQueue class to peruse.

Advanced Options

girl_friday has a number of nice options built-in already:

  • Send worker errors to Hoptoad Notifier or a custom error processor
  • Persist jobs to Redis so a restart does not lose queued jobs
  • Asynchronous callbacks (call a block with the result when the message is processed)
  • Runtime metrics for monitoring
  • Clean shutdown (stop processing new jobs)

See the girl_friday wiki for more specifics about each of these options.

Caveats and the Future

girl_friday supports Ruby 1.9.2, Rubinius 1.2.3 and JRuby 1.6.0 and above. Moving forward, I’d like to see a web UI added for girl_friday, like Resque’s web UI. If you want to help out, please fork the Github project and send pull requests!

This entry was posted in Web. Bookmark the permalink.
  • Jared Carroll

    Mike,

    What are some of your concerns with fibers? I was first introduced to fibers in some of your past work and they seemed like a good compromise in place of threads.

    I’ve never deployed a production Rails app on JRuby or Rubinius. Have you ever used either of those rubies successfully on an app?

    Thanks for the gem, I’m going to have to go learn about actors now ;)

  • http://mikeperham.com Mike Perham

    Fibers are a way to get synchronous behavior out of an asynchronous, reactor-based system (i.e. EventMachine). I have serious issues with the reactor (it makes testing very difficult, for one) but I just think Fibered systems are brittle. Accidentally add one blocking IO mechanism and your process can easily lock up. Debugging is difficult, IO errors can be swallowed silently, etc.

    Fibers are a solution but I just don’t think they will ever achieve popular usage by Rails developers due to the issues I’ve cited.

    • Jared Carroll

      On my current project that is attempting to be completely non-blocking with respect to I/O we’re using fibers and testing has been difficult. It’s also been tough finding the right gems and researching what is currently going on under the hood of the various event based gems. I’d definitely prefer to move away from fibers.

  • Michael Wynholds

    I would much prefer a good thread model (like Java’s) to a reactor. Fibers can make this async code look nicer, but threads do the same.

    So you say that Rubinius will have lightweight threads in 2.0?

  • http://mikeperham.com Mike Perham

    Rubinius 2.0 will have native threads with no GIL, making it possible to peg multiple cores with a single Ruby process so threads become much more scalable. Only JRuby can do that today. I’d bet they are targeting a release announcement for RailsConf.

  • meskyanichi

    Mike, this is brilliant.

    I have a few questions for you:

    1) If you run MRI 1.9′s native threads, which has the global interpreter lock, wouldn’t this block, for example, the Rails app? If it does, I’m assuming it only blocks for an extremely short time after the “heavy/long” processing is finished, am I right? Or how does this work? Could you give an example?

    2) JRuby (and the Hydra branch of Rubinius) both support native OS threads without a global interpreter lock. Now this definitely makes sense and I can see how well it’ll work here. I have also been experimenting with JRuby on Rails, but wasn’t really that successful yet due to errors and lack of java web server knowledge. Rubinius however basically worked for my application as a “drop-in replacement”. But what web servers can I use if I want to use Girl Friday with my Rails (or Sinatra or other Rack apps?) like Thin, Unicorn, Passenger, Mongrel, etc. What works? And does it work out of the box? Do I need for example need to use the –threaded option on the Thin web server, or is that purely to wrap requests in threads before they even hit the rack application and is it unrelated? Do I need to un-comment the config.threadsafe! in a Rails app in order to use GirlFriday or can I leave it alone?

    3) If using Rubinius, what application server(s) (Thin, Unicorn, Passenger, Mongrel, etc) do you recommend? Or does it not matter?

    4) Would this work on Heroku’s platform with MRI 1.9.2? (It runs Thin iirc, but I’m not sure how they deal with threading)

    Pardon if some of these questions are obvious! Interested to learn more about this topic (concurrency/threading).

    Thanks!

    • http://mikeperham.com Mike Perham

      Great questions! You know your stuff.

      1) Yes, there is still a GIL but remember that using a Socket for I/O will release the lock so with Threads you will get the same or better concurrency than Fibers since Fibers generally only yield during I/O. Yes, any CPU processing the girl_friday message handler does will block the Rails app because of the GIL. Generally server-side applications are I/O-heavy, not CPU-heavy.

      2) Generally any server that is compatible with threads should work with girl_friday. I recommend Rainbows, a variant of Unicorn, using a ThreadPool for the concurrency model. Passenger and Unicorn will not work – they assume one request per worker process. Threads are not compatible with EM-based servers.

      http://rainbows.rubyforge.org/

      Yes, I would make sure your Rails app is using config.threadsafe!, especially in non-development environments. I’m wondering if I can add some smarts to girl_friday to support Rails’s development mode since code autoloading is incredibly useful.

      3) I would try to use Rainbows with Rubinius. Eric Wong (author of Rainbows) is very responsive in my experience if you have a problem or need a little help.

      4) I honestly have no idea. Give it a try and let me know? :-D

  • http://hedgehogshiatus.com Hedgehog

    Would be interested in what you see as GF’s advantages over DripDrop?
    Or even what disadvantages you see in DripDrop?

    • http://mikeperham.com Mike Perham

      DripDrop isn’t really related to girl_friday. DD is a low-level toolkit allowing you to build network services which all speak one language using 0MQ and EM. girl_friday allows you to add background processing to your Ruby app with 5-10 lines of code. girl_friday has no networking aspect and uses actors (threads), not events.

  • Michael van Rooijen

    Thanks for the detailed information! I have tried Rainbows with Rubinius, but, like all the other app servers I tried, it’s incredibly slow so I’m quite sure that I’m doing something wrong.

    Basically all I did was generate a new Rails app, generate a Post scaffold, enable config.threadsafe! add rainbows to the Gemfile, use rbx-head-nhydra, add a rainbows.rb configuration file and set [worker_processes 1] [use :ThreadPool] and [worker_connections 10] (and i tried various numbers). But regardless of what I try the response time is ridiculous. Comes out at around 10-30 seconds per request. Same goes for Thin etc. However, using the same configuration with Rainbows on MRI 1.9.2, it yields about 200 requests per second – super fast. I also tried the stable Rubinius (with the GIL) and it performs just as poor as the hydra branch which doesn’t have the GIL. I’m wondering what I’m doing wrong. Even under heavy load (ab -n 2000 -c 200) MRI 1.9.2 just yields the 200req/sec while Rubinius does only 6req/sec according to ab, and it also includes 165 failed requests.

    Do you have any idea’s? I probably have some incorrect configuration because I doubt anyone would use it in production like this.

    Could I also use Thin with Girl Friday, that’s actually what Heroku uses iirc. If it does then I’ll try it out soon.

    Thanks!

    • http://mikeperham.com Mike Perham

      Give thin a try and let me know what happens. I’m not confident it will work well but it’s an interesting experiment to see what will happen.

      I’m going to play with rbx and rainbows this week and will report back when I have more info.

      • Michael van Rooijen

        Sure thing! Looking forward to your results! :)

  • Pingback: Actors and Ruby

  • http://na andy

    What’s the best way to ask questions about girl_friday? This comment thread? A mailing list? The github issue tracker?

    • http://mikeperham.com Mike Perham

      Issue tracker would probably be best. If the wiki documentation doesn’t answer your question, enter a bug and I’ll add content.