Programming concurrent code with threads and shared state is hard to get right. Actors are an attempt to build a safer concurrency model for application developers to use. Erlang uses the actor model as the basis for its concurrency and while Ruby doesn’t have actors built into it, actors can be layered on top of Ruby threads. In this post, I want to introduce you to actors with some examples.
The core idea behind actors is message passing. Your concurrent code does not share variables, but rather sends messages to each other and those messages contain a copy of the state to be processed. Since nothing is shared, you don’t need locks and don’t have race conditions. Simple idea but powerful in practice! Rubinius has an actor API built into it, let’s take a look at some examples using that API.
You’ll need to install Rubinius:
rvm install rbx rvm use rbx
We’ll start with a simple ping/pong example [full gist]:
require 'actor' pong = nil ping = Actor.spawn do loop do count = Actor.receive break puts(count) if count > 1000 pong << (count + 1) end end pong = Actor.spawn do loop do count = Actor.receive break puts(count) if count > 1000 ping << (count + 1) end end ping << 1 sleep 1
Here we are spawning two actors to increment a number from 1 to 1000, without any locks. This is possible because the two are simply sending each other the latest value for the counter. One gotcha: the messages you send should not contain references to mutable objects, just basic types. Rubinius’s actor API does not marshal the message so you need to be careful that you don’t fall into the trap of sharing state between actors.
What about error handling? What if an actor dies? What if we want a pool of actors? This is purpose behind the spawn_link
API. Hang on to your hats, this example gets a lot more complex [full gist]:
require 'actor' require 'rubinius_fix' Ready = Struct.new(:this) Work = Struct.new(:msg) processor = Proc.new do |msg| raise msg.to_s if msg % 7 == 0 print "Doing some hard work for #{msg}, boss!n" end @supervisor = Actor.spawn do supervisor = Actor.current work_loop = Proc.new do loop do work = Actor.receive result = processor.call(work.msg) supervisor << Ready[Actor.current] end end Actor.trap_exit = true ready_workers = [] 10.times do |x| # start N workers ready_workers << Actor.spawn_link(&work_loop) end loop do Actor.receive do |f| f.when(Ready) do |who| # SNIP end f.when(Work) do |work| ready_workers.pop << work end f.when(Actor::DeadActorError) do |exit| print "Actor exited with message: #{exit.reason}n" ready_workers << Actor.spawn_link(&work_loop) end end end end 10.times do |idx| @supervisor << Work[idx] end sleep 1
Here we create 10 units of work to process and hand those items to the supervisor. The supervisor will receive those units of work, pick a worker to process each and hand it to the worker. Notice the only mutable state is local to the supervisor actor – it manages the ready_workers
array. I’ve simplified much of the inner workings of the supervisor. Note the require of a Rubinius fix, the latest Rubinius (1.2.3) release has a bug which prevents exit notification from working. This bug will be fixed in the next release.
This example shows several really useful ideas:
trap_exit
and spawn_link
together, which notifies the supervisor actor when a worker actor diesYou can envision the next step in complexity: a processing pipeline of steps, each with a different-sized pool of workers, but I’ll let you implement that! That’s enough for this introduction to actors. In my next blog post I’ll introduce you to my latest gem which has a working supervisor/worker pool with many more features than shown here. If you have questions, please leave a comment.
If you want to dive further into actors, I’d suggest looking at a few different projects on github. The Revactor and Celluloid projects are actor-based. Tony Arcieri and MenTaLguY are the two guys who’ve worked on actors in Ruby the most over the years. Both are a constant source of new and interesting projects for concurrency. Of course, don’t forget me too!