Testing Doesn’t Scale

The Ruby community’s obsession with testing is unrivaled. Over the years, Rubyists have gone from old school TDD using test/unit, to modern BDD with RSpec and finally to comprehensive integration testing, including JavaScript support, via Cucumber. The goal was tests at all layers and to get as close as possible to simulating a real browser.

There’s no question that extensive test suites allow us to rapidly develop and deploy apps with confidence. Unfortunately as apps grow so do their test suites. Eventually a once agile test suite will become massive enough to slow development. Then it’s hard to find anyone willing to work on the app.

I’ve seen this happen time and time again on apps and I know I’m not alone. An app’s testing strategy has now become a major factor in its long-term success. How can we manage these large test suites? Let’s take a look at some possible solutions that might be able to help us out.

Continue reading

Posted in Design, Process | 33 Comments

Improving Resque’s memory efficiency

Resque is a very popular message queueing system for Rails applications.  Here’s how I recently improved the memory efficiency of a Carbon Five customer’s resque processing farm by 68x!

The Problem

This customer has an existing investment in Resque and is a heavy user of a third-party Java API so they need to run their resque workers using JRuby.  Unfortunately this gets them the worst of all worlds:

  • JRuby does not fork so they don’t get the benefit of memory isolation by working in a child process
  • The JVM is relatively demanding in terms of memory
  • Resque is single threaded

To scale to the levels they wanted to get to, they projected that they would need to run hundreds of Resque processes, each consuming 512MB.  Insanity! If your problem requires a lot of concurrency to solve, using lots of large processes is a terrible idea.

The Solution


Figure 1 – The improvement is obvious. I’m still trying to figure out how to provision 2.25 machines though.

To fix this I spent a few days modifying Resque to use multiple threads when on JRuby. The changes were relatively straightforward; Resque was already thread safe so no brain surgery was required. There were three major changes:

  1. Modify the main processing method so it spawns N threads, each of which run the work loop
  2. Modify the redis connection to use a connection pool so the connection does not become a point of contention with lots of threads
  3. Modify the signal handling so Resque can shutdown gracefully (applications cannot use SIGQUIT in JRuby)

Before my changes they were running 9 machines with a total of 135 processes with 512MB of RAM each for each test run. Subsequent testing has shown that a single JRuby process with 135 threads and 1GB of RAM performs just as quickly so they’ve gone from 68GB to 1GB and needing several machines to just one. Now instead of a large processing farm, they just need a small garden. :-)

You can find my modified Resque project on github.

Posted in Web | 6 Comments

Behavior Driven Development for node.js

On Tuesday night, the San Francisco Server Side JavaScripters had an excellent meetup hosted by the folks at CBS Interactive. I gave a short talk on using zombie.js with jasmine-node to test drive your node.js application under automation. Here is the video:

Here are the slides from my presentation:
http://node-bdd.heroku.com

And here is the source code:
https://github.com/blindsey/node-bdd

Enjoy and good luck with your integration tests! Hope to see you at the next meetup.

Posted in Process, Web | Tagged , , , , , | 1 Comment

Valium extracts Model attributes without instantiating ActiveRecord objects

I was recently on a project that captures and logs data as ActiveRecord models.  Each datum had 10 or so numeric attributes.  One story required pulling out all the values for a particular attribute in a time range (i.e. all the temperatures for the last week).  This could involve 1000′s of rows from the database.  I cringed at the thought of instantiating that many ActiveRecord objects.  I was not, however, eager to start executing SQL in the project just for efficiency; it usually violates DRY, makes testing harder and the code more fragile.

Enter Valium, a neat gem that make it easy to use your existing ActiveRecord code – but only pull out the attributes you want.  No ActiveRecord instantiation.  Ernie Miller, the creator, has a great post on the hows and why of using this gem.  After testing it out, I was very pleased.  It was exactly the solution I was looking for.  All you do is add .values_of :field_1, :field_2, … :field_n at the end of your Relation.

Note:  I updated this post to use 0.4.0′s new syntax.

Old code changed from

fields = Dummy.since(date).select(:some_field).map(&:some_field)
fields_and_dates = Dummy.since(date).select('some_field, sample_date') \
     .collect{|d| [d.some_field, d.sample_date] }

To

fields = Dummy.since(data).value_of :some_field
fields_and_dates = Dummy.since(date).values_of :some_field, :sample_date

Not only does this look cleaner (especially the case where you are extracting multiple fields), but you are not instantiating a bunch of ActiveRecord objects.  You can reuse your existing scopes and other Relations.  No SQL required. For a quick performance test, I created a Dummy model with 20 string fields.  I seeded the database with 20,000 of these models, where each field was just the current timestamp as a string.  I then timed how long it took to extract 2 of the 20 fields using each method:

# Method 1, creating AR objects for each row
Dummy.select('field_1, field_2').collect{ |d| [d.field_1, d.field_2]}

# Method 2, using Valium to extract data w/out AR objects
Dummy.values_of :field_1, :field_2

The second (Valium) version was 4x faster. When I pulled out more fields, the performance gap narrowed. For 5, 10 and 20 fields, Valium was still about 3x faster. This is just an informal test based on execution time, and doesn’t consider the object churn and memory involved in creating ActiveRecord objects for the first (non-Valium) method. Ernie posted some benchmarks (and his script) comparing using select/map to Valium.

Posted in Web | Tagged , , | 2 Comments

A Look at Test Generation in Cucumber and ScalaCheck

In a typical agile project the test suite grows roughly twice as fast as the non-test code. As developers, our goal should be the minimum amount of test code that specifies the behavior of the app. This often leads to using “clever” metaprogramming techniques in order to reduce boilerplate and overall lines of test code. Metaprogramming works fairly well, however the tradeoff is usually a cryptic implementation and hard to debug test failures. Some testing frameworks however offer alternatives to metaprogramming. Two such tools are the Ruby gem Cucumber and Scala’s ScalaCheck.

Continue reading

Posted in Design | 2 Comments

Deploying node.js on Amazon EC2

After nearly a month of beating my head against the wall that is hosted node.js stacks — with their fake beta invites and non-existent support — I decided it was time to take matters into my own hands. Amazon Web Service (AWS) offers 12 months of a micro instance for free (as in beer) with 10 GB of disk and 613 MB of memory. This is perfect for an acceptance server running node. All you need to do is sign up with a new email address and provide a credit card. Totally worth it. After 12 months, the price will jump to roughly $15 a month.

I’m a huge fan of Debian and it’s progeny Ubuntu. The guys over at http://www.alestic.com/ do a great job of providing Amazon Machine Images (ami) that are production ready. I choose to use Ubuntu 10.04 LTS because it will be supported until April of 2015. The 64 bit ami for the us-east region is ami-63be790a. Feel free to choose one that best suits your needs.
Continue reading

Posted in Ops, Web | Tagged , , , , , | 30 Comments

Agile Team and Product Building: Our 2012 SXSW Panel Proposals

This year’s SXSW was a great experience; we saw old friends, made new ones, gave presentations on agile games and starting Cassandra, and learned a lot from others. We’ll be back and hopefully, with your help, presenting a few of the panels we’ve proposed around the ideas of agile development, team building, and product design.

Perhaps we can convince you to vote for the “Art of Persuasion” panel? David Hendee has brought together a jury consultant, a biologist, a psychologist and an entrepreneur in a cross discipline discussion on how to harness the lessons of behavior change without necessarily changing your product.

Of course, there comes a time when you’ll want to make a change, add a feature, or develop a whole new product. So how to vet those ideas? The “Fake it Till You Make It” panel will cover tools and techniques to rapidly prototype, user test, and iterate the features in an agile design process.

The ability to rapidly deploy features on the web has become a standard and expected practice, but shifting it to native platforms such as iOS and Android, with external review processes and outdated clients, can be a tricky proposition. Developers who have gone through the transition share the lessons and tools learned in “Agile apps: effective mobile & native development”

“Continuous Planning: Story Mapping for Agile Teams” has Jeff Patton and our own Ben Lindsey, walk through not only the writing and prioritization of stories but also the importance of collecting and acting on data on the process for better planning.

Finding and retaining the right people on your team is not simply a matter of their skill or your own resources, especially when everyone else is trying to do the same. In “Company Culture During the Gold Rush”, Christian Nelson leads a discussion on how concretely communicating amorphous values plays a great part in signing up talented individuals and getting them to stay.

We don’t just want to share our experiences; we want to hear yours too, either at the sessions or over a drink. So, please follow the links above to vote; it’s an important part of the selection process. And when you see any of us next year (or sooner!), say “Hello” and we’ll grab that drink.

Posted in Everything Else | Leave a comment

Designing mobile APIs – dynamic content

On mobile devices native UIs offer superior responsiveness and performance but web views offer flexible layouts and data driven content. How can we combine the strengths of both to produce a highly responsive UI which can display dynamic data from a remote server?
Continue reading

Posted in Mobile | Tagged , | Leave a comment

Join Us for CarboNite: A Half-Day Hack-athon

Every two weeks at Carbon Five we have a half-day Project Time where we get to work on projects of our own choosing, with colleagues from different teams. The goal is to create new libraries, learn new technologies, generate subjects to discuss in our weekly brown-bag, and of course keep Carbon Five being the fun place it is. Project Time then leads into a bi-weekly Hack Night at both our office locations where we invite you, our friends, to come in with your own projects.

This week’s Project Time is going to be different, as we open up our doors to you for CarboNite! Starting at 12:30pm (though you can drop by anytime), in both San Francisco and Los Angeles, we will pitch, create, and present a project in a seven-hour window. Prizes for Nerdiest, Most Fun, and Most Creative will then be voted on by the group. No coding required – designers and other types are welcome as there is no shame in a presentation based prototype. And we definitely want you outside partners to join in; in fact, bonus points will be awarded for mixed-office teams!

So please share the details with your friends and join the fun.

Posted in Everything Else | Leave a comment

Designing mobile APIs – error handling

While designing an api I need to provide reliable error responses to both protocol and application level errors. Here I’ll consider any error response generated outside our application stack to be a protocol error while those errors returned from my application’s codebase are classified as application level errors. On a recent project I started to conflate these two categories which would have made the client implementation unnecessarily difficult if my pair had not intervened.
Continue reading

Posted in Mobile | Tagged , | Leave a comment