Testing Doesn’t Scale

Posted on by in Process

The Ruby community’s obsession with testing is unrivaled. Over the years, Rubyists have gone from old school TDD using test/unit, to modern BDD with RSpec and finally to comprehensive integration testing, including JavaScript support, via Cucumber. The goal was tests at all layers and to get as close as possible to simulating a real browser.

There’s no question that extensive test suites allow us to rapidly develop and deploy apps with confidence. Unfortunately as apps grow so do their test suites. Eventually a once agile test suite will become massive enough to slow development. Then it’s hard to find anyone willing to work on the app.

I’ve seen this happen time and time again on apps and I know I’m not alone. An app’s testing strategy has now become a major factor in its long-term success. How can we manage these large test suites? Let’s take a look at some possible solutions that might be able to help us out.

Parallelizing Tests

The first solution most people look into is usually hardware. Splitting a test suite into groups and then running each group in a separate process or even a separate machine will help reduce test times. parallel_tests, testjour and hydra are a few gems that take this approach.

Parallelizing your tests works; they will run faster. However, I’ve always had issues with setup and configuration. Whether it’s some problem with a particular version of Ruby or it doesn’t run on Linux, etc. Consistent test runs have also been a challenge. Difficult to debug random test failures are often blamed on obscure causes like race conditions. Also, Hardware isn’t free; so in the end cost will determine how well this will work for you.

Run Fewer Tests

This is an interesting approach. Instead of running the full test suite before integrating your changes, you run just a subset of it. A continuous integration server is then responsible for running the full suite.

Running less tests is probably the easiest and most used approach in dealing with a slow test suite. Of course you can still end up breaking the build due to your change breaking something not in that subset of tests you ran. Broken builds will become more common and you’ll gradually start to lose confidence in your main development line. And your test suite is still slow.

Mock Everything

Mocking is a testing technique where every object is tested in isolation and all external dependencies (e.g. a database) are removed from the tests. Because dependencies are mocked, end-to-end integration tests are necessary.

Basically everytime I’ve seen mocking used heavily it always turned out bad. Usually this was due to a large amount of ugly, brittle mocking code. Elegant mocking is an art and everyone seems to have their own style. It also can be a challenge to find a team of developers who are all pro-mocking and understand it. When someone is first exposed to mocking and the resulting test code, their reaction is almost always negative. I tend to feel the same way. Mocking is always a tough sell. In the end you’ll have faster tests but the resulting test code is complicated and difficult to maintain.

Service-Oriented Design

In Service-Oriented Design with Ruby and Rails author Paul Dix outlines an app architecture using services. Each service is a separate smaller app supporting a single aspect of the main app. The main app then interfaces with these services.

This isn’t a new idea but it feels like a very promising approach. If you took an app with say a 45-minute test suite and broke it into several smaller apps, then each smaller app could potentially have a more focused, faster test suite.

The tradeoff here is complexity i.e. overengineering. Each developer now has to run several smaller apps just to run the main app. The main app’s test suite is also going to need to mock out the smaller services. I’ve yet to attempt this architecture mainly because it’s very hard to justify to your teammates that a brand new app needs partitioned into several smaller apps. A team is more likely to support this approach as a refactor of a mature, monolithic app. Unfortunately at that point no one seems to have the energy, time or money for a large-scale time-consuming refactor.

Fewer Features (Simpler Apps)

Most agile developed apps have a very similar life. Features are written, discussed, estimated, and then put in a release. Releases usually happen every 1-2 weeks; possibly every month. Add in 5-10 developers and a year’s time and odds are you’ll end up with a codebase that no one wants to touch mainly because of a disgustingly slow test suite.

Maybe agile gives us a little too much flexibility? As a developer, you try to push back but there’s only so many times you can tell a client “no”; it’s their app you’re building not yours. It’s even harder if the client has the money to spend and you don’t have any new work in the pipeline. However, if you can convince a client to take a simpler approach, that means less features and less tests.

Can Testing Scale?

Any of these techniques can alleviate some of the pain of a slow test suite but sadly they end up being nothing more than stopgap solutions. I’ve tried most of them and I still don’t prefer one over another. Maybe we need to more carefully choose what we test instead of testing everything by default. More tests always sounds great at first but it doesn’t mean much to a 2-year old Rails app with a 60 minute test suite.

Don’t get me wrong, I love testing. I enjoy writing test code more than non-test code but it’s sad to watch a promising young app become bombarded with tests and in a few months completely miserable to work on. Unfortunately there’s no straightforward solution right now. Maybe it’s time to jump ship. Node.js? Scala?

tl;dr

  • Rubyists are obsessed with testing
  • This obsession is costing us our agility
  • There are ways to speed up tests
  • The reality is that scaling tests is hard