Multithreaded Testing

Posted on by in Web

Every now and then you’ll work on something that needs to handle requests from multiple concurrent threads in a special way. I say “special way” because in a web application, everything needs to handle being executed concurrently and there are a slew of techniques used to handle this (prototypes, thread locals, stateless services, etc). Here’s an example of what I mean by “special”…

On my current project, we have a queue of articles that need human-user attention (i.e. editorial moderation). Each article must be doled out to only one moderator and there are multiple instances of the web application servicing requests in the cluster. Imagine tens of thousands of articles per day and a team of moderators churning through them. We can’t rely on Java synchronization because it only works within the JVM instance, not across server instances.

The simplified version of the service interface we’re working on looks like this: – Example service interface

public interface ArticleService
    Article findNextArticleForModeration();

What makes this interesting is that we must ensure that the service doesn’t hand out the same Article to more than one user. This is impossible to assert using a single thread. We’ve all been told that multiple threads and automated testing don’t mix. It’s generally true and should be avoided if at all possible, but in some cases it’s the only way we can truly assert specific behavior. I’ve found a pretty simple way to do this type of testing in a reliable, consistent, and non-disruptive manner. Despite the fact that the technique leverages Java 1.5 built-in concurrency utilities, most of the engineers who have seen it are surprised and weren’t aware that such testing was so easy to implement.

Given the above service interface, here’s a test that will assert that no single article is given out to more than one invoker of the method findNextArticleForModeration(). The scenario we’re simulating is 10 users feverishly moderating a queue of 250 articles as quickly as possible. – Test to invoke service concurrently

public void findNextArticleForModerationStressTest() throws Exception
    final int ARTICLE_COUNT = 250;
    final int THREAD_COUNT = 10;

    // Create test data and callable tasks
    Set<article> testArticles = new HashSet<article>();

    Collection&lt;callable<article>&gt; tasks = new ArrayList&lt;callable<article>&gt;();
    for (int i = 0; i &lt; ARTICLE_COUNT; i++)
        // Test data
        testArticles.add(new Article());

        // Tasks - each task makes exactly one service invocation.
        tasks.add(new Callable<article>()
            public Article call() throws Exception
                return articleService.findNextArticleForModeration();

    // Execute tasks
    ExecutorService executorService = Executors.newFixedThreadPool(THREAD_COUNT);
    // invokeAll() blocks until all tasks have run...
    List&lt;future<article>&gt; futures = executorService.invokeAll(tasks);
    assertThat(futures.size(), is(ARTICLE_COUNT));

    // Assertions
    Set articleIds = new HashSet(ARTICLE_COUNT);
    for (Future<article> future : futures)
        // get() will throw an exception if an exception was thrown by the service.
        Article article = future.get();
        // Did we get an article?
        assertThat(article, not(nullValue()));
        // Did the service lock the article before returning?
        assertThat(article.isLocked(), is(true));
        // Is the article id unique (see Set.add() javadoc)?
        assertThat(articleIds.add(article.getId()), is(true));
    // Did we get the right number of article ids?
    assertThat(articleIds.size(), is(ARTICLE_COUNT));

The test starts off by creating 250 test articles to be moderated. It also creates 250 ‘tasks’, each designed to make a single service invocation of findNextArticleForModeration(). The real magic happens in Executors.newFixedThreadPool() and executorService.invokeAll(). The first creates a new ExecutorService backed by a thread pool of the specified size. This is a generic ExecutorService that is designed to churn through tasks using all of the threads in the pool. invokeAll blocks until every task has finished executing. In this test, 10 threads will rip through 250 tasks, each making a single call to our service and capturing the result of that call. Each task execution results in a Future, which is a handle to the results of the task (and more).

Iterating over each resulting future, we make several assertions. The most important one is the last, where we assert that every task is given a unique Article. Thanks to the natural semantics of Set, this is easy to do in an elegant way. Another useful, though unexpected, feature is that if an exception occurs during the task execution, an ExecutionException will be thrown when get() is called on the corresponding Future. If our service fails for some reason, the test will fail because no exceptions are expected.

This technique makes simulating a multi-threaded environment in a test easy and readable. It’s important to only use this technique when it’s really necessary. The resulting test is more of an integration test than a unit test, and its run time is orders of magnitude more than a unit test, so overuse of the technique will artificially inflate the time it takes to runs the tests. After I’ve finished working on the component under test, I will reduce the test-data size and thread count to a level that the test still provides value, but is no longer a stress test (e.g. 10 articles and 2 threads). The next time the component is being worked on, the developer can crank up the values and run the tests to be confident that the behavior isn’t broken.

The complete source for a working example of this technique is available here. You’ll need Maven (or IntelliJ IDEA 7.x) to build and run the test. By default, the tests run against an in-memory H2Database instance, but if you look at you’ll see configurations for PostgreSQL and MySQL as well.

Happy testing!


  Comments: 7

  1. OR… Terracotta and fuhgetaboutit.

  2. (it should be noted that the previous comment was for threads across multiple VMs, but doesn’t really speak to testing in a multithreaded environment, which this article does a pretty good job of doing).

  3. You could also look into using a latch or a cyclic barrier. Java Concurrency in Practice’s chapter on testing describes these quite thorougly.

  4. Anthony Williams

    Though this technique is certainly useful, and something that I’ve applied myself, it is worth noting that this is not a flawless technique. What it does is throw a load of threads at a problem and see whether anything bad happens. However, if there is a race condition that is especially timing-sensitive, this sort of testing still won’t necessarily find it: it may well happen that the bad timing just didn’t happen. You can bet that race conditions that don’t happen in testing *will* happen on your customer’s system.

    Also, there is the issue of visibility. On single CPU systems, or dual-core systems that share a cache between cores, threads may see changes to data made by other threads even when suitable synchronization is not being used. If the application is then run on a true multi-processor system, such latent bugs may be exposed.

    So: this is a worthwhile tool to have, but it’s not a silver-bullet.

  5. I’m not sure when I last saw a software engineering “silver bullet”; perhaps never.

    This style of testing should be used sparingly and only to assert behavior that can’t be tested some other, more deterministic way. I do think it can be very effective, and while there is always a chance the “bad behavior” doesn’t surface for small sets, that chance rapidly nears zero once you increase the sample and thread sizes. For example, for the project where this technique was first employed, we consistently saw issues with sample sets as small as 10 with 2 threads. When we were close a solution, we cranked the sample set to 1k and 5k with 50-100 threads to see if it would break under extreme load. With that data size and load, we were confident that we’d found the right solution.

    There may be scenarios where multi-cpu configurations expose some unexpected behavior, but in the example and sample code, there are no such scenarios. There is no shared state between invocations and you can run as many concurrent threads as the VM can handle, on any number of CPUs without issue. It’s a rather simple solution to fairly common real world problem.

  6. Anthony Williams

    I forget who it was who said “testing can’t prove a system correct; it can only demonstrate that it isn’t”.

    Testing is all about getting confidence that your code is correct. As you say, this technique can find bugs, and so when it then ceases to find bugs in a given bit of code even as you increase the load it gives you confidence that your code is working.

    However, I’ve been doing lots of multi-threaded work in C++ recently, and I’m painfully aware that multi-threaded synchronization is one aspect that is particularly hard to test: you need to carefully think through the synchronization requirements before coding.

  7. Check out this new extension for JUnit posted here

Your feedback