Same Team, New Tricks

Michael Avila ·

I am writing this to explore the following line of thought:

  • We can increase the value we generate without expanding our talent pool
  • Improving quality creates the most value
  • Inflexibility keeps us from improving
  • Ignoring queues is the primary cause of our inflexibility
  • Kanban is the method for managing our queues
  • Once flexibility exists, we can use it to add value by increasing quality

I will be exploring this in two major sections. The first will describe why flexibility matters. The second will prescribe a possible solution: Kanban. Get comfortable, this post is a little long.

Why Flexibility Matters

Writing software is a social activity. All aspects of the software we produce are inextricably tied to the people involved in its design and their interactions when designing it. (We call this idea Conway’s Law.) Essentially, the solution at any time is an emergent property of the team. Changing this provisional solution sometimes requires changing the team itself. Of course, change is common, making organizational flexibility particularly important.

Being a flexible organization means having the right people available at the right time to solve the right problems. Especially when those problems were not planned for. It means increasing quality opportunistically. Not only when we have time scheduled for it. Everyone benefits the closer we come to this ideal. More often than not, though, organizations lack this kind of flexibility, along with the quality that can come with it.

Unchecked, inflexibility causes us to do inferior work by squandering our own talent. If we miss opportunities to increase quality, then the work is inferior compared to what it could have been. It does not matter whether we passed on the opportunity or missed the opportunity completely, everyone loses.

Understandably, we have a difficult time measuring the value lost by what we have missed or passed on. Less acceptable, however, is allowing this to take our attention away from what our inflexibility ultimately costs us: organizational-actualization.

Where Inflexibility Comes From

Work takes time to complete, including time spent waiting in what we call queues. Lead time is the name we give to this total amount of time. Queues increase lead time because work has to wait in them. Longer queues result in longer lead times. Longer lead times result in decreased responsiveness, quality, and motivation.

  • decreased responsiveness because people have work to get done before they can give you or your problem any attention
  • decreased quality because decreased responsiveness prevents getting help when it is needed, and the longer work waits in a queue the more likely it will need rework, putting work back through the process (reworking) can further decrease quality
  • decreased motivation because being busy making low quality software sucks

We want short lead times and high quality. Better work, faster.

The prevailing method for keeping profitable is ensuring that work keeps coming in and people are kept busy working on it. If you’re paying someone to do something and they’re not, then you are losing money. In this way, it is easy to quantify the cost of having talent wait around doing nothing. But, what does it cost to have work waiting around not being worked on? The answer is our flexibility.

The Effects of Queues

Imagine, for a moment, an organization that has no members. As work comes in, it queues, resulting in an indefinite lead time. If we want to prevent work from sitting in this queue forever we have to bring someone in to work on it. Meet Alice, our first member – she gets work done. Lead time is now some definite number. Work might still queue up, but now we have confidence that, given enough time, Alice will complete it.

If Alice is completing work faster than new work is arriving, then Alice has to wait. If Alice is working on something when new work arrives, then the new work has to wait. Either way, as long as work or people exist, something, somewhere in the organization is waiting. This is why we will never see lead times of zero.

Let’s change the arrival rate for work coming into the organization to the chaotic rate we experience in practice. Everything from sudden influxes to complete starvations of work. We need to deal with this unpredictable arrival pattern, it is plain unsustainable. Meet Bob, our second member – he manages the arrival of work.

Bob has two jobs. First, he prevents Alice from becoming starved for work, which is the type of thing a sales or account person usually does. Second, he deals with influxes of work, which is the type of thing a manager usually does. The point is that the work no longer goes straight to Alice. Work still queues up in front of Alice, but only at the rate it leaves Bob. Not the unpredictable rate that work comes into the organization itself, which is why we call this queue between Bob and Alice a buffer.

Thanks to Bob, Alice now has a manageable amount of work coming in. To make things interesting, increase Bob’s productivity so that work is coming in faster than Alice can finish it. The buffer between Alice and Bob is now growing faster than it is shrinking, as is the lead time. We need help! Meet Carroll, our third member – she gets work done too. Carroll takes her work from the same queue as Alice, which allows Bob to keep bringing in work at the increased rate.

The lead times for our organization depend on the number and size of the queues our work needs to wait around in. We can occasionally sidestep this wait time through prioritization, allowing work to rush through, but this has limits. If you give priority to everything, then priority no longer matters. Prioritization can only be part of the solution.

Let’s bring our organization just a step closer to reality. First, by favoring that work waits around, not people. Second, by completing work less often than starting it. What happens? Assuming the work is flowing in, people keep busy by:

  1. starting new work
  2. continuing work in-process (WIP)
  3. completing current work in-process (and finally moving it out of the organization)
  4. nothing relevant

While waiting for work to be accepted as “complete”, people are expected to be busy. They really only have the first two options. This means more work is coming into the organization than is going out. The result: over time, at high capacity utilization, our queues grow very quickly. Keeping people busy actually undermines our efforts to reduce lead time. Organizational inflexibility comes from long lead times that themselves come from missing the presence and impact of queues.

What can we do? Limit the amount of work in our queues. An empty queue has no work to pull and full queues prevent you from starting new work. Both give you the same thing: time to make things better.

Generating Flexibility

Limiting the size of queues is an attractive proxy for managing lead time and in turn responsiveness, quality, and talent retention. There are a number of disciplines that have already done great work in this area. Toyota, through both their manufacturing and product development divisions, is the notable case. Most of us have tacit knowledge of queues through our own experiences with them at grocery stores, banks, and anywhere else people form into lines. There is also the field of telecommunications rich with long, costly, risky queues that must be managed. In each case the essential ingredient to the solution is the same, slack.

Slack is the antithesis of inflexibility. It is  “… the time when you are 0 percent busy.” (DeMarco, p. 6). When you are zero percent busy you are available and being available has its benefits. Here are two we care about in particular (p. 34): better responsiveness, and flexibility, a capacity for ongoing organizational redesign.

Creating slack is not necessarily difficult. You can give people slack one day a month. You can give people slack one day a week, known as 20 percent time. You can give certain members of your team permanent or temporary slack, known as floating. You can give people slack between projects. Clearly, you have options. What we want is a solution that lets us generate slack naturally and consistently, in response to the current work, throughout the entire organization.

Software organizations, particularly those consulting, experience chaotic arrival patterns of new work, unpredictable availability of talent, and sudden changes in throughput. All challenges the method Kanban is well equipped to help out with.

Hello Kanban

If you are already familiar with kanban, even slightly, then you probably know that “kanban” is a Japanese word that translates to “signboard”. Kanban, the method, uses cards on a board to signal when new work should be started. Software organizations leveraging Kanban might exhibit the following traits:

  • kanban boards are used to visualize work
  • all work is assigned to a “class of service”, managing expectations and building trust
  • simple policies determine process, allowing teams to self-organize
  • each class of service is allocated a limited amount of cards, limiting total WIP
  • an agreeable amount of slack always exists somewhere, providing room to improve

Kanban Boards

A kanban board is some area that shows work on cards, typically divided up in some logical way. Not all kanban boards need to be the same (Skarin). How work is visualized will likely differ from one part of the organization to another. Less so between different organizations. The kanban board we are going to start with is a very simple board inspired by the Scrum methodology. This board has three columns that the work flows through: todo, started, done.

Before work is started, the card representing it sits in the “todo” column. When the work is in-process the card sits in the “started” column. Once the work is done and ready to be released we put the card in the “done” column. Work leaves the “done” column upon customer approval. Simple.

WIP Limits

Preventing too much work from building up is where the value of adopting Kanban comes from. Limits prevent work from building up more than we want. To start, we are going to limit two things: the amount of work in each column on the kanban board and the amount of work assigned to a particular class of service.

Agree on a total number of cards, this number is the WIP limit. A generous limit might be double the number of contributors–although this is highly variable between teams. On a team of ten this means a WIP limit of twenty. Divide this total across each of the columns on the board using the cost-of-delay for each column to determine how much of the total WIP limit to allocate for that column.

For example, take our scrummy kanban board. The cost-of-delay for work that has already been started is greater than that of work that has not yet been started. Because of this the “started” column will be more limited than the “todo” column. We allow more work to sit around in the “todo” and “done” columns than we do in the “started” column.

Why limit the started column more than the other two? Change is expensive, and all work is subject to it. The more you work on something, the more there is to change. Older decisions tend to be significantly more expensive to change than decisions that were made more recently. For this reason, once we start working on something we want to finish it before moving on to something else.

Classes of Service

For Kanban to work we need to manage expectations, both within the organization and with our customers. We can do this by offering and consistently delivering work according to our classes of service. Start with a few simple classes that you are confident you can deliver on. Trust between the customer and the team will grow more and more as we continue to fulfill the promises we have made. Here are four classes we should consider starting with (Anderson):

  1. Expedite
  2. Fixed Delivery Date
  3. Standard
  4. Intangible

The first two classes should be self-explanatory. Work in the expedite class will be rushed through the system. Fixed Delivery Date means work will be completed by a certain date. Standard  work will be completed within a specified amount of time once it has been started, say 15 business days. Work classified intangible is recognized as valuable, but there is not a clear understanding of the cost-of-delay for that work. These classes are ordered from highest to lowest priority. Indicate visually which class of service each work item belongs to (e.g. use different colored cards).

Now our job is to limit the amount of work assigned to each class of service. How should we do this? For the columns on our Kanban board we used the cost-of-delay, does this work for classes of service too? No. Balancing the capacity we allocate for these different classes of service requires a bit more delicacy.

First, expedited work does not count against our WIP limit, but it is severely limited itself. We can really only expedite one thing at a time.

Fixed Delivery Date is an easy class to throw everything into, but it’s much harder to deliver on these promises. When all work has a fixed delivery date we often experience low due date performance (DDP). Low DDP indicates we are not fulfilling our promises. Not fulfilling our promises deteriorates trust, quickly. So we want to limit the amount of work in this class more than we might be comfortable with. This is what makes the Standard and Intangible classes so important.

Starting off, limit the amount of standard WIP less. Change this as your organizations needs change, which has already been mentioned will happen often. A good example of this is the intangible class. Take, for example, the situation where large infrastructural changes need to be made, but they can be done piecemeal. In this situation the limit on intangible work should be much lower than usual.

Remember that changing the WIP constraint for a particular class of service will cause the other classes of service to either gain or lose capacity. The WIP is limited to a fixed amount. For our imaginary organization that fixed amount is twenty.


High capacity utilization is a tradeoff. You gain the illusion of progress (motion without progress). But, you ultimately lose value due to low quality. Kanban can help:

  1. visualize the work so you can see it piling up
  2. utilize classes of service to manage expectations and build trust
  3. limit queues of work to control lead times
  4. use the downtime created by our limits to increase quality

What does this get you? Nothing by itself. The underlying values of your organization, the forces acting on the decisions being made, will take over from here. If the organization values quality, then it will use its newfound downtime to improve it. If not, then who knows what will happen. If the organization understands the benefit well enough to begin adopting these ideas, then chances are good that they do indeed value quality and so will benefit from the ideas.

Further Reading