iPhone Distributed Computing Fallacy #1: the network is reliable

Posted on by in Development

As iPhone and web developers we have a number of useful abstractions available for working with network requests. Unfortunately none of them can actually spare us from needing to consider the realities of an unreliable network, especially when working with mobile devices. Fortunately with a little foresight and a few good patterns we can build reliable applications which gracefully handle real world network conditions. Let’s consider the 8 classic “fallacies of distributed computing” and how we can avoid them when writing iOS applications.

The fallacies of distributed computing:

  1. The network is reliable.
  2. Latency is zero.
  3. Bandwidth is infinite.
  4. The network is secure.
  5. Topology doesn’t change.
  6. There is one administrator.
  7. Transport cost is zero.
  8. The network is homogeneous.

Fallacy #1: “the network is reliable”.

What actually happens?

  • Your network requests will fail at unpredictable intervals.
  • A device will report that is has a network connection but your requests will still fail.
  • You will successfully send data to a server and never receive confirmation. You will eventually send the same data more than once.
  • Your data will be corrupted in flight or arrive incomplete.
  • The host you are trying to reach will be unavailable.

How am I supposed to handle that?

Queue and retry requests intelligently in your app.
Use Apple’s Reachability class (or the underlying SCNetworkReachability interface) to get asynchronous updates about the device’s network availability but understand the limits of this API:

A remote host is considered reachable when a data packet, sent by an application into the network stack, can leave the local device. Reachability does not guarantee that the data packet will actually be received by the host.

Unreachable hosts are probably actually unreachable and not worth connecting too until the network becomes available but reachable hosts are still going to see connection failures and may never actually receive or respond to a request.
In addition it can take a devices many seconds to bring a network interface online so an app may be running for some time before you can determine if a network connection is available.
I have found that it usually makes sense to keep a queue of active network requests. I can let the queue fill up when the network is unreachable, retry connections which fail when it makes sense to do so, and remove requests from the queue if their results are no longer needed or they seem unlikely to ever succeed. The strategy for managing this queue very much depends on the app and the nature of the requests. A request to check for new data may not need to be worth retrying while a request to save a user’s data to a server might need to persist or be recreated if the user quits the app.

Support dropped connections or repeated requests.
Most of my applications use some form of http request to communicate with a server and that server needs to handle the consequences of a unreliable network.
As I drag my iPhone between cell towers, through tunnels, and around buildings I lose and occasionally regain my network connection. Web Kit and the NSURL classes can handle most of those interruptions but my server is going to see long lived connections and timeouts. If those open connections use significant system resources my server may not be able to handle additional connections even if plenty of extra bandwidth is available.
Additionally my app is going to try to be clever and retry failed requests if it doesn’t get a success response from the server. Sometimes this success response which is lost so the server will receive multiple copies of the same request. If the server can not distinguish between a user trying to create two new records and a request to create a single record sent twice I’m going to run into trouble. Depending on the app I might handle this by disallowing duplicate records or including a sequence number, timestamp, or some other identifier with each request to identify duplicates.
When loading content for a UIWebView I want to consider how that content should appear when some resources cannot be loaded. Do I want to show the page as soon as possible and allow images and other resources to load as they become available? Do I want to show some loading state until the entire page is ready?

Handle intermittent connections or no connectivity gracefully.
Finally my app needs to account for an unreliable network in its visual design. I don’t want to interrupt the user with a modal every time a request fails and is retried. Nor do I want to create a stack of modals when many requests fail at once. Where possible I want the user to know the app is working on their request, reveal when it cannot be completed, and allow the user to continue using the app until the request is done.
In practice this leads to apps which sync rather than simply fetch and push, which cache requests and responses even if the user dismisses the view which started the request, and which try to avoid blocking the user while a request is in flight.

Testing an unreliable network
The network is always unreliable, except on my development machine where everything works perfectly to ensure I never see networking bugs. As part of my manual app testing or to reproduce a bug I run Charles to simulate poor network connectivity on a local wifi network. Setting breakpoints in the proxy allows me to selectively interrupt requests and create a reliably unreliable network.