Engineering for Testability 

Session 414 WWDC 2017

Unit testing is an essential tool to consistently verify your code works correctly. Discover techniques for designing the code of your app so that it can be easily tested. Find out the best practices for developing a test suite that evolves with your app and scales as your app grows.

[ Cheering ]

[ Applause ]

Hello, everyone.

Welcome to Engineering for Testability.

My name is Brian Croom, and I work on the Xcode Team.

My colleague Greg and I, want to share some things that we’ve been learning about testability and what it means for the process of developing great apps.

I’ll start by talking about what it means for an app’s code to be testable.

And we’ll see some examples of techniques that can be applied to improving testability of an existing code.

Then Gregg will come up, and talk about some ways of working with your test suite to help ensure that it helps support the development of your app, even as it grows in size and complexity.

So, let’s dive in, and talk about testability.

I remember a while back, when I was first learning about writing tests.

I’d been hearing a lot about how a test suite could help during the development of my app.

How it could provide confidence, that the code I was writing was working the way it was supposed to.

How it could help patch regressions in my code as my app grew and changed over time.

And how it could serve as executable documentation for my code.

But, I would start writing a test, and get stuck before I’d gotten very far.

It felt as if my app’s code were actively trying to prevent me from writing the test.

It took a while, but eventually I discovered that I was structuring my code in a way that interfered with effective testing.

To explore what this means, let’s first take a look at a unit test for some code that is readily testable, sorting an array.

This is a test that exercises the Sorted Method from the Swift Standard Library.

It begins by setting up an array of unsorted values.

It calls the sorted method on that array.

And then asserts that the returned array has the values in the expected order.

Generalizing this, we can see that the test structure consists of first, setting up any required input state or values, then, calling the code being tested, and finally, asserting that the returned output is correct.

This is sometimes referred to as the Arrange Act Assert Pattern.

So, we’ve seen that sorted is a nicely testable piece of code.

But I know what you’re thinking, “My app’s code doesn’t have a single sorting algorithm in it.”

In my experience, most of the code in apps, looks pretty different than the sorting algorithm.

Still, there are some characteristics of the sorted method that we can strive to achieve in our app’s code, and to make them more testable.

Specifically, testable code provides a way for its clients to have control over all of the inputs that it operates on.

It provides a way for its clients to inspect any output being generated.

And it avoids relying on internal state that could affect the code’s behavior later on.

I want to use code examples to demonstrate a couple of techniques that can be used to help application code take on these characteristics, and thus, improve its testability.

The first technique is how to introduce protocols and parameterization into a piece of code.

For this example, imagine a document browser app, that is capable of previewing documents of various types, and has the ability to switch to a different app for viewing it in more detail or editing it.

The previous screen we see here, includes an open button for this, along with a segmented control for choosing [inaudible] open for viewing or editing.

So, let’s have a look at the first try at some code for this screen.

So, the event handler in the view controller that gets invoked when that Open button is tapped.

It starts with some business logic for constructing a URL to be used to request that iOS switch to the other app.

Then, it uses the shared UI application instance, provided by UIKit, to determine whether the system is able to handle this Open request, and perform the open URL action if it is.

And if not, it calls a helper method to show some UI to direct the user to install the other app.

Now, I want to write some tests to make sure that this Open button is working the way it’s supposed to.

There are a couple of different ways to approach testing this.

One option would be to write a UI test.

We ask to launch the app, navigate to this screen, tap on the [inaudible] control, to peck and open mode, tap the Open button, and then verify that the phone switched to show the other app.

While this would work, there are some drawbacks to this approach.

For one, the task would probably take a while to run, especially if I wanted to expand it to test with several different documents types for the different open modes.

The bigger problem, however, is that a UI test would have no way to inspect that URL that was being generated, to request that iOS switch apps.

And that URL is really something that I want to be able to inspect more precisely.

So, it really feels like a unit test is more appropriate for the situation.

So, let’s see what it would take to write a unit test for this code.

First, you need an instance of the view controller to work with.

My view controller uses a storyboard for its UI, so I’ll ask UI Storyboard to give me an instance of the view controller.

Then we need to load the view to populate that [inaudible] control property, so that we it’s populated by the view data.

We can then configure that for our open mode.

Provide a document to work with.

And with the setup finally complete, we’re now ready to call the method being tested.

But what do we do now?

It’s not so clear what kind of assertion we could write here.

Let’s go back to the code, and look more closely at what makes this challenging to test.

For one thing, just being in a view controller, made the methods test more complicated.

You’re going to jump through some hoops to get the view controller instance to work with.

Then here, we’re pulling input state directly from the view.

[Inaudible] of the test had to force the view to load, and then indirectly provide the input by setting a property on some sub-view.

The biggest problem though was this usage of a UI application shared instance.

The return value from this call to can open URL, is effectively another input for the method.

But since this relies on global system state, there’s no programmatic way for the test to control the result of this query.

Nor is there a good way for a unit test to observe the side effects of opening a URL.

In fact, after calling this, the test render app would actually be sent to the background, and there’d be no way to bring it back to the foreground afterwards.

So, let’s see what we can do to improve the testability of this code.

We can start by getting it out of the view controller.

Let’s introduce a new document opener class, to encapsulate this logic and behavior.

The open mode and document inputs, should now be provided a simple method arguments that the test can pass indirectly.

But we still have to fix the problems caused by that shared UI application incidence.

What can we do about that?

Well, to start, we can stop using that [inaudible] accessor directly into method.

Let’s add an initializer to the class that lets us pass in a particular application instance.

We can provide a default value for the argument, so the [inaudible] in the view controller, doesn’t have to worry about this detail.

Back in the open method, we then switch over to use the application instance that was passed in.

Let’s see how far this refactoring gets us.

If we try to rewrite our test now, with the document open or in its current state, we’ll still find ourselves getting stuck.

You want to pass in an application instance that we have control over.

So, you can imagine sub-classing UI application, overriding the can open URL and open methods, to get the control that we need.

However, it turns out that UI application strictly enforces its singleton nature.

And throws an exception to try to make a second instance, even if it’s a subclass.

So, instead of using sub-classing, to get the control that we need, let’s instead add a protocol, URL Opening.

We go to protocol, two methods, with precisely the same signatures as the application methods that we’ve been using up to this point.

We still want UI application to be the primary implementation of this protocol.

So, we’ll throw an extension on UI application to give it the URL opening conformance.

Since the method signatures line up exactly, you don’t have to add any additional code in the extension to get the conformance.

With a protocol in place, let’s update the document opener to use the protocol instead of requiring UI application itself.

First, we change the property and initialize a parameter to accept any implementation of the URL opening protocol.

Note that we’re still able to keep the shared UI application instance with the default argument as a convenience when we use it in the view controller.

A final change requires the document opener simply to adopt the URL opener property name.

With that, let’s return to the test and see how the pieces fit together.

Since UI Application doesn’t give us the control that we need in our test, we want to create a secondary mock implementation of the protocol, to use in its place.

Here we add a sub-implementation of the two methods.

The can open URL method acts as an input from the document opener.

So, the test needs to have control over this input.

We can get that by having the implementation return the value of a property that the test can set beforehand.

The open method, act as an output from the document opener.

The test wants to be able to access any URL that was passed into this method.

We can achieve that by stashing the URL into a property for the test to to read afterwards.

So, let’s go ahead and write the test.

First, we create an instance of our mock URL opener that we just wrote, and configure the input using the can open property.

And create a document opener.

And we pass in that mock URL opener as the argument.

With that setup finished, we can call the open method, passing in document and open mode values, to act as the rest of the input.

And we can then assert that the opened URL property of the mock URL opener, has been set to the expected URL.

This one assertion, is testing both at the open method was called at all, as well as the URL passed in contains the correct data.

With that test under our belt, you could continue to write tests for other variations of input data, such as when the can open property is set to false.

But we have a lot more to cover, so let’s just leave it there.

So, in this example, you performed a few refactorings to allow us to write unit tests for our code.

We pulled out explicit references to a shared singleton instance, and replaced them with parameterized input to offer substitution.

This is sometimes referred to as a dependency injection.

In this example, we use an initializer parameter to achieve this.

We could also have used a property setter, or a parameter of the method being tested.

We created a protocol to decouple the code from the concrete class that it previously depended on.

And we created a test implementation to use in its place.

And it gave us the control that we needed over the inputs, and the visibility into the outputs.

Next, I want to look at how separating out logic from effects, can also be used to enhance testability.

The example here is an on-disk cache class, which might be used by an app for faster retrieval of assets that have been previously downloaded from a server.

This cache defines a script for representing the items that it stores.

It defines the path to the item on the file system, how long it’s been in the cache, and its size on disk.

And it provides a way to get the set of all the items currently stored in the cache.

The method that you want to look at now, though, is a cleanup method.

It’s meant to be called periodically, to ensure that the cache doesn’t grow to take up too much space in the file system.

So, let’s have a look at the starting implementation for this method.

First, it asks for the set of all the current items in the cache, sorts them from newest to oldest.

But then it [inaudible] over all of these items, keeping track of the total size of the items already seen.

Once you’ve seen enough items then we reached our maximum size, we begin removing the rest of them from the file system.

So, let’s think about how to test this method.

What are the inputs?

What are the outputs?

Well, one input is the parameter specifying how large you want the cache to be able to grow.

This is a simple integer, and a method parameter, so the test already has full control over it.

The other input, is the list of items that are currently stored in the cache.

We didn’t take the time to look at how this is implemented, but the key point is that it uses a file manager to retrieve a list of files from the disc.

This means that the input is actually derived from the file system, which is the dependency that the tests would need to deal with.

The clean cache method has no return value.

So, its output can’t be data.

Rather, it’s the side effect of a certain set of files having been removed from the disc.

Because of this dependency on the file system, tests for this method would need to deal with a file manager in the file system.

For setup, they might need to create a temporary directory and populate it with a bunch of files of certain sizes and give them particular timestamps to provide the input.

To validate the output, you would need to then return to the file system to see which files are still there.

One way to approach this could be to use the protocols and parameterization techniques that we’ve already seen.

You could introduce a file manager protocol that has the methods that we need for getting a list of files, and for removing a file.

And then create a test implementation that would allow specifying the list of files that would be returned, and a query of which ones have been removed.

If we do this though, we’re still interacting indirectly with the code that we’re trying to test mediated by the file manager.

Instead, let’s try something a little bit different.

We could take our clean cache method and factor out the logic responsible for deciding while files should be removed, the clean-up policy, which you can then interact with more directly.

Let’s see how that might work.

To clearly define the APIs we’re going to work with, let’s first define clean-up policy as a protocol.

It just needs a single method items to remove.

Notice how the type signature that we’ve given it, looks a bit different than from the clean cache method that we started with.

As input, the new method takes a set of cache item values.

And for output, it returns another set of them, but that is the ones to be removed.

So, let’s see how we can implement this protocol using the algorithm that we previously saw from the cache class.

Well, first define a property for that max size input.

That’ll let us specify this max size, and we construct the policy.

Then we add the items to remove method that the protocol requires.

We want to inspect each of the items passed into the method, and build up a set of items to remove, and return that to the method when we’re done.

To populate the set, we’ll loop over all of the items from newest to oldest, summing up the size of all the items we’ve seen so far.

Once we’ve reached the maximum size, we start adding the rest to the set of items to remove.

Looking at this code, we can see that the side effects that were prone in the earlier version have been removed.

What remains is the underlying algorithm, taking data as input, and returning some data as output.

We can also visualize the data flow that we’ve achieved by doing this.

Notice that the code has taken on a very functional style.

Data in; data out.

With the logic factored out in this way, we’ve enabled ourselves to write clear, concise tests that put the algorithm through its paces.

All we have to do is define an input set with some cache items, create an instance of the type, calling the method, directly passing in the values that it needs, and then asserting that the returned items match the expected result.

With a code in this form, we now have easy control over the inputs, visibility into the outputs, and there’s no hidden state to contend with at all.

It’s very reminiscent of the test for the sorted method that we saw at the beginning.

This allows the test to be easy to read, with minimal distraction from the essentials of what is being tested to run very quickly because it has no dependency on slow resources like the file system.

And to be deterministic because all of the data in use is fully self-contained.

Taking a peek back at the original clean cache method, after the extraction, we see that there’s very little left.

All we’re doing is asking the clean-up policy for the list of items to remove, and iterating over all of them, removing them from the file system.

To test what remains, we could decide to introduce that file manager protocol and testable implementation to allow writing a very isolated unit tests for it.

Or we might decide that a couple of integration tests are sufficient to give the confidence that we need that the code is doing the right thing.

This thin layer of [inaudible] code that’s left.

So, in this example, we looked at how to extract business logic and algorithms into separate types, away from the code, using side effects.

When doing this, the algorithms tend to take on a rather functional style, using value types to describe the inputs and the outputs.

This allows for straightforward unit tests that exercise the algorithm in as much detail as you require.

We’re left with a small amount of code that perform side effects based on the computer data.

This bit is often a good candidate for testing with integration tests in order to track that its interaction with the rest of the system is working properly.

So, to wrap up, we saw examples that allowed us to explore two sets of techniques that allow us to structure or ask code, so that tests have control over the code’s inputs, and visibility into its outputs, thereby allowing us to write effective unit tests for it.

And now, I want to call up my colleague, Greg Tracy, to talk about how to create a test suite that scales with your app as it grows.

[ Applause ]

Hi, everyone.

My name is Greg.

And I also work on Xcode.

Earlier, Brian showed you some techniques about how to make app code more testable.

Now, I want to show you how to make the accompanying test code more scalable.

To do this, we’ll look at a few methods to make tests faster, more readable, and more modularized.

I also want to mention that many of the techniques that Brian described earlier, can also be applied to test code.

Here, we’re going to go through some additional tips.

First, I’ll talk about having a balance between UI and unit tests.

Then, dive into code that helps test scale with the focus on UI test code.

And then, I’ll talk about the importance of having quality in test code.

Striking the right balance between UI and unit tests.

I sometimes like to view my distribution of tests as a pyramid.

At the top, you have your UI tests, and at the bottom, you have your unit tests.

These [inaudible] pyramid structure we usually have more unit tests than we will UI tests.

This is usually because unit tests can run much more quickly than UI tests.

Now between UI and unit tests, we also have integration tests.

However, today, we’re going to focus more on just UI and unit tests.

Aside from the distribution, we might also look at the pyramid as a way to do maintenance costs.

Generally, UI tests tend to have a higher maintenance cost because of the number of things that can happen.

Unit tests on the other hand, have a lower maintenance cost.

So, if a unit test fails, it’s usually immediately obvious what’s gone wrong.

With UI tests, it’s like casting a wide net where you can get failures that are difficult to understand, or might not be relevant to the test at hand.

So, it can be a bit more tricky.

While the testing pyramid is a great way to represent distributions of our distribution of tests, it doesn’t represent every situation.

In fact, you might think of testing as a spectrum, rather than a pyramid.

It’s often the case that some UI and unit tests, exist on opposite ends of the spectrum, or opposite ends of the pyramid.

Some UI tests might be more like unit tests, and a unit test might interact with several different modules of code and not just single, isolated bits.

The pyramid is just a good approximation.

It’s not written in stone.

When thinking about these two kinds of tests, we need to consider each of their strengths.

Unit tests are great at testing small bits of code that might be hard to reach without access to all of our app source code.

UI tests on the other hand, are great when you need to test large chunks of code, working together.

Of course, we need to keep in mind that unit tests do have access to all of our app’s source, whereas UI tests do not.

Focusing more on UI tests, let’s look at a few things you could to do improve the quality of your test code.

By making some of the changes I’m about to suggest, we can make it easier to create tests that scale alongside our app code.

We’ll look at abstracting UI element queries, creating objects in utility functions, which can then be placed in a library for later use, and utilizing keyboard shortcuts.

So, first we’ll look at abstracting UI element queries.

Say I have an app that has several buttons in a view controller.

And each button is at the same level in the view hierarchy.

The only difference is the name of each button.

Instead of writing out this query seven times, let’s wrap this up in a method.

We can now modify each of our queries to use the new method we just created.

However, I might even go further.

Since each method calls the same, except for the name, let’s put all those names in array and just loop through them.

This adds some benefit of maintainability for this code.

If I add an extra button in the future, I don’t have to add a new line of code.

I just have to add an extra button name to the array.

By nature of what a UI test is, we’re issuing a lot of these queries.

So, if you’re using the same query multiple times, store it as a variable.

Even if it’s only part of a query, store it somewhere.

Also, if you have queries that are very similar, consider creating a helper method around that query.

The code will look a lot cleaner and become much more readable.

In terms of scaling our test suite, the use of shorter lines of code of shorter lines of test code and thoughtfully named helper methods, will make it faster and easier to implement new tests when the time comes.

So, that was abstracting UI element queries.

Now, let’s move on to creating objects in utility functions.

I have this game I’ve been working on, and for each test, I want to change some settings.

Now, this is not a great example of scalable code.

Because I’ve been recently working with this app, I’m familiar with how everything is laid out.

I understand exactly what’s going on.

However, later, if I was to come back to this code after a few weeks, or better yet, somebody not familiar with my code has to sit down and read what I wrote, it might not make all that much sense.

I first have to realize that I have a Settings page that I need to get in and out of.

And I then have to realize that between those two lines, I’m going through a difficulty page, setting the difficulty, then I’m going through a sound page, and setting the sound.

Coming after an extended period of time, I might not understand why I have two back tabs at the bottom.

I would have to run through this test to actually see this happen.

And if the test were broken because of some change in the actual UI, I wouldn’t be able to run my test, and I would be able to see what I wanted to see.

I’d be clueless as to why the code was written the way it was.

To fix this, let’s try to abstract away some of this logic into helper methods.

We can create a method to set the difficulty.

And then similarly, we can create a method to set the sound.

But can we do a little better?

Sure. We can instead, of instead of passing stream typed arguments, let’s utilize enums.

That way, Xcode can helps determine if the arguments we’re passing are even valid, before we even compile.

Now, looking at the code from before, if we replace some of the code with our new helper methods, we reduce what we had before.

And this is already starting to look a lot better.

What about the initial jump in and out of the Settings page?

Can we improve this as well?

I think we can.

Let’s make a game app class.

And in this class, I’ll include the enums I defined earlier for difficulty and sound.

I’ll also include the helper methods from before that set those settings.

We’ll create another method called Configure Settings, that takes the two settings as inputs, and we’ll migrate the setup logic from before, into the configure method.

Back to where we were before, now that we’ve created this game app class, we can take away all the code we wrote before, and just use a single call, the configure method.

This looks a lot more readable to me than what we had before.

Now, if I wrote more tests than needed to set the settings, I would just call our configure method.

And if I needed to or if I decided that I wanted to add more settings to my app, I would just have to update our configure settings method to handle these additional settings.

From the example, one of the most important things to do when trying to scale your tests, is to create abstraction that you can later put into a library suite.

By doing this, we’re encapsulating common workflows that can be applied to more than one test.

This also means that we’re able to share test code across different platforms.

And, of course, by sharing code, we improve maintainability.

If something related to an abstracted workflow changes, we only have to update our code in a single place, as opposed to several.

One other improvement that I want to mention, in our configure method, and new in Xcode this year, we can add an XCTContent.runActivity block to our code.

This makes it so that when we run our test, instead of having a log that contains all the actions that we made at the top level, we can nest our logging, using runActivity.

This helps organize our logging to make things look a little bit cleaner.

For more information regarding the test activity feature, check out the earlier talk about What’s New in Testing?

Now, let’s move on to Utilizing Keyboard Shortcuts for macOS UI Tests.

Let’s say I have an app where the user can pick a color for their text, using the standard macOS color picker.

And I’m writing a test to verify that the color is set correctly.

The typical way to bring up the color picker in my app is by opening the Format menu, and navigating to the Font sub-menu, and finally choosing Show Colors.

I can write this in my UI test like this.

But there’s a faster way to do this that’ll scale better as my test [inaudible] grows.

Notice that the Show Colors menu item has an associated keyboard shortcut.

Rather than using multiple lines of code to bring up the color picker, we can just use one line of code using the shortcut.

And for the sake of readability, I might use a wrapper to make this call for me.

So, not only is this less code to maintain in my tests, it’s less code that has to be run that isn’t directly relevant to the actual test I’m trying to perform.

So, in an example test method, I was previously going through the menu to get the color picker to show.

Using the new wrapper method, we remove all those extra lines and reduce to a single method.

Not only does this make my test faster, it makes my code look a lot more readable.

Looking at what we just saw, if you’re writing a UI test for a macOS application, you can offer a keyboard shortcut instead of going through the menu bar.

I’m still going to have at least one test that ensures that bringing up the color picker via the menu bar, works properly, but that doesn’t need to be repeated across every single test that involves the color picker.

In the process of using keyboard shortcuts, we make our test code more compact by skipping extra steps needed to work through the UI, sometimes reducing multiple lines of code to a single line, which can help for readability.

Finally, I want to stress to all of you that writing good tests is about writing good code.

It’s easy to treat tests as an afterthought.

Usually, we focus on trying to make our app the best it can possibly be, by investing all this time in writing beautiful app code that adheres to all these principles of good design.

We might have this additional requirement of writing test code.

It might be something tacked on at the end or done in a hurry just to check off a checkbox.

But we can’t allow this to happen.

Without the same attention to detail, test code isn’t going to scale the same way our app code might.

So, to that end, test code is important even though it isn’t shipping.

Also note that the test suite should support the evolution of your app, and not hinder change.

With low quality test code, it becomes a burden to have to update your test whenever you make a change to your app.

But by consciously designing test code with quality in mind, our ability to scale won’t be inhibited by poorly designed tests.

And of course, coding principles that apply to app code, also apply to test code.

Test code and app code should be viewed equally.

And here’s an idea for you.

We should have code reviews for test code, not just code reviews with test code.

Having code reviews that are exclusively for test code, ensures that somebody else is checking your work, or [inaudible] with a test cover, and it’s a chance to further improve the tests themselves.

Now, I want to leave you with a thought.

App code and the tests that verify it, are really two halves of a whole.

When you update your app code, you’ll need to update your test code too.

We need to think of app code and the test code, as part of the same thing, our code.

By making our code more testable, as Brian discussed earlier, and by treating test code with the same care as your app code, you improve the quality of the whole app.

We ought to be proud of both halves, and treat each with the care and attention that they deserve.

For more information and resources regarding this session, you can visit the link listed on the screen.

Here are a few related sessions.

One that happened yesterday, and a few that happened in previous years.

You can check those out online, or through the WWDC app.

And with that, I’d like to thank you for your attention, and I hope you enjoy the rest of the conference.

[ Applause ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US