Designing Code for Performance

Session 224 WWDC 2013

Effective use of the right data structures can make a big difference in the responsiveness of an app. Come learn about the performance characteristics of the Foundation collections, how to select one that best fits your needs, and how to design software to use them efficiently.

[ Silence ]

Good morning and welcome.

Thank you.

[ Applause ]

Thank you.

My name is Quinn Taylor, I'm an Internal Applications Engineer at Apple and I'm excited to be talking today about Designing Code for Performance.

It's great to see such a big crowd here, obviously, you're all interested in performance as well and that's fantastic.

So, to start off, I want to give you a little bit of introduction, the motivation for this talk, why it came to be.

It's no secret that the raging success of the App Store has led to a huge influx of new developers to the platform.

As you've heard Tim say on Tues on Monday, about two-thirds of you are here for this conference for the first time, that's fantastic.

Many of you are in fact new to programming in the recent past.

You come from a diverse array of backgrounds and experience levels and that's great.

You have a lot of different kinds of wonderful apps that you can contribute for our customers in the App Store that they love.

However, across all these domains and different types of applications, one thing that's constant is that everyone uses data structures and it's common to have performance issues.

No matter what your app does, you're going to be using arrays and dictionaries and so on and it's important to understand how that can affect your application.

When you have issues like this, often they're puzzling unless you have some knowledge of what's going on under the hood and what's happening so you can evaluate your app and improve.

So, the goal of this session is really to teach you how to fish.

I'm not here to give you a one half tip about how you can fix this specific problem, but really to look at the grand scheme of things, understand when it comes to data structures, how can I make my app perform the best possible.

OK. So the things we're going to cover today, first off, when to focus on performance, how to evaluate what we call computational complexity, how to choose and use data structures in your application, and last, how to design your code for performance.

We'll give some concrete examples.

So to start off, when to focus on performance, and you think, is it a trick question, the answer is always, right?

That's true to certain extent and we will talk about that in detail.

To start off with a quote, I have this on my wall, this is form Steve Jobs, "We don't get a chance to do that many things, and everyone should be really excellent.

We've all chosen to do this with our lives.

So it better be damn good.

It better be worth it."

I love this quote because it motivates me to make my applications the best that I possibly can and try to eek performance out of every angle.

Have you noticed that he says, "Everyone should be really excellent," it's not necessarily perfect.

We as developers know that our applications have flaws.

We do the best that we can to make them as polished as possible before we hand them over, but we have a limited amount of time.

Steve also said, "You have to pick carefully.

Innovation is saying no to 1,000 things."

So how do we strike the balance here when it comes to performance?

You may have seen a quote from Donald Knuth, a famous computer scientist where he says, "Premature optimization is the root of all evil."

People tend to throw this around at developer forums and some people will use this almost an excuse to say, "You don't need to worry about performance at all, right?

It's not going to be a big deal."

Sometimes that's the case.

But unfortunately, we usually only see this middle sentence of the quote.

But he said a lot more, he said, "We should forget about small efficiencies about 97 percent of the time.

Yet we should not pass up our opportunities in that critical 3 percent.

So there comes a time when focusing on performance is really important for your application.

I like to summarize this by saying, "Optimize performance when it will make a meaningful difference."

That's what we're going to be talking about today, is how I can choose to see is this going to make a difference for my application?

So, to talk about that, there's a principle called Amdahl's Law which is really about helping you pick your battles in performance.

So this law was proposed by Gene Amdahl, a very famous computer architect, and basically, it has to do with predicting the maximum improvement that you can expect by speeding up some portion of your code.

This depends obviously on what percentage of time that code is taking to begin with to see what kind of speed that you can achieve.

And the payoff is much larger for the dominant piece of code, the thing that's taking the most time, this appear fairly obvious.

So the question is, will the payoff of improving the performance of this of piece of code be worth the effort, the time that it's going to take me to do that?

And this applies directly to concurrency.

In fact, as this law was originally stated, it had to do with multi-core processing and breaking your code up into multiple pieces, and that has great tie ends as well to Grand Central Dispatch and using blocks.

So let me give you an example of Amdahl's Law.

Say that you have a process that has two segments A and B and one takes 80 seconds and one takes 20 seconds, all right.

So we have a certain amount of time.

Now, if you can spend a bit of time and optimize and cut the time spent by process B segment B in half, then you have a great win.

Now you're 90 percent of your previous performance.

However, if you can apply the same effort to speed up process A and cut that time in half, now you've gone to 60 percent.

This is a much bigger win and this is what we talk about when we say identifying bottlenecks, looking at the actual problems in your code.

So that's where you really want to focus on performance.

You know, it's great to have a performance win in segment B but everyone has things that they have to do.

So you got to choose wisely what you work on.

So back to Donald Knuth, he talked about premature optimization.

Now, premature optimization, generally leads to unnecessary complexity in your code.

You take something that's simple and to try to get more performance out of it, you change and tweak things and make it more complex and clever.

Sometimes that's great.

But if it ain't broke, don't fix it.

You don't have to fix something that's not necessarily a problem for your code.

You have a ton of features that you need to implement for your app and polishing it to make it great for you users.

So I'd like to focus on something I call informed design which leads to elegant and efficient code.

Informed design is all about considering your performance early on, even during the design phase.

You can do that before you even write a line of code.

And it helps to intelligently avoid problems that you can actually face in the real world, rather than premature optimization and fixing your problem that may not be such a big problem after all.

And this is useful because it can help you avoid designing slowness into your application only to fix it later.

So if you can think about it upfront, it's a big win.

OK. So now let's move to talking about how to design for performance.

If you have a performance issue that you've identified in your code, there's three broad ways that you can resolve that.

The first is, don't do it.

If there's unnecessary work, then you can completely cut it out.

The second is do it as rarely as possible, and the third is do it as efficiently as possible.

Now, these are generally in increasing order of difficulty.

It's very easy to just remove code that you no longer need, but getting something to be more efficient can be really difficult.

In order to answer these questions and choose which approach works for your scenario, you really have to have some context.

Is the work necessary?

If not, I may be able to remove it.

Is redundant work being done?

Doing the same thing over and over again, I may be able to reuse that work.

Or is there a more efficient way?

And that last question is really the most tricky.

How do I know if I can do better than what I'm doing right now?

So, to answer that question, we're going to the next portion of the talk where we talk about computational complexity and cost.

Now, these are some big words so I'll explain out for you in kind of broad terms.

So we're talking about the cost of code, and by this, I don't mean how much you pay for an app in the store or how much it cost to you to have people develop it.

Every piece of code takes some amount of time to run because there's work being done, and it's obvious that more work takes more time.

However, it's not necessarily obvious that sometimes you can have really short code that does a lot of work and it can hide some complexity, right?

There's a cost associated with a particular code, perhaps an API Caller, so on, or you're using a library.

Now, the effects of data growth can vary quite a bit when you're choosing an algorithm and it can really impact you performance, particularly as your data size grows.

It may be disproportional affect the number of objects that you add.

And consequently, small tests often won't turn up this kind of problems.

We do we try to do as much testing as we can before we send our app out into the world, but you've probably all experienced that your customers will use you app in new and exciting and sometimes terrifying ways and throw a lot of data added in ways that you didn't anticipate.

And sometimes these performance issues will crap out where you least want them to.

OK. Fortunately, this kind of complexity can often be analyzed without even running the code.

Much like the Xcode static analyzer can do that.

And the key to this is understanding the complexity of the code that you're running and how much work is being done.

Now, Computer Science has entire semester courses devoted to this that leaves sophomore students really puzzled.

We don't the have time to get into entire semester but I'm going to give you kind of a crash course and help you have a framework for understanding complexity and analyzing your code.

So, in Computer Science, we talk about something called "Big O" notation.

So this is a way by ranking algorithms by efficiency, generally, by their time efficiency, or memory efficiency, or so on.

And the letter O stands for the order, the order of growth of this and we'll talk about this in detail.

And it really relates to how the performance changes as the scale of the work that you're doing when you increase the number of objects you're working with for example.

So, with Big O notation, what we're really seeking to do is approximate the worst case behavior for an algorithm, the most work that you might ever have to do.

Ideally with an algorithm, you may be able to skip out early and so on if you're doing searches and that type of thing.

But Big O is concerned with how bad could this possibly be?

So then that's my absolute worst case of performance.

And in general, we ignore coefficients and lower order terms and logarithmic bases because what we really care about is the order.

Whether it's n or n squared as we'll go into detail in just a moment.

Now, for any given task, it's important to realize that there are inherent limits.

There are some things that just take time.

We'd like to do everything instantly but that's not always possible.

A perfect example of this, Fred Brooks, a famous software engineer, "Nine women can't make a baby in one moth."

As many women you assign to it, you'd like to do it concurrently, it will be much more convenient for them but it's simply not possible.

There's a hard limit on this, right?

So we're going to talk about Big O complexity.

And I want to identify for you a few key and common order functions.

So I have a table here, in the first column, notation, you'll see how you might see it written and these are pronounced Big O of 1 or order 1, order log n, order n and so on.

You might also hear it referred to by the name constant time, logarithmic time, linear time, quadratic time, and so on 'cause they're interchangeable.

And then provided in the third column some examples of common algorithms or tasks that fall into these different categories.

I won't go into them in detail here on this slide but this is great for reference to come back and see where my task actually fit.

And we'll talk a little bit more in detail about identifying which of these your task belongs to.

Now, a few of these are most common.

You'll see this all over the place and they're very common for algorithms that you'll be working with.

So I'll talk about these in detail.

So the first, before I get into some specific details, let me show you a comparison of these order functions.

So it's easy to see them in a table but maybe it doesn't hit home why this is important.

First off, this is what order n squared complexity looks like, like a parabola, order n log n, order n, order log n, and order 1 or constant time.

Now, the first thing that you can see is these diverged really rapidly.

When you get to a large number of items across the bottom scale, you have some of this that it gets really complex and shoots its way up towards the top.

That means your performance is going to be bad.

You're doing that much more work, right?

The other thing to notice is that down at the bottom left corner, they're really close together.

So for small scale, it may not be necessary to find a really optimal algorithm.

This would be arranged where you want to watch out for premature optimization.

Am I really going to have an issue here, because they're all pretty close together.

So those are some things to be aware off.

So, I'm going to go through some examples of a few sample pieces of code that have somebody's complexities.

And remember, the key for this is, is that we want to look of the growth at the growth of the amount of work as the size of variant [inaudible] increases.

So first is a very simple one.

Let's say that we have an array of integers and I have a function where I pass in an integer value and an index, and I simply want to know does this value appear at that index.

Now you could see we're only doing one operation and you may know that indexing into an array is extremely fast.

So, in this case, it's order 1 regardless of what index I choose, it's going to return nearly instantly.

So if we have a sample array, you can see I'm in query 1 index and say that that value is not there.

Here I found it, and there I've not.

So I don't have to look at anything else in the array, just one spot.

So that makes this order 1.

If we change this up just slightly, now we have an order and algorithm.

So, now I'm interested in seeing if a particular value appears anywhere in an array, not just at given index.

In order to do that, because I don't know where it would appear, if it's there at all, I have to look through each index in the array.

So we have a for loop when we start at the beginning and we go towards the end and then we test to see if it's there.

If at any point I find it, I return yes.

But the worst case is that it's not there in the array at all.

And I have to check thought every single index in order to determine that.

So, if you could see here, I have to march one by one down the array until I find the object or determine that it's not there.

So in worst case, order n.

For order n squared, let's say we have a similar array, I give it a value and I want to say, does this value appear at least twice anywhere in this array?

In order to do that, I have to compare every two possible pairings of indexes in the array to see if they're equal.

So what that means is that I have to do two for loops.

I have an outer for loop and an inner for loop and each of this is an order n algorithm.

And inside this for loop once I get to comparing two indexes, you see, I compare each of these two to see if they're equal and if at any point I find them, I can bail out and say yes.

And you can see that we're doing a lot more work here.

J is moving back and forth like a mad man and I is just trying to keep up, and finally here at the end, we find those two matching values.

Now in the worst case, there may be no values at all.

These two values may not be present in the array or the value may not be present twice in the array but we have to go to the very end to find that out.

Now, some of you may look at this and realize this is a rather naive implementation.

I'm starting the j loop at zero every time.

So, for example, when I get to the very end of the array and j starts at the beginning, I've already compared that one, I was at the beginning and j was at the end.

So that's kind of silly they have to do that.

So instead you could choose to start the inner for loop at i plus 1 for example and you eliminate half of your comparisons.

That's great, that's a great win.

So now we're in the n squared over 2, but it still ends for a growth.

So the key to this is although it's now much faster for a given input size, it's still going to have the same growth properties as you add one more object or one more item to the array, you're going to add n comparisons.

That's the important thing from this, so order n squared.

OK. So, calculating complexity.

We've seen that you can combine this order functions.

For the n squared example I just showed, we have two for loops and each of those has order n and we have nested complexities, we multiply those together, n times n, we get n squared.

When you have sequential complexities, we add those together and then we take the largest term.

So for example, this function were introduced to n squared.

We have one n squared call and two n and three that are constant time but all we care about it the n squared, nothing else, because that's the growth of this function.

OK. So you see that was source code.

However, sometimes you may not have access to the source code or not be aware of what work it's doing, and in such cases, you can often estimate.

You can consider what the code would have to do, what kind of work it's trying to accomplish, and another complimentary alternative is to profile with instruments.

This can be a great way to find performance issues in your code.

There are many sessions from previous years about using instruments and in fact, Xcode 5 as you saw earlier this week introduces new debug gauges including memory and CPU is that it can help you with identifying this kind of issues.

Now, some APIs may appear to be very similar but it's important not to confuse them when you're estimating, just look at the name and say, "Oh, I know what that is" because of the name.

For example, we have a containsObject API on both NSArray and NSSet.

But it would be folly to assume that they're the same.

So let's look at NSArray first.

At the containsObject, we want to see if a given object appears anywhere in the array.

What work does it do?

Well, in this case, the documentation actually tells us explicitly that it sends the isEqual message to each object in the array until one of them returns true or which is the end of the array.

OK. So it goes to each object.

That certainly sounds like order n complexity and that's certainly a valid assumption.

I think that that's pretty reasonable.

Now, we may think, well, what if I can enumerate these concurrently and compare the objects?

That would be sort of a win.

And it would, it would reduce it by a constant factor but it would still be a linear operation with the size of the array.

OK. So let's see NSSet.

NSSet also has containsObject, it has the same object.

I give it an object, it does the same thing.

It tells me whether it's in the collection or not.

It looks just like the one that we just saw in NSArray.

Is it also order n?

You might assume that it would be.

However, it's actually order 1, it's a constant time.

Regardless of the size of the set, containsObject is always really fast, that's great.

There's a caveat, please don't go start replacing NSArray with NSSet anywhere in your code because it's fast.

There's context to understand about why is this and if it will work for you.

So, the reason that it's faster is it must be doing less work.

It's doing the same amount of work, roughly speaking, regardless of how large my collection is.

So talk about that.

The reason that this is is something called hashing.

We'll talk about hash-based organization.

So, NSSet uses a hash table for storage as this NSDictionary.

Every object has a deterministic hash value.

You may have seen the hash method.

This is defined on NSObject we'll talk about in detail.

And it returns an unsigned integer value that you could use to store this object in a hash table.

And equal objects should always have the same hash.

We'll talk about this in more detail and give examples.

When in a hash table, objects are grouped into "buckets", parts of the array based on the hash that they provide.

And the structure has a hash function that takes a hash and maps it to a given bucket.

It can decide that this range of hashes goes first in the bucket, and this one and so on, and it can divide this up.

And the goal of this hash function is to achieve uniform distribution so that you have about the same number of objects in all the buckets in the hash table.

When you achieve this lookup in a hash table, it really only has to consider the objects that are in a particular bucket.

It knows if an object is going to be there it will be in this bucket and it can check isEqual on only a few objects or none at all if there's none in the bucket.

This is an obvious win.

If you have an array that's a thousand objects, you may have to check up to a thousand items in order to see if it's there.

But with the set, you may have to check only a handful.

This is great.

So let me give you a concrete example with an animation here.

Say we have a hash function and we have a hash table we're going to store some objects in.

Let's say that we want to store the names of all Apple CEOs past and present.

So let's start off with Tim Cook.

We ran this through our hash function and it determines that Tim Cook by the hash belongs in bucket 2, great.

So we set him there.

We have Steve Jobs and that gets determined to go in bucket 0, great, no problem.

OK. Now we're going to add Gil Amelio and our hash function decides that it goes in the same bucket as Steve Jobs.

We have a collision in our hash table, rather unfortunate and somewhat awkward.

[laughter] In order to see if we should insert this object, we have to compare so we check isEqual.

And in fact, we see that Steve Jobs is not equal to Gil Amelio, thank goodness.

And so what happens is we move one aside and we chain this together, they both occupy the same bucket, great.

Now we have Michael Spindler, no collision here, goes into bucket 5.

We have John Sculley, and once again we get a hash table collision.

So the hash function decides it goes in the same place.

So once again, we compare each of these objects to see if this is equal.

They are there not equal so we move them aside and we stick that in the same bucket.

So, this is all well and good except you'll notice that now the majority of my objects are all in one bucket in the hash table and half of my hash table is completely empty.

I'm not using it.

This is obviously not optimal.

If I have some sort of simple query to see if someone is in the list for example, and my hash function happens to navigate to the same bucket, I now have to check through each of those objects to see if they're equal to Bill Gates to determine no he's not in the table.

All right, so this is definitely not optimal.

Fortunately, something like NSSet can choose to optimize the hash function on the fly.

If it realizes that your performance is not as good as you could expect it reserves the right to be able to move objects around, to change the hash function and redistribute them evenly.

So here we're in a much better situation and our lookups are restored to being really fast.

OK, great.

So how can you participate in this?

Defining your identity for your object is how you take part of this hash-based identity.

So NSObject as I mentioned defines an isEqual method and a hash method and they're functionally equivalent to comparing the pointers, the object addresses for these two objects, and returning just the address itself.

Now, Apple-provided subclasses of NSObject will override this as necessary.

For example, NSString has its own hash, NSDate, NSNumber and so on, there's a variety that do this, and in fact, arrays, NSSets themselves also have their own hash and isEqual.

Custom objects that you create should choose to override these methods if pointer quality is not sufficient for your needs.

For example, say that you have two person objects and you want them to be considered equal if the Social Security Number is the same for both of them, something to that effect.

And you can learn a lot more about this.

This will be in the slides for reference afterward about objects comparison and implementing these methods.

OK. There are a few rules of the road if you're going to implement these on your own object.

If isEqual returns true for two objects, then the hash must be the same for both of these.

This is critical because if you have two objects that should be the same but have two different hashes, they could get put in different places in a hash table and you won't be able to find them when you're looking for them.

You'll get incorrect results.

However, the same hash value does not necessarily imply that isEqual must be true.

You can have two objects that have the same hash value but isEqual will be the tie breaker.

If you decide to implement isEqual, you really should also define hash for this very reason so that you can guarantee that they are the same.

Now, a good hash as you're implementing one should minimize collisions.

A poor hash will really cause your performance to tank.

All right, for example, in the worst case, let's say that you're like, "I don't know about this hash thing, I'm just going to return one for all of my objects."

They all have the same hash which means they're all going to go to the same bucket in the hash table which means you've now turned an NSSet into an NSArray and killed your performance.

That's the opposite of what we want.

So we want to be careful about that and we'll talk about that in just a moment.

One other key important thing is that when you have hash lookup, your hashes really should be stable and predictable.

If your hash tends to change while it's in the collection, it's really bad.

You're not going to be able to find it.

So there are two options really.

The first is not to modify and object in a collection.

If you have a mutable string and you've put it in as the key or an object in a set, for example, and you change that string and the hash will change and you're going to have all sorts of hilarity and hair-pulling that ensues.

The second option is to not base your hash on mutable state.

This may be an option for you 'cause that's something they consider as well.

So let me show you sample implementation.

You all have the WWDC app on your phones and iPads and there is a news object, as you see in the news update throughout the week, we have these news objects to represent that.

This is a simplified representation.

Here we're just looking as the title and the time stamp.

So I want to focus on these two implementations.

Now, for hash, you see that we need to return some hash value and we're just returning the hash that NSString provides us from this title property.

However, if we have two news items that may have the same title and thus have the same hash, we can break the tie by calling isEqual and compare both the title and the time stamp.

Let me also check to see if it's the same object.

So, there are a lot of examples of things like this in the documentation that I encourage you to check out afterward.

OK. So next I want to talk to you about data structures performance, kind of applying what we've learned.

We've got the map view part out of the way about Big O complexity and hashes and so on.

So I'm going to talk to you a little but about choosing data structures and choosing data structures and the performance characteristics about them.

So let's start with an analogy.

Say that you have books that you need to organize.

There's a lot of different ways that you could choose to organize books.

Say you could sort them by topic or author, title, even size or color.

You can choose an arbitrary characteristic and choose to sort them that way.

Everyone does this with their CD collections and the iTunes have solved that problem now.

But everyone has their own preferred way to sort things and organize them.

However, each of these options makes some things really easy and others difficult.

Say that you've chosen to organize alphabetically by author and then later you want to find all the New York Times' best sellers, or you want to find children's books, those are scattered all over in your collection and it's not easy to look them up when they're sorted by author.

So this is something to be aware of.

It's also important to plan for scale when appropriate as you're choosing your data structures.

For example, it's a completely different thing to organize a small bookshelf here where you can say, "Go find the small green book" and instantly you've all found it because it's small versus organizing a really large library.

Things that will work at small scale may not necessarily when you get to large amounts of data.

You have to have a more intelligent way to organize things.

So, choosing a strategy, every data structure has some tradeoffs.

It has things that it's best at and things that is not as well suited for.

And you want to use the one that best fits your needs.

When you choose a data structure that's not the best fit, it really hurts your performance, for example, using a hammer to drive in a screw, not the ideal tool for the situation.

Even if you have the right tool, you may not be using it optimally.

And if you have a hammer with the nail and you just pressed down really hardly as hard as you can, it's not going to be as effective as whacking that really hard, right?

So, it's really important to choose the right thing for your needs and use it correctly.

And we encourage you to always prefer to use the built-in API whenever possible.

The collections that are available on the foundation framework are extensively tested.

They've been around for years.

We use them just as much as you do and we want them to be really, really fast just as much as you do.

This also using the built-in frameworks also guarantees that in the future when we have improvements such as you'll see in OS X Mavericks and in iOS 7, that you will also get those improvements for free.

Your apps will suddenly become faster which is great.

So, getting down to actually choosing a data structure, there's a few things you have to know for context.

The first is knowing what you need.

Is it important that my object have an order?

Will I have duplicates in my collection?

Do I have to be able to add or remove objects, have it mutable?

What operations are most critical for me to be really fast?

What am I going to be doing with my structure?

And even where my data come from and go to, for example, you may choose differently if your data will come from user input or from a property list file or Core Data or JSON over the web or so on.

There's certain things that may change your decision.

And the second thing is to know what to expect from your collection, which we'll go into in detail.

There are certain things that are broadly to across all of our collections.

For example, getting the count of items in a collection is always really fast.

It's constant time.

We cache that for you because that's a common operation.

Enumerating through all the objects in a collection is generally linear time because you have to visit each of the objects.

Even if you have a set or dictionary, look up this fast but going through each on this going to take the same out of time.

And other operations will vary by the collection.

So, before we go over individual data structures, I want to share this and note about mutability.

The best guideline with mutability, we got a lot of questions about this, is to use it when you need it, when it's semantically correct, when you're adding and removing objects from a collection.

And mutable collections do have benefits.

If you don't need mutability, it can be to your benefit to use an immutable collection instead.

Most foremost among these benefits is thread safety.

If a collection is immutable such as an NSArray, you know that no other thread can add or remove objects and you don't have to worry about exceptions or crashes.

Also, there are memory and speed optimizations that are possible with immutable collections because we know they're not going to change in size.

We can allocate just the exact right amount of memory for the objects that you've provided.

Now, another option is to make a collection immutable afterwards.

Make a copy of it that you can use for quick reference and get those benefits but still be able to build up your collection without a remove.

And lastly, if you do use immutability, it's often great if you can help us to help you.

You know how your collection is being used.

We design it for a broad range of different uses and you know specifically where you're going to do with it.

And there's times that you can provide hence help us get you the best performance.

One example is initializing a collection with initWithCapacity.

Now, what this does is gives us kind of a hint about how many objects you're likely to store in a given collection.

So for example, if you say "I'm only going to store three items in this mutable array and add and remove them", you can provide that capacity upfront and we can optimize it accordingly.

Now, don't It's not wise to try to outsmart the collections and ask for a really large capacity.

The collections are really finally tuned and optimized to take care of that.

And even if you do provide the capacity hint upfront, the collection can dynamically grow and shrink as objects were added and removed.

So this just provides us an upfront way to kind of get in the ballpark of what you need.

OK. So I'm going to go on a quick tour of some of the data structures in foundations in the most valuable players.

Give you a brief overview and talk about their performance characteristics.

First, NSArray and MutableArray.

You're all familiar with this.

It's a work host of the Cocoa library.

It's an order collection.

It always indexed access and you can have duplicate objects in an array.

Operations that are really fast are anything that includes indexing: objectAtIndex, the firstObject and lastObject.

Adding and removing at either end of a mutable array is actually really fast as well.

This may come as a surprise adding at the end would seem really fast but you may think that if you had at the beginning you'd have to shift everything over.

However, that tends to be a common use case for a lot of our developers.

So we've optimized that and we've even the guarantee that inserting at the front, the very front of a mutable array is also really fast.

It's such a great thing to be aware of as well, to not be afraid to insert in the front.

Slower operations include anything that involves search and looking through any of these indexes as we've already mentioned before.

Also, adding and removing at an arbitrary index somewhere in the middle of the array, not either of the ends, would tend to be a little bit slower.

One specialty operation I'd like to call is binary search.

If you're not familiar with this, binary search is a quick way to narrow down your search.

There are a couple pre-conditions.

You have to have some sorted range of data.

So if I know that my array is already sorted in ascending or descending order, I could use binary search to quickly narrow down and cut out half of the objects each time.

And this is a great win because this is an order login operation as compared to order in.

if you remember from the graph, order in was a line that goes up at a 45-degree angle and order login stays really close to the bottom.

It's got great performance as your objects has a scale.

Next is NSSet and NSMutableSet.

This is an unordered collection, it does not allow duplicates, and it has hash-based lookup as we discussed previously.

In sense, adding, removing and searching for objects are really fast, precisely because of the hash-based lookup.

Now, there are some specialty operations that are really convenient on NSSet.

For example, set arithmetic.

If you think of a Venn diagram where you've got two circles that overlap, you can find the intersection between them, you can see who one is completely containing the other and so on.

And if you have mutable sets, you can extend this and intersect the set and modify one set to only contain the objects that also appear in another set or minus set or union set and so on.

This can be really convenient for merging and separating different collections of objects.

Good thing to be aware of.

Caveats with NSSet, when you convert from an array to an NSSet, you can do that but you do lose your ordering of the array and you lose any duplicates that may appear in the array.

Those will be stripped out with the set.

If you convert back to an array, you may get a smaller array and you almost certainly get one that's in a different order that the one you passed in.

Related to this is that you can't store a set directly in a property list or in JSON as we'll talk about in a little bit.

You can convert to an array and back as you're reading them out, but these are caveats to be aware of.

OK. A brief of note on NSCountedSet.

This is really a handy one that many people are not aware.

It's also unordered just a like a set, no duplicates and hash lookup.

It's a subclass of NSMutableSet so it has all the same operations and the same characteristics.

Its big trick though is it attracts the net insertion count of each object.

So, object still only appear once in a count of set, however, if you insert the same object again, the count for that object will increase to two and so on.

If you remove that object, it decreases and if it's removed it set 0.

This can be really convenient if you're trying to count up occurrences of a particular object.

It may be words and text or something like that.

It's kind of a histogram binning type of thing, so great thing to be aware of.

Next is NSDictionary and NSMutableDictionary.

This you're also very familiar with.

It's an unordered collection, it has key value entries to store those associated together, it has unique key values, and it uses hash lookup just a like a set.

So, adding, removing and searching are also really fast in the dictionary, anything where you're looking for an object by the key, adding or removing and searching.

Specialty operations, one thing that's convenient is it has some handy API for reading and writing directly from a property list file.

And also one other thing that was added in 10.8 in iOS 6 is really handy.

If you have a mutable array and you know upfront which keys will ever appear in this mutable dictionary, sorry, mutable dictionary, you can provide a key set in array of those keys upfront.

And there's a class method on the NSDictionary.

You give it an array of keys, it will give you a sharedKeySet object.

If you provide that one creating an NSMutableDictionary, it can create a really optimal hash algorithm for organizing those particular keys.

It knows upfront what's likely to be there.

And it can be a great situation if you need mutability but you want speed and you know the keys you're going to have upfront.

Caveats of NSDictionary, you're probably familiar with the fact that any keys that you provide to NSDictionary are going to be copied in.

They have to conform to the NSCopying Protocol.

And you should never mutate an object that is a dictionary key.

The hash will change very likely and you probably won't be able to find it and you'll get incorrect results.

This is a major thing to be aware of.

NSOrderedSet and NSMutableOrderedSet are an ordered collection, similar to arrays, they don't allow duplicates like I said, and they support both index and hash-base lookup.

This class was introduced in iOS 7 iOS 5 and OS X 10.7.

And it's effectively across between a set and array.

It's not a subclass of either one although you can get an array or set representation that one convenient thing about this are their live-updating.

So as objects are added or remove to the mutable set, those array and set representations will be updated as well.

Now, a caveat of this is that you're going to have increased memory usage.

You can't get the best of both worlds with as a free launch.

So because we're maintaining ordering, we have an array internally.

And because we're guaranteeing uniqueness and no difficult objects, we also have a set.

So roughly double memory usage.

And also when you If you want to store an ordered set in a property list, you'd have to convert it to an array and then back as you bring it out.

NSIndexSet and NSMutableIndexSet differ because they don't store objects, they store indexes, integer values, so it's a collection of unique indexes and these are really handy for referencing some subset of object in an NSArray.

And it's really handy especially because you can avoid the memory overhead of making a copy or saying objects at indexes, you can give it an index set and it gives you a new auto released array with those objects in it.

If you want to avoid that overhead, you can just enumerate through the objects at particular indexes without creating a new array and it's really handy for that.

Index set has really efficient storage and coalescing of indexes.

So if you have a mutable index set and you add the indexes 1 and 3, it will store those separately.

If you add 2 as well, it's intelligent enough to coalesce and say, "I'm going to store the range 1 to 3."

And if you add more indexes that are consecutive, it will group those altogether.

So it's really efficient with storage.

It also supports set arithmetic such as intersection, subset, and difference like we saw with mutable sets.

One caveat of this is when you're using index set and you apply it to an NSMutableArray, if that array changes after you've obtain the index set, you can have all sorts of bad results.

For example, you have a mutable array, you've gotten a group of indexes, and then you remove all the objects or some of them in the array.

If you try to get objects as indexes, you can easily get a range exception because that index no longer exists in array.

So this is something to be aware of.

NSMapTable and NSHashTable are very interesting.

They are very similar to NSMutableDictionary and NSMutablSet.

There's a couple of key differences.

They offer a lot more flexibility with some of these additional options.

You can use pointer identity rather than choosing to use isEqual as you would in these other collections.

It can contain any kind of pointer not just in an object.

It could be a void star or any kind of pointer you can store in here.

You can optionally say that you want weak references so that if all the strong references to an object stored as a key or value in a map table for example should those go away then that entry would automatically be removed for you.

And this works under ARC as well with zeroing references.

And you can optionally say that you don't want to copy on insert as you add an object to a map table, for example, it won't copy the key.

You can specify that.

Now, you can't convert a map table or a hash table that contains these raw C pointers directly to their counterpart, mutable dictionary or so on, because they don't translate over well, and also a caution to be aware of premature optimization.

This is one of the situations that where you think, this is going to be just what I need, but you'll sacrifice some of your compatibility with APIs that require an NSDictionary.

And for most cases, it's probably not necessary.

It's a great tool to be aware of but it's not something that you're generally going.

This was also introduced in OS X 10.5 and of course binding iOS release you should be able to use this broadly throughout your applications.

And last is NSCache.

This is also similar to NSMutableDictionary and it's a great tool to have as well.

Primarily, there's a few big benefits.

First and foremost is that its thread safe unlike in mutable dictionary.

It doesn't copy your keys unlike in mutable dictionary and it supports auto-removal under memory pressure.

So when the OS needs to reclaim memory, it can automatically boot out entries from this cache.

This is ideal for objects that you can regenerate or obtain again easily.

For more detail on this, I highly recommend the talk Building Efficient OS X apps.

A lot of this also applies to iOS as well.

It focuses a lot on memory usage and this [inaudible] and energy.

And it will go into match more depth about NSCache and purgeable table data and what you can do to work well and be a good citizen when memory pressure comes to bear.

So, wrapping up these data structures, I have just two brief asides as it relates to data structures and storing them.

So first property list.

You're all familiar with them that you can store hierarchies of data, XML or binary format.

Each of you at least has an info.plist in your application and may use them in other places.

They support property list objects.

It's a set of six objects in foundation, arrays, data, date, dictionary, number, and string.

Any others that you can, for example, convert to a string, you would have to use NSCoding and archive it into an NSData and store it in the property list.

So you do have flexibility there.

It's good to be aware though that mutability is not preserved.

If I build up mutable dictionaries and arrays in a hierarchy that I like and then store it to property list and read it back out, they're no longer mutable.

I'll get the immutable variance to those.

That's important to know.

Property lists are generally not efficient for lots of binary data like encoding things into NSData and stick them in there, or for really large files, particularly if you're using and XML format.

There may be better alternatives to look into.

You'll see property list commonly with NSUserDefaults for storing user preferences for your application, it's a common way to interact with it.

NSPropertyListSerialization is also a really handy way to deal with this for input and output of property list.

And I encourage you to look at NSKeyedArchiver and NSKeyedUnarchiver if you have needs to store other data in a property list.

The second aside is on JSON.

This is JavaScript Object Notation and originated in JavaScript but it's supported across a broad variety of platforms.

It's a lightweight data interchange format that's human readable and it's really handy, and it's commonly used for things like web services or sending things between different platforms.

And it works with a few Foundation classes.

Some of them are similar to property list.

You'll see array, dictionary, number and string, but there's also NSNull.

If you're not familiar with this, this is null place holder.

Collections in Cocoa don't allow you to store nil inside of a collection, but JSON does allow null objects, and you'll see those transfer across NSNull place folders.

And there are also some restrictions such as the top level object in JSON needs to be a dictionary and you can only use strings for the keys in your dictionary.

You can't use numbers or other things.

OK. And you should definitely be aware of NSJSONSerialization.

This was introduced a few releases ago and it's a really convenient way to deal with JSON.

It supports reading and writing.

You can even pass an object and see if it will form valid JSON.

You can also specify whether you want mutability when you read objects out of JSON which is different from property list.

And this class has been improved and iterated a lot over time.

It's fast and it's built-in, you don't have to link in a static library, for example, on iOS.

You can count on the improvements to efficiency and performance over time.

And you'll see even more improvements in iOS 7 and in Mavericks.

OK. So that wraps up about data structure and performance.

That's a lot of information.

For our final section, I'm going to kind of tie this all together and talk about real world applications.

How we can take all this about "Big O" complexity and talking about data structures performance and how fast they are and apply that to my actual code.

So I'm going to give you example again from our code.

From the WWDC App, we refresh the sessions.

You all saw this on Monday.

You're anxiously waiting for those TBAs to disappear and see what the sessions were going to be.

And we have this functionality to refresh those.

So the data is stored and Core Data on the application, on your device, and we periodically fetch updates from the server to check if there's new information.

Now, we did notice some performances issues in early versions of the applications.

At first we were getting great speed, but as the conference became more fleshed out and more sessions were added, the speed got progressively slower and noticeably slower not just a little bit but a lot more over time.

And it became kind of unbearable.

And when we profile that we found that's there was non-linear growth.

We obviously had some sort of complexity problem doing too much work.

So I'm going to walk you through a simple example here.

So this is what we did at first.

We have an array of sessions that we get from the server, we just fetched, and then we iterate through each of those, so we want to handle each of these refreshed sessions so we have a for loop that's order and complexity.

And then we do a core data fetch to see what we have locally on our device.

We look for a session with that particular ID and then we see if I had a session with that then I'm going to merge them together.

If this is a new session, I'm going to insert it in the Core Data.

So we were seeing non-linear performance.

And probably the suspicion is this may be n squared.

Well, the for loop has to be there and that the part at the end about merging sessions looks pretty simple.

So the work must be somewhere in between, right here in the core data fetch.

And let's think about what work it has to do.

Well, we're forming a predicate to search where we say, "Find me a WWDC session objection with the session ID as equal to this session I just gave you.

So when I tell core data to do that, we think, what would they have to do?

Well, because it may not have ordering on the other side, probably what it's doing is it's going through each of the sessions and seeing if that session ID matches which is an order n algorithm, right?

So there we have our hidden work.

We have an order n loop and order n for finding the session as core data fetch.

That's where n squared is.

So, let's just hoist that right out, let take the core data out, we'll optimize that a little bit.

Let's say that we now change it so that instead of fetching one session at a time, we get the list of session ideas that we've just gotten and we fetch all of those at the same time.

Now we have them all in one array and that's great.

We should see order and performance now, except we don't.

If you have keen eyes, you'll notice there's still some work left inside the loop.

Now, although we know what session it is and we've just moved the session the problem or the work out of core data of finding that session, we'd moved into a line where we're calling index of object.

Now we have an array of all these sessions but we're still doing a linear search.

We're stepping through each of these until we find the one with the right session.

So, we haven't really eliminated the problem.

But there is a great way around it.

Since we're using session IDs that are unique across these, do a slight modification, and after we've gotten all those sessions upfront, we create a dictionary where the objects are the sessions themselves and we key them by the session ID.

This is going to be unique.

And now, inside the loop, you can see that we just have look at that dictionary and find the one the session has that particular session ID, and look up in a dictionary is order 1, it's no longer order n.

So we just solved our problem.

We've taken this algorithm from order n squared to order n, and now all of you can get your session updates really quickly.

There's not 5,000 of you waiting forever for all those sessions to reload.

So that's a real world example from our own application.

So, I want to also give a couple of tips about eliminating extra work.

Really what you want to do is minimize redundancy and in particular expensive code.

So what we just saw in the previous example was each time to the for loop we we're searching through the same array just for a different session ID.

If there's a way that we can get that out of a loop and do that work just once, do it less frequently, that's a big one.

Also, take advantage of faster lookup when possible like we did at the NSDictionary.

Sets can also be useful for this one, that's appropriate.

Using mutable collections and strings when it makes sense can also be a big performance win.

And I'll show you an example of that in just a moment.

And also streamline how you access your data.

Don't make it hard for yourself to get to the data that you need as quickly as possible.

You can organize your data in such a way that it's structured intelligently and your lookup will be really fast.

And again, try not to reinvent the wheel.

There's times where you need to write your own code but there are situation, for example, joining an array of components with strings.

You may have a group of strings that you want to comma separate.

You could write your own code to do that and append into a string and so on but there's already API where you can give an array of objects and give it some string that you want as the delimiter and it can join them together for you.

So wherever possible, use those as your benefit from the optimizations and eliminate extra work, you know, that we're doing as fast as we can think to do it.

So, an example on eliminating this extra work, let's say that you have a method that takes an array and I want to scan through that and do something with these objects and if the result is not nil, I want to add it to a new array and I'm going to return that back from this function, from this method.

The big problem here is really each time we're creating a new immutable copy of an array.

This is an obvious no-no and we can avoid this.

Instead of doing that, you can create a mutable array upfront and add the objects to the array which is going to be really fast, and at the end when we're done, we can call copy which gives us back an immutable NSArray and we can return that.

So we take a full advantage of the mutability when it make sense for us and we still fulfill the contract of our method by saying we'll return an actual NSArray.

And you could do some other things with appending to it, NSMutableString or NSMutableData and so on.

The key takeaway from this whole section and I'd like to say, don't leave performance on the table.

If there's performance there just waiting for you to take it and be faster, don't leave it there.

Take advantage of it.

You know, if I could be using an NSDictionary for this rather than an NSArray, I might as well do that, you know.

You have to weigh the benefits, the pros and cons of choosing either way.

But I really encourage you to think deeply about the performance that you're going to expect to see with your data structures.

And so, a lot more information that you can found out is Dave DeLong is our App Frameworks Evangelist and he'll be a great point man for any questions you have on this and also on the Cocoa Labs, in Developer Forums, and then I have three documentation links that are really useful for more in-depth understanding of collections, property lists, and archive sand serializations and so on.

There's two related sessions, I mentioned the one on Building Efficient OS X Apps and Hidden Gems in Cocoa and Cocoa Touch is immediately following in this room.

This is a great overview of a lot of things you may not be familiar with or even aware of throughout the Cocoa and Cocoa Touch frameworks, also in XCode and different things like that.

So just to summarize, I have a few key takeaways that I want you to remember if nothing else from the session.

The first is that complexity kills large-scale performance.

If you have a lot of work and your complexity grows over time, when you get to large amounts of data, your performance will tank.

You need to be aware of that.

Second, know how much work your code is doing.

Know what it's actually trying to accomplish, that's where a lot of the hidden complexity lies.

Avoid redundancy and strive for efficiency, it's the last two points of how you can resolve this kind of performance issues.

If you're doing something over and over again, take it out of a loop.

Do it just once.

If you're doing something in an inefficient way, try to find a better algorithm.

Focus on the biggest performance wins first as we talked about Andahl's Law.

Find the bottlenecks and eliminate them first.

Don't prematurely optimize.

Prefer to use the built-in collections and API wherever possible so you can benefit from optimizations and improvements over time.

Design according to your particular needs.

Choose a data structure that fits how you need to access it.

Something that will make it really fast to the tasks that you need to do and it's going to be convenient.

And last, think about performance early.

There's a difference between premature optimization and being having common sense about the performance of your application.

When you use some of this knowledge about understanding how complexity can affect performance, it enables you to make great decisions upfront about how to store your data so that you can access it quickly, you'll be happy, and your users will be delighted with the performance of your application.

Thank you.

[ Applause ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US