Advanced Performance Optimization on iPhone OS, Part 1

Session 135 WWDC 2010

Quick, nimble, and efficient apps provide the best user experience on iPhone OS. Learn advanced techniques for optimizing memory usage and making efficient network requests. Get expert instruction on how to use Instruments and other tools to examine your application's memory footprint, monitor network activity, and track CPU utilization to fine tune application performance. This is the first part of a two part series.

David Chan: Good afternoon.

My name is David Chan.

I'm on the iOS Performance Team and I'll be joined by colleague later Peter Handel whose on the iOS Power Team.

So today, we're going to talk about Advanced Performance Optimization on iPhone OS.

And in particular this is Part 1 of two.

We're going to be talking about animations, scrolling, responsiveness, and battery life today.

"The iPad is a far slower machine that a modern MacBook in terms of raw hardware performance, but feels faster in many ways, because you never have to wait for it."

This is a great quote.

Basically, what this is saying is this is why you're here today.

People love using the iPad, the iPhone and iPod Touch because it's a magical experience.

And a large part of that experience is you never have to wait for it.

So, great performance is all about creating an outstanding user experience.

So today, we're going to be covering the most advanced topics, for our most advanced developers.

We assumed that you've already written an iPhone application, you've played with all the different aspects of it, and you've seen the challenges.

Today, we're going to be covering animation and scrolling, responsiveness, and battery life.

Tomorrow, we're going to be covering memory, databases, and I/O.

Be sure to show up for that one as well.

So, across both of these talks, we're going to be trying to give you a framework to solve your application performance challenges.

Now what do we mean by that?

First, we want you to learn as much about the system as possible.

You're going to use that knowledge as a mental model so that when you come up with performance issues you can think creatively about solving them.

And finally, we want you to really measure progress.

We built great tools into the iPhone SDK and we want you to use them to see exactly what's happening on your system.

Don't guess and be able to see that the changes that you're making are making real progress.

So let's jump in.

Animation and scrolling is first up, so let's begin.

So what we're going to be covering today is we're going to go behind the scenes of the animation.

Now you're all familiar on how to create an animation.

We're going to see what happens after you commit that animation.

We're going to go into how to keep your animations responsive, how to keep them smooth and how to keep all these nice scroll views in your system very, very nice and smooth.

So, here's a timeline diagram of a typical animation.

Now it has three stages and we'll walk through them one by one.

First, you create your animation.

Now this is pretty simple.

You've probably seen this before.

Use UIView, create animations.

You changed some part of your view hierarchy.

You maybe changed some properties.

And then you commit it.

So that's step two.

Now this is where the system calls your layoutSubviews, and drawRect calls.

Now this is when the system is ready to take your animation, ship it over the render server, and have that animation show up for the user.

And that's the third process.

Every single frame is rendered by the render server for the length of your animation.

So let's start with step one.

So, this should be pretty simple code for you guys, pretty familiar stuff.

We're creating a view hierarchy starting with that inside view.

We're putting a little scale on to that.

So that starts up very small.

We're going to begin the animation and add that new view hierarchy to our existing view hierarchy and then bring the transform up so it will become full size.

So let's see what that looks like, great.

So, this is what we just saw.

We saw the existing view hierarchy there and then we add a new part to that.

So that's the card with image and that great label.

So as you can see, they're not quite filled in yet.

The next thing that happens in stage two is that the animation is prepared for commit by calling layoutSubviews and drawRect on each of those new views.

So they just got filled in and now they're ready for commit.

So, we have the new view hierarchy on the left here and what render server thinks is going to be displayed on screen.

And major part of commit is that this gets sent over.

So now this towards syncs up and the transaction of the animation is committed to the render server like you would commit a transaction to a database.

So now we're on to stage three.

Now for every single frame of your animation the length of the animation, the render server goes through these four steps.

It takes the current time and looks over the tree that it has for your application and sees what animations it needs to update for the current time.

So in our example, we have that scale starting very, very small and coming up to full size.

Now, every 160th of a second, that scale gets interpolated to a new value so it gets just a little bit bigger, a little bigger on every frame.

So the second part is we calculate the screen region that needs to be updated.

This is important because our render server is really, really great at figuring out, "Well, we don't need to update the whole screen, we just need to update that things have changed.

So it can walk over the tree and figure this out in a very, very nice way.

The third thing it does is it takes the whole view hierarchy and constructs a scene using Quads and using the images that you draw that you drew into in your image assets as textures and creates a series of graphic commands for the GPU to render this scene.

Once it's handed off the GPU, it tells it to present the rendered update to the display and this is a lot of stuff.

And again, it happens on every single frame.

Every 160th of a second, that's 16 milliseconds.

Not a long time right?

So that's behind the scenes of animations.

Let's talk about what can go wrong in stage two.

So, like I said before we create, commit and render an animation.

And when you create the animation, you can often set a delay or, I'm sorry, a duration, and when you create that duration, you're basically saying, "I want the animation to start and end within this amount of time."

So as you can see, the start of that duration starts as soon as you create the animation.

Now if you spend too much time preparing and committing that animation, you know, maybe you spent too many time drawing, you spent too much time laying out subviews, creating views, that can end up with a pretty serious delay.

So what can you do?

Well, the first thing that we use that we'd say you should do is to draw less when you're preparing.

So it only invalidates views that need to be updated, only call setNeedsDisplay on visible views, if you call on hidden views, they'll still get drawn, and only implement drawRect when absolutely needed.

Now if you're going to have an empty drawRect, this still matter because the system has to allocate a backing store for this.

Second, you should invalidate smaller regions of large views.

If you have large views and let's say you implement something like a painting program, we have a great mechanism built into systems that you can implement a smart drawRect and use setNeedsDisplayInRect so that you can, for example like in the painting program, only invalidate the regions around touches.

And finally, if that isn't your kind of program, you know, you should think about taking large views and decomposing them into the parts that stay the same and the parts that change.

So the other part of preparing an animation is if you use image assets.

Your image assets are going to get decompressed in stage two.

So we often see big delays because people are using, you know, very, very large images or formats that just aren't appropriate for the device.

So try to decompress and rescale big images sparingly.

This will, you know, keep your animations quite responsive.

And try to use the, you know, formats that are optimized for iPhone.

Or iPhone-optimized PNGs, those that are added, your Xcode projects are great, and JPEG and TIFFs have different tradeoffs.

JPEGs are small in disk but have, you know, perhaps lost equality, whereas TIFFs are very large and so they make a little bit more time to read off of storage.

Finally, if you create any custom CGImages, we highly encourage you to use to UIGraphics convenience functions.

This will take care of the all the nitty-gritty details.

And the problem the reason I'm mentioning is that if you get those details a little bit wrong it's possible that the system will have to copy those images for you into the right format.

So try to avoid those.

And you can use the Color Copied Images debug option in the Core Animation Instrument to see them.

So this is a big topic.

People want to know how to make their animation smoother.

So we're going to go through and talk about exactly what happens behind the scenes.

We're going to talk about some specific examples of how you can improve your animations and we're going to look at a new feature in iPad and iOS 4 that I called dynamic flattening that can help you create smoother animations.

So let's begin.

So, your server tries to render each frame your animation at 60 times per second.

This again is not very, very much time to do any form of rendering.

So, fewer pixels to render, means smoother animations.

That means fewer input pixels.

So, if you have very, very large images for small view, that's not going to be great.

If you have a lot of blended views, you're going to end up with a lot more output pixels.

And finally, few rendering passes also means better animation.

So like I said in the beginning, we want to get you guys to measure what's going on here.

So we're going to be using the Core Animation Instrument and this is really simple to use.

I hope every one of you has tried using this.

You just plug in your device, launch instruments, and select your application and hit record and it'll show you the number of frames that were rendered in the last second.

It gives you on a second by second basis that count.

Now, when you're measuring, you want to make sure that you measure the base line as in what it looks like right now and any changes that you make.

So you make sure that you're actually making improvements.

So I said before that this is a count.

Now, one thing to keep on mind is our animation from before was only about 300 milliseconds or so.

And when I mentioned this the first time, I only saw that I got 18 frames and I was really disappointed I thought that was quite slow.

But it turns out that's actually the max that we're hitting that we're shooting for right?

So, at 60 frames per second is our target rate and although I only rendered 18 frames, that was the correct number.

So you have to do that division if you're going to be measuring stuff at subsecond levels.

But one thing that you can do and it's really neat trick that I actually really like a lot.

You can link in the animation over a few seconds for much, much better measurements.

And this helps reduce measurement jitter.

This allows you to get the timing just right.

And here we actually see that when I do that, I was really happy and surprised that I got 60 frames a second over the course of that animation.

So, fewer pixels to render mean smoother animation.

Now I've mentioned that the render server actually looks over the whole view hierarchy and tries to figure out exactly which screen regions need to be rendered.

Now you can actually see, one, it figured out that I needed a render using the Core Animation Instrument again.

And we're going to be using the flash updated regions check box.

Now, what this will do is it will cause parts of your application that are being updated by the renderer to flash yellow.

Let's take a look of what that looks like.

So, we're bringing in the card, it flashed yellow, that's exactly what we want to see.

So that gives you kind of a baseline to figure what parts of the application I actually need to fiddle with for this animation that I care about.

So, let's jump in to one specific way that you can improve the smoothness of your animation.

We said this before.

You want to reduce the amount of you blending in your animation and in your view hierarchy.

So again, let's take a look at how you can see that.

So Core Animation Instrument, we're going to be checking color blended layers.

And let's take a look at what that looks like.

So, we have the same view from before except that now the opaque regions are shaded green and the blended regions are shaded red.

And as you can see towards the top, the regions that are even deeper blended are darkened red.

So, why does this matter?

Well, the graphic system can form a certain number of pixel operations per frame to maintain a smooth frame rate.

And blending requires more operations per on-screen pixel.

So if you're just putting down an opaque view onto screen, you can just write each of those pixels out, right?

If you have to blend, the graphic system has to read the value there and then write the ending pixel because it needs to figure out what is actually blending to, right?

So the second reason is, the graphic system supports efficient hidden surface removal.

That is if something is fully occluded then we can get rid of it.

And it doesn't even have to touch that surface.

Never even gets rendered.

So But it can only avoid views that are completely occluded by opaque views.

So by having all the views that should be opaque, be opaque and you can help out the graphic system and it can render your animation very smoothly.

So, I mentioned that the graphic system can perform certain number of operations per frame and we're going to take a look at what that means for opaque views and for blended views.

So here on the left, we have that view of my animation with color blended layers on and you can see the opaque ones are green and the blended ones red.

And on the right, we have a rectangle that represents the approximate number of pixel operations per frame at 60 bits.

So let's see what happens when we bring over the opaque pixel, the opaque views when they get drawn.

I'm sorry, when they get rendered.

So as you can see they're overlapping over on the actual view and they're overlapping here as well.

Now why is that?

That's because our graphic system supports something called differed rendering.

Basically, what allow us to do is figure out what's overlapping what and we don't need to spend anymore time drawing things I'm sorry, rendering things that are just going to be overlapped anyway.

However, we can still have these blended views left.

So let's see what happens when we bring those over.

So as you can see, the number of pixel operations that this blended views take up don't take advantage of this optimization and so the use up even more pixel operations per frame.

So what can you do about it?

Well, first it's important to understand how views get marked as needing to be blended.

Contents determine the blendings.

So there are three ways that happens.

First, views that are drawn or by default part is opaque and you have to actually set that set the flag to no and implement drawRect.

Once you do that, it's been blended.

Second, the use image assets like PNGs, they often can contained an alpha channel.

If you look in your image assets with preview and hit the Get Info button, you can actually see whether or not your PNGs have alpha.

And if you didn't intend for that image asset to be blended in your system, then you really should go in to your Image Editor, resave that, and get rid of that alpha.

The third way that views can become blended is by creating opaque sorry, creating custom CGImages.

And if you use the UIGraphics convenience functions, we make it really easy to have you pass yes to opaque and then the image ends up being opaque.

So these are the ways that contents become blended or opaque.

And the next question is, well, what else can you do about it, right?

Once you've fixed all the accidental blending in your views, let's say you still have a lot of blending.

Well, the next thing to remember is that it's the number of pixels that are rendered and the number of pixels that are blended that impact the performance of this.

And so, if you decompose a large blended view into the parts that actually need to be blended and the parts that are still opaque.

Even though that ends up being more views, that still ends up with better performance.

OK. So that's view blending.

Now we're going to move on to talking about Offscreen rendering.

So what is Offscreen rendering?

Well, to achieve certain effects on our system, the compositor or the render server needs to use a temporary offscreen region in order to achieve the final effect.

And one way of thinking about this is how a painter will use their color pallette and take two colors together and mix them before painting on their final canvas.

So why is this slow?

Well, besides the fact that you are rendering more pixels, you're rendering to this offscreen context and then taking that and then rendering to the final display.

You're also switching between this main and offscreen context and they'll stall the graphics pipeline and, you know, really hurt performance.

So let's take a look at how you can detect this into your own animations.

So again, we're going to use the Core Animation Instrument and we're going to be checking the Color Offscreen-Rendered Yellow flag.

Now this shades yellow portions of your animation that had to be rendered offscreen and then back to the main screen.

So this isn't quite as easy as blended views.

I can't just tell you to go find the blended views that aren't supposed to be blended and get rid of them.

Avoiding this kind of offscreen rendering requires some creative solutions.

So let's take a look at couple of examples and some workarounds that we come up for you.

So let's say I have an animation where I take an image with a background color and I fade opacity from solid to blank.

So it looks something like this which have the image, we have set the background color and we begin an animation, set the alpha to 0 and it fades away.

Now to composite correctly, the image needs to be composited over the color at full opacity offscreen and then blended into the view.

So let's take a look at that.

So, you can see that the blue in the iTunes icon and the greens in the face in the photos look right.

They're faded over black.

They're just dimmed a little bit.

Now, what I mean correctly, one naive way of trying to avoid this kind of offscreen rendering is just to say what happens if I just blend that orange color into the background at the lower opacity and then blend in the image.

So you basically break this out into two separate layers.

Let's take a look at what that looks like.

So as you can see, the image doesn't look quite right.

You know, the the blue isn't quite right and, you know, the face in the background gets this orange tint.

So let's take a look at some other ways that you might approach this.

So one workaround is to composite the background color and image in drawRect first just like that.

And then when you faded out, so here we've actually drawn it and then when it's actually faded out, when it's being rendered, it doesn't have to go offscreen and just fades out very nicely.

So this falls in the category of thinking about what the graphic system needs to do to render that offscreen and do it ahead of time using Core Graphics in drawRect.

So, one more workaround.

So if we're fading over a static background like here, this is great.

We have a black background.

You know, nothing is changing behind it.

There are no patterns.

You can try fading in a view that contains the background over the view instead.

So it might look something like this.

We just create a new view with the bounds of our view.

We set the background color to black and then we fade in from 0 to 1 instead.

So let's see what that looks like.

It's great, it looks exactly the same.

So the lesson to this is if you see a situation where you have offscreen rendering, this often alternate ways of getting the same effect, visually by using a different technique.

Let's take a look at another example.

So new on iPad and iOS 4, CALayers now support this great property called cornerRadius.

And this allows you to get these nice rounded corners on your views really easy, you pop that layer you pop that property on.

Now, animating a view with a rounded corner mask, as you will find out, requires that the renderer would go offscreen, render your image and then apply that mask that contains the rounded corner.

Now, this actually applies to all masking that's non-pixel aligned.

So if you have an arbitrary mask layer that you assign to a CALayer, you'll see this as well.

Or if you're moving a view that has clips to bounds set and it's moving on non-pixel boundaries.

In any case, let's take a look at the rounded corner example.

So, what can we do about it?

Well, again, we have two Workarounds.

The first is to try to achieve the same effect in Core Graphics ahead of time.

New on iPad and iOS 4 is this great UIBezierPath API and allows us to just to draw a path with that rounded rect with those great rounded corners and we set that as the clipping region for whatever we draw into our background.

So when we draw, the rounded corners come with it.

Now of course, this only works if the corners don't have to clip anything that actually goes outside of the rounded areas.

But in our original situation, it works great.

So the second workaround is to decompose rounded corners into separate views.

So basically, all that means is that you're creating four small views that are positioned around your view that have a little black sliver drawn into it.

And here are some codes for the top left corner.

That's pretty simple.

Again, this is one of those cases where you can achieve the same visual effect just by using a different technique and you can avoid offscreen rendering.

OK, those are some ways that you can avoid offscreen rendering.

Remember fewer pixels to render means smoother animations.

And that means fewer rendering passes also means smoother animations.

So let's talk about that new feature I talked about, dynamic flattening.

Now this is new in iOS 4 and on the iPad.

And the reason we added this is that animating changes to a complex view hierarchy can be choppy.

So here we have the little subhierarchy that we added to our existing hierarchy in our app for the Core Animation.

Now why do I say this is slow?

Well, it renders the hierarchy on every single frame.

So as it's scaling in it works over that tree, renders those subviews together and then scales it.

So this animation would be smoother with a flattened hierarchy, right?

You do the work once.

You draw it in drawRect and then just have that scale it overtime, surely a lot faster.

But as I'm sure you guys know it's kind of pain to change your whole view hierarchy and then you lose the dynamism of being able to move views around it independently.

So now you can flatten without changing your view hierarchy using shouldRasterize.

Let's see how you can use it.

So like I said it's a CALayer property.

Generally, the way you want to use it is you want to turn it on before animations and turn it off after animations.

Let's take a look at some code.

So this is using the new animateWithDuration using blocks.

So I'm just going to quickly walk through this example.

So we created that scale before.

Before the animation starts, we set shouldRasterize.

And then during the animation, we just set the Transformed to Identities so that brings it all the way up and then when the animation is done the system will call my completion block here and all that it does is it sets that shouldRasterize back to no.

So let's see what that looks like.

Let's see how it works.

So here we have the subview hierarchy again.

And we're going to hint the compositor that it should render this view hierarchy offscreen and then cache.

Now I just warned you a whole lot about avoiding offscreen rendering.

But I want to assure you this time, this offscreen rendering for good.

And this is how it's going to work.

When we render for the first time, it's actually going to render into this offscreen region and then it's cached.

We're actually going to keep this around for frame to frame.

And then on each step of the animation, it's going to get rendered over just like that.

So this is actually really, really nice but it can hurt more than it can help.

Don't turn on everywhere because there's a limited cache size.

And if you start setting lots of views with shouldRasterize, you're going to overflow the cache and that ends up in a really, really bad situation, ends up being much worse than before because essentially you're rendering every single view that you set with shouldRasterize offscreen and the back on the screen and we just talked about how doing that in every frame can really, really hurt your animation performance.

And like any good cache, it throws away old results.

So if you changed anything in your view hierarchy during your animation.

The render server actually has to throw away that cached copy and then render a brand new one in order to actually show the proper results.

So make sure you don't change anything during your view hierarchy while you have shouldRasterize on, otherwise you ended up rendering offscreen without great performance.

So that's smooth animations.

Remember, rendering fewer pixels means smoother animation and that applies to blending, that's fewer output pixels that applies to rendering passes as well, so reduce the amount of offscreen rendering.

So let's talk about scrolling.

Now, I talked a lot about animations first.

But a lot of you probably care a little bit more about scrolling now why did I do that?

Well, it turns out that each frame scrolling is a small animation.

When you flick that scroll view every 160th of a second is issued in brand new animation and that's calculating a new scroll view and that's the implicit animation.

It's going to prepare and commit that animation.

So if you have a new cell coming on screen, your layout subviews gets called, your self or rowAtIndexPath gets called.

And the compositor has to render a brand new frame.

So, the animation advice I gave earlier totally applies.

Prepare yourself very quickly and then render very quickly.

So here's another timeline diagram like I showed you before.

As you can see these animations are squished together really tight because they have to happen within 16 milliseconds in order to get that nice scrolling effect.

So first thing that happens is that we create an animation implicitly by calculating a new scroll position.

We prepare and commit the animation.

So this is what happens when your cell gets laid out and this is where all the drawing happens.

And finally the frame is rendered.

Now, when I say the frame is rendered I really do mean the whole table view gets rendered, right?

Because every time you scroll, each of those cells is moving in a different position.

So, prepare cells quickly.

Now, there are two major parts to preparation, right?

There's layout and there's drawing.

And on the layout side, that's when the table view gets to tell you, "Well, you've adjusted the scroll position.

Now give me your new cell if there's a new cell appearing on screen.

So you want to use the dequeueReusableCellWithIdentifier.

We use those table cells.

It's not necessary an advanced tip but we do have to mention it.

You will save a ton of time creating objects and backing stores for each of those cells.

And you know be sure to use unique identifiers for similar cells.

If you have lots of different kinds of cells in your table view, don't spend the time to transform one to another.

Use a little bit extra memory and give them different identifiers.

You can save time and you can get those cells up really quickly.

So the second part of preparation is drawing and we've told a lot of you in the past to flatten a view hierarchy of your cells.

And that's actually a really good idea.

I like that.

Because what it does is it reduces the amount of time that it takes to render those cells in the table view.

And so you end up with a nice scrolling effect, right?

It's very, very smooth.

However, I have seen some applications where it scrolls nice and smoothly right up until you get a new cell and then it jumps.

Now, what can happen here is too much cell drawing, you can spend a lot of time drawing all of these views into the same cell and that isn't great.

That's not a great experience.

Your table view isn't scrolling very smoothly.

So there's a nuance point and of course, you're going to have to measure an experiment to see what works for you.

But if you have that kind of scroll view in your application where it scrolls very smoothly and then it jumps when you get a new cell.

First measure it.

But if you find that you spend a lot of time drawing cells I have a tip for you.

So elements that need to be rasterized anyway, text labels, things with pads.

Be sure to just flatten those together that make sense.

Just flatten all those labels together.

Those don't need to be composited by the render server.

Do it once in Core Graphics, you'll be happy.

But for elements that are just images, let's say you would put them into an image view, you might consider letting the rasterizer handle sorry, the renderer handle a few of those.

And basically, what you're doing is you're balancing the time on the CPU spent when you're creating a new cell and the amount of time spent on the GPU on every scroll change.

So, like I said, there are two halves, you want to prepare cells quickly and render quickly.

Now, all of the lessons that we talked about from smooth animations apply here.

A few pixels to render means, smooth scrolling as well.

So simplify the structure of your view hierarchies.

If you have any unnecessary or invisible views, just get rid of them.

You want to reduce the amount of view blending as much as possible.

So, use color layers, color blended layers to see what's going on in your table views, and reduce any offscreen rendering that you might have.

That would actually really, really bad in the situation.

And again, new to iPad and iOS 4, you can try to use the dynamic flattening property in order to you shouldRasterize property rather to flatten your cell hierarchies if you haven't already.

And this might be a nice way of making scrolling performance just a little bit better for the cost.

One caveat though.

Your cell animations will not look great if you keep this on all the time.

So if you end up doing a rotation, you want to turn this off just before the rotation starts.

And for any edit animations, you want to turn this off before the edit animation starts.

So, we shipped a lot of devices that run iPhone OS and iOS 4.

Last year we shipped the iPhone 3GS and the iPod Touch new iPod Touch.

And these have twice the CPU power of the previous generation, twice the RAM and the GPUs are way faster but what that means is, it is a big gap between what you're developing on if you're using an iPhone 3GS and the iPhone 3G.

And we have millions of customer who have iPhone 3Gs and iPod Touches.

And we want your apps to look great on them.

So if you can, keep around one of these devices and make sure to test on the devices you intend to target.

iOS 4 runs on the 3G, the iPod Touch and iPhone 4.

And we want everything to look great across those.

Now, I want to talk about the iPad.

We shipped this a couple of months ago and we think it's great.

It has even faster CPU with the A4.

And even though it has 5 times as many pixels and about the same graphics capability, we doubled the bandwidth of the BUS.

And so we think that this has great graphics performance as well.

And finally, the brand new iPhone 4.

Again, it has the A4 chip so the CPU is a lot faster because, hey, now you have 4 times as many pixels to draw.

And even though you have 4 times as many pixels, you're really going to want to get your hands on one of these to make sure that all your animations look smooth.

Because things are going to double and you want to make sure that everything looks great.

So that's animation scrolling.

Let's talk a little bit about keeping your applications snappy and responsive.

So the key part of responsiveness is simply, do not make your users wait.

We're going to talk about how you can measure some of these things.

We're going to talk a little bit about launch, interaction delays and just a couple of notes about CPU optimization.

So Time Profiler Instrument, this is new in iPhone SDK 4, not Xcode 4, iPhone SDK 4.

And it's actually wonderful statistical sampling profiling tool.

And what that means is every millisecond you can see what's happening in your program.

It can take a look at, you know, the stacks of exactly where things are using the CPU and if things are blocking.

So, the way you should use this during your development process is if you come up with a performance issue, use this to measure what's happening during that scenario.

Measure first, and then as you drill down, you can actually find some of the problems.

You'll actually see the different landmarks of your code and be able to see, wow, I didn't realize that that was going to take so long.

So by default, this shows time spent on the CPU.

If you want, you can use this great check box here.

This was just hitting the information button there called "All Thread States."

It actually shows the time spent in blocking as well.

Now, for stuff that's running on the main thread, that's hugely important.

You can use this tool to narrow down to exactly what's happening in your program on the main thread.

And if you see things blocking, that's not kind of the normal, you know, blocking on new events, then you want to chase down after those.

So once you found your problem, you want to measure exactly how long that particular section of code is taking.

So you want to take a baseline.

Again, what it looks like right now.

And as you make changes, just try to make that faster you want to actually see real changes.

We recommend just simply timing the start and end using CFAbsoluteTimeGetCurrent.

Now for those of you that want to know, this is wall clock time and it's user time, too.

So simply use it like this.

It's pretty easy to use.

So I want to talk a little bit about launch.

If you are at the performance optimization on iPhone, talk a little bit about it there, too.

Now, I encourage everybody to measure, right.

But this is a little bit tricky to measure the total number of time the total amount of time during launch.

You can start by measuring the amount of time between the start of name and the end of application did finish launching.

That gives you a good sense of what's happening.

But the other thing to do maybe is to time launch using Time Profiler.

Now you can use this as an absolute, measurement because obviously there's sampling going on.

But it's actually really, really useful tool for relative measurement as you're making changes.

And it helps you figure out what your application is actually doing at launch.

So what can you do?

Well, deal only what's necessary on launch.

Can you defer the work that you see that your application is doing?

Could you do it on demand?

We have a philosophy in our application development of being lazy.

If you can be lazy and do it on demand, that's great, because the user might not need network ever at all.

The second point is, reduce the number of linked frameworks.

I know when I'm developing I sometimes try out the brand new frameworks and add them into my project to try out a brand new feature.

That's great, you should experiment.

But before you ship and build the final product, you should make sure to remove those frameworks from your Xcode project.

Because when you have those in there, the system will actually try the load those at launch and you want to reduce that as much as possible because that can cause I/O and cause initializers to run.

You want to reduce that as much as possible.

And if you're using third part libraries, you want to look out for static initializers.

Now these are usually C++ methods that are called to initialize a class and you can detect these using these environment variables.

The other thing that you want to look out for is weak exports.

These are somewhat rare but we've actually seen them in the field.

And you can quickly check for these using otool like this.

This is how you set up environment variables to run in Xcode if you haven't seen that before.

And we have some sample output of what gets printed with the print statistics and print initializers options.

So interaction delays, the key thing here, simply do not block the main thread.

And by that I mean, not as blocking operations.

But if you have any operation that's taking longer than about few milliseconds or so, you really want to spin it off into the background.

We have lots of scroll views on our system and people love idly playing with them.

And if you block for a few frames.

It can be really disconcerting.

So, long running task should be spun off into the background.

And you should try to factor these into executable units of works that you can really show progress to user if it's really long running.

Remember to make UI updates back onto the main thread once you actually do this.

And in iOS 4 it's really easy to do with NSOperationQueue and blocks.

So here we have some sample code.

And what is this doing is just creating an image and we're going to do some custom drawing to that on the background thread.

And when that's ready, we're going to post it to the main thread with an image view.

So the first thing we do is we just create an operation with this block.

And here, we're just creating image context with options.

By the way this is now thread safe in iOS 4.

You can totally use this on the background.

It's great.

And then grab the current grab the image form that current context.

Now, so our drawing is all done here and maybe that's like 100 milliseconds to 200 milliseconds or so.

Now we want to post that to the main thread.

Now of course, you can't really modify stuff on the main thread from background threads because a lot of the UIKit code isn't thread safe here.

So what can we do about that?

How do we get the image to the main thread?

Well with blocks, it's really easy.

All you do is create a new operation with this block on the main queue, run on the main thread, and we just create a new UIImageView and use that image.

Pretty simple, huh?

So next if I have about responsiveness, always make URL request asynchronous.

So I've seen this code, it's really easy to use sendSynchronousRequest, boom, it's done.

Unfortunately, you don't know what the users are going to have in terms of network connectivity, right.

They could be somewhere where the network is a little bit flaky and they start to make the connection that kind of goes through and, you know, it doesn't take very long for people to get frustrated.

So it's a little bit more code but it's worth it.

Use connection with request, implement the delegate of NSURLConnection and you this all happen in the background and then your main thread is nice and free for users to interact with.

You'll get callbacks when you need to receive data when things failed and when it's finished loading.

One more note, spikes in memory usage can cause delays.

And why is this?

Well it turns out that to accommodate higher memory usage, code is evicted.

And what I mean by that is the code that's on the system is usually at is usually filling up the rest of the memory that's free.

And if you spike memory usage like this, the system actually has to kick out something in order to give you more memory.

And what it usually kicks out is code.

And so as you see, it will spike up there with very little code left in the system.

And when you bring memory back down, it doesn't just magically fill in.

As if you read back from the storage proceed and that can take a long time.

This is probably one of the most common and unexplained delays that people will find whether when their application is unresponsive.

You'll do some operations, you'll sample it you say, "Well, I'm not spending a whole lot of CPU time here.

Where is the time going?"

Often times, it's reading back the code that your application needs to proceed, in this frame or code, your code, system libraries.

So one final note about responsiveness, we have some great tools and system including Time Profiler that allows you to actually find hot spots in your code.

And as you could see it will actually give you statistics about each individual line of code and even to the point of each individual instruction.

It's pretty handy.

So one tip we have about this besides, you know, of course making sure that your algorithms are as optimized as possible is to use a feature that we have built into our CPUs and its called "vector processing."

And what vector processing is, it's the way that we can use the chips to process many elements at once.

So let's say about four elements at a time in this situation.

So let's say we have some sample code here which is pretty simple, we're just walking along this array and we're summing up the total values into this foot.

In iOS 4, we have a new framework called "accelerate" and this is great stuff.

This is all really, really highly optimized code that you can just use out of the box that will give us this vector processing to do lots of operations at once.

So in this case, we're using the summing the vector elements.

And it's one simple line of code and operates on this vector four at a time.

So again, don't make users wait, measure the problem situations, look for situations where you can improve the interaction time in your application.

So with that I'm going to hand things off to my colleague Peter Handel and he'll be talking about power and battery life.

[Applause] Thank you.

[ Applause ]

Peter Handel: Hi, everyone.

My name is Peter Handel.

I'm an iOS Power Engineer and I've been doing that for almost four years now.

I'd like to share with you some tips and tricks and how you can improve the battery life of your application in three key areas.

When using the radio to send and receive data, when using Core Location to figure out where your device is located.

And when using the CPU and GPU to get your work done and draw on the screen.

First the network, transmitting data over 3G is one of the most power intensive things you can do.

This is exacerbated by the fact that 3G networks keep the 3G radios in a high-power state for few seconds after data transmission.

Therefore, if you were to send and receive even just a little bit of data, every few seconds, you'd keep those power-hungry 3G radios in a high-power state the entire time.

That's one of the quickest ways I know how to drain your battery.

So how can we enjoy the high speed and wide availability of 3G while still maintaining excellent battery life?

Here's a few tips.

First off, use the Activity Monitor tool which is part of Instruments to figure out how much networking your application is doing.

Next, coalesce your data into large chunks rather than transmitting a thin stream of data.

If you notice that your application is transmitting a thin stream of data, this may be because you're pulling across the network to check to see whether an event has occurred on the server.

Try to avoid this at all cost.

Let me repeat that.

Try to avoid pulling over the network at all costs.

We came across application, a little chat application which checked with the server every few seconds to see whether a new chat come in.

And as you can image, just chewed through the battery.

Instead, try to use the Apple Push Notification service if you can.

Also, minimize the amount of data transmitted.

Use a compact data format or maybe even compress your data before you transmit it.

And finally, be real careful when you reuse legacy or third party code because oftentimes this code will assume that it's just on Ethernet.

So for the 3G radio chip, let that chip idle.

From a power perspective, Wi-Fi uses roughly half the power of 3G.

Now this obviously depends on network characteristics but it's kind of rule of thumb.

Also, note that the Wi-Fi network will allow the Wi-Fi radios to enter low power state immediately after transmission.

Because of these 2 things, your application may want to know when it's on Wi-Fi versus when it's on the cell network.

To check this, use the kSCNetworkReachability FlagIsWWAN.

Where does 2G fit into this mix?

Well, from a power perspective, it fits in roughly halfway between Wi-Fi and 3G.

Also, like Wi-Fi, the 2G network will allow the 2G radio to enter the low power state immediately after data transmission, and that's the radios.

Next, Core Location.

Judging by the number of apps in the App Store that use Core Location, you guys love it and your customers love it too.

If you haven't used it, Core Location is an API that with just a few lines of code which I have up here, will allow your device to figure out where it's located to varying degrees of accuracy.

However, be sure to only use the least amount of accuracy you can because the higher level of accuracy uses more power.

For example, if you have a coffee shop finder application which can tell which and Core Location can tell you that you're here at the Moscone Center, that's probably good enough to know that there's coffee shop right across the street.

So in this situation, you use the nearest I'm sorry, you use the 100 meters accuracy.

Next, the distanceFilter.

This dictates how often you receive location changed updates.

Be sure to set it appropriately because the default is to receive every single notification.

And as you can imagine, this would lead to a lot of unnecessary events and higher CPU usage and worst battery life.

Be sure to call stopUpdatingLocation as soon as you reach your desired level of accuracy.

Also, note that Core Location will manage the GPS power for you.

What this means is that for example in our coffee shop finder application, if your user is looking at the map and they decided to go in the preferences part of your application, call stopUpdatingLocation immediately.

And then one or few seconds later, they go back to the map, go ahead and call startUpdatingLocation and Core Location will pick up right where it left off.

So for the GPS chip, let that chip idle.

Note, the same is true for Core Motion, which is the new iOS 4 API.

After you call the start update functions, be sure to call the matching stop update functions.

Also, if your application goes in the background, be sure to turn off the sensors when that happens if you like.

Note that in if your application would like to be notified of the significant location change or if you want to use region monitoring, instead of just having Core Location running all the time, use the new iOS 4 API which lets you do this.

And I have this that up here.

And that's Core Location, finally, the CPU and GPU.

You might be wondering why we're talking about performance and power in the same presentation.

Well, it turns out that if you optimize for performance, you get better battery life thrown in for free.

This is because fast code uses less CPU time which uses less power.

So for the CPU, let that chip idle.

So as you know, the iOS 4 is an event based operating system.

Now as I mentioned earlier in the networking portion of this talk, there are certain conditions when there are certain situations where you might want to uphold to check to see when an event has occurred or something has changed.

Try to avoid this and instead subscribe to an event.

But we don't have events for everything so in some situations you may have to pull.

Try to reduce the frequency with which you pull.

For example if you pull every 30th of a second, try dropping that down to every tenth of a second or even every second to see if there's any user-visible impact.

For example, we see a lot of sample code on the Internet which recommends that you figure out whether the device is being shaken like continuously pulling and using your accelerometer to do that.

Don't do this.

Instead, use the Shake API to figure out when your device is being shaken.

Next, be bursty.

Try to consolidate your CPU usage into short bursts.

This will allow the CPU to enter that idle state I've talking about.

Note that this may require you to restructure your code or possibly you can use a different algorithm.

How do you know when your code is being nice and bursty?

Well, use the Time Profiler tool as part which is part of the instruments to check your CPU activity level.

For example, we found that during audio playback.

We were able to get much better battery life by decompressing large chunks of audio at once rather than decompressing small pieces continually.

Next, procrastination, because who doesn't like to procrastinate?

Because if you put it off long enough, you just might not have to do it all.

For example, we came across a game, which check the which stayed at state every few seconds.

You can imagine it's not really good for the CPU.

It's not very good for battery life either.

Instead, maybe you could save the state when the user reaches a milestone or a checkpoint or maybe even the user quits the game, the time safety.

When using the GPU, pick a fixed frame rate.

We recommend about 30 frames per second and enforce this using the CADisplayLink rather than using NSTimer.

This will help you minimize the appearance of dropped frames and also help you avoid the situation where your app is continuously drawing as quick as possible.

On the other end of the spectrum, if a frame has not changed, don't redraw it.

For example, if you have a chess application and your user is looking at the pieces, contemplating the next brilliant move, don't be updating the screen every 30 frames a second if nothing has changed.

Finally, be sure to check out the Energy Diagnostics Tool which is part of instruments.

And I was in the session this morning, session 309 which delved into this extensively.

So be sure to check out the video once it's available.

So to summarize, for the radios we learned that data transmission is very expensive.

So we coalesce and compress data as much as possible.

With Core Location, we use the least amount of accuracy we can get away with and we call stop obtaining location as soon as we can.

And on the CPU and GPU, we optimize our performance and get better battery life for free, with bursty we procrastinate as much as possible.

And on the GPU, we used the fixed frame rate, 30 frames per second and we don't unnecessarily redraw the screen.

So to summarize, let those chips idle.

Thank you.


[ Applause ]

David Chan: Thanks Peter.

So in summary, use your knowledge about the systems, come up with creative solutions.

Always measure the baseline and the changes you make to make sure that the changes you're making are improvements.

About animations, fewer pixels to render, means smooth animations.

Make sure to prepare yourselves and render very quickly for smooth scrolling.

Don't block the main thread and let those chips idle.

Thank you for coming to our talk.

Here are some related sessions for more information.

Be sure to come tomorrow to part 2 of this talk.

We'll be covering memory, databases, how to use the data APIs on our system and I/O.

And right after this, there's an optimizing core data performance on iPhone OS session that I highly recommend you go through if you use core data or you're planning using core data in any of your applications.

Here are some other session that have already happened that are great reference material for some topics that we covered today.

Thank you very much for coming.

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US