John Harper: Welcome to Session 425, Core Animation in Practice, Part 2.
And my name is John Harper.
I'm part of the Core Engineering team and mostly working on Core Animation and everything relating to that.
And this talk is kind of a continuation of the first talk.
The idea is that the first talk was kind of setting the scene, going over all the broad details.
And this one, we want to just dive into certain areas, not taking a broad slice, but just looking at things we thought you might find interesting, to try and give you a better understanding of what's happening and some new things you can be using.
So, things we're going to go over, we're basically going to split the talk into three sections.
The first section will be kind of a mixed grab bag of different APIs, some new, some things we think you might find useful.
And then we're going to spend quite a bit of time on performance relating to Core Animation and basically try to give you an idea of how the GPU, the current Core Animation, all these things fit together and affect the performance you see in your applications.
Finally, we're just going to talk very quickly about the new kind of High-DPI screen on the new iPhone and what that means to your graphics rendering from a Core Animation perspective.
So, let's get right into it.
So, the first API I want to talk about, the first kind of section, I guess, is drop shadows.
And, obviously, shadows are a very important part of your applications, in that they can often give a lot of depth to your visual display, and they can really make something stand out.
They just make things look a lot more natural, often.
So, in the past, we've had a full set of shadow APIs on the Mac platform, and you can set things like the shadow radius, and the shadow card, the path, all this kind of stuff, but we've never actually supported them on the iPhone platforms, because, really, the performance just wasn't there.
And so when we were doing the iPad, it really became apparent that we really do need some kind of shadow support, so we brought all those APIs back onto the iPhone OS, but we added some new features, just to make the performance a lot more acceptable on these kind of lower end devices, compared to the Mac.
So, the new API we added is this thing called the shadowPath, and so the idea here is that, typically, if you say "Turn on shadows on your layer," what that means is that we had to take the Alpha channel of the composite content, blur it to get that nice kind of blown out shadow look, and then apply it underneath the layer, and that blurring step in particular is a lot of work for the GPU, and doing it every frame really doesn't work very well on certain GPUs.
So, the idea of a shadowPath is that this is a way you can tell us where your layer is opaque.
And, obviously, once we know the opaque region, we can use that to kind of cache the shadow as a bitmap.
So, we can render it, blur it, and just keep it around.
And then as long as you don't change the path, the shadow will be there forever, and we can reuse it from frame to frame very cheaply.
So, I just want to show a very quick demo of this, to give you the idea of why you should be using it.
So, we have this demo app, and you can see that we have a number of layers, just colored rectangles bouncing around the screen.
And so what I'm going to do, first of all, is just turn on shadows in the old naive way, where we basically just set the shadow, capacity of the layer to be non-0.
And when we do that, you can see that we have nice shadows, but we have pretty horrible performance, especially when I bring this up a bit.
It's really nasty.
So, very simple.
All we had to do here is set the shadowPath to be a rectangle in this case, because that's the shape of the layout.
And as soon as we turn that on, everything becomes way faster and much smoother.
[applause] So, probably, that shows you why you should be caring about this.
And now let's just go through that example in a little more detail.
So, as I said, we have a number of layers.
Each of them was created in this way.
We created a layer.
We set its rectangle.
We set a background color with some random hue.
And then the next, when we enabled shadows, we really just set the shadowPath, we set the shadow radius, and set the shadow offset, move the shadow vertically under the layer a little bit to make it look more natural.
And then when we hit the final button to make everything fast, all that was happening is this, and so you can see what we're doing here is we're going into the UIBezierPath to create a path for the rectangle, which is the shape of the layer.
And then UIBezierPath is a nice feature where you can ask it for a CGPath, and, obviously, the CGPath will be the underlying representation of that path, and then we can use that CGPath there to set the shadowPath of the layer, because, obviously Core Animation is a lower level API than UIKit, so we really only deal in core graphics objects.
And so, really, that's all you had to do.
There's just one extra line or two extra lines.
You just put them up like I did here to get those faster shadows.
And, obviously, this isn't restricted only to rectangular things.
Because it's a path, you can create round recs, or complex shapes.
Really, anything you can imagine can be put in that path and used to create the outline for your shadow.
The next topic, moving on from shadows, are shape layers.
And, typically, when you're using Core Animation layers or UI views, and you have some non just a slab of color content, then what you're always using is a bitmap, and you'll probably be drawing into the bitmap or providing a CG image ref, and that's fine, in many cases, but there are problems with that, if you're trying to, for example, scale the layer.
Then, obviously, since you have a bitmap, the resolution of the bitmap are fixed, and when you scale it up or down, you get blurriness, or aliasing, or whatever.
And, also, you can't really animate the contents of that image, that bitmap, because, you know, again, it's fixed.
So, in certain cases, shape layers can be a way to avoid those problems.
A shape layer is really just a layer which draws a path with a filler or a stroke.
And so, because the drawing of the path is deferred until the composite time, when we know the resolution that the layer is being drawn at, the we really get a nice scalable result, where the path will stay sharp, no matter what scale path you apply to it.
Similarly, we have some support for animating paths and separating between two path states.
So, if you, for example, have to have a line, and you want to have it a certain width, and then animate from point A to point B, then this is a really easy way to do that without using any images or any bitmaps.
So, we should talk a little bit about the performance here, because it's not as obvious as just using an image.
Firstly because we're really just storing the contents as a path, then it really does use very little memory.
I mean, if you have an image, the memory usage is fixed in terms of width by height; whereas, if you have a path, then you could have a million by million path shape, but it could only have four line segments, and there's probably only like 20 floats or something.
On the other hand, because, like I said, we deferred the rendering of the path a lot longer, so we can get these advantages, then we'd all be rendering probably every frame, and so that can take more CPU if you're not careful, if you have a very complex path or a thousand of these names or something like that.
Finally, another plus, I guess, is the images you may have heard us talk in the past about how you must avoid blending and all that kind of thing, the nice thing about shapes is that since we know the shape ahead of time, because we know what the path is, when we draw it to the screen, we can ignore all those transparent areas, so you really only pay the cost for the regions with color.
And so, again, if you have, for example, a diagonal line, then if you had that in an image, you would have to pay the cost of drawing all those transparent pixels to the screen, because the GPU doesn't know what's transparent and what isn't; whereas, with the shape layer, if you had that line stored as a path, we know exactly where the color is, and we can just slam down those pixels that are colored and ignore everything else.
And so, in summary, we really think this can be really useful, but you have to be judicious about where you actually use it.
And, typically, you only need to really use it when you want to take advantage of the features of this scalable/animatable content.
And, typically, it's best in a few semi-large elements.
Again, if you have thousands of these things, you may run into issues.
So, again, I want to run into a quick demo of this.
So, really, what we have here, again, some number of layers created, and we can up the number to whatever we want.
And then each of these layers is a shape layer, and you can see that it's animating between two arrow shapes.
And so this is trying to show the first of those or the second of those points, which is that shapes can be animating; whereas, you can see that if you had an image, you really wouldn't have any way to do this without redrawing those arrows every frame; whereas, now we can just have two paths representing arrows, and let Core Animation sublay between them as it can.
And just to show you the second nice feature about the shape layer, which is that it's scalable.
If I apply a scale transform to the common container of this layer oops, trying to get this to work.
Here we go.
Hopefully, you can see that, even though we scaled in, we didn't lose any resolution.
We're staying perfectly sharp on the arrow points, and it's still animating at a fairly decent frame rate.
And, obviously, we're only paying the cost to render the bits you can actually see, so it really doesn't make a whole lot of difference that we now have these things massively huge.
So, again, I want to just go through the code quickly.
So, first of all, we're going to create the two paths for each layer.
In this case, we have some function which just creates an arrow of random shape, using a number of random numbers.
Just for convenience, we're going to use, get the bounding boxes of each path, and you need them together, so we have one rectangle which represents the overall bounding rec of those paths.
And then we're going to create a shape layer, whose fill color is set to some random color, whose position is somewhere on the screen, and then whose bounds is the bounding path rate that we computed.
And then to set up the actual path, you can see we didn't bother setting the layer path property at all, because we're going to animate that.
So, we create a basic animation, which gives us that from-to behavior and targeting the path property, and then we set the from and to values of the animation to be these two paths we created.
And, obviously, animations often work on numbers, but, in general, they can be anything that can be interpolated, and we know that we can interpolate between two CG path objects.
And so once we have that, we just set some common timing properties, give it a duration, give it some kind of these, set it to pulse back and forth forever.
And then finally we just add that to the sublayer.
We're not going to specify a key, because we don't ever want to reference this again, so we just let it sit on the layer forever.
And so you can see we did that 20 times, or whatever, and got those pulsing arrows.
One of the things we found, when developing the iPad, in particular, with its larger screen and more complex content, is that when you have a lot of things animating around the screen at once, that puts a lot of strain on the compositing system, because if something is moving, even if you have actually, where you're only moving one of those layers, but it has like 100 sublayers, like some kind of view hierarchy of table view or whatever, then you don't just pay the cost for moving the one thing.
Obviously, you have to rerender all of those things, and it's a complex rendering tree.
Then that can be an expensive task, which can take enough time that you've dropped 60 frames per second, from the nice smooth behavior we want.
So, we've added a new feature for the iPhone OS 3.2, for the iPad in later releases, where you can now ask us to basically take a subtree of a layer and all its sublayers, and you can ask us to basically cache that in a bitmap on the render tree side.
So, to do that, you really just set this shouldRasterize property.
And what this is telling us is that you're asking us to convert that layer tree into an image every time we render it.
And so that's kind of a flattening act.
It's taking this tree and converting it to just one bitmap.
And then the beneficiality of that if that's a word is that we can then reuse that bitmap whenever we can.
So, we create this bitmap, and, ideally, the thing you asked us to rasterize with a cache won't be changing from frame to frame to frame, so in that case, we rendered it a previous time into a bitmap, and then we can just use that bitmap again and again to stamp it into subsequent frames.
Obviously, if we can do that, if we can get some reuse, then we can avoid a lot of that extra rendering, where we can get much better animation performances.
So, I wanted to show kind of a diagrammatic example of what I'm talking about, because it may not be immediately obvious.
So, if you think about a very simple layer tree we have here, we have a background color, a layer with a background color.
It has two sublayers, an image, and some text, and then all of that is parented into another layer, which is setting some kind of 50 percent scaling matrix.
So, if we don't have any of this caching stuff, and we add that layer tree to our view, then what's going to happen when it renders is that each of those three layers that actually provide content are going to render one after the other.
So, they render into the frame buffer one, first the color, then the image, then the text.
Then you can see that well, you can't see, but I'm going to tell you they rate it at 50 percent resolution here, because there is the scaling.
So, they didn't render at 100 percent and scale.
They just rendered directly into the screen.
Now, if I set this middle layer to say rasterize, then, implicitly, what you're asking us to do is create a second buffer here, so we now have the frame buffer, and we have this caching buffer, which is going to start to hold everything in that subtree.
So, now you can imagine what's going to happen when I render this again.
Instead of rendering to the screen, we're going to render those three things again into the caching buffer, but this time, hopefully, you can see they're a lot larger, because they're actually rendering at the native resolution of the layer, instead of through that transfer matrix, because we're just caching that subtree.
And so, obviously, once we have that cached, we then can use the rendering system will take that and just copy it to the screen through the actual matrix, and we end up the same place we were before.
Of course, the nice thing is that we've done that once, but if we need to render that again, then we don't have to go back to the layer tree.
We can just take the cache buffer and just copy it straight to the frame buffer.
We don't have to render the layer tree.
That's, obviously, in this case, only three items, but if there was 3,000, we could get a huge performance win there, because we just skipped all of that work.
And, again, another example of why this helps you is, imagine that you change the scaling matrix from 50 percent to 25 percent, and probably we would create an animation to animate that scale change, and so now we have this thing cached.
Again, we can just go, and every frame of that animation can be rendering out of the cache, and just take a single kind of imaging operation to take the cache version to the screen.
So, hopefully, you can see that this really can make a big difference in certain cases.
So, again, I'm going to show a demo of that now.
What we have here is yet another shape layer.
This one is much more complex.
So, this is actually from an SVG, and it has about 300 path segments, I think.
So, you can see we can render one of these things at approaching frame rate, but I'll add a few more, and the performance is getting pretty shoddy.
Add a few more, and we're chunking along.
And, obviously, I can prove that there's still shape layers, because when I zoom in, you can see all the detail there, and there's no pixilation.
So, go back to the zoom back state, and what I'm going to do is I'm just going to set that shouldRasterize property on each of these shape layers, each of the butterfly layers, and go oops, twice.
So, you can see that when I do that, it just the butterflies get cached in the bitmap instead of rendering to the screen every time.
We get this nice, beautifully smooth animation.
Of course, the problem with this is that, although it's nice and smooth, we've lost one of the good features of the shape layer, which is, now, when we zoom in, I don't know if you can see that, but now everything is pixilated, because we asked the cache.
And so now what we're doing is, instead of rasterizing the shape, we're scaling the bitmap.
There are ways to work around this.
There's another property called rasterization scale, where you can ask the cache version to be cache to the 7th scale factor, but for now we're just going to ignore that.
So, I just want to talk a little bit more about why you shouldn't use this now.
So, it looks great, and it can be really useful, in some cases, but you have to be really careful with this API we've been talking about.
Firstly, a lot of these devices don't have a huge amount of memory, so, thusly, any caching, any things you are caching are taking memory from something else.
Bitmaps can be large, especially on larger screen devices.
Also, obviously, the caches are fixed size, so, once you ask too many things to be cached, then some of those won't fit, and you won't get the benefits.
If you ask us to cache, but then are unable to get the reuse, then that's actually worse than if you hadn't asked us to cache at all.
And the reason for that is that the rasterize properties, it's kind of an API contract.
You're asking us to always convert it to a bitmap, because that has some side effects, like these pixilation effects.
And so we really need to always use the same results, no matter what the other circumstances are.
So, for example, if you ask us to cache 1,000 layers, and then there are only ten of them that can be used from frame to frame, then the other 990 will be rendered into a buffer and then rendered to the screen every frame, and that can be pretty expensive.
Also, as we saw, rasterization locks the scale down.
And one last point, which is a little more esoteric, but people have run into it, which is the rasterization or the caching happens at a very precise point in this kind of pipeline of rendering operations.
Really, what we're doing is we're taking the layer, and all its sublayers, and its contents, and copying that to an image, and then taking that kind of thing and stamping it into its parent.
And the act of compositing it into its parent is another step in the rendering process, and that's where masking happens.
So, if you have a mask layer applied to a layer, it's going to put a nice shape around it.
Then that is going to be working on the cache version, and so the masking operation itself, which is also fairly expensive, will not get any benefit from the caching, at that point.
So, obviously, if you want to deal with that, you can just turn on caching on the sublayer, and, hopefully, that'll solve it.
Okay, so one UA guide for iPhone iOS 4, I should say, is something to do with keyframe animation.
So, let's talk about that.
So, as you may have seen in the previous Core Animation talk, keyframe animations are another type of animation object, which instead of just moving between two points, they move value between endpoints.
So, for example, in this case, we have four points, and we're moving some point up, down, whatever.
But you can see here the lines are very straight.
There's no curves, so it might be okay for what you want, but, typically, you want a more natural kind of animation movement.
And you can do that with the previous set of APIs, but it's fairly tricky in that you need to either create a set of timing functions to apply to each segment, or use a Bezier path.
And in both those cases, you have to be very careful to preserve continuity through the transition points.
You have to make sure the tangents of each side line up and all that stuff.
So, we've added a new feature, which is basically what we call a new calculation mode for the animation, and a calculation mode is just how we do the interpolation.
So, whereas, before we were using a linear calculation mode, we've added a new one called Cubic.
And most of you probably know, a Cubic interpolation is not just looking at two points to get the interpolated point.
It looks at the surrounding points, as well.
Because of that, it preserves the continuity through the points.
So, when I set that, instead of having this flat, angular curve, we get this fairly similar, but now we have the transitions are a lot smoother.
And so this is actually using something called a Catmull-Rom spline to fit those points, but there is a fair amount of customizability here, and there's three other properties on the animation core, the tension, continuity, and bias, which just let you kind of yank the tangents a little more, but without ever giving you the possibility that you're going to lose that continuity, at least unless you really want to.
So, it's a very quick thing, and, hopefully, it's very easy to use.
It should just if you need to use it, hopefully, it'll make things a lot easier.
So, another animation topic, which I guess Michael touched on earlier, but I wanted to talk about, too, when you apply a rotation animation, there are really two ways to do that.
You can either use the transform property and, obviously, in that case, you're interpolating matrices, which means that to represent angles, the angle is a modulo one ton, because that's just what matrices do.
Or you can use this other subproperty called rotation.z and then interpolate that as a 1d value.
That avoids this kind of modulation issue, modulo issue, because you're going to animate your angles, say, from 0 to 720 degrees, but you have a whole other set of issues to deal with, which are this Euler angle problem.
And so what you're really asking us to do here is take the matrix, the transform property, and extract the three Euler angles for that matrix, and then interpolate those.
And the problem with that is that it works fairly well, if you're only animating one of them.
But once you start touching multiple of these Euler angles, like you want to do a y animation and a z, then you get into a whole world of pain, I guess, because these things really don't concatenate nicely.
You can get gimbal lock issues, where they align to the same plane.
And it can be a nasty issue, so what we've had to, I guess, a couple of releases ago, now, is a new value function property, and the value function is really just a way to apply a function to the interpolate to get the value we set on the property.
And so, obviously, you know, the interpolant is what we're interpolating, and so we want to think about that for this rotation problem.
Then what we're going to do is like the Euler angle rotate animation.
We're going to interpolate a 1d value between 0 and 2 pi or 0 360.
But we're going to set that to the transform property of the layer, which, as you know, is a matrix, not a 1d value.
So, we have to apply this makeRotationMatrix function to turn that 1d value into the matrix.
But, hopefully, you can see that by doing this we've avoided all these problems with the previous two methods of doing this, which are we can represent any angles.
We get complete control over how the interpolation happens through those angles, and we don't have to worry about any of those Euler angles.
So, if you have two animations, both setting the transform property and with the additive mode set, then they will animate correctly, and you should get exactly what you wanted.
So, again, just a really quick example of what this looks like.
So, we create an animation for the transform.
We set the two from and to values to be 0 2 pi, and then we just set the value function to be a instance of the CA/valueFunction class.
And right now there's no way to create your own functions, but we give you a bunch of useful ones.
So, in this case, we want to have the function which takes a single value and creates a matrix, which is a rotation of that z access, which is normal to the rotation.
So, when we get that, that's going to do what we saw on the previous slide, and then finally we'll just set the duration at the animation of the layer, and off we go.
So, one final animation point, which is, typically, you often want to find when the animations have completed.
You want to modify your layer tree at that point, add new content, or remove them.
You want to chain animations, in some cases.
And then so previous to iOS 4 and Mac OS Snow Leopard, the only way to do that was to create the animations explicitly, and then set the delegate property, and that works fine, but often explicitly creating animations is more work than you have to do otherwise, because we have all this implicit animation feature.
So, we now have this other way of getting completion callbacks, which are using the objective C block syntax.
And so setting this transaction completion property will tell us the runtime, that this is a block of code.
And any animations I create from this point, I want you to remember the block of code they're associated with.
And then when all of those animations have completed, that's when you fire off the block to run on the main thread, and it gets to do whatever completion work it needs to do.
So, a lot less typing than creating explicit animation subclasses, delegates, and all that kind of thing.
So, again, in this example, we created a block, and then we just set these two properties, capacity and position, and, obviously, what we're trying to do here is we're ramping down the capacity to 0 and moving this layer somewhere far over to the right.
So, you can probably guess we're trying to move this thing offscreen.
So, then, when the block runs that we set up earlier, it's just going to remove the layer and then, presumably, do some other cleanup work.
And that's a nice way to animate some types of things, where you don't need to do this work afterwards.
But implicit animations are totally enough for what you need to do.
Okay. So, that's really all the API mixture.
So, just in summary, I think the most important point to this section, if you're going to do shadows on the embedded iPhone platform devices, things, then you really must use the shadowPath.
Just not setting that is really not acceptable to performance.
I would say putting 100 times that over 100 would never be good enough for what you want.
So, you really do need to set the shadowPath.
Secondly, CAShapeLayer, although not useful for everything, in some cases, can really save you, because it works, gets around all these limitations of bitmaps, and the performance is really good enough to have a few of these things running around, as we saw, and getting you this nice rendering quality.
And then, finally, think about if you're coding up some kind of animated UI, and the performance really isn't good enough, then using the shouldRasterize property to try and get some kind of caching out of it is often a really good way to improve performance.
But I must stress, like I said, it's really a last resort kind of feature in that you don't want to do it unless you really have to.
Okay. So, the next thing I want to talk about is performance.
Specifically, I want to build up a picture of how to think about performance of graphics rendering on the iPhone and the Mac.
I'll mostly focus on the iPhone, although all of this stuff is really applicable to both platforms.
So, the first question you really want to ask is we're going to build up from the bottom, up from the hardware through to the API.
And so the question is what do GPUs do?
A GP obviously being a graphics card or a graphics processor.
And so you may have seen this kind of diagram before.
This is like one way we program the GPU, and this is not really what I'm talking about.
We really don't care about this.
This is for open GL programmers.
And we're really program at a much different level than this.
We don't deal with lots of vertices.
So, we're really just thinking about triangles.
So, let's get rid of that, and let's think about GPUs.
In our eyes, the GPU is really just the device to compare triangles to pixels.
Obviously, the pixels live in a frame buffer, a piece of memory somewhere.
And so we have multiple types of triangles.
Firstly, we can have a triangle with a color.
You can see that.
We can have triangles with an image, and we can have triangles that aren't opaque, so they need to be composited with what's beneath them.
The interesting point there, from a performance standpoint, at least, is that the first two were both opaque, so they really don't care what's beneath them, and they can just write that color directly into the frame of them; whereas, the second one, the non-opaque one really needs to do some math to compute the final pixel.
So, it's going to look at what's underneath it, do some kind of plus, multiply thing, and then write that back in.
So, already we can see that blended triangles have more performance, more GPU cycles required for them than opaque triangles.
So, that's very useful if you're thinking about your UI, right?
And then, finally, one other thing to think about is that we're not just talking about one memory buffer.
We can draw triangles into a piece of memory and then use that as the source image to apply to another triangle.
And so you can see here, I took the content I previously rendered and mapped it across some other set of vertices.
That's really all we can talk about for the GPU.
So, the question then becomes how do we take your view hierarchy, and how do we map that onto that set of primitives?
The answer is really very simple, which is we just take your layers and map them into triangles.
So, this is an image I reused, so kind of like triangles, but the idea is that each of these rectangles is really just two triangles.
Obviously, you can split from one vertex, split between opposite rectangle points and get two triangles out of each of these rectangles.
So, specifically, your layer has a background color.
Your view has a background color.
Then we would draw two color triangles in that color into the layer of the thing I was drawing it to.
Similarly, if you have an image applied to the contents of the layer, then we draw two triangles on an image, and, obviously, you can see, depending on the opacity of those contents, we may have to turn blending on one of them.
And then, again, more complex compositing effects will use that other feature we talked about, which is render similar, render something similar, and then use that to do some extra map, copy that back to the screen.
And simply, we can do caching.
We saw just before that when we cache something, we render it into one buffer and then copy it back.
And also things like masking filters, if you're on a Mac.
All these things have to do extra work.
They can't just render directly to the screen.
It turns out that's a big deal for the GPU, because it interrupts its flow of stream.
You can think of the GPU like an oil tanker.
It's moving along, and if you need to stop it and point it somewhere else, it's a big operation that takes a lot of time.
So, one more thing.
Obviously, we, at the Core Animation level, we don't even bother sending content to the screen that's colored by opaque regions, typically.
Again, if you're thinking about performance, you need to know that, because you really just need to look at the visible areas of your app.
For example, your application on the iPhone is sitting on top of Springboard icons, probably, the hump screen.
And if we didn't do this, then that would contribute to the performance of your application.
But since we do, really, you can stop thinking about what you have to care about, once you get down to that first opaque layer.
Now we get to the interesting part, I guess, which is, given that we know, roughly, what we're doing, at least to some very broad strokes, degree, then what are the costs involved here?
What are the expensive things we have to care about?
So, we can break this down into three points.
Basically, how many destination pixels are we going to touch in the screen or the temporary frame buffer?
How many pixels do we have to read to generate that content?
So, obviously, we have triangles with images.
We need to read so many triangles to generate the destination pixel, so I read so many pixels.
And then, finally, how many times do we switch buffers?
And so we basically give these a name.
We have write bandwidth, read bandwidth, which, obviously, measures the memory bandwidth.
We really just have to think of how many pixels, really.
And then the big one is how many times do we have to switch buffers?
So, just a few quick examples of when you run into these.
Firstly, we have too much non-opaque content, then you probably will your application will be limited by the amount of writing the GPU has to do, the number of destination pixels it has to touch, because, obviously, translucent things have to be drawn; whereas, the opaque things, they wouldn't have to be.
Secondly, too many large images, then you're probably going to be limited by the amount of data the GPU is having to read every frame.
And then, again, if you have too many masking operations, then the GPU will be switching between rendered targets, and the performance will get lost that way.
Typically, what will happen is, at any one point, your app will be bottlenecked behind one of these three points, and so you'll do some work, fix that, and when your final is fast, but not quite fast enough, and then you have to switch and look at one of these other points.
So, at this point, I want to switch and try and put some examples behind all this talk.
And so what I have here is a sample application.
This is available on the WWDC website.
Hopefully, you'll be able to find it.
It's called the Core Animation Image Browser.
So, I'm just going to run it once, so you can see what it does.
I'm going to build it, compile it, and switch onto this, and then we came up, and we had this kind of image browser app.
And I could flick, and it scrolls slowly.
The structure of this is pretty simple.
We have a scroll view.
We have view controller, and we have a subclass of the scroll view, which kind of lays out these item layers, and each one of these is a view, but it has a custom layer backing it.
So, really, we're going to spend a lot of time looking at how that item layer is implemented and what we can do to make it faster.
So, this is the app.
I'm just going to give you a very quick run through.
So, we have some code.
We have a app delegate, a view controller.
The view controller is really just taking a bunch of URLs from the app bundle and then passing them on to the scroll view.
The scroll view has a little bit of code to do this layout, and so the layout method is really just, like I said, creating an item view for every image that was given, image URL that was given, and then initializing the image view, the item view, with the URL.
And then it's going to add that to itself as a subview.
There's some other stuff down here we'll talk about later.
So, the item view is really simple.
All that it does is it has an end method, but, more importantly, it implements layer class to redirect the UIKit to be using another, our own layer subclass as the backing of this view.
We don't want to use CALayer.
We want to use our own one with all its custom codes.
We return our class from this method, and that's what happens.
And then when we initialize ourselves, we just basically pass on the image URL into the layer that was created for us by UIKit.
So, like I said, most of the code, in fact pretty much all of it is in this image or item layer.
And you can see it has a bunch of methods.
And what do I want to talk about?
Right. So, I guess the final point here is the obviously, I don't have time to write code here, so I kind of cheated and added a header file with a bunch of different options we can turn on or off.
So, the first thing I want to do is I want to recompile with this, using this thread option, enable it, because, as you saw, maybe this thing took a really long time to start up, and we don't want to be waiting for every time we test something.
So, by setting this use image thread, all that's going to do is we're going to create a background thread, and we're going to arrange for our images to be loaded on the background thread and then set into the layer as they arrive, rather than just doing it all at once ahead of time.
And that's really not a Core Animation performance thing, but it makes this a lot more usable.
So, let's run it again.
Okay, so now you can see that the images are loading, as we go, and that's a lot better.
But performance now is what we want to look at, and performance here is really bad.
So, the first thing we want to change here is we want to look at how the shadows are drawn.
You saw each of those items had a shadow, and I'm afraid I did the thing I told you, you really shouldn't do, which is I basically just set up the shadow properties in my init method, passed the radius offset, and then let it auto generate the shadows.
That actually works pretty well in the Simulator, but in the Simulator, we have a very fast CPU to do all that rendering for us.
So, I'm going to go back to my options, and I'm going to say, okay, let's use the shadowPath this time.
And so, hopefully, when I rerun this, we'll see okay, so, see, we still have the same shadows, but, well, it's a little faster, not massively so.
Okay, well, anyway, so we know we still have work to do here, right?
So, hopefully, you can see it's better.
Now, the other thing I want to look at is the images.
This is going to be really hard to make out, but the other bad thing we did is we, when we loaded the images, we just took the CG image that UIKit loaded for us, and we assigned it directly to the contents layer, because we can do that, and it works.
And so here, what we have, nine images onscreen.
These images are actually 1024 X 768, which is screen sized.
So, you can imagine, when I composite this, I'm actually asking the GPU to read nine times the screen size and the amount of image data which is quite a lot of memory.
I guess this screen is roughly a megapixel.
See, now that's nine million pixels.
A really easy way to fix that, which is well, for me, I'm just going to turn on this, but [laughter] I'm going to tell you what I actually did now.
[laughter] So, right.
So, instead of using the contents properties you can see before what I was doing is getting the image up here, finish loading, and this is my didChangeValueForKey, so when I set the image property, I want to pull that to the layer, so, in this case, before I was setting the contents of the layer to be image that had been loaded, at this point, putting it on a backup thread.
Then calling setNeedsLayout, just so I can update the bounds and the shadow shape, basically.
But, so, what I'm going to do is I'm not going to set the layer contents to be the image, because that's how we get this nasty behavior, where we have all this image data being downsampled on the fly of your frame.
I'm going to tell it send this display, and at the same time, I'm going to implement the drawer in context method.
And so my draw in context is going to do a bit of work.
First of all, I'm going to take advantage of the fact that now that I'm actually drawing, I can get rid of the composited shadow entirely by just asking core graphics to shut it for me.
But, mainly, I'm going to fetch the image, and I'm going to draw the image directly into the layer I've back in store.
But, obviously, I'm going to draw it at the size I want it, not the original size, so firstly I'm going to have Core Graphics do that down sampling, which is going to get a much, much better result than having the GPU do it, because this is kind of bread and butter for Core Graphics.
And, secondly, obviously, when we come to draw, we have prescaled content, which is not the right size, so we really, at that point, instead of compositing 9 times the screen size of image data, which is going to have roughly screen size.
So I think I flip that, so let's recompile again.
Okay. So, firstly, if you're looking at this on the actual device, you'd see it looks better, but immediately you see the performance is way, way better now.
And just by having the right amount of image data for the right size screen, we can give the GPU the amount of work that it really likes to be doing, instead of way, way more.
Okay. There's one more thing I think I should show you here.
I need to restart this.
So, I'm going to run I guess I should switch back.
I'm going to run this, this time, using the same version of the application, but I'm going to run it using instruments, specifically, the Core Animation instruments tool.
So, this is messed up.
Maybe I have to kill this.
There we go.
Right, so, you can see we have okay, I have to I didn't rehearse this.
So, anyway, what I wanted to turn on is this color blended layer option, and I want to switch back.
You can see that you can see I can't scroll for one thing.
But, anyway, the point here is that we're asking Core Animation to tell us where the opaque pixels, and where are the non-opaque pixels?
So, we can see this example.
Obviously, the background is green, so that's good.
But all these images are being asked to composite every frame.
And if you look at them, they're opaque, right?
So, we really don't need to do that.
We can just have them mark themselves as opaque and get rid of the Alpha channel.
In this case, they had a shadow, as well, but we can cheat there.
We can just draw white into the background of the layer, because we know the background is white.
So, I'm going to put that to the background.
And so we have this other option, which is going to set the I think it sets the I can't remember what it does, but it sets the opaque property of the layer.
And you can see where it's saying here, if our layer is opaque, when we draw it, we're just going to fill the background with white.
And so I run this again, and, hopefully, this time, it's going to be a lot greener.
You probably won't see a difference in performance, because we weren't really stressing that aspect of the GPU in this app, but this would be useful in your other cases except I don't hit save.
Okay. Okay, here we go.
Okay, so now it's all green, and that basically means that it's still scrolling really smoothly, and probably if I put a few more images in here, it'll get smoother than it was before.
And that's kind of what you want to look for, just little tricks where you can minimize the amount of compositing that's going on get the extra bit of performance.
So, one last thing I wanted to show here is let's turn off that color thing, first of all, while I remember.
So, one final thing, which is a little similar, but I wanted to show you another feature, another way of doing a common feature, which is after you scroll, and you wanted to have a masked feathered edge in it, so I set this to 1, and then recompile.
Then we'll see the app has changed a little bit, which is near the top and bottom of the scroll layers, we have a feathered edge.
Right. So, you can see it fading, right?
I see this every now and then.
And so you can also see where we lost a bunch of performance now.
It's not as bad as it was, but it's still, obviously, chunky-ish, and so we want it to be as fast as it was, but let's, first of all, just look at what we did here.
So, really, I turned on this stop edge layer, and this is going to be added as the mask of the scroll view, because we want to take the user masking operation to kind of just gradually clip out the edges of the scroll.
And so what this does is, really, it's just a layer, and it has a sublayers method, so that whenever its size changes, it gets to reconfigure itself.
And in this case, we're basically going to create two layers I'm sorry, three layers.
We're going to create two gradient layers, one for each edge, so It's going to wrap from 0 to 1.
And then we're just going to create a solid line in the middle, and it'll just read a nice gradient, which is going to ramp from 0 to 1.
Sorry, 1 to 0 to 1, to 1, to 0.
And you can see, when we set that as the mask of the layer, we get the effect we wanted, because our mask kind of dissolves the content and then applies it to the background.
But as I was saying, the performance here wasn't good enough, and that's because if I switch let's see, if I switch on the color offscreen option, and then switch back to the app, you can see the whole screen is yellow.
And what that means is that we're basically taking an extra offscreen rendering pass, which is that thing I showed you earlier, where we draw a bunch of triangles into a buffer, and then use that as the source for another drawing operation.
And that's, you know, this whole kind of dependency chain gets created, and it makes performance pretty nasty.
So, another thing to look at, is you want to get rid of this kind of thing.
You want to basically eliminate all of this yellow.
And just like before, with the shadows, I could draw the shadows on a white background.
Again, we know the scroller background here is white, or at least it's static, and so I don't really need to be doing masking here, even though you may think of this as a masking operation.
I can turn this around and really just composite a white gradent on the top and bottom of the scroller, and I get the same effect, right?
So, I have another magic option, which will oops do that for me.
So, I switch this to 2, then what I'm going to do now is if I find the right view you can see I have this piece of code here, which is setting up this subedge layer.
So, if edges equals 1, I'm going to use the mask, which is probably what we're doing.
But, in this case, I'm just going to add this as a sublayer of the top of the other thing, and then I have to make sure now that the gradient is inverted, because before we wanted to mask off the edges.
Now we want to cover them up, so we want the opacity to be, basically, in the other places.
So, whereas, before I was having access to the gradient one way.
I'm just going to flip it over and drop the layer in the middle, because we no longer need it.
So, let's run that.
Yeah, I'm not going to make that mistake twice, maybe.
So, I still have the color of the offscreen option enabled, but you can see it's no longer firing, because we don't have any offscreen rendering.
We just have two white gradients, one on the top, one on the bottom, to cover up the pixels we want to hide.
And, obviously, you can see the performance is back where we want it to be.
And just to prove, to maybe make it a little more obvious, I'm going to turn back on the color blended layers option, and you can see exactly where these gradients are now sitting.
And, obviously, they have to be blended, because they have an opacity ramp in them.
Okay. [applause] Okay, so just going to want to summarize this, go over these three things, talk about what we were just saying.
So, firstly, let's get rid of those blended layers.
You need to minimize the number of alpha-blended pixels to minimize the amount of write bandwidth.
And there's two basic ways to first of all, there's one way to see that, which is you turn on this color blended layer option.
That's in instruments, and instruments only works for the devices.
And so if you're running on a Mac, or you're running on the Simulator, you can't use instruments to turn these options on yet, so but what you can do is you can set environ variables.
And so, if you're running on the Mac, you can just set this environ variable in your x code project and then run your application, and you get exactly the same behavior.
Or if you're running, say, the iPhone Simulator, then you can set this environ when you run the Simulator.
It's not particularly hard to do.
You just have to make sure to run the Simulator from the command line before you start x code, whatever environ variable set you want, and then when your app comes up, it'll have all those things preset.
So to get rid of the alpha channels, you need to make sure that any image refs, which have opaque data, which include the ones we were looking at there, you have to make sure they don't have an alpha channel, because the alpha channel is the way we are told to look at blending.
We don't look at any properties of the layer.
We just look at, does the contents of the layer have an alpha channel?
And so, if you're drawing into the layer, obviously, you don't get to touch the CG image rep, because there problem isn't one, but you can set this layer opaque property, which is going to tell us that when we create the bitmap here to draw into, we don't create an alpha channel, and so it's kind of the same thing, two ways of doing the same thing.
And, finally, another point which is a little more interesting is that, say you have an image, and it may have, say, a translucent border, but an opaque center.
And what you'll probably do, to start with, is have one image and just put it on the screen.
But, obviously, since it has that non-opaque edge, you have to have the whole thing have an alpha channel.
And if the image is really large, compared to the border, then that can be pretty expensive, in terms of compositing costs.
So what you can do, in those cases, is you can basically just cut up your artwork into multiple images.
You could do strip at the top, strip at the bottom, strip down each edge, and then you could have the center bit be an opaque image, like a JPEG or something.
And that will save you a lot of performance, if your image is large.
So, the read bandwidth is really, really simple, which is just use images that, as much as possible, match the screen resolution.
When I say images, I really mean bitmaps of any type.
So, layers that draw exactly the same, when you draw into them, they create a bitmap of the size of the layer, so you want to make sure you're drawing them at the right size to match the screen.
Yes, don't use megapixel images to create thumbnails, because it doesn't look good, and it doesn't work well.
And so, again, there is an option in instruments for this, "Color Misaligned Images."
This one is a little tricky to get to understand correctly, because we've changed it recently.
So, I'll try and explain what it does.
If you are on iOS 4, then this will draw two colors.
If you have an image which has just shifted a little bit, maybe its edges aren't quite pixel lined, it'll draw pink.
If you have an image which is scaled, which is really what we're talking about here, then it'll draw yellow.
On previous devices, I think there were ways to draw pink for both those cases.
So we really added that as one way to help you track down low res content and a high DPI app, and things like that, so you can use it for find any general scale res.
And then, again, rendering passes, this is really often the most important thing to get right.
And, typically, unless you're doing very small offscreen things, you need to have only one rendering pass per frame to get good performance.
And so often, you really need to trick your way into that.
You can't just set all this compositing stuff up in the most obvious way.
You really have to think about what you're doing and what you really need, and just try and drive the number of passes down by turning on that thing that's not the bullet I was expecting.
So, complex compositing, things like masking group opacity.
In some cases, you have that enabled.
And for those on the Mac, we'll all require this offscreen rendering.
And then I was about to say, use the Color Offscreen Instruments option to basically show you wherever you have this offscreen rendering.
Obviously, this option controls the yellow tint over your layers that are drawn over offscreen, and so you can see it gets if you have multiple offscreen passes, it'll draw yellow, on top of yellow, on top of yellow, and so it gets darker and darker.
So, that gives you a nice way just to gauge exactly how bad it is.
And then one final thing, the feature we talked about earlier, this cached contents of layers and a bitmap, that is actually involving offscreen rendering itself.
If you get it working correctly by which I mean you actually get some cache reuse from frame to frame, because the contents of that cache subtree isn't changing, there's not too much demand on the cache memory, then that can really hide those extra rendering passes, because you can push them into that subtree that's been rendered once, and then reuse just the image from frame to frame.
Yes, but there is the caveat always, which you really need to make sure it's working; otherwise, you could be making things worse for yourself.
Okay, so one more slide on performance.
So, to sum it all up, there's really a very simple algorithm here to look for performance.
Obviously, this involves all those color whatever options.
But what you're really caring about, why my frame rate isn't at 60 frames a second.
Get rid of extra rendering passes, get rid of really large images, and just get rid of extra non-opaque content.
And you just have to keep cycling around and around and, obviously, eliminating extra core graphic storing, as well, but at the end of the day, you just have to do the hard work and get the performance where you want it.
Okay, so that's enough about performance.
The final section of the talk is going to be just a little bit about high DPI.
And, obviously, we all saw the new iPhone with the massively high DPI screen.
And so I don't know if you've been to any of the UIKit lectures, talks about how that's going to be exposed as programming API, but I'm not going to talk about that.
I just want to give you an idea of how you can use this stuff on the Core Animation level.
So, what's really happening here is when we have the high DPI phone is that you give us the layer tree, or you give us a view tree, which has a layer tree backing it.
And it gets composited to the screen, so we have this is a picture of a new iPhone, unfortunately, but if we had an old iPhone, then what we're going to have is we're going to have a screen sized layout that's going to have a bitmap, which is 320 X 480 pixels large.
And then if we were going to display that application on a high DPI device, then your layer tree is exactly the same.
But what happens is that when the UI window is created, it's going to add a scaling transform onto the root of your layer tree.
So, we now have this kind of 200 percent, 2X scaling transform, which just blows everything up 2X.
And I don't know if you can see this.
Maybe you can, actually.
But when we take a bitmap, and 320 X 480 bitmap, and blow it up twice, then we get pixilation.
And so, obviously, if you had that, you would get very little, if any, benefit from the high DPI screen.
So, we added some features in Core Animation to work around this.
Namely, we have a new property on the layer called content scale, and what the content scale is, it's basically a way of telling us either that well, it tells us the scale factor of the content of the layer, and the content is, obviously, the image.
So, in this case, we're drawing text, so when I set the content scale to be 2, which is the relationship between my layer geometry and the screen geometry, I get scale by 2X, so I'm going to say content scale 2, and what that's going to do is it's going to implicitly change the size of that bitmap context from 320 X 480 to twice that, 640 X 960.
And then the nice thing, though, is that we'll just hide this from you by just setting the matrices on the Core Graphics context so that you still think you're drawing, because it's saying 320 X 480 buffer, but just Core Graphics will take care of the extra on what's required to get the high resolution content.
And so zooming in, this is just to hammer this home one more time.
We magnify the old content, and then we set the content scale properly.
The buffer changes size, and you get finer grain, more pixels per inch or whatever.
And it's going to, obviously, look great, because that's going to match the natural resolution of the screen.
You get the highest possible DPI.
So, one final point is that even though UIKit, they've chosen the way to expose us to apps is preserve compatibility, make sure that your window is still 320 X 480.
In some cases, you may want to think about, for example, if you have graphics content, and you really want to get down to the native resolution of the display, you want to have a 640 X 960 layer, just so you can position things in inches, exactly correctly for some reason, then there's no reason you can't just undo that matrix.
UI window has this text matrix, but anywhere in your layer tree, you can apply an inverse to that, which are all CBS scale, half, 50 percent matrix, and that will set things up correctly, and then your layer will be in the native, again, in the native coordinates, native scaling space, and it will match right.
Okay, so I'm really almost running out of time here.
So, basically, you have a text scale factor.
The geometry is the same.
Content scale should be used for content.
And like I said earlier, rasterization scale for rasterizing, and you can undo the scale matrix when you need to.
Okay, so one more slide.
So, if you take anything out of this, hopefully, it'll be maybe these three things.
One, whenever you need to, use shadowPath, or rather, whenever you're using shadows, use shadowPath.
Two, whenever you don't have quite the right performance, but it seems like this could help, use shouldRasterize.
And three, really think about what your layers are meaning to the graphics card, and try to think of them in terms of triangles and opacity, and what have you, and just try to make some kind of mental calculations.
And then now we're really done with five seconds to spare.
[laughter] Okay, thank you very much.