Adopting Metal, Part 1

Session 602 WWDC 2016

Metal provides the best access to the GPU on iOS, tvOS, and macOS, enabling you to maximize the graphics and compute potential of your apps and games. Get introduced to the essential concepts behind Metal, its low-overhead architecture, streamlined API, and support for efficient multi-threading. Start learning how to code with Metal in a walkthrough of rendering a basic scene.

[ Music ]

[ Applause ]

Good afternoon, and welcome to Adopting Metal, Part I.

I'm Warren Moore from the GPU Software Team, and I'm joined by my colleague, Matt Collins, who will be driving the demos today.

I want to start off by asking a deceptively, simple question: what is Metal?

You've heard us say that Metal is Apple's low overhead API for GPUs, that it has a unified graphics compute language, and that it's built for efficient multithreading, and is designed for our platforms.

And all of this is true, but Metal is a lot more than Metal.framework.

Metal is supported by additional frameworks and tools and so on.

And they make it a lot more than just the metal framework API.

In particular, last year we introduced MetalKit, which includes utilities for doing common tasks like interacting UIKit, and AppKit, and loading textures, as well as Metal Performance Shaders, which allow you to do common tasks such as imaging processing, and contain hand-tuned, highly optimized Shaders that you can drop right into your app to do these tasks.

Metal is also tightly integrated with our developer tools, Xcode and Instruments.

When you have Shaders in your app, they're actually compiled right along with your app, including your app bundle to do Metal's integration with Xcode.

And the GPU Frame Debugger, allows you to take a snapshot of your app at any given point, and see exactly what's going on.

Metal System Trace in Instruments allows you to get an ongoing view of the performance and behavior of your Metal apps.

So two years ago, we introduced Metal on iOS, and since then, we've brought Metal to Mac OS and tvOS.

So it really has broad support across our platforms.

And it's also widely supported by our hardware.

It's supported on our desktop and mobile architectures from Apple, AMD, Intel, and NVIDIA, and this includes all Apple Macs introduced since 2012, and all iOS devices since 2013, as well as the new Apple TV.

So Metal gives your applications access to the performance and power of the GPU in literally hundreds of millions of our most popular products.

And Metal is also a foundational technology on these platforms.

It powers Core Graphics, Core Animation, as well as our Games and Graphics Libraries such as SpriteKit, SceneKit, and Model I/O.

And it's also an important component in key system applications like Preview and Safari.

And Metal has been widely adopted by developers of all sizes, from AAA Studios, game engine providers, independent developers, and creators of professional tools, and they've built amazing games and apps across all of our platforms.

These are just a few examples, but I'd like to highlight a couple.

For instance, Fancy Guo used Metal to dramatically improve performance and bring amazing visual effects to their highly popular MORPG, Furious Wings.

And Metal has also been used to build inspiring professional content creation tools, like the upcoming version of Affinity Photos for iPad.

And I'd like to show you just a quick preview of what's coming.

This is Affinity Photos built by Serif Labs.

And they're building a fully featured, photo editing app for the iPad Pro, allowing them to achieve truly stunning results.

And this year at WWDC, we want to give you the tools to help you start using Metal to build amazing experiences in your apps as well.

We have a lot of phenomenal content this year at WWDC, five sessions dedicated to Metal.

Of course, this is the first session, Adopting Metal, Part I.

And during this session, we'll talk a little bit about some foundational concepts in Metal, go on to talk about doing 2D drawing and then actually add lighting, texturing, and animation as well as we move into 3D.

In Part II of this session, happening in this room after this session, we'll talk about dynamic data management and go on and talk about some of the finer points of synchronizing the GPU and CPU, and really taking your performance to the next level with multi-threaded encoding.

Of course, we're also going to talk about what's new in Metal.

And there's really a tremendous list of new features that you probably saw teased during the Platform State of Union yesterday.

I won't go through all of these in detail, but if you're interested in implementing any of these in your apps, you should definitely check out the What's New sessions.

And finally, we have an awesome talk on advanced Shader optimization.

And this is really a hardcore talk for the people who want to get the absolute most out of their Metal Shaders.

We'll talk specifically about how the hardware works and how you can use Metal to really drive it to the max, and tune your Shader code.

Throughout the course of these sessions, we'll build a sample project, starting with just a simple Hello Triangle, and Hello world of graphics programming.

And then as I mentioned, we'll move to animation and texturing.

And in Part II, we'll take it to the next level and talk about updating object data in real time and also, performing draw calls across multiple threads.

Now of course, we have to make some assumptions about who you are.

We assume that you're familiar with the fundamentals of graphics programming, ideally with a programmable pipeline.

So you're familiar with Shaders and so on.

And also of course that you're interested in actually using Metal to make your games and apps even more awesome than they already are.

I assume that everybody here is on the same page with that.

That's why you're here, right?

So just to go through the agenda.

We'll kick things off with a conceptual overview that will sort of introduce the philosophy of Metal and why Metal is shaped the way it is.

Then we'll actually get right down to the nitty gritty and talk about creating a Metal device.

We'll go on to talk about loading data into memory that's accessible by the GPU.

And we'll talk about the Metal shading language briefly.

We'll talk about creating pre-validated pipeline states.

And then talk about issuing GPU commands, including draw calls.

And then we'll finish up with a discussion of how to perform animation and texturing in Metal.

Part II will take things even further, and I've already mentioned what we'll discuss there.

So let's just forward ahead.

Starting off with the conceptual overview.

There are a few things that I want to emphasize here.

Use and API that matches the hardware and driver.

Favor explicit over implicit.

And do expensive work less often.

Let's start with using an API that matches the hardware.

Metal is a thoroughly modern API.

And by that, I mean that it integrates with and exposes the latest hardware features, and it matches very closely to how the hardware actually works.

And being a comparatively new API, it's very thin and has no historical cruft that you get with other legacy APIs.

So there are no fancy tricks required for low overhead operation.

It's baked in to how Metal is shaped and how it operates at the most fundamental level.

And fortunately, it's unique by design across all across our platforms.

When we say that we want to favor explicit over implicit operation, we mean that we put in your hands, the responsibility to perform some explicit control over how commands are submitted to the GPU, as well as how you manage and synchronize your data.

And this puts on you, a lot of responsibility, but with great responsibility comes great performance.

So, just to illustrate what we mean when we say "to do expensive work less often," there are kind of three regimes of time that we can think about.

The time that your app is built, the time that your app is loading, loading assets and so on, and then draw time, the things that happen 60 times per second.

So with a legacy API, like OpenGL, you pay the cost of work like state validation every time you issue a draw call.

You take the hit for recompiling Shaders on the fly in the worst case.

And all this adds overhead on top of the necessary work of encoding the actual work for the GPU, like your draw calls.

With Metal, we push some of this work earlier in the process.

So, as I alluded to earlier, Shader compilation can actually happen at application build time with Metal.

And additionally, we allow you to validate the state that you're going to be using for your draw calls in advance at load time, so you don't pay that cost every time you issue a draw call.

Instead, the only work that remains to be done when you're issuing your draw calls, is to do your draw calls.

So with that conceptual overview, let's talk about where the rubber hits the road, and that's, the Metal device.

So there's a Class MTL device, and it's an abstract representation of your GPU.

And it functions as the root object in your Metal app, meaning that you'll use it to create things like command queues, resources, pipeline state objects, and other objects that you'll be using.

It's very easy to create a Metal device.

You'll just call MTLCreateSystemDefaultDevice.

Now devices are persistent objects, so you'll probably want to create one at the beginning of your application and then hold onto reference to it throughout your application life cycle.

It's just that easy.

So now let's talk a little bit about how to get data into a place where the GPU can access it, so that you can issue your draw calls.

And in Metal, we'll store our data in buffers.

So buffers are just allocations of memory that can store data in any format that you choose.

These might be vertex data, index data, constant data.

And you write data into these buffers and then access them later on in your vertex and fragmentFunctions.

Let's take a look at what that might look like.

So here is an example of a couple of buffers that you might create, as you're loading your data.

We have a vertexBuffer, containing some vertices.

And an indexBuffer, containing some contiguous indices.

To get a little bit more concrete, each instance of this vertex type might be a Swift struct that contains a position vector for the vertex, as well as a color for the vertex.

You can just lay them out contiguously in memory.

Now let's talk about how you actually create buffers.

So the API for this is on the device that you've already created.

And you simply call newBufferWithLength to get a buffer of a particular size that doesn't have any data loaded into it.

Or you can call newBufferWithBytes and pass a pointer to data that already lives in memory.

And Metal will then copy that data into the newly created Metal buffer, and it will be ready for your use.

You can also memcpy into the contents pointer of the buffer if you choose.

So, since we're going to be showing a 2D triangle as the first part of our demo, let's talk about defining the geometry for this triangle here.

So, since we want to keep the vertex shader and fragment shader as simple as possible, we'll actually provide these coordinates in Clip Space.

And Metal's Clip Space is interesting.

It differs from some APIs and it's similar to some APIs.

This is like the DirectX Clip Space.

It runs from negative 1 to 1 in X, negative 1 to 1 in Y, and zero to 1 in Z.

So this is the coordinate space that we'll specify our vertices in.

So in code, it looks like this.

We create a Swift array of vertices, and then we just append vertices each with a position and a color.

Now, we don't strictly need to use index drawing to do the simple of a used case, but we'll go ahead and create an indexBuffer and append the indices 0, 1, and 2, which correspond to of course, the first, second, and third vertices of our triangle.

And then we'll create a couple of buffers with our device.

So we'll create the vertexBuffer by calling newBuffer(withBytes, which loads our vertex data into this Metal buffer, and we'll call a newBuffer(withBytes again and pass the index data and get back the indexBuffer.

So, now that we have our data and memory, let's talk a little bit about Metal's unique shading language.

The Metal shading language is an extended subset of C++ 14, and it's a unified language for graphics and compute, meaning that you can do a whole lot more than just 3D graphics with it.

It really just lets you write programs for the GPU.

So here's a block diagram of the stages of the pipeline.

And what we're really talking about now is the vertex and fragment processing stages.

Each of these has an associated function that you'll write, that's used to either process the vertices or the fragments that will wind up on screen.

Syntax-wise, it looks a little bit like this.

And we're not actually going to go through this in any detail.

I just want to call your attention to these function qualifiers: vertex and fragment.

You'll notice that right out in front of these functions, unlike in say a regular C++ program, we actually have these qualifiers that denote which stage this function is associated with.

So we have a vertex function up top, and a fragmentFunction down below.

And I'll show you shortly how to actually associate these with your pipeline so that you can use them to draw.

And we'll also look at the internals of these functions for our 2D demo and later on for our 3D demo.

Now, I've mentioned a couple of times that Metal allows you to compile your Shaders directly into your app bundle.

And the way that happens is, if you have even a single .Metal file in your Compile Sources Phase of your project, the Metal will automatically generate what's called a Metal Lib file, the default.Metallib file, and copy it into your bundle at the time your application is built with no further effort on your part.

So there's the insides of your app bundle.

There's your default.Metallib.

So just to recap.

You can build Metal Shaders at runtime.

Again, if you have a .Metal file in your app, it will be compiled by Xcode using the Metal toolchain.

And then produce default.Metallib which will wind up in your app bundle.

And the natural question you have at this point is, "Well, how do you actually get these functions at runtime?"

And the answer is that you'll use a class called Metal Library.

So Metal Library is a collection of these compiled function objects, produced by the compiler.

And there are multiple ways to create it.

You can go through the flow that we just discussed, which is to build a default.Metallib into your app bundle, and then load it at runtime.

You can also build .metallibs with a command line toolchain that we provide.

And you can also build directly from a source string at runtime if you're for example, building Shaders through string concatenation.

So in code, it looks like this.

In order to load up the default.Metallib, you simply call newDefaultLibrary on your existing Metal device.

And there's other API for loading from, for example, an offline compiled .Metallib, or from source.

And you can consult the API docs for that.

So, you have a Metal Library.

What do you get from Metal Library?

You get a Metal function.

Now a Metal function is simply an object that represents a single function.

And it's associated with a particular pipeline stage.

Remember that we saw the diagram earlier, the vertex or fragment stage.

And we also have an additional function qualifier called kernel, that signifies a data parallel of compute function.

So here's that code snippet again, and you can see that the function named here is vertex transform, and the fragmentFunction name is fragment lighting.

And I rehash this so that I can show you the API for loading these functions from your library, which looks like this.

We simply call NewFunctionWithName and pass a string that represents the name of your function, and get back a Metal function object, and then hold onto it.

Now, I'll show you how to actually use all these objects in a moment.

But that was just a brief introduction to the Metal Shading language.

So let's talk about building pre-validated pipeline states.

But first, let's motivate it a little.

So, with an API like OpenGL, you're often setting a lot of state.

And then you issue draw calls.

And in between those, the driver is obligated to validate that the safety you've set, is in fact a valid state and again, in the worst case, you can even pay the cost of recompiling Shaders at runtime.

This is what we want to avoid.

So with Metal, it looks more like this.

You set a pre-validated pipeline state object, and maybe set a few other bits of ancillary state, and then issue your draw call.

Now what we're trying to do here is to reduce the overhead of draw calls by again, pushing more work earlier into the process.

So here are a few examples of state that you'll want to set on your pipeline state object, which we'll talk about in a moment.

First, let's state that you can set pretty much anytime when you're drawing.

You'll notice that in the left hand column, the state that you'll set on the pipeline state, includes the vertex and fragmentFunction that will be used to draw.

And it also includes things like your alpha blending state.

On the right hand side, instead we see the state that you can set before issuing any given draw call, including the front face winding and the call mode.

So let's talk about how you actually create objects that contain this pre-validated state.

The chief object is the Metal RenderPipelineState.

It represents the sort of configuration of the GPU pipeline, and it contains a set of of validated state that you'll create during load time.

Like devices, RenderPipelineStates are persistent objects that you'll want to keep alive throughout the lifetime of your application, though if you have a lot of different functions, you can create pipeline state objects asynchronously while your app is running.

To actually create a RenderPipelineState, we don't create one directly.

Instead, we use an object called a Descriptor that bundles up all the parameters that we're going to use to create this RenderPipelineState.

Often in Metal, we'll create Descriptor objects that really just bring together all of the different parameters that we need to create yet another object.

And so for the RenderPipelineState object, that's called a Render Pipeline Descriptor.

You'll notice that it contains pointers to the vertex function and fragmentFunction, as I mentioned earlier.

And it also contains a collection of attachments.

And attachments signify the type of texture that will be rendered into when we actually do our drawing.

Now in Metal, all rendering is rendered to texture, but we don't need pointers to those textures right up front.

Instead we just need you to supply the pixel formats that you'll be rendering into so that we can optimize the pipeline state for them.

Additionally, if you're using a Depth or Stencil Buffer, then you can also specify the pixel format of those targets.

So once you've constructed a render pipeline descriptor, you can pass it off to your Metal device and get back a MTLRenderPipelineState object.

Let's take a look at that in code.

Here's a minimal configuration for a RenderPipelineState.

You'll notice that we're setting our vertex function and fragmentFunction properties to the vertex and fragmentFunction objects that we created earlier from our library.

And we're also configuring the pixel format of the primary color attachment to be .bgra8Unorm, which is one of our renderable and displayable pixel formats.

This represents basically the texture that will ultimately be drawn to when we do our drawing.

And finally, once we've constructed that pipeline descriptor, we can use the new RenderPipelineState function on the device to actually get back this pre-validated object.

I want to emphasize once more that PipelineStates are persistent objects, and you should create them during load time and keep them around as you do your device and resources.

You can switch among them when doing drawing in order to achieve different effects.

You'll generally have about one per pair of vertex and fragmentFunctions.

So now that we've talked about how to construct pre-validated state and how to load some of your resources into memory, let's talk about actually issuing GPU commands, including draw calls.

We'll go through this in several stages.

We'll talk about interfacing with UIKit and AppKit, talk about a bit about the Metal command submission model, and then get into render passes and draw calls, and finally, how to present your content on the screen.

So in terms of interacting with UIKit and AppKit, we're going to use a utility for MetalKit called MTKView.

And MTKView is a cross-platform, view class.

It inherits from NSView on Mac OS and from UIView on iOS and tvOS.

And it reduces the amount of code that you have to write in order to get up and running on Metal.

For example, it creates and manages a CA Metal Layer for you, which is a specialized CALayer subclass that interacts with the Windows server or with the display loop in order to get your content on screen.

It can also, by use of CV or CA display link, manage the draw callback cycle for you by issuing periodic callbacks in which you'll actually do your drawing.

And it also manages the textures that you'll be rendering into.

I want to emphasize on particular aspect of what this does for you, and that's drawables.

So, inside of the CA Metal Layer that's managed by your MTKView, there is a collection of drawables.

And drawables wrap a texture that will ultimately be displayed on screen.

And these are kept in an internal queue, and then reused across frames because they're comparatively expensive and they need to be managed by the system because they actually are very tightly bound to how things actually get displayed on the screen.

So we manage them for you and we hand you the drawable object that wraps up one of these textures that you can draw into.

So here's how you can there are numerous properties that you can configure on an MTKView to determine how it manages the textures that you're going to be drawing into.

In particular, you can set a clear color that will determine which color, the primary color target is clear to.

You can specify the color pixel format, which should match the color format that you specified on your pipeline state object, as well as specifying a depth and/or stencil pixel format.

And this last property is probably the most important.

This is where we set the delegate.

So, MTKView doesn't actually do any drawing in and of itself.

You can either subclass it, or you can implement a delegate that's responsible for doing the drawing.

And we'll talk through the later use case.

So let's take a look at what you have to do in order to become an MTKView delegate.

It really boils down to implementing two methods: drawable sizeable change, and draw.

So in drawable sizeable change, you are responsible for responding to things like the window resizing or the device rotating.

So for example, if your projection matrix is dependent on the window size, then this gives you an opportunity to respond to that instead of rebuilding it every frame.

So the draw method will be called periodically, in order for you to actually encode the commands that you want to have executed, including your draw calls.

And we're not showing the complete internals of that method here, but this is just a taste of what's to come when we talk about command submission.

So you'll create a commandBuffer, do some things with it, and then commit it.

We'll talk a lot more about that in a moment, but this is sort of your hook for doing the drawing if you're using MTKView.

And we recommend using MTKView, especially as you're getting started, because it takes care of a lot of things for you.

So let's talk about Metal's command submission model.

This is the picture that we're going to be building up over the next several slides.

And it's not important for you to memorize everything that's going on here.

We're going to be building this up.

This is just sort of an overview.

The objects that we're going to be constructing as we go along.

So, Metals Command Submission Model is fairly explicit, meaning that your obligated to construct and submit commandBuffers yourself.

And you can think of a commandBuffer as a parcel of work to be executed by the GPU, in contrast to what we're calling a Metal buffer, which stores data.

Command buffers store work to be done by the GPU.

And commandBuffer submission is under your control, meaning that when you have a commandBuffer that you've constructed, you're obligated to tell the GPU when it's ready to be executed.

We'll talk all about this in a moment.

Additionally, we'll talk about command encoders which are objects that are used to translate from API calls into work for the GPU.

It's important to realize that these command encoders perform no deferred state of validation.

So all of the pre-validated state that bundled up in your pipeline, state objects, we assume that that's valid because we validated it in advance.

And so there's no additional work to be done by the encoder or the driver at the point that you're issuing commands to be rendered.

Additionally, Metal's Command Submission Model is inherently multi-threaded, which allows you to construct multiple Command Buffers in parallel, and have your app decide the execution order.

This allows you to scale to, and beyond, tens of thousands of draw calls per frame.

Adopting Metal Part II will talk about this in depth, but I wanted to mention it now to hint at what's to come.

So let's talk a little bit more about these objects in depth.

The first thing we'll talk about is the Command Queue.

And the Command Queue, which corresponds to a Metal class called Metal Command Queue, manages the work that has been queued up for the device to execute.

Like the device and resources and pipeline states, queues are persistent objects that you'll want to create up front and then keep a handle on for the lifetime of your app.

You'll often only need to create one.

And this is how thread safety is introduced into the Metal API in the sense that you can create Command Buffers and render into and use them on multiple threads.

And the queue allows you to create and commit them in a thread safe fashion, without you having to do your own locking.

It's really simple to create a Command Queue.

You simply call a new Command Queue on your device, and you'll get back a Metal Command Queue.

Of course, a queue can't do much unless you actually put work into to, so let's talk about that.

So I've already mentioned Command Buffers.

And Command Buffers are the parcels of work to be executed by the GPU, and in Metal, they're represented by a class called Metal Command Buffer.

So a Metal Command Buffer contains a set of commands to be executed by the GPU, and these are each enqueued onto a Command Queue for scheduling by the driver.

These, in contrast to almost everything we're talked about thus far, are transient objects, meaning that you'll create one or more of them per frame, and then encode commands into them, and then let them go off to the GPU.

You won't reuse them.

You won't hold onto a reference to them.

They're just fire and forget.

To create a Command Buffer, you simply call up the Command Buffer on a Command Queue.

So we've talked a bit about Buffers, and Queues, and now let's actually talk about how we get data and commands into a commandBuffer, and that's done with a special class of objects called Command Encoders.

And there are several types of Command Encoders, including Render, Blit, and Compute.

And these each allow you to do different things.

And they all have this common thread though allowing you to encode work into a Command Buffer.

So for example, a Render Command Encoder will allow you to set state and perform draw calls.

A Compute Command Encoder will allow you to enqueue work for the GPU to execute in a data parallel fashion that's not rendering work.

That's your GP, GPU, and other stuff like that.

And the Blit Command Encoder allows you to copy data between buffers and textures, and vice versa.

We're going to look in detail at the Render Command Encoder in this session.

And as I mentioned, it has the responsibility of encoding commands.

And each Render Command Encoder, encodes the work of a single pass into a Command Buffer.

So you'll issue some state changes and then you'll issue some draws and it manages a set of Render target attachments, that represent the textures that are going to be drawn into by this one particular pass.

So schematically, what we're talking about here is sort of the last stage.

You can see we have these attachments sort of hanging off the frame buffer right stage of the pipeline.

And if we were doing multi pass rendering, then one or more of the render targets in this pass might become inputs for a subsequent pass.

But this is sort of the single pass, simple use case.

So the again, the attachments represent the textures that we're going to be drawing into at the end of this pass.

So in terms of actually creating a render command encoder, we use another type of descriptor object, called a RenderPassDescriptor.

So a RenderPassDescriptor contains a collection of attachments, each of which has an associated load store action, a clear color, clear value, and an associated Metal Texture to be rendered into.

And we'll talk a little bit more about load and store actions in a couple of slides.

But the important thing to realize here, is that you'll be constructing a RenderPassDescriptor at the beginning of your frame, and actually associating it with the textures that are going to be drawn.

So in contrast to the renderPipelineState object that only needs to know the pixel format, this is sort of where the rubber hits the road and you actually have to give us the textures that we're going to be drawing into.

So again, a RenderPassDescriptor contains a collection of render pass attachments, each of which might be a color, depth, or stencil target, and refers to a texture to render into.

And it also specifies these things called load and store actions.

Let's talk in more depth about what that actually means.

So, at the beginning of a pass, you have your color buffer and your depth buffer and they contain unknown content.

And in order to actually do any meaningful work, we'll need to clear them.

And we do this by setting their associated load action on the RenderPassDescriptor.

So we set a load action of clear on the color and depth targets, and that clears them to their corresponding clear color or clear value, as the case may be.

Then we'll do some drawing, which will actually put the results of our draw calls into these textures.

And then the store action will be performed.

And the store action here is going to be one of two things.

The store action of store, signifies the result of rendering should actually be written back to memory and stored.

And in the case of the color buffer, we're actually going to present it potentially on screen.

In the case of the depth buffer, we're really only using it when we're actually drawing and rendering, and so we don't care about where the results of that go at the end of the pass.

So we can set a store action of "Don't Care" in order to save some bandwidth.

This is an optimization that you can do if you don't actually need to write back the results of rendering into the render target.

So to go in a little bit more depth on load and store actions, these determine how texture contents are handled at the start and end of your pass.

In addition to the clear load action that we just saw, there's also a Load-Load action that allows you to load pixel contents of your textures with the results of a previous pass.

There's also "Don't Care."

For example, if you're going to be rendering across all pixels of a given target, then you don't actually care what was in the texture previously, nor do you need to clear it because you know that you're going to actually be setting every single pixel to some value.

So that's another way that you can optimize, if you know that in fact you are going to be hitting every single pixel in this pass.

Now, I could walk you through how to create a RenderPassDescriptor and then create a Render Command Encoder, but fortunately, MTKView makes this really easy on you.

You saw earlier that we configured the MTKView, with a couple of properties that by now I hope you become familiar, like the clear color and the texture formats of your render targets.

So you can actually just ask the view for its current RenderPassDescriptor and you'll get back a configured RenderPassDescriptor that you can then go on to use to create a Render Command Encoder.

And this is how you do that.

You simply call Render Command Encoder on your Command Buffer.

Now it's important to note here that current RenderPassDescriptor is potentially a blocking call.

And the reason for that is that it will actually call into the CA Metal Layers next drawable function, which we won't talk about in detail now, but which is used to obtain the drawable that wraps the texture that can be presented on screen.

And because that is a finite resource, if there is not a drawable currently available, if all of them are in flight, then this call will block.

So it' something to be aware of.

So we've talked about loading resources into memory, and we've talked about creating pre-validated state and we've talked about now, created Render Passes and Render Command Encoders.

So how do we actually get data into our Shaders?

First, we need to talk a little bit about argument tables.

So argument tables are mappings from Metal resources to Shader parameters.

And each type of resource that you're going to be working with, such as a buffer or a texture, has its own separate buffer argument table.

So you can see here on the right, that we have the buffer argument table and the texture argument table, each of which contain a couple of buffers that are maps to particular indices in the argument table.

Now the number of slots that are available in any given argument table are actually dependent upon the device.

So you should query for them.

Let's make that a little bit more concrete.

So, on the Render Command Encoder there's a function called Set Over Text Buffer, and you'll notice that it has three parameters.

It takes a buffer, and offset, and an index.

So this last parameter is what we care about the most because it's our argument table index.

So this is sort of the host side of setting resources that are going to be used in your Shader.

And there's a corresponding Shader side which looks like this.

So this is in the middle shading language.

Inside your Shader file, you'll specify that the that each given parameter that corresponds to a resource that you want to access, has an attribute that looks like this.

So, this is the first buffer index.

Buffer Index Zero in the Argument Table, corresponds to the buffer that we just set back on our Render Command Encoder.

And we'll look at a little bit more about this in detail when we actually talk about doing drawing in 2D.

We've already created a renderPipelineState, but we actually need to tell our Render Command Encoder which pipeline state to use before doing any drawing.

So this is API for that.

We simple call setRenderPipelineState with the previously created PipelineState object, and that configures the pipeline with the Shaders that we've that we created earlier that we're going to be using to draw.

Now of course, the RenderPipelineState, has associated an associated vertex and fragments function.

So let's take a look at the vertex and fragmentFunction that we're actually going to be using to draw in 2D.

Back in Metal Shading language, it looks like this.

So, this is basically a pass through vertex function which means that we're not going to be doing any fancy math in here.

It's really just going to copy all these attributes, straight on through.

So, the first parameter to this function is a list of vertices which is the buffer that we just bound.

And the second parameter is this thing that's attributed with the vertex ID attribute, which is something that's going to be populated by Metal with the index of the vertex that we're currently operating on.

And the reason that's important is because the vertexBuffer contains all the vertices, and we can access it at random.

But what we actually want to do in our vertex function is operate on one particular vertex at a time.

So this tells us which vertex we're operating on.

So, we create an instance of this struct VertexOut, which represents all the varying properties of the vertex that we want to pass through to the rasterizer.

So we create an instance of this and set its position to the position vector of the vertex indexed at vertexId.

And similarly for the color.

And this just passes that data on through from the vertexBuffer to the struct that will be interpolated by the rasterizer.

And then we return that structure back on out.

Now let's look at the fragmentFunction.

It's even simpler.

So we take in the interpolated struct, using the stage in attribute, and that signifies that this is the data that's coming in from the rasterizer.

And we just extract that color the color from the incoming structure, and then pass it back on out.

And so, what's happened in this process is the vertices, which were already specified in Clip Space in this example, are being interpolated and then rasterized and then for each fragment that we're processing, we simple return the interpolated color that was created for us by the rasterizer.

So once we've specified the RenderPipelineState which contains our vertex and fragmentFunction, we can also set additional state, kind of like the stuff that I mentioned earlier, including the front facing state.

So if you want to specify a different front facing winding order, than Metal's default of clockwise, then you can do that here.

It's a lot of configuration, but we're actually about to see some draw calls happen, right now.

So Metal has numerous functions for drawing geometry including indexed, instance, and indirect, but we'll just look at basic index drawing.

Let's say that we want to draw that triangle, at long last.

So, here we call drawIndexedPrimitives, and we specify that the prototype is triangle because we want to draw a triangle.

We pass an index count of three to signify that we want to draw a single triangle, and then we also specify the type of indices.

We made our Swift array a collection of UN 16s earlier, so we mirror that here.

And we also pass in the indexBuffer that we created earlier that signifies which vertices should be drawn, and then we pass in an offset of zero.

And this is actually going to result in a single triangle being drawn to the screen.

We might also additionally set some more state and issue other draw calls, but for the purposes of this first demo, this is all there is to it.

So in order to conclude a render pass, we simple call it endEncoding on the Render Command Encoder.

To recap all of that, you will create a request, a RenderPassDescriptor, at the beginning of your frame.

Then you'll create a Render Command Encoder with that Descriptor.

Set the RenderPipelineState.

Set any other necessary state.

Issue draw calls, and finally end encoding.

So here's a recap of all the code that we've seen thus far.

Nothing new here.

Exactly what we've seen and exactly what I just said.

Create a Render Command Encoder, set state, set state, bind some buffers, and draw.

So you've rendered all this great content, but how do you actually get it on the screen?

It's pretty straightforward.

So, first color attachment of your render pass is usually a drawables texture that you've either gotten from a CA Metal Layer or from and MKTView.

So in order to request that that texture actually get presented on screen, you can actually just call present on the commandBuffer, and pass in that drawable, and that will be displayed to the screen once all the preceding passes are complete.

Then to finally actually finish up the frame, since we've been encoding into this commandBuffer, we need to signify that we're done with the commandBuffer by calling commit.

Committing tells the driver that the commandBuffer's ready to be executed by the GPU.

So to recap that, we created a command queue at start-up, and since it's a persistent object, we hold onto reference to it.

Each frame we create a commandBuffer, encode one or more rendering passes into it with a render command encoder.

Present the drawable to the screen and then commit the commandBuffer.

And now I'm going to hand things off to my colleague Matt, to walk us through the demo of drawing in 2D.

Thanks Warren.

[ Applause ]

So here's the proof.

A 2D triangle, this is the Metal triangle demo as you can tell by our awesome title, is very simple.

Just a triangle, three colors on the ends interpolated nicely over the edges.

Let's take a look at the code.

Now first I want to show you what it takes to become a delegate of MTKView, and Warren mentioned we have two functions to implement.

So here we have MTKView, drawable, sizeable change.

And this is what is called when you need to respond to changes in your window.

This sample is very simple so we didn't actually implement it.

We'll leave that up to you guys for your own applications.

And the other thing is simple the draw.

We chose to put this into a render function.

So, when our draw gets called, we go into our render.

Render's also quite simple.

I just wanted to show you.

When we take MTKView's current RenderPassDescriptor, you just grab it out like Warren said, and then you create the RenderPassDescriptor and your encoder with it.

And I'd like to draw your attention here, "push debug group."

And this is how you talk to the awesome Metal tools.

So when you do a frame capture, this will then sort all of your draws by whatever debug group you've had.

So here, we have one draw and a draw triangle and then we pop the debug group after we've drawn, and so this draw will show up labeled as "Draw Triangle."

Let's take a look at the Shader.

Now Warren mentioned, we had structs.

We have a vertex end struct, which is the format of the data we're putting into the Shader, as you see.

That's just a position and a color.

And we have the vertex out struct, which is what we're passing down to the rasterizer.

And you see here the position has been tagged with this position attribute.

And this represents the Clip Space position.

And every vertex Shader that you have or vertex function, sorry, must have one of these.

And as you saw, these should look kind of familiar, they're very simple.

Vertices come in.

We have a pass through.

And you write them out.

And in the fragmentFunction, we take in the vertices that came out of the rasterizer and we read the color, and send that down.

So that's the simple triangle demo.

I'll send it back to your Warren.

Thanks Matt.

So, we've shown how to actually draw 2D content.

And 2D is cool, but you know what's even cooler?

Three-D. So let's talk a little bit about animation and texturing in Metal.

In order to actually get into 3D alright well, we'll go through this in a couple stages.

We'll talk about how to actually get into 3D.

And we'll talk about animating with a constant buffer, and then we'll talk a little bit about texturing a sampling.

In order to move into 3D, whereas we've been specifying our vertices in Clip Space, we now need to specify them in a model local space.

And then multiply them by a suitable model view projection matrix, in order to move them back into Clip Space.

And we'll also add properties for a vertex normal as well as texture coordinates so that we can actually use those in our fragmentFunction, to determine lighting and to determine how to apply the texture map.

So, here's our extended vertex.

We have removed the color attribute and we've added in a normal vector as well as a set of texture coordinates.

And similarly to how we had in 2D, we'll just be adding on a new buffer that will store all the constants that we need to reference from our various vertex and fragmentFunctions in order to actually transform those vertices appropriately.

Now, you'll notice that the outline of this buffer is dashed, and there's a good reason for that.

Because I don't want to create another Metal buffer in order to manage this tiny amount of data.

This is only a couple of matrices.

And it turns out that Metal actually has an awesome API for binding very small buffers and managing them for you.

So again, for small bits of data, less than about 4 kilobytes, you can use this API set of vertex bytes and pass it a pointer directly to your data.

And of course, tell us what size it is.

And Metal will create and or reuse a buffer that contains that data.

And again, you can actually specify the argument table index here, specifying it as 1, because our vertices are already bound at Index 0, so we bind at Index 1 so that we can then read from that, inside of our functions.

So let's take a look at how our functions actually change and respond to this.

Before that, we'll see an example of how to actually call setForTextBytes inside your application code.

So, we'll create this constant struct that again creates these contains these two matrices that we're going to be multiplying by the Model View Projection Matrix, and the normal matrix, which is the matrix that transforms the normal from local space into iSpace.

We'll construct them using whatever matrix utilities we're comfortable with, and then multiply them together.

And finally use setVertexBytes, passing a reference to that structure and then Metal will copy that into again, this implicit buffer that's going to be used for drawing in our subsequent draw call.

Now, last year at WWDC, we introduced and awesome framework called Model I/O, and Model I/O contains a lot of awesome utilities.

But one of the great things about Model I/O is that it also allows you to generate common shapes.

And because of MetalKit, it actually has very tight integration with Metal so that you can create vertex data that can be rendered directly by Metal.

So, instead of actually specifying all these vertices by hand, I can for example, draw my model in some sort of content creation package, export it, and load it with Model I/O.

Or in this case, generate it procedurally.

So let's take a look at that in code.

So I want to generate some vertexBuffers that represent this cube.

Well, in order to actually get Model I/O to speak in Metal, I'll create this thing called a MeshBufferAllocator.

So MTKMeshBufferAllocator is the glue between Model I/O and Metal.

By passing a device to a Mesh Buffer Allocator, we allow Model I/O to create Metal buffers directly and then hand them back to us.

So we create an MDLMesh using this utility method boxWithExtent, etcetera, pass in our allocator, and this will create an MDLMesh - a Model I/O Mesh - that contains the relevant data.

We then need to extract it by using MetalKit's utilities that are provided for this purpose.

And that looks like this.

So first, we generate and MTKMesh that takes in the MDLMesh that we just generated, as well as a device.

And then in order to extract the vertexBuffer, we just index into the mesh and pull it out.

Similarly, for the indexBuffer.

And there are also a couple of parameters here that we've already seen that we'll need to supply to our draw call.

But the emphasis here is on the fact that it's very easy to use Model I/O to generate procedural geometry and subsequently pull out buffers that you can use directly in Metal.

And now let's talk a little bit about textures.

We have our vertex data.

We want to apply a texture map to it to add a little bit more detail.

Well, as you know, textures are blocks of memory in some pre-specified, pixel format.

And they predominantly are used to store image data.

In Metal, it's no great surprise that you create textures with a descriptor object, specifically a Metal Texture Descriptor.

And texture descriptors are parameter objects that brings together texture properties like height and width and pixel format, and are used by the device to actually generate the texture object: Metal texture.

Let's take a look at that.

So we have these convenience functions on Metal Texture Descriptor, that allow you to ask for the descriptor that corresponds to a 2D texture, supplying on the necessary parameters: height, width, pixel format, and whether or not you want it to be mipmapped.

You can then ask for a new texture by calling newTexture on the device.

Now, this texture doesn't actually have any image content in it, so you'll need to use a method like Replace Region or similar.

You can consult the docs for that, but we're going to use yet another utility to make that a little bit easier to day.

And that's called MTKTextureLoader.

So this is a utility provided by MetalKit, and it can load images from a number of sources, including your asset catalogs or from a file URL, or from CG images that you have already sitting in memory, in the form of an MS image or a UI image.

And this generates and populates Metal textures of the appropriate size and format that correspond to the image data that you already have.

Now let's take a look at that in code.

So you can create an MTKTextureLoader by simply passing your Metal device.

You'll get back a TextureLoader, and you can subsequently fetch a data asset or whatever have you from your asset catalog.

And as long as you get the data back, then you can call texture Loader.newTexture, and hand it to data, and it will hand you back a Metal texture.

You might also be acquainted with the notion called Samplers.

Now, Samplers and Metal are distinct objects from textures.

They're not bound together.

And Samplers simply contain the state related to texture sampling.

So parameters such as filtering modes, address modes, as well as level of detail.

And so we support all those shown here.

In order to get a Sampler state that we'll bind later on in our Render Command encoder, to do textured drawing, we'll create a Metal Sampler Descriptor, and that looks like this.

So we create an empty Metal Sampler Descriptor that has default properties, and we specify whichever properties we want.

Here, I'm specifying that we want the texture to repeat in both axes, and that when minifying, we want to use the nearest filtering and when magnifying we linear filtering.

So once we've created this descriptor object, we call newSamplerState, and we get back a Metal Sampler State Object, that we can subsequently use to bind and sample from a texture.

In the Render Command Encoder, the API looks like this.

We create a texture so we set it at Slot Zero of the Fragment Texture Argument Table.

And then we bind our Sampler State at Index Zero of the Sampler State Argument Table for the fragmentFunction.

And let's look at those functions in turn.

So the vertex function this time around, will multiply by the MVP Matrix that we're going to get out of a constant buffer.

It will then transform the vertex positions from all the local space into Clip Space, which is what we're obligated to return from the vertex function.

And it will also transform those vertex normal from Models Local Space into Eye Space, so that we can do our lighting.

Here's what it looks like in code.

So notice that we've added a parameter attributed with Buffer 1, and like I mentioned earlier, this corresponds to the constants buffer.

So we've created a struct type in our metal shaving language code that corresponds to the constant struct that we created in our SWF code, that allows us to fetch out the Model View Projection in normal matrices.

And again, this is bound at Argument Table Index 1.

So that corresponds to the attribute that you see there.

So, to actually move into Clip Space, we index once again into the vertexBuffer at Vertex ID.

Get up a position vector.

Multiply it by the MVP matrix and assign it to the outgoing struct.

Similarly, for the normal.

And also, we just copy through the texture coordinates to the outgoing struct as well.

And all of these of course will be interpolated by the rasterizer.

So we just go ahead and return that struct.

The fragmentFunction is a little bit more involved than previously.

We want to actually compute some basic lighting, so we'll include two terms of ambient and diffuse lighting, and also sample from the texture [inaudible] you just bound to apply the texture to the surface.

It looks like this.

We're not going to talk through this in exacting detail, but the important thing to note here is that we've added a parameter that corresponds to the texture that we've created and bound, we've given it an access qualifier of sample which allows us to sample from it.

It's sitting at Argument Table Index Zero.

The Sampler State that we created is sitting at Argument Slot Zero for the sampler, and all we need to do to actually read a [inaudible] from the a filtered value from the texture, is call Sample, on the texture.

So Text2D.Sample, actually it takes the sampler state, as well as the texture coordinates and gives us back the color vector.

We'll also go ahead and do all of our fancy lighting, but I won't talk through any detail.

But it's just dependent upon the dot product between the normal and the lighting direction.

And we specified some constants related to the light earlier in our Shader file that we'll see during the demo.

And that's pretty much it.

So we constructed the color for this particular fragment by multiplying through the value that we sample from the texture, by the lighting intensity, to result in an animated textured lit cube.

And I will now let Matt show you exactly that.

Alright, let's take a look at this demo.

Here's the Metal texture mesh.

You can see, it's a very complicated cube.

Some simple lighting, and texturing, on a nice colored background.

Go ahead and admire it in all its glory, and now we'll take a look at the Shader.

So you can see some new stuff in our Shader compared to last time.

The first thing we take a look at is this constant struct.

This corresponds the Swift's direct that has a 4 X 4 Model View Projection Matrix, and a 3 X 3, normal matrix, and those are used for the appropriate transforms.

As Warren mentioned, we have some light data here.

Ambient light intensity, which is quite low.

And the diffused light intensity, which is quite high, and the direction of the light that we'll use to actually compute the dot product.

Our input and output structs are slightly different.

We've got a little more information that we need to pass down now.

We have position.

We have the normal, which we needed for the lighting, and the texture coordinates which we need to apply the texture.

And similarly, when we output from our vertex function, we need that same data again.

So let's take a look at the vertex function.

Just as Warren said, it's basically just a couple simple matrix, multiplies, and then a pass through for the texture coordinates.

And a quick look at our fragmentFunction, which is exactly what Warren just showed you.

Now let's see how the renderer looks.

A little more going on now.

So we have a little bit of animation.

So we need to update a little time step to know how much to rotate our cube by.

So here we have a little helper function to update my time step update with Time Step.

And that will change our constants.

Just like Warren said, we don't have much data that we'd like to send over to the GPU, so when you set vertex bytes, send a small structure over, which was the two matrices before.

And that's what we'll use to compute the animated positions of our vertices.

Put the texture, the samplers, and issue your Draws.

Highly recommend you guys always remember to push your debug groups so you know, exactly what you're looking at if you're going to look at a frame capture later on.

Present your drawable and commit, and then you're done.

Cool. Thanks again, Matt.

So with these adopting Metal sessions, we really wanted to take advantage of the fact that we've had a couple of years now, teaching Metal, and introducing awesome new utilities that make Metal easy to use.

And so we hope that these this two-part session is useful for that.

You've seen that Metal is a powerful and low overhead GPU programming technology, and fortunately now, you've become acquainted with some of the APIs that are available inside of it.

Metal is a very closely is very much informed by how the GPU actually operates and is you know, philosophically of course, we want you to push as much expensive work up front as possible.

And so you've seen sort of some of the ways that that informs the API as well.

And the emphasis of course is not on the restrictions that that entails, but of course, the power that it imbues you with.

So you've seen how explicit memory management and command submission can let you work a little bit smarter, in a sense that if you know how your application is shaped and you know what it's doing, then you can actually, you can take the reins and control the GPU directly.

And of course, over the next few sessions on Metal here at WWDC this year, we'll show you even more that Metal has in store.

And then of course, it will be your turn to go and build awesome new experiences.

So for more information on this session, Session Number 602, you can go to this URL, and of course, there are some related sessions.

Part II will be happening in this very room, very shortly.

And tomorrow we have, What's New in Metal, Parts I and II.

And the Advanced Metal Shader Optimization talk that I mentioned.

So thank you, and have a wonderful WWDC.

[ Applause ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US