From Art to Engine with Model I/O

Session 610 WWDC 2017

Model I/O makes it easy to bridge the divide between artists' tools and your custom engine. See how to build a Model I/O-powered art pipeline to bring assets from content creation tools into a Metal 2-based renderer. Learn strategies for baking 3D content into structures optimal for real-time rendering, and best practices for working with animation data, common mesh and model formats, textures, and materials.

Hi everybody, I'm Nick Porcino.

I work in the Games Technologies Team, and I'm here to talk to you today about taking your art from, all the way from Asset to Engine using Model I/O, so welcome.

So, I'd like to begin by refreshing you what's Model I/O.

Model I/O is Apple's toolkit for building pipelines.

You can import and export 3D assets in a variety of industry standard file formats, such as common ones like Wavefront OBJ and more modern ones like Pixar's USD.

It represents geometry and materials, lighting, cameras, voxels as well as lights and all kinds of other things.

It does data format conversions, so you can get your assets coming in whatever format they're originally authored in and you can conform them to particular strides and layouts that you might need for Metal.

And, there's a variety of processing tools and we'll review them briefly at the end of the talk.

So, this year, there's a bunch of improvements in Model I/O.

I'm going to refer you to the developer website to learn more details, but just to give you a taste of what's coming up.

There's improvements to the importers.

We've got lots of feedback from people who've loaded one exotic file format or another and run into issues.

We've corrected things as needed.

We've introduced support for skinned character animation.

And that's going to, that's going to come up in a bit.

We're supporting blend shapes, so if you have a character with a say a smile and a frown, you can go from one to the other with this data structure.

And we've got transform stacks which correspond to how data looks in a program like for example Maya where the animation is going to be separated out into rotations and skills and translations and put in certain orders.

Last year, we just gave you a matrix, this year we can compose the components of the animation just the same as you do in an authoring tool.

Now, most importantly, Model I/O gives you a consistent view on your data.

So, if you open up an asset, no matter what format it was originally, Model I/O is going to normalize it so that as you traverse it, you can write one loop of code, it's going to know how to go through all of the data in a consistent way every time and for every asset that you ever load.

So, this particular asset that I've got up here on the screen is just a simple thing with a car.

It's got a camera, it's got a light.

The car is broken down into components.

It's got materials.

No matter what scene I load up, it's going to come in like this and it's going to be easy to traverse.

So, that's why Model I/O is really easy and helpful in pipelines.

Now, we want to make something that looks like this.

So it's going to be my little game.

And, the game is going to be composed of art assets that were prepared in another tool like Maya for example or Blender or whatever you like.

And there's going to be models.

There's going to be materials on those models.

There's going to be animations, textures of various sorts.

And we're going to put a scene together from a lot of different files.

Now, when an artist creates an asset, they're in a pretty sophisticated environment with a lot of tools at their disposal.

So, for an artist the art and the tool like Maya or Blender is very much like source code for a programmer.

It's got all kinds of things that are super helpful for iteration and development, but that don't actually make it into the runtime.

You don't ship your code you ship your object code.

So, just like you compile source for your object code in your application, we're going to compile assets to be optimal for an engine.

Now, it's pretty tempting to make nice little UI-based tool.

So maybe some Drag and Drop, and lots of things to click and slide around and so on and so forth, and it's going to be fun to build that tool and it's going to be fun to use the first few times.

But then, I did my first car, and then the artist gave me 12 more cars, and it's like really, I have to drag all of them in, and I have to click all of those buttons again and again and again?

That kind of a tool becomes overwhelming pretty quickly.

What we're going to talk about is how to scale that work using a pipeline.

So, we're going to start with the artwork.

We're going to use an exporter to turn it into an asset.

We're going to use Model I/O to transform that asset into engine-ready data.

We're going to load that engine-ready data into our engine.

We're going to convert it into Metal buffers and then we're going to make a pretty picture.

So, the very first step is to export the art.

Now, in the example that we've put together, we're using Maya and we have a little Python script that goes through Maya, finds all the things.

It traverses complex hierarchies and files and it exports an asset file.

Now, as I mentioned before, the choice of file formats is pretty important, and we're not going to use something that's, you know, a bit long in the tooth like Wavefront OBJ, we're going to use something fresh and modern which is Pixar's Universal Scene Description file format.

Now, I'm just going to say a few words about it and first of all, there is a website.

You can see the URL there.

And you can find out all the details and information that you possibly want to know about USD there.

Now, Pixar's Universal Scene Description file format has been in use at their site for years in the production of feature animation film.

And as you know, or you might recall from last year's SceneKit presentation, we integrated Universal Scene Description into the operating system, iOS and on macOS last year.

And we've been working with Pixar to improve that integration over time and add new features.

Now, what makes Universal Scene Description like super powerful compared to what we might have had before is the fact that we can take a whole ton of files and we can compose them together to make a complex scene.

So, in this particular sample here, I've decomposed the pieces from our race track, or from our game with a race track, into the building, the race track, a tire wall, a car, and some wheels.

We're hierarchically composing that all together into a single file.

And, just like that diagram I showed earlier where Model I/O will read everything in into easy to traverse format, that's what we're going to get when we load this file into Model I/O.

Now, the other great thing, another great thing you get from Universal Scene Description is this idea of variations.

So, this is a really powerful tool for artists to use when they're putting a scene together.

If I want to have lots of cars on that race track, what I can do with USD is make a single car file, and I can have some shading variations.

You can see them here, yellow, green, red.

And modeling variations, with a fin and not with a fin.

So, I can reference them into the file, pick which versions that I want, and then Model I/O will essentially flatten that all down so that when you're traversing the data structures and looking to the data to make your buffers, it's going to find the right things for you.

And I also want to mention that Universal Scene Description has a ASCII format and a fast binary format.

So, the mesh with all of the thousands of vertices, normals, and so on, I'm going to export as binary.

But on the other hand, if I want to just noodle around with the data, I can write text that looks like this, where here I've got a world with an animation on a car and the car is referenced and I'm actually changing just the color on this car, and I put it in the animation scene, and it's not animated by, you know, it's general principle here.

And this is really powerful for just making variations and iterating your assets offline experimenting.

So, what we're going to build is a little tool that's going to take that Universal Scene Description asset and turn it into engine-ready data.

So, command line tool.

The thing that we're going to get form having a command line tool is that it's going to be repeatable.

It's repeatable because it's got command line arguments.

The operation of tools can be consistent because we've got well defined inputs and outputs and parameters.

It's going to be scriptable.

So, you can batch your tools, you can sequence your tools.

It's going to give us the scalability that we didn't have with the graphical user interface tool, and because we can have automation without intervention.

For example, the artist might have a Dropbox folder somewhere, where they're just going to throw all of their assets in whenever they're ready to be integrated into the engine, we can have a little script monitoring that directory.

Whenever it notices there's new files, the processes that we're going to talk about in a moment can automatically run, make it engine-ready, and then move it so that when you build in Xcode, all the assets will be automatically imported and ready for your game.

And finally, this kind of tool is composable, and that's what the little arc on the right side of the diagram indicates.

It's composable in the sense that if I've got multiple tools and they can all read and write the same blocks of data, then I can feed one tool into another.

I might do one tool and to extract all the meshes and light map them.

I might do another tool to find all the textures and make a list of textures.

And I make lots of little tools like that and put them together into a workflow.

So, in this sample, it's a very simple one.

And it doesn't go into a whole lot of you know best practices but it does give you a simplified data format that's really easy to understand so that you can match what we're talking about versus what you see in the code.

There's no compression because you don't want to obscure anything about what's going on.

This thing's intended to be a jumping off point for you to start to build your own pipeline tools and elaborate them to match your own engines and data formats.

So, also like to talk a little bit about the toy engine that we're going to put together.

It's got a really simple renderer in it.

It's all written in Swift and Metal.

It's a single-pass forward renderer, physically-based shader, mesh instancing, skinned and animated meshes, multiple materials.

Ticking off lots of bullet points there.

It's got a straight forward rendering loop.

We have on the left side of the diagram, some meshes to draw.

For everything that we want to draw, we're going to set transform buffer, skinning data, vertex buffers, set our pipeline state, material uniforms, fragment textures.

And we're going to draw indexed primitive to make the pretty picture.

So, we're going to call that tool the baker.

So what are we going to bake?

We're going to bake, first of all, geometry and transformations, what something looks like and where it sits.

Texture paths and materials so that when we're drawing those somethings we know what they're going to look like.

Instancing data so if we have say more than wheel, we're going to have information that tells us where to put many copies of that wheel in an efficient way.

Transform animation, so that things can be animated.

And finally, we're going to talk about skinning and character animation.

So, first of all, geometry and transformations.

Now, from basic computer graphics you're probably familiar with a scene graph.

We've got a transformational hierarchy here.

A couple of transform nodes A and B.

And A is perhaps a world's node and maybe it doesn't provide anything but an identity transformation.

B is a transformation that says where is the car in the world.

And my racing car as one wheel, pretend you can see all the other wheels off the bottom of my slide here.

And we've got a body.

So, the way a transform hierarchy works is if I move the parent node everything underneath that parent node moves together as a unit.

So, what we want to do is we want to get that information into our engine so that it can render it ultimately.

So we want to compactly encode that in a way that's easy to store and easy to read and doesn't require me to fix up pointers or any of that kind of thing.

So, what I'm going to do is I'm going to flatten and linearalize, linearalize that array.

I hope I don't have to say that word again.

So, we're going to make an array of local transformations.

So the first one A is probably identity, the world, world matrix.

Then another matrix to tell us where is the car.

Another matrix to tell me where is the, the wheel.

Another matrix to tell me where is the car's body.

I'm going to assign indices because those are going to be really useful in a moment.

So, 0, 1, 2, 3 that was the in-order traversal of the scene graph.

Now, I'm going to encode the tree.

So, I'm going to make another array of parent indices.

The first entry is nil.

The world has no parent.

The car's root B has a parent at index 0, which is the world.

The wheel is parented under the B, transfer node, as would the other wheels be.

The body is also parented under that node.

So, we've encoded a graph.

Finally, we're going to want to be able to tell the engine what to draw, so we're going to draw a wheel, which is at index 2.

And we're going to draw the body which is at index 3.

So, we've described our scene in a way that's really easy to write out.

And now, we also just need to tell what to draw.

So, that's going to be a vertex descriptor, which is an array that's going to tell Metal these are normals, these are texture coordinates, these are positions, the actual vertex buffers themselves.

And then the index buffer that just says, you know, these indices correspond to these triangles in the vertex buffers.

So, it's really easy to do this with our consistent data structures.

We're going to run through all of the objects in the MDLAsset object after we've imported it.

If the object can be casted to MDLMesh, we're going to fetch out the vertexDescriptor which tells us that we've got positions and normalcy.

For all of the vertex buffers, that are in that mesh, we're just going to create an NSData from the vertex buffer bytes and how long it is, and the dot dot dot just tells us that we're going to store that NSData somewhere for encoding.

Then, we're going run through the submeshes and just a quick note about what is submesh.

And since I've just introduced that word, if we just think about that wheel, the wheel had a rubber tire and a metal rim, right?

So, we're probably going to have two materials and therefore two draw calls.

But it's one mesh.

There are going to share a lot of vertices, like on the intersection between the tire and the rim.

So, we're going to make two submeshes to just index the rim and the tire with their own independent index buffers and meshes.

So, we run through them, we cast them.

If we successfully cast them, we create some NSDatas and stash them for storage.

Finally, for all of the objects in the asset, we're just going to find if there's a transform on the object, then grab its matrix and store it in the array.

Well NSEncode and archive it, and these are the buffers that are going to go off to the disc.

And the mesh data, descriptors, vertex and index buffers, the scene data which is the linearalization of the indices of the hierarchy, and finally the transformation data the actual places to put those things.

Next, we're going to fetch out all of the material data.

So, for every submesh, we're going to find out that there's possibly a material.

And if there's a material, we're going to find the parameters that are needed by our shader.

If our shader needs say diffuse color and roughness, then we'll ask the MDLMaterial hey have you got those values, and then we're going to check is it a scaler value or is texture and we're grab that out and record it.

Once again, the code's really straightforward.

If the submesh has a material, run through all the properties.

What I'm not showing you here is just filtering out the ones that we actually care about for our run-time shader.

But once we've got to the point of filtering it out, we're then going to say, hey property are you a string or URL?

If so, we're referring to a texture and we'll just write out the texture path for later.

Otherwise we're going to check and are you just a uniform property like are you a float value or a color or something like that, and if that's what we've found then we'll write that out.

And then once again, here's the data that we wrote out in the previous step which was the scene graph and the mesh.

Now we'll write out the material uniforms and the texture paths.

Finally, instancing.

So, this is where our car gets to have more than one wheel I think.

So, you're going to probably want to use a single mesh more than one time.

So, now my car has two wheels.

Now, it's kind of a waste to store that in memory more than once, right?

It's the same wheel multiplied lots of times.

So, Model I/O has a thing on the MDLAsset called a masters array.

When you load one of Pixar's USD files that uses instancing to replicate data, Model I/O notices that, collects all of those replicated objects into the masters array, and instead of storing the individual meshes in the nodes, instead just refer it, just stores an MDLObject that refers back to the master array.

And so that's how we get reuse.

And since Metal has great instancing facilities, that's going to stand us in good stead.

So, once again, we're going to flatten the hierarchy and linearalize the arrays, just as we did before.

And now on the right, you can see I've got two wheels in that array at indexes 2, 3, and 4.

But we want to batch those things together, so let's just go ahead and do that.

So, we've grouped the tires together, and now we've got the body down on the bottom.

So, we're just going to store a tiny little bit more data, which is the instance count.

There's two wheels and one body.

Eventually they'll be four wheels, trust me.

So, we're going to get the data that we stored out already earlier and to the scene composition data, we're just going to add the instance count.

And that's all it takes to get an instanced scene with lots of materials and objects.

Now, with that I'd like to hand it off to Nicholas to show you how that's all starting to come together.

[ Applause ]

I'd like to show you how easy it is to take our assets and turn them into engine-ready data using Model I/O.

I have here two folders and in the first folder we have our art assets.

It contains animation data, cars, it has skinned animation, it has a bunch of materials.

So what we want to do is we want to create a baker that takes this data, turns it into engine data, and puts it into this second folder here.

So, what we have here is our baker project and what we're going to do is we're going to slowly extend it to extract more and more data out of those art assets.

So starting out with the simplest, let's export the geometry and transforms.

To do that we're going to walk the scene graph hierarchy and look for any object of MDLMesh type.

Then we're going to store the vertexDescriptor, all the vertexBuffers, and then we're going to iterate through all the submeshes and grab the index buffers.

We're going to walk through the scene graph once again, and this time we're going to look for any objects that have a transform component.

If they do, then we simply store the matrix.

And that's all for the first example.

And so let's go ahead and run this.

And you'll notice in that second folder we now have a new file and that's our engine-ready data.

This second project here will be our engine and it's going to read in that data and it's going to render it.

So let's see what we have so far.

So as you notice, we have two cars on a race track, but there's color.

So let's extend the baker to also support materials.

So in addition to looking for the index buffers on the submesh, we're going to look to see if it has a material property.

If it does, then there are five properties associated with the semantic baseColor, metallic, roughness, bump, and ambientOcclusion that we care about.

And when we read in the property, we're going to iterate through all the properties and checking the type.

If the type is of float or float3 we assume it's a uniform and we're going to record that.

Otherwise, if it's a string or a URL we're going to record the texture paths.

So now, let's run this second example and see what kind of output we get in our engine.

So now we have a race track and two cars, but now with materials.

Let's further extend this and support instancing.

So, before, we only considered meshes while traversing the scene graph.

But now, we want to consider all meshes that live in the masters array of the asset.

So we walk through the masters, collecting all objects that are of MDLMesh type and storing in just like we did before.

In addition, we also need to record all objects that refer to those masters.

And we can find that on the instance property.

We then, sort the instances by mesh, and then grab the instance count.

And that's it.

So, let's go ahead and run our scene again.

And now we have multiple cars, rendering using instancing.

Back to you Nick.

[ Applause ]

Next, we're going to talk about transform animation.

So, transform animation is transforms that vary over time.

So, let's just consider our little simple scene graph again.

Now, I've got a car sitting on the start line, and I'm just going to want to do an animation where maybe the body is going to wiggle a little a bit before it starts, and then the car is going to drive away.

So in order to accomplish that, I'm going to need to record some animation data on the body node D and on the root transform of the whole object to move him away, which is B.

I'm going to record out animation tracks for both of those two nodes.

And once again with Model I/O, that's really easy to do.

As before, we run over all of the objects looking for the transform components, and when we find them we're going to be appending them.

But now we're going to do one more thing, which is, we're going to ask the transform if it's got any keyed times on it.

Now, if there's no keyed time, that's 0 count, then we're just going to use it as is.

And if there's 1 keyed time, we're just going to treat it as constant.

So we're just looking for counts that are greater than 1.

So, we're going to actually use a really exotic and cool piece of Swift here.

I like this part.

We're going to use a map closure.

So, what we're going to do is we're going to sample the animation at times and we're going to create a new array of transforms corresponding to those times that will append to our buffer.

And to just pick apart that mapping operation a little bit, the first line says samplesTimes.map.

The thing that's not showing on the slide is where did sampleTimes come from?

So, it's another array of keyTimes and you can do two things here.

One is you can make the sampleTimes array just the transform.keyTimes array, or if you want to instead of just getting only the times that the artist put in the file, if you want to for example sample the times at a constant frame rate, you could have synthetically made an array of transform times at the frame rate that you care about, say 60 frames a second.

And so, when you do this mapping operation, the closure takes the transforms, gets the local transform from it at the times corresponding to the values in the array.

I thought that was really cool.

So, here's the data that we've output already.

And it's straightforward to just encode the animated local transforms.

So, finally, skinning and character animation.

So, we're just going to take a little car here and he's a cartoon car apparently.

We're going to make him able to wiggle his nose and otherwise animate.

So, as we've seen before, the mesh is going to have geometry and all the same sorts of buffers and things that we've already talked about, but it has a new thing which is an embedded skeleton.

I hope you can see the little green bones.

They are going out into the wheels and there's some going down the spine of the car.

Now, those bones are bound to the vertices through a painting process that the artists do.

So, the one on the left has got a bone that's bound to the front of the car and heavily weighted to the bumper and the nose.

And then the one on the right we've selected one of the bones on the back of the car connected to the wheel, so when that one moves it's going to affect the wing on the back of the car and that wheel.

So, I should also mention that that kind of data requires a little bit of extra work in your shader.

We've got some more information coming along for the ride that we didn't have before.

And specifically, we've got the jointWeights per vertex and the jointIndices which are a small array of indices to what we'll call the matrix palette of joints that correspond to the vertex positions.

So, if two joints or bones were influencing a particular vertex, then the indices of those bones will come along with the vertex, with some weights so that when I transform them, when I transform them in the shader it'll all be combined together and the vertex will move to its final deformed position.

So, there's more data involved in order to encode the skeleton separately from the geometry and other transformations that are going on.

And that's the skeleton down there at the bottom of the diagram.

Let's just isolate it.

So, as we did before, we're going to traverse the graph and assign indices according to traversal order 0, 1, 2, 3.

We're going to encode the parents in the skeleton graph just as we did for the geometry graph.

And so, not going to go into particularly how it works because it's exactly the same as before.

Now, we're also going to encode for each of the bones that actually influences a vertex, the index of that bone in the hierarchy and the inverse bind pose.

On the previous slide, with the shader there is a bit of math there that referred to some sort of a palette matrix.

The inverse bind pose is going to be an extra bit of math that you need in order to get the vertices into the right space to be easily blended.

And I'd refer you to the sample for the details of that transformation.

We're going to go through each one of those and store those matrices and indices.

And then finally, to make an animation clip, we're going to record the animation that corresponds to each one of those bones in the clip.

So, in code it looks very much like what we've seen before.

We're going to go through the object and find out if the object has skin.

The skin, corresponding to the skeleton et cetera and is encoded in Model I/O's new MDLSkinDeformerComponent.

So if we found a skin deformer component, we're once again going to take advantage of this Swift map closure to take the jointBindTransforms that Model I/O read from the file and stored.

We're going to use this simd inverse to invert all of them because that's what the math needs, and store it in an array.

And then, here's all the data that we've stored to date.

And we're going to put out the skeletal data, the inverse bind transforms, and the joint to palette mapping, and the skeleton parent indices.

And so, I'd like to ask Nicholas to come up again and show us what it looks like now.

[ Applause ]

So where we last left off, we had a race track with multiple cars rendering using instancing.

Now, in addition let's also support animation.

So, before when we traversed the scene graph and we looked for any object that had a transform component, we assumed it was constant.

Now we want to know if the transform is time varying.

And the easiest way to find that out is to see if the keyTimes.count is greater than 1.

If it is, then for the purposes of this sample, we're going to sample it in regular intervals.

So, we sample them and then we store them.

And that's it for animation.

So, let's go ahead and run this example and see what kind of output we get in our engine.

So, now you notice that the front cars are taking off.

So, finally let's go ahead and add in skinning.

Let's extend the baker to support skinning now.

So, in addition to any mesh data that you may need, there's an additional skin data that you might need and so we check to see if a MDLMesh has a component conforming to MDLSkinDeformer.

If it does, then two bits of additional information we need is we need to know how the skeleton is bound to the skin mesh, and we need to know its animation data.

So, we find that paths of all the bound skeleton joints in the jointPaths array.

And then, we find the bind pose of the skeleton in the jointBindTransforms.

So, now that we know what our skeleton looks like, let's go ahead and time sample the skeleton's joints transforms just like we did objects local transform.

So we time sample it in regular intervals, store the matrix, and then we decompose it into a quaternion rotation, translation, and we store it in an animation clip.

So, let's go ahead and run this example.

So now we have a skinned car.

So, to recap on what we've done.

We were able to construct a simple baker using Model I/O that exported geometry and transforms, and with a bit of code we were able to extend it to support, materials, instancing, animation, and skinned animation.

All 5 of these examples and the engine are available for this session sample code which you can modify for your own engine's needs.

Back to you Nick.

[ Applause ]

All right then.

So, quick recap.

We've shown taking artwork all the way from your asset creation program to Pixar's Universal Scene Description file format.

We used Model I/O to transform that asset into engine-ready data and we encoded it and archived it off to the disc.

We put together a little game engine with a simple renderer using Swift and Metal.

And we loaded all that data up and we animated it and we drew some pretty pictures.

So, what's next?

Well, I'd encourage you to have a look at the other facilities that Model I/O's got built into it.

There's a whole ton of tools that are useful for building your own tools.

Your own tools for that pipeline chain.

For example, if you have a scene composed of a bunch of objects, you can perform a light mapping operation.

Model I/O will cast lots of rays, it'll bounce light around.

It'll make a prioritization for the scene, store the data all out for you.

It's got tools to do things like UV unwrapping, so we've taken a little airplane and carved it apart into the logical chunks that are ready for painting.

It has other operations like here we're calculating ambient occlusion.

So, we've taken the little airplane, we've done ray casting to compute accessibility of the surface from the outside and encoded that as a signal on the surface of the plane, so that your shader can render an object to be more physically grounded in your scene.

Here's another fun thing that you can do.

We've got all kinds of tools for dealing with 360-degree imagery, which would be very helpful for like making panoramic spheres for VR and whatnot.

So, on the very left there you can see a 360-degree picture that was taken with you know one of those funny little cameras.

And Model I/O can convert it to a cube map ready for hardware.

It can also take a cube map and convert it back to that other format.

And then the two blurry columns down on the side are precomputing the irradiance convolution for physically-based shading for you.

So, we're creating a bunch of coefficients that are ready for your shader so that if we dropped an object into the scene it would feel physically situated.

So, there's all kinds of things like that in there.

And I encourage you to go off and explore.

So, as Nicholas mentioned, this sample is available for download on this session's website.

So, please go grab like the little car and find the other, you know, missing wheels.

And, there's a bunch of other sessions that are worth reviewing in order to learn more about these topics.

There was Introducing Metal 2 and What's New in SceneKit and I'd also refer you to the What's New in SceneKit session last year when we went into some details about the integration with USD.

And from 2015 there was also the introductory session on Model I/O where we go into quite a lot of detail about the various data structures and operations on those structures.

So, with that.

Thanks for showing up and hope you enjoyed it.

[ Applause ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US