Introducing ARKit 3 

Session 604 WWDC 2019

ARKit is the groundbreaking augmented reality (AR) platform for iOS that can transform how people connect with the world around them. Explore the state-of-the-art capabilities of ARKit 3 and discover the innovative foundation it provides for RealityKit. Learn how ARKit makes AR even more immersive through understanding of body position and movement for motion capture and people occlusion. Check out additions for multiple face tracking, collaborative session building, a coaching UI for on-boarding, and much more.

[ Music ]

Good afternoon.

[ Applause ]

Hello, everyone.

And welcome to our session on introducing ARKit 3.

My name is Andreas.

I’m an engineer on the ARKit team.

And I couldn’t be more excited to be here today to tell you all about the third major release of ARKit.

When we introduced ARKit in 2017, we made iOS the largest AR platform in the world, bringing AR to hundreds of millions of iOS devices.

And this is really significant for you because that’s the wide audience that you can reach with the apps and the games you write.

Our mission has been from the beginning to make it really simple for you to write your first augmented reality app even if you’re a new developer.

But we also want to give you the tools at hand that you need to create really advanced and sophisticated experiences.

So if you look at the App Store today, we can see that you did an amazing job.

You built great apps and great games.

So let’s just have a look at a few of them now.

Bringing your game idea into AR can make them so much more engaging and physical, like Angry Birds.

You can now play it in AR, actually work around those structures and identify the best spot to shoot and then have to slingshot right into your own hand and fire off those angry birds.

ARKit also works really well for large scale use cases.

iScape is a tool for outdoor landscaping.

We can place bushes and trees and watch your next garden remodeling project come to life in AR right in your own garden or backyard.

Last year, with ARKit 2, we introduced USDZ, a new 3D file format for exchanging formats right made for AR.

Wayfair uses this to place virtual furniture right in your home.

But leveraging ARKit’s advanced scene understanding features like environment texturing, these objects really blend nicely with your living room.

And Lego is making use of ARKit’s 3D object detection capabilities.

It finds your physical Lego set and enhances it in AR.

And thanks to ARKit’s multi-user support, you can even play together with your friends.

So these are just a few examples of what you have created.

ARKit helps you to take care all of the technical details, do the heavy-lifting for you to make augmented reality work so that you can really focus on creating the great experiences around them.

Let’s do a quick recap of the three main pillars of functionality that ARKit provides to you.

First, there is tracking.

Tracking determines where your device is with regard to the environment so that virtual content can be positioned accurately and updated correctly on top of the camera image in real time.

This creates then the illusion that virtual content is actually placed in the real world.

ARKit also provides you with different tracking technologies such as road World Tracking, face tracking or image tracking.

On top of tracking, we have scene understanding.

With scene understanding, you can identify surfaces, images, and 3D objects in the scene and attach virtual content right on top of them.

Scene understanding also learns about the lighting and even textures in the environment to help make your content look really realistic.

And finally, rendering.

It brings your 3D content to life.

We’ve been supporting different renderers like SceneKit, SpriteKit, and Metal.

And now, this year also RealityKit, designed from the ground up with augmented reality in mind.

So with this year’s release of ARKit, they’re making a huge leap forward.

Not only are your experiences going to look even better and feel more natural, you will also be able to create entirely new experiences for use cases that haven’t been possible before.

Thanks to many new features we’re bringing with ARKit 3, like people occlusion, motion capture, collaborative sessions, simultaneous use of the front and back camera, tracking multiple faces, and many more.

We’ve got a lot to cover so let’s dive right in and we’re starting with people occlusion.

Let’s have a look at this scene here.

So in order to create a convincing AR experience, it’s important to position virtual content accurately and also to match the real world lighting.

So let’s bring in a virtual espresso machine here and put it right on the table.

But wait, when people are in the frame, as in this example, it can easily break the illusion, because you would expect the person on the front to actually cover the espresso machine.

So with ARKit 3 and people occlusion, you can now solve that problem.

[ Applause ]

Thank you.

So let’s have a look at how this is done.

Virtual content by default is rendered on top of the camera image.

As you can see here, for pure tabletop experience, this is fine, but if any people in the frame are in front of that object, the augmentation doesn’t look correct anymore.

So what ARKit 3 now does for you is, thanks to machine learning, recognized people present in the frame and then create a separate layer that has only the pixels containing these people.

We call that segmentation.

Then we can render that layer on top of everything else.

Let’s have a look at the composite image.

It’s looking much better now.

But if you look closely, it’s still not entirely correct.

So the person the front is now occluding the espresso machine.

But if you zoom in here, then you can see that also the person in the back is rendered on top of the virtual object, although she is actually standing behind the table.

So the virtual model in that case should occlude her, not the other way around.

Now, that happened because we did not take the distance of people from the camera into account.

When ARKit 3 uses advanced machine learning to do an additional depth estimation step, with this estimate, how far does segmented people are away from the camera, we can now correct the rendering order and make sure to render only people upfront if they are actually closer to the camera.

And thanks to the power of Apple Neural Engine, we are able to do this on every frame in real time.

So let’s have a look at the composite image now.

You see that people are occluding the virtual content just as you would expect resulting in a convincing AR experience.

[ Applause ]

That is great.

Thank you.

So people occlusion enables virtual content to be rendered behind people.

It works also for multiple people in the scene, and it even works if people are only partially visible like in the example before the woman behind the table actually was not visible with the full body but it still is working.

Now, this is really significant because it does not only make your experiences look way more realistic than before, it also means for you that you can now create experiences that haven’t been impossible before.

For example, think of a multiplayer game where you have people in the frame together with your virtual content.

People occlusion is integrated right in ARView and ARSCNView as well.

And thanks to depth estimation, we can provide you with an approximation of the distance of the people detected with regard to the camera.

We’re using Apple Neural Engine to do that work.

So people occlusion will work on devices with an A12 processor or later.

So let’s have a look at how to turn this on in API.

We have a new property on ARConfiguration called FrameSemantics.

This will give you different kinds of semantic information of what’s in the current frame.

You can also check if a certain semantic is available on the specific configuration or device with an additional method on the ARConfiguration.

Specific for people occlusion, there are two methods available that you can use.

One option is person segmentation.

This will you provide just with the segmentation of people rendered on top of the camera image.

That’s the best choice if you know that people will always be standing upfront and your virtual content will always be behind those people.

For example, a green screen use case just that you don’t need a green screen anymore now.

And the other option is person segmentation with depth.

This will provide you with additional depth estimation of how far those people are away from the camera.

That’s the best choice if people might be visible together with virtual content either behind or in front of that content.

If you do your own rendering using Metal or for advanced used cases, you can also directly access the pixel buffers with the segmentation and the estimated depth data on every ARFrame.

So now, let me show you how people occlusion works in a live demo.

[ Applause ]

So here in Xcode, I have a small sample project using the new RealityKit API.

And let me quickly walk you through what it does.

So in our viewDidLoad method, we’re creating an AnchorEntity looking for horizontal planes and we’re adding this anchor entity to the scene.

Then, we’re retrieving a URL of a model to load and load it using ModelEntity’s asynchronous mode loading API.

We add the entity as a child of our anchor, and also install gestures so that I will be able to drag the object around on the plane.

So what this does, thanks to RealityKit, is automatically setup a World Tracking configuration because we know that we need World Tracking for plane estimation.

And then, as soon as a plane is detected, the content will automatically be placed.

Now, this is not using people occlusion yet.

But I have already a stop to turn this on.

It’s called togglePeopleOcclusion and what I want to implement is a method that lets me switch people occlusion on and off when the user taps the screen.

So let’s go ahead and implement it now.

So the first thing I’m going to do is actually check if my World Tracking configuration supports the person segmentation with depth frame semantics.

It’s recommended that you do that always because if the code is run on a device that it does not have Apple Neural Engine and this frame semantic is not supported, we want to gracefully handle that case.

So if that’s the case, we would just display a message to the user that people occlusion is not available on that device.

And let’s actually move on and implement the toggle.

I’m going to do a switch statement on the frameSemantics property of our configuration here.

And if personSegmentationWithDepth is part of the frameSemantics, then we remove it and tell the user that people occlusion is now turned off.

And I just need to implement the other case as well.

If you don’t have person segmentation enable then, we insert the frameSemantics into different semantics property and display a message that we now turn people occlusion on.

Now, I need to rerun the updated configuration on my session.

So let me retrieve the session from the ARView and call run with a configuration that I’ve just updated here.

So now, the implementation of my togglePeopleOcclusion method is finished, now I need to make sure that it’s actually called when the user taps the screen.

I already installed a tap gesture recognizer and here in my onTap method, I just call togglePeopleOcclusion.

And that’s all I need to do.

And now, let me go ahead and build that code and run it on my device here.

And we already see that the plane was detected and the content placed.

I can also move it around, thanks to the gestures that I’ve added.

You see that RealityKit has added a nice grounding shadow.

And now, let’s actually have a look at people occlusion.

Right now, it’s still turned off.

So if I bring in my hand now, you see that the content is always rendered on top.

This is the behavior that you know from ARKit 2.

Now, let me turn it on and bring in my hand again.

And you’ll see that now [applause] the virtual object is actually covered.

[ Applause ]

And that’s people occlusion in the ARKit 3.

[Applause] Thank you.

So, let’s talk about another exciting new feature of ARKit 3 which is motion capture.

With motion capture, you can track the body of a person which can then for example be mapped to a virtual character in real time.

Now, this could only be done with external setup and special equipment before.

Now, with the ARKit 3, it just takes few lines of code and works right on your iPad or iPhone.

So, motion capture let’s you track a human body both in 2D and in 3D.

And it provides you with a skeleton representation of that person.

This, for example, enables to drive a virtual character.

And this is made possible by advanced machine learning algorithms running on Apple Neural Engine.

So it’s available on devices with an A12 or later processor.

Let’s look at 2D body detection first.

How to turn this on?

We have a new frame semantics option called bodyDetection.

This is supported on the World Tracking configuration and on image and orientation tracking configurations as well.

So you’ll simply add this to your frame semantics and call run on the session.

Now, let’s have a look at the data we will be getting back.

Every ARFrame delivers an object of type ARBody2D in the detectedBody property if a person was detected.

This object contains a 2D skeleton.

And the skeleton will provide you with all the joint landmarks in normalized image space.

They are being returned in a flat hierarchy in an array because this is most efficient for processing.

But you will also be getting a skeleton definition.

And thanks to the skeleton definition, you have all the information how to interpret the skeleton data.

In particular, it contains information about the hierarchy of joints, for example, the fact that the hand joint is a child of the elbow joint.

And you will also be provided with names for joints that can then be used for easier access.

So let’s have a look at how this looks like.

Here’s a person we detected in our frame.

And that’s the 2D skeleton provided by ARKit.

As mentioned before, important joints are named to make it easy for you to find out the position of a particular one you’re interested in, for example, the head or the right hand.

So this was 2D.

Now, let’s have a look at 3D motion capture.

3D motion capture let’s you track a human body in 3D space and it provides you with a three-dimensional skeleton representation.

It also provides you scale estimation to let you determine the size of the person that is being tracked.

And the 3D skeleton is anchored in world coordinates.

Let’s see how to use this in API.

We are introducing a new configuration called ARBodyTrackingConfiguration.

So this lets you use 3D body tracking but it also provides the 2D body detection that we have seen before.

So the frame semantics is turned on by default in that configuration.

In addition, this configuration also tracks the device’s position and orientation and provides selected World Tracking features such as plane estimation or image detection.

So with that, you have even more possibilities in what you can do using body tracking in your AR app.

In order to set it up, you simply create the body tracking configuration and run it on your session.

Note that we also have API to check if that configuration is supported on the current device.

So, now, when ARKit is running and detects a person, it will add a new type of anchor, an ARBodyAnchor.

This will be provided to you in the session that that anchor’s called back just like any other anchor types that you know.

And just like any anchor, it also has a transform providing you with a position and orientation of the detected person in world coordinates.

In addition, you will be getting the scale factor and a reference to the 3D skeleton.

Let’s have a look at how this looks like.

You see already that it’s much more detailed than the 2D skeleton.

So the yellow joints are the ones that will be delivered to the user with motion capture data.

The white ones are leaf joints that our additionally available in the skeleton.

These are not actively tracked so the transform is static with regard to the tracked parent.

But of course, you can direct the access each of those joints and retrieve their road coordinates.

Again, we also have labels for the most important once, an API to query them by name so that you can easily find out about a particular joint you’re interested in.

Now, I’m sure that you can come up with a lot of great use cases for this new API, but I want to talk about one particular use case that might be interesting for many of you, which is animating 3D characters.

By using ARKit in combination with RealityKit, you can drive a model based on the 3D skeleton pose and it’s really simple to do.

All you need is a rigged mesh.

You can find an example for that in one of our sample apps that you can download on the session home page, but of course, you’re also free to make your own in a content creation tool of your choice.

Let’s see how easy it is to do that in code.

It’s built right in RealityKit API.

And the major class we will be using is the BodyTrackedEntity.

So here is a code example using RealityKit API.

The first thing you’re going to do is create an AnchorEntity of type body and add this anchor to the scene.

Next, you’re going to load the model.

In our case, it’s called robot.

We’re using the asynchronous loading API for that.

And in the completion handler, you will be getting the BodyTrackedEntity that we now just need to add as a child to our bodyAnchor.

And that’s already all you need to do.

So as soon as ARKit now adds the AR body anchor to the session, the 3D pose of the skeleton will automatically be applied to the virtual model in real time.

And that’s how simple it is to do motion capture with ARKit 3.

[Applause] Thank you.

So, now, let’s talk about simultaneous front and back camera.

ARKit lets you to World Tracking with the back-facing camera and face tracking with the 2Depth camera system on the front.

Now, one really highly requested feature from you was to enable user experiences with the front and the back camera together.

Now, with ARKit 3, you can do that.

So with this new feature, you can build AR experiences using both cameras at the same time.

And what this means for you is, that you can now build two new types of use cases.

First, you can create World Tracking experiences.

So, using the back-facing camera, but also benefit from face data captured with the front camera.

And you can create Face Tracking experiences that make use of the full device orientation and position in 6 degrees of freedom.

All of this is supported on A12 devices and later.

Let’s see an example.

So here we’re running World Tracking with plane estimation but we also placed a face mesh on top of the plane and are updating it in real time with the facial expressions captured through the front camera.

So let’ see how to use concurrent front and back camera in API.

First, let’s create a World Tracking configuration.

Now, the configuration that I choose determines which camera stream is actually displayed on the screen.

So in that case, it would be the back-facing camera.

Now I’m turning on the new userFaceTrackingEnabled property and run the session.

This will cause that I receive face anchors.

And I can then use any information from that anchors like the face mesh, land shapes, or the anchors transform itself.

Now, note, since we are working with world coordinates here, the user face transfer will be placed behind the camera, which means that in order to visualize the face, you would need to translate it to a location somewhere in front of the camera.

Now, let’s also look at the face tracking configuration.

You create your face tracking configuration just as you would do it always and set worldTrackingEnabled to true.

And then, after you run the configuration, you can access in every frame, for example, in the session that update frame callback the transform of the current camera position.

And you can then use that for whatever use case you have in mind.

And that’s simultaneous front and back camera in ARKit 3.

We think that you will be able to enable many great new use cases with this new API.

[ Applause ]

Thank you.

[ Applause ]

And now, let me hand it over Thomas who will tell you all about collaborative sessions.

[ Applause ]

Thank you.

Thank you, Andreas.

Good afternoon, everyone.

My name is Thomas and I’m part of the ARKit team.

So let’s talk about collaborative sessions.

In ARKit 2, you could create multiuser experiences with the ability to save and load world maps.

You had to save a map on one device and send it to another one in order for your users to jump into the same experience again.

It was a one map one-time map sharing experience, and after that point, most of the users wouldn’t be in the same wouldn’t share the same information anymore.

Well, with collaborative sessions in ARKit 3, we now continuously share your mapping information across the network.

This allows you to create ad hoc multiuser experiences where your users are more easily accessing the same session.

Additionally, we also allow you to share or we actually share ARAnchors across all devices.

All those anchors are identifiable with anchor’s session IDs on those ones.

Note that at this point, most all the coordinate systems are independent from each others even though we still share the information under the hood.

Let me show you how this works.

So in this video here, we can see two users.

Pay attention to the colors.

One user will be showing feature points in green and another user will be showing feature points in red.

As they move around the environment, they’re starting to map the environment and add more feature points do it.

At this point, and this is the internal representation of their internal maps, they don’t know about each other’s maps.

As they move around, they gather more feature points.

When they gather more feature points in the scene and you can see the internal maps, and pay attention to the color and their final matching point, then those internal map will then merge into each others and will only form one map only, which means that each users will now understand each others and scene understanding.

As they move around, they map even more information.

And they continue sharing that under the hood.

Additionally, ARKit 3 provides you with like AR participant anchors that allows you to understand where another user is in real time in your environment.

This is really handy if you want to display for example an icon or something to represent that user.

As mentioned earlier, ARKit 3 also share ARAnchors under the hood, meaning that if you share or add an anchor on one device, it will automatically show up on the other device.

Let’s now have a look at how it works in code.

As Andreas mentioned earlier, ARKit is really well integrated with RealityKit.

If you want to enable the collaborative session with RealityKit, it’s pretty simple.

You first need to setup your Multipeer Connectivity session.

Multipeer Connectivity framework is an Apple framework that allows you for discovery and peer-to-peer connections.

Then, you need to pass this Multipeer Connectivity session to the AR scenes view synchronization service.

And finally, as every ARKit experiences, you have to setup your ARWorldTrackingConfiguration, set the isCollaborationEnabled flag to true and run that configuration on the session.

And that’s it.

So what’s going to happen then?

So ARKit when sending up the isCollaborationEnabled flag to true will essentially and running that configuration on the session will essentially create a new method on the ARSessionDelegate for you to be transferring that data.

In the RealityKit use case, we’ll take care of it, but if you’re using ARKit in another renderer, then we’ll you’ll have to send that data across the network.

This data is called AR collaboration data.

ARKit can create at any point in time an AR collaboration data package that you then have to forward again to other users.

This is not limited to only two users.

You can have a large amount of users in that session.

During that process, ARKit will generate additional AR collaboration data that you will have to forward to other devices and broadcast that data.

Let’s see how that works in code.

So you first need to setup your multipeer connectivity in this example, or you can also setup any framework any network framework of your choice and make sure that your devices are sharing the same session.

When they do, then you need to enable the ARWorldTrackingConfiguration with this isCollaborationEnabled flag to true.

When this is the case, you need to run then run the configuration.

At this point, you will then have a new method available under delegate for you where you will be receiving some collaboration data.

Upon receiving that data, you need to make sure to broadcast it on the network to other users that are also in this collaboration session.

Upon the reception of that data on the other devices, you need to update URL session so that it knows about this new data.

And that’s it.

This collaboration session data automatically exchange all the user created ARAnchors.

Each anchors is identifiable by a session ID so that you can make sure to understand from which device or which AR session the anchor is coming from.

As mentioned earlier, the ARParticipantAnchor represents in real time the participant position which can be very handy in some of your use cases.

So this is how you create collaborative sessions.

[ Applause ]

Let’s now talk about coaching.

When you create an experience, an AR experience, coaching is really important.

You really want to guide your users whether they are new or returning users into your AR experience.

It’s not a trivial process.

And sometimes, it’s hard for you to understand or even to guide the user to that new experience.

Throughout that process, you have to react to certain tracking events.

Sometimes the tracking gets limited because the user moves way too fast.

So far, we’ve been providing you with human interface guideline that allowed you to provide some guidelines for the onboarding experiences.

Well this year, we’re embedding that in the UI view and we call it the AR Coaching View.

This is a built-in overlay that you can directly embed in your AR applications.

It guides your users to a really good tracking experience.

It provides a consistent design throughout your applications so that your users are very familiar with it.

You actually may have seen that design before.

We have it AR Quick Look and in Measure.

This new UI overlay automatically activates and deactivate base on the different tracking events.

And you can also adjust certain coaching goals.

Let’s have a look at some of those overlays.

So in the AR Coaching View, we have multiple overlays.

The onboarding UI provides the users the ability to understand what you’re looking for, and in this case, surfaces.

Most of the time your experience require a surface to place content on to it.

So if you enable plane detection on your configuration, then this overlay will automatically show up.

Secondly, we have another overlay that provides the user with the ability to understand that they have to move around a little bit more to gather additional features so that tracking can works best.

And then finally, we have another overlay which helps your user relocalize against certain environments in case of your lost tracking for example, or if the app went into the background.

Let’s look at one example.

So, in this example, we’re asking a user to move the device around to find a new plane, and as soon as the user is moving around and gathering more feature, then the content can be placed and the view deactivates automatically.

So you don’t have to do anything.

Everything is handled automatically.

[ Applause ]

Let’s have a look at how you can set that up.

Again, this is really easy.

As is it’s a simple UI view, you have to set it up as a child of another UI view.

Ideally, you set it as a child of the AR view.

Then, you need to connect this session to the coaching view so that the coaching view knows what events to react to.

Or you need to connect the session provider outlet of the coaching view to the session provider itself if you’re using a storyboard for example.

Optionally, you can set a bunch of delegates if you want to react to certain events that the view is giving you.

And finally, you can also provide a set of specific coaching goals if you want to disable certain functionalities.

Let’s look at some of those delegates.

So we have three new methods on the AR Coaching View CoachingOverlayViewDelegate.

Two of them can react to activation and deactivation.

So you can choose if you want to still enable that throughout the experience or if you think, for example, that if a user had that once, it doesn’t need to know anymore.

Additionally, you can react to certain relocalization abort requests by default the UI view, the coaching view will actually give you or give your users a new UI bottom that where they can actually relocalize and restart the session or reset the tracking for example.

So this new view is really handy for your applications so that you can make sure that you’ve got that consistent design and help your users.

Let’s now talk about Face Tracking.

In ARKit 1, we enabled Face Tracking with the ability to track one face.

While in ARKit 2, we have the ability to the multi-face tracking up to three faces concurrently.

Additionally, you can also make sure to recognize the person when he leaves the frame and comes back again giving you the same face anchor ID again.

[ Applause ]

So multi-Face Tracking tracks up to three faces concurrently and provides you with a persistent face anchor ID so that you can make sure to recognize one user throughout the session.

If you restart a new session, then this ID disappears and a new one comes up.

To enable this, this is really easy.

We have two new properties on the ARFaceTrackingConfiguration.

The first one allows you to query how many multiple faces can be tracked concurrently in one session on that specific device.

And the other one allows you to set the number of track faces that you want to track concurrently.

And that’s Multi-Face Tracking.

[ Applause ]

Let’s now talk about a new tracking configuration that we called ARPositional TrackingConfiguration.

So this new tracking configuration is intended for tracking only use cases.

You often had a use case where it didn’t really need the camera backdrop to be rendered for example.

Well, this is made for that use case.

We can achieve a low power consumption with the ability to lower the capture frame rate and also the camera resolution by still keeping your rendering rate at 60 hertz.

Next, let’s talk about some of the scene understanding improvements we’ve made this year.

Image detection and image tracking has been around for some time now.

We can now this year detect up to 100 images at the same time.

We also provide you with the ability to detect the scale of an printed image for example.

Oftentimes, when new application require a user to use an image to place content and scale that content accordingly, the image might be printed with a different size for example or different paper size.

With this automatic scale estimation, you can now detect the physical size and adjust the scale accordingly.

We also have the ability to query at run time the quality of an image that you’re passing to ARKit when you want to create a new AR reference image.

We’ve also made improvements to our object detection algorithms.

With machine learning, we can enhance that object detection algorithms and provide you with a faster recognition, and also in more robust environments, in more in different environments.

Oftentimes, you had to scan a specific object in an environment so that it works perfectly in another one.

Now, this is a bit more flexible.

Finally, another area of scene understanding which is really important is plane estimation.

Oftentimes, you need plane estimation to place content.

Well, with machine learning, we’re actually making that even more accurate and we’re making that even more robust to detect planes and faster.

Let’s have a look at an example.

With machine learning, not only we can extend those planes on the ground when features are detected but the flow is actually going even further.

But we can also with the ability to detect we also have the ability detect walls on the side when no feature points our present.

And this is thanks to machine learning.

[ Applause ]

As you can see here As you can You saw on the previous video, we had a couple of classifications on the planes.

Well, this is done again with machine learning.

And last year, we introduced five different classifications, wall, floor, ceiling, table, and seat.

Well, this year, we’re adding two additional ones.

We are adding the ability to detect doors and windows.

As mentioned earlier, plane classification is really important Or plane estimation is really important to place content on your in the world.

This is actually ideal for object placement.

You always your objects to be placed on a surface.

Well, this year with the new raycasting API, you can now even easier place your content like more precisely and is even more flexible.

It supports any kind of surface alignment.

So you’re not always bound to vertical and horizontal anymore.

But also, you can track your raycast all the time.

Meaning that as ARKit or as you move your device around, ARKit detects more information about the environment, it can accurately place your object on top of the physical surface such as those planes are evolving.

Let’s see how you can enable that in ARKit.

Well, it sounds we’re creating a raycast query.

A raycast query has three parameters.

The first one decides from where you want to perform that raycast.

In this example, we’re doing this from the screenCenter.

Then you need to tell it like what you want to allow in order to place that content or get the transforms back.

And additionally, you need tell it which alignment you want.

It can be horizontal, vertical, or any.

Then you need to pass that query on to the trackedRaycast method on your AR session.

This method has a callback that will allow you to react to the new transform and result giving to you with that raycast so that you can adjust your content or your anchors accordingly.

And then finally, when you are done with this raycast, you can just stop it.

And those are some of the scene Those are some of the raycasting improvements that we’ve done this year.

Let’s move on with some of the visual coherence enhancement we’ve made.

So this year, we have this new ARView that allows you to activate and deactivate different render options on demand.

It also can also be automatically deactivated and activated based on your device capability.

Let’s look at an example of this.

You’ve probably seen that video before.

Look at how the Quadrocopter actually moves around and how the objects are moving as well on the surface and how real all of these looks like.

When everything disappears, you actually don’t even realize that those objects were virtual.

Let’s look at some of those visual coherence enhancements we’ve made.

Let’s look again at the beginning of that video and let’s pause for a second when those balls are rolling on the table.

Here you can see the depth of field effect which is a new feature of RealityKit.

Your AR experience are designed for, you know, small and big rooms.

The camera on your iOS device always adjust their focus to the environment, while the depth of field feature allows to adjust the focus on the virtual contents so that it matches perfectly with your physical one, so that the object blends perfectly in the environment.

Additionally, when you move the camera quickly or when the object moves quickly, you can see that blurriness occurring.

Well And then most of the time when you have a regular renderer and no motion blur effect, then your virtual content stands out.

And it doesn’t really blend nicely in the environment.

Well, thanks to VIO camera motion and the sense of parameters, then we can synthesize the motion blur.

And we can apply it on the visual object so that it blends perfectly in your environment.

This is a variable on ARSCNView and on the ARView as well.

Let’s look again at this example where everything looks really good.

Two additional APIs that we’re making available for visual coherence enhancement this year are HDR environment textures and camera grain.

When you place your content, you really want that content to reflect the real world.

Well, with high-dynamic-range, you can capture even in bright light environment those highlights or higher highlights that makes your contents more vibrant.

Well, with ARKit this year, we can actually request for those HDR environment textures so that your content looks even better.

Additionally, we also have a camera grain API.

You’ll probably notice when you have an AR experience in a very low light environment how your other content looks really shiny compared to the camera.

Every camera produce some grain.

And especially in low lights, this grain can be a bit heavier.

Where with this new camera grain API, we can make sure to apply the same grain patterns on your virtual content so that it blends nicely and doesn’t stand out.

So those are some of the visual coherence enhancement for this year.

But we didn’t stop there.

I think there’s one feature that a lot of you have been requesting.

When you develop an AR experience, you’ll always have or often in order to prototype or to test your experience go to a certain location.

And most of the time when you go there, you’d come back to your desk, you develop your experience, you want to come back again.

Well, this year with the Reality Composer app, you can now record an experience or a sequence.

Meaning that you can go to your favorite place so where the place where your experience will be happening to capture the environment.

ARKit will make sure to save the sensor data alongside the video stream into a movie file container so that you can take it with you and put it in Xcode.

At that point, the Xcode scheme settings now have like one new additional feature or a new field that allows you to select that file.

When that file is selected and then you press Run on the device that is attached to Xcode, then you can replay that experience at your desk.

This is ideal for prototyping, and even better for tweaking your AR configuration with different parameters and trying to see how the experience looks like.

You can even react to certain tracking [ Applause ]

So this is great.

I think you’ve got a lot of different tools this year in ARKit 3 where you can enhance your multiuser experiences with collaborative sessions, multiple-face tracking.

You can improve the realism of all your AR apps with RealityKit’s new coherence effect, and also the new visual effects on the ARWorldTrackingConfiguration.

You can enable new use cases with the new motion capture, and also the simultaneous face and back camera.

And, of course, there are lots of improvements under the hood on existing features.

As an example, with object detection and machine learning.

And least last but not the least, the record and replay workflow I think will make your design and your experience prototyping even better than before.

So I’m really looking forward for you to go and download our samples on the website.

We also have a couple of labs, one tomorrow and one on Thursday, where I hope you will be coming and asking us the questions you have or just to chat.

And then we also have two in-depth sessions, one of which is around bringing people into AR which we’ll talk more about people occlusion and motion capture.

And the second one is about collaborative AR experiences.

I hope you enjoy the rest of the conference.

Have a great day.


[ Applause ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US