Developing CarPlay Systems, Part 2 

Session 723 WWDC 2016

Learn how CarPlay integrates with your car’s infotainment system. Understand how CarPlay is designed to work with your car’s resources including the display, speakers, microphone, user inputs, steering wheel controls, instrument cluster and sensors.

[ Music ]

Hello. My name is Tanya, and I want to welcome you to the second session about developing CarPlay systems.

The video to Part 1 is also available.

So if you haven’t watched it, definitely check it out.

In this session, we’ll cover more detail on how CarPlay is integrated into a typical automotive infotainment system.

We’ll start with a system overview, then talk about volume and resource management, and finish by discussing application state management.

Now, let’s dive into an overview of the system.

If you watched the first video, you will recall the software components you need to implement a CarPlay receiver.

In summary, you need an IP-based link to the head unit either over USB or over Wi-Fi, IP cline for data exchange, an instance of the CarPlay Communication Plug-in implementing the control protocol, audio framework to play or record sound, and a video framework to render the video stream.

However, those components are just one of the sub-systems in a typical automotive head unit.

In addition to the CarPlay functionality, your unit has its own native interface, audio sources, and logic.

Both together build the complete head unit.

Then you add all hardware resources within the vehicle: Microphone, speakers, displays, sensors, user input elements, and you have the complete system.

OK, so we have the full system.

Now the question is how do those seemingly competing sub-systems work with each other?

Let’s understand it.

First, let’s look at how sensor data is exchanged.

And we will take location data as an example.

Location information includes data from a GPS receiver or sensor data like wheel speed or yield.

The data reaches the device through iAP2, must be always available, and is requested by the phone when needed.

Next, let’s look at the instrument cluster or heads-up display.

These secondary displays are used to show metadata provided by the device.

Such metadata could include the currently-playing music track, the active phone call, or starting with iOS 10, the next turn-by-turn direction.

Again, the data is fired through the IP protocol and exchanged on demand.

Now let’s see what happens on the primary display.

Of course, to operate the content shown on the display, the user must either use a touchscreen, an app controller, or both.

So we can consider the availability of the user input devices together with the display.

When the CarPlay UI is active, the unit is rendering the video stream on the primary display.

And both the screen and the user input are used by the device.

But if the native user interface is shown on the screen, then both the display and the user input are routed to the native sub-system.

However, additional user controls on the steering wheel, like the Voice Recognition button, Next/Previous Track, may be linked to CarPlay even when the native UI is shown on the display.

That way, Siri can be always launched or the next audio track played.

OK, now that we looked at the user interface integration, let’s talk about audio.

Audio in the car can be roughly separated into three categories.

The first one provides access to the speakers and the microphone, and it’s used for phone calls or voice recognition.

Then we have audio playback for any media content like radio or music.

And last, we have the alerts category which is used for navigation prompts or any other higher-priority alerts.

Within the native sub-system, these three categories are to be found again.

Within CarPlay, we have a Main audio channel which includes phone call, voice recognition and audio playback audio.

And an Alternate audio channel which is reserved for higher-priority prompts and sounds such as navigation prompts or new message notifications.

So the media channel can either play music coming from the phone or media playback from one of the native audio sources.

OK, now that we have looked at media playback, let’s see what happens when there is a phone call through CarPlay.

The phone call needs the audio channel which provides access to both the microphone and the speakers.

So media playback is interrupted and the channels are switched.

And audio switched over to the channel, which provides access to both the microphone and the speakers.

In addition to switching to audio channel, the screen switches to CarPlay to show the phone call UI.

And metadata information on the instrument cluster is updated as well.

Of course, if the native system also supports phone calls outside of CarPlay, then that phone call is played and shown on the in-car displays.

But while on a native phone call, the user can be using Apple Maps for navigation.

So each time there is an approaching turn, Apple Maps will play a sound through the announcement channel to notify the user.

The phone call continues to be routed through the speakers of the car, but the turn-by-turn notifications are also displayed as they are mixed with the call audio.

Of course, in such situation, we would not want the user to hear the phone navigation prompt, but just an alert sound that an upcoming turn is coming.

And if the user wants to see where the next turn is, the main display may show Apple Maps.

And last, after the phone call ends, music continues to play through the native sub-system as that was what the user was listening to prior to the phone call.

Now, the navigation announcement will include the full-spoken direction as the user is no longer on a phone call.

All right.

As you see, integrating CarPlay is rather a complex task, especially in regards to sharing hardware resources.

But before we get into those details, let’s look at volume controls.

Volume management.

As you might know, a usual automotive head unit shows the volume indicator while the user is operating the volume knob, and that volume knob operates the volume of the currently-playing audio.

The same principle applies for CarPlay audio, but there is a different volume level setting for each of the major CarPlay applications.

Let’s look through them.

During Siri playback, the volume knob sets the volume specific to voice interactions.

For an incoming phone call, the volume of the ringtone is controlled.

When the user is on a phone call, the phone call audio is controlled.

And if a next turn is approaching, navigation prompts are played.

Then, the user can change the navigation volume.

And remember, media could be playing in the background but its volume will be lowered during those prompts and cannot be adjusted.

Once the announcement is over and music volume is ramped up, the button controls the media volume level.

All right, now let’s look at resource management.

We are going to take a look at which resources are managed, talk about how they are managed, and go over a couple of typical examples.

So we saw that there are multiple hardware resources, but let’s understand which one of them are managed.

And there’s only two: The mainScreen representing the main display in the car, and mainAudio as the resource giving access to the car’s audio system.

Those resources can either be taken or borrowed.

When you take a resource, it belongs to you for unlimited amount of time.

It’s basically yours.

When you borrow a resource, you may use it for a while, but you have to give it back after you’re finished.

OK, so let’s have the native user interface take over the display permanently.

The user can be using native navigation, or listening to FM radio, or adjusting some car settings.

What triggers such permanent switch to the native UI?

The native UI can take the screen when the user presses a hard key or switches to the native UI from the entry point within the CarPlay UI, or by using the native voice recognizer to call out a specific native application.

Now let’s say the user selects the Apple CarPlay icon from the main menu.

In this case, CarPlay takes over the display as the user explicitly requested it.

Again, what can trigger such permanent transition?

Any hard key linked to CarPlay.

Or, as we saw, through any CarPlay button within the native UI.

Or through Siri, for example, by saying, “Open Maps.”

Now, there are other applications that need to borrow a resource.

Remember, they just need it for a while and will give it back.

Which applications are we talking about?

That’s phone calls, voice interactions, notifications, or alerts.

Let’s look at an example.

The native UI has taken the Main Screen resource and is permanently showing on the display.

Now the user gets a call through CarPlay.

The CarPlay UI borrows the screen and is shown for the duration of the call.

Once the call ends, we go back to the native UI and stay there until further user action.

Then, let’s look at the Main Audio resource.

Main Audio can be split in four major types.

Each type is used for a different CarPlay application as it provides access to a different hardware resource.

Media is audio output only and is used for any media playback.

Alert is also only output and plays ringtones and timer alerts.

speechRecognition is used for Siri as it adds access to the microphone.

Same for the telephony type, which is used for phone calls.

And same for the default, which is used for undefined audio.

But don’t forget the second audio channel used for navigation announcement.

alternateAudio is not managed.

It is basically always available, so there is no need to take it or borrow it.

alternateAudio is mixed with any of the audio types within the mainAudio channel, and it is always accessible.

And with this, I’m going to hand off to Tom to talk about the resource manager.

Thanks, Tanya.

Hi. I’m Tom, and I also work on the CarPlay Engineering Team.

So now that you’ve learned which resources need to be managed and you know you can either take or borrow those resources, let’s talk about how you go about managing them.

To be able to distribute the resources between the two sub-systems, we need some type of arbitrator.

And we call such an arbitrator a resource manager.

So what does the resource manager do?

It has three main tasks.

First, it holds the current state of the whole system.

Second, it follows a set of strict rules to decide which system will get the resource.

Third, based on the current state and the set of the rules, it assigns the resource to one party or the other.

So let’s see how this works in practice.

Let’s say both the native UI and CarPlay need to show something on the screen.

So both send a request to have ownership of the display.

Then, the resource manager checks the internal state and decides which of the two should get access to the display.

Let’s say, in this case, the native UI request had a higher priority and the screen is assigned to it.

The resource manager sends a notification that the screen can now be used by the native UI.

Only then the native UI can show content on the screen.

It’s important to note that this state is not changed until the resource manager has sent an update.

So where does the resource manager live?

Is it part of the native system or of CarPlay?

Well, when we designed CarPlay, we asked ourselves the same question.

We considered the complexity of exchanging detailed information about the resources.

We considered the future.

What, if any, of those rules needed to be adjusted?

What if new applications became available through CarPlay that the vehicle had no existing designs to handle?

What solution would give us the greatest flexibility five years down the road?

So we decided to implement a system where the complexity on the native system is lower and it’s easier to update after the vehicle has been in the customer’s hands for a while.

Hence, the resource manager is implemented within iOS.

And because the resource manager is the component with which the native system interacts, we refer to the iPhone as the controller, and the head unit as the accessory.

But don’t forget, all of the iPhone applications request the same resources in the same way as the native UI would.

Next. What commands can you use to interact with the resource manager?

It’s simple.

You only need two commands.

The changeModes command is used to request or release a resource, and the modesChanged command is used to describe the current states.

changeModes is the notification sent by the head unit, the accessory in this case, to the resource manager, the controller.

This changeModes command states what the accessory wants to do with the resource.

It declares why it needs it and who can take or borrow the resource after it is transferred.

The modesChanged is the notification sent by the controller back to the head unit.

modesChanged provides the current state which describes who’s the owner of the resources of the system.

It sends so the accessory knows if a resource has been transferred and the owner has changed.

Now we’ll talk about how resource management works in more detail.

We will start with the simple action of switching audio from iOS audio to FM.

Then, we’ll talk about native voice recognition, then how to handle a backup camera case where you would not want iPhone apps to interrupt you unless we walk through an example where Siri triggers music playback.

So let’s look at playing FM radio.

Let’s say iPhone music is playing through the car speakers, then the user wants to play FM radio through the native system and it’s the Radio button.

The unit sends a changeModes request to have ownership of the speakers, and it’s taking it as the user might continue listening to FM radio for a prolonged period of time.

The controller assigns the audio to the head unit and it sends a modesChanged notification.

And the head unit is the new owner of the audio resource, so it can now start playing FM radio.

The mainAudio resource is permanently assigned to the head unit at this point.

So, in summary, this example showed us when to take a resource and that the native system should not use that resource before it is the owner of that resource.

Let’s look at the next example using native voice recognition.

FM radio is still playing from our last example.

Let’s see what happens if the user triggers the native voice recognizer.

The unit requests to temporarily own the display and audio, so the transferType should be set to borrow the audio and screen.

Both the mainScreen and the mainAudio are transferred to the accessory.

And the native voice recognizer starts.

Once the voice dialogue is finished, the head unit returns the borrowed resources by sending an unborrow command.

And since the unit was playing FM radio before, the resource is again assigned back to the head unit and FM radio can resume.

Now you may be wondering, “Why would the accessory need to borrow the resource if it was already the owner?”

Well, that’s a great question.

Whenever the resource manager needs to evaluate a request to change the ownership of a resource, it must know what the current state of the system is.

The resource manager needs to know why you’re using the resources so that they can take the right decision if somebody else needs the resource later on.

Let’s move on to our next example, showing the backup camera.

Let’s say CarPlay audio is playing and the native UI is showing on the screen.

When the user switches into reverse gear, the unit borrows the screen.

However, notice it also tells the resource manager that the screen cannot be borrowed again.

The resource manager assigns the display to the native UI and also notes the limitation.

So the backup camera is now shown.

CarPlay audio continues to play, but now the user gets a phone call.

The iPhone cannot show any content on the screen as the head unit has constrained the access, but the ringtone plays through the speakers.

So how did that happen?

With the changeModes command, the unit told the resource manager which rules to apply by the resources in its possession.

By setting the borrowConstraint to anytime, the unit allows the application to be able to borrow the resource.

By setting it to user-initiated only events triggered by the user can borrow.

With never, no application can use the resource, as with this example.

As you see, the constraints have a major impact on the system behavior, so it’s very important that they are used properly.

Only restrict access if there’s a need for immediate user attention.

Do not restrict permanent ownership when CarPlay is connected.

So that was using backup camera.

And the last example we want to look at is what happens when Siri plays music.

Again, we start with FM radio playing.

The user triggers Siri.

Siri is launched on the phone and the resource manager tells the native system that the resources have been assigned for Siri.

Siri shows up and the user can ask, “Siri, play some music.”

Siri returns the display and audio resources, but the music app now needs the audio resource to play music.

So the resource manager notifies the unit that mainAudio is now with the phone, but the screen returns to the accessory.

The head unit does not continue to play FM radio, and so iPhone audio is heard in the car.

So what did we see here?

On two occasions, the resource manager changed the state without a request by the head unit.

And this is completely fine because the apps on the device request resources as well.

And if they get them, the unit is notified about it.

So, how to handle this?

First, after each modesChanged, you check if you’re the owner.

If you are, resume your activity.

Otherwise, you should wait for CarPlay to take action.

Second, do not ignore resource switches to the phone when they are triggered by the phone.

OK, so far, we’ve looked at examples showing how resources are being exchanged.

But there is another situation where both sides want to accomplish the same task.

So let’s look at how applications are managed.

We need to manage native and CarPlay applications which have similar features.

But what are those applications?

They are active route guidance, phone calls, and voice recognition.

We achieve those using appStates.

But what is an appState?

We use different appStates to record who is currently engaged in the shared application that can only be active on the accessory or the controller at any given time.

So the three different appStates are TurnByTurn, PhoneCall, and Speech.

So let’s have a look at how that route guidance appState is managed.

When turn-by-turn navigation is started on the accessory, it uses a changeModes command to update the TurnByTurn appState to be active.

With this example, it is a last in win situation.

So the controller grants the TurnByTurn appState to the accessory, returning a modesChanged command showing the accessory as currently running route guidance.

Now let’s look what happens when the user sets a new destination using Apple Maps while the native system is already guiding.

The user says, “Siri, take me to the closest coffee shop.”

Now, the iPhone has started route guidance.

So the controller updates the TurnByTurn appState.

This is the instruction for the accessory to stop its own native turn-by-turn guidance.

Now, the iPhone has started route guidance, so the controller updates the TurnByTurn appState.

This is the instruction for the accessory to stop its own native turn-by-turn guidance.

So now, the Apple Maps turn-by-turn directions are shown on the display and announcement played over audio.

And there will be no conflict with the native guidance system.

So now when the Apple Maps turn-by-turn directions are shown on the display and the announcement is played, there will be no conflict with the native guidance system.

So that was route guidance.

Now let’s look at a phone call.

If the unit supports a second phone over Bluetooth Hands-Free Profile and there was an ongoing phone call, the call uses mainAudio exclusively.

The phone call cannot be interrupted.

So what happens if the user gets a phone call on the CarPlay device?

The CarPlay call rings only on the device as it should not interrupt the ongoing conversation.

This is managed using the PhoneCall appState.

And lastly, let’s look at voice interactions.

If the unit has a native voice recognizer when it’s running, it has borrowed the resources and is exclusively using the speaker microphone and screen resources.

But it can be interrupted at any time by user-initiated events.

So if Siri is triggered by the user, native voice recognition ends and Siri continues on the in-car display.

The Speech appState is used to achieve this.

So, in summary, CarPlay relies on the same resources as your native system and is designed to coexist with your native user experience.

For a great CarPlay experience, consider resource handling for each use case and follow the CarPlay design recommendations.

The CarPlay specifications are available through the MFi Program.

If you haven’t watched Developing CarPlay Systems, Part 1 yet, I’d encourage you to go and check that out.

For more information on this talk, please visit the URL on the screen.

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US