What’s New in Core Audio for iOS

Session 602 WWDC 2013

Core Audio is the professional-grade technology for playing, processing and recording audio on iOS. Learn how your apps can take advantage of the latest advances in Core Audio. Discover how to send and receive audio between multiple apps, allowing for advanced mixing, effects, and sound generation.

[ Silence ]

Good morning everyone.

My name is Tony Guetta and I'm the Manager in the Core Audio group at Apple.

And today, I'm going to talk to you about, What's New in Core Audio for iOS.

We're going to begin with a very high level overview of some of the new audio features in iOS 7.

And for the majority of this session, we're going to talk spend our time focused on one new technology in particular that we think you're going to be very excited about.

So, let's dive in to the list of new features.

First is Audio Input Selection and with input selection, your application now has the ability to specify which audio input it would like to use in certain situations.

So for example, if the user had a wired headset plugged into his or her device, but your app wanted to continue to use the built-in microphone for input, you now have the capability to control that.

With input selection, you can also choose which microphone that you'd like to use on our multi-mic platforms and on devices that support it such as the iPhone 5.

You can take advantage of microphone beam forming processing to set an effective microphone directivity by specifying a polar pattern such as cardioid or subcardioid.

We've made some enhancements to multichannel audio on iOS 7.

And through the use of the AVAudioSession API, you can now discover the maximum number of input and output channels that are supported by the current audio route as well as being able to specify your preferred number of input and output channels.

For audio outputs supported such as HDMI, you can obtain audio channel labels which associate a particular audio channel with the description of a physical speaker location such as front left, front right, center and so on.

We've added some extensions to Open AL to enhance the gaming audio experience in iOS 7 starting with the ability to specify a spatialization rendering quality on a per-sound source basis.

Now, you might use this to specify a very high quality rendering algorithm for the important sound sources in your game.

But a less CPU intensive algorithm for those less importance sound sources in your game.

We've also made some improvements to our high quality spatialization rendering algorithm.

And also added the ability to support rendering to multichannel output hardware when it's available.

Finally, we've added an extension to allow capturing the output of the current Open AL 3D rendering context.

We've added time-pitch capabilities to Audio Queue.

So your application can now control the speed up and slow down of Audio Queue playback both in terms of time and in frequency.

We've enhanced the security around audio recording in iOS 7 and we now require explicit user approval before your application can do audio input.

Now, the reason for doing this is to prevent a malicious application from being able to record a user without him or her knowing it.

The way that this works is very similar to the way that the location service's permission mechanism works.

In that the user is presented with a model dialog requesting his or her permission to use the audio input.

The decision is made on per-application basis and it is a one-time decision.

However, if you'd like to go in and change your mind at a later time, you can always go into the Settings application to do that.

Until the user has given your application permission to use audio input, you will get silence so you need to be prepared to handle that.

Now, what actually triggers the dialog from being presented to the user is an attempt by your application to use an audio session category that would enable input such as the record category or play and record.

However, if you'd like to have control over when the user is presented with his dialog so that it can happen at a more opportune time for your application, we've added some API and AVAudioSession for you to be able to do that.

Finally, just a note on the AudioSession API.

As we mentioned at last year's conference, the C version of the AudioSession API is officially being deprecated in iOS 7.

So, we hope that over the course of the past year, you've all had the opportunity to move your applications over to using the AVAudioSession API.

So, here is a summary of the features that we just discussed.

We're not going to spend anymore time today going over any of these topics in any more detail.

So, if you have any questions about these or like a more detailed overview of any of these items, we encourage you to come by our labs either later today or tomorrow morning and we'd be happy to discuss with you in more detail.

I'd also encourage you to have a look at the documentation in the various header files that I outlined in the course of going through each of these topics.

So for the remainder of this session, we're going to focus on one new technology in particular that again, we think you're going to be very excited about and that's Inter-App Audio.

So what is Inter-App Audio?

Well, as the name implies, Inter-App Audio provides the ability to stream audio between applications in real-time.

So, if you have a really cool effects application and you want to integrate that into your DAW application, you now have the ability to do that.

We've built Inter-App Audio on top of existing Core Audio APIs so it should be very easy for you to integrate into your existing applications and deploy quickly to the app store.

Because it's built into the operating system, the solution is very efficient with zero additional latency and should provide for a stable platform for the evolution of the feature over time.

Now, before we get into any of the technical details of how Inter-App Audio works, I'd like to invite up Alec from the GarageBand team to give you a demo.

[ Applause ]

Thanks Tony.

Am I up?

Yeah.

My name is Alec, I am a product designer for GarageBand and Logic and I'm going to switch over here to my iPad.

So, what I want to do today is give a quick demonstration about how we have been working with the development version, kind of a sneak peek into a development version of GarageBand and how we're doing some experiments with Inter-App Audio.

So, what I have up here is just a simple FourTrack song in GarageBand, I'm going to play a little bit so you can get an idea of what it sounds like.

[ Music ]

OK. So the first thing I want to do is I want to add a little keyboard part to this.

But instead of using one of the built-in instruments in GarageBand, I want to use an instrument, on the system that is not part of GarageBand.

So to do that, I'm going to go out to the GarageBand instrument browser.

Now, what we see here are the instruments that ship with GarageBand, part of GarageBand.

And then we have a new icon here, Music Apps.

I'm going to tap on that and we see the icons of other apps on the system which are audio apps.

So, I'm going to click on sampler one here and we'll see that the sampler launches in the background.

Now here it with the UI in the foreground and we can hear it.

Now, you see there's a transport here and that's, this transport is remotely controlling the transport of GarageBand.

So, when I press the record button, what we're going to hear is a count off from GarageBand and then the track that I just played and I'll record over the top of it.

[ Music ]

Brilliant musical passage.

So now, if I if you look up at this transport again you'll see that there's a GarageBand icon.

When I tap on that icon, I switch back to the GarageBand application and now, in the tracks view, a new track has been added with this little keyboard part that I played we can listen to it.

[Background Music] And add some keyboard to it.

[ Music ]

So, that was bringing audio from another application, controlling that application in its interface, and recording that in GarageBand.

The next thing I want to do is I want to process an input from GarageBand.

So, I'm going to put on my little guitar here and we'll go to the guitar amp in GarageBand.

Now, this guitar amp is part of one of the instruments built in the GarageBand and I'm going to turn on input monitoring so I can hear myself.

[Background Music] You guys got it out there?

It's a little phase switch.

OK. Or we're going to there you go, that's more rock and roll.

OK. So, that's a good sound right?

That's using the guitar app from GarageBand.

What I want to do though is I want to process it with another effect on my system.

So again, I'm going to go into the input settings in GarageBand and if you see about halfway down this list, it says Effect App.

I'm going to tap on that and we can see a list of apps on my system that are effects so I'm going to click on this Audio Delay.

[Music] So, there is the delay but it's not really the settings I want.

So, I'm going to tap on the Effect icon and switch to the Effects Interface.

I'm going to take the feedback down here a little bit and the mix.

[Music] OK.

So that's a little bit better.

So now, what I'm doing is I'm taking the input through GarageBand, sending it out to this effect and bringing it back on to GarageBand.

Then I can hit record.

[ Music ]

Now, if we switch back to the tracks view, we can see that new region has been recorded in GarageBand and if I play.

[ Music ]

There is the source with the delay added to it recorded in the GarageBand.

So that's just the quick overview of how we're doing some experiments inside this development version of GarageBand with the new Inter-App Audio APIs.

And next, we're going to bring up Doug to give you a little more detail about how some of the stuff works under the hood.

[ Applause ]

Thank you Alec.

Hi, my name is Doug Wyatt, I'm a plumber in the Core Audio group.

I'd like to present to you some of the details of the Inter-App Audio APIs.

So, conceptually here, we have two kinds of applications which we call the host application and the node application.

The fundamental distinction between these two applications is that the host is where we ultimately want the audio coming from the node application to end up.

So, GarageBand in this example was a host application.

It was receiving audio from the sampler application and from the delay effect application.

So, given these two kinds of applications, we're going to look at APIs for how node applications can register themselves with the system and how host applications can discover those registered node applications.

We'll look at how host applications can of initiate connections through the system to node applications.

And once those connections are established, the two applications can stream audio between each other.

But, again, primarily the destination has to be the host that could optionally send the audio to the node if the node is providing an effect.

We'll look at how furthermore host applications can send MIDI events to node applications to control their audio rendering.

So for example, with that sampler application, the host could have been actually sending the MIDI nodes to the sampler and receiving the rendered audio back.

We'll look at some interfaces where the host can express information about its transport controls and transport state and timeline position to node applications.

And finally, we'll look at how node applications can remotely control host applications.

So, let's look inside host applications.

So, this is your basic standalone music or audio application on iOS.

We have AURemoteIO audio unit and its function is to connect to the audio input and output system with zero well, very low latency using pretty much the same mechanisms as on the desktop but always through the RemoteIO audio unit.

So, feeding the audio unit, we have the host's audio engine and that can be constructed in a number of ways, we'll show some examples later.

But for the the purposes of Inter-App Audio, the host engine connects to a node application by instantiating a node audio unit.

This is another Apple supplied audio unit that, in effect, creates the bridge to the remote node application to communicate audio with it.

Now, on the node side, a node application is also by default a normal application with an AURemoteIO that can play and record as always and it's got its own audio engine.

What's a little different here is in the Inter-App scenario, the node application has its input and output redirected from the mic and speaker to the host application.

So, that's the node application.

So, you see then we're implementing this API as a series of extensions to the existing AudioUnit.framework APIs.

The host sees the node application as an audio unit that it communicates with and the nodes AURemoteIO unit gets redirected to the host so the node's communication to the host application is through that IO unit.

So, to express the capabilities of these node applications and to distinguish them a bit from the existing audio unit types, we have these four new types.

They all are the same in that they produce audio output but they differ in what input they receive from the host.

We have remote generators which require no input.

We have remote instruments which take MIDI input to produce output, audio output.

We have effects which are audio in and out.

And finally, we have music effects which take both audio and MIDI input and produce audio.

So, node applications use these component types to describe their capabilities.

And furthermore one node application may actually have multiple sets of capabilities and may wish to present itself in multiple ways to hosts.

As a simple example, a node application may produce audio just fine on its own, in which case, it's a generator.

It may optionally be able to respond to MIDI in producing that audio.

So, it can also be a generator.

So, such a node application could publish itself as two different audio components with separate capabilities.

Another example of that is an application like a guitar amp simulator where the application appears to the user as an effect because audio is going in, it's being processed in some way and then it comes out.

But from the host's point of view, this application can appear either as a generator an effect and the node can publish itself either way.

For example, if a node says I'm a generator, it can continue to receive microphone or a line input from a guitar directly from the underlying AURemoteIO while only sending the audio output to the host.

So again, that's generator mode.

Or if a host application like GarageBand might have a prerecorded guitar track and want to process that through the guitar amp simulator.

The guitar amp simulator can function fully as an effect, not communicate with the audio hardware at all, and just communicate the two audio streams between itself and the host.

Let's move on and look at some of the requirements for the Inter-App Audio feature, it's available on most iOS 7 compatible devices, the exception being the iPhone 4.

And on the iPhone 4, you don't really have to deal with this specially because what will happen is node applications, if they attempt to register themselves with the system, those calls will just fail silently, the system will ignore them.

And on the host side, the host will simply see no node applications on the system.

Both host and node applications need to have a new entitlement called "inter-app-audio" and this you can set this for your application in the Xcode Capabilities tab.

Furthermore, most applications will want to have audio in their UIBackgroundModes.

Most especially hosts for obvious reasons because hosts will keep running their engines when nodes are on the foreground.

Also, nodes like the guitar amp simulator I just mentioned may want to continue accessing the mic and to be able to do that, they too need to have the audio background mode.

One final requirement for nodes in particular is the MixWithOthers AudioSessionCategoryOption.

Hosts can go either way on this one.

We'll get into that in more detail later.

OK. Getting in to the nuts and bolts of the APIs here, let's look at how node applications can register themselves with the system.

So, there's two pieces of registering one's self for a node application.

The first is an Info.plist entry called AudioComponents.

So, the presence of this Info.plist entry makes the app discoverable and launchable to the system.

The system knows, oh I've got one of this node applications installed.

The second part of registration is for the node application to call AudioOutputUnitPublish which checks in that registration that it advertised in its Info.plist.

It says, I've been launched and here I am ready to communicate.

So let's look at those two pieces in a little detail.

So here is the AudioComponents entry in the Info.plist.

Its value is an array and in that array, there is a dictionary for every AudioComponent that the node wants to register.

And in that dictionary, if you're familiar with AudioComponentDescriptions already, you'll see some familiar fields there.

There is the type, subtype and manufacturer along with the name and the version number.

So, that completely describes the AudioComponent that the node application is advertising.

So, moving on to the second part of the registration here, this is when the node application launches, the first piece of code here is the node's normal process for creating its AURemoteIO when it launches.

It creates an AudioComponentDescription describing the Apple AURemoteIO instance, it uses AudioComponentFindNext to go find the AudioComponent for the AURemoteIO.

And finally, it creates an instance of the AURemoteIO and this is something just about every audio and music app on today will do for creating a low latency IO channel.

What's new is that the node application, to participate in Inter-App Audio is now going to connect that IO Unit that it just created with the component description that was published in the Info.plist entry.

So, to do that, we're seeing this code here that that node creates an AudioComponentDescription which matches the one in the Info.plist we saw a moment ago.

It supplies the name and version number and passes all that along with the AURemoteIO instance to a new API called AudioOutputUnitPublish.

So again, that connects what was advertised in the Info.plist with the actual RemoteIO instance in the application to which the host application will connect as we'll see in a little bit.

So, to make this all work, a requirement of the node application is to publish that RemoteIO unit when it launches because the node application is going to get launched by host applications when at times when the user what's to use them.

And so, the node application basically has to acknowledge, I'm here, I've been launched.

And so, you can see then why the Info.plist entry and the call to AudioOutputUnitPublish must have the same component descriptions, names, and versions.

One note here is that by convention, the component name should contain your manufacture name and application name and that lets host applications sort the available node applications by manufacture name if they like.

So, that's the registration process for node applications, let's look at how host applications can discover those registrations.

So again, if you've used the AudioComponent calls before, this should look fairly familiar.

What we here have here is a loop where we want to iterate through all of the components on the system because we're looking for nodes and there are multiple types.

So, the simplest way to do that is to create a wild card component description and that's the searchDesc, it's full of zeros.

And so then, this loop will call AudioComponentFindNext repeatedly and that will yield in turn each of the AudioComponents on the system which are in the local variable comp.

When we find null, then we've gotten to the end of the list of all the components in the system and we're done with our loop, we'll have found them all.

Now, for each component on the system, what we want to do is call AudioComponentGetDescription and this will supply to us the AudioComponent description of the actual unit as opposed to that wild card that we used for searching.

So now, in foundDesc, we can look at its component type and see if it's one of the four inter-app audio unit types that we're interested in, the RemoteEffects, RemoteGenerator, RemoteInstrument, and RemoteMusicEffect.

If we see one of those, then we know we found the node.

OK. So the host has found a node.

So now, I'm going to walk through a little bit of code here from one of our sample apps.

It creates an objective C object of its own just as a way of storing information about the nodes that it's found.

And it calls this class RemoteAU and it stores away into it the component description that was found and the AudioComponent that was found.

It also fetches the component's name and stores that in the field of the RemoteAU object.

It sets the image from AudioComponentGetIcon which is a new API call which works with inter-app audio.

This gives you the application of I'm sorry, the icon of the node application.

We can also discover the time at which the user last interacted with the node app and this can be useful if we want to sort a list of available node applications by time of when they were most recently used, the way the home screen does.

So we've gathered up all this information about the node application, and now we've built an array from which we can drive a table view and present the user with a choice of node applications to deal with.

One wrinkle though is having cached all that information in an array, it can become stale and out of sync with the system.

Most notably, when apps are installed and deleted so if you find yourself caching list of components like this, you should probably also listen to this new notification that we supply, its name is AudioComponentRegistrations ChangedNotification.

So, you can pass that to NSNotification center to register for a notification.

In this example, we're supplying a block to be called and then that block which is called when the notification or rather, when the registration has changed, we can refresh that cached list of audio units we built.

So, that's the process of discovering node apps for host.

So now, we've built up a table and maybe the user has selected one of them and in the host application now, we want to actually establish a connection to the node application so let's look at how that works.

The first step is very simple because we held on to the AudioComponent of that node.

Now, all we have to do is create an instance of that component and now, we have an audio unit through which we can communicate with the node.

It's worth mentioning that this is the moment at which the node application will get launched into the background if it it's not already running.

And we'll look it all the mechanics of what happens on the node side of that later.

Right now, we're just going to focus on the host side.

So, the host has to do a fair a few steps here to get ready to steam audio between itself and the node.

Most importantly, the host must be communicating with the node using the same hardware sample rate as or the same sample rate as the hardware.

So, to be absolutely sure the hardware sample rate is what it's supposed to be, we should be making our audio session active if we haven't already.

So, once having done that, then we can specify the audio stream basic description which is a detailed description of the audio format that the host wishes to use to communicate with the node.

So, we can choose mono or stereo.

In this example, I've chosen stereo.

Here is where we're using the hardware sample rate And these lines of code here are basically specifying 32 bit floating-point, non-interleaved.

Now, the host application can choose any format it likes here and the system will perform whatever conversions are necessary as long as there's not a sample rate conversion being requested.

Again, you must use the sample rate that matches the hardware.

All right.

Now, we have built up an audio stream basic description and we can use AudioUnitSetProperty on the node AudioUnit for the stream format property and this is specifying since it's in the output scope, this is specifying the output format of the audio we need to receive from the node.

If we're working with a generator or instrument which don't take audio input, that's it, we've we're done, we've just specified the output format.

But if we're dealing with an effect, then we should also specify the input format.

And in many cases, it's going to be identical to the output format and so, we can make that same call using the input scope just that the input format that we're going to supply to the node.

So, having specified formats, we can look how we're going to get audio from the host into the node and this is starting it get into the details of multiple ways that your host may be interacting with the node AudioUnit.

Now, since we're connecting input, this is only for effects and the host at this point can supply input to a node from another audio unit using AUGraphConnectNodeInput.

AUGraph is a higher level API which I'm just going to touch on a few times today but you can use AUGraph to build up graphs or a series of connections between audio units.

The other way to make a connection to the node's input from some other audio unit is with the AudioUnitProperty MakeConnection.

Alternatively, a host can simply supply a callback function with the SetRenderCallback property.

This callback function gets called at render time and the host supplies the audio samples to be given to the node.

Now, as far as connecting the output of the node, this too depends on the way you built your host engine.

If you're using audio units, you want to connect to the node output to some other audio unit.

You can use the MakeConnection property again.

If however you're pulling audio into a custom engine, then you would call AudioUnitRender but there's no setup at this time for that.

We'll look at the rendering process in more detail a little later.

OK. One last a bit of mechanics here that a host needs to do to establish a reliable connection to a node or actually to reliably handle bad things happening with the node is to look out for what happens when nodes become disconnected.

This could happen automatically if the node app crashes, if the system ejects it from this memory before being under memory pressure.

Also, if the host fails to render the node application regularly enough, the system will evict it from or I'm sorry, will break the connection.

When these things happen, then the node AudioUnit becomes, in effect of zombie, meaning that it's there's still an audio unit there.

You can make API calls on it but they won't crash but you will get errors back and that's the error that you'll get back, the InstanceInvalidated error.

The mechanics of establishing that disconnection callback - we call AudioUnitAddPropertyListener for this new property IsInterAppConnected.

Here is what you would do in the connection listener, you can fetch current value of the property and see if it is 0 and if the local variable here connected has become zero, then you know the node application has become disconnected and you should react accordingly.

So, all of that prep work has led us up to the point or we're ready to actually initialize the node.

Now, the AudioUnit initialize call basically says to the system and the other AudioUnit.

Here, allocate all of the resources you need for rendering.

In the case of inter-app audio, the system at this point is also allocating some resources on behalf of that connection such as the buffers between the applications and the real-time rendering thread in the node application.

So, it's important to realize.

This is a point at which you are beginning to consume resources and as such, you have the responsibility now of calling AudioUnitRender regularly on this node audio unit.

So, that's the process of setting up a host to communicate with a node.

You activate your audio session, you set your stream formats, you connect your audio input, add a disconnection listener, and finally call AudioUnitInitialize.

So having done that, you're at the point now where you're ready to begin streaming audio between the two applications.

Let's look inside a host application's engine in more detail.

This is kind of a wonderfully simple way to do things if you can get your work done using Apple AudioUnits.

So, the green box, the green dotted lines box represents a host engine but those red boxes inside are all Apple supplied audio units.

So, there is the AURemoteIO, we have a mixer AudioUnite feeding that.

In feeding the mixer, we have a file player AudioUnit and the node AudioUnit.

But of course, there are many things you would want to do with audio that Apple doesn't give you AudioUnits for.

If you want to that, then you're going to write some code of your own represented by the green box with the squiggly brackets.

So here, your engine is feeding the AURemoteIO and if you've written an app like this before, you know the way to provide input to an AURemoteIO from your own engine is with the SetRenderCallback property.

And now in this case, to fetch the audio from the node AudioUnit, you would call AudioUnitRender.

OK. So, that's a bunch of stuff about how we a host application interact with a node.

One final nice thing to do for the user here is to provide a way for the user to bring the node application to the foreground.

So, we can do this by asking the audio unit for a PeerURL and this URL is only valid during the life of the connection.

You don't want to hold on to it because it's not going to be useful later.

But right before the user wants to switch in response to that Icon tap or whatever, you can fetch the PeerURL then pass that to UIApplication and ask it to open that URL and that will accomplish the switch of bringing the node application to the foreground.

So, let's go back just a little bit and look at how node applications see the process of becoming connected to hosts.

So, the most important thing to think about here as the author of a node application is that when the user opens your application explicitly from the home screen, you're launched into the foreground state, you're ready start making music.

But if you're being launched from the context of a host application, you're actually going to get launched into the background state and there are some limitations about what you can do at this in this state and there's also a requirement here.

You can't start running from the background but you must create and publish your I/O unit as I showed earlier.

So, it's probably going to be necessary and useful in your node application to ask UIApplication what's the state here, Am I in the background or am I in the foreground?

and proceed accordingly.

So, node applications to find out when they're becoming connected and disconnected can also listen for the IsInterAppConnected property just as I described for host applications earlier.

For a node application, you listen to this property on your AURemoteIO instance.

So, in your property listener, you can notice the transitions of this property value from zero to one.

When you see it becoming true, then you know that you're output unit has been initialized underneath you and that you should set your audio session active if you're going to access the microphone.

You should at this time start running because that's kind of your final step of consent saying, My engine is all hooked up and ready to render, start pulling on me.

You can, at this time, start running even if you are in the background.

This is the exception to the rule about running in the background.

When you are connected to the host, you can start running in the background.

One further note, if you want to draw an icon representing the host that you've become connected to, there's a new API called AudioOutputUnitGetHostIcon.

Pertaining further to the IsInterAppConnected property, you also want to watch for the transition to zero or false meaning that the host has disconnected from you.

What you want to do at this point is understand your output unit has been uninitialized and stopped for out from underneath you.

Now, if you were accessing the microphone, you should set your session inactive at this time.

However, you might, in some situations, find yourself disconnected while in the foreground.

Maybe the host application crashed or the system didn't have enough memory to keep it running.

So, if that happens, you probably do want to start running and keep your audio session active or make it active if it isn't already.

But again, you can only start you can only make your session active and start running when you're in the foreground.

So, just to reemphasize that.

Your node application can start if you've been connected to the host or you're in the foreground but you can keep running in to the background if you're connected, of course, or if you are in some other standalone non inter-app scenario where your app wants to keep running in to the background.

Let's look again now at a few different scenarios involving how nodes render audio.

This is your normal standalone mode when the user has launched you.

You've got your engine connected to the RemoteIO, connected to the audio I/O system.

If you're a generator or instrument, you may have your output completely redirected to the host.

But if you leave your input bus enabled but you advertise yourself as a generator or instrument, then you've continued to receive input from the microphone even while your output has been redirected to the host application.

Now, this doesn't add any extra latency because the system is smart enough to deliver your application, the microphone input first and then in that same I/O cycle, the host application will pull your output.

In the final node rendering scenario - is you have in effect both your input and output streams are connected to the host rather than the audio I/O system.

Node applications can also use that PeerURL property I described earlier to show an icon as Alec did in his demo.

He showed the Garageband icon in the sampler app.

So, you can fetch that icon from your remote your AURemoteIO instance in this case.

You can I'm sorry.

You can fetch that URL to accomplish the switch.

OK. Back on the host side of things, there are a few considerations about stopping audio rendering.

The normal API calls for this doing are AudioOutputUnitStop or AUGraphStop and what you want to do at this point is promptly uninitialize your AudioUnit representing the node.

That releases the resources that were allocated when you initialized it and it releases you from the promise to keep rendering frequently.

You can turn around and reinitialize when the user wants to start communicating again or if you're completely done with that node AudioUnit, you can call AudioComponentInstanceDispose and that's what you would do the if the user, for example, explicitly breaks the connection or if you discover that the node application has become invalidated.

So, that's the process of audio rendering.

Next, I'd like to look at how we can communicate MIDI events from host applications to node applications.

Now, this of course, is for remote instrument and remote music effect nodes.

You would want to use this if you have MIDI events that are tightly coupled to your audio that's being rendered.

It lets you sample-accurately schedule MIDI note-ons, control events, pitch-bends, et cetera.

But this is not recommended as a way of communicating clock and time code information.

That's sort of a funny way to communicate that you're using seven bit numbers to break up timing information.

We actually have a better way to do that.

I should also mention that this does not replace the coreMIDI framework which still has a role when you're dealing USB MIDI input and output devices or, for example, the MIDI network driver.

You might also be dealing with applications that don't support inter-app audio and you still want to communicate with them.

So, let's look at how a host application can send MIDI events.

You might do something like this in this like the sampler demo app, Alex showed.

It had an on-screen keyboard.

So, whenever the user touches the key, you send a note-on.

When the key is released, you send a note-off.

So, the APIs for sending MIDI events are in the header file MusicDevice.h and there is a function in there called MusicDeviceMIDIEvent.

Here, you pass the node AudioUnit, the three byte MIDI MIDI message.

And here, offsetSampleFrames, the final parameter, that would be used for sample-accurate scheduling but since we're doing this in kind of a UI context, we don't really know how to have that kind of sample accuracy.

I'll get into how we do in a moment.

So, we just passed this sample offset frames of zero that at that note-on will appear at the beginning of the next rendered buffer.

Now, if we do want to do sample-accurate, scheduling then we have to schedule our MIDI events on the same thread that were rendering the audio, because in that thread context, we can say where the MIDI events need to land relative to the beginning of that audio buffer.

For instance if that MIDI buffer or audio buffer rather is 1,024 frames, we might do some math and figure out, oh, that note-on needs to land at 412 samples in to that sample buffer and we can specify that in our call to MusicDeviceMIDIEvent.

Now, of course, we can call MusicDeviceMIDIEvent any number of times to schedule any number of events for one render cycle.

I just put these next to each other to emphasize that you have to be in the rendering thread context to be able to schedule sample-accurately.

Now, for you're using AUGraph and you want to schedule sample-accurately, it's similar but a little different because you're not calling AUGgraph I'm sorry, you're not calling AudioUnitRender, the graph is doing that on your behalf.

So, the way to do this, there's an AUGraph API that lets you get called back in the render context and that's AUGraphAddRenderNotify.

That gives you a callback function that the graph calls at the beginning of the render cycle before actually pulling audio from the node.

And that turns out to be the precisely corrects time to call MusicDeviceMIDIEvent to schedule events for that render cycle.

So, that's the process of sending MIDIEvents, let's look at how nodes receive MIDI Events.

So, we have two basic functions for sending and then there's MusicDeviceMIDIEvent and MusicDeviceSysEx.

And we have two corresponding callback functions for the use of the node application, the MIDIEventProc and the MIDISysExProc.

So, in the node application, here we have an example of MIDIEventProc.

Well, it doesn't do much but here is where you receive each event that's coming from the host and typically, you would just save it up in a local structure and use it the next time you render a buffer because this function will get called at the beginning of each render cycle with new events that apply to that render cycle.

So, having created that callback function, we can populate a structure of callbacks.

You can notice I left the SysExProc null, that just means I'm not going to get called if there is any SysEx.

We use AudioUnitSetProperty to install those callbacks and now, on the node application side, I'm going to receive each MIDIEvent as it arrives.

So, that's how hosts can sent MIDI to nodes.

Let's look now at how host can communicate their transport and timeline information to nodes.

So, the important thing about this model is that the host is always the master here.

The nodes can just find out where the host is and synchronize to that.

We'll look at how the host can communicate its musical position as well as the state of its transport.

And all of this is highly precise and it's called and pertains to the render context.

So here too, we have a structure full of callback functions, we'll look at each of these.

So, this is probably the most common one that a host will implement, this is called the BeatAndTempo callback.

Here, the host can say for the beginning of the current audio buffer, Where am I in the track and that could be in between beats.

The host can also communicate what the current tempo is.

And so with these two pieces of information, even even only these two pieces information, the node can do beat synchronized effects from the host for instance.

There's also some more detailed musical location information supplied by the host such as the current time signature.

And finally, the host can communicate some bits of transport state, most notably whether it's playing or recording.

There's also a facility for the host to express whether it's cycling or looping.

So here too, we're installing a set of callback from callback functions on an audio unit.

The host populates the host callback info structure, installs the callback functions that it implements and calls AudioUnitSetProperty.

So, once the host does this, the system will call those callbacks at the beginning of each render cycle and communicate that over that information over to the node process where the node application will have access to them.

And the way the node application gets that access is by fetching the host callback property.

It will receive that structure full of function pointers.

They won't actually point to functions in the host process, of course.

We can't make a cross process call there, but the information, as I just said, has been communicated over to the node.

And it can access them there within its own process.

There are some considerations of thread safety here.

Most people importantly, since this information is accurate as of the beginning of the render cycle, if you call it in some other context, you might get inconsistent results.

It's easiest if you fetch this information on the render thread but of course, there are some cases where you want to observe a transport state for instance.

So, we give you a better way to receive notifications of transport state changes on a non-render thread context.

You can install this property listener for the HostTransportState and get a callback on a non-render thread.

Okay, so that's the process of transport and timeline information.

Finally, I'd like to look at the whole mechanism by which node applications can send remote control events to host applications.

To accomplish that, we have something called AudioUnitRemoteControlEvents.

Now, there's something called RemoteControlEvents in UIKit as well.

Those are kind of in a different world.

These are more specific to the needs of audio applications.

So with these events, the node can control the host application's transport.

And for now, we have these three events to find.

You can toggle you being a node application-can toggle the host's play or pause state, its recording state and the node, through an event, can send the host back to the beginning of the song or track.

We do have some sample applications where our node applications have some standard looking transport controls.

And we'd like to encourage you to check those out and use them in your application so that we can have a consistent look and feel for these controls.

So, looking at how node applications can send RemoteControlEvents, first, we want to find out whether the host actually is listening and is going to support them because if it doesn't, maybe we don't even want to bother drawing the transport controls at all.

So to do that, we can fetch this property HostReceivesRemoteControlEvents.

And to actually send the RemoteControlEvent, the node calls AudioUnitSetProperty using the remote control to event or I'm sorry, remote control to host event.

And the value of that property is the actual control to be sent, toggle, or record on this example.

So, there's a node sending a RemoteControlEvent.

Here is a host receiving one or rather preparing to receive them, I should say.

So, to do that, the host creates a block called the listenerBlock and in that block, the host simply takes the incoming AudioUnitRemoteControlEvent and passes it to one of its own methods called handleRemoteControlEvent.

Now, that block is in turn a property value for the RemoteControlEventListener property so the host only has to set that property on the node AudioUnit and that accomplishes the installation of the listener for RemoteControlEvents.

Next, I'd like to bring up my colleague Harry Tormey to show you about some of these other aspects of the inter-app audio API in action.

Thanks Doug.

Hey everybody, my name is Harry Tormey and I work with Doug in the Core Audio Group at Apple.

And today, I'm going to be giving you a demonstration of some of the sample applications we're going to be releasing on the developer portal to illustrate how inter-app audio works.

The first demo I'm going to be giving you is of a host application connecting to a sampler node application and sending it some MIDI events.

So what you see in the screen up there is a host application and I'm going to bring up a list of all the remote instrument node applications installed on this device and I'm going to do that by touching the add instrument button.

So, none of these applications are currently running.

They have just published themselves with their audio component descriptions.

When I select one of these applications from the list, it will launch into the background and connect to the host application.

So I'm going to do that, I'm going to select the sampler.

OK. So, you can see the sampler's icon up there underneath the instrument label.

That means it's connected to the host application.

So, I'm going to bring up a keyboard in the host by touching the show keyboard button and I'm going to send some MIDI events from the host to the sampler by playing the keys.

[Music] Totally awesome.

OK. So, what if I want to change the sample bank that the sampler is using?

Well, I'm going to have to do to the sampler and do that.

I'm going to do that by touching the sampler's icon.

We're now in a separate application and I'm going to select a different sample bank to use so how about something nice like a harpsichord?

Let me just do that there.

OK. So now, we're in harpsichord, I'm going to touch the host icon there and go back to the host application.

Touch the show keyboard again and listen for it.

[ Music ]

That's a harpsichord.

OK. So, the next thing that I'm going to show you is how to use the callbacks that the host application has published to get the time code of the host application when it records and plays back things.

So one again, I'm going to go the sampler by touching its icon and I'm going to touch the record button and I'm going to record some audio in the host so I'm sending a remote message to the host.

[Music] I'm going to stop recoding by touching record button again.

Now, what I want you to pay attention to is the blue text over the play button.

This text is going to be updated with the callbacks that the host application has published and were going to use this to display a time code indicating how far into the recording we are.

So, I'm going to touch the play button and watch that text.

[Music] So, if I do that again and I go to the host application, you'll see the time code is consistent across both applications so let me do that.

Let me press play again and go to back to host application.

[Music] Okay, so for my grand finale, I'm going to add an effect and that effect is going to be the delay effect that you saw.

So once again, I touched the add effect button.

It shows you all of the effects that are installed on this device.

I'm going to select the delay one, it's going to launch it and connect to the host.

OK. So in the host, if I touched the show keyboard button again and play a note, it's going to be delayed.

[Music] How about that?

Much cooler than remote controlled cars.

Okay everyone, that's me, these demos are all up on the developer portal and I'm done with my demo so back over to you Doug and thank you very much.

[Applause]

Thank you Harry.

Hey, I found the right button.

So, back to some more mundane matters here.

Dealing with audio session interruptions, both host and node applications need to deal with audio session interruptions.

Here, the usual rules apply namely that your AURemoteIO gets stopped underneath you.

But furthermore, in a host application, the system will uninitialize any node AudioUnits that you have open.

This will reclaim the resources I've been talking about that you acquire when you initialize the node AudioUnit.

One other bit of housekeeping here, you can make your application more robust if you handle a media services reset correctly.

It's a little bit hard to test this sometimes oops, but let me find my way back.

But if you implement this, your application will survive calamities.

So, when this happens, you can you will find out that all of your inter-app audio connections have been broken, the component instances have been invalidated.

So, in a host audio host application, you should dispose your node AudioUnit and your AURemoteIO.

And in a node application, you should also dispose your AURemoteIO.

So in general, it's simplest to dispose your entire audio engine including those Apple Audio objects.

And then, start over from scratch as if your app has just been launched and that's the simplest way to robustly handle the media services being reset.

Some questions that have come up in showing this feature to people, in talking with them, can you have multiple host applications?

Yes, if they are all mixable.

If one is unmixable, of course, it will interrupt everything else as it takes control.

Also, if you were to have multiple host that are mixable and one node application, only one host can connect to that node at a time.

Can you have multiple node applications?

Yes, Harry just showed us that that's more than possible.

A couple of debugging tips here you may find when creating a node application that you're having trouble getting it to show up in host applications.

If you see that happening, you should watch the system log.

We try to leave some clues for you there in the form of error messages.

If you see a problem with your Info.plist entry which is a little bit easy to do unfortunately but if you do see a problem there, we'll tell you that and I would recommend going and comparing your Info.plist with the one in one of our example applications.

I should also mention here the infamous error of 12,985 which many people stub their toes on in a lot of different contexts.

I can tell you that what it means is operation denied.

And in the context of inter-app audio, you're likely to hit it if you start playing from the background.

We do hope to in an upcoming release give that a proper name and maybe another value but in any case, if you do see it, that's what it means.

So we've looked at how node applications register themselves with the system, hosts discover them.

Hosts create connections to node applications.

Once that connection is up, host and node apps can stream audio to and from each other.

Host apps can send MIDI to node applications.

Hosts can communicate their transport and timeline information.

And finally, we have seen how nodes can remotely control hosts so I think if you have an existing music or audio application, it's not that much work to convert it to a node.

It's mostly adding a little bit of code to deal with the transitions to and from the connected state and you can look how that works in the example apps we have posted.

Creating a host application is a bit more work but you're using existing API for audio units and there's a lot of history there as well as powerful there's a lot of power and flexibility.

We also like you to we encourage you to look at our sample applications.

They'll help you with a lot of the little ins and outs and we're really looking forward to the great music apps you're going to make.

On to some housekeeping matters here, if you wish to talk to an Apple Evangelist, there's John Geleynse.

Here are some links to some documentation and our developer forums.

This is the only Core Audio Session this year but here are some other media sessions later this week that you might be interested in.

Thank you very much.

[ Silence ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US