Introducing CloudKit

Session 208 WWDC 2014

CloudKit is the framework that powers iCloud on iOS and OS X, now available directly in your app. Learn how you can take advantage of its feature-rich API to store and query your own custom data and assets in iCloud.

My name is Olivier Bonnet.

I'm the engineering manager for the CloudKit on the client side.

So, up until today Apple had many iCloud technologies and they looked like this to you.

We had iCloud Drive, iCloud Core Data, iCloud file Library and how they interacted with the iCloud, the Apple iCloud server was kind of a mystery.

So today we're lifting some of the mystery in introducing CloudKit.

So let's look at what we're going to cover here in this session.

First we're going to start with what is CloudKit.

We're going to walk you through the couple of steps you need to enable CloudKit in your application and start using it.

We're going to do a walkthrough of the different APIs, we're going to talk about how CloudKit interacts with the Apple, the iCloud user accounts and last but not least we'll cover when to use CloudKit as opposed to other existing iCloud APIs we already have on the platforms.

So let's start.

What is CloudKit?

You'll notice the new batch but the whole session is new so it's share only on that slide.

[Laughter] CloudKit is a way to give you access to iCloud servers and we really mean it.

CloudKit is the foundation for both iCloud Drive and iCloud file Library.

Those features were written from scratch on top of CloudKit.

CloudKit is supported on both OS 10 and iOS.

It's a new public framework, cloudkit.framework.

Check it out in SDK.

CloudKit uses the iCloud accounts infrastructure.

That means that if there's a logged in iCloud account on the device we'll use that to identify the user.

If there's none, we'll provide read only anonymous access.

CloudKit supports both a concept of public and private databases.

You can see public databases as a soup of data that's all your user can access to.

Private databases are meant to store the actual data just on behalf of a specific user.

CloudKit support both structured and bulk data.

You can use it to store large files on iCloud servers and we'll take care of transmitting them efficiently from the iCloud servers.

Importantly CloudKit is a transport technology.

It doesn't provide any local persistence.

It enables you to send and receive data from the servers.

So let's start.

How do you enable CloudKit in your application?

So first you'll want to navigate your application's capabilities, pane and X code.

See that big iCloud off switch?

You want to turn it on and you want to check the CloudKit checkbox here.

At that point you're ready.

There's no fourth step.

Your app is ready to read and write data to Apple's iCloud server and to walk you through the API on how to do this I'm going to hand over to Paul to chat about it.

[Applause]

All right.

Thank you, Olivier.

[ Applause ]

So, as Olivier said, my name is Paul.

I work on the CloudKit client framework, and I'm very excited to talk to you guys today about CloudKit.

We're going to start off we're going to start talking about the fundamental CloudKit objects.

These are going to be the set of objects that you're initially exposed to when you open up and start playing with the CloudKit framework.

Just running through a quick list of them we're going to talk about containers and we're going to talk about databases, we're going to talk about records, we're going to talk about record zones and RecordIDentifiers, we're going to talk about references and we're going to talk about assets.

It's quite a list.

This is going to be a really fun and really jam-packed session.

So let's get started and we're going to talk about containers.

This is sort of the idealized model of your application talking about iCloud.

Now you guys are application developers, you guys are client developers, and you know that your application running on a client whether that client is a iPhone or whether it be a Mac Book yours is not the only process running on that client.

Rather yours is one of many.

Now on your client your process is going to be siloed, sandboxed in some ways.

In some cases, it's a literal sandbox.

Certainly in others you're running in your own memory space.

This concept of taking your client and actually running it separated from other clients is pretty powerful.

It has got a couple of advantages.

It helps with security, it helps with stability and it helps with privacy.

So as we are figuring out how we wanted to build CloudKit, we thought to ourselves how can we take these three advantages and replicate them up in the server?

So here's what we did.

Just as your client is one of many, I'm sorry, your application is one of many running on the client, so too the part of iCloud that you're talking to is one of many up on iCloud.

We call these different silos containers.

So containers.

Containers are exposed in the CloudKit framework as the CKContainer class.

CK is our prefix and you're going to see this all over the place.

By default one application talks to one container.

Containers afford us the ability to segregate data.

That means that your application can read and write data to iCloud, another application can read and write data to iCloud and the two datasets up on the server will not be intermingled.

In addition to data segregation, this containerization of iCloud storage allows us to encapsulate user information.

Now as Olivier mentioned, CloudKit involves using the iCloud account infrastructure and we want to give you some limited access to that iCloud account and we want to make sure that we're doing so in a privacy conscientious manner.

So in order to do that, we encapsulate user information.

User information available to your application is going to be container scoped and, therefore, different than the view of user information seen by another application.

Containers are managed by you the developer.

You're going to be managing them via the WWDR portal.

It's important to note that the name space of containers is global to all developers so when you're choosing a name for your container make sure that you're using it in a reverse, you're using a reverse DNS name.

Now, right up there just like a little while ago I said that by default there is one container to one application, a one to one mapping.

We think this is going to be successful for 99% of use cases and certainly as you go and start using CloudKit it's going to be great as you start exploring the framework.

We recognize that there are some scenarios where you need a more complex mapping so we support a many to many model.

What we mean is that multiple applications can coordinate on the same iCloud container.

Also a single application can talk to multiple iCloud containers.

On to databases.

One of the chief purposes of CloudKit is the ability to take your object model in your application and replicate that up to the server.

So when we start thinking about how we want to present this modeling to you we thought to ourselves how can we divide up application objects?

Obviously all objects in my application are not treated equally.

One of the first things that we notice was there's a fundamental difference in the audience of data.

Some data is intended to be used by the user that created it.

If you imagine an application where I'm writing up notes and I want to see my notes everywhere else, that's my data.

I create it, I consume it.

On the other hand, there's use for what we call public data.

This is data that can be, a, created by the user for the benefit of a community; think perhaps a review on a restaurant.

Or it could be information that you the developer has uploaded to iCloud because it's useful to your application.

In either case, the audience is not just a single user but a community of people.

So how did we solve the fact that we've got these different types of data.

Let's break open a container and have a look inside.

Inside of a container you're going to notice first and foremost the public database.

This is the soup, this is where all of the public and communal data co-lives, co-mingles.

Additionally, you're going to notice that there's private databases and you're going to find that there's an individual private database for each user of your application.

Now this is sort of the 50,000 foot overview of what the iCloud infrastructure looks like, but you know, how much do you guys care about that?

What you guys are interested in is what does this infrastructure look like to me the client running on a phone or running on a Mac.

Obviously you're only going to have access to the currently logged in iCloud user.

So rather than seeing a public database and a gajillion private databases, your view is going to look a little bit more like this.

You're going to have a choice between a public database or the private database that is correlated to the currently logged in iCloud user.

So databases.

Databases are exposed in our API as a CK database class.

Every application has access to two of them, the public and the private.

Let's have a look at a little bit of code.

The container is the initial entry point into CloudKit.

Here we see that I am talking to my default container and getting its public database.

I can also talk to the default container and get it to private database.

Let's have a look at the differences between these 2 databases.

As we mentioned, the desired audience is going to be different between the public database and the private database.

The public db is for shared data, the private db is for the user's own data.

As such, we have different requirements for whether or not an iCloud account needs to exist on the client.

In the private database, since I am reading and writing a user's data if we have no notion of the user, that is if we don't have a login iCloud account there's really no utility in moving this information to and from the server.

So we require an iCloud account to be logged in if you want to use the private database.

In the public database since the audience is more communal, we allow you read only anonymous access to the public database.

You'll recall from yesterday's talk that CloudKit is free with Linux.

So we've got some pretty aggressive quotas.

That being said we still need to account for where data is being used so that we can talk to you when we start approaching those limits.

Data stored in the public database is accounted for on the developer's quota.

Date stored in the private database is accounted for on the user's quota.

By default data written into the public database is world readable.

Data written to the private database is user readable.

Again it's the user's own data so they're really the only ones to have access to it in the private database.

We recognize that world readable is not an appropriate permission for a lot of data in the public database so we give you the ability to edit these permissions on a record class level.

The mechanism by which you edit these permissions is something that we call the iCloud Dashboard roles.

iCloud Dashboard is the administrative interface into CloudKit.

I invite you guys to come back on Thursday for the advanced talk.

We're going to go into a lot of detail about the dashboard.

Suffice it to say there's an ability to set ACLs so that a user is a member of a role and a role can have certain enlarged access to a class of records.

In the private database, things start out in lockdown there's no need to edit them.

The user is the only one to create data and the user is the only one to be able to read that data.

When I say that the user is the only one to be able to read that data, I really mean it.

You as the developer do not have access into somebody else's private database.

Their data is their own.

All right so those are databases.

Let's talk about records.

So, here's our model that we have so far.

We've got a container, within containers are databases.

Let's keep going down the rabbit hole and crack open a database.

Inside the database we see that it's full of records.

Records are exposed in our framework as the CKRecord class.

They are the mechanism by which you've moved structured data to and from CloudKit.

CKRecords wrap key value pairs.

Lest you think that a CKRecord is just a glorified dictionary, there are some additional attributes that make it worthwhile of being in its own class.

To start off with records have a record type.

If CloudKit is a mechanism by which you take your object graph from your application and move it into CloudKit, then let's continue with that analogy.

An instance of an object in your application is equivalent to an instance of a CKRecord.

Similarly the class of the object in your application is equivalent to the record type of the CKRecord.

Records have a just in time schema.

You do not need to tell CloudKit about what your data looks like before you hand CloudKit your data.

Hand us your data and we'll figure it out.

CloudKit, excuse me, CKRecord also supports a raft of metadata.

For example, a record understands when it was created and who created it.

It understands when a record was last modified and who last modified it.

Lastly a record contains a notion of a change tag.

A change tag is a version of a record.

It represents a specific revision of this record.

It's use so that we can have a lightweight way of determining whether or not a client server had the same version of a record.

Let's talk a little bit about record values, CKRecords wrap key value pairs.

What are the acceptable value types that you can put into a CKRecord?

Well, we've got your usual suspects, your P list types, your strings, your numbers, your datas and your dates.

We think that especially in the public database domain location is an interesting scenario.

So, CLLocation is a native type that you can set on a CKRecord.

You can set CKReferences and CKAssets and we're going to go over what those are in just a moment.

Lastly any value can be a single instance.

I can have a string or a date or it can be a homogenous array.

I can have an array of numbers or an array of CKAssets.

Let's have a look at a little bit of code.

Here we see a CKRecord.

The CKRecord Initializer takes a record type because that's an invaluable and necessary piece of a record.

You can set objects and get objects from a record using a dictionary syntax or a keyed subscripting syntax.

We also give you the ability to enumerate all keys on a record so that you can dump the entire key value pair.

Let's have a look at a specific example.

Now, throughout this talk my example is going to be an application that I've created for me and my friends.

This application allows us to create parties and we're going to stick party records in the public database.

A party might be a structured data it's going to have a summary, a start date and end date and it might also have additional metadata that we associate with it, pictures of the party, et cetera.

So, how do I create a party record?

Well I create one just as you might imagine with a party record type.

I can now set values on it and I can read values from it using the dictionary or key subscripting syntax.

Those are records.

Let's talk about record zones.

This is the model that we've just presented and it was kind of a lie but it was a useful lie at the time.

So, records don't exist by themselves just as objects in your application don't exist by themselves.

There's going to be a natural grouping of objects within your application.

Similarly we want a way to express this grouping in CloudKit.

Fundamentally we're trying via CloudKit to take as much of your knowledge about your object graph and reflect that up to the server.

So the way that we group records is via something that we call a record zone.

There can be multiple records within a record zone and there can be multiple record zones within a database.

Every database has a default record zone.

Some databases support additional custom record zones.

Record zones are the default granularity at which you're going to do atomic commits and change tracking.

If either of those sound interesting to you, I invite you back on Thursday for the advanced talk.

We're going to go over a whole bunch of that.

So, those were record zones.

Let's talk about RecordIDentifiers.

RecordIDentifiers, let's get the code up there, are a tuple.

They represent both a client provided record name and also the zone in which that record name exists.

So what are their characteristics?

Number one they are created by the client.

You get to specify the ID, the record name of the record, but because we are coupling this record name, which is scoped per record zone, along with a reference to the owning record zone, they become a fully normalized representation of the record.

It's the full path to it.

We think that it's going to be fairly common for you to try and bridge an external dataset into CloudKit.

If you're doing so and if your external dataset has a unique key, using that unique key as the CKRecordID allows you to have a foreign key back into your external dataset.

Totally an approved usage.

Let's have a look at some code.

Here we are, you know, creating a record, you've seen that already, and we've got multiple initializers for CKRecord.

You can either choose to provide us a RecordID or you can choose not to.

If you choose not to provide a RecordID, we're going to assign a random view UUID to the record.

Also note here that when I created my RecordID I chose not to give it a zone.

Throughout CloudKit's API if you choose not to give us a record zone, we're going to assume that you meant the default zone.

So here, I am creating a RecordID with the name well-known party that exists within the default zone.

So those are RecordIDs.

Now let's talk about references.

Just as there's a natural grouping of records that we want to expose via a record zone, there's also going to be a natural relationship between objects.

For example, let's say that in addition to being able to write up parties I can assign different clowns to parties because what is the point of a party if you don't have a few clowns?

So, I want some way of representing the object relationship that I have between parties and clowns up to the server and the way we do that is via something that we call references.

Now you'll note here that in this contrived example parties own clowns.

That is we've got a parent/child relationship with the party as a parent and the clown as a child and that the reference goes from the child object, from the clown, up to its parent object.

We call that a back reference.

References are exposing API as the CKReference class.

They are a way of letting the server understand the relationship between records.

When the server understands the relationship between records it can do very interesting things such as cascade delete.

If the server notices that you've deleted a record and that record is the parent in a parent and child relationship, the server will automatically go ahead and cascade delete all the children of that item.

With any database the scope and scale of CloudKit dangling pointers are going to become a necessary and inconvenient truth of your use of CloudKit.

By the time you fetch a record and you read a reference, then you go and fetch the target of that reference, the target may not exist.

So it's important that you're code is resilient to this.

Again, as I mentioned, we prefer back references.

It's not a requirement, but it's more efficient if references go from child objects to parent objects.

This is the tip of the iceberg in a very large topic called data modeling and if you come back on Thursday we're going to tell you a whole bunch about data modeling in CloudKit.

Let's have a look at some code.

Here we see creating a reference between two CKRecord instances that I have in memory, but it's not necessary that I have the target of a reference in memory.

I can make a reference that points at a RecordID.

This allows me to refer to a record that I've got, you know, reason to believe exists up on the server.

If I were to save this record and the target didn't exist, I'd actually be creating a dangling pointer and that's okay because the code you're going to write is going to support and be resilient in the face of dangling pointers.

So those are references now let's talk about assets.

Here we have our model again just to refresh our container database record and now let's take the idea that I want to write a record, I want to write a party record up to the server, but let's say that I want to associate a large file with that record.

Let's say, for example, that we're going to have a get together after the presentation and I'm going to read you a screen play that I've been working on.

Now I know, I understand that there's different characteristics between these different datas.

I understand that the record about the party is structured, you know, it's got a summary, it's got a start date, it's got an end date and, you know, just innately I believe that I want the server to understand those bits about the record.

If you contrast that to the screen play, well, the screen play is just essentially a bag of bits.

I don't feel any real need to tell the server how my screen play breaks down into amazing acts and, you know, dashing scenes, but I do know that I want to treat them as just an opaque bag.

So how do we solve this?

Well, we're going to solve this and we're going to solve the fact that data has different characteristics like this in two different ways.

Up on the server we're going to introduce a notion that we call bulk storage.

As you might expect, bulk storage is great for storing bulk data.

Similarly on the client you're going to tell CloudKit about the different characteristics of your data by treating some of it as a CKRecord and other bits of it as a CKAsset.

A CKAsset is the representation of this bag of bits.

Now when you ask CloudKit to save this record, the appropriate bits are going to go in the appropriate database.

Structured data in public database, bulk data in bulk storage.

So assets.

Assets are exposed in our framework as the CKAsset class.

They represent large, unstructured data.

Because you don't necessarily want large unstructured data in memory, the way you communicate assets to and from CloudKit is via files on disks.

Assets are owned by records.

This gives us a nice tight coupling between a record and an asset.

What this allows the server to do is garbage collect assets.

Even though that we're storing this data in two separate areas when the server detects a delete for a record it can go ahead and clean up any assets that were owned by that record.

Lastly because we expect CKAsset to be large, opaque data, we go through some great pains to try and move that data to and from the server as efficiently as possible.

This is all sort of inside of CloudKit but we're going to send only the bits that the minimal amount of bits that we can.

Let's have a look at some code.

Here I'm creating a CKAsset based on a file URL to my screen play on disk and just like any other CKRecord value supporting class I'm setting it on CKRecord.

So these are the fundamental objects in CloudKit.

The first thing you talk to is a container.

Within a container are two different databases.

Databases contains records, records are wrapped and grouped within record zones.

You identify a record via a RecordIDentifier and records are related to one another via references and large bulk data is transmitted to and from CloudKit via CKAssets.

All right you guys made it through the nouns now let's get into the verbs.

[Laughter] CloudKit we offer two different APIs for using CloudKit.

We call them the operational API and the convenience API.

The operational API has every single bell and whistle you might care about.

In some cases, you're going to want to tweak every single bell and whistle to fit your application's model, but not every application really wants to do all of this tweaking and not every application has enough knowledge to set these bits correctly.

Sometimes you want to let the framework make some of these decisions on your behalf.

So, we offer the convenience API.

It's convenient, it's going to be what you want to start off playing with when you start looking at CloudKit.

For many uses of CloudKit it's all that you're going to need to touch.

So quickly we're going to go over how you save a record in the convenience of API, how you fetch a record from the server via the convenience of API and how you can take a fetched record, modify it and save it back up to the server.

Let's start off with saving a record.

Here I am creating a record.

You guys are now very well familiar with this.

When I want to save the record, I have to choose which database I want to save it into.

Here I'm going to save the data into the public database and how do I do that?

Well, I call the save record with completion handler method.

Now I want you guys to note three separate things about this code right here.

First of all it's very simple.

You guys aren't providing a lot of options, you're delegating a lot of the bells and whistles like how important is this, what interface should I send this data over, you're delegating those choices off to CloudKit.

Second, it's asynchronous.

As Olivier mentioned, the CloudKit does not have local persistence.

We are a transport technology.

We're going to transport your data up to the server and we'll store it on the server and we'll transport it back down to other clients.

So when you save a record via CloudKit, we are going to attempt to save that record directly to the server.

If it fails, we're going to tell you about that immediately or, you know, as quickly as we can.

Now we don't want to block threads and we don't want you to block the user.

So we don't want to make this a synchronous call.

So here we've got an asynchronous call.

Now the third thing I want you to note is that even though is a very simple method we do provide an error as part of the call back.

Now if you've been to WWC in the past or if you've watched any presentations, you've seen something that looks like this.

We've got an Apple developer up here and the Apple developer says you need to handle errors that return from our framework.

Now, I'm not calling them liars it's true you do need to handle errors returned from frameworks.

In many applications, it's the difference between a good and functioning application and a great application.

CloudKit is a little bit different.

CloudKit by its very nature is going to be talking over the network.

Networks are inherently lossy [phonetic].

Phones like to fall off the network all the time.

So, in CloudKit, the difference between handling an error versus not handling an error is really the difference between a functional and a non-functional app.

Error handling has got to be one of the first things that you look at when you start using CloudKit.

I'm going to be a little bit glib throughout the slides here but every time you see a comment imagine that you're seeing just some really nice error handling.

[Laughter] So let's talk about now that we've saved a record up to the server, how do we fetch a record back down from the server?

I'm going to start by deciding which database I want to fetch a record from.

I'm then going to construct a RecordID, the identifier of the record I care to fetch down.

Here I've gotten this name either via some side channel or something that's built into my application.

I then can ask the database to fetch a record with ID with completion handler.

Again, asynchronous, simple, amazing error handling.

Once I fetched a record let's get that code back up here, I want you to note that the successful return value is an actual CKRecord instance and this is a live honest to God CKRecord.

Let's say I'm having so much fun at this party I pulled that I want it to last a little bit longer.

I can take a record off of it and I can bump out the end date by half an hour and I can set that record, that value back on the record.

Once I've done that I can take my CKRecord instance and just like one that I've created locally I can turn around and save it back up to the database again with amazing error handling.

All right so that's the convenience API, the initial typical flow that you're going to go through when talking with CloudKit.

You're going to be saving records, fetching records and taking those records that you fetched modifying them and putting them back up to the server.

So now let's say my party application starts becoming really popular and it's grown.

My user base is no longer me and my friends but it's all of you.

Everyone is really excited about joining into the parties.

What are some of the problems I'm going to run into?

Well, let's assume that when I started out I was a relatively naive developer and because the developer on stage said CloudKit is all about taking your object graph and moving it to and from the cloud, that's exactly what I did.

I had the one to one mapping.

My objects went up to the Cloud and on every client I would fetch the entire cloud state and that would become my object graph.

What are some problems we're going to run into?

Well, at that point we've got big data and a very tiny phone.

The more popular my app becomes the more data on the database the less reasonable it is to have a cache of that entire data locally on my device.

So how are we going to solve this.

Let's think about what we want to do.

We want to keep the large data up in the Cloud.

The Cloud is very good at storing large datasets.

My client wants to view a slice of that data.

Because I'm writing an application for my users and my users have their own preferences, I want each client to be able to view a different slice of that data and each individual client might want to change its view of that data.

The way we saw this is via something that we call queries.

Clients use queries to focus their viewpoint so that they can see a small section of a large dataset that exists up on the cloud.

So what is a query?

As you might imagine, it's exposed in our API as the CK query class.

A query combines three different things.

It combines a record type, a predicate and optionally a sort descriptor.

If you've used NS predicate in the past, you know that NS predicate is very expressive.

CloudKit supports most of NS predicate.

We document the parts that we do and if you hand us a predicate that we don't understand, we're going to throw an exception.

So you're going to learn pretty quickly which ones are and are not supported.

Let's have a look at some that are supported.

Here we see a predicate that would match records where name is equal to a value I had in memory.

Predicates allow you to use dynamic keys so that I don't have to know the key name at compile time.

We can do relative ordering comparisons as opposed to strict equalities.

We mentioned that location is an interesting aspect in the public database.

So you can query with location as a filter.

This is every location within a 100 meters of where we're standing here in Moscone.

CloudKit supports a tokenization search.

So, what this predicate is going to do is it's going to tokenize that string after session and it's going to come up with two different tokens, after and session.

This predicate will match any record that has those two tokens as values.

These two tokens don't need to exist side by side, they don't even need to exist in the same key value pair, but so long as the record has the token after and the record has the token session the record will be a match.

Lastly, CloudKit supports compound predicates joined using the and operator.

Here we see a predicate that does that.

Here we see the creation of a query and as we mentioned it's combining both a record type and a predicate.

How do I perform queries after I've created them?

Well, just like saving records performing a query is going to be a database specific operation.

So I'm going to choose the database on which I want to perform a query.

Even in the simple API and the convenience API we give you the ability to restrict these queries by record zones.

You see here that we're not choosing to pass in a record zone filter.

So this query is going to search across the entire public database.

Let's have a look at what happens in the completion handler.

First, of course, amazing error handling.

Second, if we don't have errors, let's have a look at the results.

You'll see here that the results are actually CKRecord instances.

These are live objects.

If I wanted to, I could pull data off of them, I could set data on them and I could even choose to save them back to the server.

So a way to think about queries is that queries are polls and polls are great in some scenarios.

They're great for slicing through large datasets.

If your application wants to start up and, for example, show all the parties, the top 10 parties that are near me, a query is absolutely the way to go, but there's other things that you might be tempted to use a query for that are not perfect.

If you find yourself issuing the same query over and over and over again and you're getting back a static, a mostly same data result set, well, then you've got a large mostly static dataset and queries are bad for that use case for a few reasons.

They're bad for battery life.

You have to constantly wake the device up, poll, run the same query and get back more or less the same results.

They're bad for network traffic.

All those questions go to the server and if they're not pulling down new and interesting data, why do we bother?

They are also bad for the user experience.

By definition, you're only going to learn about new results on the period of how often you're polling.

Users nowadays have come to expect push.

So as opposed to using a client-generated query in the scenario, what you really want is you want the server to be running the query on your behalf.

You want the server to be running the query on your behalf in the background and you want that to happen after every single record save whether it was you or somebody else that saved the record.

Lastly, of course, you want pushes when the results have changes.

Well, we've given you this and we call that subscriptions.

Yay.

[ Applause ]

So subscriptions are exposed in our API as the CKSubscription class.

They combine a record type, a predicate and push.

Push is delivered via the Apple Push Service.

If you've used APS in the past, you're largely familiar with this, but note CloudKit pushes are slightly augmented.

They contain CloudKit specific information about what caused the push to happen.

Let's have a look at an example.

Here we have a phone and that phone is interested in parties that are going to be happening in the future.

This phone when it sees that happening wants to be alerted with a push that says party time.

The phone is going to go ahead and save that up to iCloud and iCloud is going to, you know, shuffle it away with all the other subscriptions.

Now along comes the Mac.

The Mac creates a new record.

It's a record-type party and it's happening tonight and because the Mac didn't chose to give us a RecordID we created that random UUID you see.

The Mac goes ahead and saves that to ICloud.

ICloud is then going to loop through all of the subscriptions that it knows about.

Eventually it's going to come across this one.

It's going to check and say, yeah, okay, this is a new party and, yeah, it's happening in the future.

So at that point it's going to create a push and it's going to take some information from the subscription.

Here it took the alert string party time.

It's going to take other information from the record itself.

Here we're pulling in the RecordID.

Now that I've constructed this augmented payload, I can send that augmented payload down to all clients that are registered and interested in it.

So let's look at some code.

How do I create a subscription?

Well, a subscription combines a record type and a predicate.

Because a subscription is also in charge of telling the server how you want to be alerted, we introduce a notion called CKNotificationInfo.

Here we're requesting that the server badges our icon, that it plays a particular sound pulled out of my resources, and that it shows an alert string based on a string in my localized strings file.

I can associate that notification info with a subscription and now I've created everything I need in my subscription.

Let's go ahead and save it to the server.

As you might imagine via the convenience API, it's simple, it's asynchronous and it's got great error handling.

Let's look at how you're going to be handling subscriptions, how you're going to handle pushes as they come in.

If you've used APS in the past, you're probably familiar with this code snippet.

This is your application delegate implementing the ApplicationDidReceive from notification method.

Now in most scenarios if you know the format of the push payload that's coming in, you would just then iterate through that dictionary pulling out the key value pairs that you care about.

However, because CloudKit was the one that generated this push, we ask that you let CloudKit do that parsing.

So the way you would do that is via CKNotification then that really long one I'm not going to name.

Once we've actually parsed out a CKNotification you can pull off APS level information from it and you can also pull off CloudKit level information from it.

Here we're taking the RecordID of the saved record that caused the push to happen.

So between queries and subscriptions we have an answer to the big data, tiny phone problem.

You're going to leave your large data up in the cloud and you're going to use these two capabilities to give your users a quick view into that large dataset.

Now, I want to talk about CloudKit user accounts.

As Olivier mentioned, CloudKit is built on top of the iCloud account infrastructure.

So what does that mean?

I want to focus a little bit on how accounts are exposed to you explicitly throughout the API.

When you think about an account system, the first thing you think about is authentication and CloudKit supports authentication via the logged in iCloud user, but that's not, you know, that's sort of behind the scenes and you guys don't care about that, that's implicit.

Let's talk about the explicit things.

What do we give you because we're built on top of iCloud accounts?

We give you identity; a way of identifying the user.

We give you metadata; the ability to save and retrieve information about users.

We do all of this in a privacy conscientious manner and we don't want to disclose anything if the user hasn't agreed to it and lastly we give users the ability to discover their friends that are using your application.

Let's dive into each one of these.

First of all we're going to talk about identity.

So, here's our model.

We've got our client, our application running on the client, and all of these different users and their private databases up in the container.

Your specific client is going to be linked to one and only one of those users.

This is related to the user that's logged in via iCloud locally on your device.

Because this is iCloud we've got a rich backing store of user information and because iCloud is the one that is hosting your container we can correlate users.

For example, here we see that the user, whose email address is c at iCloud dot com, is linked to your current client.

So given this setup how are we going to present an identity?

How are we going to let you know, your client, your application, know what user is logged in?

Well, natively you might think let's give them an email address.

We're not going to do that obviously.

That's private user identifiable information and we don't want to give that out.

So, instead what we do is on a container by container basis we come up with a random ID.

This is an identifier that is stable so that is your application no matter what client it's running on talking to this container will get the same identifier, but it's not identifying the user via any personal information.

So we feel confident giving you this identifier.

You can take this identifier and do with it what you will.

Note that different applications running on your phone because they're talking to different containers are going to get back different container scoped RecordIdentifiers for the same user.

This goes back to what we talked about in the beginning that we've got user encapsulation.

So user identity.

We expose user identity via API as a user RecordID.

It is a stable identifier for this user.

It will be the same for your application no matter where your application is running.

It's scoped to the container so 2 different applications are going to come up with different identifiers for the same user.

This is a feature.

Lastly this is an independent API.

This is a section of the CloudKit framework.

You can use this in collaboration with the database API or you can use this completely separately.

We've given you enough support that if you wanted to you could implement a login via iCloud flow in your application using the CloudKit framework.

Let's have a look at the code.

[ Applause ]

All right let's have a look at the code.

Because identity is a container-scoped notion and not a database-scoped notion, we go to our container to learn about our user.

Here we're asking our container to fetch the user RecordID.

Now because we may have to talk to the server to figure this out, for example, the first time you access and try to learn about a RecordID we have to go to the server to do that translation and come up with a container-scoped info.

So, asynchronous and we have to do error handling.

That's user identity.

Let's talk about user metadata and quickly to recap the problem what we have here is we have a stable identifier and we've got a desire to set key value pairs based on that identifier.

I don't know about you but to me that sounds very much like what a record was built to do.

So we expose users as user records.

Looking inside a container within a database we see there is one user record in the private database that user record represents your user.

There are many user records inside the public database representing each one of the users of your application.

One of them will have a RecordID that matches your currently logged in iCloud user's RecordID.

So, user metadata.

Exposed via our framework as a user record.

There's one per database that represents your current user.

User records in the public database like any other default record are world readable.

They're treated mostly like an ordinary record with a record type that we exposed in the framework CKRecord TypeUserRecord, but there are a couple of caveats.

First off these records are reserved by the system.

You do not create a user record.

Rather you fetch an existing one from the server.

What that means is that you can be assured that when you fetch a record for your current user but it has not been spoofed.

It was, indeed, iCloud that created that record in the first place.

Secondly, we think that it doesn't make sense for you to be able to query the entire set of user records in the public database.

It doesn't really make sense to be able to say I want to look at all users whose first name begins with A.

It's a little bit too course grained for something that's so, that we want to protect privacy around.

So we don't allow you to query user records.

Don't worry we're going to fix that in a couple of slides.

All right so what does it look like?

Here we have the same code that we saw earlier and we're fetching a user RecordID.

Once I have that CKRecordID I can go ahead and I can fetch that record from either database that I choose.

Here I'm choosing to fetch a record with that identifier from the public database.

Assuming that I don't get an error I now had a live CKRecord that represents this user and I can treat it like I would any other CKRecord.

I can pull records, excuse me, I can pull key value pairs off of it, I can set key value pairs on it, and if I wanted to, I could go ahead and save it back to the server.

Now let's talk about privacy.

We care very much about our user's privacy.

Therefore, we disclose no information, no personally identifying user information about the current user by default.

Now we recognize that in some cases your application is going to want to have limited access to metadata about the user.

So, if you want that data, you can request that from CloudKit.

When you do, we're going to go to the user to make sure that they're okay with that.

Here we see an example of the party application requesting the ability for my user account to be discoverable within the application.

The user can either choose to allow or deny that.

Assuming that the user has acquiesced to this privacy request, we can go on to the next phase, which is discovery.

So let's talk a little bit about user discovery.

Here's our image.

We've got our client talking to a container and the container is backed by iCloud and iCloud has all this rich user information.

If we look a little bit more on the client side, we see there's actually two different processes involved here.

There's your client process and then there's the CloudKit process and it's the CloudKit process that's the one that's actually talking over the wire.

So let's examine what user discovery would look like if you want to discover information about a user given a RecordID.

You take that RecordID and you can send it off to CloudKit.

CloudKit is going to in turn send it up to the container.

Once it hits the container we're going to ask the iCloud account info to exchange out for different information for information about that user.

If that user has opted in to discoverability, we're going to get information back.

That information can traverse back to CloudKit and back via the process boundary over to your client, but we're not restricted to just RecordIDs.

If your user enters in an email address, we can do the same sort of dance.

This email address is sent from your client over to CloudKit and CloudKit is then going to hash it out a whole bunch of times so that we're not sending personal info off the client, and we'll send that up to the container.

That container exchanges it with iCloud and if the target of this discovery has opted in to discoverability, we're going to get a result sent back.

Those results can go back to CloudKit and can traverse back up to your client.

Now, I'd like to think that are we are pretty good about naming things in CloudKit but this one we sort of didn't really do well.

So we offer a different way of doing user discovery that we call the whole address book and it's a way for you to discover the whole address book.

The way this works is that your client is going to say I would like to discover the user RecordIDs and more information about every user that is friends with my currently logged in iCloud user.

You send that request over to the CloudKit process.

The CloudKit process is then going to pull in the user's address book.

We're going to take all the emailed addresses in that address book and we're going to hash them up and we're going to send, you know, a non-personally identifying version of that address book up to the container.

The container is going to send it off to iCloud and for those members of my address book that have opted into discoverability I'm going to learn information about them.

That information is going to come back to CloudKit and it's going to be sent over the process boundary to your client.

Now if you'll note at no point did your client in this little flow have access to the user's address book.

What this means is that we can give you the support without requiring that your user allows your application to the address book meaning that you don't have to have that blue alert, which is now the white alert, giving your application access to the address book.

You can leverage it without access to it.

[ Applause ]

So user discovery.

These are the three different kinds of inputs that we can have for user discovery.

You can start off with a user RecordID, an email address or request to view the entire address book.

What do you get back from user discovery?

Well, you'll get back a user RecordID.

In the latter two cases, that's new information.

You also get back the first and last name of this user.

Of course it bears repeating first and last name is personally identifying information so you're only going to get discovery results for users that have opted into discoverability.

Let's have a look at some code here.

Here we are asking our default container to discover all of my users that are part of my address book.

In the response, again, asynchronous error handling in the successful response case what we see is that we get back a CKDiscoveredUserInfo object and that user info from that I can pull a user RecordID and a first and last name.

So these are really the four tent poles of how we do user accounts.

We give you a stable identifier, we give you the ability to store and retrieve metadata about users, we protect user's privacy and we give your users the ability to discover their friends in your application.

Now, to tell you when it's appropriate to use CloudKit versus some of the other iCloud technologies that we already have exposed, I'd like to invite back up Olivier Bonnet.

[ Applause ]

Thank you, Paul.

So CloudKit is the new framework but it doesn't obsolete or deprecate any of the existing tools.

It's really just a new tool in your toolbox.

Let's look at all the four tools you now have and look at where different use cases where we think they make sense and they're appropriate to use.

So, first, iCloud Key Value Store.

iCloud Key Value Store keeps small piles of data up to date between your app and the iCloud servers.

This is done asynchronously.

Your app doesn't really need to care about when and how this is done.

We think this is great for small amounts of data like application preferences game states.

Conflict resolution is pretty simple, last writer wins.

So, that's iCloud Key Value Store.

iCloud Drive builds on top of the existing iCloud document APIs.

Doing so it provides full offline cache on OS X; all the files on the iCloud drive of the user are downloaded on the OS X.

It's completely unstructured and internally tied to the file system.

You use the file coordination APIs to read and write data in your application iCloud container on the file system and the iCloud Drive daemon takes care of uploading and downloading those changes up and from the iCloud servers.

We think it's great for document centric apps or apps that need to deal with an existing file formats.

ICloud Core Data built on top of iCloud Drive replicates off specific user data between all the user's devices.

It's great for keeping private structured data in sync but because it downloads all the data to all devices you're also constrained to the size of the smallest device in that case.

Enters CloudKit, the new kid on the block and we think there are a number of interesting use cases where CloudKit makes sense and complements pretty well the existing technologies.

So first any public data.

If your app, if you need to give access to all the users of your app to large datasets, CloudKit public databases are pretty good compiling tool.

CloudKit supports both structured and bulk data.

So you can use it to store large files on iCloud and we take care of downloading and uploading them to the iCloud servers.

As Paul described, CloudKit has good support for large datasets where your app will want to give a specific slice, a specific view of large data set to the user at a giver point in time.

CloudKit lets you use the existing accounts infrastructure whether you need to identify the user or to let the user discover his friends using your app CloudKit enables you to do that.

Last but not least compared to the other three technologies we think CloudKit is closer to the middle in some way.

In this case when you're using CloudKit, your app is really in control of when the app is uploaded or downloaded from the server.

Your app is controllng the operations and that is also why you need to do this [inaudible] will do.

So, in summary, what have we covered today?

CloudKit gives you access to iCloud servers.

It supports both public and private data.

It supports both structured and bulk data.

You can use it for large files.

It leverages the existing iCloud account infrastructure which means that the over 400 million iCloud accounts out there are here for you to take advantage of.

Apple is building on it in a big way.

Both iCloud Drive and iCloud file Library were built from scratch on top of CloudKit.

We're super excited to see what you're going to build on top of this new framework.

So for more information Dave is our evangelist.

We have some awesome framework reference on developer.apple.com website as well as developer forums.

We have an advanced CloudKit session on Thursday and Jacob is going to tell you everything you want to know about that, Data Modeling and Advanced Record Manipulations.

Thank you very much for being here and thank you for your attention.

[ Applause ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US