Data You Can Trust 

Session 222 WWDC 2018

A lot can go wrong when loading data into your app. Whether you work directly with JSON and property lists, or with higher-level APIs such as NSCoding and Codable, learn how to defend your customers and secure your code against invalid or malicious data. Avoid fatal assumptions by validating payload structure, type information and domain correctness, to turn the data you work with into data you can trust.

[ Music ]

[ Applause ]

Good morning, everyone.

Thank you all for joining me, even before the coffee’s kicked in.

My name is Itai and I work on the foundation team.

In this session, I’d like to talk to you about how the data that flows through your app can affect it, and how you can better protect your customers by building trust in that data.

Let’s get started.

Apps don’t live in a vacuum.

In order for your apps to do something useful, they have to draw in data from external sources, like the disc, or the network, or your customers themselves.

And then do something meaningful with that data, and then present it to your customers.

In order for that data to be consumable, it has to come in some known format or structure.

What happens when it doesn’t?

Usually, this means the data is corrupted, or invalid in some way and can be ignored.

But sometimes, this data can invalidate assumptions that your app makes, and it can cause your app to misbehave, or maybe even crash.

This can be a bad experience for your customers who might have to wait for your app to get updated in the app store.

And it’s an even worse experience if it’s a crash on launch, because they can’t even use the app.

And, in the meantime, while you’re waiting, you’ll get a wave of one-star reviews.

It’s a bad experience for everyone.

This is something to be even more cognizant of, if you’re a framework author.

Because it’s not just one app that could possibly be affected, but maybe many apps.

Today, we’re going to be talking about trust.

And specifically, we’re going to be talking about how to build trust in data, by making sure of two things.

One, that the data that we’re going to be using, hasn’t been modified from underneath us, and two that it contains what we expect it to in the format and structure that we want.

So, we’ll do just that by taking a look at the lifecycle of our data, and what we can validate about that data at every stage in the lifecycle.

Then, we’ll see what sort of type-level validation we can apply with the NS Secure coding protocol.

And then apply those same concepts to codable types.

Let’s get started.

In order to talk about data, we’re going to want to build a mental model of the forms that data can take within our app.

At the most basic level, data makes its way into our app as a stream of bites.

There’s not much we can tell about this data at this stage without looking at it, but this we’ll call raw data.

Now, to get working with that data, we need to make sure it conforms to that known format or structure.

And in this case, each of those code points correspond into a UTF code point, and sorry, one moment, let me make that a little bit more readable.

It looks like this is JSON.

And so, once we’ve made sure that that data conforms to some format that we want to work with, we’ll call this formatted data.

Now, formatted data, on its own doesn’t mean much, until we create primitive values out of it, strings, and arrays, and dictionaries that we can then use as building blocks for further algorithms.

So, this we’ll call our primitive data.

Now, there’s building blocks we most often want to work with, not as just primitive values, but as our own model types.

So, once we do that, we’ll make use of this as we’re going to call, structure data.

Now, these forms of data in our apps, form an abstraction timeline.

Raw data is the least abstract data that we’re going to work with, and our own structured model types are the most abstract.

So, our goal for today is to take that data as far along the spectrum as we can.

Now, our apps can stop at any point, make use of the data, however, we see fit.

But we really want to work with our own model types wherever possible.

Now, the goal for today is to not just go as far along the abstraction spectrum as we can, but to build trust as we do so.

At every stage, the data is going to get more complicated, and there’s going to be more that we need to validate about it.

But once we do that, there’s also going to be more that we can trust about it.

Now, for our use case today, we’re not really going to be talking about formatted data.

Very often it’s just a stepping stone between raw and primitive data, and you don’t work with it directly.

For instance, given raw data, foundations JSON serialization will give you primitive data back.

You don’t see just the formatted date directly, and you won’t make use of it.

So, today we’ll just be talking about raw, primitive, and structured data.

Now, let’s start by talking about raw data.

Again, as we mentioned, raw data is just a stream of bytes that’s made its way into your app.

Until you inspect that data and you give it meaning, there’s not much you can do with it.

Now, we might care to know what we can take a look at about that data before we start interpreting it.

Is it even safe to do that?

One thing we can validate about this data before making use of it is its length.

Say your app expects to load a 1-kilobyte file from disc, but finds a 1-gigabyte file in disc.

Does it make sense to even load that data in the first place, and start reading it?

Almost certainly not.

Now, sometimes we might not be able to have length expectations about the data.

Maybe it’s external data we don’t own.

We don’t’ know how much data there could be.

But in some cases, we might also be able to verify a checksum, or a cryptographic signature that represents what the data might look like, even if we don’t know what’s inside.

Checksum is built by hashing all of the data.

And if any bit in the data changes, either due to a potential malicious third party, or just regular data corruption, bad blocks on disc, a bad network connection.

If any one of the bits flips in that data, it will invalidate the checksum, or the signature.

And we’ll know before even reading any of those bytes, that the data is incorrect, and you shouldn’t trust it.

Now, we also don’t always have a checksum.

Maybe it’s data we don’t own, where you can’t get that ahead of time.

So, at this stage, there isn’t much we can do with this data, besides read it and inspect it.

And so, once we do that, we can get primitive data out.

Now, as we’ve mentioned, we can take that raw data and pass it through, usually deserialize it, like foundations JSON serialization.

When we do that, we’ll get inert strings and dictionaries and arrays of numbers back out that we can use.

And if this process exceeds, we know two things about that data.

One, that the data was indeed in the correct format that we expected.

For instance, XML data won’t pass through JSON serialization.

And two, if we trust the deserializer, we know that the run-time objects we get back out are going to be valid.

Again, foundations JSON serialization will always give you strings and numbers and arrays that you can actually work with.

It’s individual values that we can trust.

But at this stage, we might wonder okay how can we make use of this data, or what can we trust about it, and what validation do we still need to do?

Well, we don’t actually know much about the contents of this data yet until we start looking at it.

And in fact, we might not know anything about the structure of the data until we start inspecting it.

If you’ve ever worked with dynamic deserialization in this way, you’ll know that there’s a lot of downcasting from anys.

There’s no upfront expectation what the data can be because it’s very generalized.

And so, we’ll want to check to see what the data contains and how we can work with it.

So, let’s motivate this with an example.

I’ve been working with an app lately called Sell My Old Junk, which allows me to sell my old junk to some friends and family.

And when one of them opens up my app, my app makes a request to my server.

Which requests a list of products that are currently available for sale to my friends and family.

When the server receives this request, it responses with JSON, that indicates here are the products available to sell.

Now, this is what an API response from my server might look like.

It’s an array of product listings, which have some interesting fields that you might care to look at.

For instance, each listing has a product ID.

A positive integer that uniquely identifies the product.

And in my case, these are sequential integer IDs.

Every listing also has a name and a description which are strings.

And there are a few other type fields here that we might care to look at.

For instance, there’s a field that’s a Boolean that indicates whether or not this listing has already been sold.

And there’s some internal structure here.

This list of tags, which are strings, which we might care to use.

There are also a few fields here, which come to us as strings, but really represent other forms of data that we might care to look at.

For instance, URLs and dates.

So, let’s make use of this data.

In my app, I can fetch that data from the network say with URL session, and wherever possible I’ll validate the length.

Maybe my server can produce a checksum that I can validate.

Or a cryptographic signature.

Once I’ve done that, I can take the data and pass it off JSON serialization.

If deserializing the data fails, JSON serialization will throw an error.

Will check and then catch and handle.

May display dialogue to my customers.

Again, in our purlins of the day, we’ve just taken raw data and carried along it the abstraction spectrum to primitive data.

And if anything went wrong, we can handle that failure.

So, now we need to make use of this data.

How can we consume it?

Well, JSON is an any variable that contains the actual values.

So, I can downcast it to the array of dictionaries that I expect it to be.

Now, this part of my app cares only about product listings that are related to music.

So, will filter out any products that don’t contain the music tag.

And here, I have that substructure.

That list of tags, which will downcast an array of strings and make use of it, right.

Whoops. Each of those forced downcasts, actually contains a hidden fatal error.

If either of those casts fails because the API changed or the data changed along the way before it made into my app, again due to data corruption or malicious changing, those downcasts will fail.

And when they do fail, they’ll abort, and they’ll crash my app.

And again, that’s a bad experience for my customers.

Let’s take a look at how this could happen.

So, here again is that sample API response.

And we’ll take a look at the list of tags here.

And say that second tag in there is modified.

Instead of a string, we have a number.

It’s maliciously changed by a third-party, or maybe again due to regular data corruption.

We can’t always tell.

Downcasting this list of tags will fail because they’re not strings and we never checked to make sure they work before we cast.

So, to avoid this, our main tenant for the day will always be validate first, execute later.

Instead of asserting that you know what the structure of the data is, check first.

Don’t blindly assume.

So, let’s see how we can do that.

So, here again is that first forced downcast.

And instead of forcibly downcasting these values, I can conditionally downcast.

This allows me to validate that the data actually contains what I want and if that cast fails, well I can handle that error instead of fatally erroring.

Now, similarly, later downcasting that list of strings, instead of forcibly downcasting, again I can conditionally downcast.

And in this case, instead of throwing an error, I can give a default value that allows execution to continue.

In this case, I’ll simply ignore any product listings that don’t have a valid list of tags.

I could throw an error, but in this case, I chose not to.

Now, type validation isn’t the only form of validation that you want to perform at this stage.

For instance, if that had been replaced by null, which is totally valid in JSON, I would’ve seen a similar crash.

In Swift strong static type system nullability is part of the type.

And indeed, you can’t downcast null to a string.

And so, again, this cast would have failed.

Now, even if all of these values are of the correct type and nullability, there’s other forms of validations that we should care about here.

For instance, I said that each product listing has a positive integer ID.

In my case, they’re all sequential integers.

Does it make sense for one of these IDs to be negative?

No, it doesn’t.

But even if it is always positive, does it make sense for it to be such a large positive integer value?

I’m not selling that many things.

So, no it doesn’t.

and in this case, this might be due to somebody trying to cause overflow in my app.

This is something you need to watch out for.

Now similar to range validation is length validation.

Again, every product listing has a description.

Does it make sense for that description to be empty?

Well, in my case, I know that anytime I upload a product listing, I’m always going to put a description in there.

So, in my case, no it doesn’t make sense for it to be empty.

But even if it’s not empty, does it make sense for it to be the full length and contents of “Romeo and Juliet?”

Also no, it doesn’t make sense.

Something here’s gone wrong and I need to look for that.

Now, there’s additional forms of validation that we really do care about here.

Even if all of these fields are right type, nullability, and fit within the range and length that we expect, their values and contents are also equally important.

Every product listing has a URL that I can send a customer to see more information about that product listing.

This comes to me as a string, but it actually has to contain a URL.

Does it make sense for it to be any arbitrary string?

No, and in my case, I want to make sure that it actually represents a URL.

But more importantly, just because it looks like a URL, doesn’t mean it will point to my domain.

And this is something I care very deeply about.

I really, really want to keep my customers safe.

I don’t want to possibly send them to a phishing domain, which could look like mine, but not really be my site.

So, this is something that I care about watching out for.

Now, lastly, even if each of these fields is valid on its own, sometimes the relationships between fields that matters.

For instance, each product listing has a date when it was created, and a date when it was last updated.

Each of these can be valid on their own, but does it make sense for the date that it was last updated to come before when it was created?

No. And in my case, this might not open a security vulnerability in my app.

But this is something that maybe you watch out for because it’s a good indicator that something’s gone wrong, and maybe I shouldn’t trust this data.

So, let’s take a look at how we can start doing that.

Here, I’ve started writing a function which will take one such listing and start validating all of the contents.

So, I’ll take a listing, and I’ll start pulling out the product ID.

And we’ve learned here not to forcibly downcast this ID to an Int, but conditionally downcast.

And if the cast fails the guard will fail and will throw an error.

Now I don’t want to stop there, I want to perform the range validation that ensures that product ID is also valid.

That it’s positive and not too large.

And again, if something goes wrong, I’ll throw an error.

Now, later on I might care to check out that URL.

And again, I’ll downcast it to a string, instead of forcibly downcast.

And here, I can check the link.

In this case, I know my server will never produce URLs that are too long.

So, if I find a really long URL, I’ll know that it’s invalid.

Once I’ve validated that, I can send it off to the URL type to perform that domain-specific validation to make sure it actually is a URL, and not just a garbage string.

And again, if anything goes wrong, I’ll throw an error.

But, again, I don’t want to stop there, once I have an actual URL, I want to make sure it points to my domain and not a phishing domain, so I’ll keep working with that.

Now, at this point, I can apply the same types of validation to the other fields here.

And again, if anything goes wrong, I’ll throw an error.

But once I have this function, I can apply it to all the product listings that I’ve loaded from the payload.

And again, I’ll stop executing if anything goes wrong.

Now, this is how we can work to validate primitive data.

But as we just saw, primitive data can be very general.

A string can be a string, but it can also be a date.

And it can also be a URL.

Sometimes we care to work with data with the semantics that matter to us.

We wanted to make sure that the host of that URL was our domain, and we can’t do that with just a regular string.

Similarly, a dictionary can represent a model like a listing here, or it can represent arbitrary customer data that we know nothing about.

Instead of performing the same validations everywhere to make sure all the fields that we care about are there, isn’t it nicer to work with our own model types, where that guarantee is always present?

It is. And in our case, we want to work with structured data wherever possible.

We can use primitive date as a building block to get there.

But we want to work with that form of data.

So, let’s take a look at how we can do that.

Elsewhere in my app, I have a purchase type, which does just this.

When a customer makes a purchase, I store that data to disc, so that later, when they open the app, even if they’re not connected to the network, they can view their purchase history.

Each purchase keeps track of the product listing it was associated with, and when the purchase was made, and a receipt.

I can save it to disc in this way using NS coding and NS key to archiver.

And I’ll archive it.

But as we saw, when we unarchive data, and we handle raw and primitive data, we want to validate it.

So, let’s do that by taking a look at how doing it with coder here could work.

If you’ve ever written in a note with coder, this might look familiar to you.

We’ll start by decoding the product listing.

And again, we’ve learned to conditionally downcast, instead of forcibly downcasting.

And if something goes wrong, well this is a fail-able initializer, we’ll simply return nil, right?

if decoding succeeds, I’ll assign this to my property, and I’ll keep going.

I’ll do the same thing with the purchase data, and again, conditionally downcast to a date.

If something goes wrong, I’ll fail, and so on.

And repeat this for each of the fields in my type.

When I want to save one of these purchases to the history, well I have a function, which does this.

It archives a purchase to binary data.

And then, I can save it out to disc, or shrill that data off into database or similar.

When I want to load that data back, well I can do the same thing.

I can get that raw data and then pass it along to KeyedUnarchiver, to get an object back out.

Now, as we’ve said, at every point here, the data is getting more complicated.

There might be more to validate about it before we can trust it, just as well.

So, you might wonder, okay, what’s the catch here?

What else is there left to validate?

And this downcast, here is a good hint.

Note that this downcast happens after we’ve unarchived an object.

How could this ever fail?

It’s a good indicator that something else might be going on here.

So, let’s take a look at that.

This is an abstract representation of what these model objects might look like in my archive.

Here we have all the fields that we cared about in coding.

And each of them contains their own structure, and substructure, and contents, and so on.

But, interestingly here this representation also contains the name of the class of this object.

Let’s take a look at how KeyedUnarchiver can make use of this information.

So, we have a decode call we’re currently making.

And this under the hood creates a KeyedUnarchiver, and decodes an object for the object key.

When you perform this in KeyedUnarchiver, KeyedUnarchiver finds that class name in the object in the archive and pulls it out.

And dynamically finds a class with that same name in your app.

It then allocates an instance of that class and then initializes it to allow it to decode its own contents.

Afterwards, it awakens the object to give it a chance to finalize its state.

Now, this works great for our objects.

But now, we might wonder what happens if the data is maliciously changed to contain an object of a class that we didn’t expect?

Well, this whole process that we just performed happened on a different type.

We just allocated, initialized and awoke an object of a class that we didn’t expect.

What kind of effect can this have in our app?

Well, as we saw before, the conditional downcast here, prevents us from accidentally using an object of this unexpected class.

We’re only going to work with objects of the types that we did expect.

The downcast fails, well we’ll fail.

But decoding one such object can have a lasting impact in our app.

Say that class has an alloc method, which changes global state.

Maybe it allocates a singleton or changes some global data.

Even though we throw the object away, if this fails, this can have a lasting impact in our app.

And can cause differing behavior somewhere else and an archive can be maliciously constructed in this way to cause this to happen in our apps.

So, how can we validate the data to prevent this from happening?

This is exactly where the NSSecureCoding protocol comes in.

NSSecureCoding is a protocol inheriting from NSCoding, whose goal is to prevent exactly this sort of attack.

By allowing you to pass in type information upfront, it prevents arbitrary code execution by validating the contents of the archive to make sure it only contains the types that you expect.

The hallmark introduction of NSSecureCoding were two alternative methods to decode object for key, which allow you to pass that type information upfront.

And using that type information, NSKeyedUnarchiver can keep you safe.

So, let’s take a look at the current decode object for key call that we have.

This top level decode.

Now, here, if instead we use the variant that allows us to pass in the class upfront, in this case, we want to decode a purchase, instead of performing this whole process and whatever is in the archive, you can first gate it on a class check.

Let’s examine this class check for a moment.

At every stage in decoding, if secure coding is on, NSKeyedUnarchiver maintains a list of classes, which are valid to decode.

When we make such a call, NSKeyedUnarchiver, takes the object that we used, this class, and constructs an allowed class list from it.

When we go ahead and decode an object from an archive, it’s class is first checked against the allowed class list.

And if it isn’t present, the decode call will be rejected.

Now, if the class of the object that we find in the archive is in the allowed class list, there’s a few checks that we need to perform.

Specifically, we’ll need to make sure that this class itself also conforms to NSSecureCoding.

If it doesn’t, we can’t be sure that it itself will be able to further securely decode its own contents.

And so, we can’t safely decode one of these objects.

In our case, the purchase class will.

And so, it’s safe to decode and keep track of it.

Now, there’s two other checks that are related to superclass-subclass relationships.

If you have two classes, one of which is a subclass of another, both conforming to NSCoding, and the superclass adopts NSSecureCoding conformance, the subclass will inherit that conformance.

Now, the subclass may never have had a chance to rewrite its init with coder to do the secure thing.

And so, we have an escape hatch here.

The support secure coding method, allows you to say, actually I don’t really support secure coding, and you can turn this off to say, I’m not ready yet.

Alternatively, if you still say yes, we have to make sure that you either inherited the full conformance to NSSecureCoding from the superclass, or that you overrode both of the methods to indicate yes I really do support secure coding.

There isn’t a mismatch here.

In our case, purchase meets both of these requirements and so we can safely decode it from the archive.

Now, we go ahead and decode a purchase, it itself decodes a listing.

And it can make use of the same type of call to indicate that it wants a listing.

When it does that NSKeyedUnarchiver uses that class to build a new allowed class list.

And everything is checked against this new version of the list.

We go ahead and decode an object, the same checks are performed and in this case a listing is still valid to decode.

But again, if we try to decode an object of an unexpected class that’s not in the list, it will be rejected.

Let’s take a look at what that rejection might look like.

And this is called decoding failure and there are a few other types of failures that we might care to look at.

So, when secure coding is on, we might be able to see secure coding violations in cases like this.

But there’s other forms of failure here, too.

For example, a type mismatch can happen if you try to decode an object and instead there’s a primitive value, like an integer in the archive at that location.

Or, you try to decode a primitive, like an integer, and instead we find an object or a primitive of an incompatible type like double.

These can cause decoding failures.

There’s another form of failure that can happen here.

And that’s archive corruption.

If the archive itself is too corrupted and doesn’t follow the expected format for NSKeyedUnarchiver, well we won’t be able to decode anything, and so you’ll get that same sort of failure.

Now, what happens on failure is decided by the decoding failure policy on the Unarchiver.

There’s two options here.

On failure, we can either raise an exception or store information about what happened and continue execution.

Raising exceptions is currently the default.

So, if we have a call, again this is our call from earlier, trying to decode a listing.

And we find an object of an unexpected class in the archive, under the hood this calls the failWithError method, on the Unarchiver and passes in an error that indicates what went wrong and where.

Now, under the hood, failWithError has a decision to make.

If the decoding failure policy is to raise exceptions, it will raise exceptions.

If you’re writing a Swift app, this is something you have to be mindful of.

Swift can’t catch Objective-C or C++ exceptions, and so this can lead to a fatal error in your app.

Now again, this can lead to a crash and a bad experience our customers.

If the decoding failure policy to set error and return, the error will be assigned to the Unarchiver’s error property.

And execution has to continue.

And in this case, because execution does continue, the decode call has to return something, and so it will return nil to indicate that nothing could be decoded.

And as mentioned, if you’re decoding a primitive type, here and we find an object or a primitive type that’s incompatible, the same series of steps has to happened.

And in this case, instead of returning nil, we’ll simply return 0.

Now failWithError is API on NSKeyedUnarchiver, and we urge you to use in your own code to indicate when failures happen and what went wrong.

Instead of simply returning nil, failWithError first to record that information.

If you do, there are a few things to keep in mind.

If the decoding failure policy is to set an error, and return, you have to keep in mind, that once an error is set on Unarchiver, it won’t later be reset.

And that’s because one decoding failure often leads to a cascade of decoding failures.

And we don’t want to lose sight of what originally went wrong.

Second, keep in mind that any given failWithError call, can either throw an exception or continue execution, so you have to keep both of those options in mind.

Especially if you’re working Objective-C.

Maybe you can catch the exception.

So, there’s things to handle there.

And lastly, keeping an eye out for nil, or 0 return values.

This could either happen because of a decoding failure, if the decoding failure policy is to set an error and return.

Or, the data could have just been missing.

Or, even encoded as nil or 0.

So, to disambiguate between these cases, check the error property on the Unarchiver.

All right, so that was a lot of information.

Let’s distill this down into a checklist to see how we can adopt NSSecureCoding on our own types.

We’ll start by converting all decode object for key calls to the variant which allows us to pass in that type information up front.

And then, if something goes wrong instead of just returning nil, let’s failWithError to indicate what happened.

And lastly, this is a great opportunity to audit for further points of failure, where we weren’t performing validation before.

So, let’s do just that.

So, here again is a decode call to decode a listing.

And you’ll notice that if we pass in that type information up front, the conditional downcast later can go away.

There’s a generic overload of this method, which when given the type information statically causes you to not have to conditionally downcast anymore.

You’ll always get an object of that class.

Now, again, as we said, instead of returning nil on failure, we want to fail meaningfully.

So, here we can fail with an error to indicate what went wrong and where.

And in this case, we can use one of the conveniences on CocoaError to return a meaningful error that has a good localized description for our customers and that indicates what went wrong.

We can always add a debug description for ourselves, to later log if we care to.

But this is the core.

We want to failWithError before returning nil.

And then, later on, again, we were decoding the purchase date.

And here, there is a great opportunity to add further validation where we weren’t before.

Here, if we can decode a date, well, I was simply storing the property.

But instead, I want to make sure that that date is valid.

For instance, a purchase couldn’t possibly had been made before my app went live on the app store.

So, this is something you could check.

And again, if something goes wrong here, we want to fail with a meaningful error.

In this case, it wasn’t that the data was missing, it was that the data was corrupted or invalid in some way.

And so, we’ll indicate just that.

Now, that we’ve gone ahead and done exactly this on our type, we can go ahead and claim that it supports secure coding.

And lastly, conform to the NSSecureCoding protocol instead to indicate to the runtime that this is what we intended.

It really does support secure coding.

And after that, well, congratulations.

We’ve earned our NSSecureCoding badge.

Physical badge is sold separately.

Now, we think it’s so important for you to earn your own NSSecureCoding badges and use them that this year, we’ve added new API and NSKeyedUnarchiver to make sure that NSSecureCoding is done wherever possible.

This new initializer and convenience methods, turns on secure coding by default and sets the default decoding failure policy to set error in return.

So, unless you change the decoding failure policy on your own, you don’t have to worry about exceptions in Swift.

And indeed, the old initializer and convenience methods are deprecated in favor of these versions.

So, we really want you to use them.

And similarly, we’ve introduced the same APIs on NSKeyedUnarchiver, to make it easier for you to turn SecureCoding on when archiving.

And this is equally important because it makes sure you can’t accidentally archive an object which doesn’t conform to NSSecureCoding.

And you wouldn’t later able to decode it.

And again, the old initializer and convenience methods are deprecated.

And so, this means that if you have old code that looks like this, well turn on SecureCoding when archiving.

And switch to the convenience methods that allow you to pass in that type information upfront.

This way, you can actually make use of your SecureCoding badge.

Make sure you’ve earned it.

Now, if you can’t yet support SecureCoding, because your types can’t conform, or they, themselves depend on types which don’t yet conform, that’s okay.

You can still use these methods.

On nCode, simply turn off the secure coding requirement.

And on decode, these conveniences always have SecureCoding enabled.

So, instead use the new initializer to create a KeyedUnarchiver, and then turn SecureCoding off, manually.

This also gives you the opportunity to reset the decoding failure policy back to the old default in case you need it.

Once you have such an Unarchiver, well you can decode as usual.

Now, if you’re working on a Swift app, NSSecureCoding isn’t the only way for you to convert your own model types to external representations and back.

Last year with Swift 4, we introduced the Codable protocols which make it easier to do just that.

And importantly, all the design decisions that made their way into NSSecureCoding were also present in Codable from day one.

Specifically, Codable never writes type information into archives.

So, there’s nothing to trust later on.

By requiring you to pass in static type information upfront about what you expect to decode, you can prevent exactly this sort of attack.

Now, Codable also comes with conveniences.

If you have a type whose fields are all themselves codable, well you’ll get a synthesized implementation of the init from and encode to requirements.

And the synthesized implementation gives you type and nullability checking for free.

But as we saw, most types require additional validation if they come from external sources, just like ours do.

So, we need to further validate those.

So, we can do that for our own types by overwriting in it with init from decoder.

And in this case, again, here’s that JSON response from earlier.

And I can trivially turn it into a Codable type by simply creating a type with all the same name fields.

Because all these fields are codable, I get that free implementation of init from and encode to.

But, I want to override init from to make sure I’m performing the same validation that I was before with the primitive values.

So, I can do that here, the same way.

Where my old code decoded the ID from the payload by downcasting to an int, I can do similarly here by decoding an int from the decoder.

If the type found in the payload is of a different type or is missing, this will throw an error that indicates what happened.

Now, more important than that is my own validation that I added in that validate method.

Here, I want to make sure I keep performing that same validation to make sure the ID is valid.

And here, I can use a convenience method to throw a meaningful error that indicates what happened.

Now, later in that validate function I might’ve pulled that creation date out as a string and then had to pass it along to a date formatter to get a valid date back out.

Here, because we can use JSON decoder, we can simply decode a date directly, we don’t have to worry about that type conversion.

We can use JSON decoders date decoding policy to indicate what sort of conversion can happen here.

And this is nice because this conversion became a one-liner.

And, the other decode call is also a one-liner, which makes it easy for me to focus on the types of validation that I care about.

Now, later on in the validate function, I might have also pulled out that substructure of tags out as an array of strings, and later had to map those strings to my own type, maybe for further validation later on.

Here, again with Codable, I have the benefit of tag conforming to Codable itself.

And so, I can just decode an array of tags directly.

This happens automatically.

And instead of focusing on the type conversion, allows me to focus on the validation that matters to me.

This lets me make sure that this data is data I can trust.

We talked about a lot today.

We started with raw data and carried it all away across the abstraction spectrum to our own model types.

And along the way we saw how to build trust into that data by validating a checksum, or the length on that raw data, we made sure that it was valid to work with an inspect, and even check whether it conforms to some format.

Once we’ve made sure that it conforms to a known format, we knew that that formatted data was valid to work with and pull primitive values out of.

By validating the contents and structure of those primitive values, we made sure that it was valid to work with them to turn them into our own model types.

By validating the semantics and relationships between those model types, we made sure that they were valid to work with and that they were data we could trust.

So, where to go from here?

Well, now that you know all of this, I encourage you to go and look at your code and do just this.

Validate data at every stage in the lifecycle of that data.

Check for types and nullability.

But more importantly, range, and length, and domain-specific applications.

If you have NSCoding types, audit those types and earn your NSSecureCoding badge.

And don’t just earn it, use it.

Turn on secure coding wherever you can.

And lastly, for new data types, adopt Codable where possible and make sure to perform that validation.

Make sure you’re only ever working with data you can trust.

For more information about Codable, specifically, see last year’s talk about What’s New in Foundation.

But if you have more questions about this topic or want to come to us for help on how to apply this in your own apps, come to our lab later today, we have a foundation lab, where we can help you do just that.

With that, I’d like to say thank you very much.

Have a great day.

[ Applause ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US