Introducing Apple File System

Session 701 WWDC 2016

The Apple File System (APFS) is the next-generation file system designed to scale from an Apple Watch to a Mac Pro. APFS is optimized for Flash/SSD storage, and engineered with encryption as a primary feature. Learn about APFS benefits versus HFS+ and how to make sure your file system code is compatible.

[ Music ]

All right.

Welcome. My name is Eric Tamura.

I'm accompanied by Dominic Giampaolo and we're going to tell you a little bit about Apple file systems.

[ Applause ]

All right so this is going to be a little bit of our roadmap for our presentation today.

I'll tell you a little bit about introduction and motivation.

Why we decided to build this.

Some of the new features that we've added as part of Apple File System.

We'll do a short demo, some of the new features.

And then we'll wrap it up with some new APIs that you can use in your apps.

Okay, let's start it.

So as Sebastian mentioned, Apple File System is available in the WWDC build of macOS Sierra that you all got yesterday.

And it will be available as a developer preview technology in macOS Sierra once it finally ships this fall.

So what is Apple File System?

It's our next generation file system that we've been building for Apple products.

And you might care about this because it will run on watchOS, iOS, tvOS and macOS.

So as far as the intended audience here, we expect that some of you may be either new to the platforms, or you're a long time developer, but we intend to cover all of the high level of the new file system in enough detail so you can follow along.

So, one of the hallmarks of this product is that we wanted to scale from an Apple Watch all the way up to a Mac Pro.

We also wanted to take advantage of Flash and SSD storage, because nearly all of our products use SSDs.

And finally it's been built with encryption as a primary feature from the very beginning as we brought this idea to fruition.

So you might be wondering what about HFS+.

Well, we are currently shipping HFS+ as our primary file system.

But its original design is almost 30 years old at this point.

So how many of you would like us to ship HFS+ for another 30 years?

Great. So HFS+ which was designed in an era where floppies and hard drives were state of the art, and the world has changed since then.

We now use SSDs and other next generation storage technologies evolving as well.

The data structures in HFS+ were also relatively single treaded, so our B-trees are relying on a big block in order to access or mutate them.

And the data structures are also relatively rigid.

And by this we mean things like the file record, or catalog record in HFS+, which is more or less equivalent to an inode in other filed systems, is fixed in order to add new fields, to expand the file system, give it new features, we would have to incur the cost of a backwards, incompatible volume format change.

And by this we mean what happens if we add a new feature to HFS+ and take it all the way back to 10.5 and try to attach that same file system?

What will happen?

So if we're concerned about backwards compatibility as well as forwards compatibility we're starting to think about well maybe it makes sense to build something completely new altogether.

And so we thought about something new.

So we wanted something that was designed and tuned specifically for Apple products.

And other file systems serve other purposes and they do it well.

In particular filers or enterprise level storage servers have a lot of features that might not make sense on Apple products.

We typically use a single storage device all of our products.

And we have a wide range of scale.

So on an Apple Watch, you have significantly DRAM and storage than does a Mac Pro, which has tens of gigabytes of DRAM potentially, and terabytes of storage.

So we wanted something that's flexible and dynamic and that's the platform on which it's running.

So other things we wanted to build, we wanted to add new and enhanced security capabilities.

So on iOS today we already shipped a version of HFS+ that uses per file encryption.

So every file is encrypted differently on storage form every other file on the file system.

We want to take that a step further and we'll dive into some of those features a little bit later on in the presentation.

And finally, we just wanted to add some new general file system features that have been requested and that we learned were really important for the future of our platforms.

So before we get into these new features, I just wanted to give you a brief view of the landscape of what storage software looks like at Apple.

And so in terms of file systems and storage, we talk about HFS+, but it's actually not just HFS+ in the little small bubble, there's actually many ancillary technologies that comprise our storage software.

So in the beginning, we had HFS standard, almost, you know over 30 years ago at this point.

We then added HFS+, some number of years later.

But we added crash protection to it.

So we gave it a journal of case sensitive variant as well.

We also added core storage which gives us full disk encryption as well as our fusion drive, which combines the speed of an SSD with the capacity of a hard drive.

And let's not forget all of the iOS specific variants.

We have an iOS specific variant of HFS+ as well as one that supports the per file encryption that we just discussed.

So our intention is that all of these technologies will get replaced by one thing, which is Apple File System.

So, let me tell you a little bit about some of the new features in Apple File System.

So, this is a brief view of what we've got on deck here, so we've got some improved file system fundamentals, HFS compatibility, Space Sharing, cloning files and directories, snapshots, and reverting to snapshots.

A feature we're calling fast directory sizing.

Atomic safe save primitives and encryption.

So you don't have to memorize all of these, we will go into greater detail of all of these in the upcoming slides.

So, first let's talk about some of the improved fundamentals.

So first it's been Flash and SSD optimized.

So on all of our devices, you know on all of our iOS devices, but a lot of our Macs we ship with SSDs so we want to be as friendly to the solid state drives as we know how.

It's also crash protected, APFS, or Apple File System employs a new and novel copy and write metadata scheme so every metadata write is written into a new location on stable storage.

We combine this with a transaction subsystem which ensures that if you lose power, if the machine panics, or anything bad happens, you'll either see a consistent view of what was on disk or you won't see it at all, the change.

We have modern 64-bit native fields, so the inode number has been expanded to 64-bits.

We have timestamps that are now 64-bits.

We support nanosecond time stamp granularity.

We also support Sparks Files for the first time on an Apple File System.

And all of our file and directory records that point at where the blocks actually live on disks have been expanded to 64-bits.

Our data structures were also expansible and allowed for future growth.

So one thing that we talked about HFS+ is that its data structures are relatively rigid.

And at APFS, or Apple File System, the data structures that represent the core inode are now flexible.

So fields are either optional, or we may not have invented them yet.

So new fields that we may choose to add down the line, will be correctly recognized as not supported, or I don't understand if you attach that storage to today's version of macOS Sierra.

In this way we can add new features without fear of harming backwards compatibility.

This also allow us to have optional fields.

So on some systems, having the mere presence of a file is enough to convey some information.

And so if you have a 0-byte file you don't necessarily need all the machinery that points at which blocks live on disks, because they're needed.

So those fields are optional.

It's also been optimized for our Apple Software ecosystems.

So we wanted to add features and optimize the APIs that are extremely compelling for our platforms moving forward.

And we also have a low latency design.

And typically in file systems latency is often a tradeoff between latency and throughput.

And we've chosen to lean on the side of latency.

And we do this because we want your apps when a user clicks on them on the desktop, or they tap on them on your phone you want it to come up quickly and responsively and have very crisp animation.

And the reason for that is when you go down to the file system, we want to ensure that we get you the answers you need as quickly as possible.

And finally we have native encryption support built into the file system.

On HFS+ as we mentioned, it uses per file encryption, but those are stored on disk through extended attributes.

On Apple File System that's not the case these are now first class citizens, first class objects inside the file system.

So that's a bit about the fundamentals.

So HFS compatibility, if all of you have apps that run on HFS+ just fine, we intend for those to continue to run without any changes whatsoever on your side.

So Apple File System will support and replace HFS+ functionality.

And there's an asterisks there because there's three things that we will not support moving forward.

One of them is exchange data, the other search FS, and the third is directory hard links for time machine.

But every other API and behavior will be supported just as it is on HFS+.

So now I want to tell you a little bit about Space Sharing which is one of the features that we've added into Apple File System.

So let's take a quick poll how many of you in the audience have a Mac or have used a Mac with more than one partition?

Okay, great.

We do as well, one of the things that we do internally, is we want a development version of OS on one partition, and we want you know the stable released version on another.

Or you might choose to have your home directory on one and other different data that you don't care about in another.

But one of the things that we've learned through our analytics that come back when users opt in to data collection and reporting their statistics to Apple machines is that most end users don't do this.

They just have the one partition.

And the reason they don't is because it's hard.

You have to know exactly how you're going to lay out your disk at the time you set it up, and changing it is relatively expensive.

Moreover, free space on one partition, as you know does not translate into available free space on another partition.

So we're solving this with a feature we're calling Space Sharing.

So let's take an example here, we'll work through this as we kind of explain the feature.

Let's say you're downloading the latest and greatest cat video you just got of the internet from your friend over at AirDrop.

And let's say that file grows, and it gets bigger, in fact so big that you've completely run out of space on the partition on which you're running.

Well there's not a whole lot you can do in this case.

If you're out of space, you're out of space.

One thing you could do though is completely destroy the partition immediately afterwards and then grow partition 1.

So let's look at that.

We can destroy partition 2, partition 1 grows, and now you have enough space to continue growing your cat video.

But this is also inflexible and presents a little bit of a problem if the file that you're downloading wasn't on partition 1, but it was in fact on partition 0.

So in this case, you're going to grow the file, it will get bigger.

And then even if you have free space, or content that you're willing to destroy on partition 2, we could destroy it, but then partition 0 couldn't grow, because it's not adjacent to any of the free space that we just made available.

So we think this is something that we can solve with Space Sharing.

So in Apple File System we've come up with this base concept we're calling the container, applicably named because it contains volumes or individual file systems.

So, in this instance, Apple File System containers represent the lowest level of functionality.

And this is what encapsulates our block allocator, as well as our crash protection subsystem.

So let's say we have volume 0, which occupies some amount of free space in the partition, volume can grow, or it can shrink.

But in all these cases the free space will dynamically resize to what's currently available at the time that you request it.

You can also create more than one volume in the container which will occupy incrementally more space.

And then if you wanted to grow partition or volume 0, at this point you could do so.

And now if you ask for how much free space is available on the system you will get the area that's in the green rectangle at the bottom.

So developers take note, this is something that's a little bit subtly different from how you may have computed free space in the past.

If you're using some paradigm like taking a total storage size, subtracting the used space to get the free space, that will no longer work, because other volumes on the container are also participating in Space Sharing.

Additionally, you can't necessarily add up all of the used space either.

So, next, I'm going to invite Dominic up and he will tell you about cloning files and directories.

[ Applause ]

Hi. Again my name is Dominic and I'm going to walk through a couple of the other higher level features that we have in APFS.

First, we're going to talk about cloning of files and directories.

So here we have a file, TOP SECRET APFS.key, that Eric has in his home directory.

And it has references to two blocks of data.

Now if Eric wanted to make an archive of this presentation as it existed at this point in time, he could copy the data by reading it all in, and writing it back out.

That has obvious costs in terms of CPU, power, and disk space usage.

Instead with APFS you can clone the file.

By cloning the file, you copy the references to the data instead of the actual data.

So it's obviously much faster and if it's a large file, you're not using twice the amount of space, you're using exactly the same amount of space plus a small incremental amount for the additional references to the data.

What a clone guarantees in the file system is that if a modification is made to either the original or the clone, the file system will write that data to a new location.

So the clone remains untouched.

So this is an important point to be aware of.

When you have clones, you will, at the time of the clone, you're not using any additional space.

As you continue to make modifications, you will start to use more and more space.

In addition, because APFS, or iOS and macOS support document in application bundles, APFS will also allow you to clone an entire directory hierarchy.

So a document bundle is a directory which contains a set of files inside of it.

APFS can clone that atomically as well.

Next let's talk about snapshots.

Here, we have another representation of a file system with two files in it.

BikeRacing and CoffeeOrigins.

BikeRacing has two blocks of data.

And CoffeeOrigins has one.

If we take snapshot of the file system, we now have a separate, independently mountable, read only copy of the file system that represents this data the file system at the point in time that the snapshot was taken.

Much like with clones, if a write comes into the live file system, the file system will put that data in a new location preserving the integrity of the snapshot.

Likewise, if we were to delete CoffeeOrigins.key to try to free up some space, the file system can't reclaim those blocks because as you can see the snapshot continues to refer to those blocks.

This is an important consideration that developers need to be aware of when working with snapshots because when a file is deleted, if it was present at the time of a snapshot, the blocks aren't reclaimed.

So snapshots can cause you to use all of your disk space if you don't harvest them periodically.

We expect that developers will probably use snapshots for the purposes of having a stable read only copy from which to perform a backup, but we're looking for other feedback from developers on other uses that they might have for snapshots.

So please come see us in the lab at 12:30.

Let us know what you would like to do with snapshots.

Now let's talk about reverting to a snapshot.

And this is another feature that APFS supports.

So we have the same state of the file system here but we decide that well, we don't like this, we'd like to revert, essentially a global undo.

We want to go back to the point in time of the file system at the time the snapshot was taken.

So you can flag a file system to revert to the state of a snapshot.

And the next time it's mounted, the file system will rewind, essentially to the point it was at the time of the snapshot.

And then allow you to continue making changes from that point forward.

So, again, you can see that CoffeeOrigins.key came back and the change that was made to the other file has been discarded.

The snapshot continues to exist and you can revert as many times as you would like.

All right.

Now let's talk about fast directory sizing.

This is an answer to the question how much space does a directory hierarchy use.

Now applications frequently need to compute this size for sizing an operation to provide progress to the user.

And the obvious way to do this is to open the directory hierarchy iterate all the contents recursively and look at the size of all the items to add the size up.

Of course users would really like to know that answer a little bit more quickly.

On this next slide, if you focus your attention on the left side of the screen, when the get info panel comes up, you'll see it says, calculating size.

And after a couple of seconds it populates with the size.

That's what we're looking to improve.

So, the file system could keep track of this.

Obviously you could store the size of the directory hierarchy with the directory itself, but that has one main issue, how do you safely update copy your parent and its parents on up the chain.

So we're just delving a little bit into file system internals, but when you have a lock on a child when you're modifying it, you can't also lock your parent, because that's a lock order violation.

File system always lock from the parent to child, never from the child to the parent.

And so if you start doing it the other way you have deadlock.

So APFS instead sidesteps the problem.

If the problem is storing the size with the directory, well let's store the size somewhere else.

So by storing the size separately, we can use atomic operations to update the size in a separate record that's maintained by the file system.

And we don't have any lock order violations.

This comes at a small incremental cost for the additional size records, but that's basically lost in the noise with the IO.

All right.

Next, we're going to talk about atomic safe-save primitives.

The first example is just a basic file.

This is how safe-save works for a regular file today.

So here I have MakeMonneyFast.key.

And I come up with some brilliant new scheme for making money fast.

And when the application saves that data, it's written to a temporary location off on the side.

When the application is happy that everything has been written and is safe out on disk, it will ask the file system to perform a rename.

Now renames of files have always been atomic.

The file system guarantees that it either happens completely, and it's safe, or it doesn't happen at all.

In addition, the file system will handle deleting the previous version of the document.

So that's great for regular files, but what happens if you have a document bundle?

So here we have a document bundle ClutchConcertReview.rtfd, which is a directory that contains the assets of the document inside of it.

And what happens today is let's say I go see Clutch play and they play a really great show and I update my review, that change is written out, but now what commences is there's no way to do an atomic rename of a directory over top of another directory, because POSIX semantics don't allow that if the destination has something inside of it.

So, we begin playing a shell game.

First, the document is moved out of the way, the live document.

So at this point, if something were to go wrong and the application crashed, or the system lost power, users data is gone.

Then, the application moves the data into place, and last it has to handle deleting the previous version of the directory, the document bundle.

So this is not atomic and it's not safe.

And this is something that has kind of bugged us for a very long time and we wanted to improve it.

With APFS we introduced a new system called renamex np for non POSIX, which allows an atomic safe-save of the directory.

So now, when the application writes the data to its temporary location, and asks to perform the rename operation APFS with atomically handle the swap and deleting the previous version of the document.

So this is now atomic and safe.

Of course as a developer, you probably won't have to resort to this low-level of system call because it's already been adopted by Foundation for you.

So you just get the benefit of this improved behavior on APFS.

[ Applause ]

Next, I'm going to talk about encryption.

So as Eric mentioned, with HFS+, on the Mac we use a layer called Core storage that sits beneath HFS, and provides full disk encryption among other things.

It's a rather sophisticated layer, and it does a lot of things.

On iOS, we have a different variant that stores encryption keys and extended attributes that work.

And those encryption keys work in conjunction with the accelerated AES hardware found on iOS devices to provide per file encryption.

It's a kind of complicated story with two rather different code bases.

And with APFS, we were looking to try and provide a more complete story across all of our products.

So APFS supports multiple levels of file system encryption.

But first, the easiest level we got this working day one, is no encryption.

All data is written in plain text, all data and metadata is written in plain text to disk.

The next level is to have one key per volume.

So all sensitive metadata and data are encrypted with the same key.

This is essentially the equivalent of full disk encryption.

The most sophisticated level that we support is multi-key encryption.

Here, sensitive Metadata is encrypted with a single key that's distinct from the per file keys that are used in encrypting individual files.

In addition, because of how snapshots and clones work, APFS supports per extent encryption.

So each region of a file can be encrypted with its own key.

This is unique and no other file system out there supports anything like this.

In addition, this allows us to unify our encryption story across all of our platforms.

All right and with that I'll turn it back over to Eric.

[ Applause ]

Okay so now I'm going to show you a quick demo of Apple File System on MacOS Sierra using the WWDC build.

So probably the easiest and fastest way to start experimenting with Apple File System is to use a disk image.

So we're going to do that first.

So, you can see on the command line here I've typed hdiutil create-fs APFS which specifies create me a disk image of type APFS, we give it a size and we're going to do a sparse bundle.

So it's going to warn you here, because this is an in-development project and we want you to be aware that you are using something that's not completed 100% yet.

So, at this point it will prompt me, I'll say yes.

And you've created the disk image, which if I attach.

You can examine that on the desktop, do a get info.

You can see that in fact the file system type is the APFS.

So that's probably the easiest way if you want to just get something and try it out.

So next I want to show you some of the other, more advanced features that we've added.

I'll close that.

So here I have two thumb sticks.

These are just ordinary thumb sticks.

You can get them at any standard office supply store.

So I will plug in one.

One is formatted as HFS+ and the other is formatted as Apple File System.

So we'll also do a get info on both so you can watch the free space as it manipulates.

So at this point I have some demo photos of a trip to Italy.

And there's a good amount of storage in both of those directory hierarchies.

But first we're going to start by copying this latest copy of iTunes in the HFS volume and then we'll do the same thing in APFS.

So start the copy, as that goes.

You can see the progress bar there, but APFS already finished.

Because it uses the clone under the cover.

So finder has already adopted all of the new cloning behavior.

So if you do a copy and finder it will automatically clone for you behind the scenes.

And HFS still hasn't finished yet.

[ Applause ]

Okay, so I could do the same thing with the demo photos, which you can look in here, there's several photos, they're all several megabytes in size.

Pay attention to the free space up here, 3.35-gigabytes free.

So if I do a copy, it's actually going to do a clone for me and you notice the free space actually did not decrease at all.

So next, I'm going to show taking a snapshot.

So this uses the tool called SnapshotUtil which will be available in the public data once that releases to everybody.

Oh, sorry this needs to be run as root.

Okay so now I've created a snapshot, I can examine it with Snapshotutil-s.

And you can see that it now knows about APFS Snap.

So I've created a map point, already before this session.

So I will mount this snapshot at this time.

And you can see that the snapshot showed up on the desktop and this contains a read only view of the file system as it exists at the time that I just took it, so.

[ Applause ]

So now in the APFS volume, I'm going to create a temporary file, hello I am a temporary file.

Save that, close it.

You can see that it showed up here in that APFS volume this is mounted read write.

But it's not there in the snapshot.

Correspondingly, I can also delete some of these demo photos.

I'll move them to the trash and then delete them.

The free space actually still does not decrease, because now they're pinned by the presence of the snapshot.

So if I wanted to delete them I would also have to delete the snapshot.

Okay, so that's just a quick peak of Apple File System in action.

[ Applause ]

Okay, so let's talk about some of the new APIs that we've added to support Apple File System.

So first, is one that we probably expect that you're familiar with.

If you're using the Foundation or the FileManager these have both been Swift enhanced.

So is use copy item or replace item they will adopt either the clone or the safe-save semantics that we just described for you automatically, you don't have to do anything so it just comes for free.

It automatically figures out if the file system you're suing is HFS+ or Apple File System and will use the behavior only when it is appropriate.

If, however you decide that the Foundation or the FileManager doesn't provide exactly what you need, you can go a little bit lower and we have a library called libcopyfile.

And this supports the copying of deep hierarchies and this is what we used for a number of years before we had cloning.

So copy files supports a new bit, called COPYFILE CLONE.

It's equivalent to the 5 or 6-bits that are below it.

And we decided to make this opt-in because if you're using a specialized library like this you may not necessarily want your [inaudible] and extended attributes and everything else to be copied exactly as they are.

Whereas cloning will implicitly copy all of those things.

Again, this library will also automatically call clone for you if the backend file system supports it.

And if not it will continue to do what it's always done.

These are the new safe-save APIs, so renamex np and renameatx np are the new system calls to support the safe-save primitives.

These are available in the Man Pages in your version of macOS Sierra that you have so if you'd like to take a peek at the Man Page, they're right there.

And these are the cloning APIs as well.

So clone files and its variance, support the cloning of files and directories.

So a word on compatibility, we expect that the easiest way to get access to an Apple File System images to use hdiutil as I showed you.

It's currently available in the command line only for macOS Sierra so as a developer preview technology has not been fully wired up into disk utility intentionally.

So the fastest way to use something like this hdiutil create-fs APFS and you can get a disk image and attach it.

You can also use diskutil apfs to add a container, delete a container, add a volume, delete a volume.

Whatever lower-level manipulations you want to do to the container itself.

And finally, we also have an FS check that we've been working on as well.

So this will be able to validate the file system as well as perform repairs.

So that's also continuing to be in development.

So, some current limitations of Apple File System file system in macOS Sierra.

This will be supported on data volumes only.

We don't support booting from Apple File System right now.

Time machine backups with Apple File System are not supported right now.

File vault and fusion drive support is still forthcoming.

And currently the volume format is case sensitive only right now.

So if you're not sure that your app requires case insensitivity please give it a try, create a disk image or set it up on a partition on your Mac, try substantiating it and running your app from Apple File System and let us know how it's doing.

Some other compatibility notes.

Apple File System cannot be shared over AFP.

So if you want to use file sharing, we recommend that you use SNB instead as a preferred file sharing mechanism moving forward.

OS X Yosemite or earlier will not recognize Apple File System volume.

So please do not take an Apple File System instance all the way back to OS X Yosemite or earlier.

You will inevitably get a dialogue that you don't want to respond to.

So, [laughter].

macOS Sierra will have a developer preview of Apple File System.

And it will be a developer preview technology once macOS Sierra ships this fall.

So now you might be wondering what's our roll-out plan.

How do all of you get access to Apple File System on your machines?

We'll talk about that.

So upgrading to Apple File System.

So you want these great new features that we've shown you, how do you get them?

Well one way we could do this is to require everybody, all users to back up their systems, save it away, make sure everything's completely safe and then erase the volume, erase the device, restore it put a new OS back on and then restore from backup.

A process which will take several hours, and hope that everything's exactly as it was after you've restored it.

Well, we're not doing that.

Instead, Apple will provide an in-place upgrade path from HFS+ to Apple File System.

[ Applause ]

In doing this, the user data will remain exactly where it is and we will write, or Apple will write the APFS metadata, brand new, into the HPF+ free space.

And we're doing this for crash protection.

This is a multi-second to multi-minute operation potentially.

And over that time, if the device loses power, panics, anything bad happens we want the data on the device to be safe and sound as if nothing had ever happened.

So the Apple File System Converter will try to be as atomic as possible.

It's not completely instantaneous, but as the operation is ongoing, if the device crashes, we intend for it to be completely as if nothing had ever happened.

So Apple File System will ship, by default on all devices in 2017.

[ Applause ]

So to summarize, Apple File System will be the default file system for all Apple products 2017, it's ultra-modern, it's crash protected, it supports Space Sharing, we support cloning and snapshots.

Enhanced data security features like the multikey encryption that we just discussed.

It's also been tuned and designed specifically for the Apple ecosystem in all of our devices.

So you can get more information about Apple File System at this URL behind me that will have a developer guide as well as some sample code so that you can see cloning of files and directories in action.

So some takeaways for all of you in the audience.

Apple File System is coming soon, 2017 will be here before we know it.

I want you to please test your apps against Apple File System with the macOS build that you got yesterday.

Try running your apps on Apple File System.

Please let us know how that process goes.

If you report any bugs, please report them through Bug Reporter through traditional means so that we can investigate.

We all want you to love it as much as we do.

So, some related sessions.

If you're interested in learning more about some of the security features on our platform, specifically iOS, we recommend that you can take a look at How iOS Security Really Works, which will be in this room today at 4 o'clock.

And with that, that's Apple File System, we can't wait to see how you're going to use it.

[ Applause ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US