Introducing HEIF and HEVC

Session 503 WWDC 2017

High Efficiency Image File Format (HEIF) and High Efficiency Video Coding (HEVC) are powerful new standards-based technologies for storing and delivering images and audiovisual media. Get introduced to these next generation space-saving codecs and their associated container formats. Learn how to work with them across Apple platforms and how you can take advantage of them in your own apps.

[ Crowd Sounds ]

Welcome everyone and thank you for coming to this session.

My name is Athar Shah.

And I'm a manager here at Apple in the core media software team.

And we at Apple are really excited to talk to you today about two new media technologies.

In particular, we're going to be talking about a new codec for video and image compression called HEVC and a file format for images that we're adopting called HEIF.

But before we get into the details, if you've downloaded the developer seed, the latest build, then on certain iOS devices you are already capturing and using these new technologies and file formats.

And if you pay close attention, you'll notice that these files are a lot smaller than they used to be.

Later on in the presentation, Gavin will talk to you about the specifics of our platforms and how we support these technologies in hardware and in software.

So we're going to be going over the landscape of media as it is today, why we need to change.

We'll talk about what are HEVC and HEIF and why we decided to adopt them here at Apple.

And then finally, we'll give an overview of how we've adopted these technologies within the Apple ecosystem and then also provide some guidance on how you can take advantage of these within your apps for your use cases.

So let's talk about media today.

The world is becoming more and more visual.

And both consumers and producers are generating more and more video and media related content.

Not only that, the content is taking new forms like high resolution 4K video, HDR video, wide color or wide gamut video and so on.

The nature of media is also changing with, you know, our personal favorite being live photos but there's lots of content out there, short-form video and so one.

Bandwidth continues to be at a premium.

And certain applications and use cases like over-the-top video delivery and wireless networks place a premium on the amount of bandwidth.

And anything we can do to reduce the bandwidth requirements really helps out those use cases.

So we've been using H.264 and JPEG for a while.

There are limits to what these codecs can do in this evolving landscape with these new challenges.

And that leads us to HEVC.

We were looking for a next-generation codec that we could use both for movies and for photos or images.

And we were looking for something that was going to give us significant benefits over what exists today.

So having evaluated our options, we decided to select HEVC.

HEVC stands for High Efficiency Video Coding.

It is a state-of-the-art industry standard that was adopted and approved in 2013.

It was ratified by ISO as MPEG-H part 2 and by ITU as H.265.

We're going to be calling it and referring to it as HEVC.

And Apple is adopting it as its next-generation codec.

So we'll spend a couple of minutes talking about what makes HEVC such a great codec.

Now HEVC is similar to H.264 in that it is a codec that processes videos and frames in blocks.

And it uses temporal and spatial compression techniques to get the compression benefits.

Now H.264 has a notion of macro blocks, which are 16 x 16 size blocks that are used within the codec for processing.

HEVC introduces notion of CTUs or coding tree units.

And these it start down at 4 x 4 and go all the way up to 64 x 64.

And it's when you can use these larger sizes is when you realize the greater compression benefit.

And this is especially true when you're dealing with high resolution videos and images.

You can really take advantage of this.

Similarly, H.264 had 4 x 4 and 8 x 8 DCT or discrete cosine transform, whereas HEVC not only uses a discrete cosine transform, it also uses a DST, distribute sine transform.

And similarly, with the coding block sizes for the transform blocks, it also goes up to 32 x 32.

To get better spatial compression, HEVC introduces additional directional modes.

While H.264 had up to nine, HEVC has up to 35.

A key part of being able to do a high degree of compression involves motion estimation or motion compensation.

And this is where you try to find you basically have a block in your current image that you're trying to predict from a previous image.

And what you can do is say, you know what?

This block is that same block from the past but moved over by a certain number of pixels.

Sometimes you don't always land in a pixel's boundary.

For example, you know, something could have moved five pixels over but it could have moved five-and-a-half pixels over or five-and-a-quarter pixels over.

When you need to do motion estimation at that half pixel or quarter pixel boundary, you need to be able to generate those pixels with accuracy and precision because those don't exist.

You just have to pixels at the boundaries.

And so as you can see, HEVC has advanced filtering which can be used to generate those sub-pel and quarter-pel pixels with much more accuracy and precision.

And that leads to better motion estimation compensation and, hence, better compression.

Finally, if you're familiar with block-based codecs, we talked about H.264 having macro blocks, sometimes you can actually see in the coded video, artifacts around the edges of that block.

We call those blocking artifacts.

H.264 introduced a loop filter called the deblocking filter that helps get rid of a lot of those artifacts.

Now HEVC improves upon that filter, but it takes it a step further and introduces a second sequential step where we run a sample adaptive offset filter which gives us even better results.

So now these are some of the highlights that give us a better compression in HEVC.

And there is more.

So having said that, how much improvement are we talking about?

And this is why we're excited about introducing this next generation codec at Apple.

In the case of general video content, we're seeing an up to 40 percent compression improvement over H.264.

So you can generate the same quality video as you were doing previously but reduced 40 percent less bandwidth.

Or you could keep the same bandwidth and significantly improve the video quality.

And we've done quite a bit of subjective and objective testing to fine-tune the video to realize these benefits.

In some use cases, we're seeing an even greater benefit.

So we've really optimized our end-to-end pipeline with the iOS Camera Capture.

And in this use case, we're seeing an up to 2x improvement over H.264.

What this means is you're able to now store twice as many movies as you could previously with H.264.

Let's talk a little bit about the specifics of what we are supporting.

Like H.264, HEVC also has the notion of profiles.

And in HEVC, we're supporting the Main profile, the Main Still Picture profile, and finally the Main 10 profile.

So with the Main 10 profile, we are now able to encode and decode video with 10-bit precision.

This means that you can represent more gray scales and more colors and increase the overall quality of the video end-to-end across our pipeline.

With HEVC, we are requiring the hvc1 codec type for playback.

This means that the parameter sets have to be stored in the decoder configuration record as opposed to within the samples or the payload itself.

So make sure when you are creating HEVC assets that they are the hvc1 codec type to enable playback within the Apple ecosystem.

Finally, we're lucky in that we can take advantage of the fact that HEVC naturally fits into the existing file formats.

So it fits nicely within the QuickTime movie file format as well as the ISO MPEG-4 file format.

So we picked a technology that is well supported within the industry and within Apple's ecosystem.

It has both hardware and software support.

And later on in the presentation, Gavin will go into the details about where and how that's supported.

It works already with the file formats that we've talked about.

So it's supporting the QuickTime movie file format and within the MPEG-4 file format.

And finally, it's an ideal codec for both movies as well as still images, allowing us to use it in place of H.264 and JPEG.

Now as I've mentioned, HEVC can use the existing movie file formats.

That's not the case for when we want to use it for photos or images.

So we needed to find a different file format that we could use for images that would allow us to use HEVC as the codec and that's where HEIF comes in.

So before we get into HEIF, let's talk a little bit about the requirements that we wanted satisfied when adopting a new file format for images.

We wanted support for state-of-the-art compression technology with HEVC being the primary consideration.

We wanted explicit support for alpha and depth channels as primary asset types.

We needed support for animation with animated GIF or JIF and Live Photo.

In addition to just still image support, we wanted support for image sequences, to be able to compress and represent those in a file format such as what happens when you're dealing with bursts of photos.

And finally, with images getting larger and larger in size, it was important for us to be able to process these efficiently with low latency and a reasonable amount of performance.

So when you've got a massive image, you want to be able to download it and start to preview it immediately.

And if you're interested in a particular smaller region in a much wider image, you want to be able to just decode and process the relevant tiles.

And so we were looking for something that would allow us to take advantage of this kind of a pipeline.

So with these requirements in mind, we looked at a few options and we selected HEIF.

HEIF stands for High Efficiency Image File Format.

And like HEVC, it is also an industry standard that was ratified by ISO in 2015.

It's based on the familiar ISO Base Media File Format.

It is extremely feature-rich.

And it supports both individual images and sequences.

There are many use cases in addition to that, that are supported by this file format which allows for future extensibility.

It typically uses HEVC for compression.

It has the option where you can use other encoders.

But at Apple, when we're generating HEIF assets, they will only be using the HEVC encoder.

So why do all this for images?

We saw the benefits for movies and videos.

And for images, we're also seeing tremendous gains over JPEG.

So now using HEVC within HEIF, you're able to capture and store twice as many images as you were able to previously with JPEG.

Let's talk a little bit about the different formats of HEIF that we are supporting.

So as I mentioned from an encode or asset creation perspective, we're going to be using the HEVC encoder, which means we're going to be generating the .heic file extension or .heic.

So that's what we're doing from asset generation perspective.

So we'll be able to create and play back those files.

In terms of playback, we're also going to support decoding H.264 HEIF images and, more generally, if there's another codec that's supported within our system, we'll also be able to decode and display those HEIF files.

There is a great video that one of our colleagues [inaudible] has done which will provide more information into the HEIF file format.

So and I think he'll be posted sometime later today or tomorrow.

So please be sure to check that out if you want more details about the HEIF file format, the different atom types, and how to parse the file and process it.

So now that we've talked about these new technologies, what they are and why we decided to select them at Apple, we'd like to talk to you about how we are using them within the Apple ecosystem and then also provide some guidance on what you can do with these fantastic technologies in your apps and use cases.

And to talk you through that, I'd like to hand things over to Gavin Thomson.

[ Applause ]

Thanks Athar.

So my name is Gavin Thomson.

I'm one of the engineering managers for the camera and photo organization at Apple.

OK. So with we just learned a lot of the great benefits of HEIF and HEVC.

But what are the ecosystem implications?

So at Apple, we've been working over the last year to adopt HEIF and HEVC into the Apple ecosystem.

So today I would like to share, talk about that adoption and share the changes that were made to help with that transition.

Our goal is to make the introduction of these new formats as transparent as possible.

So there are three topics that I would like to talk about.

First is creation: how and where we can create HEIF image and HEVC movie content.

There is access: how and where to access HEIF and HEVC movie content.

And finally transfer: so what strategies do you need to consider when you want to move HEIF or HEVC content off a capture or supported device?

So first I'd like to start with access.

So we've discussed the extensibility of the HEIF format for images.

What are some of the characteristics of the Apple generated HEIF images?

So as we heard, the file format is an ISO Based Media format.

So those of you that have look at the internals of the MP4 or a QuickTime movie file, be very familiar with the internal structure as they're based on the same standard.

So I want to reiterate here, HEIF is a container format.

Not too similar to the QuickTime movie file format.

It's got many more options and a lot more flexibility than a JPEG file format.

Now, our image payload is encoded using HEVC.

And as we've learned, that provides great compression improvements over JPEG, up to 2X.

We also encode the image payload as 512 x 512 tiles.

And amongst other advantages, that provides great flexibility for fast incremental loading of high resolution content.

We also have a 320 x 240 embedded thumbnail, which is four times the resolution but only twice the size of our current 160 x 120 JPEG embedded thumbnail.

Why can we do this?

Because the thumbnail was also encoded using HEVC.

We still support an image, EXIF metadata payload compatible with the payload that we capture in our JPEG format.

So HEIF with the HEVC encoded image will be identified in the file system with a new extension .heic or .heic as we call it.

So some will know this is a breakaway from the DCF 8.3 filenaming convention.

So if you have any assumptions with your file name parsers, a variant three character extension, we now have the default still capture format that has four characters.

So where is HEIF decode supported?

It's supported on all of our it's available on all of our supported platforms that have macOS 10.3 or iOS 11 installed but there's a variety of hardware and software support.

For iOS, we have hardware decode on the minimum config of the A9 chip.

An example, of which, is the iPhone 6S or the iPad Pro.

On macOS, we have hardware decode support on the sixth-generation Intel Core, which is the Skylake family of processors.

An example of that machine is the new MacBook with the touchbar, but we have software decode support on all of our supported iOS and macOS devices.

So where do we have HEIF image support?

ImageIO was Apple's lowest level image framework.

And it supports HEIF as a source for decode, incremental loading, metadata, and thumbnail abstraction.

This is in line for their other supported image formats using exactly the same APIs.

Core Image also supports HEIF as a source for real-time image manipulation.

And the PhotoKit APIs which allow access to assets in the photo library also supports direct access to HEIF, but I'll talk about this framework in a little more detail in the coming slides.

Many of Apple's many based applications will also natively work with HEIF.

Notable amongst those is photos, preview, Quick Look but there are also many others.

OK. Let's move to our movie format and talk a little bit about Apple captured HEVC movies.

So the changes here aren't as drastic as our newly supported image format.

We're still capturing using the QuickTime file format but with HEVC coded video frames.

As we've learned for Apple captured video, we're getting twice the compression that we're getting from H.264.

And, once again, just to call out again, we're using HEVC to encode both our images and our video formats.

We also support both an 8- and a 10-bit encoding.

So those of you that really care about image quality or deep color, we have a non-real-time, a 10-bit software encoder on macOS to satisfy your needs.

You'll be happy to learn, with this new format, we have retained the three character .mov extensions, so you won't have to update any of your filename parsers for this media format.

OK. So where do we have decode support for HEVC movies?

So we have both 8- and 10-bit decode which is available on all our supported platforms with, once again, macOS 10.13 and iOS 11, but there's a variety of hardware and software support.

So in iOS, we have both 8- and 10-bit hardware decode on the minimum config of the A9 chip.

Once again, the iPhone 6S is an example there.

For macOS, we have 8-bit hardware decode in sixth-generation Intel Core processor which is the Skylake family of processors.

And we have 10-bit hardware decode support on the seventh-generation Intel Core processors or the Kaby Lake family of processors.

And we have both 8- and 10-bit software decode across all of our supported macOS and iOS platforms.

OK. So where do we support HEVC movies?

AVFoundation is the primal framework to manage movies.

And it supports play, create, and edit workflows for HEVC content.

Once again, PhotoKit will vend original HEVC movies.

WebKit will also playback HEVC movies but only on devices that have hardware acceleration and all macOS desktops.

There's also support to encourage HLS streams using HEVC.

This represents a great opportunity to improve network throughput.

And, in fact, the session directly after this is going to talk about this initiative in much more detail.

Also, we have many Apple applications that will natively work with HEVC movies, you know, QuickTime player, Quick Look photos but also FaceTime.

This is a great example of HEVC usage to greatly improve network throughput.

So we have decode support for HEVC across all of our devices.

What about playback?

This is where the distinction between decodable and playable is very important and particularly so with HEVC.

This is unfamiliar territory for many of us.

We don't have hardware acceleration across all supported devices for default capture format.

And this is a problem we haven't had to deal with for a long time as H.264 hardware decode is fairly ubiquitous at this point.

So all of our movie formats are decodable but on some software systems, there will be formats that are much slower than real-time and really only supported for export or transcode workflows.

So how do you make the determination that a format is suitable for playback on a given device?

So AVFoundation, through its API, supports the notion of "isPlayable."

And this indicates whether a device's video level supports the movie for playback.

If true, you should experience smooth playback without incurring any significant power or [inaudible] cost for videos of extended durations.

For example, even though Apple captured 4K30 is decodable across all of our supported systems, it's unlikely to be marked as playable on some of our older hardware like the iPhone 5S.

So this is a call out to developers.

It's really, really important, at this junction, for you to be observing these playable state to ensure we provide the best possible user experience.

OK. For many developers, their first interaction with original HEIF and HEVC content will be through one of the public photo APIs.

PhotoKit is the widely used photos API on iOS.

And during his conference, they'll be some announcements about its availability on macOS.

So look to the what's new photo session tomorrow for more details on that, but it will vend original HEIF and HEVC movie content.

Also, the deprecated AssetLibrary framework is still very popular in the development community.

And it will also vend HEIF and HEVC content.

On macOS, there is also the Media Library API.

For this release, it will only be vending transcoded representations of HEIF and HEVC as JPEG and H.264 have been an equivalent resolution.

Because of the popularity of PhotoKit to access media on Apple platforms, I wanted to highlight some of the classes through which you can access HEIF and HEVC content.

When requesting images, you use the PHImageManager.

Through this class, you can request original HEIF images.

Also, remember requesting video objects, you can also use the PHImageManager to request HEVC movies.

For managing resources, we have the PHAssetResourceManager which allows you to manage all the discrete resources in the photo library.

Amongst those could be HEIF and HEVC content.

And for edit workflows, we have the PHContentEditinginput.

And it will support HEIF image or HEVC movies as input to an edit session.

So a point that I really want to stress here is that if you're already using Apple frameworks to manage media, the transition to HEIF or HEVC should be transparent.

On the other hand, if you [inaudible] image or video stack, you might need to revisit that integration and possibly consider adopting one of the appropriate Apple frameworks.

Amongst those are ImageIO for images, AVFoundation for videos, Core Image for video frame or image manipulation.

We have UIKit for presentation.

We have PhotoKit to access the resources within the photo library.

So usage of HEIF and HEVC through these frameworks will be transparent.

I'm going to make a second call out to the session that we have Friday at 11:00 a.m. working with HEIF and HEVC.

There are lots of great code examples of using HEIF and HEVC with these frameworks.

I highly recommend it.

OK. That was access.

Let's move on to creation.

So where and how can we create HEIF image and HEVC movie content?

So, as you can see, we currently only have HEIF encode support and hardware on iOS with minimum configuration being the A10 Fusion chip, an example, of which, is the iPhone 7 and the iPhone 7 Plus.

So the notable exception at this point in time is HEIF encode support on macOS.

How do we create HEIF images?

ImageIO supports HEIF as a destination.

So you could consider transcoding your JPEG resources to HEIF for great storage or network benefits.

Also, the AVFoundation capture APIs will support HEIF captured directly from the camera.

As Athar said, so all the Apple camera modes will default to HEIF with the bursts being the only exception.

So if you have installed the seed builds on supported hardware and you're taking photos, you're capturing them in the HEIF format.

It's also worth reemphasizing that only the HEIC variation of HEIF is supported from code through Apple frameworks.

So HEIF images with the .hvc encoded image.

Let's move to movie creation.

Here are at the table showing where we have HEVC encode support.

So in iOS, we have 8-bit encode support with minimum configuration of the A10 Fusion chip, the iPhone 7 for example.

On macOS, we have 8-bit hardware encode on the sixth-generation Intel Core processor or the Skylake family of processors.

And also on macOS in software we have a 10-bit encode support.

How do we create HEVC movies?

So AVFoundation is a framework for movie creation.

And HEVC movies can be created through an AVFoundation export session.

So you could export H.264 to HEVC for great storage or network optimizations.

You can also capture HEIF movies directly through an AVFoundation capture session with a camera.

Also, all the current movie camera modes will default to HEVC encoded movies.

So, once again, if you've installed the seed builds and you're taking movies, you're capturing them as HEVC movies.

That was creation.

Next we come to transfer.

So what questions should we be considering when we want to transfer HEIF image of HEVC movies from a creation device or other supported devices?

When transferring HEIF or HEVC off a supported device, you don't have the same ecosystem to code support that JPEG or H.264 provides.

You might want to consider transcoding.

There are a few approaches you might want to consider.

The first and the simplest would be to always transcode.

Another option might be to support a capabilities exchange.

Let's start with looking at an example for workflow where you might always transcode.

In this example, you have your own social networking client that allows users to add HEIF or HEVC content to a timeline.

So with this architecture, there is really no opportunity to evaluate the capabilities of all the receiving devices and possibly no server transcode support.

So the option here is to always transcode.

So for this scenario, both supported and unsupported devices would receive the transcoded representation, for example a JPEG or H.264.

Another approach that we might want to consider is a capabilities exchange.

Let's take a look at an example of that workflow.

So here you might have a application that is adopted, Apple's Multipeer Connectivity APIs.

If you're exchanging media with a supported device, you don't want to incur the cost of transcode and high network latency of always sending a transcoded JPEG or H.264.

So you could introduce your capabilities exchange in the initial handshake.

The sending device would evaluate the capabilities of the receiving device and decide whether to transcode or not.

So the hope, over time, is that as the support for these formats grows, the amount of times that we need to transcode decreases.

So this strategy is really suitable for both P2P and in client/server architectures.

So how is Apple handling many similar workflows?

Here are a couple of examples.

So it's mail.

It's not really possible to evaluate the capabilities of all the receiving clients.

And we don't have server transcode support so before sending HEIF or HEVC as a mail attachment, we always transcode.

Now for those developers that have share extensions, we'll also transcode before handing off a HEIF or HEVC.

This simplifies that integration for the time being anyway.

We've also adopted the capabilities exchange for a number of workflows.

Example of those, P2P and AirDrop.

So with these integrations, we always evaluate the capabilities of the receiver before deciding whether to transcode or not.

So in summary, there are a few points with regards to HEIF and HEVC that I'd really like to highlight.

HEVC is Apple's next-generation codec, which we're going to use for both encoding images and videos and is providing up to two times the compression improvement for Apple captured content.

We're adopting HEIF as our image file format.

This image container provides us with a flexible format which we can use well into the future.

If you're using Apple frameworks within the Apple ecosystem, the transition to HEIF and HEVC should be mostly transparent.

But if you need to move that content outside of that ecosystem, you should consider your transcoding options to provide the best backwards compatibility for our users.

And finally, we really want developers to embrace HEIF and HEVC for creation and access workflows as we believe this will provide great benefits to not only developers but all of our customers.

So for more details on this particular session, you can go to the following website, but we also have a number of sessions and labs.

We can learn more about HEIF and HEVC.

Just a few to highlight.

The session following this is advances in HTTP Live Streaming.

We also have working with HEIF and HEVC at 11:00 a.m. on Friday.

We have a great video, which you can learn more about the HEIF image format.

Thanks for your time today.

And we look forward to answering your questions at the labs and sessions for the rest of the week.

Thank you.

[ Applause ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US