Advanced Text Handling for iPhone OS

Session 110 WWDC 2010

Applications deal with large amounts of text in a wide variety of languages and formats. iOS 4 introduces many new features and enhancements to the ways you can use and present text. Learn the details of these additions including string handling, attributed strings, regular expressions, data detectors, spellchecking, custom fonts, displaying text with Core Text, and more.

Douglas Davidson: All right.

Good afternoon, everyone.

I'm Doug Davidson.

And along with my colleague, Julio Gonzalez, I'm here to talk to you about text handling.

So, just about every application has to deal with text in one form or another.

For many applications, it's enough to just use standard controls, text field, text view, web view, drag them in the Interface Builder, and you're done, that's fine.

This session though is for applications that want to go a little deeper, either through detailed analysis of the content of text or through custom presentation of it.

You know, we have a great system here for text and topography.

It's worth putting a little extra effort into it.

We're going to be concentrating on the APIs that are new in iOS 4.

You can refer back to the last year's sessions for more detailed information about some of the earlier APIs.

So, first of all, I'll be talking about some of the APIs that we have for analysis of text content, and Julio will be coming up and talking about fonts and text rendering and display.

Before we start, I want to take a brief moment to orient us as to where we are in the operating system.

So, up in the top, we have UIKit which is where those standard controls, the text fields and text views and so on, live.

Down at the bottom, we have CoreFoundation, which is the low level procedural interface to some of the basic objects in the system, things like the collections and strings, and so forth.

And in between, we have Foundation, which is the object-oriented interface to the collections and strings and a few other things.

And we also have Quartz, which is the low level 2D graphics framework.

And then above that, Core Text, which is our low level text layout framework.

Now, those of you who have this experience with the desktop operating system may recognize some of these because everything below UIKit here is essentially common to, well, to phone and the desktop operating systems.

So, you can pay attention to this for both.

Now, in my part, I'm going to be spending most of time in the left-hand side of this dealing with CoreFoundation, Foundation, and UIKit.

So, at the bottom, we have CoreFoundation, which I say is the low level of procedural interface to arrays, and collections, and strings, and so forth.

Above that Foundation, where I'm going to be spending most of my time, is the object-oriented interface to that same functionality, but also it has a few other things with it.

And on top, there is UIKit.

So, first, I want to give a brief introduction to our string objects.

The basic object for handling text in our operating systems is NSString.

This is our fundamental string object.

And into it, we have poured everything we know about Unicode.

And remember that Apple has been working with Unicode from the very beginning.

So, you use NSString.

We do the Unicode heavy-lifting, so you don't have to.

Now, when I say NSString, I'm actually referring to a number of different things.

First, there is NSString class itself, the NSString class in Foundation, that is also the toll-free bridged to the corresponding CFCoreFoundation type, CFString.

When I say the toll-free bridge, what I mean is they're essentially the same thing.

These are two different interfaces to the same underlying objects.

So, a CFString can be passed to NSString APIs and vice versa.

You can deal with them both on the same basis.

And now in addition, these are immutable strings.

We also have mutable versions.

So, at the Foundation level, there is NSMutableString.

And again, that's toll-free bridged to the CoreFoundation-type equivalent, the CFMutableString.

So, anything you could do on NSString, I'll be describing things you can do on NSString, but you can also do to all these other objects.

Now, in addition to strings, we also have something that we call attributed strings.

And an attributed string is a string plus something else.

That "something else" is a set of dictionaries of attributes applied to various portions of the string.

In this example, the first word on the string has attributes.

The font is Times 48, the color is white.

The middle portion has a different font, same color and it is underlined.

The last word has a different font still, it's yellow.

Again, for attributed strings, we have the same sort of hierarchy, there's an NSAttributedString toll-free bridged to the CFAttributedString.

Those are the immutable versions.

There's also mutable versions NSMutableString toll-free bridge CFMutableAttributedString.

What I want you to remember in particular about attributed strings is that every attributed string has a string.

You can get it with the string method.

It's you don't get a copy of it, you get a direct proxy to the underlying characters.

And also every MutableAttributedString has a mutable string, again, direct proxy to the underlying characters.

So, anything you can do with the string, you can also do with an attributed string by working on its underlying string.

And anything you can do with the mutable string, you can do with the MutableAttributedString by acting on its mutable string.

So, I'll be talking about working with NSStrings, but I actually mean working with any of these types of objects.

Now, what is in these NSString objects?

Formally and conceptually, an NSString is just a UTF-16 sequence.

But I really want discourage people from thinking of the strings as sequences of characters.

That's too low level.

What I want to encourage instead is that you think of the strings as being composed of sequences of meaningful elements, each of which is represented by a range in the string, a range of characters in the string.

So, what sort of meaningful elements do I need?

Well, in Unicode, in general, a single character doesn't necessarily stand on its own.

You might, for example, have a base character followed by some number of combining accents.

And these combine together to form what we call technically a grapheme cluster which is what appears to the user as a character, but it is actually several characters.

It is a range in the underlying string.

Now, some people think that if they just use enough normalization, they won't have to deal with this sort of thing, but that's not true, and it can't be true in general.

The combinatorics just don't work out.

There are too many possible base characters, too many possible accents.

In general, you always have to consider the possibility that you're dealing with several characters in your grapheme cluster.

Also, there are it's not just accents.

There are languages that have other forms of combination.

In addition, because we're dealing with UTF-16, if you have a character that's outside the basic plane, then you might have a surrogate pair where it takes two UTF-16 units to form a single character, and that would also be a part of this cluster.

These things have to be dealt with together.

You can't separate the two halves of the surrogate pair.

You don't want to separate the accents from the base character.

So, this is the smallest independently meaningful unit to deal with.

Now, these clusters then combine into larger units like words.

And words are the basic unit for many, many different kinds of string processing.

The words, in general, are not necessarily strings of letters separated by punctuation or white space.

In a number of different languages, there isn't going to be any white space to separate words.

Now, then these words combine into larger units such as paragraphs.

And paragraphs are particularly important because in Unicode, the Unicode algorithms generally don't go beyond paragraph boundaries.

So, the paragraph is sort of the largest basic unit for string processing.

Now, how can I deal with these?

Well, one of the big new features in iOS 4 is blocks.

And blocks are pretty much tailor-made for text processing.

Because what they allow you to do is to go through and apply a chunk of code to each piece of your text in turn.

And that's what this particular API here does.

This is enumerateSubstringsInRange options usingBlock.

And you fill in the block with whatever code you want, and it gets called.

Here I have chosen the option NSStringEnumerationByWords, it gets to it and, it gets called for each word in the range of the string you apply it to.

And you get passed in the substring, that's the word, or you can choose not to if you don't want that.

You get the range of it in the overall string.

You get an enclosingRange, so for words that would include the enter word spaces as well, if you need to know about those.

And what can you do with it?

Just about anything you want.

You can put any code you need to do to handle it in this block.

So, for example, if you wanted to count words, I've shown that here, you can declare a count variable.

You have to give it the block attribute if you want to modify it within the block.

And you increment it as you go through in this block.

You also have the option of early exit.

That's what this Boolean pointer stop variable is for.

So, the trivial example I showed here is you count up through the first 100 words, after that, you set the stop variable to YES, and you exit.

And so, for example, if I'm going through this string by words, your block would be executed first for the first word, then for the second word, then for the third word, then for the fourth word, and so on.

But it's not just words.

You can do this for many different basic types of elements.

You can do it for the grapheme clusters.

You can go through your string by those, by words, by sentences if you like, or by lines or by paragraphs.

Or you could even take several of these and nest them together, and so, go through your string by several different types, finer and finer granularity, and do whatever it is you need to do to these pieces of the text.

NSString has many other APIs that are very flexible in general and provide Unicode savvy processing for things like matching and searching and comparisons.

So, here's one of the big ones, the rangeOfString family methods.

These are what you can use for matching a string within a larger string.

So, that's the example I've shown here.

I choose the NSAnchoredSearch option.

So, it will do matching, it'll match only at the place I look.

I chose CaseInsensitiveSearch.

So, I'm looking for the string "resume", and it'll find it no matter what the case of the letters is.

I chose the DiacriticInsensitiveSearch option, which means maybe it'll also find resume with accents.

And the WidthInsensitiveSearch option which is important in Japanese context.

And if I do the same sort of thing without the AnchoredSearch option, then I'll go through and look for the first match of this anywhere in the range of the string that I specify.

Now, this is these are very powerful APIs.

They've been in on NSString for a long time.

So, for iOS 4, we thought, well, what can we do to extend these?

And well, maybe, how about adding regular expression support.

So, actually, the first thing we did [ Applause ]

The first thing we did was actually in 3.2, which is to add the NSRegularExpressionSearch option to this particular range of string APIs that allow you to treat the string you're trying to match as a regular expression pattern.

We're using the ICU Regular Expression syntax.

For those of you who may not have heard of it, ICU is the International Components for Unicode.

It's a very, very important open source library for dealing with Unicode.

Some of you actually might have already been using ICU directly for regular expression operations, but that's not very convenient.

This is a lot easier.

In addition, we've actually gone in and added hooks in ICU to make it easier for it to deal directly with NSStrings.

So, this is not only easier than using ICU directly, it's probably more efficient.

So, in this particular case, this is a very simple, sort of trivial example.

The pattern I've chosen here starts with a backslash b that matches the word "boundary."

The actual syntax is backslash b, but because here it appears in a string literally, you have to escape the backslash so you have to have two of them.

That's followed by (i|o) that matches an I or an O, (f|n) that matches an F or N, in other word, boundary.

So, this particular example matches two other words that start with I or O and end with F or N.

And I chose the CaseInsensitiveSearch option, so it will match these upper or lower case.

And I'll find so, for example in a string like this, the things that it might find are these.

For iOS 4, we extended this by adding in search and replace functionality.

So, these are pre-existing APIs on, in a string.

There are two here.

One takes an immutable string and returns a modified copy.

The other one takes a mutable string and mutates it in place.

And the previous versions of these just fit out little strings.

For iOS 4, we added support for NSRegularExpressionSearch option which means when you add it to these, that the string you're looking for is treated as a regular expression pattern, but also, the string that you're replacing it with is fitted as a template.

Now, there's one more thing I have to explain and that is that the things in parentheses in the pattern here are capture groups.

So, the first set of parentheses, the I or O is the first capture group.

In this case, it's the first letter of the word.

And the second set of parentheses is the section capture group, F or N.

So, that would be the second letter.

And then the template syntax is a sort of standard thing where $0 gets replaced by the whole match of the regular expression.

$ of 1 gets replaced by the contents of the first capture group.

$2, the contents of the second capture group, and so on if you have more of them.

So here, $2 and $1 takes the second capture group, follows it with the parenthesis of the first capture group.

So, those of you who are familiar with this have already guessed what this is going to do.

I didn't have room in the previous slide to specify case insensitive, but assume that I did.

What it would do to this is just reverse the letters in each of those two-letter words, very simple example.

But now, you may be saying, this is all well and good, but that doesn't cover everything that I would want to do with regular expressions.

That's true.

We figured that you would want a complete regular expression support, and so we have given you that as well with the NSRegularExpression class.

Now, an NSRegularExpression object encapsulates a pattern plus various options.

So, in this example I've used the CaseInsensitive option again.

There is actually a long list of options.

You can look at the documentation in the headers, and it covers everything you would expect through this sort of thing plus probably a few more.

And when you create a regular expression object, you control by creating it when the pattern gets compiled into its internal form.

These NSRegularExpression objects are immutable and thread safe.

You can create one at the beginning, and then just use it wherever you want.

Now, what can you do with it?

Well, the most general and the most flexible API on NSRegularExpression is guess what?

It's a block iteration.

EnumerateMatchesInString options: range: usingBlock: goes through that range of the string and calls the block on each match of the regular expression in that range.

So, you can do whatever it is you want to it there.

Your block gets passed in an objective class NSTextCheckingResult which describes everything you need to know about the match.

And then you can do whatever it is you want by putting your code, arbitrary code, in the block.

So, for example, if you were to apply it to this string, your block would be called first on the first match and the second and so on, and so on, and so on.

And what are these NSTextCheckingResult objects?

We use this in a number of different places to express different things that might have been located in a piece of text, and they are of different types.

Well, when you use them with regular expressions, they're all the same type, the regular expression type.

And they have different various properties.

They all have a range property, it describes the overall range of the match.

The regular expression results also have the ranges of the various capture groups that's expressed by the rangeAtIndex method.

So rangeAtIndex is 0 is just the overall range of the match, rangeAtIndex 1 is the range of the first capture group, rangeAtIndex 2 is the second, and then so on and so forth.

And that is what you need to understand how your regular expression has been matched.

So, as you go through, for each match, you can inquire of this text checking results what is your overall range, what is the range of the first capture group, second capture group, and so on.

From there, it's easy to do whatever you want.

You can get you know, you can ask for a substring without range if you need to extract substring and so on and so forth.

Now, I said that the block iteration is the most general and flexible method on NSRegularExpression.

But of course, there are going to be cases when you don't need all that power and flexibility, so we have some simpler APIs for simpler cases.

If you want to get all the matches as an array, you can do that.

If you just want to count them and get the number, there's also a method for that.

If you want to get the first one, that you can do that too, or if you just want to need all you need is the range of the first one, we have that as well.

And at that point, we're sort of back down to the level of the initial NSString APIs that I showed you.

So for example, the equivalent of that using NSRegularExpression would be something like this: rangeOfFirstMatchInString:options:range:, that will just find you the first the range overall range of the first batch.

And of course, on NSRegularExpression, we also have, again, the search and replace methods.

Again, there are two of them.

One that takes an immutable string and returns a modified copy, another, that takes a mutable string and modifies its place just the same way as the previous APIs that I mentioned.

OK. So, that's regular expressions.

What else do we have?

How about data detectors?

You might be familiar with data detectors as the things that in your e-mail maybe find interesting things like dates and addresses and phone numbers, and give you options to do things with them.

But up until now, we haven't had API access to the underlying data detectors functionality.

Well, that's changed.

With iOS 4, we now have an NSDataDetector class.

And NSDataDetector is just a subclass of NSRegularExpression.

You don't create it with a regular expression pattern, instead, you just create it with a set of the data detector types that you want to have it detect.

So, in this example here, I chose the type, the link type which detects URLs, and the phone number type which detects phone numbers.

So, this data detector object will detect URLs and phone numbers.

So, for example, in this string, it would detect one phone number and one URL.

Now, you might be tempted to use a regular expression for this sort of thing, but the data detectors knows a wide variety of different formats.

For example, a wide variety of international phone number formats and it's highly optimized.

So, you would not be able to match its performance's sophistication with a regular expression.

It's very good at what it does.

The types of things that it can detect: Dates, addresses, URLs, phone numbers.

The transient information type generally detects the flight numbers, airline flight numbers.

And since it is just a subclass of NSRegularExpression, it uses all of the same APIs.

So, the same block iteration will, in this case, go through and call your block for each match that the data detector finds in the string.

For example, in this case, first your block will be called on the first thing, then on the second thing that it detects.

And what it passed in, again, it's an NSTextCheckingResult that describes the match.

But it's not the regular expression type of TextCheckingResult, instead there's a special type of TextCheckingResult for each of these different data detectors type, each of the different things that data detectors can detect.

There's the date type, the address type, the phone number type, link type for URLs, and so on and so forth.

And each of these has properties appropriate to it.

So, for example, the date type has a date property.

The address type has a components property, that's a dictionary of the various pieces of the address.

The link type has a URL property.

The phone number type has a phone number property.

So, as I go through in my block, for each result, I can ask it what type are you.

In this case, if the type is the link type, then it should have a URL, I can do whatever I want with that.

If it's a phone number type, it should have a phone number.

I can do whatever with that.

In addition to other properties like the overall range.

And again, since this is just a subclass of NSRegularExpression, it has all the same convenience APIs for getting all the matches or the number of the first match, the range of the first match just exactly in the same way that I showed before for NSRegularExpression.

Now, I'm going to demo that, but before I do this, there's one more thing I want to mention and that is spellchecking, which you may have noticed as being available by default on standard text entry.

But we have also made it available programmatically starting in 3.2.

Now this actually is at the UIKit level with the UITextChecker class.

And what you'd do with one of these is go through a piece of text and find words that are likely to be misspelled in a particular language.

Once you've found one of those, you can ask for a set of potential replacements that might be what the user intended.

You can learn words, you can ignore words, forget words, all the basic functionality for text checking.

Now, I want to go over to the demo and briefly show some of this.

So, this is a very simple test application, and I've entered in that regular expression that I showed before.

And what I'm going to have this code do is highlight everything that it finds in this piece of text.

Oh, there it is.

All the 2-letter words that start with I or O and end with N or F, it's case insensitive.

This will also do the same sort of thing with NSDataDetector.

So, in case I want to detect phone numbers and highlight those phone numbers or dates or URLs if I like.

And that's very simple, but what I want to show you is how simple this really is.

So, here is the business piece of this test code.

This is the invocation of the enumerate matches and string block iterator for the regular expression.

And it's basically just two lines of code.

So, it enumerates the matches in the string.

Now, remember what I said earlier about attributed strings, this test app happens to deal with attributed strings, but I can apply the same APIs to the attributed strings just by asking for its string.

Then I enumerate through it using this block.

And the content of the block is just one line that calls my custom view here to have that range highlighted.

There's another piece of code that does the same thing for the data detector's piece of this.

And you'll notice it's essentially just the same code because again NSDataDetectors is a subclass of NSRegularExpression.

Again, just about two lines of code to go through and do something for each instance that the data detectors finds in the string.

OK, let me go back to the slides.

And so, that is we have to say about text processing, analyzing strings.

Let me bring Julio Gonzalez up to talk about fonts and display.

[ Applause ]

Julio Gonzalez: Thanks, Doug.

I'm Julio Gonzalez, Manager of the Type Team.

And I'm going to talk to you about drawing with Core Text.

First, I'll start with what the text architecture is especially with the drawing side, give you a bird's eye view of Core Text, what it does, capabilities, principles.

How it differs from the version that's present in the Mac OS X, and hopefully, give you a nice demo of some of the things that I'll discuss during this presentation.

So again, this is the same slide that Doug showed you earlier with all the text components.

I want to focus on the layers that have to do with drawing.

We have Quartz, Core Text, and UIKit.

The lowest level is Quartz.

It's a rendering engine.

It's the same rendering engine on iPhone and iOS and Mac OS X.

If you want to get access to any text supports, you first need a font reference, that's the CGFontRef.

It's the most fundamental object to get an access to fonts.

And one thing I want you to note is that Quartz, even though you can use it to render text, it's truly isn't a text rendering engine.

It is a graphics engine.

It's a glyph rendering engine.

That's an important distinction.

Now, you may have gone into Quartz, and looked at the CG APIs and notice that there are some APIs that claim to process text.

Well, they don't.

They know how to handle ASCII text, that's about all they know how to do.

They don't know how to handle any Unicode.

You won't be able to use any NSStrings with this.

Also, you don't have any access to font substitution which is something that you would expect a normal text engine to do.

So, now you ask, why would I use this?

Well, there are reasons.

If you know exactly what glyphs you're going to put on the screen and exactly where you're going to put them, then you use Quartz.

Well, that means that you have your own layout engine, otherwise, you have no business being here.

Now, at the opposite end is UIKit.

That's where 99.5 percent of developers are going to be.

You have a bunch of text, it deals with it beautifully.

You want to get to some text, you have the UIFont, you have to specify the font, and that boils down essentially into CGFontRef at some point.

Now, it is a place to be.

There are great classes available to you.

They are easy to use.

UILabels, TextViews, easily accessible from IB.

You also have the advantage that with this you have a lot of things built in for free.

You have text editing in the case of text views, you have copy and paste, you have voice-over support, you have spellchecking.

All of these are available to you at this level.

Now, you do lack some things at this level.

For example, you can't have multistyle text unless you go to a web view.

But it might be that the source that you're dealing with is not HTML, or the things that you want to do to render, HTML can't handle.

That's when you need to go a little bit lower down and that's when you use Core Text.

So, Core Text is the low-level Unicode layout engine in Mac OS X and iOS, OK.

It's been around for a while.

It's been there since Mac OS 10.5.

And from the very beginning, it was built for speed in mind and for threading, which is very important to some of you.

And it also, it's internally, it's divided into 2 subcomponents, the font and the layout components, and it provides a very, very rich set of API set that wasn't available to you.

You can get access to a lot of font features, and you get access to a lot of APIs that allow you to control what you get on the screen very precisely.

So, there are some differences between both versions of Core Text primarily that's less capable on the iOS at least for now.

The first one is we have no font management.

And what I mean by that is you don't have the ability to register and unregister fonts programatically or to enable or disable font faces in a font family.

Also, we don't have OpenType shaping support.

We do support OpenType.

In fact Hiragino, our Japanese font is an OpenType font.

However, we just don't support the font features or the layout capabilities of OpenType.

Also, we do not support vertical glyphs in the iOS.

So, unfortunately, you won't be able to create vertical text views very easily using Core Text in iOS.

And finally, the font substitution mechanism is not as rich as it is in Mac OS X, but it is better than what you have currently available in UIKit or with WebViews, and I'll talk about that a little bit later.

So, let's start with the basics, the font system.

So, our primary object for referencing fonts is the CTFontRef, and you create that by specifying a postscript name.

That differs from a UIFont which you typically use a font family name to create a font reference.

Also a CTFontRef is not toll-free bridged to a UIFont unlike on OS X where it's toll-free bridged to an NSFont.

Now, this API allows you different ways to get out fonts.

One of them is using FontDescriptors, and think of them as a search mechanism for fonts.

You're not very specific about the font you want, you can be a little bit more ambiguous.

You can create a FontDescriptor that says I want a font that is bold, that is mono spaced, that supports German, for example, just to say something.

And you can call you can then derive a font from this descriptor and Core Text will find the best matching font for this descriptor.

Also, this is the recommend that we suggest for you to store any font references in document, it's both in iOS and Mac OS X.

And the reason is because if you have a FontDescriptor, you typically will want to create a very rich description about what the font is.

And if you go to another platform and the font that you wanted to use is not there, you have extra information there to make an educated guess as to what font could be substituted.

Another object that is available to you in the Core Text space is FontCollections.

These are list of font descriptors.

And we use this in Mac OS X to create the font panel or to implement the font panel.

We don't have such a capability in iOS.

And if you created an app that wanted to provide a font picker, you might consider using FontCollections to implement that.

So, what are the benefits of this API from what you knew or had already?

Well, I don't know if you've noticed, but we've shipped a good set of new fonts with the iPad.

And with these fonts, there is with these families of fonts, there are some of these families have a lot of font faces.

And you may have noticed that you only have access to four of these font faces through UIKit.

You only have access to the regular, the bold, the bold italic, and the italic.

With Core Text, you'll have access to all those font faces, and there are some that are extended, regular I'm sorry, light.

So, all of these will be accessible.

The other thing is font substitution.

Like I said earlier, it's not as rich as it is in Mac OS X, but it is richer than what you have with WebViews or with UIKit.

Instead of just matching based on whether your font can handle a character that the other font didn't, it will match based on styles from the font or traits from the font.

So, if you were starting with serif font, it will match to another serif font.

If you had an italic face, it will try to get you an italic face.

But that's not all.

Also, you have the ability to tell Core Text what kind of fallback you want to take place.

You have control over that.

And to use that, you create what we call a cascade list, and you specify that cascade list in the font descriptor, and now you have full control over what the font fallback is going to be.

Another thing that the Core Text API provides is full access to typographic features in the font.

These are features that some high-end fonts have, and we ship some of them with the iPad.

And they are typically options that are hidden in the font that in Mac OS X you could make available by using the typography panel.

Some of these features are things like small caps, so you can draw all your lower case letters with small capitals.

You can use non-proportional numbers and some proportional numbers, different serifs and so forth.

All of these will be accessible to use with the Core Text API.

And finally, it is the Core Text API, in regards to the font, is a very rich API.

It gives you full access to all the font metrics information.

It gives you access to character, to glyph mapping.

In short, it's you should be able to create a very rich typographic experience by using these APIs.

Now, let's move on to the other side of Core Text which is the drawing side, the layout side.

The first thing to note is that Core Text operates on a flipped coordinate system.

Typically, your views, the origin is at the bottom left, and it grows up and to the right.

Core Text does the opposite.

The origin is at the top left and it grows down and to the right.

Typically, that's the way text would flow, that's the reason we do it.

Now, the input for all drawing into Core Text is the attributed string.

It's the same string that Doug spoke about earlier.

And there are four objects in the layout API that you need to be familiar with.

The framesetter, typesetter, the line, and the run, the glyph run.

The framesetter is the highest level object.

The object of the framesetter is to create frames which are shapes of text.

And it fills those shapes with lines of text and applies any paragraph styles that you may have provided.

It internally created types it creates a typesetter.

The typesetter is what does all the heavy-lifting in the layout engine.

It's what measures text.

It figures out words with just line breaks, figures out alignment, figures out where line direction goes.

It's all done by this object.

Now, these two objects, like I said, are the heavy-lifters of the layout operation, and you want to hang on to these objects for dear life because that's where all your performance comes from.

Now, next object is the line.

You can create one of these lines directly, and it's very simple by passing an attributed string.

And lines are just a collection of glyph runs.

And what a glyph run is is a series of glyphs that all share the same attributes.

So, let me show you the typical flow.

First, you start with your attributed string.

From there, you could create your framesetter.

The framesetter only takes a single input which is the attributed string.

It internally creates a typesetter.

Now, the only thing you need at this point is where you going to draw it.

So, you require a shape, that's a CGPath.

And note, the only shape Core Text supports at the moment it is a rectangle, OK.

So, now, you have your path, you have your framesetter, you can create a frame.

And once you have the frame that takes the typesetter goes to town and starts generating all those lines and the framesetter applies any paragraph styles that you need.

Now, at this point in time, you have full access to the frame, the lines, the glyph runs to make any adjustments that you want.

Or you can just draw it at this stage.

Now, let's assume that you had a document that had multipage document and have multiple columns, and not all your text fit into that single frame.

Well, the only thing you need to do now is specify another frame, another shape, another rectangle.

And once you have that, you use that same framesetter that you created before which is, you know, at a heavy cost, and now you just draw into this frame, and it does the same operation.

And you keep repeating this process over and over.

I'll show you some of this in the demo later on.

So, what does this look like in code or how you go about it in code?

First thing you need is a font.

And here are three examples of how you create a font reference in Core Text.

The top API allows you to create a system UIFont.

The middle API is what you typically use to use an arbitrary font, and you specify the postscript name.

The last API creates a font from an existing font, and it does so by morphing the style traits in it.

In this case, we wanted to go from a bold face to an italic face.

Once we have our font, we create an attributed string.

Now, note, it seems like a lot of code to create a single string, this is not typically what you do.

You typically have a higher level object, the same thing I do on my demo that creates this attributed string, but I'll show you how you create one from scratch.

First, you start with your base NSString.

I'm grabbing one from my localized file.

Then optionally, you want to apply some attributes to this string.

In this case, I want to I want it to be blue and have a dotted underline, so I create a colored object for blue and an object for the underline.

Now, I specify the dictionary of attributes.

And in this dictionary, I'm going to specify my font that I created in my previous slide, the color and the line.

So now I have my base string, I have my dictionary of attributes.

I create my attributed string, OK.

So, now we're ready to draw.

Like I said, Core Text operates on a flipped coordinate system, so the first thing that you do when you are on your view, ready to draw, or in a layer is to make sure that your context is set up appropriately.

Now, one thing I didn't mention that is good practice to do is set up your text matrix.

This text matrix might have been modified by somebody else before you got to it.

This is especially important in OS X where we're dealing typically dealing with Cocoa, and the Cocoa text system modifies the text matrix.

In iOS, it's not necessary, but it's a good practice to do.

Then the next two lines translate the CPM, which is what flips the coordinate system.

So, now, we're ready to draw with Core Text.

So, to draw a simple line, the only thing you need to do is to create a line object, and you create the line object with CTLineCreateWithAttributedString.

And at this point, you're ready to draw it.

You can there's other APIs that let you manipulate the line where you draw it.

But otherwise, you just set your position where you want to draw it and go ahead and draw it.

Now, what if we wanted to draw paragraphs, a longer set of text.

Well, it's not that much more difficult.

We start the same way, we set up our context.

And the first thing we do is we create our framesetter.

Once we have our framesetter, now we just need to define the shape that we're going to draw in, a rectangle.

We define it from the bounds of the view, whatever.

Now, we have a frame.

And once we have the frame, we can either manipulate it or draw it.

So, you'll see some of that in the demo later.

So, now, there are some more objects that are available to you to customize the text.

The first one is the ParagraphStyle object.

And this is an object where you can specify attributes a lot of style.

You can specify things such as indentation, line height, justification, alignment.

You can specify TabStops which is the next object, which is a conceptual object where you can specify things such as in a ruler or you specify the tab stops.

You can specify the tab stops into your ParagraphStyle.

And the ParagraphStyle is something you include in your NSAttributedString.

And later on, the framesetter will apply to the frames as it's drawing it.

Now, there are some other lower level objects that are powerful.

The GlyphInfo object is a pretty nice object.

It allows you to override the character to glyph mapping characteristics of a font.

So for example, you can say, Core Text, if you ever see the word "copyright", we replace copyright with the copyright glyph inside of the font.

Also, you have RunDelegate.

The RunDelegate is only available in iPhone OS.

And what this allows you to do is to override the metrics for a single glyph.

And the way you use it at a low level is you can make the metrics for this glyph large thus creating a space in your run.

In this way, you can attach a picture or a movie or whatever in the middle of your document.

This is something similar to what the Cocoa text system does in Mac OS X.

So, what are the benefits of this layout API, this drawing API?

Well for one, as you've seen, by using attributed string, we have full access to multistyle text.

We also have access to all the font features in the font.

And more importantly, is we have a good set of APIs that allows to control frames, lines, and so forth to render that fine level of control that you might need in your application.

And finally, it's thread safe.

So, if you have a very large document that you need to display to your users, the last thing you want to do is to layout the entire document and while the user waits for the document to layout.

So, you can be sure that you can use Core Text safely in threads.

So with that, let me move to a demo.

So, what I have here is a very simple page layout application, here is my project.

And the first thing I want you to note is my this XML file.

Basically, I have a set of XML files here.

Let's see if I I don't know if you can see that.

And this is what I use as input to my application.

And on the side, I have what the XML file looks like and it just describes every single thing in the document.

And from here, I can create an NSAttributedString very simply.

It also has features such as what the pages look like or how many columns I want and so forth.

Now, one thing I want you to note, something that's new in 3.2 is the ability to add fonts to your applications.

You can copy any set of fonts to your resources and they get copied into your bundle once the application is built.

However, they're not automatically available to you.

What you need to do is you need to modify the plist file.

Yeah, and let me blow that up.

So you can create this key, the fonts provided by the application, and then you list all the fonts that you want to be able to make accessible to your app.

Note, I included here about four font faces.

That's about right.

If you wanted to add more font faces, you just need to be a little bit careful.

If you add 20, 30 font faces, now you're eating into your launch time for your app.

Because at launch time the font subsystem will need to parse through all these fonts to figure out what's available and make it available to the rest of the system.

So, now let me run this app.

Hide all these.

OK, so again, a very simple page layout.

In this case for this particular sample, I decided to create some static frames.

Here are some text, a picture frame, free flow text.

And for those of you who have a keen eye on fonts, you will notice that it's using Futura Medium Condensed.

It's something that is not available to you through UIKit.

Let me switch to a different sample.

Set in to portrait mode, landscape mode.

And here, this sample shows that flow diagram that I showed you earlier in the slides.

I have two columns up and pages flow from one to the other.

And it is very easy to construct using those APIs.

I'll show you the code a little bit later.

Now, something else that you get access to is font and font features.

So let me set this text to Dido, and Dido has access to some of these typographic features.

So, look what it looks like right now, and I'm going to switch now to use small caps, and now you have access to small caps in Dido.

I'll use some more dramatic example here.

I'll switch to Zapfino, which has this font that's full of different features.

And I'll instead use some stylistic variance.

Here we go.

It's completely different.

So you can really create some really nice typographic applications using Core Text.

So, I want to show you what this code looks like and I could go back to my keynote to show you that, but I have a text viewing application, so why don't I just show you here in line.

So, yeah. Nice.

[ Applause ]

So, the first thing to note is that this quote-unquote slide is being rendered using the font that I included in the bundle, not something available to the iPad.

So, let's get into the code.

So, it's a very simple piece of code.

I just defined the type of frames.

One of them are fixed frames, other are text full frames or picture frames.

But the next object that I create for my code is the view frame object, which is how I keep track of all my frames.

And please note that I'm keeping track of the framesetter.

Like I said, I hang on to this object for dear life because it's what does the heavy layout in the process.

Now, from there on, it's quite simple.

I have a document object that basically reads those XML files and keeps track of all the attributed strings and also keeps track of pages and where frames layout.

So, not that we need to go to this line by line, but just keep in mind that I can get my frames from this object and store them away in my frame info object.

Then I could get the values.

Once I get the values for text as my attributed string, I can create my framesetter and stash it away for later drawing.

In the case where I'm not using static frames where I'm just using free flow of text, just in the second example that I provided you, there was two columns up.

So in this case, I'll just my document says that I want two columns up, I just divide the balance of the view into two columns and figure out what the rects are and I just keep track of those frames in the same object.

Then once I'm ready to draw, I have everything at my disposal.

I have the framesetter all ready.

I have my path or my CGPath so I can create a frame and draw it.

So in the case where I am flowing text to from column to column, it's not that much more complex.

The only thing we need to figure out is how much text fits into each one of those columns.

And I keep track of that with this current index value.

I create my first frame, draw it.

Actually, I don't quite need to draw it before I measure it but I can measure it with this API.

The frame gets visible string range, and it will tell me how much of the string fit into that frame.

Notice at the very end I update my current index and pretend I'm in a four loop or in a while loop and I just repeat the process here until I exhaust all the text.

Let's see, I do have some time and I will pray to the demo gods that this works because I was sitting in the couch just doing this a minute ago.

Sorry, I didn't mean to do that.

But, the other thing I talked about was threading.

So here I have a very long document that I created.

Just notice how long it gets stuck at.

It has a bunch of Arabic text.

It's got a bunch of Japanese text, Hebrew text, English text.

It's got 385 pages, that's how long it took for Core Text to layout all that information and display.

Instead of doing this, I'll just switch back so you know that I'm not playing tricks on you.

I can do this by loading the frame asynchronously, OK.

So, I'll just turn that on and I'll switch to the long text and there it goes.

And to see that I'm not playing tricks, see it's at page 155, and I can keep going.

So, this is a kind of a user experience you want to provide.

You want to take advantage of Core Text threading capabilities to get this done.

So with that [ Applause ]

Oops, I forgot my so in short, Doug showed you a great set of APIs and we highly encourage you to use these APIs, the NSString API especially, along with the block APIs that we have for them.

And also, this great new set of APIs for RegEx and data detection.

We've done a lot of the hard work.

There is no reason for you to try to go and re-implement this yourself.

On the Core Text side of things, it's slightly different.

It does provide a lot of power like you've seen, gives you a lot of flexibility but it does come at a cost and I want you to make sure that you understand what that cost is.

Once you go down to the Core Text level, you're giving up a lot.

You're giving up text editing, you're giving up spell checking, you're giving up voice over support, you're giving up copy and paste.

So if your experience is to provide the user with some you know, a great view and text experience, great.

But then if you have to be able to edit, then this is something that you need to be aware of that it's going to come bite you, but you might need it, it's well worth it.

Some related sessions, especially if you're new to iOS.

I recommend that you attend the Foundation sessions.

You'll get a pretty good understanding of the foundation of all the APIs that we use.

The Internationalization Data on Mac and iPhone will get your applications to be world ready so I recommend that a lot.

And if you're doing cross-platform development, I recommend you attend the Cocoa Text session as well to get some nice tips.

Any feedback, any suggestions, gripes that you have, please contact Bill Dudney, he's our Evangelist.

And documentation, the documentation up on the web is pretty robust and stable.

I recommend that you go there.

You can access it through

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US