Introducing Natural Language Framework 

Session 713 WWDC 2018

Natural Language is a redesigned framework designed to provide high-performance, on-device APIs for fundamental NLP tasks across all Apple platforms. Through the deep integration of the framework with Core ML and Create ML, you now have the ability to train custom NLP models to perform many different inferences and leverage the power of NLP in your apps. Join us for this exciting session to learn all the details and to see it in action.

[ Music ]

[ Applause ]

Hello and good afternoon everyone.

Welcome to our session on natural language processing.

I’m delighted to see so many of you here today, and I’m really excited to tell you about some of the new and cool features we’ve been working in the NLP space for you.

I’m Vivek, and I’ll be jointly presenting this session with my colleague, Doug Davidson.

Let’s get started.

Last year, we placed your app center stage and told you how you could harness the power of NLP to make your apps smarter and more intelligent.

We did this by walking you through the NLP APIs available in NSLinguisticTagger.

NSLinguisticTagger, as most of you are familiar with or have used at some point, is a class and foundation that provides the fundamental building blocks for NLP.

Everything from language identification to organization, part of speech tagging, and so on.

We achieve this in NSLinguisticTagger by seamlessly blending linguistics and machine learning behind the scenes.

So you, as a developer, can just focus on using these APIs and focus on your task.

All that’s great.

So what’s new in NLP for this year?

Well, we are delighted to announce that we have a brand-new framework for NLP called Natural Language.

Natural Language is now going to be a one-stop shop for doing all things NLP on device across all Apple platforms.

Natural Language has some really cool features, and let me talk about each of those.

First, it has a completely redesigned API surface, so it supports all the functionalities that NSLinguisticTagger used to and still does but with really, really Swift APIs.

But that’s not it.

We now have support for custom NLP models.

These are models that you can create using Create ML and deploy the model either using Code ML API or through Natural Language.

Everything that we support in Natural Language, all of the machine learning NLP is high performed.

It is optimized for Apple hardware and also for model size.

And finally, everything is completely private.

All of the machine learning in NLP that is powered in Natural Language is done on device to protect user’s privacy.

This is the very same technology that we use at Apple to bring NLP on device for our own features.

So let me talk about each of these features of Natural Language.

Let’s start with the Swift APIs.

As I mentioned, Natural Language supports all the fundamental building blocks that NSLinguisticTagger does but with significantly better and easier to use APIs.

In order to provide some of these APIs and go over them, I’m going to illustrate them with hypothetical apps.

So the first app that we have here is an app that you wrote, and as part of the app, you enable a social messaging or peer-to-peer messaging feature.

And an add-on feature in this app that you’ve created is the ability to show the right stickers.

So based on the content of the message, which in this case is “It’s getting late, I’m tired, we’ll pick it up tomorrow morning, good night.”

Your app shows the appropriate sticker.

You parsed this text, and you bring up the sticker.

The user can attach it and send it as a response.

So all of this is great.

This app has been doing really well.

You’ve been getting rave reviews.

But you also get feedback that your app is not multilingual.

So it so happens that users are bilingual these days.

They tend to communicate in several different languages, and your app, when it gets the message in Chinese, simply doesn’t know what to do with it.

So how can we use natural language to overcome this problem?

Well, we can do this with two simple calls to two different APIs.

The first is language identification.

With the new Natural Language framework, you start off by importing Natural Language.

You create an instance of NLLanguageRecognizer class.

You attach the string that you would like to process, and you simply call the dominant language API.

Now this will return the single best hypothesis in terms of language for the string.

So the output here is essentially simplified Chinese.

Now in Natural Language, we also support a new API.

There are instances where you would like to know the top-end hypothesis for a particular string.

So you’d like to know what are the top languages along with their associated probabilities.

So you can envision using this in several different applications where there’s a lot of multilinguality, and you want that leeway in terms of what could be the top hypothesis.

So you can do this with a new API called Language Hypotheses.

You can specify the maximum number of languages that you want, and what you get back is an object with the top-end languages and their associated probabilities.

Now in order to tokenize this Chinese text, you can again see the pattern is very similar.

You again import Natural Language.

You create an instance of the NLTokenizer, and in this particular instance, you specify the unit to be word because you want to tokenize the string into words.

You attach the string, and you simply call the tokens method on the string, on the object.

And what you get is an array of tokens here.

Now with this array of tokens, you can look up the particular token here is goodnight, and lo and behold, you have multilingual support in your app.

So your app can now support Chinese with simple calls to language identification and tokenization APIs.

Now let’s look at a different sort of an API.

I mean language identification and tokenization are good, but we would also like to use auto speech tagging, named entity recognition, and so on.

So let me illustrate how to use named entity recognition API again with the hypothetical app.

So here’s an app.

It’s a news recommendation app.

So as part of this app, your user has been going and reading a lot of things about the royal wedding, so a really curious user.

They want to find everything about the royal wedding.

So they’ve perused a lot of pages in your app, and then they go on to the search bar, and they type Harry.

And what they see is completely things that are not pertinent to what they’ve been looking for.

You get Harry Potter and so on and so forth.

What you’d like to see is Prince Harry, something related to the royal wedding.

So now you can overcome this issue in your app by using the name entity recognition API.

Again, as I mentioned, the syntax here is very familiar.

Those of you who have been used to using NSLinguisticTagger, they should look familiar and feel familiar, but it’s much easier to remember and to use.

You import Natural Language.

You now create an instance of NLTagger, and you specify the scheme type to be name type.

If you want part of speech tagging, then you would specify the scheme type to be lexical class.

You again specify the string that you want to process.

In this particular instance, you specify the language, so you set the language to be English.

So if you were not familiar or if you’re not sure about what the language is, Natural Language will automatically recognize the language using the language identification API under the hood.

And finally, you call the tags method on this object that you just created.

You specify the unit to be word and the scheme to be name type.

And what you get as an output is the person names here, Prince Harry and Meghan Markle, and the location to be Windsor.

Now if the user were to go back to the search bar, based on the contextual information of what the user has been browsing, you can significantly enhance the search experience in your app.

So there’s a lot more information about how to use these APIs.

You can go to the developer documentation and find more information.

What I’d like to emphasize here is NSLinguisticTagger is still supported, but the future of NLP is in Natural Language.

So we recommend that and encourage you to move to Natural Language so that you can get all the latest functionalities of NLP in this framework.

Now let’s shift gears and look at a situation where you have an idea for an app, or you need a functionality within your app that Natural Language does not support.

What do you do?

So you can certainly create something, which will be great, but what if we gave you the tools to make that much easier?

To talk about custom NLP models and how to build custom NLP models using Create ML and use the subsequent models, which are essentially Code ML models in Natural Language, I’m going to hand it over to Doug Davidson.

[ Applause ]

Thanks Vivek.

So I’m really excited about the new Natural Language framework, but the part I’m most excited about is support for portraying and using custom models.

And why is that?

I’d just like you to think for a second about your apps, maybe the apps you’ve written or the apps that you want to write, and think about how they could improve the user experience if they just knew a little more about the text that they deal with.

And then think for a second about how you analyze text.

So maybe you look at some examples of text, and you learn from them, and then you understand what’s going on, and then you can look at a piece of text, and at a glance, you can figure out something about what’s going on with it.

Well, if that’s the case, then there’s at least a reasonable chance that you can train a machine learning model to do that sort of analysis in your app automatically for you, giving it examples that it can train and learn from and produce a model that can do that analysis.

Now, there are many, many types of machine learning models for NLP, and there are many different ways of training it.

Probably many of you are already training machine learning models, but our task here has been to produce ways to make this sort of training really, really easy and to make it integrate really well with the Natural Language framework and APIs.

So with that in mind, we are supporting two types of models that we think support a broad range of functionality and that work well with our paradigm in NLTagger of applying labels to pieces of text.

So the first of model we’re supporting is a text classifier.

A text classifier takes a chunk of text, maybe it’s a sentence or a paragraph or an entire document, and applies a label to it.

Examples of this in our existing APIs are things like language identification, script identification.

The second type of model we support is a word tagger, and a word tagger takes a sentence considered as a sequence of words, and then applies a label to each word in a sentence in context, and examples of existing APIs are things like speech tagging and named entity recognition.

But those are sort of general purpose examples of these kinds of models.

You can do a lot more with them if you have a special purpose model for your specific application.

Let me give you some hypothetical examples.

So, for text classification, suppose you’re dealing with user reviews, and you want to know automatically whether a given review is a positive review or a negative review or somewhere in between.

Well, this is the sort of thing you could train a text classifier to do.

This is sentiment classification.

Or suppose you have articles or just article summaries or maybe even just article headlines, and you want to determine automatically what topic they belong to according to your favorite topic classification scheme.

This again is the sort of thing that you can train a text classifier to do for you.

Or going a little further, suppose you’re writing an automated travel agent, and when you get a request from a client, the first thing you want to know probably is what are they asking about?

Is it hotels or restaurants or flights or whatever else that you handle.

This is the sort of thing that you could train a text classifier to answer for you.

Going on to word tagging.

So we provide word taggers that do part of speech tagging for a number of different languages, but suppose you happen to need to do part of speech tagging for some language that we don’t happen to support quite yet.

Well, with custom model support, you could train a word tagger to do that for you.

Or named entity recognition.

So we provide built-in named entity recognition that recognizes names of people and places and organizations, but suppose you had some other kind of name that you were particularly interested in that we don’t happen to support right now.

So, like for example, product names.

So you could train your own custom named entity recognizer as a word tagger that would recognize names or other terms of whatever sort you are particularly interested in.

Even further, for your automated travel agent, once you know what the user is asking about, probably the next thing you want to know is what are the relevant terms in their request.

For example, if it’s a flight request.

Where do they want to go from and to?

So a word tagger can identify various kinds of terms in a sentence.

Or another application, if you need to take a sentence and divide it up into phrases, noun phrases, verb phrases, propositional phrases.

With the appropriate sort of labeling, you could train a word tagger to do this, and many, many other kinds of tasks can be phrased in terms of labeling, applying labels to portions of text, either words in sequence or chunks of text in the text classifier.

So these are supervised machine learning models, so there are two phases always involved.

The first phase is training, and the second phase is inference.

So training is what you do in part of your development process.

You take labeled training data, and you feed it into Create ML and produce a model.

Inference is then what happens in your app when you incorporate that model into your application at run time when it encounters some piece of data from the user, and then it analyzes that data and predicts the appropriate labels for it.

So let’s see how these phases work.

So let’s start with training, and training always starts with data.

You take your training data, and then you feed it in to in this case Create ML in a playground let us say or a script [inaudible] as you may have seen in the Create ML session.

Create ML calls the Natural Language framework under the hood to do the training, and what comes out is a core ML model that’s optimized for use on device.

So let’s look at what this data might look like.

So Create ML supports a number of different data formats.

Right here we’re showing our data in JSON because JSON makes things perfectly clear, and this is a piece of training data for a text classifier that’s a sediment classifier.

So each training example, like this one, consists of two parts, a chunk of text, and the appropriate label for it.

And so this for example is a positive sentence, so the label is positive, but you can pick whatever label set you want.

Now, then when you start using Create ML, the Create ML provides a very, very simple way to train models in just a few lines of code.

First line, we just load our training data from our JSON file.

So we give it a URL to the JSON file, create a Create ML data table from it.

Then in one line of code, create and train a text classifier from this data.

All you have to tell it is what the names of the fields are, text and label, and then once you have it, one line of code writes that model out to disk.

Now for training a Word Tagger, it’s very similar.

The data is just a little more complicated because each example is not a single piece of text.

It’s a sequence of tokens, and the labels are, again, a sequence of labels, the same number of labels, one label for each token.

So this, for example, is training data for a Word Tagger that does name identity recognition, and each word, each token, has a label, either none, it’s not a name, or org, it’s an organization name or prod, it’s product name, or a number of different other labels for whatever kinds of names you’re recognizing.

So each token has a label, and each sample consists of one sequence of tokens and their corresponding labels.

And then the Create ML to train this is almost identical.

You load the training data into a data table from the JSON.

Then you create and train a Word Tagger, in this case instead of a text classifier, and then you write it out to disk.

Now, there are a number of other options and APIs available in Create ML.

I encourage you to, if you haven’t already, take a look at the Create ML session, which happened yesterday, and Create ML documentation for more information on that.

Now, once you have your model, we then go to the inference part.

So you take your model, you drag it into your Xcode project.

Xcode compiles it and includes it in your applications resources, and then what do you do at run time?

Well, it’s a Core ML model.

You could use it like any other Core ML model, but the interesting thing is that these models are able to work well with the Natural Language APIs just like our built-in models that provide the existing functionality for NLP.

So what will happen is data comes in.

You pass it to Natural Language, which will use that model and do everything necessary to get all of the labels out and then pass back either a label, single label for a classifier or a sequence of labels for a tagger.

And so how do you do this in Natural Language API?

First thing you have to do is just locate that model in your application’s resources, and then you create an instance of a class in Natural Language called ML Model from it.

And then, well, the simplest thing you can do with it, at least for a classifier, is just pass it in a chunk of text and get a label out.

But the more interesting thing is that you can use these models with NLTagger in exactly the same way that you use our built-in models for an existing functionality.

So let me show you how that works.

In addition to the existing tag schemes that we have for things like named identity recognition part of speech tagging, you can create your own custom tag scheme, give it a name, and then you can create a tagger that includes any number of different tag schemes.

Your custom tag scheme or any of our built-in tag schemes or all of them, and then all you have to do is to tell the tagger to use your custom model for your custom tag scheme.

And then you just use it normally.

You attach a string to the tagger, and you can go through and look at the tags for whatever unit of text is appropriate for your particular model, and the tagger will automatically call the model as necessary to get the tags and return the tags to you and will do all the other things that NLTagger does automatically, like language identification, tokenization and so on and so forth.

So I want to show this to you in a simple hypothetical example.

And this hypothetical example is an app that users will use to store bookmarks to articles they may have run across and then might be intending to read later on.

But the problem with this application as it currently stands is that the list of bookmarks is just one long list with no organization to it.

Wouldn’t it be nice if we could automatically classify these articles and put them into some organization according to topic?

Well, we can train a classifier to do that for us.

And the other thing is that the articles, when we look at them, it’s a long stream of text.

Maybe we’d like to highlight some interesting things in those articles like for example names.

Well, we have provided built-in name identity recognition for names of people, places, and organizations, but maybe we also want to highlight names of products, so we could train a custom word tagger to identify those names for us.

So let me go over to the demo machine.

And so here’s our application as it stands before we apply any natural language processing.

As you can see, even just one long list of articles on the side and a big chunk of text for our article on the right.

Well let’s fix that.

So, let’s go into so the first part of training a model is data, and fortunately, I have some very hard-working colleagues at Apple who have collected for me some training data to train two models.

The first model is a text classifier that will classify articles according to topic.

So this is some of what the training data looks like.

Each training example is a chunk of text and the appropriate label by topic, entertainment, politics, sports, and so on and so forth.

And I also have some training data to train a word tagger that will recognize product names in sentences.

So this training data is pretty simple.

Each example consists of a sentence considered as a sequence of tokens and then a sequence of labels, and each label is either none, it’s not a product name, or prod, it is a product name.

So, let’s try to train with these.

So first thing I want to do is bring up a playground that I have using Create ML, and this playground will just load.

In this case, this is my product word tagger.

It’ll load the training data, create a word tagger from it, and write it out to disk.

So let me just fire that off, and let it start running.

It’s loaded the data, and so under the hood, we automatically handle all of the tokenization, the feature extraction.

We do the training.

This is a fairly small model, so it doesn’t take all that long to train, and I have it set to automatically write my model out to my desktop, and there it is.

All right.

So that’s one model.

Now I have another playground here that’s set up to train my text classifier.

As you can see, it looks very similar.

Load the training data, create a text classifier from it, and write it out to disk.

So I start that off.

And again, automatically natural language is loading all the data, tokenizing it, extracting features from it.

This one is a bit larger model.

It takes a couple minutes to train.

So let’s just let that go, and in the meantime, take a look at some of the code we have to use these at run time.

So I’ve written two very small classes to do what I need to do at run time.

The first one uses the text classifier by finding that model in my apps resources and creating an NL model for it.

And then when I run across an article, I just ask the model for a predicted label for that article, and that’s really all there is to it.

Slightly more code for use of my word tagger.

So as you saw before, I have a custom tag scheme for my product name recognition, and the only tag I’m really interested in is the product tag.

So I create a custom tag for that.

Again, I have to find the model in my bundles resources, create an NL model for it, and then create an NLTagger, and this NLTagger I’m specifying two schemes.

The first is the built-in name type scheme to do name identity recognition, and the second one is my custom product tag scheme, and they’ll both function in exactly the same way.

And then I just have to tell that tagger to use my custom model for my custom scheme.

Now if I supported multiple languages, I might have more than one model in here for this scheme.

And then what I’m going to do is highlight text in this article that is located, determined to be a name of one sort or another.

So I’m going to get a mutable attributed string, and I’m going to add some attributes to it.

So I’ll take this string of that mutable attributed string, attach that to my tagger, and then I’m going to do a couple of enumerations over tags.

The first one uses the built-in name-type scheme for name identity recognition of people, places, and organizations, and if I find something that’s tagged as a person or place or organization, then I’m going to add an attribute to the attributed string that will give it some color.

And then we can do exactly the same thing with our custom model.

We’re going to enumerate using our custom product tag scheme, and in that case, if we find something that’s labeled with our custom product tag, then I can add color to it in exactly the same way.

So you can use custom models with Natural Language API just in the same way that you use built-in models.

Now, let’s go back to our playground, and we see that the model training has finished, and in fact there are now two models showing up on my desktop.

So all I need to do is drag those into my application.

Let’s take this one and drag it right in.

Okay. And let’s take this one and drag it in, and Xcode will automatically compile these and include them in my application.

So all I have to do is build and run it.

And let’s hide that.

Here’s my new application, and you’ll notice that my list of articles is all neatly sorted automatically by topic, and if I go in and take a look at one of these articles, you’ll notice that names are highlighted in it, and you can see, using our built-in name identity recognition, we highlight names of people, places, and organizations, but if you look a little further, you can see that it has used our custom product tagger to highlight the names of products like iPad, MacBook, iPad mini, and so forth.

So this shows how easy it is to train your own custom models and to use them with the natural language APIs.

[ Applause ]

So now I’m going to turn things back over to Vivek to talk about some important considerations for training models.

[ Applause ]

Thank you, Doug, for telling us how to use these custom NLP models.

We are really excited to sort of have a very tight integration of natural language with Create ML and the Core ML [inaudible], and we hope that you do some really unbelievable things with this new API.

So I’d like to shift attention now again and talk about performance.

So as I mentioned before, Natural Language is available across all Apple platforms, and it also offers you what we call as standardized text processing.

So let’s take a moment again to understand what we mean by this.

Now if you were to look at a conventional machine learning pipeline that didn’t use Create ML, where would you start?

You would start with some amount of training data.

You would take this training data.

You would tokenize this data.

You’d probably extract some features.

This is really important for languages like Chinese and Japanese where tokenization is very important.

You would throw that into your favorite machine learning toolkit, and you’d get a machine learning model out of it.

Now in order to use that machine learning model on an Apple device, you’d have to convert that into a Core ML model.

So what would you do?

You would use a Core ML converter to do this.

This is sort of the training procedure in order to get from data to a model and deploy it on an Apple device.

Now, at inference time, what you do is you drop the model in your app, but that’s not it.

You also have to make sure that you write the code for tokenization and feature extraction that is consistent with what happened at training time.

It’s a lot of effort because you have to think about maximizing the fidelity of your model.

It’s absolutely important that the tokenization featured extraction is identical at both training and inference time.

But now with the use of Natural Language, you can completely obviate this.

So if you look at the sequence at training time, you have training data.

You just pass it to Create ML through the APIs that we’ve discussed so far.

Create ML calls Natural Language under the hood, which does the tokenization feature extraction, chooses the machine learning library, does all the work, and returns a model which is a Core ML model.

Now at inference time, what you do is you still drop this model in your app, but you don’t have to worry about tokenization feature extraction or anything else.

In fact, you don’t have to write a single line of code because Natural Language does all of that for you.

You just focus on your app and your task and simply drag and drop the model in.

The other aspect of Natural Language as I mentioned before is it’s optimized for Apple hardware and for model sizes.

So let’s look at this through a couple of examples.

So Doug talked about named entity recognition and chunking, and here are two different benchmarks.

So these are models that we built using an open source tool kit called CRF Suite, and through Natural Language.

The models were built from identical training data and tested on identical test data.

The same sort of features were used.

The accuracy obtained by both these models is the same.

But you look at the model sizes that Natural Language is able to generate.

It’s simply just about 1.4 megabytes of data size, model size for named entity recognition and 1.8 megabytes for chunking.

That saves you an enormous amount of space within your app to do other things.

In terms of machine learning algorithms, we support two different options.

We can specify this for text classification.

So for text classification, we have two different choices.

One is maxEnt, which is an abbreviation for Maximum Entropy.

In NLP, we call maxEnt is essentially a multinomial logistic regression model.

We just call it Maximum Entropy in NLP feed.

The other one is CRF, which is an abbreviation for Conditional Random Feed.

The choice of these two algorithms really depends upon your task.

So we encourage you to try both these options, build the models.

Now in terms of word tagging, that is one default option, which is a conditional random feed.

When you instantiate an ML word tagger, specify data to it, the default model that you get is a conditional random feed.

Now as I mentioned, the choice of these algorithms really depends on your task, but what I’d like to emphasize is sort of draw [inaudible] between your conventional development process.

So when you have an idea for an app, you go through a development cycle, right.

So you can think of machine learning to be a very similar sort of a work flow.

Where do you start, you start with data, and then you have data, you have to ask a couple of questions.

You have to validate your training data.

You have to make sure that there are no spurious examples in your data, and it’s not tainted.

Once you do that, you can inspect the number of training instances per class.

Let’s say that your training a sentiment classification model, and you have a thousand examples for positive sentiment, you have five examples for negative sentiment.

You can’t train a robust model that can determine or distinguish between those two classes.

You have to make sure that the training samples for each of those classes are reasonably balanced.

So once you do that with data, the next step is training.

As I mentioned before, our recommendation is that you run the different options that are available and figure out what is good, but how do you define what is good?

You have to evaluate the model in order to figure out what suits your application.

So the next step here in the work flow is evaluation.

Evaluation in convention [inaudible] for machine learning is that when you procure your training data, you split your data into training set, into a validation set, and into a test set, and you typically tune the parameters of the algorithm using the validation set, and you test it on the test set.

So we encourage you to do the same thing, apply the same sort of guidelines that have stood machine learning in good stead for a long time.

The other thing that we also encourage you to do is test on out-of-domain data.

What do I mean by this?

So when you have an idea for an app, you think of a certain type of data that is going to be ingested by your machine learning model.

Now let’s say you’re building an app for hotel reviews, and you want to classify hotel reviews into different sorts of ratings.

And the user throws a data that is completely out of domain.

Perhaps it’s something to do with a restaurant review or a movie review, is your model robust enough to handle it.

That’s a question that you ought to ask yourself.

And the final step is well in a conventional development workflow you write patches, you fix bugs, and you update your app.

How do you do that with machine learning?

Well, the way to do that or fix issues with machine learning is to find out where your models do not perform well, and you have to supplement it with the right sort of data.

By adding data and retraining your model, you can essentially get a new model out.

So it’s, as I mentioned, it’s very similar to sort of the development workflow, and they are very [inaudible].

So you can think of it as part of your fabric if you’re employing machine learning models as part of your app, you can just combine it with the word process itself.

The last thing I’d like to emphasize here is privacy.

So everything that you saw in this session, all of the machine learning and Natural Language processing happens completely on device.

So we at Apple take privacy really seriously, and this is a remarkable opportunity to use machine learning completely on device to protect user’s privacy.

So in that vein, Natural Language is another step towards privacy preserving machine learning but in this case apply to NLP.

So in summary, we talked about a new framework called Natural Language framework.

It’s tightly integrated with the Apple machine learning [inaudible].

You can now train models using Create ML and then use those models either with the Core ML APIs or with Natural Language.

The models that we generate using Natural Language and the APIs are highly performed and optimized on Apple hardware across all the platforms.

And finally, it supports privacy because all of the machine learning in NLP happens on user’s device.

So there’s more information here.

We have a Natural Language lab tomorrow, so we encourage you to try out these APIs and come talk to us and ask us questions about where you’d like enhancements or perhaps some sort of consultation with respect to your app.

We also have a machine learning get together, and there’s a subsequent [inaudible] Create ML lab that’s happening right now.

So you can continue coming and talking to us as part of that lab.

Thank you for your attention.


[ Applause ]

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US