Using iTunes and App Store Affiliate Tools and Technologies 

Session 133 WWDC 2010

The iTunes and App Store Affiliate Program lets you earn commissions by using a number of powerful tools to integrate iTunes and App Store product information and links into your apps and website. Learn how to leverage tools like Link Maker, the Enterprise Partner Feed, JSON Search API, RSS and others, to build rich user experiences using product meta-data to efficiently direct users to specific apps and discrete iTunes collections.

Mark Miller: Good afternoon, and welcome to the iTunes and App Store Affiliate Tools and Technologies session.

My name is Mark Miller.

I work for the iTunes Store on the Engineering team, and today we’ve got a lot of great stuff to talk about with the affiliate program, and the tools and services you can use to use the affiliate program.

So, this is our first time presenting at WWDC about the program, so we’ll give a quick overview, what it is, how it works, just so everybody’s caught up.

Following that, we’ll kind of go into some common use cases, mistakes people make when they first join the program, just to try and alleviate any errors.

And then, finally, we’ll dive down into some of the really cool tools that you can use to generate links and find content for your customers on the iTunes Store.

So, what is the affiliate program?

It is a great way for helping you promote your apps and other content on the Store.

Along the way, it allows you to earn money for referrals.

That money is called a bounty.

It’s a commission.

It’s based on the percentage of the sale that is made after a user clicks a link and makes a purchase.

The program also provides some metrics for you to judge the effectiveness of your marketing.

So, you can say this link on page 2 of my website is working really well, while this other link in my app isn’t working.

And as I alluded to, it works both on your apps, and in your apps, and on your websites.

So, quick cast of characters, so we know who we’re talking about.

First of all, we’ve got the iTunes and App Store servers, then we have affiliate publishers.

That would be you in the audience, hopefully, at the end of this session.

And kind of bridging the gap between us, there’s the affiliate networks.

I’ll get into what they do in just a second.

And, finally, we have customers: your customers and customers on the iTunes Store.

So, the affiliate networks kind of fill in and provide some services for you, the publisher, that iTunes can’t.

And to do that, we partner with different organizations around the globe.

In the U.S., the affiliate network is LinkShare, while in Europe, the affiliate network is TradeDoubler.

In Japan, again, we go with LinkShare, and DGM covers Australia and New Zealand.

The affiliate networks, they will vary in their terms and conditions.

You do need to sign up with each, in turn.

The bounty percentage is going to vary a little bit, but, for the most part, it’s going to be five percent, and the bounty window, meaning the time between a click and the time between a purchase, is three days, worldwide.

So, how does this all work?

It is all about the links, the links that you create for your customers and that they click on.

Here’s a basic iTunes link to an app called Remote.

And we need to annotate that link with some affiliate data, so that you can get a bounty.

So, let’s go into that.

The first query parameter that’s really important is the partnerId.

The partnerId identifies the affiliate network, either LinkShare, or TradeDoubler, or DGM, what have you.

And the second piece of information is the affiliate token.

This is sort of an opaque string to the iTunes Store that identifies you to the affiliate network.

So, you’ll get an affiliate token from the network.

Some concrete examples here.

For LinkShare, the partnerId is 30, and the affiliate token name is siteID.

So, this is the actual query parameter that will be on the URL, and it’s called siteID.

You can see here an example.

It’s sort of a strange string, CBIMI.

For TradeDoubler, the partnerId is 2003, and the affiliate token name is tduid.

You’ll notice that this is all lower case.

These query parameters are case sensitive, so make sure you get them right when you’re constructing your URLs.

And DGM, finally, the partnerId is 1002, and the affiliate token name is affToken.

So, we saw some basic links right there that will allow you to get a bounty for a purchase made on the iTunes Store, but you can also do a more complicated link, such as this, that will allow you to get reports on clicks, impressions, and other information, via the affiliate network.

So, what this URL does is, when a user clicks on it, it goes through the affiliate network.

The affiliate network tracks that a little bit, redirects them to the iTunes Store, and they don’t see anything different.

So, what happens when your user clicks on an affiliate link?

Well, it’s pretty simple.

Here, we have a webpage, and they click on a link, and they get into the iTunes Store.

It’s the same thing as an unaffiliated link.

Under the covers, though, a little something different is happening.

So, over on the left, we’ve got Mobile Safari.

This is just kind of playing the role of either your app, or a desktop browser, or even a desktop app.

And then there’s iTunes, the client, the affiliate network, and the iTunes Store.

So, when the browser makes a request, clicks on an affiliated URL, it’s going to go to the affiliate networks redirection server.

That network will record the click and redirect or send a redirect back to the browser.

The browser will then launch iTunes, which makes a similar affiliated request directly to the iTunes Store.

iTunes will take the URL off the affiliate data off the URL, put it into a cookie, and send it back to iTunes.

So, at this point, we’re in the bounty window.

The user has clicked a URL, and he or she is browsing the Store.

Hopefully, within the next three days, she’ll make a purchase.

So, let’s do that.

The purchase is made.

The product is downloaded.

The customer’s happy.

iTunes Store reports the sale to the affiliate network.

After this, you’ll see the purchase happen, and eventually you’ll get credited with that bounty.

So, given that’s how it works, let’s go into some use case scenarios of where this might be applicable.

In your app, you can use it to promote similar apps or companion apps.

If you’ve got a game, and there’s another game that you think your users will like, go ahead and link to that game in your app, and make it an affiliated link.

Now, it doesn’t have to be you don’t have to link to content that you created.

It could be other links.

If you have an app that recommends other apps, or maybe an app that recommends TV shows that a user would like, make those links affiliated and earn the affiliate revenue that way.

Finally, you want to connect your customers with the content that they want.

If, at a certain point in your app, you know that your user is looking for some TV episode, or some music video, or some song or album, go ahead and give them that link, so that they have the option to get it right then.

On your website, the situation is very similar.

You might have a website that’s promoting your application to maybe someone who’s searching online.

Make those links affiliated, and then you can track how they’re doing, see what works to really get people buying your app.

You might link to similar or some sort of complementary app.

If you’ve got some productivity software that works really well with another piece of software, link to that.

It’s sort of a custom package that your customers will appreciate.

And, finally, you can link just any random content to that content on the iTunes Store.

So, if you run a streaming radio station, or maybe you have some sort of channel listing for TV shows, or a music recommendation engine, link that content to the iTunes Store and get the affiliate revenue, based on that.

These are some of the common pitfalls that people run into, when they first get started.

The number one thing is not checking the URL for the existence of a query string before modifying it.

So, here we have a URL that doesn’t have a question mark identifying the start of a query string, and this user has just appended &partnerId;=30.

That’s not going to work, and it probably won’t even load to a webpage.

Better is to check for the question mark and append the affiliate data that way.

The same holds true in reverse.

If there’s already a question mark, you need to use ampersand to separate your query parameters.

So, here’s a link with a question mark already on it, and we use ampersand to add the partnerId and the affiliate token, siteID in this case.

This is another issue that might be a little unexpected.

If you have users in a particular storefront, you need to use the affiliate network that’s associated with that storefront.

So, if you have French users, you need to provide them with a TradeDoubler URL, because if they buy if they click, say, a LinkShare, U.S. link, and make a purchase, we can’t send that data to LinkShare from Europe.

So, it’s important that you create the right URL for your customers.

And, finally, this is just sort of the simple generic advice.

Test the links.

If you’re rolling out a new feature, or you’re doing some sort of big push, make sure that when the URL is clicked on, it winds up at the right place, and your users will be happier.

Also, if it’s a really big thing, go ahead, click that link, and make a purchase.

You can purchase any cheap content.

It doesn’t have to be for your high-priced app.

The sale will bounty, regardless.

So, that’s kind of the overview.

Let’s go into some of the tools and services we provide.

You might have noticed that the URLs can be complicated, and so we really want to make that easier on you.

And for that, we have a tool called Link Maker.

Link Maker is the simplest way to generate an affiliated URL.

All you do is search for content and get an affiliated URL back.

But, obviously, that doesn’t really scale.

Once you’re generating more than 50 links or so, you might want to come up with something more dynamic, and for that you’ve got RSS.

RSS is the same service that feeds your newsreaders, and weblogs, and what have you.

But we use RSS to promote popular content on the Store.

We have feeds for new content, recently featured content, and the popular stuff.

Now, once you have RSS, you might find yourself a little bit not too you might need something more responsive to your users.

For that, we have a Search API, so you can actually generate searches against the iTunes Store over the web and get back, programmatically, the content that you or they are looking for.

It’s a great way to add some interactivity.

So then we might want to graduate to the next level of technicality, if you will.

The Search API is great, but it doesn’t scale well if you’re making a million queries an hour or whatever.

You might want to have all that data within your systems integrated so that you can do custom queries.

For that, we have Enterprise Partner Feeds.

It’s a bulk data download that you pull down over HTTP, integrate into your databases, and you can access that way.

And, finally, once we have this library of music or library of other content that you want to link to the iTunes Store, you might want to construct a custom playlist for your users, so that they can buy a mix of their favorite genre of music, and you can create that, because, presumably, you know what they’re interested in.

Web iMix is a great solution for this, because it gives them a customized playlist on the iTunes Store.

So, let’s take a look at Link Maker, see how it works and what it does.

Link Maker, as I said, is the simplest way to create links on the iTunes Store.

You put in a search term, and you get back affiliated search results.

You can access it at, and we just launched the new version an hour or so ago.

So, if you want to check it out, I encourage it.

Link Maker is a very simple tool.

You put in a search term, and specify some parameters, and you get back search results.

If you click the link on the right, you’ll get an overlay, which provides the affiliated URL, along with a custom HTML that you can drop into a webpage, if you need to.

And we offer three variations on the URL, so that I’m sorry on the HTML, so you can get a text link, or a text link with a picture, or a badge, and that sort of thing.

And this is what that overlay looks like.

You’ve got – it gives you the encoded link, and, below it, the HTML allows you to copy/paste that into a webpage or into your app, if you need to.

Obviously, though, that’s a one link at a time sort of tool, and we might want to have a whole bunch of links coming in, and that’s where RSS comes in.

RSS provides you a feed of dynamic, constantly updating content that is a great way to spice up any place where you feel like the content you have is too static.

You can put this on a website or in your app, and there are a number of feeds: popular content, new, and recently added content, recently added, being back-cataloged stuff that only recently appeared on the iTunes Store, and we also have featured content.

So, it’s the stuff that’s being promoted on the pages right now.

Here’s what an RSS link looks like.

For music, for example, we have a top albums feed and a top songs feed.

Perhaps, on the other hand, we have the top paid applications, and there’s actually a number of feeds, in this case, where you can get the top free applications or the top-grossing applications, as well.

All those are different feed URLs.

As far as content types, we cover a wide variety.

There’s podcasts, movies, music videos, audiobooks, TV shows.

There’s a lot there.

You can get the full list of the RSS content at

We have a little generator there that will allow you to specify some parameters and generate a feed URL for you to use.

So, we saw those basic URLs, but maybe we want to customize it a bit.

If we’re interested in the top albums in France, well, we just put in the French country code at the first element in the path, and that all works.

You might be more interested in the top albums in a particular genre; in this case, Blues.

Well, that’s easy, too.

We put in, as an option, genre=2, 2 being the genre ID of the Blues genre.

There’s a full list of all the genres available, available at the resource site that I’ll mention at the end of this talk.

We can also extend or contract the number of items in the RSS feed that we see at a given time.

And with that, it’s called limit, and you say limit=200, and you’ll get the top 200 albums in the Blues genre in France.

And, finally, we can get to the advanced options.

The standard feed format that you’re going to get is Atom, and that’s specified by the xml extension at the end of the URL.

We also offer JSON output.

This is Javascript Object Notation, that you may be familiar with.

That’s available just by changing the extension to json.

If you want to learn more about Javascript Object Notation, you can check out

But, in essence, it’s a interchange format that’s very similar to the old style plists, if you’re familiar with those.

We also provide JavaScript callbacks, via the callback parameter, and this allows the JSON format to integrate with JavaScript on your websites.

And so to give a demo of what you can do with RSS, we’ve got Joe Hwang, my colleague on the iTunes Store.

He’s going to give us a quick tour of a music blog that needs some pizzazz, if you will.

So, I’ll turn it over to Joe.

Joe Hwang: Thanks, Mark.

Nice to see all you guys here today.

[applause] Thank you.

So, in the beginning, Mark talked about affiliated links, and so how you can earn a bounty on links that are driven to the Store, and he also talked about RSS, which is a way that you can grab content from the Store.

So, let’s figure out a way to put those two things together and spruce up our little music blog here.

So, we have a music blog, and it’s kind of nice, and you can talk about the latest music thing that’s going on.

But let’s say we want to make this a little bit more dynamic.

So, we can do that by adding a splash banner on top of the top five albums in the iTunes Store today.

So, this is the same page.

We just added the splash banner on top, and these are the top six albums in iTunes Store right now.

And so there are a lot of advantages to this.

The first thing is that all of this data is dynamic.

You can just set up the code once, and it’s an RSS feed, so it’ll constantly update with all the latest stuff.

It also looks pretty cool.

You get all the cover art, and you get all the metadata that comes with it.

So, from this example, we’re pulling down the five most popular songs from each of these albums.

And, of course, these links are also affiliated.

So, if you take a look here, here’s the affiliated link.

This is the click tracking that Mark talked about in the beginning.

And if anyone clicks on one of these links, and gets driven to the Store, and then buys something, then you get a bounty, which is great.

So, let’s take a look at some of the code for this.

One thing to note is that it’s very simple.

It’s very easy to do.

So, here’s our script, where, first, we have the feed URL here.

And if you notice, it is slightly different from the feed URL that Mark gave a little bit earlier.

That’s because the demo was written before all the URLs were updated.

And we’re pulling down the top Alternative albums.

So, this is the genre ID here, and we’re grabbing the top albums.

Again, you can customize this however you like with whatever genre or whatever type of media type you want to do.

Here, we have the partnerId, which is, in this case, LinkShare.

PartnerId 30 is LinkShare, and the URL prefix for click tracking.

So, we put all that together, and we’re calling this method here RSS.getsimplifiedfeed, and this method, pretty much, what it does, it’ll call the URL, get the JSON back, which is an easy way to do object notation in Javascript, and go through each of those rows, getting the fields from each of those rows, and then creating some HTML markup to output into what you saw earlier.

So, again, it’s really simple to do.

It’s dynamic.

It’ll always update with whatever is there, and you get the bounty.

So, yeah, just an example of how you can put those things together.

Mark Miller: Great.

Thanks, Joe.

[applause] Oh, and just a quick reminder, all the sample code that we’re demoing here today is available, and it’s attached to the session.

There was a little problem at first, so you might want to check again, if you’re interested.

So, RSS, it’s a great way to get a feed of content, but maybe you want something more of a conversation interactivity.

How do we search through the Store for something in particular?

That’s where the Search API comes into play.

The Search API provides search and metadata lookup on the web.

The responses are in the JSON format, and there’s two main actions.

There’s Search.

This is where you provide a search term of some sort.

And then there’s a Lookup.

A lookup is an ID-based lookup.

So, if you have iTunes IDs already from some other system or some other source, you can look up that content, using that ID, and get up-to-date pricing, availability information, and all sorts of metadata, as well.

So, here’s some of the data that you can get, if you do use the JSON Search API.

This is an example for an application, but you can see that there’s the artist ID is available, the price, a URL where you can download it, the supported devices, description.

There’s a lot of information there, and it’s pretty rich.

So, how do we actually construct a Search URL to get this data?

The simplest possible way to do that is to just use the query parameter called “term,” but there’s a lot of modifiers available, so that you can dig pretty deep into the Store to find what you’re looking for.

First and easiest is “country.”

The country specifies which country you’re interested in, and if an item is not available in that country, then you won’t get the result.

You can also specify the “language.”

So, if we have a localization or a localized title in that language, you can specify that.

“Media” is an interesting thing where you can say it’s a broad term, but you can specify that you’re interested in music content, as opposed to application content, or maybe it’s TV content.

All that can be handled with media.

If you want to get more specific, though, you can specify the “entity.”

The entity distinguishes, say, between an album and a song, or a TV episode and a TV season.

You can also specify the “attributes,” so this is the aspect of the entity that you’re searching on.

We’ll get to that in a minute with an example.

As in RSS, we also provide a “limit,” so you can specify how many results you want back.

And also, like RSS, we have a “callback,” so that allows you to use this from JavaScript on a remote website.

And the results come back as JSON dictionaries, and in the search case, they’re sorted by relevancy.

So, the top item is supposed to be the most relevant to your search.

So, let’s take a simple example.

We’re searching for Madonna.

Well, that URL is pretty simple.

You say, term=madonna, and you’re good to go, but say we’re actually interested in Madonna’s books.

She’s also an author.

So, we specify that the media we’re interested in is audiobook, and we want five results back, so we add limit=5.

Now maybe we rollback a little bit and say, “No, let’s get the music albums instead.”

For that, it’s media=music, and the entity=album.

A little more advanced, we might want songs named Madonna, not songs by Madonna.

In that case, we can use attribute=songTerm.

So, that means that the search will be applied to the song name, rather than the artist associated with the song.

So, this can be very powerful when you want to get very particular about the searches you’re doing.

So, that’s Search, in a nutshell, but if we already have the IDs that we’re looking for, and we just want to query some things about it, Lookup comes into play, and Lookup takes an iTunes ID in an “id” param.

And if you want to specify multiple IDs in a single request, you can do that with just a comma separator.

You can also use different IDs.

So, if you have the UPC for an item, you can use that, and if you have All Music Guide IDs, we also offer that mapping.

So, Lookup query parameters, mostly the same as Search: country, language, limit, callback.

All these are familiar.

“Entity” is the same, but we use it differently, and I’ll get to that in just a moment.

We also allow you to “sort.”

So, what is this for?

So, we might do a simple lookup for Madonna.

We already know her artist ID is 20044, so we put together a URL like this, and we get that single result back.

Now we might want to fetch Madonna’s top songs.

Well, how do we do that?

In that case, we specify the artist ID, but you include the entity as song.

What that will do is fetch the artist, and then look on the artist’s songs and return those, as well.

We also are applying a sort to this URL, and we’re saying rank it by popularity, so you’ll get the top five songs for Madonna with just this one query.

You can also change that.

If you’re not interested in the most popular songs, but maybe the most recent, we can give you the most recent songs with sort=recent.

This can also be very powerful for up-to-the-minute sort of data.

And to give us another demo of the Search API and what you can do with it, I’m going to turn it over back to Joe.

Thanks, Joe.

Joe Hwang: All right, so just to show off the Search API a little bit, here we have, again, our simple music blog, and you can spruce it up with a search field here.

And this is just going to be calling the WS search that Mark just talked about.

So, here, if we search for Madonna, here we can get these results back, and these are all coming directly from the Store.

And, again, we have beautiful cover art that you can format however you like.

All the metadata, that’s there, and, again, importantly, all of these links are affiliated links, so that people who click on these and buy things on the Store, you’ll get a bounty for that.

So, let’s take a look at the code again.

It’s very simple, very similar to before.

We have our partnerId and our URL prefix, and we’re calling here, do search.

So, in this JavaScript method, we’re setting up our query here with limits, and, you know, all the other delimiters that you want to add.

We’re asking for all tracks, so songs, and here’s the URL that we’re going to be sending a query to.

We get the JSON back from that query, and, once again, we’re going to go ahead and format all of the markup with the JSON objects that we received back.

So, once again, it’s very simple to do, and it looks great.

Mark Miller: Thanks, Joe.

Joe Hwang: All right.

Mark Miller: So, that’s a quick taste of what you can do.

Again, this is all provided in sample code, that you’re welcome to customize and use on your sites or in your apps.

But, again, the Search API might not scale well, if you’re doing a very large number of queries, or if you’re just trying to get a better sense of what you’re trying, a better sense of what’s going on in the Store, and maybe you need to do more analysis.

For that, we’ve got the Enterprise Partner Feeds.

The Enterprise Partner Feed, or EPF, as it’s called, it comes in two flavors.

There’s EPF Relational, which is a format that’s very suited, very well suited for relational databases like MySQL, Postgres, whatever.

It also comes in a Flat flavor.

This is all of the data for a given content type and country in a single file, and there’s some applications that you can use that for, as well.

So, let’s take a look at EPF Relational.

EPF Relational has a lot of very rich metadata about virtually all iTunes content.

It’s intended, as I said, to be imported into a relational database, so it comes in multiple files, each one representing a database table, and it’s aimed at organizations that need to run custom queries, so if you need to run some SQL, that gives you prices or whatever, all that data’s there.

It’s a great tool.

So, what data is available?

And the answer is most anything that you can see on the iTunes Store is going to be an EPF.

So, just anything that you can see on these pages is going to be in there.

For some more concrete example, we can take a look at apps, near and dear to all our hearts.

You’ve got the basics, the app name, developer, the price, and the URL, which you can access the app, but you can also get the app description, the recommended age, screenshot URLs, copyright information, that sort of thing, and you can also get device requirements, the popularity for an app a lot of rich data there.

So, how does it work?

EPF comes out every week as a full export, so this is as near a complete dump of the data that we have in the iTunes database onto disc and into files.

We also do a daily incremental export that’s relative to the full export.

The incrementals are data that’s new or that has changed since the last export.

So, if the full export arrives on Wednesday, the Thursday incremental will have all of the deltas corresponding to Wednesday.

Now, this also applies to Friday, Saturday, Sunday, as well.

On Sunday, it will be a cumulative export of anything that’s changed since Wednesday.

Now, where do you get these files?

You download them over HTTP.

There’s a server that you can, that we’ll mention, that’s available on the Resources site.

So, the structure of what you’re downloading data is organized into purpose-specific archives.

These are grouped into iTunes, which is just the basic metadata that you’d expect, app name, album name, that sort of thing.

Matching, which is great, it provides all the UPC, ISRC, ISAM.

Any sort of standard ID that we have, we export in the matching archive.

The pricing archive has prices across all countries, and there’s a new popularity archive, which gives you the popularity per genre of various bits of content.

Now, within the archives, there are files, and those represent the database tables.

So, here’s an example.

We have the artist table, and we also have the application table.

Now, artist is the term that we use to describe you, the app developer.

We’re all artists here, so, you, too, are an artist.

To link them, we have a join table.

This is just like your Database 101 class.

It’s the same sort of schema you’d expect.

So, we have the artist application join table.

Now, the file format, if you’re going to be processing, this is important.

EPF files are formatted by records and fields.

The field separator is ASCII char 1, and the record separator is ASCII char 2 + a new line.

This is a good match for readability within the file, and the delimiters are not actually printable, so they won’t interfere with the actual content.

We also have a comment record.

That’s just a record whose first character is hash.

We use comments to provide metadata about a particular table/file.

So, that includes the column names, some database type information, primary keys that you might want to set up, a lot of useful stuff there.

So, that’s EPF Relational.

We can take a quick look at EPF Flat, as well.

EPF Flat is a simpler approach.

For a given country plus content type, there’s exactly one file that you have to download, and EPF Flat uses a tab-separated values format, so it’s compatible with spreadsheet applications.

So, if you need to select a portion of the file, paste it into a spreadsheet, and manipulate the data that way, you can.

It’s very easy.

Now, these files are very large.

There’s a large number of songs on the Store, so it’s not going to be simple just to open up the whole file in a spreadsheet, but you can do a subset.

There’s no schema, no relational tables, none of that sort of academic type stuff.

All of the data for a given object appears in a single row.

So, if you want to find some data about an app, you search for the name.

The price is going to be right there.

And EPF Flat is great for populating a non-relational data store.

So, if you’re using CouchDB or Memcached, all of these kind of popular, no SQL key value stores, EPF Flat is a great candidate for that.

So, that’s EPF Flat.

And, fortunately, all this complexity, we’ve come up with a tool that helps you pull this data down and import it into your databases.

So, EPF Importer is a cross-platform Python app that imports EPF into a MySQL database.

It handles both Relational and Flat style data, and it’s self-configuring.

It actually reads those comments I mentioned and figures out what it needs to do.

On the command line, you can restrict the imports with a whitelist or blacklist, and it also supports resume, so it’s production ready.

If you need to run this in a high availability production environment and something happens to it, it’ll pick right up where it left off, if something happens, that is.

And it’s released as sample code.

It should be attached to the session, so I urge you to check it out.

It’s great stuff.

And to give you a quick demo of EPF and EPF Importer, we’ve got Rick Rubenstein, a colleague on the QA team.

He’s also the author of EPF Import, so perfect for talking about it.

Rick Rubenstein: Thank you, Mark.

[applause] Thank you.

So, as Mark said, EPF Importer is a relatively simple Python tool, cross-platform, imports EPF files of any sort into a MySQL database.

Before we actually run EPF Importer itself and take a look at that, let’s take a quick look at what you’ll see if you download and unarchive one of the EPF Relational feeds.

So, here we have the iTunes feed, which contains all of the basic metadata on all different content types, artists, genres, etc. There’s about 40 files in this particular download.

And even though this is everything on the iTunes Store, it’s not a huge amount of data.

It’s a few gigabytes, compressed, so not at all difficult to handle.

Let’s take a look at the same thing here.

So, again, here’s the list.

We have, for example, an artist table, collection, various joined tables, as Mark mentioned.

So, let’s take a quick look at the actual content of one of these downloaded files.

Let’s look at the genre file, which is relatively simple, and just take a look at the first eight rows of the genre file.

So, up here at the top, well, the first thing that you’ll notice is these funny-looking characters jumping out.

Those are the record and field delimiters that Mark mentioned, ASCII 1 and ASCII 2.

At the top here, we have a number of comment rows.

So, the first comment row, by the definition of the EPF spec, is always a list of column or record names for that particular exported file.

There’s always an export date, so you can make sure that you always have the most recent; in this case, genre ID, parent ID, and name.

Primary key is genre ID.

Every genre ID in the genre feed will be unique.

DB types is the data types associated with each column or record, which you would use when creating your database to make sure that the record type matches the data type.

And, finally, the export mode, which in this case is full.

This is a full EPF Relational export, as opposed to an incremental export.

And EPF Importer knows how to automatically distinguish between the two and do the proper type of import, either replacing the entire content of the database table or updating it in place.

And here we have just the first few rows of content of the genre feed.

So, we have the genre number, the parent genre, which, in this case, would be music, and Jazz, Latin, New Age, Pop, etc., what you would expect to find in here.

So, let’s take a quick look at how this looks from the database side.

So, let’s switch over to a database visualizer here and connect to our EPF database, where we’ve already imported files from pretty much the entire EPF relational set of feeds.

So, here we have the list of all the different feeds over on the left, already imported.

Let’s take a quick look at the genre table that we were looking at.

So, here we have the structure of it, again, just four simple fields.

We have the primary key defined over here as genre ID, etc., and here’s what the content looks like from the database side.

So, as I said, the genre feed is a fairly simple one.

There are only a few columns.

It’s not very large.

Let’s take a look at application, which is a little more complex.

You’re going to feel a good structure here.

You’ve got maybe, oh, 16-20 different columns.

This is what the application would look like, application feed would look like after being imported into the EPF database.

So, now, for demonstration purposes, we’re going to actually go ahead and delete this application table, and then we’re going to recreate it by doing an actual EPF import of the downloaded and unarchived application data feed.

Normally, you wouldn’t actually have to delete it from the database beforehand, EPF Importer would take care of that itself.

So, let’s switch back over to the command line here.

So, this is a typical command for running the EPF Importer tool, to do what we want.

So, it’s invoked, just as you would invoke any ordinary command line script, EPFImporter.pi.

-w indicates that we’re using a whitelist.

We don’t want to import here during the demo, all of the files that we downloaded.

We only want the application file.

So, that’s specified here.

You may notice the little carat and dollar sign.

The whitelist is defined as a set of regular expressions, for those of you familiar with regular expressions.

So, you can quite sophisticatedly filter out which files you would or wouldn’t want to import.

For those of you not familiar with regular expressions, don’t worry.

You need to learn very little of them in order to use EPF Importer.

And, finally, here, we have simply the path to the directory of the set of files of which we’re going to import an application.

So, let’s go ahead and run this right now and begin an EPF import of the application feed.

So, it’s running now.

It logs a number of things.

EPF Importer is configured to log both to the console and also to a rotating set of log files.

Each time you do an import, it creates a new log file, very easily searchable.

This is going to take just a little bit of time to import, so while we’re waiting for that to complete, let’s take a look just briefly at the code of EPF Importer itself.

EPF Importer is written in file, Python.

It’s not terribly long.

It’s going to be a couple of thousand lines of code, total, consists of three modules.

EPF Parser, this is the part of the code that knows how to parse the data from the EPF files.

It parses the metadata directly from those comment headers at the top, and knows how the table will need to be created, based on that metadata, applying the primary key constraints, etc. EPF Ingester is the part of the code that communicates with the MySQL database.

It knows how to connect to it, and it also knows how to actually create and populate the tables in the correct manner, using the data that it retrieves from the parser.

So, we mentioned that this is released as sample code.

One obvious modification you would make, if you wanted to import into some other kind of a database, rather than MySQL we chose that simply because it’s freely available, well supported but if you wanted to use, say, a Postgres database or an Oracle database, this would be the Python module that you would be modifying or replacing.

And, finally, EPF Importer, itself, is what you call from the command line.

It parses the command line arguments, applies constraints from a config file, which you can supply, etc. So, the import is probably finished by now, so let’s go back to determine here.

There we go.

It’s finished.

Total import time for all directories about 42 seconds, not too shabby.

And now let’s take a look at what we’ve actually accomplished here.

Let’s do a quick refresh of our table list, and now application has returned, and here it is in all its glory.

Here we have about 210,000 rows.

This is actually not production data.

If it was, we’d have 225,000.

Anyway, so this has all been repopulated automatically.

All of the constraints have been applied, the primary key, everything else.

The table is recreated, required virtually no human configuration of the MySQL database.

So, once you have all of this data imported into your own personal database, what sorts of things can you do with it?

Well, you can, you know, do whatever you want, but, as a couple of concrete examples, we have a couple of queries prepared here.

So, first query here would be to find the most popular apps in the U.S. Store, the genre of productivity that contain “keynote” in the description field.

And I won’t go into too much detail on the actual SQL.

Those of you who know SQL will see some familiar things, but you’re selecting a certain number of fields, performing a join on a different table from the table list, etc. So, let’s go ahead and run this.

And it goes very quickly, and here we have 16 rows, which are all of the applications that meets this set of parameters.

So, let’s now do a slightly more sophisticated query.

We didn’t pull down all of the available rows from this, but one that’s not in the table that we were querying is price.

That exists in another table, so let’s take a look at a second query.

This will actually retrieve, essentially, the same set of data as before, but this time it will also retrieve the prices from the separate price table.

Let’s go ahead and run this, and there we have the list of the same apps, but with their retail price over there on the side.

So, that’s just scratching the surface of the kind of things you can do using EPF Importer and the EPF Relational or Flat feeds.

Back to you, Mark.


Mark Miller: Thanks, Rick.

[applause] So, yeah, a lot of stuff you can do with EPF and EPF Importer.

Again, sample code, check it out.

A lot of possibilities.

So, next on the list, we have Web iMix.

Web iMix is a way for you to dynamically generate playlists.

If you have a library of content, and you know your customers are interested in some subset, you could put that all together in a playlist just for them, and they can buy it with one click.

The important part here is that you can do it just by creating a URL.

It’s not like the regular iMix feature that you might be familiar with.

You don’t have to pre-upload or predefine anything, via the iTunes client.

It’s all done by the URL.

And here’s what it looks like in the iTunes client.

So, how do we do that?

Well, here’s an example URL, and it’s obviously pretty complicated, but let’s highlight some key features.

First of all, we’ve got a description.

The description is URL encoded, added to the URL, and it appears in the iTunes client.

We also have a title that titles the playlist and a list of IDs that specify what content we want in this playlist.

Now, you can add songs, and albums, and music videos, and movies, I believe, to this list.

Now, here’s where this is the meat of how a Web iMix works.

You need to create a special affiliate account with iTunes first, and there’ll be a link for you to do that, an email address, rather.

What that will get you is a WD ID that identifies you as a Web iMix affiliate, and you apply that to the URL you create and then sign the URL with a secret that we share, and attach that hash to the URL in the key parameter.

And that’s Web iMix.

It’s a simple feature, but you can do a lot of great stuff with it.

And, finally, we have a special feature that might be the most powerful tool in this whole arsenal, and that’s this Resources website.

We launched it about a few months ago, and it’s at

It is a great way for you to stay on top of new features, any sort of news, tips and tricks that we can come up with, FAQs.

There’s a lot of information there, so I urge you to check it out and bookmark it.

Add it to your RSS list.

Lots of information, and it’s updated frequently.

So, that’s our Tools and Services.

In summary, the affiliate program is a great way to earn bounties and promote your content.

To get started, check out Link Maker, great way to create a few links.

But if you find that doesn’t really scale to what your needs are, there’s other tools.

RSS lets you keep track of what’s hot on the iTunes Store.

With Search API, you can come up with queries to find what’s going on in a particular artist’s realm or something like that.

EPF for doing some heavyweight data processing, and there’s EPF Importer to help you get started with that.

And, finally, Web iMix for creating custom playlists.

And, again, really, this website is a great way I really urge you to check it out.

If you’re interested in what’s available, the Resources website is a great way to start.

And if you haven’t yet, sign up with an affiliate network.

Get started in this process, and start earning some bounties.

As I said, third time, for documentation and news, affiliate resources.

And our evangelist for this session is Mark Malone.

Apple, Inc. AAPL
1 Infinite Loop Cupertino CA 95014 US