API/Hacking

Not logged in - Log In / Register

Revision 1 as of 2008-07-31 17:21:33

Clear message

Hacking the Launchpad web service

Canonical provides a Python client, launchpadlib, for reading and writing to Launchpad's web service. But there are many situations where you wouldn't use launchpadlib: if you're not a Python programmer, if you want to write an Ajax client that runs in a web browser, if launchpadlib is too heavyweight for what you want to do, or if you just want to understand what's going on between the client and the server.

This document describes the HTTP resources published through Launchpad's web service. It shows you how to read and write information about those resources by making HTTP requests. It assumes you have basic knowledge of how a web browser and web server interact.

Launchpad Resources

Every object in Launchpad--everything you might think of as having a separate identity--has its own URL in the Launchpad web service. You can bookmark this URL and pass it around. You can also manipulate the underlying Launchpad object by making HTTP requests to its URL. Everything means everything: big things like people, teams, bugs, bug tasks, and projects, all the way down to team memberships, bug watches, and the languages people speak.

(This is the plan, anyway. We're still working to expose all of Launchpad's objects through the web service, and many objects that are exposed don't yet publish much useful information.)

In the following sections I'll show you HTTP requests you can make and the responses you'll get back. I'll be skipping over the fact that you'll need to digitally sign those requests using a set of OAuth credentials, or they'll fail with response code 401 ("Unauthorized"). To see how to sign a request, see "Signing Requests" below.

An entry: your user account

Let's consider one of the objects exposed by the web service: your own user account. The URL to your user account in Launchpad's web service is "http://api.launchpad.net/beta/~{your-user-name}": for instance, "http://api.launchpad.net/beta/~salgado". If you send a GET request to "http://api.launchpad.net/beta/+me" you'll be redirected to your user account.

(How are you supposed to know the URL to your user account without reading this document? That's a good question which we'll deal with later. But for now, note that the URL to your user account in the Launchpad web application looks like "http://www.launchpad.net/~salgado", and that "http://www.launchpad.net/+me" will redirect you to your user account on the website.)

GET

To find out about your user account you send a HTTP GET request to that URL:

   GET /people/~your-user-name HTTP/1.1
   Host: api.launchpad.net

(Again, to actually make this request, you'll need to get an OAuth credential and digitally sign the request, as described below in "Request Signing.")

You'll get back a response document containing a JSON hash:

   200 OK
   Content-type: application/json

   {"languages_collection_link": "http:\/\/api.launchpad.net\/beta\/~your-user-name\/languages", "members_collection_link": "http:\/\/api.launchpad.net\/beta\/~your-user-name\/members", ... }

That looks ugly as a string. It looks better when you deserialize it from JSON into a native-language data structure. Here's what I get when I turn it into a Python dictionary.

{u'admins_collection_link': u'http://api.launchpad.net/beta/~your-user-name/admins',
 u'confirmed_email_addresses_collection_link': u'http://api.launchpad.net/beta/~your-user-name/confirmed_email_addresses',
 u'date_created': u'2005-06-06T08:59:51.619713+00:00',
 u'deactivated_members_collection_link': u'http://api.launchpad.net/beta/~your-user-name/deactivated_members',
 u'display_name': 'Your name here',
 u'expired_members_collection_link': u'http://api.launchpad.net/beta/~your-user-name/expired_members',
 u'hide_email_addresses': False,
 u'homepage_content': None,
 u'indirect_participations_collection_link': u'http://api.launchpad.net/beta/~your-user-name/indirect_participations',
 u'invited_members_collection_link': u'http://api.launchpad.net/beta/~your-user-name/invited_members',
 u'irc_nicknames_collection_link': u'http://api.launchpad.net/beta/~your-user-name/irc_nicknames',
 u'is_team': False,
 u'is_valid': False,
 u'jabber_ids_collection_link': u'http://api.launchpad.net/beta/~your-user-name/jabber_ids',
 u'karma': 0,
 u'languages_collection_link': u'http://api.launchpad.net/beta/~your-user-name/languages',
 u'latitude': None,
 u'longitude': None,
 u'mailing_list_auto_subscribe_policy': u'Ask me when I join a team',
 u'members_collection_link': u'http://api.launchpad.net/beta/~your-user-name/members',
 u'members_details_collection_link': u'http://api.launchpad.net/beta/~your-user-name/members_details',
 u'memberships_details_collection_link': u'http://api.launchpad.net/beta/~your-user-name/memberships_details',
 u'mugshot_link': u'http://api.launchpad.net/beta/~your-user-name/mugshot',
 u'name': u'your-user-name',
 u'open_membership_invitations_collection_link': u'http://api.launchpad.net/beta/~your-user-name/open_membership_invitations',
 u'participants_collection_link': u'http://api.launchpad.net/beta/~your-user-name/participants',
 u'participations_collection_link': u'http://api.launchpad.net/beta/~your-user-name/participations',
 u'preferred_email_address_link': u'http://api.launchpad.net/~your-username/+email/your.address@foo.com',
 u'proposed_members_collection_link': u'http://api.launchpad.net/beta/~your-user-name/proposed_members',
 u'resource_type_link': u'http://api.launchpad.net/beta/#person',
 u'self_link': u'http://api.launchpad.net/beta/~your-user-name',
 u'sub_teams_collection_link': u'http://api.launchpad.net/beta/~your-user-name/sub_teams',
 u'super_teams_collection_link': u'http://api.launchpad.net/beta/~your-user-name/super_teams',
 u'team_owner_link': None,
 u'time_zone': None,
 u'visibility': u'Public',
 u'wiki_names_collection_link': u'http://api.launchpad.net/beta/~your-user-name/wiki_names'}

That's a lot of information. You can consult the reference documentation (XXX) for more information on what each of the fields of this dictionary mean. What's important is that there are three and only three kinds of fields:

  1. Atomic chunks of data. Examples here include 'date_created', 'display_name', and 'time_zone'. These may be of any JSON data type. Some of these can be modified: you can change your own 'display_name', but you can't change 'date_created'. (How do you know which fields can be modified? See "WADL Description" below.)
  2. Links to other resources. 'mugshot_link' points to your mugshot image. 'preferred_email_address_link' points to a resource that represents your preferred email address. Remember, every object in Launchpad has its own URL, even tiny objects like email addresses and languages. By Launchpad convention, all links between resources have field names that end in '_link'. Two of these links are especially important, and you'll find them present in every document of this sort.
    • 'self_link' is the URL to the resource itself. You can keep track of this URL and come back to it later to find this resource again. It's just like bookmarking a web page.
    • 'resource_type_link' is a link to a machine-readable description of this resource. This is how you can know which fields can be modified and which can't. In the larger sense, these descriptions are also how you can discover that the URL to your Launchpad user account is "http://api.launchpad.net/~your-user-name". For more on this see "WADL Description" below.

  3. Links to collections of resources. A person in Launchpad can be associated with more than one email address, but only one of those can be the 'preferred' address at any one time. By Launchpad convention, all links to collections have field names that end in '_collection_link'. The 'preferred_email_address_link' field points to whatever address is currently preferred. The 'confirmed_email_addresses_collection_link' field points to a list containing all the addresses. For more on collections, see "The list of bugs" below.

Modifying resources

It's your user account; you should be able to change it through the web service. The simplest way to do this is to take the document you received from a GET request, modify it so that it says what you want, and send it back to the server with a PUT request.

Let's say I want to change my display name. The document I got in the previous section looks like this:

    {
       ...
      u'display_name': 'Your name here',
       ...
    }

Since I parsed that document into a data structure (call it 'person'), it's easy for me to change that data structure with code. Here's Python code that will work:

    person['display_name'] = 'New display name'

Then I can turn the data structure back into a JSON string. Now the string looks like this:

    {..., "display_name": "New display name", ...}

Now I can send the document back to the server with PUT:

    PUT /people/~{your-user-name} HTTP/1.1
    Host: api.launchpad.net
    Content-type: application/json

    {..., "display_name": "New display name", ...}

The response should indicate that my changes were made:

    200 OK

PATCH

The PUT technique is very convenient when you already have a document describing the resource you want to modify. If you don't have such a document, you don't have to create the whole thing. You can create a smaller document from scratch, and only mention the fields you want to change:

    {"display_name": "New display name"}

You can send this document to the server as part of a PATCH request:

    PATCH /people/~{your-user-name} HTTP/1.1
    Host: api.launchpad.net
    Content-type: application/json

    {"display_name": "New display name"}

Again, the response should be simple:

    200 OK

Note that PATCH is not yet an official HTTP method. It's defined in [[http://tools.ietf.org/html/draft-dusseault-http-patch-11| an Internet-Draft]. Because it's not standard you might not be able to use it from your web client. In particular, you can't use PATCH if you're writing an Ajax application.

Some of a resource's fields are links to other resource: for instance, your preferred email address.

    print person['preferred_email_address_link']
    # http://api.launchpad.net/~your-username/+email/your.address@foo.com

Since the current value is expressed as a URL, you change the value by changing the URL. In this PATCH request I change my preferred_email_address_link so that it points to another of the 'email address' type resources associated with my user account.

    PATCH /people/~{your-user-name} HTTP/1.1
    Host: api.launchpad.net
    Content-type: application/json

    {"preferred_email_address_link":
     "http://api.launchpad.net/~your-username/+email/another.address@bar.com"}

How did I find that link? Well, you can get a collection of all your confirmed email addresses by following the "confirmed_email_addresses_collection_link". (See "the list of bugs" below to learn what a collection looks like.) Each email address has its own permanent URL, accessible as its 'self_link' field. Stick one of those URLs in the document describing your user account, and you can make a request that changes which email address is your 'preferred' one. In Python code the URL change might look like this:

    person['preferred_email_address_link'] = new_email_address['self_link']

The WADL description (again, see below) tells you which links you're allowed to modify. This information is also in the reference documentation.

You can never change a link to a collection. The link to the collection of your confirmed email addresses will always be "http://api.launchpad.net/beta/~{your-user-name}/confirmed_email_addresses".

Error handling

If something goes wrong with your request, you'll probably get a response code of 400 ("Bad Request") instead of 200 ("OK"). The body of the response will tell you what was wrong with your request. For instance, if you try to send PUT or PATCH data in a format other than JSON...

    PATCH /people/~{your-user-name} HTTP/1.1
    Host: api.launchpad.net

    display_name=New display name

You'll get this response:

    400 Bad Request
    Content-type: text/plain

    Entity-body was not a well-formed JSON document.

A collection: the list of bugs

The user account resource is a typical example of what we call an "entry resource": one that represents one specific thing. The other main sort of resource is what we call a "collection resource": one that acts as a container for a number of other resources.

As with entry resources, every container resource has its own URL that you can bookmark or pass around. Send a GET request to a container resource, and you'll get you a JSON document describing the collection.

One interesting collection is the list of filed bugs. Its address is "http://api.launchpad.net/beta/bugs". Send a GET request to that URL and you'll get back a JSON document that looks like this:

    {
      u'total_size' : 252673,
      u'next_collection_link' :
        u'http://api.launchpad.net/beta/bugs?ws.start=75&ws.size=75',
      u'resource_type_link' : u'http://api.launchpad.dev/beta/#bugs',
      u'entries' : [ ... ]
    }

All collection resources serve JSON documents that look like this, whether they're collections of bugs, people, bug tasks, email addresses languages, or whatever. It's always a JSON hash with keys called 'total_size', 'resource_type_link', and 'entries'. The 'total_size' field is the number of items in the collection, 'resource_type_link' is a machine-readable description of the collection (see "WADL Description" below). The 'entries' field contains the actual entries.

Except of course it doesn't contain *all* the entries. Putting over 250,000 bugs in one document would be crazy. Launchpad's web service does the same thing as the Launchpad website: it sends you one page of bugs at a time, and includes links (where appropriate) to the next and previous pages. So the 'entries' field here is a JSON list containing 75 JSON hashes, each describing one bug. Each hash contains the same information as if you'd sent a GET request to that bug's 'self_link'.

If you need more than 75 bugs, you can send a GET request to the 'next_collection_link'. If you need some other number of bugs, or you want to start from item 20 in the list instead of the first item, you can manually vary the 'ws.start' and 'ws.size' parameters. Sending a GET request to http://api.launchpad.net/beta/bugs?ws.start=9&ws.size=3 would get you three bugs: the ones that would be accessible from "collection['entries'][9:12]" if you'd sent GET to http://api.launchpad.net/beta/bugs and retrieved the first 75.

For consistency's sake, _all_ collection resources serve JSON hashes with 'total_size' and the rest, even collections which are very unlikely to have more than 75 entries, like someone's list of spoken languages.

Named operations

All entry resources support GET, PUT, and PATCH. All collection resources support GET. There are also custom operations available on specific resources--we call these "named operations" because they're identified by name rather than by one of the standard HTTP methods.

These operations are described in the reference documentation (and in the WADL file), and they're different for every kind of resource, so I won't cover them all here. What I will do is give a couple examples and talk about what all named operations have in common.

A named operation either modifies the Launchpad data set or it doesn't. If it's read-only, then you access it with HTTP GET. If it's a write operation, you need to access it with HTTP POST.

Read operations (GET)

The person search operation is a good example of a read operation. Launchpad exposes a list of people at http://api.launchpad.net/beta/people, but for most applications you don't want to page through the user accounts the way you would on the Launchpad person list. Usually you want to _filter_ that huge list to find specific people.

To invoke the person search operation you make a GET request to this URL:

    http://api.launchpad.net/beta/people?ws.op=find&text={text}

where "{text}" is the text you want to search for.

(Again, you can find out about this named operation by reading the reference documentation or the WADL definition of http://api.launchpad.net/beta/people. There's no secret here.)

The response to a read operation can be any JSON document, but it's usually a JSON hash that looks exactly like the JSON representation of a collection resource. It's got 'total_size', 'entries', possibly 'next_collection_link', and so on. So getting http://api.launchpad.net/beta/people?ws.op=find&text=foo gives you the same kind of document as getting http://api.launchpad.net/beta/people, but there'll be a lot less data to process.

In general, you invoke a named operation on a resource by tacking the query parameter "ws.op={operation name}" onto the resource's URL. In this case, the resource was the collection of people and the name of the operation was "find". It's just like calling a method in a programming language: the resource is the object and the operation is the method. Any arguments to the method are appended as additional query parameters.

Write operations (POST)

Team creation is a good example of a write operation. Launchpad treats teams the same as people, so when you create a team you're adding to the list of people. To invoke the team creation operation you make a POST request to the list of people:

    POST /beta/people HTTP/1.1
    Host: api.launchpad.net
    Content-Type: application/x-www-form-urlencoded

    ws.op=newTeam&name={name}&display_name={display_name}

Where {name} is the name you want for the new team, and {display_name} is how you want the team to be described. It's the same as for a read operation, except all your query arguments go into the body of the POST instead of into the URL.

Like read operations, write operations can return any JSON document. Most often, they return nothing--only a status code of 200 ("OK") to show that the operation was carried out. But operations that create new Launchpad objects, like newTeam, do something different. If you manage to create a team you'll see a response that looks like this:

    201 Created
    Location: http://api.launchpad.net/beta/~{name}

That's your indication that the team was created, and that you can find the new team at http://api.launchpad.net/beta/~{name}. Now you can go over to the new team and make additional HTTP requests to customize it, add memberships, and so on. In general, Launchpad's web service gives you the URLs to newly minted resources, rather than making you guess them.

Request Signing

If you send the GET requests given in this document as is to api.launchpad.net, you'll get a response code of 401 ("Unauthorized"). Launchpad's web service only responds to requests that have been digitally signed with a particular Launchpad user's authorization key.

This doesn't have to be *your* key. You can have a simple script that uses your own Launchpad authorization key, but you can also run a website that gathers its users' authorization keys and makes requests to the web service on their behalf. This is safe because you authorization keys have nothing to do with your Launchpad password. They're a way of delegating a limited set of privileges to some other program. If a program proves untrustworthy, you only have to revoke its key.

The standard HTTP authentication mechanisms (Basic and Digest) aren't sophisticated enough to handle these cases. That's why Launchpad has adopted the OAuth standard (http://oauth.net) for authentication. It's a little more work to set up than just sending your Launchpad username and password to the web service, but it's more versatile and more secure.

Getting credentials

A user can set up an authorization token from within Launchpad, by visiting http://www.launchpad.net/~{name}/+oauth-tokens. It's reasonable to ask your users to set up a token before they use your program, and provide their Launchpad credentials in a config file or as command-line arguments to your script. But you can also create a new set of credentials from within your application.

The basic workflow is always the same, but the details are different if you're writing a standalone application, versus creating a website.

0. Pick a consumer key

The consumer key identifies your application and it should be hard-coded somewhere in your code. Everyone who uses your application will send the same consumer key.

We recommend you choose the name of your program as the consumer key. Don't append the version number unless you want your users to get new application keys for every new version. For this example I'll use the consumer key 'myconsumerkey'.

  1. Get a request token

Getting your program's user to create a new credential for the program is a multi-step process. The request token is a unique string that Launchpad uses to keep track of your program between steps.

To obtain a request token, send a POST request to http://www.launchpad.net/+request-token. (Note: *not* api.launchpad.net!) This is the same kind of POST a web browser does when it submits a form. You should submit the following values in form-encoded format.

So the HTTP request might look like this:

    POST /+request-token HTTP/1.1
    Host: www.launchpad.net
    Content-type: application/x-www-form-urlencoded

    oauth_consumer_key=myconsumerkey&signature_method=PLAINTEXT&oauth_signature=%26

The response should look like this:

    200 OK

    oauth_token_secret=&oauth_token={request token}

oauth_token_secret will be empty; we don't use it. oauth_token will be a random-looking string of letters and digits. Save this for step 3.

  1. Authenticate the user.

Now the user needs to 1) log in to Launchpad, and 2) tell Launchpad how much authority they're delegating to your program. You need to get them to visit the following URL in their web browser:

    http://www.launchpad.net/+authorize-token?oauth_token={request token}

If your program is a website that your users visit, you can send them an HTTP redirect. Be sure to also specify the 'oauth_callback' field as a URL on your website.

    http://www.launchpad.net/+authorize-token?oauth_token={request token}&oauth_callback={URL to within your website}

Once the user delegates some of their privileges to your website Launchpad will redirect the user back to that URL, so that they can resume using your site.

If your program runs on the clients' computer rather than through their web browser, you don't have to worry about redirecting back to your web page. But you do have to worry about opening the Launchpad page in their web browser in the first place. Take a look at the open_url_in_browser() function defined in launchpadlib's launchpad.py; it works well on most Linux systems. Just open up their web browser to the +authorize-token page.

If your program isn't running in the web browser, how are you supposed to know when the user is done with the +authorize-token page? There's no 'oauth_callback' equivalent that Launchpad can use to send a signal to your client-side program. What you need to do is have the _end-user_ tell you when they're done with +authorize-token.

The launchpadlib library prints an explanatory message after it opens +authorize-token in your web browser. It waits for the end-user to authorize access through their web browser, and then switch back to the launchpadlib window and hit return. If you're writing a GUI program, you can have the end-user click a button once they're done authorizing your program to talk to Launchpad on their behalf.

For an example of good interface design around these constrains, look at F-Spot's Flickr integration. The first time you export a photo to Flickr you need to click an "Authorize" button. This opens up a web browser to a page on Flickr. You log in to Flickr and authorize F-Spot to access the Flickr web service on your behalf. Then you go back to F-Spot and click a "Complete Authorization" button. After that point, F-Spot can talk to Flickr with your credentials.

(Flickr doesn't use OAuth, but its system has the same architecture as OAuth, so the user interface can work the same way.)

  1. Exchange the request token for an access token

You can't do much with the request token. It's only good for coordinating with Launchpad while the user decides whether or not to let you make web service requests in their name. Once the user has delegated some of their authority to you, you need to exchange the request token for a new token that has their permissions associated with it.

If you're writing a website, you'll know this happens when Launchpad redirects your user back to the URL you specified as 'oauth_callback'. If you're writing a client-side program, you'll know when your user clicks the "Complete Authorization" button or hits enter or whatever it was you told them to do when they were done on the Launchpad side.

Now you make a GET request to http://www.launchpad.net/+access-token. Provide the following data in the query string:

So the URL you GET should look like this:

    http://www.launchpad.net/+access-token?oauth_token={request token}&oauth_signature=%26

Basically, you're looking up a record using the request token as a key. The record was created when the end-user told Launchpad it was okay to delegate their authorization to you.

You should get a response that looks like this:

    200 OK
    oauth_token={access token}&oauth_token_secret={access token secret}

Put those two pieces of information in some persistent storage. You'll need them as part of every request you make to Launchpad's web service.

Using credentials

Now that you've got an access token and a secret for a particular Launchpad user, you won't have to go through that again for that user. But there's still the matter of digitally signing your requests with that token.

Unlike the process of getting credentials, which is pretty specific to Launchpad, the process of digitally signing a request is completely mechanical. The mechanics are covered in detail in the OAuth standard (http://oauth.net/core/1.0/), and there are also OAuth libraries in most popular programming languages that can sign an HTTP request given an access token and secret. So I'm not going to go into much detail on how to sign a request. It's a general problem and there are plenty of places to go if you need help, and lots of sample code to look at.

WADL Description

Throughout this document I've revealed seemingly secret information about the capabilities of various resources. It makes intuitive sense that you should send a GET to a resource's URL to find out more about it, but how are you supposed to know that you can also send a GET to that URL plus "?ws.op=find"? The HTTP standard says (more or less) that if you PUT a document to a resource that supports PUT, the server should try to apply your new document to the underlying dataset. But how are you supposed to know that you're allowed to modify a person's "latitude" but not their "karma"?

Most web service providers put this sort of information in a big prose document that you're supposed to read when you write a client. We do have that big document (the reference documentation), but we also have a machine-readable document that describes the quirks of this particular web service: a WADL document.

In fact, the reference documentation is just a human-readable transformation of the WADL document. The launchpadlib Python library is a thin wrapper on top of a WADL library. Almost every interesting aspect of the web service is described in this file. You can use it as a basis for your own tools that talk to Launchpad. It's analogous to the HTML forms you use to manipulate a web site.

To get the file, make a GET request to the root of the web service, and ask for a WADL representation:

    GET /beta/ HTTP/1.1
    Accept: application/vd.sun.wadl+xml

Every entry resource has a 'resource_type_link' that's an index into this document. "http://api.launchpad.net/beta/#person", for instance, is a reference to the XML tag in this document with the ID "person". That's the tag describing the capabilities of a "person" resource, and it's the value of 'resource_type_link' in the JSON representation of every "person" resource.

What's not defined in this file? Mainly, there's also a lack of information about our URL structure. You've already seen that you can get a description of any person in Launchpad by sending GET to http://api.launchpad.net/beta/~{name} and plugging in the name. This is a useful shortcut that can often save you a few HTTP requests, but the WADL file doesn't say anything about that. It's possible to put this information into WADL; we just haven't implemented it yet.