API Category

Winners of the WhitePages.com Developer API Contest

Four months ago we launched the WhitePages.com Developer API to the general public and at the time we were going to be excited if we had a hundred active developers using the API within the first year. I can proudly report that we have greatly exceeded our initial expectations.

There have been conversations springing up on other online forums, people requesting it be integrated into their favorite application and over five hundred active developers. One fifth of those have queried our API over a thousand times and many of those have exceeded ten thousand queries.  Several small search engines have integrated our API into their results and two developers have launched Applications into the Apple iPhone store.

In conjunction with the launch of the API we announced a contest for the best iPhone application and the best social networking application to utilize the API. I’d like to give my thanks to all those developers who entered the contest. We saw some really great entries but alas there can be only two winners of the MacBook Pro’s we are awarding.

Adam Leonard of Caffeinated Cocoa won first place for his iPhone application ‘People’. ‘People’ is an excellent straight implementation of our services on the iPhone and allows people to save or merge pertinent results directly with their phones address book.

Trevor Hall took first in the social networking category for his facebook.com application ‘Visual Directory’. ‘Visual Directory’ encourages a real world connection with your friends by finding and mapping likely matches and providing contact information.

Coming in second were Philippe Furlan with an integration of the reverse_phone API into his ‘SearchQuest GPS’ iPhone application and Bryce Baril of Market Outsider for PartyBot.

You can see more about the contest at developer.whitepages.com or via this press release

This posting is also cross posted to our main blog.

API Videos FTW

We’ve put together two videos introducing the WhitePages.com API - introducing some of the team, the signup process, and the sample apps.Two of the core members of the API team, Dan (the lead developer) and Bruce (by day our QA Manager, by night our API documenter/sample herder/etc.), narrate grand tours through the developer portal.But don’t let me spoil the lead for you: You’ll laugh! You’ll cry! It’s better than Cats!        

 

More about the Whitepages Developer API

Now that you’ve read Scott’s big picture posting about the new WhitePages.com public API offering, let me tell you a little about the down and dirty of developing our new API. Our data covers 180 million people and provides approximately 80% coverage of the US: when the opportunity came across my desk to build the API that would allow us to share that data, I was elated.

Let me start by giving you an overview of how we deal with the hard problems of searching those 180 million listings in under a quarter of a second and delivering them to the front-end website.

Some people think we have ‘just a database’ or ‘it’s just a website’. But what we do is hard work. We have multiple data vendors, some onsite and some offsite via their own API calls, all with differing data formats and the resulting merge issues that causes. Our onsite data takes up 3 terabytes of storage (with indexes) is rebuilt monthly with no identification to tie data together and handles billions of queries per year.

We use Oracle, Postgres, MySQL, and BerkeleyDB, depending on which has the strengths we need for any given job. We handle residential, nicknames, households, business listings and work number listings. Our data can be bizarre with fractional streets, decimal house numbers and misleading names like streets named “North”.

Yes, “it’s just a website” that happens to power 1300 affiliate sites, does over 100 million searches and has 34 million unique users per month. We have tiered, redundant systems with strict privacy controls that allow for non-published numbers and our own opt-out list.

All of this is built using Linux, apache mod_perl, our own special sauce and it runs on just 60 boxes (16 run our backend code). Our work includes an internal search API that is strong on speed, comprehensive with its searches and absolutely inappropriate to turn loose on the world (some of our return keys have bizarre names). What we needed was an extensible platform that would allow us to wrap our own API and make it palatable and easier to work with, and to provide multiple output formats.

Back in October, Colin (one of our Architects) and I sat down to sketch out what this would look like. We decided that we would leverage our known strengths and use apache mod_perl, a YAML file for config, Oracle for User preferences and an on-disk cache of those preferences to ensure reliability. It would have to be extensible to allow for new search types and versions and we would have to allow for small developers, large partners who could send millions of queries/day, and internal use. We considered SOAP but decided that a RESTful interface was easier for more people to interact with. We would provide an XSD for people to validate the XML against and JSON output for those who were doing JavaScript. New versions would only rev the version number when the output format changed but that everyone would get additional data entries and data fields as they become available.

We looked at writing our own user management system, but decided the way to go was to partner with Mashery and leave that and the community site up to their infrastructure while we focused on building the actual API.

We build OO Perl here so our first order of business after sketching out the rough requirements was to determine what classes would need to be built. We would need an Apache response handler which would handle the overall logistics, something to clean up and validate input, a class to take the output from the search and process it, and an output transformation class that would take that output and deliver it in whatever output format was requested. All of these factory classes would need to be versionable to allow for changes within our internal API as well as updates to our XSD as we build more functionality into our public API.

Once we got the generic framework worked out and determined that we could leverage it to handle the Mashery integration as well, it allowed us to bring Ewa onboard (giving her a good view of development from the other side of the fence). Ewa has been with WhitePages QA for just over three years and she is my go-to person if I need any question answered about testing our internal API. She stepped up and took point for Mashery integration without missing a beat in addition to her duties doing end to end testing of our final product.

All through November and into December, Colin and I worked out the details and wrote the code, taking the blueprints and making them real. By January 10th we had a rough and ready working version showcasing our three main search types and just in time, as Hack Week was looming and many members of our engineering team were chomping at the bit to get their hands on this API. Some of the products of Hack Week you can see showcased in our sample applications section over at developer.whitepages.com. It also allowed Zine, a new member of our QA team, to hit the ground running and start devising new and intricate ways to torture our poor code before we gave the final stamp of approval.

Hack Week was a lot of fun for me: I babysat the code as it was being really used for the first time and I found out for myself how easy it was to extend the framework to deliver data that isn’t accessed from our internal API. How easy? Well in two days I had two totally different methods built, both of them accessing raw data directly and serving it out. It’s always a pleasure to find out that your design decisions really do work out the way you plan.

So what have we been doing since then? Writing the Technical Documentation that you see at developer.whitepages.com/docs, fixing the bugs we’ve found during the QA testing phase, writing and testing the Mashery integration code, working with business to allow us to expose nearly all of our data to you, and generally wondering when the other shoe would drop. This project has gone way too smoothly and we really couldn’t have done it without the help of our full team. I was reflecting on the number of people who have had a hand in this process and while I won’t name them here, the number exceeds 30 and spans nearly every functional group within the 20% of the company that it represents.

It’s been a fun couple of months and I can’t wait to see what else we come up with for the API. I’ve got my list but I’m even more excited to see what other people will come up with in their own wish lists.

Cheers!

Dan Sabath, api lead dev

Introducing the WhitePages.com API

From the days in Alex’s dorm room in 1996, to incorporating in 2000, to launching its first mobile site in 2006, WhitePages.com has been about doing one thing well - fast, easy, and free access to nationwide contact information for the people you care about.

It may be hard to remember, but before the interwebs, it wasn’t just that looking up people locally required either thumbing-through-the-book or paying $0.25 to call 411; if you wanted to find somebody in another state, you had to call XXX-555-1212 and hope you guessed right. (There were fewer area codes then, of course. Also, we didn’t have fire.)

Offline, not much has changed, except that the calls now cost $1.79. The data’s basically the same. And until about a year ago, the data on WhitePages.com, 411.com, and our other sites was all basically the same as you could get when you dialed 411 or visited our competitors. In the last year, though, we’ve gone from the 90 million people listed in the hundreds of thousands of pages of phone books to 180 million people across the US.

We’re proud of what we’ve done, and we’re proud of what we’ve built at WhitePages.com to make this a sustainable, profitable business, with user experiences that people recommend. As my first year here comes to a close, I’ve seen the care and pride that went into building the technology, the sales and business development businesses, the customer experience, etc. over the last 10 years.

But it’s not enough.

We believe in more than just our websites and our brands; we believe that contact information is a core building block of the internet experience, that connecting to the people you care about is a fundamental human need, and that keeping core contact information behind pay or login walls so that people can’t manage their offline social network without using your online social network makes it harder for everyone - ordinary people and developers alike. This data is already out there, for free, but unless you browse to websites like ours, you don’t find it (and often pay for it). As Tim O’Reilly wrote in O’Reilly Radar:

What really needs to be done is not just to connect the various social networks that do exist in internet network-of-networks style, but also to social-network enable our real social network apps: our IM, our email, our phone. Where, I keep asking vendors, is the Web 2.0 address book?

We aren’t there yet, and we’re not going to get there on our own. We’re a small team (but hiring!), and we’re going to focus on our core - building a great data set (meaningfully beyond what we have today), scalable technology, and a great experience for our website customers. There are dozens of apps available now that would benefit from having access to our data to help people find and connect. And while our business is built on top of the advertisements we serve when people visit our websites, we know this information wants to be free – free of charge and free of walls.

So today, we’re making available virtually all of our data free as part of our new beta web services API. Our core search types - People Search, Reverse Phone, & Reverse Address - are now available for free to developers building applications for consumers. The data you see on our website is the data you’ll see in the API. Documentation, sample apps, and forums are all at the WhitePages.com Developer Portal.

We soft-launched the API on Thursday afternoon with a few subtle links, and since then have seen more than 100 signups, thousands of API calls, and one bug report.

We’ve partnered with Mashery to make the API easy to access. We’re running a contest for the best iPhone & best social networking application. We’re in the forums, and will be joining Mashery at their booth during the Web 2.0 Expo. We’re jumping in with both feet, and if we can build and support a developer community around the data and APIs, plan to be here for quite a while.

The privacy rules that apply to our website apply to our API: WhitePages.com Privacy Central provides more information about our privacy approach and policy.

We’ll be adding a number of improvements through the year and beyond, and as we improve our core functionality, we’ll bring the API along. We have more ideas for expanding the API, and we’d love to hear yours in the forums.

Lastly, please join me in thanking the team: Dan, Bruce, Ewa, Zine, Josh, Snezana, & Sabra, for whom this has been a primary job for some time (and who pushed hard to make it available internally for Hack Week), and Sahni, Alison, Tiffany, Kyle, Sabrina, Travis, Mitch, Jason, Colin, Brian (who developed one of the sample apps during Hack Week), Jolene, Ben, and a number of others who invested time at the beginning, middle, end, or some combination to bring this to you. Look for their names in the forums!