Posts tagged as:

top 5

Note: This is Part 1 of  5 of a series of post where I document, step by step, the building of a social application using cloud computing approach. Read the introduction here.

1. Why build it on the cloud?

The first thing I wanted to do was choose an application to build. I didn’t want it to be a very complex application with  lots of logic and overhead. Plus we’re doing a major relaunch at work, so free time is not abundant these days. I’ve always been a David Letterman fan and one of the most famous sections of the show is the Top Ten List. Basically they pick a theme from current news and make a funny list around it, sorting the ones that are the most funny on the top. Here’s a video  example of one of these Top Ten lists.

I thought that this would be a great social tool to build: a crowdsourcing tool for Top Ten lists. But having people create ten items sounded like too much, so I decided to pare it down to five.

Thus I had my application: The Top5.

Let me illustrate how this application would be built using the regular approach (physical servers/databases/controllers).

The boss calls a meeting. He tells the developer, “We’ve decided to help Letterman with his Top Ten List, so I need an application that lets people create Top Ten lists online. Now, don’t forget to make it scalable, not like that slow application you built last month.”

The developer goes, “Pff. That’s easy. I just provision one of our web servers with Linux, Apache, Mysql and PHP, create a database with a user table, a list table and some CSS and launch it. Give me three weeks.”

Five weeks later (there was a lot of back and forth with Letterman) the application is launched. The site is responding pretty fast with traffic levels around 25,000 users visiting per week.

But then, one of the lists gets to the homepage of digg. The site starts to suffer as thousands of users come to check out the list.

The developer immediately responds calmly, “we just need more servers”.

Three days later, the application is divided into two webservers and two database servers (one is the master, which is in charge of inserting data to the database, and the other is a slave, which is to retrieve data only).

The site gets better for a couple of weeks, until another list gets viral in Facebook.

The developer’s response? “More servers”.

Now bear in mind that each server needs to be purchased, installed in a datacenter, and replaced if things go wrong. Once you buy the servers and the traffic goes down, you are stuck with more servers than what you need.

The logic also will need to be tuned, as more and more people input data and all that data can’t fit in one database server. So now you need to partition the data in clusters.

The list goes on and on.

Take a break and get some coffee. I’ll tell you now what happens when you build thinking in cloud terms.

In the development firm next door there’s another boss and another developer. They are tasked with the same project: building an application to allow people to create Top Ten lists. But this time the developer replies, “Sure. Let me create an instance on Amazon Elastic Computing Cloud.”

This developer then builds the application using Linux, Apache and PHP. But for the database, he uses Amazon’s SimpleDB, which is essentially an infinitely scalable database server.

Six weeks later (it does take a little more time to build apps for the cloud) they release the application live.

The difference with the other application is clear. When the digg spike happens, the developer simply “clones” the first server six times in a matter of minutes. Once the traffic goes down, the developer simply “destroys” any unused servers and pays only for the hours the extra servers were live. There’s no infrastructure, datacenters or hardware to deal with.

As the site grows more popular, he adds more cloud servers, concentrated in making the application better, not in trying to scale it better.

The Top5 is such an application. Should it ever become very popular, I just have to “copy and paste” instances of the application to scale it. I might have to split the database across domains if I get too much information, but the process shouldn’t be that hard (I could use the  djb2 hash algorithm used by Adaptive Blue’s Glue app).

Knowing that I built an application that’s highly scalable just makes me breathe (and sleep) easier.

In the next part I’ll share with you how to start by deploying instances in the Amazon EC2 cloud.

{ 0 comments }