Enterprise Initiatives: Don't pull a Twitter

Don't pull a Twitter

Architecture, startups, twitter, Web 2.0 Add comments

There has been a lot of excitement around Web 2.0 technology lately. It appears that we haven't learned much from our Web 1.0 days, also known as the .Com days. These exciting new technologies allow for companies to quickly produce new web sites and easily extend existing web sites. The problem is people forget that to be successful in the long term, you still need a reliable and well designed architecture and you still need some form of governance to ensure that the end product can meet high levels of quality of service.

What do I mean by Quality of Service (QoS)?
In terms of web sites and software, when I talk about QoS I am referring to:

Uptime
Performance
Service Level Agreements (SLAs)
Minimal defects
Scalability

Nowadays, a web startup can quickly produce a service on the Internet at low cost. They can simply take advantage of the LAMP stack (Linux, Apache, MySql, PhP), leverage existing Mashups, and build robust user interfaces (RIAs) to produce a cool, eye candy web site that consumers can access for free. Unfortunately, quick is becoming quick-n-dirty.

Balancing Agility with QoS
Startups are faced with a dilemma. They typically have a limited amount of time, money, and resources. They need to get a prototype up and running quickly to give them an opportunity to go after VC funds. The defining moment comes when they get the funding and then have a limited amount of time to produce a Beta version soon to be followed by a real production version. This is where many startups miss the boat. Many prioritize speed to market over sound architecture design. It is a tough decision to make.

Do I spend a lot of time and money now and risk missing the window of opportunity?
Can I worry about scalability after I see how well received the web site is?
Can I afford to pay an experienced architect now or can I wait until we have more money and a good stream of traffic?

Many startups can get away with postponing some of the critical design aspects and target a later release of software to address QoS. But others have not been so successful. Right now, everyone is fully aware of Twitter's Fail Whale, a symbol of incompetence and a colossal architectural blunder.

From Web site crashes

Twitter had quickly become a rising star on the web with millions of users. In the April time frame their traffic spiked to a level beyond what their underlying architecture could handle. Now Twitter is the joke of the town and are victims of daily outages and performance issues. What we found out is that behind the scenes the baby is ugly. Twitter recently responded to some questions from Techcrunch and revealed more information about their "architecture" then any sane person would offer. Here is a summary of what I learned:

Use one database for writes (because "replication of MySQL is no easy task")
Limited scalability -3 physical database machines "POWERING ALL OF TWITTER"
Human Intervention required - "There's a lot of necessary handholding and tweaking "
They plan to grow operations, rather then fix the handholding
Tightly coupled - massive traffic on one part affects all
Many design flaws - "Everything from faulty process, environment, configuration, and just plain load"

Their solution is to hire top talent at top prices and go into fire fighting mode to save this product. This is the typical result when a company with a successful product takes short cuts in terms of architecture and design. You can pay for it now or pay double later. Twitter is paying big now in salary, lost customers, and declining brand value. This is a worst case scenario for a company who generates zero revenue and who's goal is to be purchased by a big spender.

But Twitter is not the only startup having issues. It is becoming common for me to experience crashes and outages in my daily routine of using various new web sites and services. There have been days that I have seen four sites down at the same time. It is getting so common that I started creating a collection of screen shots.

Technorati has been struggling recently with a surge in web traffic. Countless other startups have flashed their cute crash messages on my screen. To me, the joke is on them. It's no wonder that the corporate architects of the world (me included) get a little annoyed when the media and talking heads start touting Web 2.0 and Mashups as the silver bullet for enterprises. Many have said that we should skip SOA because it is hard and time consuming and instead go the WOA route. I think Web 2.0, Mashups, and WOA are all great technologies, but when they are applied to an architecture that does not scale or do not follow some form of governance to assure some level of QoS, then you will likely need to design a cute outage web page to entertain your users as your team scrambles to bring the system back on line.

Here is one that I have created for any new startup out there.

Enterprise Initiatives

Don't pull a Twitter

0 comments

Post a Comment

My favorite sayings

Subscribe to this blog

Stuff I say and read

Tag Cloud

People who stopped by

Archives

Categories