Monday

Dec292008

100% on Amazon Web Services: Soocial.com - a lesson of porting your service to Amazon

Monday, December 29, 2008 at 6:13AM

Simone Brunozzi, technology evangelist for Amazon Web Services in Europe, describes how Soocial.com was fully ported to Amazon web services.

---------------- This period of the year I decided to dedicate some time to better understand how our customers use AWS, therefore I spent some online time with Stefan Fountain and the nice guys at Soocial.com, a "one address book solution to contact management", and I would like to share with you some details of their IT infrastructure, which now runs 100% on Amazon Web Services!

In the last few months, they've been working hard to cope with tens of thousands of users and to get ready to easily scale to millions. To make this possible, they decided to move ALL their architecture to Amazon Web Services. Despite the fact that they were quite happy with their previous hosting provider, Amazon proved to be the way to go. -----------------

Read the rest of the article here.

Click to read more ...

Todd Hoff |

1 Comment |

Permalink |

Print Article

Email Article

amazon,

soocial,

web service

Monday

Dec292008

Platform virtualization - top 25 providers (software, hardware, combined)

Monday, December 29, 2008 at 5:43AM

In this article they present the companies which offers means (mainly, the software and hardware) which powers most of the cloud computing hosting providers, namely virtualization solutions.

Read the entire article about Platform virtualization - top 25 providers (software, hardware, combined) at MyTestBox.com - web software reviews, news, tips & tricks.

Click to read more ...

Todd Hoff |

Post a Comment |

Permalink |

Print Article

Email Article

software

Monday

Dec292008

Paper: Spamalytics: An Empirical Analysisof Spam Marketing Conversion

Monday, December 29, 2008 at 1:39AM

Under the philosophy that the best method to analyse spam is to become a spammer, this absolutely fascinating paper recounts how a team of UC Berkely researchers went under cover to infiltrate a spam network. Part CSI, part Mission Impossible, and part MacGyver, the team hijacked the botnet so that their code was actually part of the dark network itself. Once inside they figured out the architecture and protocols of the botnet and how many sales they were able to tally. Truly elegant work.

Two different spam campaigns were run on a Storm botnet network of 75,800 zombie computers. Storm is a peer-to-peer botnet that uses spam to creep its tentacles through the world wide computer network. One of the campains distributed viruses in order to recruit new bots into the network. This is normally accomplished by enticing people to download email attachments. An astonishing one in ten people downloaded the executable and ran it, which means we won't run out of zombies soon. The downloaded components include: Backdoor/downloader, SMTP relay, E-mail address stealer, E-mail virus spreader, Distributed denial of service (DDos) attack tool, pdated copy of Storm Worm dropper. The second campaign sent pharmacuticle spam ("libido boosting herbal remedy”) over the network.

Haven't you always wondered who clicks on spam and how much could spammers possibly make? In the study only 28 sales resulted from 350 million spam e-mail messages sent over 26 days. A conversion rate of well under 0.00001% (typical advertising campaign might have a conversion of 2-3%). The average purchase price was about $100 for $2,731.88 in total revenue. The reserchers estimate total daily revenue attributable to Storm’s pharmacy campaign is about $7000 and that they pick up between 3500 and 8500 new bots per day through their Trojan distribution system. And this is with only 1.5% of the entire network in use.

So, the spammers would take in total revenue about $3.5 million a year from one product from one network. Imagine the take with multiple products and multiple networks? That's why we still have spam. And since the conversion rate is already so low, it seems spam will always be with us.

As fascinating as all the spamonomics are, the explanation of the botnet architecture is just as fascinating. Storm uses a three-level self-organizing hierarchy pictured here:

worker bots - make requests for work and upon receiving orders send spam as requested. Works pull work from higher layers.

proxy bots - act as coordinators between workers and master servers.

master servers - send commands to the workers and receive their status reports. There are small number of master servers hosted at “bullet-proof” hosting centers and are likely directly managed by the botmaster.

A host selects its worker or proxy role automatically. If a firewall doesn't prevent inbound communication the infected host becomes a proxy, otherwise the host becomes a worker. As workers pull work from proxies there's no need to contact one directly. Proxies on the other hand are directly contacted by master servers so communication must be bidirectional.

Storm communicates using two separate protocols:

An encrypted version of the UDP-based Overnet protocol and is used primarily as a directory service to find other nodes. Overnet is a peer-to-peer protocol that uses a distributed hash table mechanism to find peers.

A custom TCP-based protocol for masters sending command and control commands to proxies and workers. Command and control traffic to the worker bots is unecrypted which makes a man-in-the-middle attack possible and is how the researchers carried out their caper.

According to Brandon Enright: When a peer wants to find content in the network, it computes (or is given) the hash of that content and then searches adjacent peers. Those peers respond with their adjacent peers that are closer. This is repeated until the searching peer gets close enough to the content that a node there will be able to provide a search result. This is a complicated and interesting process that the Spamalytics paper goes into in a lot more detail on as do some references at the end of this post.

Storm harnesses a large, unreliable, constantly changing distributed system to do work. It's an architecture worth learning from and we'll explore some of those lessons in a later post.

On the Spam Campaign Trail

Scaling Spam Eradication Using Purposeful Games: Die Spammer Die!

Can cloud computing smite down evil zombie botnet armies?

Inside the Storm: Protocols and Encryption of the Storm Botnet by Joe Stewart, GCIG Director of Malware Research, SecureWorks

Exposing Stormworm by Brandon Enright. A lot of excellent low level protocol details.

Storm Botnet

Global Guerrillas by John Robb - Networked tribes, systems disruption, and the emerging bazaar of violence. Resilient Communities, decentralized platforms, and self-organizing futures.

Todd Hoff |

1 Comment |

Permalink |

Print Article

Email Article

Paper,

distirbuted,

spam

Sunday

Dec282008

How to Organize a Database Table’s Keys for Scalability

Sunday, December 28, 2008 at 9:14AM

The key (no pun intended) to understanding how to organize your dataset’s data is to think of each shard not as an individual database, but as one large singular database. Just as in a normal single server database setup where you have a unique key for each row within a table, each row key within each individual shard must be unique to the whole dataset partitioned across all shards. There are a few different ways we can accomplish uniqueness of row keys across a shard cluster. Each has its pro’s and con’s and the one chosen should be specific to the problems you’re trying to solve.

Click to read more ...

herewego |

Post a Comment |

Permalink |

MySQL,

keys,

scaling

Monday

Dec222008

SLAs in the SaaS space

Monday, December 22, 2008 at 10:11AM

This may be a bit higher level then the general discussion here, but I think this is an important issue in how it relates to reliability and uptime. What kind of SLAs should we be expecting from SaaS services and platforms (e.g. AWS, Google App Engine, Google Premium Apps, salesforce.com, etc.)? Up to today, most SaaS services either have no SLAs or offer very weak penalties. What will it take to get these services up to the point where they can offer the SLAs that users (and more importantly, businesses) require? I presume most of the members here want to see more movement into the cloud and to SaaS services, and I'm thinking that until we see more substantial SLA guarantees, most businesses will continue to shy away as long as they can. Would love to hear what others think. Or am I totally off base?

Click to read more ...

lennysan |

2 Comments |

Permalink |

Print Article

Email Article

General Discussion,

SAAS,

SLA,

transparency,

uptime

Sunday

Dec212008

The I.H.S.D.F. Theorem: A Proposed Theorem for the Trade-offs in Horizontally Scalable Systems

Sunday, December 21, 2008 at 10:35AM

Successful software design is all about trade-offs. In the typical (if there is such a thing) distributed system, recognizing the importance of trade-offs within the design of your architecture is integral to the success of your system. Despite this reality, I see time and time again, developers choosing a particular solution based on an ill-placed belief in their solution as a “silver bullet”, or a solution that conquers all, despite the inevitable occurrence of changing requirements. Regardless of the reasons behind this phenomenon, I’d like to outline a few of the methods I use to ensure that I’m making good scalable decisions without losing sight of the trade-offs that accompany them. I’d also like to compile (pun intended) the issues at hand, by formulating a simple theorem that we can use to describe this oft occurring situation.

Click to read more ...

Todd Hoff |

1 Comment |

Permalink |

horizontal scalability,

trade-offs

Saturday

Dec202008

Second Life Architecture - The Grid

Saturday, December 20, 2008 at 2:55PM

Update:Presentation: Second Life’s Architecture. Ian Wilkes, VP of Systems Engineering, describes the architecture used by the popular game named Second Life. Ian presents how the architecture was at its debut and how it evolved over years as users and features have been added. Second Life is a 3-D virtual world created by its Residents. Virtual Worlds are expected to be more and more popular on the internet so their architecture might be of interest. Especially important is the appearance of open virtual worlds or metaverses. What happens when video games meet Web 2.0? What happens is the metaverse.

Information Sources

Second Life runs MySQL
Interview with Ian Wilkes
TechTrends: Inside Linden Lab
Town Hall with Cory Linden
InformationWeek articles (1, 2) and blog
Second Life Wiki: Server Architecture
Wikipedia: Second Life Server
Second Life Blog
Second Life: A Guide to Your Virtual World

Platform

MySQL
Apache
Squid
Python
C++
Mono
Debian

What's Inside?

The Stats

~1M active users
~95M user hours per quarter
~70K peak concurrent users (40% annual growth)
~12Gbit/sec aggregate bandwidth (in 2007)

Staff (in 2006)

70 FTE + 20 part time

"about 22 are programmers working on SL itself. At any one time probably 1/3 of the team is on infrastructure, 1/3 is on new features and 1/3 is on various maintenance tasks (bug fixes, general stability and speed improvements) or improvements to existing features. But it varies a lot."

Software

Client/Viewer

Open Source client
Render the Virtual World
Handles user interaction
Handles locations of objects
Gets velocities and does simple physics to keep track of what is moving where
No collision detection

Simulator (Sim) Each geographic area (256x256 meter region) in Second Life runs on a single instantiation of server software, called a simulator or "sim." And each sim runs on a separate core of a server. The Simulator is the primary SL C++ server process which runs on most servers. As the viewer moves through the world it is handled off from one simulator to another.

Runs Havok 4 physics engine
Runs at 45 frames/sec. If it can't keep up, it will attempt time dialation without reducing frame rate.
Handles storing object state, land parcel state, and terrain height-map state
Keeps track of where everything is and does collision detection
Sends locations of stuff to viewer
Transmits image data in a prioritized queue
Sends updates to viewers only when needed (only when collision occurs or other changes in direction, velocity etc.)
Runs Linden Scripting Language (LSL) scripts
Scripting has been recently upgraded to the much faster Mono scripting engine
Handles chat and instant messages

Asset Server

One big clustered filesystem ~100TB
Stores asset data such as textures.

MySQL database

Backbone

Eventlet is a networking library written in Python. It achieves high scalability by using non-blocking io while at the same time retaining high programmer usability by using coroutines to make the non-blocking io operations appear blocking at the source code level.
Mulib is a REST web service framework built on top of eventlet

Hardware

2000+ Servers in 2007
~6000 Servers in early 2008
Plans to upgrade to ~10000 (?)
4 sims per machine, for both class 4 and class 5
Used all-AMD for years, but are moving from the Opteron 270 to the Intel Xeon 5148
The upgrade to "class 5" servers doubled the RAM per machine from 2GB to 4GB and moved to a faster SATA disk
Class 1 - 4 are on 100Mb with 1Gb uplinks to the core. Class 5 is on pure 1Gb

Do you have more details?

Click to read more ...

Todd Hoff |

7 Comments |

Permalink |

Print Article

Email Article

Apache,

Example,

MySQL,

Python,

Squid,

couchdb,

games

Friday

Dec192008

Gigaspaces curbs latency outliers with Java Real Time

Friday, December 19, 2008 at 1:59PM

Today, most banks have migrated their internal software development from C/C++ to the Java language because of well-known advantages in development productivity (Java Platform), robustness & reliability (Garbage Collector) and platform independence (Java Bytecode). They may even have gotten better throughput performance through the use of standard architectures and application servers (Java Enterprise Edition). Among the few banking applications that have not been able to benefit yet from the Java revolution, you find the latency-critical applications connected to the trading floor. Why? Because of the unpredictable pauses introduced by the garbage collector which result in significant jitter (variance of execution time). In this post Frederic Pariente Engineering Manager at Sun Microsystems posted a summary of a case study on how the use of Sun Real Time JVM and GigaSpaces was used in the context of of a customer proof-of-concept this summer to ensure guaranteed latency per message under 10 msec, with no code modification to the matching engine.

Click to read more ...

natis |

Post a Comment |

Permalink |

Print Article

Email Article

Java,

Latency,

gigaspaces

Friday

Dec192008

How to measure memory required for a user session

Friday, December 19, 2008 at 8:59AM

hi, What are the practices followed, tools used to measure session memory requirement per user? Thanks, Unmesh

Click to read more ...

unmesh |

3 Comments |

Permalink |

Print Article

Email Article

General Discussion

Thursday

Dec182008

Risk Analysis on the Cloud (Using Excel and GigaSpaces)

Thursday, December 18, 2008 at 10:21AM

Every day brings news of either more failures of the financial systems or out-right fraud, with the $50 billion Bernard Madoff Ponzi scheme being the latest, breaking all records. This post provide a technical overview of a solution that was implemented for one of the largest banks in China. The solution illustrate how one can use Excel as a front end client and at the same time leverage cloud computing model and mapreduce as well as other patterns to scale-out risk calculations. I'm hoping that this type of approach will reduce the chances for seeing this type of fraud from happening in the future.

Click to read more ...

natis |

Post a Comment |

Permalink |

risk management