Thursday
May142009

Who Has the Most Web Servers?

An interesting post on DataCenterKnowledge!

  • 1&1 Internet: 55,000 servers
  • Rackspace: 50,038 servers
  • The Planet: 48,500 servers
  • Akamai Technologies: 48,000 servers
  • OVH: 40,000 servers
  • SBC Communications: 29,193 servers
  • Verizon: 25,788 servers
  • Time Warner Cable: 24,817 servers
  • SoftLayer: 21,000 servers
  • AT&T: 20,268 servers
  • iWeb: 10,000 servers
  • How about Google, Microsoft, Amazon, eBay, Yahoo, GoDaddy, Facebook? Check out the post on DataCenterKnowledge and of course here on highscalability.com!

 

Tuesday
May122009

P2P server technology?

Is there any type of server technology that allows visitors to a website to become part of the server? Like with bittorrent, users share some of their bandwidth, so would this be possible with web servers where a person goes to a website, downloads and runs the software which makes their internet connection and cpu and hdd become part of the web server?

Click to read more ...

Tuesday
May122009

GemStone Unveils GemFire Enterprise 6.0

GemFire Enterprise is in-memory distributed data management platform that pools memory (and CPU, network and optionally local disk) across multiple processes to manage application objects and behavior. With the 6.0 release, GemFire has reached a stage of maturity in its evolution. GemStone touts this version as the true 'best of breed' distributed caching technology, solving scalability issues in all industries.

Click to read more ...

Monday
May112009

Facebook, Hadoop, and Hive

Facebook has the second largest installation of Hadoop (a software platform that lets one easily write and run applications that process vast amounts of data), Yahoo being the first.

Learn how they do it and what are the challenges on DBMS2 blog, which is a blog for people who care about database and analytic technologies.

Friday
May082009

Publish/subscribe model does not scale?

on Wiki someone posted "...For relatively small installations, pub/sub provides the opportunity for better scalability than traditional client-server, through parallel operation, message caching, tree-based or network-based routing, etc. However, as systems scale up to become datacenters with thousands of servers sharing the pub/sub infrastructure, this benefit is often lost; in fact, scalability for pub/sub products under high load in large deployments is very much a research challenge." Does anyone have something to say regarding scaling Publish/subscribe models?

Click to read more ...

Friday
May082009

Eight Best Practices for Building Scalable Systems

Wille Faler has created an excellent list of best practices for building scalable and high performance systems. Here's a short summary of his points:

  • Offload the database - Avoid hitting the database, and avoid opening transactions or connections unless you absolutely need to use them.
  • What a difference a cache makes - For read heavy applications caching is the easiest way offload the database.
  • Cache as coarse-grained objects as possible - Coarse-grained objects save CPU and time by requiring fewer reads to assemble objects.
  • Don’t store transient state permanently - Is it really necessary to store your transient data in the database?
  • Location, Location - put things close to where they are supposed to be delivered.
  • Constrain concurrent access to limited resource - it's quicker to let a single thread do work and finish rather than flooding finite resources with 200 client threads.
  • Staged, asynchronous processing - separate a process using asynchronicity into separate steps mediated by queues and executed by a limited number of workers in each step.
  • Minimize network chatter - Avoid remote communication if you can as it's slower and less reliable than local computation.

    Click to read more ...

  • Wednesday
    May062009

    DyradLINQ

    The goal of DryadLINQ is to make distributed computing on large compute cluster simple enough for ordinary programmers. DryadLINQ combines two important pieces of Microsoft technology: the Dryad distributed execution engine and the .NET Language Integrated Query (LINQ).

    Click to read more ...

    Wednesday
    May062009

    Dyrad

    The Dryad Project is investigating programming models for writing parallel and distributed programs to scale from a small cluster to a large data-center.

    Click to read more ...

    Wednesday
    May062009

    Art of Distributed

    Art of Distributed

    Part 1: Rethinking about distributed computing models

    I ‘m getting a lot of questions lately about the distributed computing, especially distributed computing model, and MapReduce, such as: What is MapReduce? Can MapReduce fit in all situations? How we can compares it with other technologies such as Grid Computing? And what is the best solution to our situation? So I decide to write about the distributed computing article in two parts. First one about the distributed computing model and what is the difference between them. In the second part I will discuss the reliability, and distributed storage systems. Download the article in PDF format. Download the article in MS Word format. I wait for your comments, and questions, and I will answer it in part two.

    Click to read more ...

    Wednesday
    May062009

    Guinness Book of World Records Anyone?

    We are planning to be the first company to do a one million user load test and are looking for someone willing to be the first to have been subjected to such a test! Is YOUR site scalable enough? How do you KNOW? http://capcalblog.blogspot.com. Randy Hayes CapCal

    Click to read more ...