Wednesday
Oct032007
Save on a Load Balancer By Using Client Side Load Balancing

In Client Side Load Balancing for Web 2.0 Applications author Lei Zhu suggests a very interesting approach to load balancing: forget DNS round robbin, toss your expensive load balancer, and make your client do the load balancing for you. Your client maintains a list of possible servers and cycles through them. All the details are explained in the article, but it's an intriguing idea, especially for the budget conscious startup.
Reader Comments (14)
I'm going to have to say something like http://www.backhand.org/wackamole/">wackamole would be a much better option.
Good idea on paper, but not a very viable one in practice.
While the arguments for balancing requests are good, cross scripting from a web browser is a very, very bad idea. Working around solutions that have been made for security reasons will most likely leave you wide open for other exploits that the JavaScript developer has not thought about.
I do agree that a load balancer and DNS round-robin might not be the best solution, but leveraging that with Javascript hacks is really the wrong way to go.
This is a very limited solution; it overly simplifies the role and function of a proper load balanced solution (just "randomly" selecting a server from the client side doesn't balance the load at all; that's load distribution, not balance). The fact is, load balancers do much, much more than just distribute the load (especially dedicated ones like BigIP).
As the author points out, this could be use for the very budget minded; don't use this if you're planning on having high-scalability and reliability though.
Well said :)
Is this an intriguing idea? Client side resolution of services has been tried several times by several people, and it does not work, and is prone to lots of problems. In fact, a startup hoping to get big one day should never even try this since it is much more painful to grow with this approach :)
All right, I'll bite. I understand the cross site scripting issues, but why doesn't it work otherwise? It's exactly what I would do when hashing to a shard in the datacenter. Consider your typical web2.0 app as a thickish app and doing rotation in a nice client side library makes good sense. Why take a hop to a DNS to do the same thing? Sure, a load balancer usually does a lot more than balance load, but if that's the only capacity you care about, why not shift work to the client?
"Be careful" is all I have to say :)
I've been pushing this in our organization for 6 years and haven't been successful. Other than the cross-site scripting issues as one of the responders mentioned, it also creates problems with bookmarks which was a bigger concern for us (and for any web based organization).
The idea itself is not new however. DNS, SMTP and DNS based loadalancing mechanisms have been doing client side loadbalancing for years/decades. But DNS has its own share of problems and is not reliable.
There are lots of software available to check for server health and to bring up backup server so that all the hosts are always active, but unless the client understands the load distribution on the server end as well, one can very quickly overwhelm a single node with too much traffic. And if you have a large number of servers, u need ability to move users from one node to another during maintainance. With the loadbalancing logic at the client end, u will loose a lot of flexibility.
In short, talking about it is easy but implementing it is far more complicated than what I described here. I personally believe that deploying a loadbalancer is very cheap compared to the work you would have to do to reinvent the wheel.
rkt
Hmmm... doesn't sound practical. Load balancers also provide another key element: transparent fail-over.
If I were to use the proposed solution in a cluster with 4 web servers, how is the failure of one communicated? Both to the clients who already have obtained the list of servers to rotate through, but also on the server, so that future clients don't obtain a list w/ a bad server?
-Bill
> transparent fail-over
The bookmark problem changed my mind a bit, but I think the current status problem is solvable through natural failure and retry mechanisms and the server communicating server state to clients on a regular basis. Load balancers can always get a bad server is noticing something is bad isn't instantaneous. Being on the client doesn't really change that.
Kyrre-
I tend to agree with you. If for no other reason than building this into a client is just a potential mess. And, yes, I've seen it. In fact, LDAP applications tend to at least have the behavior of having a "failover" option for the server. That's nice, but I prefer to see that addressed at the network level.
--
Dustin Puryear
Author, "Best Practices for Managing Linux and UNIX Servers"
http://www.puryear-it.com/pubs/linux-unix-best-practices
I do not really see the bookmark issue. The key is loading the initial interface from a server that will be sure to be around when the user comes back. From then on you just use any random server for the AJAX (or rather iframe) requests.
Of course AJAX apps have their issues with bookmarking anyways. So many AJAX apps are not really bookmarkable anyways, though the anchor based solution is now pretty well understood but not widely deployed yet.
> The key is loading the initial interface from a server that will be sure to be around when the user comes back.
That's a good point. I guess I was thinking you could only rely on your primary domain staying around. But there's no reason the other domains couldn't persist as well.
Sorry for posting twelve months late, but I found this strategy posting just moments ago.
Even if there are drawbacks, like bookmarking, xss, etc, I do think that this sounds like a extremely good idea for certain special situations, namely almost content-free web applications, which are not needed to be spidered or bookmarked (e.g. ticket reservation systems or something like that). Then there is no need to do the load balancing transparently.
There could be a simple redirect from the main site to the server, e.g. a redirect from www.ticketreservation.com to a17.ticketreservation.com. From there on the user stays at the server a17. a17 is not a server, but a cluster itself, by which redundancy could be achieved.
The initial redirect server could itself be a cluster behind a load balancer. This way, a really high scalability could be achieved, as the primary load balancer behind www is only hit once at the beginning of each session.
Of course, the redirect to different urls looks extremely nasty, but with mere web applications (which do not need to be spidered, deep-linked or deep-bookmarked anyway), this should be only an optical problem for the user?
The redirect-cluster could poll the load of the application clusters and adapt the redirects accordingly, thus achieving balancing and not mere load distribution.
How about using this as a cdn. Host your website on a standard server but get fixed content like images, ... from another host.
You use javascript to loadbalance, detect failure, ...