One important high availability principle is concurrency control. The idea is to allow only that much traffic through to your system which your system can handle successfully. For example: if your system is certified to handle a concurrency of 100 then the 101st request should either timeout, be asked to try later or wait until one of the previous 100 requests finish. The 101st request should not be allowed to negatively impact the experience of the other 100 users. Only the 101st request should be impacted. Read more here...