Entries in Strategy (358)

Wednesday
Sep102014

10 Common Server Setups For Your Web Application

If you need a good overview of different ways to setup your web service then Mitchell Anicas has written a good article for you: 5 Common Server Setups For Your Web Application.

We've even included a few additional possibilities at no extra cost.

  1. Everything on One Server. Simple. Potential for poor performance because of resource contention. Not horizontally scalable. 
  2. Separate Database Server. There's an application server and a database server. Application and database don't share resources. Can independently vertically scale each component. Increases latency because the database is a network hop away.
  3. Load Balancer (Reverse Proxy). Distribute workload across multiple servers. Native horizontal scaling. Protection against DDOS attacks using rules. Adds complexity. Can be a performance bottleneck. Complicates issues like SSL termination and stick sessions.
  4. HTTP Accelerator (Caching Reverse Proxy). Caches web responses in memory so they can be served faster. Reduces CPU load on web server. Compression reduces bandwidth requirements. Requires tuning. A low cache-hit rate could reduce performance. 
  5. Master-Slave Database Replication. Can improve read and write performance. Adds a lot of complexity and failure modes.
  6. Load Balancer + Cache + Replication. Combines load balancing the caching servers and the application servers, along with database replication. Nice explanation in the article.
  7. Database-as-a-Service (DBaaS). Let someone else run the database for you.  RDS is one example from Amazon and there are hosted versions of many popular databases.
  8. Backend as a Service (BaaS). If you are writing a mobile application and you don't want to deal with the backend component then let someone else do it for you. Just concentrate on the mobile platform. That's hard enough. Parse and Firebase are popular examples, but there are many more.
  9. Platform as a Service (PaaS). Let someone else run most of your backend, but you get more flexibility than you have with BaaS to build your own application. Google App Engine, Heroku, and Salesforce are popular examples, but there are many more.
  10. Let Somone Else Do it. Do you really need servers at all? If you have a store then a service like Etsy saves a lot of work for very little cost. Does someone already do what you need done? Can you leverage it?
Wednesday
Sep032014

Strategy: Change the Problem

James T. Kirk's infamous gambit in Starfleet's impossible to win Kobayashi Maru test was to redefine the problem into a challenge he could beat. 

Interestingly, an article titled Shifts In Algorithm Design, says something like the same gambit is the modern method of solving algorithmic problems.

In the past: 

I, Dick, recall the “good old days of theory.” When I first started working in theory—a sort of double meaning—I could only use deterministic methods. I needed to get the exact answer, no approximations. I had to solve the problem that I was given—no changing the problem.

 

In the good old days of theory, we got a problem, we worked on it, and sometimes we solved it. Nothing shifty, no changing the problem or modifying the goal. 

Today:

Click to read more ...

Wednesday
Jul302014

Preventing the Dogpile Effect - Problem and Solution

This is a guest repost Przemek Sobstel, who believes that dogpile effect issue is not covered enough, especially in the PHP world. Orignal article: Preventing dogpile effect.

The Dogpile effect occurs when cache expires and websites are hit by numerous requests the same time. From my own experiences working on big-traffic websites this is what I consider best the best solution. It was used sucessfully in the wild and it worked. Many people mention storing two redundant values FRESH + STALE, but for big traffic websites it was killing our network. We thought it worth sharing our solution and starting a discussion for sharing experiences.

Preventing Dogpiles

Click to read more ...

Monday
Jul282014

The Great Microservices vs Monolithic Apps Twitter Melee 

Once upon a time a great Twitter melee was fought for the coveted title of Consensus Best Way to Structure Systems. The competition was between Microservices and Monolithic Apps. 

Flying the the logo of Microservices, from a distant cloud covered land, is the Kingdom of Netflix, whose champion was Sir Adrian Cockcroft (who has pledged fealty to another). And for the Kingdom of ThoughtWorks we have Sir Sam Newman as champion.

Flying the logo of the Monolithic App is champion Sir John Allspaw, from the fair Kingdom of Etsy.

Knights from the Kingdom of Digital Ocean and several independent realms filled out the list.

To the winner goes a great prize: developer mindshare and the favor of that most fickle of ladies, Lady Luck.

May the best paradigm win.

The opening blow was wielded by the highly ranked Sir Cockcroft, a veteran of many tournaments:

Click to read more ...

Wednesday
Jul162014

10 Program Busting Caching Mistakes

While Ten Caching Mistakes that Break your App by Omar Al Zabir is a few years old, it is still a great source of advice on using caches, especially on the differences between using a local in-memory cache and when using a distributed cache.

Here are the top 10 mistakes (summarized):
  1. Relying on a default serializer. Default serializers can use a lot of CPU, especially for complex types. Give some thought to the best serialization and deserialization method for your language and environment.
  2. Storing large objects in a single cache item. Because of serialization and deserialization costs, under concurrent load, frequent access to large object graphs can kill your server's CPU. Instead, break up the larger graph into smaller subgraphs and cache them separately. Retrieve only the smallest unit you need.
  3. Using cache to share objects between threads. Race conditions, when writes are involved, develop if parts of a program are accessing the same cached items simultaneously. Some sort of external locking mechanism is needed. 
  4. Assuming items will be in cache immediately after storing them. Never assume an item will be in a cache, even after it was just written, because a cache can flush items when memory gets tight. Code should always check for a null return value from a cache.
  5. Storing entire collection with nested objects. Storing an entire collection when you need to get a particular item results in poor performance because of the serialization overhead. Cache individual items separately so they can be retrieved separately. 
  6. Storing parent-child objects together and also separately. Sometimes an object will simultaneously be contained in two or more parent objects. To not have the same object stored in two different places in the cache store it on its own under its own key. The parent objects will then read the objects when access is needed.
  7. Caching Configuration settings. Store configuration data in a static variable that is local to your process. Accessing cached data is expensive so you want to avoid that cost when possible.
  8. Caching Live Objects that have open handle to stream, file, registry, or network. Don't cache objects the have references to resources like files, streams, memory, etc. When the cached item is removed from the cache those resources will not be deleted and system resources will leak. 
  9. Storing same item using multiple keys. It can be convenient to access an item by a key and an index number.  This can work when a cache is in-memory because the cache can contain a reference to the same object which means changes to the object will be seen through both access paths. When using a remote cache any updates won't be visible so the objects will get out of sync. 
  10. Not updating or deleting items in cache after updating or deleting them on persistent storage. Items in a remote cache are stored as a copy, so updating an object won't update the cache. The cache must specifically be updated for the changes to be seen by anyone else. With an in-memory cache changes to an object will be seen by everyone. Same for deletion. Deleting an object won't delete it from the cache. It's up to the program make sure cached items are deleted correctly.
Wednesday
Jun252014

The Secret of Scaling: You Can't Linearly Scale Effort with Capacity

The title is a paraphrase of something Raymond Blum, who leads a team of Site Reliability Engineers at Google, said in his talk How Google Backs Up the Internet. I thought it a powerful enough idea that it should be pulled out on its own:

Mr. Blum explained common backup strategies don’t work for Google for a very googly sounding reason: typically they scale effort with capacity.

If backing up twice as much data requires twice as much stuff to do it, where stuff is time, energy, space, etc., it won’t work, it doesn’t scale. 

You have to find efficiencies so that capacity can scale faster than the effort needed to support that capacity.

A different plan is needed when making the jump from backing up one exabyte to backing up two exabytes.

When you hear the idea of not scaling effort with capacity it sounds so obvious that it doesn't warrant much further thought. But it's actually a profound notion. Worthy of better treatment than I'm giving it here:

Click to read more ...

Monday
Jun232014

Performance at Scale: SSDs, Silver Bullets, and Serialization

This is a guest post by Aaron Sullivan, Director & Principal Engineer at Rackspace.

We all love a silver bullet. Over the last few years, if I were to split the outcomes that I see with Rackspace customers who start using SSDs, the majority of the outcomes fall under two scenarios. The first scenario is a silver bullet—adding SSDs creates near-miraculous performance improvements. The second scenario (the most common) is typically a case of the bullet being fired at the wrong target—the results fall well short of expectations.

With the second scenario, the file system, data stores, and processes frequently become destabilized. These demoralizing results, however, usually occur when customers are trying to speed up the wrong thing.

A common phenomena at the heart of the disappointing SSD outcomes is serialization. Despite the fact that most servers have parallel processors (e.g. multicore, multi-socket), parallel memory systems (e.g. NUMA, multi-channel memory controllers), parallel storage systems (e.g. disk striping, NAND), and multithreaded software, transactions still must happen in a certain order. For some parts of your software and system design, processing goes step by step. Step 1. Then step 2. Then step 3. That’s serialization.

And just because some parts of your software or systems are inherently parallel doesn’t mean that those parts aren’t serialized behind other parts. Some systems may be capable of receiving and processing thousands of discrete requests simultaneously in one part, only to wait behind some other, serialized part. Software developers and systems architects have dealt with this in a variety of ways. Multi-tier web architecture was conceived, in part, to deal with this problem. More recently, database sharding also helps to address this problem. But making some parts of a system parallel doesn’t mean all parts are parallel. And some things, even after being explicitly enhanced (and marketed) for parallelism, still contain some elements of serialization.

How far back does this problem go? It has been with us in computing since the inception of parallel computing, going back at least as far as the 1960s(1). Over the last ten years, exceptional improvements have been made in parallel memory systems, distributed database and storage systems, multicore CPUs, GPUs, and so on. The improvements often follow after the introduction of a new innovation in hardware. So, with SSDs, we’re peering at the same basic problem through a new lens. And improvements haven’t just focused on improving the SSD, itself. Our whole conception of storage software stacks is changing, along with it. But, as you’ll see later, even if we made the whole storage stack thousands of times faster than it is today, serialization will still be a problem. We’re always finding ways to deal with the issue, but rarely can we make it go away.

Parallelization and Serialization

Click to read more ...

Wednesday
May212014

9 Principles of High Performance Programs

Arvid Norberg on the libtorrent blog has put together an excellent list of principles of high performance programs, obviously derived from hard won experience programming on bittorrent:

Two fundamental causes of performance problems:

  1. Memory Latency. A big performance problem on modern computers is the latency of SDRAM. The CPU waits idle for a read from memory to come back.
  2. Context Switching. When a CPU switches context "the memory it will access is most likely unrelated to the memory the previous context was accessing. This often results in significant eviction of the previous cache, and requires the switched-to context to load much of its data from RAM, which is slow."

Rules to help balance the forces of evil:

Click to read more ...

Wednesday
May142014

Google Says Cloud Prices Will Follow Moore’s Law: Are We All Renters Now?

After Google cut prices on their Google Cloud Platform Amazon quickly followed with their own price cuts. Even more interesting is what the future holds for pricing. The near future looks great. After that? We'll see.

Adrian Cockcroft highlights that Google thinks prices should follow Moore’s law, which means we should expect prices to halve every 18-24 months.

That's good news. Greater cost certainty means you can make much more aggressive build out plans. With the savings you can hire more people, handle more customers, and add those media rich features you thought you couldn't afford. Design is directly related to costs.

Without Google competing with Amazon there's little doubt the price reduction curve would be much less favorable.

As a late cloud entrant Google is now in a customer acquisition phase, so they are willing to pay for customers, which means lower prices are an acceptable cost of doing business. Profit and high margins are not the objective. Getting market share is what is important.

Amazon on the other hand has been reaping the higher margins earned from recurring customers. So Google's entrance into the early product life cycle phase is making Amazon eat into their margins and is forcing down prices over all.

But there's a floor to how low prices can go. Alen Peacock, co-founder of Space Monkey has an interesting graphic telling the story. This is Amazon's historical pricing for 1TB of storage in S3, graphed as a multiple of the historical pricing for 1TB of local hard disk:

Alen explains it this way:

Cloud prices do decrease over time, and have dropped significantly over the timespan shown in the graph, but this graph shows cloud storage prices as a multiple of hard disk prices. In other words, hard disk prices are dropping much faster than datacenter prices. This is because, right, datacenters have costs other than hard disks (power, cooling, bandwidth, building infrastructure, diesel backup generators, battery backup systems, fire suppression, staffing, etc). Most of those costs do not follow Moore's Law -- in fact energy costs are on a long trend upwards. So over time, the gap shown by the graph should continue to widen.

 

The economic advantages of building your own (but housed in datacenters) is there, but it isn't huge. There is also some long term strategic advantage to building your own, e.g., GDrive dropped their price dramatically at will because Google owns their datacenters, but Dropbox couldn't do that without convincing Amazon to drop the price they pay for S3.

Costs other than hardware began dominating in datacenters several years ago, Moore's Law-like effects are dampened. Energy/cooling and cooling costs do not follow Moore's Law, and those costs make up a huge component of the overall picture in datacenters. This is only going to get worse, barring some radical new energy production technology arriving on the scene.

What we're [Space Monkey] interested in, long term, is dropping the floor out from underneath all of these, and I think that only happens if you get out of the datacenter entirely.

As the size of cloud market is still growing there will still be a fight for market share. When growth slows and the market is divided between major players a collusionary pricing phase will take over. Cloud customers are sticky customers. It's hard to move off a cloud. The need for higher margins to justify the cash flow drain during the customer acquisition phase will reverse the favorable trends we are seeing now.

Until then it seems the economics indicate we are in a rent, not a buy world.

Related Articles 

  • IaaS Series: Cloud Storage Pricing – How Low Can They Go? - "For now it seems we can assume we’ve not seen the last of the big price reductions."
  • The Cloud Is Not Green
  • Brad Stone: “Bill Miller, the chief investment officer at Legg Mason Capital Management and a major Amazon shareholder, asked Bezos at the time about the profitability prospects for AWS. Bezos predicted they would be good over the long term but said that he didn’t want to repeat “Steve Jobs’s mistake” of pricing the iPhone in a way that was so fantastically profitable that the smartphone market became a magnet for competition.” 
Monday
May122014

4 Architecture Issues When Scaling Web Applications: Bottlenecks, Database, CPU, IO

This is a guest repost by Venkatesh CM at Architecture Issues Scaling Web Applications.

I will cover architecture issues that show up while scaling and performance tuning large scale web application in this blog.

Lets start by defining few terms to create common understanding and vocabulary. Later on I will go through different issues that pop-up while scaling web application like

  • Architecture bottlenecks
  • Scaling Database
  • CPU Bound Application
  • IO Bound Application

Determining optimal thread pool size of an web application will be covered in next blog.

Performance

Click to read more ...

Page 1 ... 6 7 8 9 10 ... 36 Next 10 Entries »