Entries in CDN (17)

Wednesday
Jan132016

Live Video Streaming At Facebook Scale

With 1.49 billion monthly active users, operating at Facebook scale is far from trivial. Facebook's new live video streaming services present a fascinating use case for designing streaming service in global distribution and massive scale.

Click to read more ...

Monday
Jul232012

State of the CDN: More Traffic, Stable Prices, More Products, Profits - Not So Much

CDNs (content delivery networks) are the secret shadow super powers behind the web and Dan Rayburn at streamingmedia.com is the go to investigative reporter for quality information on CDNs. Every year Dan has a Content Delivery Summit on all things CDN and those videos are now available. Dan also gives a kind of state of the industry talk where he does something wonderful, he gives real numbers and prices. Dan really knows his stuff and is an excellent speaker, so watch the video, but here’s my gloss on the state of the CDN so far this year:

  • Massive growth. Large customers are expecting 126% growth in video traffic over last year; medium size customers are seeing 48% traffic growth, small sized customer are seeing 73.3% traffic growth.
  • More traffic != More profit...

Click to read more ...

Thursday
Aug182011

Paper: The Akamai Network - 61,000 servers, 1,000 networks, 70 countries 

Update: as of the end of Q2 2011, Akamai had 95,811 servers deployed globally.

Akamai is the CDN to the stars. It claims to deliver between 15 and 30 percent of all Web traffic, with major customers like Facebook, Twitter, Apple, and the US military. Traditionally quite secretive, we get a peek behind the curtain in this paper: The Akamai Network: A Platform for High-Performance Internet Applications by Erik Nygren, Ramesh Sitaraman, and Jennifer Sun. 

Abstract:
Comprising more than 61,000 servers located across nearly 1,000 networks in 70 countries worldwide, the Akamai platform delivers hundreds of billions of Internet interactions daily, helping thousands of enterprises boost the performance and reliability of their Internet applications. In this paper, we give an overview of the components and capabilities of this large-scale distributed computing platform, and offer some insight into its architecture, design principles, operation, and management. 

Delivering applications over the Internet is a bit like living in the Wild West, there are problems: Peering point congestion, Inefficient communications protocols, Inefficient  routing  protocols, Unreliable networks, Scalability, Application limitations and a slow rate of change adoption. A CDN is the White Hat trying to remove these obstacles for enterprise customers. They do this by creating a delivery network that is a virtual network over the existing Internet. The paper goes on to explain how they make this happen using edge networks and a sophisticated software infrastructure. With such a powerful underlying platform, Akamai is clearly Google-like in their ability to deliver products few others can hope to match.

Detailed and clearly written, it's well worth a read.

Click to read more ...

Monday
Jul122010

Creating Scalable Digital Libraries

Like many other media content providers, libraries and museums are increasingly moving their content onto the Web.  While the move itself is no easy process (with digitization, web development, and training costs), being able to successfully deliver content to a wide audience is an ongoing concern, particularly for large libraries.

Much of the concern is financial, as most libraries do not have the internal budget or outside investors that for-profit businesses enjoy.  Even large university libraries will face serious budget constraints that even other university departments, such as science and technology would not face.

Creating a scalable infrastructure and also distributing a large digital collection that can handle multiple requests, requires planning that many librarians have not even imagined.  They must stop thinking in terms of "one-item-per-customer" and start thinking in terms of numerous users accessing the same information simultaneously.

Click to read more ...

Thursday
Oct012009

Moving Beyond End-to-End Path Information to Optimize CDN Performance

You go through the expense of installing CDNs all over the globe to make sure users always have a node close by and you notice something curious and furious: clients still experience poor latencies. What's up with that? What do you do to find the problem? If you are Google you build a tool (WhyHigh) to figure out what's up. This paper is about the tool and the unexpected problem of high latencies on CDNs. The main problems they found: inefficient routing to nearby nodes and packet queuing. But more useful is the architecture of WhyHigh and how it goes about identifying bottle necks. And even more useful is the general belief in creating sophisticated tools to understand and improve your service. That's what professionals do. From the abstract:
Replicating content across a geographically distributed set of servers and redirecting clients to the closest server in terms of latency has emerged as a common paradigm for improving client performance. In this paper, we analyze latencies measured from servers in Google’s content distribution network (CDN) to clients all across the Internet to study the effectiveness of latency-based server selection. Our main result is that redirecting every client to the server with least latency does not suffice to optimize client latencies. First, even though most clients are served by a geographically nearby CDN node, a sizeable fraction of clients experience latencies several tens of milliseconds higher than other clients in the same region. Second, we find that queueing delays often override the benefits of a client interacting with a nearby server.
To help the administrators of Google’s CDN cope with these problems, we have built a system called WhyHigh. First, WhyHigh measures client latencies across all nodes in the CDN and correlates measurements to identify the prefixes affected by inflated latencies. Second, since clients in several thousand prefixes have poor latencies, WhyHigh prioritizes problems based on the impact that solving them would have, e.g., by identifying either an AS path common to several inflated prefixes or a CDN node where path inflation is widespread. Finally, WhyHigh diagnoses the causes for inflated latencies using active measurements such as traceroutes and pings, in combination with datasets such as BGP paths and flow records. Typical causes discovered include lack of peering, routing misconfigurations, and side-effects of traffic engineering. We have used WhyHigh to diagnose several instances of inflated latencies, and our efforts over the course of a year have significantly helped improve the performance offered to clients by Google’s CDN.

Related Articles

  • Product: Akamai
  • Tuesday
    Apr212009

    What CDN would you recommend?

    Update 10: The Value of CDNs by Mike Axelrod of Google. Google implements a distributed content cache from within large ISPs. This allows them to serve content from the edge of the network and save bandwidth on the ISPs backbone. Update 9: Just Jump: Start using Clouds and CDNs. Bob Buffone gives a really nice and practical tutorial of how to use CloudFront as your CDN. Update 8: Akamai’s Services Become Affordable for Anyone! Blazing Web Site Performance by Distribution Cloud. Distribution Cloud starts at $150 per month for access to the best content distribution network in the world and the leader of Content Distribution Networks. Update 7: Where Amazon’s Data Centers Are Located, Expanding the Cloud: Amazon CloudFront. Why Amazon's CDN Offering Is No Threat To Akamai, Limelight or CDN Pricing. Amazon has launched their CDN with "“low latency, high data transfer speeds, and no commitments.” The perfect relationship for many. The majority of the locations are in North America, but some are in Europe and Asia. Update 6: Amazon Launching New Content Delivery Network: No Threat To Major CDNs, Yet. All the Amazon will kill all other CDNs is a bit overblown. As usual Dan Rayburn sets us straight: The offering won't support streaming, live broadcasting, or provide many of the other products and services that video content owners need...the real story here is that Amazon is going to offer a high performance method of distributing content with low latency and high data transfer rates. Update 5: When It Comes To Content Delivery Networks, What Is The "Edge"?. Dan Rayburn is on edge about the misuse of the term edge: closest location to the user does not guarantee quality, often content is not delivered from the closest location, all content is not replicated at every "edge" location. Lots of other essential information. Update 4: David Cancel runs a great test to see if you should be Using Amazon S3 as a CDN?. Conclusion: "CacheFly performed the best but only slightly better than EdgeCast. The S3 option was the worst with the Nginx/DIY option performing just over 100 ms faster." Also take look at Part 2 - Cacheability? Update 3: Mr. Rayburn takes A Detailed Look At Akamai's Application Delivery Product . They create a "bi-nodal overlay network" where users and servers are always within 5 to 10 milliseconds of each other. Your data center hosted app can't compete. The problem is that people (that is, me) can understand the data center model. I don't yet understand how applications as a CDN will work. Update 2: Dan Rayburn starts an interesting series of articles on Highlights Of My Day In Cambridge With Akamai. Akamai is moving strong into the application distribution business. That would make an interesting cloud alternative.. Update: Streamingmedia links to new CDN DF Splash that specializes in instant-on TV-quality video streaming. A question was raised on the forum asking for a CDN recommendation. As usual there are no definitive answers, but here are three useful articles that may help your deliberations.

  • First, Tony Chang shows how to drive down response times using edge acceleration strategies.
  • Then Pingdom gives a nice overview and introduction to CDNs.
  • And last but not least, Dan Rayburn from StreamingMedia.com gives a master class in how much you should pay for your CDN, what you should be getting for your money, and how to find the right provider for your needs. Lots and lots of good stuff to learn, even if you didn't roll out of bed this morning pondering the deeper mysteries of content delivery networks and the Canadian dollar.

    Edge Acceleration Strategies: Akamai by Tony Chang

    The edge network is the "network physically closest to the end user and the 'origin' is where the application(s) is hosted." Tony talks about how you use CDNs to manage the user experience through meeting millisecond+ level SLAs using edge acceleration services. He does this in an interesting way. He follows a request through its life cycle and shows how to turn your caterpillar into a butterfly at each stage:
  • An edge DNS means a name server closest to the end user will serve the DNS request.
  • Static content is easily cached on the edge.
  • Dynamic content is moving to the edge using what Akamai calls Web Application Accelerators.
  • And something I've never heard of is to use your CDN to improve routing performance by up to 33%. The service bypasses BGP using its own more optimized route tables to decrease latency.

    Pingdom's A look at Content Delivery Networks, or “how to serve lots of content really fast”

    CDNs are the hidden powerhouse of the internet. The unsung mitochondria powering bits forward. Cost, convenience and performance are the reasons people turn to CDNs. A CDN does what you can't, it put lots of servers in lots of different places. Panther Express, for example, puts 800 servers in 22 different geographical locations. Since CDNs sell delivery capacity capacity planning is one of their big challenges. And Pingdom would like you to recognize the importance of monitoring for detecting and quickly reacting to problems :-) The future of CDNs lies in larger caches for HD video, better locality, and more integration with hosting providers.

    Video on Content Delivery Network Pricing, Costs for Outsourced Video Delivery by Dan Rayburn

    Also CDN Pricing Data: Average Cost Per GB Declines In Q4 Due To Startups. It's evident Dan really knows his stuff. His articles and presentations are highly educational for anyone interested in the complex and confusing CDN world. Dan sees hundreds of real-life customer-CDN vendor contracts a year and he reports on real prices averaged over all the contracts he has seen. One of the hardest things as a consumer is knowing what a good price is for your basket of goods and Dan gives you the edge, so to speak. What I learned:
  • You decide who is a CDN.There's no central agency setting a standard. Dan's minimal definition is a service delivering live video in the US and Europe.
  • CDN market has gone from 10 to 30 vendors. VCs are pumping hundreds of millions into the space.
  • CDN providers provide a wide variety of services: application caching, static caching, streaming video, progressive video, etc. Dan concentrates only on video delivery.
  • You can't say vendor A is better than vendor B. It depends on your needs.
  • When comparing vendors you need to do an "apples to apples" comparison. He really likes that phrase. You can't compare vendors, only like products between vendors.
  • Video serving is complex because there are few standards in the market. There are multiple platforms, multiple encoding standards, etc.
  • Customer's don't buy on price alone. Delivery of bits over a network is a commodity. Buy on SLA, customer service, product, format, support, geographic reach, and performance.
  • There appears to be no way to compare vendors on the performance of their network. There are too many variables in play to make an accurate comparison.He's quite adament about this. Performance could mean: SLA, customer service, upload content, buffering, etc. No way to measure performance network performance across networks. Static image performance is very different than streaming performance. People all over the globe are accessing your content so what is the "performance" of that?
  • A trend this year is demand for P2P pricing and services.
  • To price your video delivery you need to answer 4 questions: 1) How many hours? 2) How many users? 3) How long will they watch? 4) What encoding and what bit rate?
  • Price varies on product bundle. Vendors need to specialize so they can move themselves out of the commodity market. If you would pay 8 cents a gig for delivered video your price will be different if you add application and static caching.
  • Contracts are at 12 months. Maybe 2 years when bundling services.
  • P2P is not necessarily cheaper so compare. Pricing is very confusing.
  • You can sometimes get a lower price by using the vendor's player.
  • Flash streaming is more expensive because of licensing fees. The number varies because each vendor cuts their own licensing deals. Could be 20% more, or it could be double, depends on volume.
  • When signing a vendor think if you need global reach or is regional reach sufficient? Use a regional service provider if you need a CDN just once in a while. It's matter of picking based on your needs. You can often get a less expensive deal and get a quarterly commit versus a montly commit.
  • Storage costs have really fallen. High of $10/gig and low of 10 cents per gig.
  • Most CDNs will pull from your origin storage and cache, which reduces your storage cost.
  • CDNs don't want to get paid with promises of ad sharing.
  • Pick a CDN vendor that will take the time to educate you. They should ask you about your business first, they shouldn't talk about themselves first. He mentions this point a few times and it makes a lot of sense.
  • Consider a dual vendor strategy where you pick one vendor for video and another for application.
  • Quality in the industry is very high. People rarely complain about the network. Customers want better support and reporting. Poor reporting is the #1 complaint. Run away if a vendor wants to charge for reporting. Lots of good stuff.

    Related Articles

  • Highscalability CDN Tag Cloud
  • Edge Acceleration Strategies: Akamai by Tony Chang
  • A look at Content Delivery Networks, or “how to serve lots of content really fast”
  • Content Delivery Network Pricing, Costs for Outsourced Video Delivery
  • CDN Pricing Data: Average Cost Per GB Declines In Q4 Due To Startups
  • A Taxonomy and Survey of Content Delivery Networks
  • Content Delivery Networks (CDN) Research Directory

    Click to read more ...

  • Thursday
    Feb052009

    Beta testers wanted for ultra high-scalability/performance clustered object storage system designed for web content delivery

    DataDirect Networks (www.ddn.com) is searching for beta testers for our exciting new object-based clustered storage system. Does this sound like you? * Need to store millions to hundreds of billions of files * Want to use one big file system but can't because no single file system scales big enough * Running out of inodes * Have to constantly tweak file systems to perform better * Need to replicate content to more than one data center across geographies * Have thumbnail images or other small files that wreak havoc on your file and storage systems * Constantly tweaking and engineering around performance and scalability limits * No storage system delivers enough IOPS to serve your content * Spend time load balancing the storage environment * Want a single, simple way to manage all this data If this sounds like you, please contact me at jgoldstein@ddn.com. DataDirect Networks is a 10-year old, well-established storage systems company specializing in Extreme Storage environments. We've deployed both the largest and the fastest storage/file systems on the planet - currently running at over 250GB/s. Our upcoming product is going to change the way storage is deployed for scalable web content and we're seeking testers who can throw their most challenging problems at our new system. It's time for something better and we're going to deliver it.

    Click to read more ...

    Tuesday
    Sep092008

    Content Delivery Networks (CDN) – a comprehensive list of providers

    We build web applications…and there are plenty of them around. Now, if we hit the jackpot and our application becomes very popular, traffic goes up, and our servers are brought down by the hordes of people coming to our website. What do we do in that situation? Of course, I am not talking here about the kind of traffic Digg, Yahoo Buzz or other social media sites can bring to a website, which is temporary overnight traffic, or a website which uses cloud computing like Amazon EC2 service, MediaTemple Grid Service or Mosso Hosting Cloud service. I am talking about traffic that consistently increases over time as the service achieves success. Google.com, Yahoo.com, Myspace.com, Facebook.com, Plentyoffish.com, Linkedin.com, Youtube.com and others are examples of services which have constant high traffic. Knowing that users want speed from their applications, these services will always use a Content Delivery Network (CDN) to deliver that speed. What is a Content Delivery Network? A Content Delivery Network (CDN) is a collection of web servers distributed across multiple locations to deliver content more efficiently to users. The server selected for delivering content to a specific user is typically based on a measure of network proximity. For example, the server with the fewest network hops or the server with the quickest response time is chosen. This will help scaling a web application by taking a part of the load from the service servers. Read the entire article about Content Delivery Networks (CDN) list of providers at MyTestBox.com - web software reviews, news, tips & tricks.

    Click to read more ...

    Tuesday
    Jul222008

    Scaling Bumper Sticker: A 1 Billion Page Per Month Facebook RoR App  

    Several months ago I attended a Joyent presentation where the spokesman hinted that Joyent had the chops to support a one billion page per month Facebook Ruby on Rails application. Even under a few seconds of merciless grilling he would not give up the name of the application. Now we have the big reveal: it was LinkedIn's Bumper Sticker app. For those not currently sticking things on bumps, Bumper Sticker is quite surprisingly a viral media sharing application that allows users to express their individuality by sticking small virtual stickers on Facebook profiles. At the time I was quite curious how Joyent's cloud approach could be leveraged for this kind of app. Now that they've released a few details, we get to find out.

    Site: http://www.Facebook.com/apps/application.php?id=2427603417

    Information Sources

  • Video: Scaling to 1 Billion Page Views Per MonthVideo (very flashy)
  • Web Scalability Practices: Bumper Sticker on Rails by Ikai Lan and Jim Meyer from LinkedIn
  • 1 Billion Page Views a Month by David Young from Joyent
  • Ruby on Rails: scaling to 1 billion page views per month by Dennis Howlettby from Zdnet
  • Joyent's Grid Accelerators for Web Applications by Jason Hoffman from Joyent
  • On Grids, the Ambitions of Amazon and Joyent by Jason Hoffman from Joyent
  • Scaling Ruby on Rails to 1 Billion Page Views a Month by Joe Pruitt from DevCentral

    The Platform

  • MySQL
  • Nginx
  • Mongrel
  • CDN
  • Ruby on Rails (rapid prototype development approach)
  • Facebook
  • Joyent Accelerator - provides a highly scalable on-demand infrastructure for running web sites, including rich web applications written in Ruby on Rails, PHP, Python and Java. Joyent Accelerators are next-generation virtual computers that can grow and multiply (or shrink and consolidate) depending on the real world demands faced by your Web application. Accelerators are built on OpenSolaris, multi-core (8+), RAM-rich servers (32GB+ each) and vast amounts of NAS storage.
  • Masochism Plugin - provides an easy solution for Ruby on Rails applications to work in a replicated database environment. Connection proxy sends some database queries (those in a transaction, update statements, and ActiveRecord::Base#reload) to a master database, and the rest to the slave database.

    The Stats

  • 1 billion page views per month
  • 13.5 million installations
  • 1.5 million daily active users. Recruited 1 million users in first 46 days.
  • 20-27 million canvas page views a day
  • 13 web application servers running Nginx and Mongrel
  • 8 static asset servers serving over 3,500,000 stickers (migrating to a CDN)
  • 4 MySQL servers in a master/slave configuration using Masochism as a proxy to load balance database operations.
  • Cost is about $25K/month.

    The Architecture

  • Bumper Sticker was an experiment to see how fast the Light Engineering Development (LED) team at LinkedIn could build a Ruby on Rails Facebook application.
  • RoR was an easy an environment to prototype in, but they needed a production environment in which they could quickly develop, deploy, and scale. Joyent was selected.
  • Some Notes on Joyent:
    * Joyent is a scale on demand cloud. Allows customers to have a dynamic data center instead of being stuck using their own rigid infrastructure.
    * There's an API if you need one. The service is unmanaged, you get root on all your boxes.
    * They consider their infrastructure to be better and more open than Amazon. You get access to a high end load balancer and the capabilities of OpenSolaris (Dtrace, Zones, lower request processing overhead, sub 10 second reboot times).
    * Joyent's primary scalability principle is to organize apps around silos built from their powerful Accelerator blocks: put applications on different servers based on the quality of service you want to give them. For example, put static content on their own servers so the static content is always served fast and reliably. This allows you to prioritize based on what's important to you. You could, for example, prioritize the virality of your application by putting the Invite Friends functionality on their own servers, thus assuring the growth of your application through your viral functionality possibly at the expense of less important functionality.
    * Has three data centers in the US and are opening a fourth, none in Europe.
    * Considers their secret sauce to be their highly sophisticated administration system which allows a few people to easily manage a large infrastructure.
    * Has a peering relationship with Facebook. That means there are direct high-speed fiber links between Joyent’s data center in Emeryville and Facebook’s data center in San Francisco.

  • 80% of the content for Bumber Sticker is static. The Facebook API can directly render content at a specified memory location. Bumper Sticker was able to use the scripting feature of F5 BIG-IP load balancer to directly load static content by passing a pointer to the Facebook API.

    The Lessons

  • Rails scales exactly like any other app. Take into account all the components from the moment the request is received at the load balancer all the way down and all the way back again.
  • The development process is: put some measurements in place, find problems, fix problems, more people adopt and scale you out of your solution, and the cycle repeats. Sun's Dtrace feature makes it easy to instrument the stack to identify bottlenecks.
  • Rails scales as long as the development team using it understands that many of the bottlenecks are exactly those faced by developers on any other database-driven web platform.
  • Hit a disk spindle and you are screwed. Avoid going to the database or the file system. The more they avoided disk the fewer timeouts they experienced.
  • Convert anything dynamic into static content. Dynamic content is your enemy. Convert anything dynamic into static content so it can be removed from the disk path.
  • Push content to the edge. Move content as close to the client as possible. Move cache to the CDN. Reduce time going across the network.
  • Faster means more viral. On a viral system the better the performance the more people can play with your application. The more people who play with your system the more likely they are to pull more people in, which means the more the app will spread and go viral. Bumper Sticker has been successful at creating a community of fans who enjoy uploading and sharing their own stickers.

    Some issues:
  • Since most of the content is static and served by the load balancer, the impact of Rails in the system is not clear.
  • The functionality of Bumper Sticker is relatively simple. What would the impact be on scalability if other often requested features like search were added?

    Related Articles

  • Friends for Sale Architecture - A 300 Million Page View/Month Facebook RoR App
  • Monday
    May192008

    UK Based CDN

    Hi, I was wondering if I could borrow the collective minds of you all to draw up a list to the CDN's that you'd use/do use in the UK. If they're outside the UK but have decent support then also include. The service must be cheap and not require a huge setup fee, it's really only for a small time business; it shares video & high-res pics so mass cheap storage is a must and wondered whether you guys had any ideas, also costs? Mass storage isn't cheap in the UK compared to the states, for example, unless I go colo but as I say, it's a small setup but happens to require a fair bit of space. Would S3 be a good starting point? What is the service like? I hear mixed reviews about it. Many thanks, Jim

    Click to read more ...