Entries in hot links (450)

Thursday
Feb042010

Hot Scalability Links for February 4, 2010

Lots of cool stuff happening this week...

  1. Voldemort gets rebalancing. It's one thing to shard data to scale, it's a completely different level of functionality to manage those shards intelligently. Voldemort has stepped up by adding advanced rebalancing functionality: Dynamic addition of new nodes to the cluster; Deletion of nodes from cluster; Load balancing of data inside a cluster.
  2. Microsoft Finally Opens Azure for Business. Out of the blue Microsoft opens up their platform as a service service. Good to have more competition and we'll keep an eye out for experience reports.
  3. New details on LinkedIn architecture by Greg Linden. LinkedIn appears to only use caching minimally, preferring to spend their efforts and machine resources on making sure they can recompute computations quickly than on hiding poor performance behind caching layers.
  4. The end of SQL and relational databases?  by David Intersimone. For new projects, I believe, we have genuine non-relational alternatives on the table (pun intended).
  5. HipHop for PHP: Move Fast. When you make millions of widgets saving pennies per widget quickly adds up to real money. Facebook released HipHop, a PHP compiler, aimed at shaving off cycle of CPU and bytes of memory in production of their social widgets. 

Click to read more ...

Wednesday
Jan272010

Hot Scalability Links for January 28 2010

  1. Google's Research Areas of Interest: Building scalable, robust cluster applications. At Google we see distributed systems as a technology in its infancy, with huge gaps in the supporting research  that represent some of the most important problems in the space. Here are some examples: Resource sharing, Balancing cost, performance, and reliability, Self-maintaining systems. 
  2. Amazon SimpleDB: A Simple Way to Store Complex Data by Paul Tremblett. The most effective way I have found to understand SimpleDB is to think about it in terms of something else we all use and understand -- a spreadsheet.
  3. Rackspace Cloud Servers versus Amazon EC2: Performance Analysis. The Bitsource conducted a review of the two cloud computing platforms, Rackspace Cloud Servers and Amazon Elastic Compute Cloud (EC2), to get a general idea of overall system performance.
  4. Private Clouds Are Not The Future by Jame Hamilton. Private clouds are better than nothing but an investment in a private cloud is an investment in a temporary fix that will only slow the path to the final destination: shared clouds.
  5. What is the right way to measure scale? by Daniel Abadi. So which scales better? Is using the number of nodes a better proxy than size of data? Hadoop can “scale” to 3800 nodes. So far, all we know is that Greenplum can “scale” to 96 nodes. Can it handle more nodes?

Click to read more ...

Wednesday
Jan132010

10 Hot Scalability Links for January 13, 2010

  • Has Amazon EC2 become over subscribed? by Alan Williamson. Systemic problems hit AWS as users experience problems across Amazon's infrastructure. It seems the strange attractor of a cloud may be the same as for a shared hosting service.
  • Understanding Infrastructure 2.0 by James Urquhart. We need to take a systems view of our entire infrastructure, and build our automation around the end-to-end architecture of that system.
  • Hey You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute Clouds. We show that it is possible to map the internal cloud infrastructure.
  • Hadoop World: Building Data Intensive Apps with Hadoop and EC2  by Pete Skomoroch. Dives into detail about how he built TrendingTopics.org using Hadoop and EC2.
  • A Crash Course in Modern Hardware by Cliff Click. Yes, your mind will hurt after watching this. And no, you probably don't know what your microprocessor is doing anymore.
  • Click to read more ...

    Monday
    Dec212009

    Hot Holiday Scalability Links for 2009

    Wednesday
    Nov112009

    Hot Scalability Links for Nov 11 2009  

    Friday
    Oct302009

    Hot Scalabilty Links for October 30 2009

    Thursday
    Oct152009

    Hot Scalability Links for Oct 15 2009 

    Update: Social networks in the database: using a graph database. Anders Nawroth puts graphs through their paces by representing, traversing, and performing other common social network operations using a graph database.

    Update: Deployment with Capistrano by Charles Max Wood. Simple step-by-step for using Capistrano for deployment.

    Log-structured file systems: There's one in every SSD by Valerie Aurora. SSDs have totally changed the performance characteristics of storage! Disks are dead! Long live flash!

    An Engineer's Guide to Bandwidth by DGentry. It's a rough world out there, and we need to to a better job of thinking about and testing under realistic network conditions.

    Analyzing air traffic performance with InfoBright and MonetDB by Vadim of the MySQL Performance Blog.

    Scalable Delivery of Stream Query Result by Zhou, Y ; Salehi, A ; Aberer, K. In this paper, we leverage Distributed Publish/Subscribe System (DPSS), a scalable data dissemination infrastructure, for efficient stream query result delivery.

    Thursday
    Sep172009

    Hot Links for 2009-9-17 

  • Save 25% on Hadoop Conference Tickets
    Apache Hadoop is a hot technology getting traction all over the enterprise and in the Web 2.0 world. Now, there's going to be a conference dedicated to learning more about Hadoop. It'll be Friday, October 2 at the Roosevelt Hotel in New York City.

    Hadoop World, as it's being called, will be the first Hadoop event on the east coast. Morning sessions feature talks by Amazon, Cloudera, Facebook, IBM, and Yahoo! Then it breaks out into three tracks: applications, development / administration, and extensions / ecosystems. In addition to the conference itself, there will also be 3 days of training prior to the event for those looking to go deeper. In addition to general sessions speakers, presenters include Hadoop project creator Doug Cutting, as well as experts on large-scale data from Intel, Rackspace, Softplayer, eHarmony, Supermicro, Impetus, Booz Allen Hamilton, Vertica, About.com, and other companies.

    Readers get a 25% discount if you register by Sept. 21: http://hadoop-world-nyc.eventbrite.com/?discount=hadoopworld_promotion_highscalability.

  • Essential storage tradeoff: Simple Reads vs. Simple Writes by Stephan Schmidt. Data in denormalized chunks is easy to read and complex to write.
  • Kickfire's approach to parallelism by DANIEL ABADI. Kickfire uses column-oriented storage and execution to address I/O bottlenecks and FPGA-based data-flow architecture to address processing and memory bottlenecks.
  • "Just in Time" Decompression in Analytic Databases by Michael Stonebraker. A DBMS that is optimized for compression through and through--especially with a query executor that features just in time decompression will not just reduce IO and storage overhead, but also offer better query performance with lower CPU resource utilization.
  • Reverse Proxy Performance – Varnish vs. Squid (Part 2) by Bryan Migliorisi. My results show that in raw cache hit performance, Varnish puts Squid to shame.
  • Building Scalable Databases: Denormalization, the NoSQL Movement and Digg by Dare Obasanjo. As a Web developer it's always a good idea to know what the current practices are in the industry even if they seem a bit too crazy to adopt…yet.
  • How To Make Life Suck Less (While Making Scalable Systems) by Bradford Stephens. Scalable doesn’t imply cheap or easy. Just cheaper and easier.
  • Some perspective to this DIY storage server mentioned at Storagemojo by by Joerg Moellenkamp. It's about making decision. Application and hardware has to be seen as one. When your application is capable to overcome the limitations and problems of such ultra-cheap storage
  • Friday
    Sep042009

    Hot Links for 2009-9-4 

  • A tour through hybrid column/row-oriented DBMS schemes by DANIEL ABADI. Approaches: PAX, Fractured Mirrors, and Fine-grained hybrids.
  • The Future of Database Clustering by ROBERT HODGES. Simple management and monitoring, Fast, flexible replication, Top-to-bottom data protection, Partition management, Cloud and virtualized operation, Transparent application access, Open source.
  • Some perspective to this DIY storage server mentioned at Storagemojo by Joerg Moellenkamp. Quality costs. Period.
  • Turn up the volume: API Scalability with Caching by Scott.
  • Disk I/O Bottlenecks by Ryan Thiessen. My first approach to diagnosing a performance problem is to start by trying to find the system’s bottleneck.
  • Patterns for Cloud Computing by Simon Guest. Using the Cloud for Scale, Using the Cloud for Multi-Tenancy, Using the Cloud for Compute, Using the Cloud for Storage, Using the Cloud for Communications
  • Server Processor Roadmaps Show Change in Direction By Michael J. Miller. What fascinates me is the big change in direction we're seeing on server chips...The focus seemed to be on putting more cores on a chip, something we're still seeing with these new 8-, 12-, and 16-core chips. But now a lot of focus seems to be going into increasing memory bandwidth and new cache architectures, as designers are addressing the memory issues that are often the bottleneck in a multicore system, as well as core-to-core communications.
  • Confronting the Data Center Crisis: A Cost - Benefit Analysis of the IBM
    Computing on Demand (CoD) Cloud Offering

  • Azul's Experiences With Hardware / Software Co-Design by Dr. Cliff Click. Owning whole stack allows progress, Some really hard HW problems “solved” in SW, GC is “solved” w/HW Read Barrier, Simple HTM can do Lock Elision, Huge count of simple cores really useful in production.
  • Java Memory Problems - Memory problems in Java applications are manifold und easily lead to performance and scalability problems. Especially in J EE applications with a high number of parallel users memory management must be a central part of the application architecture.
  • Noob question: how do you [Reddit] join on so much data?
  • Transactional Memory versus Locks -
    A Comparative Case Study
    by Victor Pankratius. TM alone is no silver bullet.
  • Looking at Redis by Peter Zaitsev. With Redis I got about 3 times more updates/sec – close to 100.000 updates/sec with about 1.5 core being used.


    The fantasy sponsor for this post are those little food kiosks outside Home Depot stores. I love their Fire Dogs. Hot and yummy. I bet most home improvement projects in America are inspired by cravings for one of these little beauties.
  • Wednesday
    Aug262009

    Hot Links for 2009-8-26

  • I'm Going To Scale My Foot Up Your Ass - Shut up about scalability, no one is using your app anyway.
  • Multi-Tenant Data Architecture - Microsoft's take on different approaches to multitenancy.
  • Cloud computing rides on spiraling Energy costs - A report by US researchers has shown the increasing cost of power and cooling in the data centre is a driver towards cloud computing.
  • Interview: Apple’s Gigantic New Data Center Hints at Cloud Computing - Companies building centers this big are getting into cloud computing. Running apps in the cloud requires massive infrastructure: Google-size infrastructure.
  • What Does Cloud Computing Actually Cost? An Analysis of the Top Vendors - Amazon is currently the lowest cost cloud computing option overall. At least for production applications that need more than 6.5 hours of CPU/day, otherwise GAE is technically cheaper because it's free until this usage level.
  • no:sql(east) - October 28–30, 2009, Atlanta, GA. Very cute page playing off of SQL syntax.

    New Products and Updates

  • Gear6 Web Cache Virtual Appliance - a feature complete virtual machine (VM) of the Gear6 Web Cache software. It includes all the functionality of the Gear6 Web Cache including simulating Gear6 high density RAM-flash architecture.
  • Seamlessly Extending the Data Center - Introducing Amazon Virtual Private Cloud (VPC) - We have developed Amazon VPC to allow our customers to seamlessly extend their IT infrastructure into the cloud while maintaining the levels of isolation required for their enterprise management tools to do their work.
  • NetApp reveals cloud computing plan, new Data OnTap OS - Our research shows users are very interested in scale-out technology," she said. "What's nice about it is as you add processor and storage resources, you get much higher storage utilization rates and the new scale-out system grows up to 14 petabytes, but it can still be managed in a single array.
  • The Big Cheese: Powerful Version Of Google Search Appliance Can Grow Exponentially.

    Updates to Articles on High Scalability

  • Streamy Explains CAP and HBase's Approach to CAP - We plan to employ inter-cluster replication, with each cluster located in a single DC. Remote replication will introduce some eventual consistency into the system, but each cluster will continue to be strongly consistent. Updated: How Google Serves Data from Multiple Datacenters.

    The fantasy sponsor for this post are those little food kiosks outside Home Depot stores. I love their Fire Dogs. Hot and yummy. I bet most home improvement projects in America are inspired by cravings for one of these little beauties.
  • Page 1 ... 41 42 43 44 45