Entries by HighScalability Team (1576)

Friday
Sep302011

Stuff The Internet Says On Scalability For September 30, 2011

You deserve a HighScalability today:

  • Tumblr > Wikipedia
  • Potent quotables:
    • @tokutek : Yelp generates close to 400 GB of compressed logs per day according to @petersirota of Amazon #Strataconf #BigData. More at From Under the Desk to the Cloud
    • @LHK_ITRG : Massive scalability: 80,000 users on a single AppSense server. I think that should do...
    • @solarce : OH: "Automation is a great way to distribute failure across the system" #surgecon
    • palominodb : #surgeconf - DataDog presenting on their "Data Mullet" All SQL in front, NoSQL party in the back. Classic.
    • Ryan Dahl : I hate almost all software
  • Software Design Glossary. Apparently Kent Beck didn't get the memo, only algorithms matter now, software engineering is dead. In case you don't feel that way, Kent wrote a short glossary of important software design concepts. Also, Screaming Architecture by Bob Martin.
If you deserve even more Stuff the Internet has to Say on Scalability, please click below...

Click to read more ...

Friday
Sep302011

Gone Fishin'

Well, not exactly Fishin', I'll be on vacation starting today and I'll be back in mid October. I won't be posting, so we'll all have a break. Disappointing, I know. If you've ever wanted to write an article for HighScalability, this would be a great time :-) I especially need help on writing Stuff the Internet Says on Scalability as I won't even be reading the Interwebs. Shock! Horror! So if the spirit moves you, please write something. My connectivity in South Africa is unknown, but I will check in and approve articles when I can. See you on down the road...

Wednesday
Sep282011

Pursue robust indefinite scalability with the Movable Feast Machine

And now for something completely different, brought to you by David Ackley and Daniel Cannon in their playfully thought provoking paper: Pursue robust indefinite scalability, wherein they try to take a fresh look at neural networks, starting from scratch.

What is this strange thing called indefinite scalability? They sound like words that don't really go together:

Indefinite scalability is the property that the design can support open-ended computational growth without substantial re-engineering, in as strict as sense as can be managed. By comparison, many computer, algorithm, and network designs -- even those that address scalability -- are only finitely scalable because their scalability occurs within some finite space. For example, an in-core sorting algorithm for a 32 bit machine can only scale to billions of numbers before address space is exhausted and then that algorithm must be re-engineered.

Our idea is to expose indefinitely scalable computational power to programmers using reinvented and restricted—but still recognizable—concepts of sequential code, integer arithmetic, pointers, and user-defined classes and objects. Within the space of indefinitely scalable designs, consequently, we prioritize programmability and software engineering concerns well ahead of either theoretical parsimony or maximally efficient or flexible tile hardware design.

It's easy to read indefinite as "infinite" here. So what would such machine look like? Well, they've built the Movable Feast Machince as an implementation of their ideas:

Click to read more ...

Tuesday
Sep272011

Use Instance Caches to Save Money: Latency == $$$

In the post Using memcache might be free, but it's costing you money with instance pricing! Use instance caches if possible made on the Google App Engine group, Santiago Lema brings up an oldie but a goody of an idea that was once used to improve performance, but now it's used to save money:

  • Santiago's GAE application went from about $9 to about $177 per month. 
  • Memcache is slow enough that under higher loads extra instances are created by the scheduler to handle the load.
  • For static or semi-static data, a way around the cost of the extra instances, is to keep a cache in the instance so requests can be served out of local memory rather than going to memcache or the database. A simple hashtable makes a good in-memory cache.
  • This solution made his app affordable again by reducing the number of instances back to 1 (sometimes 2).

Where have we seen this before?

Click to read more ...

Tuesday
Sep272011

Sponsored Post: Grid Dynamics, aiCache, Rocketfuel, FreeAgent, Percona Live!, Box, New Relic, Surge, Tungsten, AppDynamics, Couchbase, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

  • Grid Dynamics is hiring engineers to use the latest advances in grid and cloud computing to build scalable services. Please send your cover letter and résumé to jobs@griddynamics.com
  • Rocketfuel is hiring engineers to build ad-serving, bidding, modeling and data infrastructure built using a mix of proprietary and open-source technologies. Please apply here.
  • FreeAgent - Senior Platform Engineer. FreeAgent is one of the UK's largest and most successful online accounting web apps, and we're growing at an explosive rate.  
  • Everything is sexier in the cloud. Box is hiring operations engineers and infrastructure automation engineers to help us revolutionize the way businesses collaborate. Please apply here.

Fun and Informative Events

  • Curious about Couchbase Server 2.0? Register for a series of weekly 30-minute webinars. Couchbase has announced the CouchConf World Tour! Check it out at http://www.couchbase.com/couchconf-world-tour
  • Come one come all! Introducing Percona Live London! Join us for this two day intensive MySQL conference Oct 24th-25th. Save £40 on Percona's early bird and regular rate tickets with discount code: HiSc-PLUK
  • Surge 2011: The Scalability and Performance Conference. Surge is a chance to identify emerging trends and meet the architects behind established technologies. Early Bird Registration.

Cool Products and Services

For a longer description of each sponsor, please read more below...

Click to read more ...

Friday
Sep232011

Stuff The Internet Says On Scalability For September 23, 2011

I'd walk a mile for HighScalability:

To read more of the amazing, surprising, and illuminating Stuff the Internet says on Scalability, please click below...

Click to read more ...

Tuesday
Sep202011

HighScalability is old news. Step your scaling game way up... (NSFW cartoon)

Jeremy Raines tweeted a link to this cartoon my new filing technique is unstoppable, showing how scotch tape can be used to create a new super-database. Very funny in a Dilbert sort of way, but definitely not NSFW...

Click to read more ...

Tuesday
Sep202011

Sponsored Post: Rocketfuel, FreeAgent, Percona Live!, Strata, Box, BetterWorks, New Relic, NoSQL Now!, Surge, Tungsten, AppDynamics, Couchbase, CloudSigma, ManageEngine, Site24x7

Who's Hiring?

  • Rocketfuel is hiring engineers to build ad-serving, bidding, modeling and data infrastructure built using a mix of proprietary and open-source technologies. Please apply here.
  • FreeAgent - Senior Platform Engineer. FreeAgent is one of the UK's largest and most successful online accounting web apps, and we're growing at an explosive rate.  
  • Everything is sexier in the cloud. Box is hiring operations engineers and infrastructure automation engineers to help us revolutionize the way businesses collaborate. Please apply here.
  • BetterWorks is hiring a PHP Software Engineer in Los Angeles to help make enterprise software be as beautiful and usable as an Apple product. Please apply here.  

Fun and Informative Events

  • Curious about Couchbase Server 2.0? Register for a series of weekly 30-minute webinars. Couchbase has announced the CouchConf World Tour! Check it out at http://www.couchbase.com/couchconf-world-tour
  • Strata New York, Sep 19-23, making data work. The data opportunity is exploding, and it's happening breathtakingly fast. Learn more here.
  • Come one come all! Introducing Percona Live London! Join us for this two day intensive MySQL conference Oct 24th-25th. Save £40 on Percona's early bird and regular rate tickets with discount code: HiSc-PLUK
  • NoSQL Now! is a new conference covering the dynamic field of NoSQL technologies. August 23-25 in San Jose. For more information please visit: http://www.NoSQLNow.com
  • Surge 2011: The Scalability and Performance Conference. Surge is a chance to identify emerging trends and meet the architects behind established technologies. Early Bird Registration.

Cool Products and Services

For a longer description of each sponsor, please read more below...

Click to read more ...

Monday
Sep192011

Big Iron Returns with BigMemory

This is a guest post by Greg Luck Founder and CTO, Ehcache Terracotta Inc. Note: this article contains a bit too much of a product pitch, but the points are still generally valid and useful.

The legendary Moore’s Law, which states that the number of transistors that can be placed inexpensively on an integrated circuit doubles approximately every two years, has held true since 1965. It follows that integrated circuits will continue to get smaller, with chip fabrication currently at a minuscule 22nm process (1). Users of big iron hardware, or servers that are dense in terms of CPU power and memory capacity, benefit from this trend as their hardware becomes cheaper and more powerful over time. At some point soon, however, density limits imposed by quantum mechanics will preclude further density increases.

At the same time, low-cost commodity hardware influences enterprise architects to scale their applications horizontally, where processing is spread across clusters of low-cost commodity servers. These clusters present a new set of challenges for architects, such as tricky distributed computing problems, added complexity and increased management costs. However, users of big iron and commodity servers are on a collision course that’s just now becoming apparent.

Until recently, Moore’s Law resulted in faster CPUs, but physical constraints—heat dissipation, for example—and computer requirements force manufacturers to place multiple cores to single CPU wafers. Increases in memory, however, are unconstrained by this type of physical requirement. For instance, today you can purchase standard Von Neumann servers from Oracle, Dell and HP with up to 2TB of physical RAM and 64 cores. Servers with 32 cores and 512GB of RAM are certainly more typical, but it’s clear that today’s commodity servers are now “big iron” in their own right.

This trend begs the question: can you now avoid the complexity of scaling out over large clusters of computers, and instead do all your processing on a single, more powerful node? The answer, in theory, is “Yes”—until you take into account the problem with large amounts of memory in Java.

Click to read more ...

Friday
Sep162011

Stuff The Internet Says On Scalability For September 16, 2011

Between love and madness lies HighScalability:

  • Google now 10x better: MapReduce sorts 1 petabyte of data using 8000 computers in 33 minutes; 1 Billion on Social Networks; Tumblr at 10 Billion Posts; Twitter at 100 Million Users; Testing at Google Scale: 1800 builds, 120 million test suites, 60 million tests run daily.
  • From the Dash Memo on Google's Plan: Go is a very promising systems-programming language in the vein of C++. We fully hope and expect that Go becomes the standard back-end language at Google over the next few years. On GAE Go can load from a cold start in 100ms and the typical instance size is 4MB. Is it any wonder Go is a go? Should we expect to see Java and Python deprecated because Go is so much cheaper to run at scale?
  • Walmart uses Muppet labor to power their real-time social shopping systems: You can’t do MapReduce computing every time (with every Tweet). You’ll die. How do you do it in real-time? We built MapUpdate, or what we call Muppet. We could map a huge amount of data and handle a huge firehose with little latency across millions of entities… We can monitor 100 million (items) at scale. That could be products, stores, anything. It’s the equivalent of MapReduce for fast data.
  • Like humans, this AI software is always seeking relations. TextRunner produces facts by digesting 500 million web pages and billions of lines of text. Peter Norvig, director of research at Google: "The significance of TextRunner is that it is scalable because it is unsupervised. It can discover and learn millions of relations, not just one at a time. With TextRunner, there is no human in the loop: it just finds relations on its own."
For many more bon mots the internet has to say on scalability, please click and enter the down below...

Click to read more ...