Wednesday
Dec092015

Free Red Book: Readings in Database Systems, 5th Edition

For the first time in ten years there has been an update to the classic Red Book, Readings in Database Systems, which offers "readers an opinionated take on both classic and cutting-edge research in the field of data management."

Editors Peter Bailis, Joseph M. Hellerstein, and Michael Stonebraker curated the papers and wrote pithy introductions. Unfortunately, links to the papers are not included, but a kindly wizard, Nindalf, gathered all the referenced papers together and put them in one place.

What's in it?

  • Preface 
  • Background introduced by Michael Stonebraker 
  • Traditional RDBMS Systems introduced by Michael Stonebraker 
  • Techniques Everyone Should Know introduced by Peter Bailis 
  • New DBMS Architectures introduced by Michael Stonebraker
  • Large-Scale Dataflow Engines introduced by Peter Bailis 
  • Weak Isolation and Distribution introduced by Peter Bailis 
  • Query Optimization introduced by Joe Hellerstein 
  • Interactive Analytics introduced by Joe Hellerstein 
  • Languages introduced by Joe Hellerstein 
  • Web Data introduced by Peter Bailis 
  • A Biased Take on a Moving Target: Complex Analytics by Michael Stonebraker 
  • A Biased Take on a Moving Target: Data Integration by Michael Stonebraker

Related Articles

 

Tuesday
Dec082015

Sponsored Post: StatusPage.io, Redis Labs, Jut.io, SignalFx, InMemory.Net, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Senior Devops Engineer - StatusPage.io is looking for a senior devops engineer to help us in making the internet more transparent around downtime. Your mission: help us create a fast, scalable infrastructure that can be deployed to quickly and reliably.

  • At Scalyr, we're analyzing multi-gigabyte server logs in a fraction of a second. That requires serious innovation in every part of the technology stack, from frontend to backend. Help us push the envelope on low-latency browser applications, high-speed data processing, and reliable distributed systems. Help extract meaningful data from live servers and present it to users in meaningful ways. At Scalyr, you’ll learn new things, and invent a few of your own. Learn more and apply.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.

Fun and Informative Events

  • Your event could be here. How cool is that?

Cool Products and Services

  • Real-time correlation across your logs, metrics and events.  Jut.io just released its operations data hub into beta and we are already streaming in billions of log, metric and event data points each day. Using our streaming analytics platform, you can get real-time monitoring of your application performance, deep troubleshooting, and even product analytics. We allow you to easily aggregate logs and metrics by micro-service, calculate percentiles and moving window averages, forecast anomalies, and create interactive views for your whole organization. Try it for free, at any scale.

  • Turn chaotic logs and metrics into actionable data. Scalyr replaces all your tools for monitoring and analyzing logs and system metrics. Imagine being able to pinpoint and resolve operations issues without juggling multiple tools and tabs. Get visibility into your production systems: log aggregation, server metrics, monitoring, intelligent alerting, dashboards, and more. Trusted by companies like Codecademy and InsideSales. Learn more and get started with an easy 2-minute setup. Or see how Scalyr is different if you're looking for a Splunk alternative or Sumo Logic alternative.

  • SignalFx: just launched an advanced monitoring platform for modern applications that's already processing 10s of billions of data points per day. SignalFx lets you create custom analytics pipelines on metrics data collected from thousands or more sources to create meaningful aggregations--such as percentiles, moving averages and growth rates--within seconds of receiving data. Start a free 30-day trial!

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex goes beyond monitoring and measures the system's work on your servers, providing unparalleled insight and query-level analysis. This unique approach ultimately enables your team to work more effectively, ship more often, and delight more customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Also available on Amazon Web Services. Free instant trial, 2 hours of FREE deployment support, no sign-up required. http://aiscaler.com

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Monday
Dec072015

The Serverless Start-up - Down with Servers!

teletext.io

This is a guest post by Marcel Panse and Sander Nagtegaal from Teletext.io.

In our early Peecho days, we wrote an article explaining how to build a really scalable architecture for next to nothing, using Amazon Web Services. Auto-scaling, merciless decoupling and even automated bidding on unused server capacity were the tricks we used back then to operate on a shoestring. Now, it is time to take it one step further.

We would like to introduce Teletext.io, also known as the serverless start-up - again, entirely built around AWS, but leveraging only the Amazon API Gateway, Lambda functions, DynamoDb, S3 and Cloudfront.

The Virtues of Constraint

We like rules. At our previous start-up Peecho, product owners had to do fifty push-ups as payment for each user story that they wanted to add to an ongoing sprint. Now, at our current company myTomorrows, our developer dance-offs are legendary: during the daily stand-ups, you are only allowed to speak while dancing - leading to the most efficient meetings ever.

This way of thinking goes all the way into our product development. It may seem counter-intuitive at first, but constraints fuel creativity. For example, all our logo design is done with technical diagramming tool Omnigraffle, so there is no way we could use hideous lens flares and such. Anyway - recently, we launched yet another initiative called Teletext.io. So, we needed a new restriction.

At Teletext.io, we are not allowed to use servers. Not even one.

It was a good choice. We will explain why.

Why Servers are Bad

Click to read more ...

Friday
Dec042015

Stuff The Internet Says On Scalability For December 4th, 2015

Hey, it's HighScalability time:


Change: Elliott $800,000 in 1960, 8K RAM, 2kHz CPU vs Raspberry Pi Zero, $5, 1Ghz, 512MB

 

If you like Stuff The Internet Says On Scalability then please consider supporting me on Patreon.

  • 434,000: square-feet in Facebook's new office;  $62.5 billion: Uber's valuation; 11: DigitalOcean datacenters; $4.45 billion: black Friday online sales; 2MPH: speed news traveled in 1500; 95: percent of world covered by mobile broadband; 86%: items Amazon delivers that weigh less than five pounds.

  • Quotable Quotes:
    • Jeremy Hsu: Is anybody thinking about how we’ll have to code differently to accommodate the jump from a 1-exaflop supercomputer to 10 exaflops? There is not enough attention being paid to this issue.
    • @kml: “Process drives away talent” - @adrianco at #yow15
    • capkutay: Seems like a lot of the momentum behind containers is driven by the Silicon Valley investment community.
    • @taotetek: IoT is turning homes into datacenters with no system administrators and no security team.
    • @asymco: On Thursday and early Friday, mobile traffic accounted for nearly 60% of all online shopping traffic, and 40% of all online sales
    • Mobile App Developers are Suffering: It’s just too saturated. The barriers to adoption and therefore monetization are too high. It’s easier on the web.
    • Taleb: It is foolish to separate risk taking from the risk management of ruin.
    • Maxime Chevalier-Boisvert:  I believe dynamic languages are here to stay. They can be very nimble, in ways that statically typed languages might never be able to match. We’re at a point in time where static typing dominates mainstream thought in the programming world, but that doesn’t mean dynamic languages are dead.
    • @__edorian: "Can i have a static linked binary?" - "No that's stupid, it's slower and takes more space!" - "Can i have a docker image?" - "Sure!
    • @grzegorz_dyk: When I see people talking about fine grained #microservices I am thinking: why not use actors? #akka #erlang
    • Henry Miller: When you can’t create you can work.
    • @ValaAfshar: For the first time ever, online media consumption is bigger than TV consumption. 
    • @matthewfellows: I learned today that Airbus code is reviewed by hand... in raw assembly code #yow15 @dius_au
    • Rich Hickey: Programmers know the benefits of everything and the tradeoffs of nothing
    • Robin Harris: Cheap storage is changing the world. Whether it is in the cloud, on a dash cam, or embedded in an app, cheap – as in inexpensive – storage is enabling new relationships between individuals, and with culture, power, and groups.
    • @sustrik: libmill shows 1400x performance improvement in c10k scenarios. Wow! I love low-hanging fruit.
    • @jmckenty: At Scale: Bigger than what you’ve got now.
    • John Cage: My notion of how to proceed in a society to bring change is not to protest the thing that is evil, but rather to let it die its own death.
    • @b6n: preemptively blog about how you scaled to support the million users you don't have yet.
    • @joeweinman: When will the FCC start addressing app neutrality?
    • @ufried: i have this post about data scalability always open in a tab, just to remind me of some essentials once in a while 

  • Personalization is getting more personal and more useful. Personalized Nutrition: Healthy foods are unique to individuals: Israeli research teams have demonstrated that there exists a high degree of variability in the responses of different individuals to identical meals...Using their set of amassed data, the researchers then went a step further, applying machine-learning algorithm to their cohort of 800 participants and developing an algorithm capable of predicting individualized PPGRs (postprandial (post-meal) glycemic responses). This intricate algorithm incorporates 137 features representing meal content, daily activity, blood parameters, CGM-derived features, questionnaires, and microbiome features.

  • Now that's putting concertina wire on the walled garden fence. WhatsApp is blocking links to a competing messenger app.

  • As programming is a creative act, perhaps the ultimate creative act, this advice applies to programmers too. Ira Glass: Nobody tells this to people who are beginners, I wish someone told me. All of us who do creative work, we get into it because we have good taste. But there is this gap. For the first couple years you make stuff, it’s just not that good. It’s trying to be good, it has potential, but it’s not. But your taste, the thing that got you into the game, is still killer. And your taste is why your work disappoints you. A lot of people never get past this phase, they quit. Most people I know who do interesting, creative work went through years of this. We know our work doesn’t have this special thing that we want it to have. We all go through this. And if you are just starting out or you are still in this phase, you gotta know its normal and the most important thing you can do is do a lot of work. Put yourself on a deadline so that every week you will finish one story. It is only by going through a volume of work that you will close that gap, and your work will be as good as your ambitions. And I took longer to figure out how to do this than anyone I’ve ever met. It’s gonna take awhile. It’s normal to take awhile. You’ve just gotta fight your way through.

  • So you want a revolution, what will be the cost? It’s a Trap: Emperor Palpatine’s Poison Pill: In this case study we found that the Rebel Alliance would need to prepare a bailout of at least 15%, and likely at least 20%, of GGP in order to mitigate the systemic risks and the sudden and catastrophic economic collapse. Without such funds at the ready, it likely the Galactic economy would enter an economic depression of astronomical proportions.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Dec012015

Deep Lessons from Google and eBay on Building Ecosystems of Microservices

When you look at large scale systems from Google, Twitter, eBay, and Amazon, their architecture has evolved into something similar: a set of polyglot microservices.

What does it looks like when you are in the polyglot microservices end state? Randy Shoup, who worked in high level positions at both Google and eBay, has a very interesting talk exploring just that idea: Service Architectures at Scale: Lessons from Google and eBay.

What I really like about Randy's talk is how he is self-consciously trying to immerse you in the experience of something you probably have no experience of: creating, using, perpetuating, and protecting a large scale architecture.

In the Ecosystem of Services section of the talk Randy asks: What does it look like to have a large scale ecosystem of polyglot microservices? In the Operating Services at Scale section he asks: As a service provider what does it feel like to operate such a service? In the Building a Service section he asks: When you are a service owner what does it look like? And in the Service Anti-Patterns section he asks: What can go wrong?

A very powerful approach.

The highlight of the talk for me was the idea of aligning incentives, a consistent theme that crosscuts the entire endeavour. While never explicitly pulled out as a separate strategy, it's the motivation behind why you want small teams to develop small clean services, why a charge back model for internal services is so powerful, how architecture can evolve without an architect, how clean design can evolve from a bottom up process, and how standards can evolve without a central committee.

My takeaway is the deliberate aligning of incentives is how you scale both a large, dynamic organization and a large, dynamic code base. Putting in the right incentives nudges things into happening without explicit control, almost in the same way more work in a distributed system gets done when you remove locks, don't share state, communicate with messages, and parallelize everything.

Let's see how large scale systems are built in the modern era...

Polyglot Microservices are the End Game

Click to read more ...

Friday
Nov272015

Stuff The Internet Says On Scalability For November 27th, 2015

Hey, it's HighScalability time:


The most detailed picture of the Internet ever as compiled by an illegal 420,000-node botnet.
  • $40 billion: P2P lending in China; 20%: amount of all US margin expansion accounted for by Apple since 2010; 11: years of Saturn photos; 117: number of different steering wheels offered for a VW Golf; 1Gbps: speed of a network using a lightbulb.

  • Quotable Quotes:
    • @jaksprats: If we could compile a subset of JavaScript to Lua, JS could run on Server(Node,js), Browser, Desktop, iOS, & Android.JS could run EVERYWHERE
    • @wilkieii: Tech: "Don't roll your own crypto if you aren't an expert" *replaces nutrition with Soylent, currency with bitcoin* *puts wifi in lightbulb*
    • @brianpeddle: The architecture of one human brain would require a zettabyte of capacity. Full simulation of a human brain by 2023.
    • MarshalBanana: That can still easily be the right choice. Complex algorithms trade asymptotic performance for setup cost and maintenance cost. Sometimes the tradeoff isn't worth it.
    • kevindeasi: There are so many things to know nowadays. Backend: Sql, NoSql, NewSql, etc. Middlware: Django, NodeJs, Spring, Groovy, RoR, Symfony, etc. Client: Angular, Ember, React, Jquery, etc. I haven't even mentioned hardware, security, servers/cloud, and api. Now you also need to know about theory, UI/UX, git, deploying servers, HTTP, scrum, software development process, testing.
    • Brian Chesky~ It was better to have 100 people who loved us vs. 1M people who liked us. All movements grow this way.
    • idlewords: All the advantages of a dedicated server without the hassle of saving tons of money.
    • jorangreef: Well, how would you handle massive traffic spikes? Through a combination of vertical and horizontal scaling? Through having excess capacity? Except that I would probably want to start with something fast and inexpensive to begin with.
    • @jaykreps: "The bigger the interface, the weaker the abstraction"--@rob_pike
    • Animats: That still irks me. The real problem is not tinygram prevention. It's ACK delays, and that stupid fixed timer. They both went into TCP around the same time, but independently. I did tinygram prevention (the Nagle algorithm) and Berkeley did delayed ACKs, both in the early 1980s. The combination of the two is awful.
    • @jaykreps: Distributed computing is the new normal: Mesos, K8s = dist'd processes; Cassandra, Kafka, etc = dist'd data; microservices = dist'd apps.
    • @bradfitz: OH: "Well you can add nodes to the cluster. They made that work well, but you can't remove them. It's the Hotel California of auto-scaling."

  • Creating Your Own EC2 Spot Market -- Part 2. Video encoding represents 70% of Netflix's computing needs. And Netflix has a daily peak of 12,000 unused instances. So they created their own spot market to improve encoding throughput by the equivalent of a 210% increase in encoding capacity. Using their update real-time approach they were able to perform an encoding job in 18 hours that they expected to take a few days. Great article with a lot of deep thinking on the topic.

  • Amen! We should come up with a catchy name for RAII so more languages support it because RAII is awesome and simplifies code!

  • Google as a cloud company instead of an ad company? It could happen: Google's Holzle Envisions Cloud Business Eclipsing Ads in 2020. Google announced Custom Machine Types  so you can configure the number of virtual CPUs and the amount RAM you want for you machine. I imagine this nifty feature is enabled by Google's advanced datacenter scheduling software, but it will take more than that to beat AWS and Azure. To take market share Google may need to instigate a price war. Though it looks like Google might make a lot of money charging back to Google.

  • Good explanation of what is servless computing by Leonardo Federico: the phrase “serverless” doesn’t mean servers are no longer involved. It simply means that developers no longer have to think "that much" about them. Computing resources get used as services without having to manage around physical capacities or limits. Let's take for example AWS Lambda. "Lambda allows you to NOT think about servers. Which means you no longer have to deal with over/under capacity, deployments, scaling and fault tolerance, OS or language updates, metrics, and logging."

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
Nov242015

Sponsored Post: StatusPage.io, iStreamPlanet, Redis Labs, Jut.io, SignalFx, InMemory.Net, VividCortex, MemSQL, Scalyr, AiScaler, AppDynamics, ManageEngine, Site24x7

Who's Hiring?

  • Senior Devops Engineer - StatusPage.io is looking for a senior devops engineer to help us in making the internet more transparent around downtime. Your mission: help us create a fast, scalable infrastructure that can be deployed to quickly and reliably.

  • As a Networking & Systems Software Engineer at iStreamPlanet you’ll be driving the design and implementation of a high-throughput video distribution system. Our cloud-based approach to video streaming requires terabytes of high-definition video routed throughout the world. You will work in a highly-collaborative, agile environment that thrives on success and eats big challenges for lunch. Please apply here.

  • As a Scalable Storage Software Engineer at iStreamPlanet you’ll be driving the design and implementation of numerous storage systems including software services, analytics and video archival. Our cloud-based approach to world-wide video streaming requires performant, scalable, and reliable storage and processing of data. You will work on small, collaborative teams to solve big problems, where you can see the impact of your work on the business. Please apply here.

  • At Scalyr, we're analyzing multi-gigabyte server logs in a fraction of a second. That requires serious innovation in every part of the technology stack, from frontend to backend. Help us push the envelope on low-latency browser applications, high-speed data processing, and reliable distributed systems. Help extract meaningful data from live servers and present it to users in meaningful ways. At Scalyr, you’ll learn new things, and invent a few of your own. Learn more and apply.

  • UI EngineerAppDynamics, founded in 2008 and lead by proven innovators, is looking for a passionate UI Engineer to design, architect, and develop our their user interface using the latest web and mobile technologies. Make the impossible possible and the hard easy. Apply here.

  • Software Engineer - Infrastructure & Big DataAppDynamics, leader in next generation solutions for managing modern, distributed, and extremely complex applications residing in both the cloud and the data center, is looking for a Software Engineers (All-Levels) to design and develop scalable software written in Java and MySQL for backend component of software that manages application architectures. Apply here.

Fun and Informative Events

  • Your event could be here. How cool is that?

Cool Products and Services

  • Real-time correlation across your logs, metrics and events.  Jut.io just released its operations data hub into beta and we are already streaming in billions of log, metric and event data points each day. Using our streaming analytics platform, you can get real-time monitoring of your application performance, deep troubleshooting, and even product analytics. We allow you to easily aggregate logs and metrics by micro-service, calculate percentiles and moving window averages, forecast anomalies, and create interactive views for your whole organization. Try it for free, at any scale.

  • Turn chaotic logs and metrics into actionable data. Scalyr replaces all your tools for monitoring and analyzing logs and system metrics. Imagine being able to pinpoint and resolve operations issues without juggling multiple tools and tabs. Get visibility into your production systems: log aggregation, server metrics, monitoring, intelligent alerting, dashboards, and more. Trusted by companies like Codecademy and InsideSales. Learn more and get started with an easy 2-minute setup. Or see how Scalyr is different if you're looking for a Splunk alternative or Sumo Logic alternative.

  • SignalFx: just launched an advanced monitoring platform for modern applications that's already processing 10s of billions of data points per day. SignalFx lets you create custom analytics pipelines on metrics data collected from thousands or more sources to create meaningful aggregations--such as percentiles, moving averages and growth rates--within seconds of receiving data. Start a free 30-day trial!

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex goes beyond monitoring and measures the system's work on your servers, providing unparalleled insight and query-level analysis. This unique approach ultimately enables your team to work more effectively, ship more often, and delight more customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • aiScaler, aiProtect, aiMobile Application Delivery Controller with integrated Dynamic Site Acceleration, Denial of Service Protection and Mobile Content Management. Also available on Amazon Web Services. Free instant trial, 2 hours of FREE deployment support, no sign-up required. http://aiscaler.com

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network.

If any of these items interest you there's a full description of each sponsor below. Please click to read more...

Click to read more ...

Monday
Nov232015

How Wistia Handles Millions of Requests Per Hour and Processes Rich Video Analytics

This is a guest repost from Christophe Limpalair of his interview with Max Schnur, Web Developer at  Wistia.

Wistia is video hosting for business. They offer video analytics like heatmaps, and they give you the ability to add calls to action, for example. I was really interested in learning how all the different components work and how they’re able to stream so much video content, so that’s what this episode focuses on.

What does Wistia’s stack look like?

As you will see, Wistia is made up of different parts. Here are some of the technologies powering these different parts:

What scale are you running at?

Click to read more ...

Friday
Nov202015

Stuff The Internet Says On Scalability For November 20th, 2015

Hey, it's HighScalability time:


100 years ago people saw this as our future. We will be so laughably wrong about the future.
  • $24 billion: amount telcos make selling data about you; $500,000: cost of iOS zero day exploit; 50%: a year's growth of internet users in India; 72: number of cores in Intel's new chip; 30,000: Docker containers started on 1,000 nodes; 1962: when the first Cathode Ray Tube entered interplanetary space; 2x: cognitive improvement with better indoor air quality; 1 million: Kubernetes request per second; 

  • Quotable Quotes:
    • Zuckerberg: One of our goals for the next five to 10 years is to basically get better than human level at all of the primary human senses: vision, hearing, language, general cognition. 
    • Sawyer Hollenshead: I decided to do what any sane programmer would do: Devise an overly complex solution on AWS for a seemingly simple problem.
    • Marvin Minsky: Big companies and bad ideas don't mix very well.
    • @mathiasverraes: Events != hooks. Hooks allow you to reach into a procedure, change its state. Events communicate state change. Hooks couple, events decouple
    • @neil_conway: Lamport, trolling distributed systems engineers since 1998. 
    • @timoreilly: “Silicon Valley is the QA department for the rest of the world. It’s where you test out new business models.” @jamescham #NextEconomy
    • Henry Miller: It is my belief that the immature artist seldom thrives in idyllic surroundings. What he seems to need, though I am the last to advocate it, is more first-hand experience of life—more bitter experience, in other words. In short, more struggle, more privation, more anguish, more disillusionment.
    • @mollysf: "We save north of 30% when we move apps to cloud. Not in infrastructure; in operating model." @cdrum #structureconf
    • Alex Rampell: This is the flaw with looking at Square and Stripe and calling them commodity players. They have the distribution. They have the engineering talent. They can build their own TiVo. It doesn’t mean they will, but their success hinges on their own product and engineering prowess, not on an improbable deal with an oligopoly or utility.
    • @csoghoian: The Michigan Supreme Court, 1922: Cars are tools for robbery, rape, murder, enabling silent approach + swift escape.
    • @tomk_: Developers are kingmakers, driving technology adoption. They choose MongoDB for cost, agility, dev productivity. @dittycheria #structureconf
    • Andrea “Andy” Cunningham: You have to always foster an environment where people can stand up against the orthodoxy, otherwise you will never create anything new.
    • @joeweinman: Jay Parikh at #structureconf on moving Instagram to Facebook: only needed 1 FB server for every 3 AWS servers
    • amirmc: The other unikernel projects (i.e. MirageOS and HaLVM), take a clean-slate approach which means application code also has to be in the same language (OCaml and Haskell, respectively). However, there's also ongoing work to make pieces of the different implementations play nicely together too (but it's early days).

  • After a tragedy you can always expect the immediate fear inspired reframing of agendas. Snowden responsible for Paris...really?

  • High finance in low places. The Hidden Wealth of Nations: In 2003, less than a year before its initial public offering in August 2004, Google US transferred its search and advertisement technologies to “Google Holdings,” a subsidiary incorporated in Ireland, but which for Irish tax purposes is a resident of Bermuda.

  • The entertaining True Tales of Engineering Scaling. Started with Rails and Postgres. Traffic jumped. High memory workers on Heroku broke the bank. Can't afford the time to move to AWS. Lots of connection issues. More traffic. More problems. More solutions. An interesting story with many twists. The lesson: Building and, more importantly, shipping software is about the constant trade off of forward movement and present stability.

  • 5 Tips to Increase Node.js Application Performance: Implement a Reverse Proxy Server; Cache Static Files; Implement a Node.js Load Balancer; Proxy WebSocket Connections; Implement SSL/TLS and HTTP/2.

  • Docker adoption is not that easy, Uber took months to get up and running with Docker. How Docker Turbocharged Uber’s Deployments: Everything just changes a bit, we need to think about stuff differently...You really need to rethink all of the parts of your infrastructure...Uber recognizes that Docker removed team dependencies, offering more freedom because members were no longer tied to specific frameworks or specific versions. Framework and service pawners are now able to experiment with new technologies and to manage their own environments.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
Nov182015

Free Book: Practical Scalablility Analysis with the Universal Scalability Law

If you are very comfortable with math and modeling Dr. Neil Gunther'Universal Scalability Law is a powerful way of predicting system performance and whittling down those bottlenecks. If not, the USL can be hard to wrap your head around.

There's a free eBook for that. Performance and scalability expert Baron Schwartz, founder of VividCortex, has written a wonderful exploration of scalability truths using the USL as a lens: Practical Scalablility Analysis with the Universal Scalability Law

As a sample of what you'll learn, here are some of the key takeaways from the book:

  • Scalability is a formal concept that is best defined as a mathematical function.
  • Linear scalability means equal return on investment. Double down on workers and you’ll get twice as much work done; add twice as many nodes and you’ll increase the maximum capacity twofold. Linear scalability is oft claimed but seldom delivered.
  • Systems scale sublinearly because of contention, which adds queueing delay, and crosstalk, which inflates service times. The penalty for contention grows linearly and the crosstalk penalty grows quadratically. (An alternative to the crosstalk theory is that longer queues are more costly to manage.)
  • Contention causes throughput to asymptotically approach the reciprocal of the serialized fraction of the workload. If your workload is 5% serialized you’ll never grow the effective speedup by more than 20-fold
  • Crosstalk causes the system to regress. The harder you try to push systems with crosstalk, the more time they spend fighting amongst themselves.
  • To build scalable systems, avoid contention (serialization) and crosstalk (synchronization). The contention and crosstalk penalties degrade system scalability and performance much faster than you’d think. Even tiny amounts of serialization or pairwise data synchronization cause big losses in efficiency.
  • If you can’t avoid crosstalk, partition (shard) into smaller systems that will lose less efficiency by avoiding the explosion of service times at larger sizes.
  • To model systems with the USL, obtain measurements of throughput at various levels of load or size, and use regression to estimate the parameters to Equation 3.
  • To forecast scalability beyond what’s observable, be pessimistic and treat the USL as a best-case scenario that won’t really happen. Use Equation 4 to forecast the maximum possible throughput, but don’t forecast too far out. Use Equation 6 to forecast response time.
  • Use your judgment to predict limitations that USL can’t see, such as saturation of network bandwidth or changes in the system’s model when all of the CPUs become busy
  • Use the USL to explain why systems aren’t scaling well. Too much queueing? Too much crosstalk? Treat the USL as a pessimistic model and demand that your systems scale at least as well as it does.
  • If you see superlinear scaling, check your measurements and how you’ve set up the system under test. In most cases σ should be positive, not negative. Make sure you’re not varying the system’s dimensions relative to each other and creating apparent superlinear efficiencies that don’t really exist.
  • It’s fun to fantasize about models that might match observed system behavior more closely than the USL, but the USL arises analytically from how we know queueing systems work. Invented models might not have any basis in reality. Besides, the USL usually models systems extremely well up to the point of inflection, and modeling what happens beyond that isn’t as interesting as knowing why it happens.
  • Never trust a scatterplot with an arbitrary curve fit through it unless you know why that’s the right curve. Don’t confuse the USL, hockey stick charts from queueing theory, or other charts that just happen to have similar shapes. Know what shape various plots should exhibit, and suspect bad measurements or other mistakes if you don’t see them.

Note, the link to the eBook requires entering some data, but it's free, well written, and useful, so it's probably worth it.

Related Articles