Entries in Strategy (358)

Wednesday
Jul292015

A Well Known But Forgotten Trick: Object Pooling

This is a guest repost by Alex Petrov. Find the original article here.

Most problem are quite straightforward to solve: when something is slow, you can either optimize it or parallelize it. When you hit a throughput barrier, you partition a workload to more workers. Although when you face problems that involve Garbage Collection pauses or simply hit the limit of the virtual machine you're working with, it gets much harder to fix them.

When you're working on top of a VM, you may face things that are simply out of your control. Namely, time drifts and latency. Gladly, there are enough battle-tested solutions, that require a bit of understanding of how JVM works.

If you can serve 10K requests per second, conforming with certain performance (memory and CPU parameters), it doesn't automatically mean that you'll be able to linearly scale it up to 20K. If you're allocating too many objects on heap, or waste CPU cycles on something that can be avoided, you'll eventually hit the wall.

The simplest (yet underrated) way of saving up on memory allocations is object pooling. Even though the concept is sounds similar to just pooling objects and socket descriptors, there's a slight difference.

When we're talking about socket descriptors, we have limited, rather small (tens, hundreds, or max thousands) amount of descriptors to go through. These resources are pooled because of the high initialization cost (establishing connection, performing a handshake over the network, memory-mapping the file or whatever else). In this article we'll talk about pooling larger amounts of short-lived objects which are not so expensive to initialize, to save allocation and deallocation costs and avoid memory fragmentation.

Object Pooling

Click to read more ...

Wednesday
Jul222015

Architecting Backend for a Social Product

This is aimed towards taking you through key architectural decisions which will make a social application a true next generation social product. The proposed changes addresses following attributes; a) availability b) reliability c) scalability d) performance and flexibility towards extensions (not modifications)

Goals

a) Ensuring that user’s content is easily discoverable and is available always.

b) Ensuring that the content pushed is relevant not only semantically but also from user’s device perspective.

c) Ensuring that the real time updates are generated, pushed and analyzed.

d) Eye towards saving user’s resources as much as possible.

e) Irrespective of server load, user’s experience should remain intact.

f) Ensuring overall application security

In summary we want to deal with an amazing challenge, where we must deal with a mega sea of ever expanding user generated contents, increasing number of users, and a constant stream of new items, all while ensuring an excellent performance. Considering the above challenge it is imperative that we must study certain key architectural elements which will influence the over system design. Here are the few key decisions & analysis.

Data Storage

Click to read more ...

Wednesday
Jul152015

64 Network DO’s and DON’Ts for Game Engines. Part IIIa: Server-Side 

This article originally appeared on ITHare.com. It's one article from an excellent series of articles: Part I. Client Side; Part IIa. Protocols and APIs; Part IIb; Protocols and APIs; Part IIIb. Server-Side (deployment, optimizations, and testing); Part IV. Great TCP-vs-UDP Debate; Part V. UDP; Part VI. TCP.

In Part III of the article, we’ll discuss issues specific to server-side, as well as certain DO’s and DON’Ts related to system testing. Due to the size, part III has been split, and in this part IIIa we’ll concentrate on the issues related to Store-Process-and-Forward architecture.

18. DO consider Event-Driven programming model for Server Side too

As discussed above (see item #1 in Part I), the event-driven programming is a must for the client side; in addition, it also comes handy on the server side. Having multi-threaded logic is still a nightmare for the server-side [NoBugs2010], and keeping logic single-threaded simplifies development a lot. Whether to think that multi-threaded game logic is normal, and single-threaded logic is a big improvement, or to think that single-threaded game logic is normal, and multi-threaded logic is a nightmare – is up to you. What is clear is that if you can keep your game logic single-threaded – you’ll be very happy compared to the multi-threaded alternative.

However, unlike the client-side where performance and scalability rarely pose problems, on the server side where you need to serve hundreds of thousands of players, they become really (or, if your project is successful, “really really”) important. I know two ways of handling performance/scalability for games, while keeping logic single-threaded.

18a. Off-loading

Click to read more ...

Monday
Jul062015

How Do We Explain the Unreasonable Effectiveness of IT? 

Joseph Campbell: As Schopenhauer says, when you look back on your life, it looks as though it were a plot, but when you are into it, it’s a mess: just one surprise after another. Then, later, you see it was perfect. So, I have a theory that if you are on your own path things are going to come to you. Since it’s your own path, and no one has ever been on it before, there’s no precedent, so everything that happens is a surprise and is timely.

Why is the IT industry so darn effective? Just think about these amazing advancements. A little over 30 years ago the Apple Mac went on sale. In 2020 Benedict Evans estimates 80% of adults on earth will have a smartphone. And about at that same time applications were typically monoliths that ran on one computer. Now applications can deploy with the push of a button on cloud native architectures that exploit many thousands of CPUs using datacenter scale operating systems. And software used to be this strange specialized niche only nerds cared about or understood. Now software is in everything and is so ubiquitous it’s becoming nearly invisible. The examples could go on and on and on...and on.

These advances have evolved step-by-step over time, so we don’t even realize the full weight of the transformative changes we’ve experienced. What can account for such astonishingly rapid progress?

Stepping stones.

What the heck do stepping stones have to do with anything? Here’s a clue...do you remember the Connections TV Series by the incredible James Burke?

For an explanation we turn to Ken Stanley, Computer scientist, artificial intelligence researcher, Associate Professor at the University of Central Florida, who wrote a new book Why Greatness Cannot Be Planned: The Myth of the Objective, with a fascinatingly counterintuitive premise:

The greatest achievements become less likely when they are made objectives. The best way to achieve greatness, the truest path to “blue sky” discovery or to fulfill boundless ambition, is to have no objective at all. To achieve our highest goals, we must be willing to abandon them. 

The Big Idea

Click to read more ...

Wednesday
Jun032015

What Does it Mean to Poke a Complex System?

A little bit of follow up...

In How Can We Build Better Complex Systems? Containers, Microservices, And Continuous Delivery I had a question about what Mary Poppendieck meant when she talked about poking complex systems.

InfoQ interviewed Mary & Tom Poppendieck and found out the answer:

Click to read more ...

Tuesday
Jun022015

Why You Dont' Want to Aim for 100% Uptime According to Google's Urs Hölzle

Wait, you don't want 100% uptime? Who said such a crazy thing? Risk taker Urs Hölzle, senior VP for technical infrastructure, in Google's Infrastructure Chief Talks SDN:
Whenever you try something new, there are going to be problems with it....We were willing to take the risk to get the innovation. Our VP who runs our site reliability gave a great talk about not aiming for 100% uptime....The easiest way to make it be at 100% is to resist change, because change is when bad things happen. Looks great for your SLA, but it's bad for your business because you slow down innovation.... In the first year of running B4, [we asked] "Will we have an outage?" Realistically, yes there's a high chance because it was all new code. Are we going to be perfect? Probably not. You have to have a willingness to take a little risk.
Monday
Jun012015

Developing Products in the Style of Etsy

How should you go about structuring your project? We have two general paradigms that I'll characterize as flowing from the Etsy coaching tree, emphasising the monolith, and from the Netflix coaching tree, emphasizing microservices. This is of course an over simplification, but it's for instructional purposes only. For a broad comparison of the two approaches take a look at The Great Microservices Vs Monolithic Apps Twitter Melee.

This is not a good vs. evil sort of mythos. The Force is truly one. We simply have two valid and functional ways of looking the world.

I think wdewind nails the heart of the difference:

The point of the article is that local optimization gives you this tiny boost in the beginning for a long term cost that eventually moves the organization is a direction of shipping less. It's not that innovative technologies are bad.

The mentioned article is Choose Boring Technology by Dan McKinley, in which Dan does a great job exploring Etsy style development with both insight and wisdom. 

Dan explores four different principles:

Click to read more ...

Wednesday
May272015

A Toolkit to Measure Basic System Performance and OS Jitter

Jean Dagenais published a great response on a mechanical-sympathy thread to Gil Tene's article, The Black Magic Of Systematically Reducing Linux OS Jitter. It's full of helpful tools for tracking down jitter problems. I apologize for the incomplete attribution. I did not find a web presence for Jean. 

To complement the great information I got on the “Systematic Way to Find Linux Jitter”, I have created a toolkit that I now used to evaluate current and future trading platforms.

In case this can be useful, I have listed these tools, as well as the URLs to get the source code and a description of their usage. I am learning a lot by reading the source code, and the blog entry associated.

This is far from an exhaustive list, as every week I find either a new problem area or a new tool that improve my understanding of this beautiful problem domain ;)

These tools are grouped into these categories: 

  1. CPU, Memory, Disk, Network
  2. X86, Linux, and Java time resolution
  3. Context Switches & Inter Thread Latency
  4. System Jitter
  5. Application Building Blocks: distruptor, openHft, Aeron & Workload Generator
  6. Application Performance Testing

Happy Benchmarking and Jitter Chasing!

1. CPU, Memory, Disk, Network

Click to read more ...

Wednesday
May132015

To see the Future of the Apple Watch Just Go to Disneyland

 

Removing friction. That’s what the Apple Watch is good at.

Many think watches are a category flop because they don’t have that obvious killer app. Like hot sauce, maybe a watch isn’t something you eat all by itself, but it gives whatever you sprinkle it on a little extra flavor?

Walk into your hotel, the system recognizes you, your room number pops up on your watch, you walk directly to your room and unlock it with your watch.

Walk into an airport, your flight displays on your watch along with directions to your terminal. To get on the plane you just flash your watch. On landing, walk to your rental car and unlock it with your watch.

A notification arrives that it’s time to leave for your meeting, traffic is bad, best get an early start.

While shopping you check with your partner if you need milk by talking directly through your watch. In the future you’ll just know if you need milk, but we’re not there yet.

You can do all these things with a phone. Google Now, for example. What the easy accessibility of the watch does in these scenarios is remove friction. It makes it natural for a complex backend system to talk to you about things it learns from you and your environment. Hiding in a pocket or a purse, a phone is too inconvenient and too general purpose. Your watch becomes a small custom viewport on to a much larger more connected world.

After developing my own watch extension, using other extensions, and listening to a lot of discussion on the subject, it’s clear the form factor of a watch is very limiting and will always be limiting. You’ll never be able to do much UI-wise on a watch. Even the cleverest programmers can only do so much with so little screen real estate and low resource usage requirements. Instagram and Evernote simply aren’t the same on a watch.

But that’s OK. Every device has what it does well. It takes time for users and developers to explore a new device space.

What a watch does well is not so much enable new types of apps, but plug people into much larger and smarter systems. This is where the friction is removed.

Re-enchanting the World Disneyland Style

Click to read more ...

Monday
May112015

Designing for Scale - Three Principles and Three Practices from Tapad Engineering

This is a guest post by Toby Matejovsky, Director of Engineering at Tapad (@TapadEng).

Here at Tapad, scaling our technology strategically has been crucial to our immense growth. Over the last four years we’ve scaled our real-time bidding system to handle hundreds of thousands of queries per second. We’ve learned a number of lessons about scalability along that journey.

Here are a few concrete principles and practices we’ve distilled from those experiences:

  • Principle 1: Design for Many
  • Principle 2: Service-Oriented Architecture Beats Monolithic Application
  • Principle 3: Monitor Everything
  • Practice 1: Canary Deployments
  • Practice 2: Distributed Clock
  • Practice 3: Automate To Assist, Not To Control

Principle 1: Design for Many

Click to read more ...

Page 1 ... 2 3 4 5 6 ... 36 Next 10 Entries »