Using Node.js PayPal Doubles RPS, Lowers Latency, with Fewer Developers, but Where Do the Improvements Really Come From?
![Date Date](/universal/images/transparent.png)
PayPal gives yet another glowing report of an app rewritten in node.js experiencing substantial performance improvements. PayPal rewrote their account overview page, one of the most trafficked apps on the website, which was previously written in King Java.
The benefits:
- Full-stack engineers. Using JavaScript on both the front-end and the back-end removed an artificial boundary between the browser and server, allowing engineers to code both.
- Built almost twice as fast with fewer people
- Written in 33% fewer lines of code
- Constructed with 40% fewer files
- Double the requests per second vs. the Java application.
- 35% decrease in the average response time for the same page.
A common pro Java response is an argument like clearly these people don't know how to program Java. Or rewriting an application usually makes it faster. Or the benchmark is faulty. And so on. Consider it noted. These are all potential factors.
Baron Schwartz from VividCortex has a different take. One based on math. Using the Universal Scalability Law (yes, there is such a thing), which models why a system’s performance degrades under higher concurrency, Baron performed an Analysis of PayPal’s Node-vs-Java benchmarks. The power of the USL is it can pinpoint which factors are to blame for non-linear scaling. The findings:
Notice that Java’s sigma (serialization) parameter is lower and its kappa (crosstalk) parameter is higher than Node.js, and the reverse is true for Node.js. This means that Java is bottlenecked less on serialization, and Node.js is bottlenecked less on coherency delays. This is exactly what one should expect from their architectures (multi-threaded versus single-threaded with event loop) and their blog post (“using a single core for the node.js application compared to five cores in Java”).
Curiously, Baron also found that both Java and Node should be performing much better than they do. We should be seeing many hundreds of pages a second instead of the 1.8 pages/sec for a single user in Java, and 3.3 in Node.js. Why? That's unclear and is a good topic for a much deeper dive into the stack.
Why might Node be faster than Java? Baron surmises:
My guess is that Node is encouraging good programmer practices in terms of scalability, and Java less so. In other words, programmers probably have to work less hard to avoid bad scalability bottlenecks in Node than in Java
Lots more details in Baron's excellent post.
There are also some good comments on the post, reinforcing the idea that doing the more parallel thing may be just easier in Node.
From Michael Holroyd:
If Paypal’s application is hitting a high-latency API this makes a lot of sense. Node strongly pushes the developer toward a non-blocking asynchronous style of programming, that is difficult to achieve in Java without understanding multi-threaded programming and increasing complexity. I’ve find the same thing in our experience at Arqball while switching to node — many applications have become *much* faster when rewritten for node thanks simply to getting work done while waiting on “slow” events (db, disk, cpu). Of course this concurrency is also possible with other technologies (Ruby in our case), but in practice we would only multi-thread applications if performance became a bottleneck for us.
Robert Treat:
Yes, that’s been our experience; we do web development in PHP, Perl, Python, and Node.js, and because our typical clients are oriented toward highly scaled systems, Node.js matches up well with that model “by default”, so it’s become very popular amongst our devs
James Roper:
Your argument about Node encouraging good scalability practices versus Java I think is spot on – this was exactly my point, though I took it further to say that Java hinders good scalability practices
So what have we learned? Node.js continues its march into the core. The lure of full-stack engineers using a naturally performant language running on a platform with a huge software base is hard to ignore. Using the Universal Scalability Law is an interesting way to look at system performance. Java and Node are slower than they should be. And it would be curious to see if a Java app rewritten in Akka would perform as well as when written in Node?
Reader Comments (12)
The improvments came from a fresh start , the rewrite.
Riksi has it mostly correct. I bet that if they converted the Node.js app - as it is written - over to Java, they'd see the same or even more performance gains.
I spent the last 6+ years as CTO over a decently large Java codebase and what we found was that the biggest gainst of all were in writing front-end code in static HTML/JavaScript and making explicit, intentional, non-framework-guessed calls to the backend to pass or retrieve data.
Sounds silly if you put it like that, but that approach works pretty darn well. Any halfway reasonable backend system is really good at taking in a JSON request, doing some business-required stuff with it, returning a response, and interacting with a database or message queue.
Structuring the app into nice, simple standalone well-documented testable API calls - as many Node apps are - and not having the app do a whole lot of fancy page composition (which many Node apps do not do) gets you a long, long way. You need to do that for the mobile version, after all, and almost nobody runs at Twitter's scale and needs to back off that to save a millisecond here or there.
The advice is often not to rewrite because you'll suffer from second system syndrome. Your system will suck because it tries to do to much. So the rewritten system is faster logic goes counter to a good body of experience.
Nobody will ever believe that Node.js is 'faster' than Java. If one takes into account man-years spent to optimize JVM and dynamic code generation in Java and in Node.js ... Should we?
The #1 reason being "full-stack engineers" makes me question the outcome. On my team we are full-stack engineers that work in multiple languages. Breaking the barrier between frontend and backend is a social problem, not a technical one.
Compare node.js against tomcat/servlet apps or play framework or vert.x etc. Or compare java to javascript.
Please also keep in mind that Java is far more mature than Node at this point. Where will Node be when it achieves the same level of maturity as Java? given the same years of optimization as Java, I am sure that Node will be just as if not more competitive. As an analogy, how can you compare an experienced programmer with one fresh out of college?
This doesn't jibe with the latest framework benchmark numbers from TechEmpower.
This is also meaningless without seeing the code. Based on the claimed reduction in files and code size, I'll bet that a lot of the performance benefits come from a simplification of the code. The problem with Java isn't with the language itself, but rather with the bloated Enterprise ecosystem that favors heavy reliance on reflection-laden frameworks like Spring and JPA. Node.js doesn't have any of those, so it's a sure bet that the code is simpler and has fewer layers. A better comparison would be to rewrite the original in Java, without using all the heavyweight frameworks. This would probably give results more in line with the framework benchmarks cited above.
I don't really trust these performance numbers to tell me much. Systems are so touchy that differences of less than a factor of two are specious, as they can frequently be put down to (or could have been achieved with) myriad ancillary changes. When you throw in that with either language the systems should be doing better by 100x, the whole argument loses gravity. node.js implementations may (or may not) generally be faster than comparable ones developed as a classic Java web stack, but this article doesn't show us that.
Um, this benchmark is INVALID:
"We should be seeing many hundreds of pages a second instead of the 1.8 pages/sec for a single user in Java, and 3.3 in Node.js. Why?"
Yeah... why??? 3.3 pages a second SUCKS ROCKS. The high-scale systems I've worked on server 1000+ pages per second (and yes, with tons of app logic, database calls, etc.).
This whole thing is akin to racing a Porsche against a Corvette and noting that the Corvette won by going about 8 miles per hour instead of the Porsche's 5 miles per hour. Seeing how the two cars do with the parking brake on does not yield even the tiniest morsel of useful information. It doesn't "point to" or "indicate" or "give you a ballpark" or anything. It's totally useless.
All that said, I think others here have it right: you can scale in both Node and Java just fine. I doubt that you'll see a lot of difference in performance if you truly compare apples to apples--or certainly none related to the platform you chose. At the end of the day broader issues are going to be more decisive, like design decisions, how well you cache, database queries, coding practices, etc. If PP re-wrote the Node app they just implemented in Java they'd see another doubling of performance, if that's was important to them.
To me the difference between Node and Java is strongly typed vs. loosely typed. I prefer the former, but that's just me. They both have an application object and multi-threading (which for instance PHP does not), so they both have all the tools you need to scale should you choose to use them and do so correctly, diligently and consistently.
Oh, and I don't buy the "you can program in the same language on both client and server" line. That sounds like a solution in search of a problem to me: I've never see a "problem" with Java or PHP developers not having enough intelligence to figure out HTML/JS on the client side.
If you like the loosely-typed environment of JS then choose Node, and vice-versa for Java. Don't choose Node because it's "faster than Java" (or Java for the opposite reason) or because of the "unified language between client and server" non-advantage.
"I bet that if they converted the Node.js app - as it is written - over to Java, they'd see the same or even more performance gains."
Just to clarify, they redesigned and rewrote the app from the ground up in both Java and Node.js.
"Just to clarify, they redesigned and rewrote the app from the ground up in both Java and Node.js."
Just to further clarify, their Java implementation was using the Spring framework, whereas Node.js utilized no such heavyweight framework.
Either way, their engineering capability is questionable when they are boasting about serving just over 3 pages per second.