High Scalability -

Recommend Stuff The Internet Says On Scalability For March 24th, 2017 (Email)

This action will generate an email recommending this article to the recipient of your choice. Note that your email address and your recipient's email address are not logged by this system.

Email Article Link

The email sent will contain a link to this article, the article title, and an article excerpt (if available). For security reasons, your IP address will also be included in the sent email.

Article Excerpt:

Hey, it's HighScalability time:

This is real and oh so eerie. Custom microscope takes a 33 hour time lapse of a tadpole egg dividing.

If you like this sort of Stuff then please support me on Patreon.

40Gbit/s: indoor optical wireless networks; 15%: energy produced by wind in Europe; 5: new tasty particles; 2000: Qubits are easy; 30 minutes: flight time for electric helicopter; 42.9%: of heathen StackOverflowers prefer tabs;

Quotable Quotes:
- @RichRogersIoT: "Did you know? The collective noun for a group of programmers is a merge-conflict." - @omervk
- @tjholowaychuk: reviewed my dad's company AWS expenses, devs love over-provisioning, by like 90% too, guess that's where "serverless" cost savings come in
- @karpathy: Nature is evolving ~7 billion ~10 PetaFLOP NI agents in parallel, and has been for ~10M+s of years, in a very realistic simulator. Not fair.
- @rbranson: This is funny, but legit. Production software tends to be ugly because production is ugly. The ugliness outpaces our ability to abstract it.
- @joeweinman: @harrietgreen1 : Watson IoT center opened in Munich... $200 million dollar investment; 1000 engineers #ibminterconnect
- David Gerard: This [IBM Blockchain Service] is bollocks all the way down.
- digi_owl: Sometimes it seems that the diff between a CPU and a cluster is the suffix put on the latency times.
- Scott Aaronson: I’m at an It from Qubit meeting at Stanford, where everyone is talking about how to map quantum theories of gravity to quantum circuits acting on finite sets of qubits, and the questions in quantum circuit complexity that are thereby raised.
- Founder Collective: Firebase didn’t try to do everything at once. Instead, they focused on a few core problems and executed brilliantly. “We built a nice syntax with sugar on top,” says Tamplin. “We made real-time possible and delightful.” It is a reminder that entrepreneurs can rapidly add value to the ecosystem if they really focus.
- Elizabeth Kolbert: Reason developed not to enable us to solve abstract, logical problems or even to help us draw conclusions from unfamiliar data; rather, it developed to resolve the problems posed by living in collaborative groups.
- Western Union: the ‘telephone’ has too many shortcomings to be seriously considered as a means of communication.
- Arthur Doskow: being fair, being humane may cost money. And this is the real issue with many algorithms. In economists’ terms, the inhumanity associated with an algorithm could be referred to as an externality.
- Francis: The point is that even if GPUs will support lower precision data types exclusively for AI, ML and DNN, they will still carry the big overhead of the graphics pipeline, hence lower efficiency than an FPGA (in terms of FLOPS/WATT). The winner? Dedicated AI processors, e.g. Google TPU
- James Glasnapp: When we move out of the physical space to a technological one, how is the concept of a “line” assessed by the customer who can’t actually see the line?
- Frank: On the other hand, if institutionalized slavery still existed, factories would be looking at around $7,500 in annual costs for housing, food and healthcare per “worker”.
- Baron Schwartz: If anyone thought that NoSQL was just a flare-up and it’s died down now, they were wrong...In my opinion, three important areas where markets aren’t being satisfied by relational technologies are relational and SQL backwardness, time series, and streaming data.
- CJefferson: The problem is, people tell me that if I just learn Haskell, Idris, Closure, Coffescript, Rust, C++17, C#, F#, Swift, D, Lua, Scala, Ruby, Python, Lisp, Scheme, Julia, Emacs Lisp, Vimscript, Smalltalk, Tcl, Verilog, Perl, Go... then I'll finally find 'programming nirvana'.
- @spectatorindex: Scientists had to delete Urban Dictionary's data from the memory of IBM's Watson, because it was learning to swear in its answers.
- Animats: [Homomorphically Encrypted Deep Learning] is a way for someone to run a trained network on their own machine without being able to extract the parameters of the network. That's DRM.
- Dino Dai Zovi: Attackers will take the least cost path through an attack graph from their start node to their goal node.
- @hshaban: JUST IN: Senate votes to repeal web privacy rules, allowing broadband providers to sell customer data w/o consent including browsing history
- KBZX5000: The biggest problem you face, as a student, when taking a programming course at a University level, is that the commercially applicable part of it is very limited in scope.
  
  You tend to become decent at writhing algorithms. A somewhat dubious skill, unless you are extremely gifted in mathematics and / or somehow have access to current or unique hardware IP's (IP as in Intellectual Property).
- Brian Bailey: The increase in complexity of the power delivery network (PDN) is starting to outpace increases in functional complexity, adding to the already escalating costs of modern chips. With no signs of slowdown, designers have to ensure that overdesign and margining do not eat up all of the profit margin.
- rbanffy: Those old enough will remember the AS/400 (now called iSeries) computers map all storage to a single address space. You had no disk - you had just an address space that encompassed everything and an OS that dealt with that.
- @disruptivedean: Biggest source of latency in mobile networks isn't milliseconds in core, it's months or years to get new cell sites / coverage installed
- Greg Ferro: Why Is 40G Ethernet Obsolete? Short Answer: COST. The primary issue is that 40G Ethernet uses 4x10G signalling lanes. On UTP, 40G uses 4 pairs at 10G each.
- @adriaanm: "We chose Scala as the language because we wanted the latest features of Spark, as well as [...] types, closures, immutability [...]"Adriaan Moors added,
- ajamesm: There's a difference between (A) locking (waiting, really) on access to a critical section (where you spinlock, yield your thread, etc.) and (B) locking the processor to safely execute a synchronization primitive (mutexes/semaphores).
- @evan2645: "Chaos doesn't cause problems, it reveals them" - @nora_js #SREcon17Americas #SRECon17
- chrissnell: We've been running large ES clusters here at Revinate for about four years now. I've found the sweet spot to be about 14-16 data nodes, plus three master-only nodes. Right now, we're running them under OpenStack on top of our own bare metal with SAS disks. It works well but I have been working on a plan to migrate them to live under Kubernetes like the rest of our infrastructure. I think the answer is to put them in StatefulSets with local hostPath volumes on SSD.
- @beaucronin: Major recurring theme of deep learning twitter is how even those 100% dedicated to the field can't keep up with progress.
- Chris McNab: VPN certificates and keys are often found within and lifted from email, ticketing, and chat services.
- @bodil: And it took two hours where the Rust version has taken three days and I'm still not sure it works.
- azirbel: One thing that's generalizable (though maybe obvious) is to explicitly define the SLAs for each microservice. There were a few weeks where we gave ourselves paging errors every time a smaller service had a deploy or went down due to unimportant errors.
- bigzen: I'm worn out on articles dissing the performance of SQL databases without quoting any hard numbers and then proceeding to replace the systems with no thanks of development in the latest and great tech. I have nothing against spark, but I find it very hard to believe that alarm code is now readable than SQL. In fact, my experience is just the opposite.
- jhgg: We are experimenting with webworkers to power a very complicated autocomplete and scoring system in our client. So far so good. We're able to keep the UI running at 60fps while we match, score and sort results in a web-worker.
- DoubleGlazing: NoSQL doesn't reduce development effort. What you gain from not having to worry about modifying schemas and enforcing referential integrity, you lose from having to add more code to your app to check that a DB document has a certain value. In essence you are moving responsibility for data integrity away from the DB and in to your app, something I think is quite dangerous.
- Const-me: Too bad many computer scientists who write books about those algorithms prefer to view RAM in an old-fashioned way, as fast and byte-addressable.
- Azur: It always annoys me a bit when tardigrades are described as extremely hardy: they are not. It is ONLY in the desiccated, cryptobiotic, form they are resistant to adverse conditions.
- rebootthesystem: Hardware engineers can design FPGA-based hardware optimized for ML. A second set of engineers then uses these boards/FPGA's just as they would GPU's. They write code in whatever language to use them as ML co-processors. This second group doesn't have to be composed of hardware engineers. Today someone using a GPU doesn't have to be a hardware engineer who knows how to design a GPU. Same thing.

There should be some sort of Metcalfe's law for events. Maybe: the value of a platform is proportional to the square of the number of scriptable events emitted by unconnected services in the system. CloudWatch Events Now Supports AWS Step Functions as a Target. @ben11kehoe: This is *really* useful: Automate your incident response processes with bulletproof state machines #aws

Cute faux O'Reilly book cover. Solving Imaginary Scaling Issues.

Intel's Optane SSD is finally out, though not quite meeting it's initial this will change everything promise, it still might change a lot of things. Intel’s first Optane SSD: 375GB that you can also use as RAM. 10x DRAM latency. 1/1000 NAND latency. 2400MB/s read, 2000MB/s write. 30 full-drive writes per day. 2.5x better density. $4/GB (1/2 RAM cost). 1.5TB capacity. 500k mixed random IOPS. Great random write response. Targeted at power users with big files, like databases. NDAs are still in place so there's more to learn later. PCPerspective: comparing a server with 768GB of DRAM to one with 128GB of DRAM combined with a pair of P4800X's, 80% of the transactions per second were possible (with 1/6th of the DRAM). More impressive was that matrix multiplication of the data saw a 1.1x *increase* in performance. This seems impossible, as Optane is still slower than DRAM, but the key here was that in the case of the DRAM-only configuration, half of the database was hanging off of the 'wrong' CPU. foboz1: For anyone think that this a solution looking for a problem, think about two things: Big Data and mobile/embedded. Big Data has an endless appetite for large quantities for memory and fast storage; 3D XPoint plays into the memory hierarchy nicely. At the extreme other end of the scale, it may be fast enough to obviate the need for having DRAM+NAND in some applications. raxx7: And 3D XPoint isn't free of limitations yet. RAM has 50-100 ns latency, 50 GB/s bandwidth (128 bit interface) and unlimited write endurance. If 3D XPoint NVDIMM can't deliver this, we'll still need to manage the difference between RAM and 3D XPoint NVDIMM. zogus: The real breakthrough will come, I think, when the OS and applications are re-written so that they no longer assume that a computer's memory consists of a small, fast RAM bank and a huge, slow persistent set of storage--a model that had held true since just about forever. VertexMaster: Given that DRAM is currently an order of magnitude faster (and several orders vs this real-world x-point product) I really have a hard time seeing where this fits in. sologoub: we built a system using Druid as the primary store of reporting data. The setup worked amazingly well with the size/cardinality of the data we had, but was constantly bottlenecked at paging segments in and out of RAM. Economically, we just couldn't justify a system with RAM big enough to hold the primary dataset...I don't have access to the original planning calculations anymore, but 375GB at $1520 would definitely have been a game changer in terms of performance/$, and I suspect be good enough to make the end user feel like the entire dataset was in memory.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Article Link:

Your Name:

Your Email:

Recipient Email:

Message: