bstg: Wonder if the best way to improve insights from #bigdata isn't better analytics, but a fundamental change in the way we capture it? #in
Apple has made their WWDC 2011 Videos available. Apple is normally as closed as a counter-insurgency cell, but their WWDC videos are always top notch.
James Hamilton hits on another big change in the database landscape, moving away from crazy enterprise pricing schemes to more sustainable and rational models.
Great discussion on Reddit of Eben Moglen's fascinating The alternate net we need, and how we can build it ourselves. Our net has been turned against us. How do we get it back? Without anonymity the human race will not be human anymore. We need smart routers that work for us.
The NoSQL Fad says Alex Popescu won't be countered by a relational database with relaxed semantics, as that's just recreating NoSQL in the first place.
Spark, in-memory cluster computing that aims to make data analytics fast — both fast to run and fast to write.
In The State of Management Scalability at Stack Exchange Kyle Brandt talks about scaling their ability to manageer their Linux and Windows environment through automation. The idea is if you have to do more than once on multiple servers then automate it. The cool part is they have a chart of what part of their current process doesn't meet this goal and they have a plan of how to get there.
Disruptor - a Concurrent Programming Framework, is a general-purpose mechanism that solves a complex problem in concurrent programming in a way that maximizes performance.
When Watson needs to be fed it dines at chez Hadoop. A Hadoop backend is used to crunch through the documents to prep for the interactive Jeopardy matches. There is no other system flexible enough to allow for the flexible knowledge extraction that we need.
More videos for you. NDC 2011 Video Torrent, a torrent of all the NDC 2011 videos (Norwegian Developers Conference) is now available. If that's not enough here are videos from Jfokus 2011. Emil Eifrem talks NoSQL and there are talks on GWT, Scala, Java EE 6, and TDD.
Performance is a Feature says Jeff Atwood. To be fast: Follow Yahoo's Guidelines, Optimize for Anonymous Users, Make Performance a Point of Public Pride.
Intel takes wraps off 50-core supercomputing coprocessor plans reports Jon Stokes. It's the age-old general-purpose (slower, easier to use) vs. specialized (faster, harder to use) tradeoff, and Intel is betting that since Tesla has so far been the only real option there are plenty of potential users out there who are in the market for something less specialized.
Lift - a web framework built on Scala to create concise, secure, scalable, highly interactive web applications that provide a full set of layered abstractions on top of HTTP and HTML.
Windows Azure Storage Abstractions and their Scalability Targets. A single queue is targeted to be able to process up to 500 messages per second. The target throughput of a single blob is up to 60 MBytes/sec. The throughput target for a single partition is up to 500 entities per second.
Jeremiah Peschka with a good overview of Resolving Conflicts in the Database. Some options: Manual intervention, Logging conflicts, Master write server, Last write wins, Write partitioning.
A unspoken law of the Internet is that all of Google's infrastructure must be recreated outside Google in open source form. GoldenOrb is doing their part by creating an open source version of Pregel, used for massive-scale graph processing. If you are unsure what Pregel is or how to use it, Michael Nielsen has a very good article on Pregel that's worth a look.
Riak Pipe details shared. Pipe allows you to specify work in the form of a chain of function pairs. It will be used to supercharge Riak's MapReduce feature.