advertise
Friday
Jun022017

Gone Fishin'

Well, not exactly Fishin', but I'll be on a month long vacation starting today. I won't be posting (much) new content, so we'll all have a break. Disappointing, I know. Please use this time for quiet contemplation and other inappropriate activities. Au revoir!

Friday
May262017

Stuff The Internet Says On Scalability For May 26th, 2017

Hey, it's HighScalability time:

 

 

Sport imitating tech. Cloud Computing chases down Classic Empire to win...the Preakness. (Daily News)

If you like this sort of Stuff then please support me on Patreon.
  • 42%: increase US wireless traffic since 2015; 44: age of Ethernet; $18.5m: low cost of Target data breach; 25 million: record set from Library of Congress; 98%: WannaCry infections on Windows 7; 100 terabytes: daily Pinterest logging; 2020: when Microsoft will have DNA storage in the cloud; 220 μm: size of microbots; 2 billion: lines of code in Google repository; 40%+: esports industry growth; 

  • Quotable Quotes:
    • @Werner: There is no compression algorithm for experience.
    • @colinmckerrache: We just crossed over 2m EVs on the road. So yeah, second million took just under 18 months. Next million in about 10 months.
    • @swardley: When discussing China, stop thinking cheap labour, communism & copying ... to understand changes, start thinking World's largest VC.
    • @JOTB17: "Cars generate more than 4Tb of data a day, humans are becoming irrelevant in data collection" 😳 @saleiva #JOTB17
    • Wojciech Kudla: that's why blacklisting workqueues from critical cpus should be on the jitter elimination check list. They can be affinitized just like irqs
    • @ryanhuber: Any sufficiently advanced attacker is indistinguishable from one of your developers.
    • @spolsky: "During peak traffic hours on weekdays, there are about 80 people per hour that need help getting out of Vim."
    • SrslyJosh: Basing anything on proof-of-work puts you in a perpetual race to control more compute than your adversaries.
    • gkoberger: So, in my mind, Mozilla won. It's a non-profit, and it forced us into an open web. We got the world they wanted. Maybe the world is a bit Chrome-heavy currently, but at least it's a standards compliment world.
    • NoGravitas: The basic argument of this article seems to be that the real benefit of cryptocurrencies, other than their speculative value, is that they provide a way of enforcing artificial scarcity in the digital realm, where scarcity does not come naturally.
    • Renee DiResta: The trouble is that “high-frequency trading” is about as precise as “fake news.”
    • Silicon Valley: I mean, that Ken doll probably thinks traversing a binary search tree runs in the order of "n," instead of "log n." Idiot.
    • @__apf__: If you look at another engineer's work and think, "That's dumb. Why don't you just..." Take a breath. Find out why the problem is hard.
    • Too many quotes. Please click through to read the full article.

  • Failing Kubernetes pods by playing whack-a-mole is an awesome idea. Funner than a barrel of chaos monkeys. You just have to see the video

  • There are times when specialized hardware absolutely destroys commodity hardware. TensorFlow Frontiers. The need for Google to create the TPU became urgent in 2013 when it was realized if all Android users spoke to their phone for just three minutes a day it might force Google to double its number of datacenters. That drove a crash program to develop the first TPU. The first-gen TPU was 15-30x faster than contemporary CPUs & GPUs, 30-80x more power-efficient, but it only worked for inference, not training. The second-gen TPU has up to 180 teraflops of floating point performance, 64 GB of ultra-high-bandwidth memory, works for both training and inference (simpler to use), and can be connected together using a 2-D toroidal mesh network (tackle largest problems). On one problem training time was reduced from 24 hours to 6 hours.

  • Another victim of Stacked ranking. T.J. Miller Is Leaving Silicon Valley.

  • The biggest every day risk from the massive data surveillance panopticon carried out by private corporations is not storm troopers busting down your door, it's this: everything will start costing you more. Whenever an algorithm calculates it has leverage over you it will exploit that advantage to charge you more. A computer mediated personalized world will anticipate your needs, but it will also invisibly shape them. What is being created is the ultimate Skinner Box. Uber Is Using AI to Charge People as Much as Possible for a Ride

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
May232017

Sponsored Post: Etleap, Pier 1, Aerospike, Loupe, Clubhouse, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Who's Hiring? 

  • Pier 1 Imports is looking for an amazing Sr. Website Engineer to join our growing team!  Our customer continues to evolve the way she prefers to shop, speak to, and engage with us at Pier 1 Imports.  Driving us to innovate more ways to surprise and delight her expectations as a Premier Home and Decor retailer.  We are looking for a candidate to be another key member of a driven agile team. This person will inform and apply modern technical expertise to website site performance, development and design techniques for Pier.com. To apply please email cmwelsh@pier1.com. More details are available here.

  • Etleap is looking for Senior Data Engineers to build the next-generation ETL solution. Data analytics teams need solid infrastructure and great ETL tools to be successful. It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission. We use Java extensively, and distributed systems experience is a big plus! See full job description and apply here.

  • Advertise your job here! 

Fun and Informative Events

  • DBTA Roundtable OnDemand Webinar: Leveraging Big Data with Hadoop, NoSQL and RDBMS. Watch this recent roundtable discussion hosted by DBTA to learn about key differences between Hadoop, NoSQL and RDBMS. Topics include primary use cases, selection criteria, when a hybrid approach will best fit your needs and best practices for managing, securing and integrating data across platforms. Brian Bulkowski, CTO and Co-founder of Aerospike, presented along with speakers from Cask Data and Splice Machine. View now.

  • Advertise your event here!

Cool Products and Services

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Etleap provides a SaaS ETL tool that makes it easy to create and operate a Redshift data warehouse at a small fraction of the typical time and cost. It combines the ability to do deep transformations on large data sets with self-service usability, and no coding is required. Sign up for a 30-day free trial.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Click to read more ...

Friday
May192017

Stuff The Internet Says On Scalability For May 19th, 2017

Hey, it's HighScalability time:

 

 

Who wouldn't want to tour the Garden of Mathematical Sciences with Plato as their guide?

If you like this sort of Stuff then please support me on Patreon.
  • 2 billion: Android users; 1,000: cloud TPUs freely available to researchers; 11.5 petaflops: in Google's machine learning pod; 86 billion: neurons in the human brain, not 100 billion; 1,300: Amazon's new warehouses across Europe; $1 trillion: China self-investment; 1/7th: California's portion of US GDP; more: repetition in songs; 99.999%: Spanner availability, strong consistency, good latency; 6: successful SpaceX launch in 4 months; 160TB: RAM in HPE computer; 40,000+ workers: private offices > open offices

  • Quotable Quotes:
    • Tim Bray: with­out ex­cep­tion, I ob­served that they [Per­son­al com­put­er­s, Unix, C, the In­ter­net and We­b, Java, REST, mo­bile, pub­lic cloud] were ini­tial­ly load­ed in the back door by geek­s, with­out ask­ing per­mis­sion, be­cause they got shit done and helped peo­ple with their job­s. That’s not hap­pen­ing with blockchain. Not in the slight­est. Which is why I don’t be­lieve in it.
    • @swardley: Amazon continues to take industry after industry not because those companies lack engineering talent but executive talent.
    • @RichRogersIoT: "I bought my boss two copies of The Mythical Man Month so that he could read it twice as fast." - @rkoutnik
    • @GossiTheDog: Seeing ATMs and banks go down here suggests fundamental issues which flashing boxes can't fix. Design, architect a security model.
    • @stevesi: Is Google's TPU investment the biggest advantage ever or laying groundwork for being disrupted? Can Google out-innovate sum of industry?
    • Ryan Mac: Last year, Craigslist took in upwards of $690 million in revenue, most of which is net profit
    • @dberkholz: Capex vs opex budget for tools is a bigger deal than I'd fully appreciated. Welcome to the enterprise!
    • Vint Cerf: AI stands for artificial idiot. 
    • Douglas Hofstadter: In the end, we are self-perceiving, self-inventing, locked-in mirages that are little miracles of self-reference. 
    • cocktailpeanuts: I feel like the term "Serverless" has been hijacked to a point that it will soon become meaningless just like "AI", "IoT", etc. Basically "Serverless" in 2017 has become just a hype friendly marketing friendly way of saying "Saas".
    • @skupor: Over last 20 years, m&a exits for venture backed companies has gone from 60% to 90% of exits (was 20% in 1990)
    • bpicolo: C# with visual studio is, I think, the most productive environment I've come across in programming. It's ergonomically sound, straightforward, and the IDE protects me from all sorts of relevant errors. Steve mentioned Intellij is a bit slower than he'd hope typing sometimes. I totally agree with that. I think Visual Studio doesn't quite suffer from that.
    • @codepitbull: A good developer is like a werewolf: Afraid of silver bullets.
    • @sehnaoui: Coffee shop. People next to me are loud and rude. They just found the perfect name for their new business. I just bought the domain name.
    • David Robinson: Python and Javascript developers start and end the day a little later than C# users, and are a little less likely than C programmers to work in the evening.
    • Ben Thompson: The fatal flaw of software, beyond the various technical and strategic considerations I outlined above, is that for the first several decades of the industry software was sold for an up-front price, whether that be for a package or a license. The truth is that software — and thus security — is never finished; it makes no sense, then, that payment is a one-time event.
    • boulos: Spanner does things for you that MySQL et al. don't. Having an automagic Regional (and eventually Global if you'd like) database without dealing with sharding is worth $8k/year even to me. So even if it could fit on $10/month of hardware, I don't begrudge them for charging a service fee, rather than saying "This is how much cores, RAM, disk and flash this eats".
    • codedokode: One of the reasons why such attack was possible is poor security in Windows. Port 445 that was used in an attack is opened by a kernel driver (at least that is what netstat says on WinXP) that runs in ring 0. This driver is enabled by default even if the user doesn't need SMB server and it cannot be easily disabled.
    • @RichRogersIoT: Job interview:  Implement Large Hadron Collider on whiteboard / Actual job:  Jira bug-id #2342: Move login button 3 pixels to left
    • slackingoff2017: This is part of a worrying new trend. Increasingly you can't buy software anymore, only rent. Innovation is being kept from scrutiny hidden behind closed doors. The kind of thing patents were meant to prevent back when the system wasn't broken.
    • Scott Borg~ Engineers need to look at their products from the standpoint of the attacker, and consider how attacker would benefit from cyberattack and how to make undertaking that attack more expensive. It’s all about working to increase an attacker’s costs
    • @tottinge: "A code base isn't a thing we build, it's a place we live. We don't seek to finish it and move on, but to make it liveable"  @sarahmei
    • Sam Kroonenburg: We Believe …Don’t do the things that someone else can do. Do the things that only we can do. [re: Serverless]
    • Anush Mohandass: What you’re starting to see are different architectures for different workloads. There will be chips for image recognition, SQL, machine learning acceleration. 
    • Craig McLuckie: Given the current state-of-the-art, most users will achieve best day-to-day top line availability by just picking a single public cloud provider and running their app on one infrastructure.
    • watmough: Chromebooks work, and I am a big fan of them in education. I have a pretty good idea how hard our teachers work, and I'd hate to think of the Windows bullshit being imposed them, like it's imposed on me and my coworkers.
    • axilmar: It [React Native] is the future! But you need experience to make it work, and navigation/routing is still being worked out, and it is native, but it is Javascript, and it is crossplatform, but you need to be aware of the differences of the two platforms, and styling uses something that is like css but not entirely, you have to learn all the intricate details...
      Thank god software engineering "practices" are not used in other engineering disciplines...
    • Anton Howes: So without the British acceleration of innovation, the Industrial Revolution would likely have happened elsewhere within a few decades. France and the Low Countries and Switzerland and the United States were by the eighteenth century well on their way towards sustained modern economic growth. 
    • Dr. Suzana Herculano-Houzel~ evolution is not progress, all that evolution means is change over geological time, it's not for the better, it's not for the worst, it's just different. All it has to do with is generating diversity. We have ample evidence we are not descendents of reptiles, we are close cousins. We could not have a basic reptile brain to which something else was added. We know now that every reptile has a neo-cortex. There is not such thing as triune brain. There is no such thing as reptilian brain on top of which a new structure appeared only in mammals. We all have it. The brain is very much the same in its essence, the difference lies in the quantities. 
    • James Clear: The great mistake of Hurricane Katrina was that the levees and flood walls were not built with a proper “margin of safety.” The engineers miscalculated the strength of the soil the walls were built upon. As a result, the walls buckled and the surging waters poured over the top, eroding the soft soil and magnifying the problem. Within a few minutes, the entire system collapsed.
    • elvinyung: This "modern" Spanner feels very different from the one we saw in 2012 [1]. Some interesting takeaways: * There is a native SQL interface in Spanner, rather than relying on a separate upper-layer SQL layer, a la F1 [2] * Spanner is no longer on top of Bigtable! Instead, the storage engine seems to be a heavily modified Bigtable with a column-oriented file format * Data is resharded frequently and concurrently with other operations -- the shard layout is abstracted away from the query plan using the "distributed union" operator * Possible explanation for why Spanner doesn't support SQL DML writes: writes are required to be the last step of a transaction, and there is currently no support for reading uncommitted writes (this is in contrast to F1, which does support DML) * Spanner supports full-text search (!)

  • Cautionary tale number 1000 on depending on someone else's service. Firebase Costs Increased by 7,000%! Google changed something (billing for SSL overhead) and HomeAutomation's bill spiked. There was no warning. There were no tools to tell why. Support stopped replying. There's no one to call. The recommendation is to protect yourself from being trapped by a service from the very beginning. They've moved to Lambda/DynamoDb, which many point out is also a potential service trap. The Firebase Founder responded with an explanation, saying he was "embarrassed by the level of communication on our side." Good discussion on HackerNews and on reddit. Lots of people with similar stories, complaints about lack of support with Google, complaints about lack of transparency, and the usual about never rely on anything ever. 

  • Serverlessconf Austin '17 videos are now available (most of them anyway). 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Monday
May152017

Is Serverless the New Visual Basic?

With Serverless hiring less experienced developers can work out better than hiring experienced cloud developers. That's an interesting point I haven't heard before and it was made by Paul Johnston, CTO of movivo, in The ServerlessCast #6 - Event-Driven Design Thinking.

The thought process goes something like this...

An experienced cloud developer will probably think procedurally, in terms of transactional systems, frameworks, and big fat containers that do lots of work. 

That's not how a Serverless developer needs to think. A Serverless developer needs to think in terms of small functions that do one thing linked together by events; and they need to grok asynchronous and distributed thinking.

So the idea is you don't need typical developer skills. Paul finds people with sysadmin skills have the right stuff. Someone with a sysadmin background is more likely than a framework developer to understand the distributed thinking that goes with building an entire system of events.

Paul also makes the point that once a system has built experienced developers will get bored because Serverless systems don't require the same amount of maintenance.

For example, they had good success hiring a person with two years of vo-tech on-the-job training because they didn't have the baggage of working with frameworks and servers and all of those kind of things. That baggage gets in the way.

So hire younger, hungrier developers who don't have that experience behind them. 

Obviously "younger, hungrier" and "less experienced" also means cheaper, not that there's anything wrong with that. Developers are hard to find.

We've seen this kind of thing before. Using Visual Basic lots of systems were built that did real and important work for companies by relatively inexperienced people because VB made it so easy to write a Windows program. It was really difficult and time-consuming to write a Windows program, like it's really difficult and time-consuming to write a cloud program today. Like VB, Serverless radically reduces the expertise needed to write a cloud program. 

Though they got the job done, most of those VB programs were technical debt bombs. Over time as more and more functionality was bolted on they became hard to understand, hard to change, hard to test, and were poorly designed. Your classic Big Ball of Mud.

A lot of the problem was VB made it easy to include business logic in event handlers, so there was no layering, the GUI was the orchestrator. This made VB programs hard to test. Serverless also has this problem. Inexperienced programmers also used a lot of global variables in VB programs so there wasn't a clean separation of concerns. Coupling was high and cohesion was low. Serverless also has this problem, though obviously there are no global variables in the code, the database effectively becomes a store for global variables that can be accessed from any Serverless function.

It will be interesting to see if Serverless can avoid VB's fate.

On HackerNews

Friday
May122017

Stuff The Internet Says On Scalability For May 12th, 2017

Hey, it's HighScalability time:

 

 

Earth's surface is covered with accidental hidden letters. Can you find them? (ABC: The Alphabet from the Sky)

 

If you like this sort of Stuff then please support me on Patreon.
  • 1 million: cord cutters in Q1; 500 billion: FINRA validations of stock trades every day on Lambda; 100k: messages sent per hour at Airbnb; 21.1 billion: transistors in GV100 GPU; 11,500: crashes to train a drone; 84,469: Backblaze hard drives; 8,000: questions per day asked on StackOverflow; 

  • Quotable Quotes:
    • Jonathan Taplin: Google Is as Close to a Natural Monopoly as the Bell System Was in 1956
    • Tom Goldenberg: more companies on the site [StackShare.io] use JavaScript on the back-end (6,000) than Python (4,100) or Java (3,900).
    • Andrew Shafer: The dark ages of of the relational database and the Java middleware stack paused everything for a decade. 
    • @Taytus: "We are early stage investors. Call me when you hit 1 million monthly active users"
    • @chrisjrn: "At this point I was drunk on Perl" @bradfitz #tweetsincontext #oscon
    • Bryan Cantrill: AWS is underwriting a war on big box retail. 
    • Paul Gilster: You’re reading that right — one-tenth of a milliwatt is enough to create error-free communications between the Sun and Alpha Centauri through two FOCAL antennas [gravitational lens].
    • Vadim Markovtsev: There is a productivity peak between 2 pm and 5 pm for all the languages, when the commit frequency is the highest. This is the industry’s golden time. Managers should never distract coders during this interval.
    • Patrick Tucker: The goal, one day, is a neural net that can learn instantaneously, continuously, and in real-time, by observing the brainwaves and eye movement of highly trained soldiers doing their jobs.
    • @alicegoldfuss: it is incredibly difficult to balance "don't burn out and become a statistic" with "get as far as you can fast so they can't take it away"
    • Jonathan Taplin: With the advent of YouTube and other streaming services, revenue for musicians has fallen 70%. If you had a song that had a million downloads on iTunes, you would get $900,000. On YouTube, you’d get $900.
    • David Robinson: In short, if we had to summarize the average story [after analyzing 100,000 stories] that humans tell, it would go something like Things get worse and worse until at the last minute they get better.
    • Confucius: He who cannot describe the problem will never find the solution to that problem
    • Peter Thiel: competition is for losers
    • Jason McGee~ Serverless adoption is moving 10x faster than Container adoption.
    • Max Ehrenfreund: An average, workers born in 1942 earned as much or more over their careers than workers born in any year since
    • Michael Elad: To put it bluntly, your grandchild is likely to have a robot spouse. And here is the punch line: much of the technology behind this bizarre future is likely to emerge from deep learning and its descendant fields.
    • aliostad: We just did a benchmarking for a PoC on DocumentDB side-by-side Cassandra. It does the job, I have not yet seen anything revolutionary. Cassandra benchmarks seemed better.
    • AWS Lambda Engineer: When you develop a Lambda function that uses SQS, SNS, Dynamo and other stuff in the cloud.. you can’t really debug it on your local. People just need to change their mindset
    • sbuttgereit: What looks compelling about the PostgreSQL offering as compared to AWS RDS is that it looks like you get a PostgreSQL cluster rather than a single database in a shared cluster.
    • Warren Toomey: Simulated hardware is infinitely easier to obtain, configure and diagnose than real hardware.
    • Kate Kaye: The mistake companies have made, he says, is to rely too much on targeted advertising, cutting too far back on broader advertising that builds brand awareness with people outside the existing customer base and eventually leads to new sales.
    • cbanek: I've had to work on mission critical projects with 100% code coverage (or people striving for it). The real tragedy isn't mentioned though - even if you do all the work, and cover every line in a test, unless you cover 100% of your underlying dependencies, and cover all your inputs, you're still not covering all the cases.
    • There's just too many quotes. Please read the full article to see them all.

  • Is bundling a race to the bottom for content creators? What's the future of game monetization?: the value of games seems to keep falling...The fact that we want everything free now because it costs less (not 'nothing', remember) to produce each additional unit is a fairly entitled view and, I suggest, it would lead to the destruction of the  games industry in the same way that it's gutted the music industry...The success of Spotify and Netflix's models in other industries worries me and we see a bit of a move in that direction with things like Humble Bundles...If we're not careful, we'll get to where there's no money to be made in games and only the most trite, generic, relatively low cost and mass-appealing titles (the Call of Duties and FIFAs) will be financially viable...it's worth noting that these titans are resorting to F2P to try and shore up their player numbers. Will we ever see subscription models in new games again?

  • A 10,000+ phone Chinese click farm looks a lot like Facebook's mobile device testing lab

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Tuesday
May092017

Sponsored Post: Etleap, Pier 1, Aerospike, Loupe, Clubhouse, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

Who's Hiring? 

  • Pier 1 Imports is looking for an amazing Sr. Website Engineer to join our growing team!  Our customer continues to evolve the way she prefers to shop, speak to, and engage with us at Pier 1 Imports.  Driving us to innovate more ways to surprise and delight her expectations as a Premier Home and Decor retailer.  We are looking for a candidate to be another key member of a driven agile team. This person will inform and apply modern technical expertise to website site performance, development and design techniques for Pier.com. To apply please email cmwelsh@pier1.com. More details are available here.

  • Etleap is looking for Senior Data Engineers to build the next-generation ETL solution. Data analytics teams need solid infrastructure and great ETL tools to be successful. It shouldn't take a CS degree to use big data effectively, and abstracting away the difficult parts is our mission. We use Java extensively, and distributed systems experience is a big plus! See full job description and apply here.

  • Advertise your job here! 

Fun and Informative Events

  • DBTA Roundtable OnDemand Webinar: Leveraging Big Data with Hadoop, NoSQL and RDBMS. Watch this recent roundtable discussion hosted by DBTA to learn about key differences between Hadoop, NoSQL and RDBMS. Topics include primary use cases, selection criteria, when a hybrid approach will best fit your needs and best practices for managing, securing and integrating data across platforms. Brian Bulkowski, CTO and Co-founder of Aerospike, presented along with speakers from Cask Data and Splice Machine. View now.

  • Advertise your event here!

Cool Products and Services

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Etleap provides a SaaS ETL tool that makes it easy to create and operate a Redshift data warehouse at a small fraction of the typical time and cost. It combines the ability to do deep transformations on large data sets with self-service usability, and no coding is required. Sign up for a 30-day free trial.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • Advertise your product or service here!

If you are interested in a sponsored post for an event, job, or product, please contact us for more information.

Click to read more ...

Monday
May082017

Privacy: Bartering Data for Services

Data is the new currency. A phrase we’ve heard frequently in the wake of the story of Unroll.me selling user data to Uber.

Two keys to that story:

  • Users didn’t realize their data was being sold.
  • Free services can be considered a sophisticated form of phishing attack.

In both cases prevention requires user awareness. How do we get user awareness? Force meaningful disclosure. How do we force meaningful disclosure? Here’s an odd thought: use the tax system.

If data is the new currency then why isn’t exchanging data for use of a service a barter transaction? If a doctor exchanges medical services for chickens, for example, that is a taxable event at fair market value. It's a barter arrangement. A free service that sells user data is similarly bartering the service for data, otherwise said service would not be offered. 

How would it work?

  • Service providers send out 1099-Bs to users for the fair market value of the service. Fair market value could be determined using a similar for pay service or as a percentage of the income generated from the data being sold.

  • The IRS treats barter transactions as income received. Users would need to pay income tax for the “free” services they use that sell their data.

What would it accomplish?

  • Force disclosure by services. Businesses making money selling data would be forced to inform their users that they are doing so because it’s required for tax accounting.

  • Eyes Wide Open. Users would know for certain that the services they are using are selling their data. They could then determine if the relationship is worth the cost.

This would not prevent free service for data arrangements. There’s nothing wrong with exchanging data for a service, but everyone should enter such a transaction knowingly.

Friday
May052017

Stuff The Internet Says On Scalability For May 5th, 2017

Hey, it's HighScalability time:

 

 

GPUs and CPUs run hot hot hot. See them in action with thermal imaging. (Tested)

 

If you like this sort of Stuff then please support me on Patreon.
  • 25ms: SpaceX satellite latency; 17 million: tax returns received by IRS during week ending April 21; 1.94 billion: Facebook users; 1.2 billion: Lambda requests by Expedia / month; ~$91.5K: Capital One's yearly Serverless TCO; 1.2 billion: Facebook Messenger users; 215 petabytes: storage per gram of DNA; 1/2: households in US are Amazon Prime members; 50.8%: households in US that are mobile phone only; 80 billion: street view images; 3 million: open sourced Instacart orders; $175: RaaS (ransomware-as-a-service); 350,000+: Amazon employees; 

  • QuotableQuotes:
    • Paul Barnum: You can have a second computer when you've shown you know how to use the first one
    • @chrisalbon: 2007: “You are the product.”  2017: “You are the training data.”
    • shitloadofbooks: As an Ops guy, I preach Ansible + systemd all day everyday, but so many of our Devs (and Ops) have drunk the containerization Kool-aid.
    • roland-s: Like you, I'm sometimes unsure if this is the right choice. Maybe a monolithic server or traditional VMs + Puppet would be easier, simpler, better? In the end, I think Docker just fit with the way I conceptualized my problem so I went for it.
    • Venki Ramakrishnan: each experiment generates several terabytes of data, which is then massaged, analyzed, and reduced, and finally you get a structure. 
    • @dberkholz: A 19-line sample pulled in 190,000 lines of code in dependencies. Is that what you call a 10000x programmer? #ServerlessConf
    • @asymco: Apple Watch continues to struggle as unit sales more than doubled in six of top 10 markets
    • @pomeranian99: Memory leaks on missiles don't matter, so long as the missile explodes before too much leaks. A 1995 memo: 
    • Paul Johnston: Most of these vendors can cope with what you throw at them so just go for it and stop trying to keep your options open. That way lies madness and mediocrity for your solution (at present).
    • @BrewersStats: 0.3% of the largest breweries make 69.3% of the beer. Conversely, 76.5% of the smallest make 0.7% of the beer.
    • @howardlindzon: Apple is 12.3 billion away from being the first Trillion dollar company
    • @michael_adda: Completely agree with the #serverless async/sync argument "Concurrency within a flow? it needs to move into our infrastructure"
    • resident_ninja: making literally EVERYTHING a stored proc creates a very bad, tight coupling between the app and db, kills scalability, and increases the pain of app and website deployments 
    • Impact Lab: There are about 1,200 malls in America today. In a decade, there might be about 900. That’s not quite the “the death of malls.” But it is decline, and it is inevitable.
    • Joel Frohlich: at that point in history, no other human being had ever experienced a focused beam of radiation at such high energy
    • Shazam: Whenever a user Shazams a song, our algorithm uses GPUs to search that database until it finds a match. This happens successfully over 20 million times per day.
    • Dmitri Zimine: You will rewrite your app, not to move to the other provider but by the progress of your cloud provider. They change existing services and introduce new ones
    • There's just too much. Read more by clicking through to the full article.

  • Filed under the coolest use of machine learning category. Algorithmic ‘Printed’ Fields Could Make Farms More Productive and Resilient: UK-based designer Benedikt Groß has created algorithmic models that enable him to plant various crops in complex patterns in a field. This improves ecological resilience and diversity through fascinating patterns that are best appreciated from above.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Click to read more ...

Wednesday
May032017

Homegrown master-master replication for a NoSQL database

Many of you may have already heard about the high performance of the Tarantool DBMS, about its rich toolset and certain features. Say, it has a really cool on-disk storage engine called Vinyl, and it knows how to work with JSON documents. However, most articles out there tend to overlook one crucial thing: usually, Tarantool is regarded simply as storage, whereas its killer feature is the possibility of writing code inside it, which makes working with your data extremely effective. If you’d like to know how igorcoding and I built a system almost entirely inside Tarantool, read on.

If you’ve ever used the Mail.Ru email service, you probably know that it allows collecting emails from other accounts. If the OAuth protocol is supported, we don’t need to ask a user for third-party service credentials to do that — we can use OAuth tokens instead. Besides, Mail.Ru Group has lots of projects that require authorization via third-party services and need users’ OAuth tokens to work with certain applications. That’s why we decided to build a service for storing and updating tokens.

I guess everybody knows what an OAuth token looks like. To refresh your memory, it’s a structure consisting of 3–4 fields:

Click to read more ...