There’s a war between two visions of how the ubiquitous AI assisted future will be rendered: on the cloud or on the device. And as with any great drama it helps the story along if we have two archetypal antagonists. On the cloud side we have Google. On the device side we have Apple. Who will win? Both? Neither? Or do we all win?
If you would have asked me a week ago I would have said the cloud would win. Definitely. If you read an article like Jeff Dean On Large-Scale Deep Learning At Google you can’t help but be amazed at what Google is accomplishing. Impressive. Wide ranging. Smart. Systematic. Dominant.
Apple has been largely absent from the trend of sprinkling deep learning fairy dust on their products. This should not be all that surprising. Apple moves at their own pace. Apple doesn’t reach for early adopters, they release a technology when it’s a win for the mass consumer market.
There’s an idea because Apple is so secretive they might have hidden away vast deep learning chops we don’t even know about yet. We, of course, have no way of knowing.
What may prove more true is that Apple is going about deep learning in a different way: differential privacy + powerful on device processors + offline training with downloadable models + a commitment to really really not knowing anything personal about you + the deep learning equivalent of perfect forward secrecy.
Since AlphaGo vs Lee Se-dol, the modern version of John Henry’s fatal race against a steam hammer, has captivated the world, as has the generalized fear of an AI apocalypse, it seems like an excellent time to gloss Jeff’s talk. And if you think AlphaGo is good now, just wait until it reaches beta.
Jeff is referring, of course, to Google’s infamous motto: organize the world’s information and make it universally accessible and useful.
Historically we might associate ‘organizing’ with gathering, cleaning, storing, indexing, reporting, and searching data. All the stuff early Google mastered. With that mission accomplished Google has moved on to the next challenge.
Now organizing means understanding.
Some highlights from the talk for me:
Real neural networks are composed of hundreds of millions of parameters. The skill that Google has is in how to build and rapidly train these huge models on large interesting datasets, apply them to real problems, and then quickly deploy the models into production across a wide variery of different platforms (phones, sensors, clouds, etc.).
The reason neural networks didn’t take off in the 90s was a lack of computational power and a lack of large interesting data sets. You can see how Google’s natural love of algorithms combined with their vast infrastructure and ever enlarging datasets created a perfect storm for AI at Google.
A critical difference between Google and other companies is that when they started the Google Brain project in 2011, they didn’t keep their research in the ivory tower of a separate research arm of the company. The project team worked closely with other teams like Android, Gmail, and photos to actually improve those properties and solve hard problems. That’s rare and a good lesson for every company. Apply research by working with your people.
This idea is powerful: They’ve learned they can take a whole bunch of subsystems, some of which may be machine learned, and replace it with a much more general end-to-end machine learning piece. Often when you have lots of complicated subsystems there’s usually a lot of complicated code to stitch them all together. It’s nice if you can replace all that with data and very simple algorithms.
Machine learning will only get better, faster. A paraphrased quote from Jeff: The machine learning community moves really really fast. People publish a paper and within a week lots of research groups throughout the world have downloaded the paper, have read it, dissected it, understood it, implemented some extensions to it, and published their own extensions to it on arXiv.org. It’s different than a lot other parts of computer science where people would submit a paper, and six months later a conference would decide to accept it or not, and then it would come out in the conference proceeding three months later. By then it’s a year. Getting that time down from a year to a week is amazing.
Techniques can be combined in magical ways. The Translate Team wrote an app using computer vision that recognizes text in a viewfinder. It translates the text and then superimposes the translated text on the image itself. Another example is writing image captions. It combines image recognition with the Sequence-to-Sequence neural network. You can only imagine how all these modular components will be strung together in the future.
Models with impressive functionality are small enough run on Smartphones. For technology to disappear intelligence must move to the edge. It can’t be dependent on network umbilical cord connected to a remote cloud brain. Since TensorFlow models can run on a phone, that might just be possible.
If you’re not considering how to use deep neural nets to solve your data understanding problems, you almost certainly should be. This line is taken directly from the talk, but it’s truth is abundantly clear after you watch hard problem after hard problem made tractable using deep neural nets.
Jeff always gives great talks and this one is no exception. It’s straightforward, interesting, in-depth, and relatively easy to understand. If you are trying to get a handle on Deep Learning or just want to see what Google is up to, then it's a must see.
There’s not a lot of fluff in the talk. It’s packed. So I’m not sure how much value add this article will give you. So if you want to just watch the video I’ll understand.
As often happens with Google talks there’s this feeling you get that we’ve only been invited into the lobby of Willy Wonka’s Chocolate factory. In front of us is a locked door and we're not invited in. What’s beyond that door must be full of wonders. But even Willy Wonka’s lobby is interesting.
So let’s learn what Jeff has to say about the future…it’s fascinating...