Second-gen Google Cloud TPUs take machine learning to the next level


Keeping scalability in mind, Google has included a custom high-speed network in each TPU that allows them to build "TPU Pods" by assembling 64 units to form an ML supercomputer with 11.5 petaflops of computational power. To do this, Google is extending supporting for these type of training feature on its newly announced second generation TPUs, known as Cloud TPUs. Google claims the each second-generation TPU can deliver up to 180 teraflops of performance.

Google has made another leap forward in the realm of machine learning hardware.

Only a week after Nvidia's new AI-focused Volta GPU architecture was announced, Google aims to steal some of its thunder with its new, second-generation, Tensor Processing Unit (TPU) that it calls a Cloud TPU. Google's first-generation TPUs, however, don't use floating point at all; they use 8-bit integer approximations to floating point. More recently, the technology has been applied to machine learning models used to improve Google Translate, Google Photos, and other software that can make novel use of new AI training techniques. One drawback, though, is that the Google TPUs now only support TensorFlow and Google's tools. The addition will significantly bulk up the capabilities of Google Cloud and provide acceleration to machine learning workloads. "One of our new large-scale translation models used to take a full day to train on 32 of the best commercially-available GPUs - now it trains to the same accuracy in an afternoon using just one eighth of a TPU pod", Dean and Holzle noted. But the newest TPU can also run software for leafing through hundreds of thousands of images or search terms and learning to organize pictures or suggest websites without explicit programming. The previous TPU could only do inference - for instance, relying on Google Cloud to crunch numbers in real time to produce a result. The former workload is the one that is most heavily dependent on massive compute power, and it's this that has generally been done on GPUs. "But for most of our machine learning workloads, we see TPUs as something we'll be using more and more often".

More news: Snap plunges after 1Q report as Facebook's shadow looms
More news: Parker hurt as Spurs draw level
More news: Turkish President's Bodyguards Suspected in Beating at Embassy

"Research and engineering teams at Google and elsewhere have made great progress scaling machine learning training using readily-available hardware".

Because this newer TPU is now capable of doing both inference and training, researchers can deploy more versatile AI experiments far faster than before - so long as the software is built using TensorFlow.

It's always hard to know how useful these comparisons are in practice, but it should at least give you a sense of the speed compared to GPUs, which are typically the most powerful chips being used in machine learning operations today. The only price is that they will have to open source the research, which seems worth it to us.