I think what OP meant was ASIC specific to deep learning like TPU's. However as I see, current frameworks are not matured enough to support GPU's and TPU's with exact same code. Also there are no standards so every big org is going to build support for their own ASIC interfaces for the framework they manage. Is there an open source interface for ASIC's for deep learning?
Cambricon is going big in china so its not just google and apples. They claim to be 6 times faster than GPU.
I am more interested in potential of being able to run video processing, voice models effortlessly on tiny devices. and also to train models offline or locally.
I think there is a good scope of solutions (like vision recognition) that port well across AI chips.