(Image Source: http://tvmlang.org/)
- Scalable frameworks, such as TensorFlow, MXNet, Caffe, and PyTorch are optimized for a narrow range of serve-class GPUs.
- Deploying workloads to other platforms such as mobile phones, IoT, and specialized accelarators(FPGAs, ASICs) requires laborious manual effort.
TVM is an end-to-end optimization stack that exposes:
- operator-level optimizations
---> to provide performance portability to deep learning workloads across diverse hardware back-ends.