![]() The company made its pitch this week, with Graphcore co-founder and chief technology officer Simon Knowles saying that the “advent of 8-bit floating point offers tremendous performance and efficiency benefits for AI compute. Now Graphcore is banging the drum to have the IEEE adopt the vendor’s FP8 format designed for AI as the standard that everyone else can work off of. “In particular, by using 8-bit floating point arithmetic the energy efficiency can be increased by up to 4× with respect to float-16 arithmetic and up to 16× with respect to float-32 arithmetic.” “Low precision numerical formats can be a key component of large machine learning models that provide state of the art accuracy while reducing their environmental impact,” the researchers wrote. In June, Graphcore released a 30-page study conducted that not only showed the superior performance of low-precision floating point formats over similarly sized scaled integers but also the long-term benefits of reducing the power consumption in training initiatives that include rapidly growing model sizes. Lower precision data formats can help.ĪI chip makers are seeing the advantages. Organizations need to figure out how to improve processing capabilities – particularly for training – using the power that is currently available. Companies are looking for more efficient ways to run AI jobs at a time when advances in processing speed are coming as fast as they did in the past. In a post-Moore’s Law world, every transistor is sacred, every clock cycle is to be cherished. ![]() Moreover, at some point in the future, if inference moves to 8-bit FP8 and possibly even 4-bit FP4 formats, that means valuable chip real estate dedicated to integer processor can be freed up and used for something else. This is important because flipping back and forth between floating point and integer formats is a pain in the neck, and having everything just stay in floating point is a lot easier. Nvidia and Intel both contend that FP8 can be used not just for inference, but for AI training in some cases, radically boosting the effective throughput of their accelerators. The FP8 format is important for a number of reasons, not the least of which being that up until now, there was a kind of split between AI inferencing, done at low precision in integer formats (usually INT8 but sometimes INT4), with AI training being done FP16, FP32, or FP64 precision and HPC done at FP32 or FP64 precision. Two months later, rival Intel popped out Gaudi2, the second generation of its AI training chip, which also sports an FP8 format. In March, Nvidia introduced its GH100, the first GPU based on the new “Hopper” architecture, which is aimed at both HPC and AI workloads, and importantly for the latter, supports an eight-bit FP8 floating point processing format.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |