NVIDIA has teamed up with Microsoft to create one of the world’s largest and most powerful language processors.
The machine, which they ominously named “Megatron,” actually goes by the more official name the Megatron-Turing Natural Language Generation model or MT-NLG), writes The Register.
According to a blog post on the NVIDIA Developer website, Megatron is designed to “demonstrate unmatched accuracy” in these natural language generation tasks:
- Completion prediction (aka auto-completing sentences)
- Reading comprehension
- Common sense reasoning
- Word sense disambiguation
- Natural language inferences
NVIDIA and Microsoft’s project is one of the largest ever built in terms of total parameters at 530 billion, dwarfed only by Google’s Switch Transformer demo, which can boast 1.6 trillion parameters.
This puts the MT-NLG in second place, though the third placer, OpenAI’s GPT-3, is not even close at only 175 billion parameters.
BEASTLY Hardware Requirements
To achieve this technical feat in the natural language processing (NLP) field, NVIDIA and Microsoft had to power Megatron with some insane hardware.
To merely train the model, NVIDIA had to use its Selene supercomputer. This machine is composed of 560 DGX A100 servers, each of them containing eight A100 GPUs with 80GB of VRAM.
That’s a total of 4,480 GPUs, all connected via NVLink and NVSwitch, communicating between one another. Furthermore, the Selene supercomputer also uses an array of AMD EPYC 7v742 processors.
As a result, Megatron can actually be trained on NLP tasks with barely any fine-tuning. The total project cost for this kind of hardware is a cool $85 million.
Read also: This AI GAN Can Tell Your Preference of Attractive Faces Through Machine Learning [STUDY]
Like with every other type of AI tech, however, NLP still faces a lot of challenges.
The biggest one simply involves how many languages in the world there are. There are an estimated 6,500 languages in total, and a few of the most significant ones, such as Arabic, Spanish, Portuguese, and Hindi, are still proving troublesome for NLP, according to the NVIDIA Developer website.
It just goes to show that even with the most powerful hardware available, current AI technology still requires years of work to understand the intricacies of human languages, which is an extremely tall order on its own.
What’s The Purpose Of NVIDIA-Microsoft’s Megatron?
As said before, the main goal of the MT-NLG is to fulfill natural language processing (NLP) tasks.
For the uninitiated, this simply means that NVIDIA and Microsoft’s plan. Brinkwire Summary News.