MAI-1 is Microsoft’s 500 billion parameter model. Should OpenAI worry?

Microsoft is training a large language model (LLM) called MAI-1 that could compete with models from Google, Anthropic, and even those from partner OpenAI.

The news comes from the newspaper The Information, according to which, for the first time since it invested more than 10 billion dollars in OpenAI, Microsoft is internally training a new model large enough to compete with its adversaries’ AIsbut also with those of OpenAI itself.

Following the developments of MAI-1 would be Mustafa Suleyman, hired in March to manage the new internal organization called Microsoft AI. Suleyman was co-founder of DeepMind in 2010, at Google until 2022 and founder of the Inflection AI startup which gave birth to the Pi “personal chatbot”.

Along with Suleyman, Microsot hired some of the startup’s staff and paid $650 million for the use of its intellectual properties. In fact, it seems that some of the training data comes directly from Inflection AI, although two sources within Microsoft said that MAI-1 is a separate project from what was previously developed by the startup. Other data would instead come from other sources. The Information also cites text generated by OpenAI’s GPT-4.

Is Microsoft Building a Rival to ChatGPT?

At the end of the training, MAI-1 should have around 500 billion parameterswhich would make Microsoft’s internal model a formidable opponent.

Although there is no official data from OpenAI, the model GPT-4 – which brings to life the paid version of ChatGPT and Microsoft’s Copilot – should have 220 billion parameters trained on 8 separate weight sets for a total of 1760 billion parameters (as stated by the well-known engineer George Hotz, who however is not an OpenAI employee).

This type of training is called “mixture-of-experts” (combination of experts). The basic idea is to divide the model into several specialized subsets or “experts”, each of which is trained to work on a specific subset of data or tasks.

Then, during inference, when the model receives the prompt from a user, a routing mechanism is used to determine which subgroup of experts is most relevant to process that specific input.

Microsoft has already released open source models, such as the small Phi-3, however MAI-1 is placed on a decidedly higher level. It has not yet been established what MAI-1 will be used for, but its announcement could be showcased at Microsoft’s next Build conference which will be held May 21-23, The Information reported.

For Latest Updates Follow us on Google News

Is Microsoft Building a Rival to ChatGPT?

Related posts