Introducing a language model larger than GPT-3 with a bold goal: freeing AI from the clutches of Big Tech.
What is Bloom?
BLOOM, a large language model (LLM), promises similar performance to Silicon Valley’s leading models, but with a fundamentally different approach. BLOOM is open to the public, unlike the LLMs of tech giants, which are usually kept under wraps.
As an unusual feature in an English-dominated field, BLOOM it is also multilingual — unlike Google’s LaMDA and OpenAI’s GPT-3. It’s possible that these features could open up access to cutting-edge technology that will have a profound impact on society.
Large language models (LLMs) are proving themselves capable of a wider range of tasks, such as essay writing, code generation, and language translation. They’re also capable of creating harmful content, and it’s hard to predict what they’ll be able to do in the future.
BLOOM is a demonstration that the most powerful AI models can be trained and released by the broader research community with accountability and in an actual open way, in contrast to the typical secrecy of industrial AI research labs.
BLOOM’s training co-leader Teven Le Scao
The creation and operation of LLMs are both prohibitively expensive. For example, an estimated $27.6 million has been spent on training GPT-3. Companies in the technology sector have a strong incentive to safeguard their significant financial investments, especially when those investments give them an advantage over their competitors.
As a result, it isn’t surprising that LLMs are rarely open-sourced — with notable exceptions, of course.
After launching in the first half of 2021, BigScience research project developed BLOOM. Artificial Intelligence start-up Hugging Face is spearheading the project.
Large ML models have changed the world of AI research over the last two years but the huge compute cost necessary to train them resulted in very few teams actually having the ability to train and research them
Thomas Wolf, BigScience co-lead and Hugging Face co-founder
BLOOM was created by a team of 100,000 researchers from more than 60 countries and over 250 institutions. A supercomputer in Paris, France, was used to train the model.
We adopted a data-first approach to make sure the training corpus was aligned with our values. The multidisciplinary and international makeup of BigScience enabled us to critically reflect on every step of the process from multiple vantage points: ethical, legal, environmental, linguistic, and technical. That meant we were able to mitigate ethical concerns without compromising on performance or scale.
Christopher Akiki, BigScience researcher, Leipzig University
It’s hard to overstate how huge this thing is. BLOOM has more parameters than OpenAI’s GPT-3 and MetaAI’s OPT combined (176 billion). Text in 46 languages and dialects can be generated by the model, along with 13 programming languages. The first-ever language model with more than 100 billion parameters for many of them.
It’s also one of the most affordable options out there. According to BigScience, researchers can use BLOOM on a cloud provider for less than $40/hr.
As with GPT3 (auto-regressive model for next token prediction), BLOOM has been trained on 46 languages, including code, and its architecture is similar.
The same dataset has been used to train several smaller models. The following variations of BLOOM are available:
- bloom (175B parameters)
This is only a fraction of what’s to come. As the workshop continues to experiment and tinker with the model, BLOOM’s capabilities will improve. BigScience started working on making it as instructable as their earlier effort T0++ was. They will add in the future more languages, compress the model into a more usable version with the same level of performance, and use it as a starting point for more complex architectures.