Artificial Intelligence

Popular Large Language Models

Robot in well lit room

Here is a list of large language models:

BERT (Bidirectional Encoder Representations from Transformers): Developed by Google, BERT is a pre-trained language model that uses a multi-task learning approach to achieve state-of-the-art results on a wide range of natural language processing tasks.
RoBERTa (Robustly optimized BERT approach): Developed by Facebook, RoBERTa is a variant of BERT that has achieved state-of-the-art results on many natural language processing tasks.
Transformer-XL (Transformer-XL): Developed by Google, Transformer-XL is a pre-trained language model that uses a combination of self-attention and position encoding to achieve state-of-the-art results on many natural language processing tasks.
GPT (Generative Pre-trained Transformer): Developed by OpenAI, GPT is a pre-trained language model that is trained on a large dataset of text and can be fine-tuned for a variety of natural language processing tasks.
T5 (Text-to-Text Transformers): Developed by Google, T5 is a pre-trained language model that is trained on a large dataset of text and can be used for a variety of natural language processing tasks.
DistilBERT: Developed by Facebook, DistilBERT is a smaller and more efficient version of BERT that has achieved state-of-the-art results on many natural language processing tasks.
ALBERT (A Lite BERT): Developed by Google, ALBERT is a variant of BERT that is trained on a smaller dataset and has achieved state-of-the-art results on many natural language processing tasks.
XLNet (Extreme Language Model): Developed by Google, XLNet is a pre-trained language model that uses a combination of self-attention and position encoding to achieve state-of-the-art results on many natural language processing tasks.
Megatron (A Large Transformer Model): Developed by Microsoft, Megatron is a pre-trained language model that uses a combination of self-attention and position encoding to achieve state-of-the-art results on many natural language processing tasks.
LLaMA (Large Language Model): Developed by Meta AI, LLaMA is a large language model that uses a combination of self-attention and position encoding to achieve state-of-the-art results on many natural language processing tasks.
DALL-E (DeepArtificial Language Model for Everyone): Developed by Meta AI, DALL-E is a large language model that uses a combination of self-attention and position encoding to achieve state-of-the-art results on many natural language processing tasks.
BART (Bidirectional and Autoencoding Transformers): Developed by Facebook, BART is a pre-trained language model that uses a combination of self-attention and position encoding to achieve state-of-the-art results on many natural language processing tasks.
GPT-3 (Generative Pre-trained Transformer 3): Developed by OpenAI, GPT-3 is a pre-trained language model that is trained on a large dataset of text and can be fine-tuned for a variety of natural language processing tasks.
LongRange Transformer (LRT): Developed by Google, LRT is a pre-trained language model that uses a combination of self-attention and position encoding to achieve state-of-the-art results on many natural language processing tasks.
Llama (Large Language Model): Developed by Meta AI, Llama is a large language model that uses a combination of self-attention and position encoding to achieve state-of-the-art results on many natural language processing tasks.

Note: This is not an exhaustive list and there are many other large language models available.

Popular Large Language Models

Read next

Large Language Models (LLMs)

Mathematics of Machine Learning: Unlocking the Secrets of the Digital Universe

Long Short-Term Memory (LSTM) Networks