MPT-7B: A Revolutionary Leap in Language Models
MosaicML unveiled MPT-7B, a groundbreaking foundational Large Language Model (LLM) for commercial applications.
Previous models were limited by the Llama model's commercial restrictions.
Release note: Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs
Fine-Tuned Variants:
MPT-7B-StoryWriter-65k:
Context length of 65,000 tokens, extrapolating up to 84,000 tokens.
Demonstrated by crafting an epilogue for The Great Gatsby.
Fine-tuned on a dataset of fiction books.
Available for commercial use.
MPT-7B-Instruct:
Fine-tuned for providing instructions.
Utilizes a more extensive dataset than Dolly.
Cleared for commercial applications.
MPT-7B-Chat:
Similar to ChatGPT, designed for engaging in dialogues.
Commercial usage not permitted due to restricted dataset access.
Key Takeaways:
Pivotal moment in LLM development.
Training a foundational model is expensive, but fine-tuning is more affordable.
Costs:
Base MPT-7B model training cost: over $200,000.
MosaicML commendably open-sourced the model.
Fine-tuning an instruction-tuned model can cost under $50 (MosaicML did it for $37).
Benefits:
Powerful testament to MosaicML platform capabilities.
Advantageous for businesses seeking smaller, specialized ChatGPT-like models on their own servers.
Addresses potential data privacy concerns.
Last updated