1. DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Scaling open-source language models with a focus on longtermism.
Paper
{Jan 6, 2024}
2. DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Exploring expert specialization in Mixture-of-Experts language models.
Paper
{Jan 11, 2024}