DeepSeek Papers

DeepSeek Papers

1. DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Description: Scaling open-source language models with a focus on longtermism.

Link to Paper {Jan 6, 2024}

2. DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Description: Exploring expert specialization in Mixture-of-Experts language models.

Link to Paper {Jan 11, 2024}

3. DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Description: Investigating the intersection of large language models and programming.

Link to Paper {Jan 25, 2024}

17. Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Description: Hardware-Aligned and Natively Trainable Sparse Attention.

Link to Paper {Feb 16, 2025}

Related Links

There's a lot of excellent work being done in the field of AI and machine learning. For more information, check out these resources:

BibTeX


@article{deepseek2024papers,
  author    = {DeepSeek Research Team},
  title     = {DeepSeek Papers: Advancements in Language Models and Multimodal Understanding},
  journal   = {DeepSeek Publications},
  year      = {2024-2025},
}