DeepSeek: Advancing Open-Source Language Models

A collection of groundbreaking research papers in AI and language models

Overview

DeepSeek has released a series of significant papers detailing advancements in large language models (LLMs). Each paper represents a step forward in making AI more capable, efficient, and accessible.

Research Papers

DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism

Deep Dive Coming Soon

Released: November 29, 2023

This foundational paper explores scaling laws and the trade-offs between data and model size, establishing the groundwork for subsequent models.

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Deep Dive Coming Soon

Released: May 2024

Introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing training costs by 42%.

DeepSeek-V3 Technical Report

Deep Dive Coming Soon

Released: December 2024

Discusses the scaling of sparse MoE networks to 671 billion parameters.

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs

Deep Dive Coming Soon

Released: January 20, 2025

Enhances reasoning capabilities through large-scale reinforcement learning.

DeepSeekMath: Pushing the Limits of Mathematical Reasoning

Deep Dive Coming Soon

Released: April 2024

Presents methods to improve mathematical reasoning in LLMs.

DeepSeek-Prover: Advancing Theorem Proving in LLMs

Deep Dive Coming Soon

Focuses on enhancing theorem proving capabilities using synthetic data for training.

DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models

Deep Dive Coming Soon

Details advancements in code-related tasks with emphasis on open-source methodologies.

DeepSeekMoE: Advancing Mixture-of-Experts Architecture

Deep Dive Coming Soon

Discusses the integration and benefits of the Mixture-of-Experts approach.