Spaces:
Running
Running
<html> | |
<head> | |
<meta charset="utf-8"> | |
<meta name="description" content="DeepSeek Papers: Advancing Open-Source Language Models"> | |
<meta name="keywords" content="DeepSeek, LLM, AI, Research"> | |
<meta name="viewport" content="width=device-width, initial-scale=1"> | |
<title>DeepSeek Papers: Advancing Open-Source Language Models</title> | |
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro" rel="stylesheet"> | |
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/bulma/0.9.3/css/bulma.min.css"> | |
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0/css/all.min.css"> | |
<style> | |
.publication-title { | |
color: #363636; | |
} | |
.paper-card { | |
margin-bottom: 2rem; | |
transition: transform 0.2s; | |
} | |
.paper-card:hover { | |
transform: translateY(-5px); | |
} | |
.coming-soon-badge { | |
background-color: #3273dc; | |
color: white; | |
padding: 0.25rem 0.75rem; | |
border-radius: 4px; | |
font-size: 0.8rem; | |
margin-left: 1rem; | |
} | |
.paper-description { | |
color: #4a4a4a; | |
margin-top: 0.5rem; | |
} | |
.release-date { | |
color: #7a7a7a; | |
font-size: 0.9rem; | |
} | |
</style> | |
</head> | |
<body> | |
<section class="hero is-light"> | |
<div class="hero-body"> | |
<div class="container is-max-desktop"> | |
<div class="columns is-centered"> | |
<div class="column has-text-centered"> | |
<h1 class="title is-1 publication-title">DeepSeek Papers</h1> | |
<h2 class="subtitle is-3">Advancing Open-Source Language Models</h2> | |
</div> | |
</div> | |
</div> | |
</div> | |
</section> | |
<section class="section"> | |
<div class="container is-max-desktop"> | |
<div class="content"> | |
<div class="columns is-centered"> | |
<div class="column is-10"> | |
<!-- DeepSeekLLM --> | |
<div class="card paper-card"> | |
<div class="card-content"> | |
<h3 class="title is-4"> | |
DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism | |
<span class="coming-soon-badge">Deep Dive Coming Soon</span> | |
</h3> | |
<p class="release-date">Released: November 29, 2023</p> | |
<p class="paper-description"> | |
This foundational paper explores scaling laws and the trade-offs between data and model size, | |
establishing the groundwork for subsequent models. | |
</p> | |
</div> | |
</div> | |
<!-- DeepSeek-V2 --> | |
<div class="card paper-card"> | |
<div class="card-content"> | |
<h3 class="title is-4"> | |
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model | |
<span class="coming-soon-badge">Deep Dive Coming Soon</span> | |
</h3> | |
<p class="release-date">Released: May 2024</p> | |
<p class="paper-description"> | |
Introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing | |
training costs by 42%. Emphasizes strong performance characteristics and efficiency improvements. | |
</p> | |
</div> | |
</div> | |
<!-- DeepSeek-V3 --> | |
<div class="card paper-card"> | |
<div class="card-content"> | |
<h3 class="title is-4"> | |
DeepSeek-V3 Technical Report | |
<span class="coming-soon-badge">Deep Dive Coming Soon</span> | |
</h3> | |
<p class="release-date">Released: December 2024</p> | |
<p class="paper-description"> | |
Discusses the scaling of sparse MoE networks to 671 billion parameters, utilizing mixed precision | |
training and high-performance computing (HPC) co-design strategies. | |
</p> | |
</div> | |
</div> | |
<!-- DeepSeek-R1 --> | |
<div class="card paper-card"> | |
<div class="card-content"> | |
<h3 class="title is-4"> | |
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning | |
<span class="coming-soon-badge">Deep Dive Coming Soon</span> | |
</h3> | |
<p class="release-date">Released: January 20, 2025</p> | |
<p class="paper-description"> | |
The R1 model builds on previous work to enhance reasoning capabilities through large-scale | |
reinforcement learning, competing directly with leading models like OpenAI's o1. | |
</p> | |
</div> | |
</div> | |
<!-- DeepSeekMath --> | |
<div class="card paper-card"> | |
<div class="card-content"> | |
<h3 class="title is-4"> | |
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models | |
<span class="coming-soon-badge">Deep Dive Coming Soon</span> | |
</h3> | |
<p class="release-date">Released: April 2024</p> | |
<p class="paper-description"> | |
This paper presents methods to improve mathematical reasoning in LLMs, introducing the | |
Group Relative Policy Optimization (GRPO) algorithm during reinforcement learning stages. | |
</p> | |
</div> | |
</div> | |
<!-- DeepSeek-Prover --> | |
<div class="card paper-card"> | |
<div class="card-content"> | |
<h3 class="title is-4"> | |
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data | |
<span class="coming-soon-badge">Deep Dive Coming Soon</span> | |
</h3> | |
<p class="paper-description"> | |
Focuses on enhancing theorem proving capabilities in language models using synthetic data | |
for training, establishing new benchmarks in automated mathematical reasoning. | |
</p> | |
</div> | |
</div> | |
<!-- DeepSeek-Coder-V2 --> | |
<div class="card paper-card"> | |
<div class="card-content"> | |
<h3 class="title is-4"> | |
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence | |
<span class="coming-soon-badge">Deep Dive Coming Soon</span> | |
</h3> | |
<p class="paper-description"> | |
This paper details advancements in code-related tasks with an emphasis on open-source | |
methodologies, improving upon earlier coding models with enhanced capabilities. | |
</p> | |
</div> | |
</div> | |
<!-- DeepSeekMoE --> | |
<div class="card paper-card"> | |
<div class="card-content"> | |
<h3 class="title is-4"> | |
DeepSeekMoE: Advancing Mixture-of-Experts Architecture | |
<span class="coming-soon-badge">Deep Dive Coming Soon</span> | |
</h3> | |
<p class="paper-description"> | |
Discusses the integration and benefits of the Mixture-of-Experts approach within the | |
DeepSeek framework, focusing on scalability and efficiency improvements. | |
</p> | |
</div> | |
</div> | |
</div> | |
</div> | |
</div> | |
</div> | |
</section> | |
<footer class="footer"> | |
<div class="container"> | |
<div class="content has-text-centered"> | |
<p> | |
This website is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"> | |
Creative Commons Attribution-ShareAlike 4.0 International License</a>. | |
</p> | |
</div> | |
</div> | |
</footer> | |
</body> | |
</html> |