DeepSeekPapers / index.html
metacritical's picture
Update index.html
73705e5 verified
raw
history blame
4.39 kB
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="description" content="DeepSeek Papers: Advancing Open-Source Language Models">
<meta name="keywords" content="DeepSeek, LLM, AI, Research">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>DeepSeek Papers: Advancing Open-Source Language Models</title>
<link href="https://fonts.googleapis.com/css?family=Google+Sans|Noto+Sans|Castoro" rel="stylesheet">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/bulma/0.9.3/css/bulma.min.css">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.0.0/css/all.min.css">
<style>
.publication-title {
color: #363636;
}
.paper-card {
margin-bottom: 2rem;
transition: transform 0.2s;
}
.paper-card:hover {
transform: translateY(-5px);
}
.coming-soon-badge {
background-color: #3273dc;
color: white;
padding: 0.25rem 0.75rem;
border-radius: 4px;
font-size: 0.8rem;
margin-left: 1rem;
}
.paper-description {
color: #4a4a4a;
margin-top: 0.5rem;
}
.release-date {
color: #7a7a7a;
font-size: 0.9rem;
}
</style>
</head>
<body>
<section class="hero is-light">
<div class="hero-body">
<div class="container is-max-desktop">
<div class="columns is-centered">
<div class="column has-text-centered">
<h1 class="title is-1 publication-title">DeepSeek Papers</h1>
<h2 class="subtitle is-3">Advancing Open-Source Language Models</h2>
</div>
</div>
</div>
</div>
</section>
<section class="section">
<div class="container is-max-desktop">
<div class="content">
<div class="columns is-centered">
<div class="column is-10">
<!-- DeepSeekLLM -->
<div class="card paper-card">
<div class="card-content">
<h3 class="title is-4">
DeepSeekLLM: Scaling Open-Source Language Models with Longer-termism
<span class="coming-soon-badge">Deep Dive Coming Soon</span>
</h3>
<p class="release-date">Released: November 29, 2023</p>
<p class="paper-description">
This foundational paper explores scaling laws and the trade-offs between data and model size,
establishing the groundwork for subsequent models.
</p>
</div>
</div>
<!-- DeepSeek-V2 -->
<div class="card paper-card">
<div class="card-content">
<h3 class="title is-4">
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
<span class="coming-soon-badge">Deep Dive Coming Soon</span>
</h3>
<p class="release-date">Released: May 2024</p>
<p class="paper-description">
Introduces a Mixture-of-Experts (MoE) architecture, enhancing performance while reducing
training costs by 42%. Emphasizes strong performance characteristics and efficiency improvements.
</p>
</div>
</div>
<!-- Continue with other papers... -->
<div class="card paper-card">
<div class="card-content">
<h3 class="title is-4">
DeepSeek-V3 Technical Report
<span class="coming-soon-badge">Deep Dive Coming Soon</span>
</h3>
<p class="release-date">Released: December 2024</p>
<p class="paper-description">
Discusses the scaling of sparse MoE networks to 671 billion parameters, utilizing mixed precision
training and high-performance computing (HPC) co-design strategies.
</p>
</div>
</div>
<!-- Add remaining papers following the same pattern -->
</div>
</div>
</div>
</div>
</section>
<footer class="footer">
<div class="container">
<div class="content has-text-centered">
<p>
This website is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">
Creative Commons Attribution-ShareAlike 4.0 International License</a>.
</p>
</div>
</div>
</footer>
</body>
</html>