|
--- |
|
license: mit |
|
library_name: transformers |
|
base_model: |
|
- deepseek-ai/DeepSeek-V3-0324 |
|
- deepseek-ai/DeepSeek-R1 |
|
pipeline_tag: text-generation |
|
--- |
|
# DeepSeek-R1T-Chimera |
|
|
|
<div align="center"> |
|
<img src="https://354918363417-runtime-assets.s3.eu-central-1.amazonaws.com/company_logo_light.svg" |
|
alt="TNG Logo" |
|
width="400" |
|
style="display: inline-block; vertical-align: middle;"/> |
|
</div> |
|
<br> |
|
<div align="center"> |
|
<a href="LICENSE" style="margin: 2px;"> |
|
<img alt="License" src="https://img.shields.io/badge/License-MIT-f5de53?&color=f5de53" style="display: inline-block; vertical-align: middle;"/> |
|
</a> |
|
</div> |
|
<br> |
|
<div align="center"> |
|
<a href="https://x.com/tngtech/status/1916284566127444468" style="margin: 2px;"> |
|
<img alt="Benchmarks" src="R1T-Chimera_Benchmarks_20250427_V1.jpg" style="display: inline-block; vertical-align: middle;"/> |
|
</a> |
|
</div> |
|
|
|
|
|
**Model merge of DeepSeek-R1 and DeepSeek-V3 (0324)** |
|
|
|
An open weights model combining the intelligence of R1 with the token efficiency of V3. |
|
|
|
[Announcement on X](https://x.com/tngtech/status/1916284566127444468) | [LinkedIn post](https://www.linkedin.com/posts/tng-technology-consulting_on-the-weekend-we-released-deepseek-r1t-chimera-activity-7323008947236290560-Cf2m) | [Try it on OpenRouter](https://openrouter.ai/tngtech/deepseek-r1t-chimera:free) |
|
|
|
|
|
## Model Details |
|
|
|
- **Architecture**: DeepSeek-MoE Transformer-based language model |
|
- **Combination Method**: Merged model weights from DeepSeek-R1 and DeepSeek-V3 (0324) |
|
- **Release Date**: 2025-04-27 |
|
|
|
## Use, Out-of-scope Use, Limitations, Risks, Recommendations et al |
|
Regarding R1T Chimera, we ask you to follow the careful guidelines that Microsoft has created for their "MAI-DS-R1" DeepSeek-based model. |
|
|
|
These guidelines are available [here on Hugging Face](https://huggingface.co/microsoft/MAI-DS-R1). |
|
|
|
## Contact |
|
|
|
- Email: [email protected] |
|
- X.com: @tngtech |
|
|
|
## Citation |
|
|
|
``` |
|
@misc{tng_technology_consulting_gmbh_2025, |
|
author = { TNG Technology Consulting GmbH }, |
|
title = { DeepSeek-R1T-Chimera }, |
|
year = 2025, |
|
month = {April}, |
|
url = { https://huggingface.co/tngtech/DeepSeek-R1T-Chimera }, |
|
doi = { 10.57967/hf/5330 }, |
|
publisher = { Hugging Face } |
|
} |
|
|
|
``` |