Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
51.6
TFLOPS
21
4
315
Sthenno
sthenno
Follow
SiweiWu's profile picture
Cran-May's profile picture
GhostGate's profile picture
30 followers
·
30 following
https://github.com/neoheartbeats
neoheartbeats
AI & ML interests
To contact me:
[email protected]
Recent Activity
reacted
to
sometimesanotion
's
post
with 👍
about 9 hours ago
The capabilities of the new Qwen 3 models are fascinating, and I am watching that space! My experience, however, is that context management is vastly more important with them. If you use a client with a typical session log with rolling compression, a Qwen 3 model will start to generate the same messages over and over. I don't think that detracts from them. They're optimized for a more advanced MCP environment. I honestly think the 8B is optimal for home use, given proper RAG/CAG. In typical session chats, Lamarck and Chocolatine are still my daily drives. I worked hard to give Lamarck v0.7 a sprinkling of CoT from both DRT and Deepseek R1. While those models got surpassed on the leaderboards, in practice, I still really enjoy their output. My projects are focusing on application and context management, because that's where the payoff in improved quality is right now. But should there be a mix of finetunes to make just the right mix of - my recipes are standing by.
new
activity
about 10 hours ago
sthenno-com/miscii-14b-0218:
使用时需要购买api吗
liked
a model
5 days ago
shuttleai/shuttle-3.5
View all activity
Organizations
sthenno
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
5 days ago
shuttleai/shuttle-3.5
Text Generation
•
Updated
8 days ago
•
150
•
42
liked
a model
8 days ago
JetBrains/Mellum-4b-base
Text Generation
•
Updated
1 day ago
•
2.2k
•
297
liked
a model
9 days ago
deepseek-ai/DeepSeek-Prover-V2-671B
Text Generation
•
Updated
9 days ago
•
6.24k
•
•
737
liked
4 models
10 days ago
Qwen/Qwen3-14B
Text Generation
•
Updated
10 days ago
•
156k
•
•
130
Qwen/Qwen3-32B
Text Generation
•
Updated
10 days ago
•
170k
•
•
285
Qwen/Qwen3-30B-A3B
Text Generation
•
Updated
9 days ago
•
120k
•
•
497
Qwen/Qwen3-235B-A22B
Text Generation
•
Updated
8 days ago
•
58.6k
•
•
746
liked
a model
16 days ago
SteelStorage/Q2.5-MS-Mistoria-72b-v2
Text Generation
•
Updated
Nov 23, 2024
•
33
•
6
liked
a model
24 days ago
THUDM/GLM-4-32B-0414
Text Generation
•
Updated
8 days ago
•
25.5k
•
•
382
liked
a dataset
26 days ago
m-a-p/COIG-P
Viewer
•
Updated
24 days ago
•
1.01M
•
735
•
17
liked
2 models
26 days ago
nvidia/Llama-3_3-Nemotron-Super-49B-v1
Text Generation
•
Updated
about 8 hours ago
•
183k
•
•
279
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
Text Generation
•
Updated
about 10 hours ago
•
28.8k
•
•
293
liked
a model
28 days ago
deepcogito/cogito-v1-preview-qwen-32B
Text Generation
•
Updated
about 1 month ago
•
13.6k
•
105
liked
a model
about 1 month ago
wanlige/QWQ-stock
Text Generation
•
Updated
11 days ago
•
86
•
•
8
liked
a dataset
about 1 month ago
Conard/fortune-telling
Viewer
•
Updated
Feb 17
•
207
•
2.73k
•
138
liked
3 models
about 1 month ago
VIDraft/Gemma-3-R1984-27B
Image-Text-to-Text
•
Updated
18 days ago
•
161
•
56
Qwen/Qwen2.5-VL-32B-Instruct
Image-Text-to-Text
•
Updated
25 days ago
•
382k
•
•
353
Qwen/Qwen2.5-Omni-7B
Any-to-Any
•
Updated
9 days ago
•
191k
•
1.58k
liked
a dataset
about 1 month ago
sychonix/emotion
Viewer
•
Updated
Mar 26
•
20k
•
285
•
31
liked
a dataset
about 2 months ago
BytedTsinghua-SIA/DAPO-Math-17k
Viewer
•
Updated
21 days ago
•
1.79M
•
4.23k
•
68
Load more