47 4 160

sometimesanotion PRO

sometimesanotion

https://ko-fi.com/sometimesanotion

AI & ML interests

Agentic LLM services, model merging, finetunes, distillation

Recent Activity

liked a model 5 days ago

kalomaze/Qwen3-16B-A3B

posted an update 6 days ago

The capabilities of the new Qwen 3 models are fascinating, and I am watching that space! My experience, however, is that context management is vastly more important with them. If you use a client with a typical session log with rolling compression, a Qwen 3 model will start to generate the same messages over and over. I don't think that detracts from them. They're optimized for a more advanced MCP environment. I honestly think the 8B is optimal for home use, given proper RAG/CAG. In typical session chats, Lamarck and Chocolatine are still my daily drives. I worked hard to give Lamarck v0.7 a sprinkling of CoT from both DRT and Deepseek R1. While those models got surpassed on the leaderboards, in practice, I still really enjoy their output. My projects are focusing on application and context management, because that's where the payoff in improved quality is right now. But should there be a mix of finetunes to make just the right mix of - my recipes are standing by.

liked a model 6 days ago

huihui-ai/Qwen3-14B-abliterated

View all activity

Organizations

Posts 6

Post

1668

The capabilities of the new Qwen 3 models are fascinating, and I am watching that space!

My experience, however, is that context management is vastly more important with them. If you use a client with a typical session log with rolling compression, a Qwen 3 model will start to generate the same messages over and over. I don't think that detracts from them. They're optimized for a more advanced MCP environment. I honestly think the 8B is optimal for home use, given proper RAG/CAG.

In typical session chats, Lamarck and Chocolatine are still my daily drives. I worked hard to give Lamarck v0.7 a sprinkling of CoT from both DRT and Deepseek R1. While those models got surpassed on the leaderboards, in practice, I still really enjoy their output.

My projects are focusing on application and context management, because that's where the payoff in improved quality is right now. But should there be a mix of finetunes to make just the right mix of - my recipes are standing by.

Post

4822

I'd like to draw your attention to a Lamarck-based experiment which uses Arcee AI's newly published arcee_fusion merge method for three out of its four merges. Yes, just four. This is a simple one, and its recipe is fully open:

sometimesanotion/Lamarck-14B-v0.7-Fusion

It unifies three branches, all of which feature models which bring Lamarck-14B-v0.7 and Qwenvergence-14B-v12-Prose together. One side features @jpacifico 's jpacifico/Chocolatine-2-14B-Instruct-v2.0.3 and the other features @suayptalha 's suayptalha/Lamarckvergence-14B paired with my models which were their merge ancestors.

A fusion merge - of a fusion merge and a SLERP of a fusion and older merge - should demonstrate the new merge method's behavior in interesting ways, especially in the first 1/4th of the model where the SLERP has less impact.

I welcome you to kick the tires and learn from it. It has prose quality near Qwenvergence v12's - as you'd expect.

Thank you, @mradermacher and @MaziyarPanahi , for the first-day quantizations! Your work helped get me started. https://huggingface.co/models?other=base_model:quantized:sometimesanotion/Lamarck-14B-v0.7-Fusion

View all Posts

Collections 1

models 18

datasets 0

None public yet

sometimesanotion PRO

AI & ML interests

Recent Activity

Organizations

Posts 6

Collections 1

sometimesanotion/Lamarck-14B-v0.7

sometimesanotion/Lamarck-14B-v0.6

jpacifico/Chocolatine-2-14B-Instruct-v2.0b3

sometimesanotion/Qwenvergence-14B-v13-Prose-DS

models 18

sometimesanotion/Qwenvergence-14B-v3-Prose

sometimesanotion/Lamarck-14B-v0.7

sometimesanotion/Lamarck-14B-v0.7-Fusion

sometimesanotion/Qwenvergence-14B-v13-Prose-DS

sometimesanotion/Qwenvergence-14B-v12-Prose-DS

sometimesanotion/Qwenvergence-14B-v11

sometimesanotion/Qwenvergence-14B-v12-Prose

sometimesanotion/LoRA-64-Chocolatine-2-14B-Instruct-v2.0b3

sometimesanotion/LoRA-32-Chocolatine-2-14B-Instruct-v2.0b3

sometimesanotion/Base-Qwenvergence

datasets 0

sometimesanotion PRO

AI & ML interests

Recent Activity

Organizations

Posts 6

Collections 1

models 18 Sort: Recently updated

datasets 0

models 18