daavoo

daavoo

AI & ML interests

None yet

Recent Activity

View all activity

Organizations

Multi🤖Transformers's profile picture Mozilla.ai's profile picture Hugging Face Discord Community's profile picture

daavoo's activity

reacted to shekkizh's post with 👀 3 days ago
view post
Post
2361
🙋🏽‍♂️ Is your "multi agent" system really multi agentic? Or is it just a modular setup with a bunch of different prompts? 🤨

I’ve had this discussion way too often, so I finally wrote it all down. If you’re building with agents, you need to read this.

Here’s the TLDR:
✅ True multi agent systems require:
• Persistent, private state per agent
• Memory that impacts future decisions
• Adaptation based on past experiences

❌ Just having modular components, function calls, or multiple LLMs doesn't cut it. That’s not multi agentic. It’s just pipelining.

🤝 The magic is in evolving relationships, context retention, and behavioral shifts over time.
🧠 If your agents aren’t learning from each other or changing based on past experience… you are missing the point.

What do you think? Curious what patterns you're experimenting with 🧐

👉 Full post: https://shekkizh.github.io/posts/2025/04/multi-agents/
  • 2 replies
·
reacted to danielhanchen's post with 🤗 3 days ago
view post
Post
4126
🦥 Introducing Unsloth Dynamic v2.0 GGUFs!
Our v2.0 quants set new benchmarks on 5-shot MMLU and KL Divergence, meaning you can now run & fine-tune quantized LLMs while preserving as much accuracy as possible.

Llama 4: unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
DeepSeek-R1: unsloth/DeepSeek-R1-GGUF-UD
Gemma 3: unsloth/gemma-3-27b-it-GGUF

We made selective layer quantization much smarter. Instead of modifying only a subset of layers, we now dynamically quantize all layers so every layer has a different bit. Now, our dynamic method can be applied to all LLM architectures, not just MoE's.

Blog with Details: https://docs.unsloth.ai/basics/dynamic-v2.0

All our future GGUF uploads will leverage Dynamic 2.0 and our hand curated 300K–1.5M token calibration dataset to improve conversational chat performance.

For accurate benchmarking, we built an evaluation framework to match the reported 5-shot MMLU scores of Llama 4 and Gemma 3. This allowed apples-to-apples comparisons between full-precision vs. Dynamic v2.0, QAT and standard iMatrix quants.

Dynamic v2.0 aims to minimize the performance gap between full-precision models and their quantized counterparts.
posted an update 3 days ago
view post
Post
1843
We have just released a new version of⭐https://github.com/mozilla-ai/any-agent ⭐exposing an API to be used in async contexts:

import asyncio
from any_agent import AgentConfig, AnyAgent, TracingConfig
from any_agent.tools import search_web

async def main():
    agent = await AnyAgent.create_async(
        "openai",
        AgentConfig(
            model_id="gpt-4.1-mini",
            instructions="You are the main agent. Use the other available agents to find an answer",
        ),
        managed_agents=[
            AgentConfig(
                name="search_web_agent",
                description="An agent that can search the web",
                model_id="gpt-4.1-nano",
                tools=[search_web]
            )
        ],
        tracing=TracingConfig()
    )

    await agent.run_async("Which Agent Framework is the best??")

if __name__ == "__main__":
    asyncio.run(main())
posted an update 7 days ago
view post
Post
2039
Another day, another release in
⭐https://github.com/mozilla-ai/any-agent ⭐

You can now use MCP (Model Context Protocol) tools via SSE (Server-Sent Events):

from any_agent import AgentConfig, AnyAgent
from any_agent.config import MCPSseParams

agent = AnyAgent.create(
    "smolagents",
    AgentConfig(
        model_id="gpt-4o-mini",
        tools=[
            MCPSseParams(
                url="http://localhost:8000/sse"
            ),
        ]
    )
)
agent.run("What do MCP and SSE mean?")


See SuperGateway for an easy way to turn a Stdio server into an SSE server.
replied to their post 10 days ago
view reply

Currently, we only support 2 patterns that can be implemented (almost) consistently across frameworks:
single agent and multi-agent in the form of "manager" + "managed agents".

Don't hesitate to open an issue https://github.com/mozilla-ai/any-agent/issues to discuss what other patterns would be useful

reacted to Xenova's post with 🤗🚀🚀 11 days ago
view post
Post
2434
Reasoning models like o3 and o4-mini are advancing faster than ever, but imagine what will be possible when they can run locally in your browser! 🤯

Well, with 🤗 Transformers.js, you can do just that! Here's Zyphra's new ZR1 model running at over 100 tokens/second on WebGPU! ⚡️

Giving models access to browser APIs (like File System, Screen Capture, and more) could unlock an entirely new class of web experiences that are personalized, interactive, and run locally in a secure, sandboxed environment.

For now, try out the demo! 👇
webml-community/Zyphra-ZR1-WebGPU
  • 1 reply
·
posted an update 12 days ago
view post
Post
1296
New release in https://github.com/mozilla-ai/any-agent 🤖

You can now use "managed_agents" also in langchain and llama_index, in addition to the other frameworks:

from any_agent import AgentConfig, AgentFramework, AnyAgent
from any_agent.tracing import setup_tracing

framework = AgentFramework("langchain")  # also in AgentFramework("llama_index") and the rest of frameworks
setup_tracing(framework)

agent = AnyAgent.create(
    framework,
    AgentConfig(
        model_id="gpt-4.1-mini",
        instructions="You are the main agent. Use the other available agents to find an answer",
    ),
    managed_agents=[
        AgentConfig(
            name="search_web_agent",
            description="An agent that can search the web",
            model_id="gpt-4.1-nano",
            tools=["any_agent.tools.search_web"]
        ),
        AgentConfig(
            name="visit_webpage_agent",
            description="An agent that can visit webpages",
            model_id="gpt-4.1-nano",
            tools=["any_agent.tools.visit_webpage"]
        )
    ]
)
agent.run("Which Agent Framework is the best??")
  • 2 replies
·
reacted to stefan-french's post with 😎 14 days ago
reacted to etemiz's post with 👀 17 days ago
view post
Post
2170
It looks like Llama 4 team gamed the LMArena benchmarks by making their Maverick model output emojis, longer responses and ultra high enthusiasm! Is that ethical or not? They could certainly do a better job by working with teams like llama.cpp, just like Qwen team did with Qwen 3 before releasing the model.

In 2024 I started playing with LLMs just before the release of Llama 3. I think Meta contributed a lot to this field and still contributing. Most LLM fine tuning tools are based on their models and also the inference tool llama.cpp has their name on it. The Llama 4 is fast and maybe not the greatest in real performance but still deserves respect. But my enthusiasm towards Llama models is probably because they rank highest on my AHA Leaderboard:

https://sheet.zoho.com/sheet/open/mz41j09cc640a29ba47729fed784a263c1d08

Looks like they did a worse job compared to Llama 3.1 this time. Llama 3.1 has been on top for a while.

Ranking high on my leaderboard is not correlated to technological progress or parameter size. In fact if LLM training is getting away from human alignment thanks to synthetic datasets or something else (?), it could be easily inversely correlated to technological progress. It seems there is a correlation regarding the location of the builders (in the West or East). Western models are ranking higher. This has become more visible as the leaderboard progressed, in the past there was less correlation. And Europeans seem to be in the middle!

Whether you like positive vibes from AI or not, maybe the times are getting closer where humans may be susceptible to being gamed by an AI? What do you think?
·
posted an update 18 days ago
view post
Post
2804
Wondering how the new Google Agent Development Toolkit (ADK) compares against other frameworks? 🤔You can try it in any-agent 🚀

https://github.com/mozilla-ai/any-agent

agent = AnyAgent.create(
    AgentFramework("google"),
    AgentConfig(
        model_id="gpt-4o-mini"
    )
)
agent.run("Which Agent Framework is the best??")

  • 1 reply
·
posted an update 20 days ago
view post
Post
1830
After working on agent evaluation🔍🤖 the last weeks, we started to accumulate code to make trying different agent frameworks easier. From that code, we have built and just released a small library called any-agent.


Give it a try and a ⭐: https://github.com/mozilla-ai/any-agent

from any_agent import AgentConfig, AgentFramework, AnyAgent

agent = AnyAgent.create(
    AgentFramework("smolagents"),  # or openai, langchain, llama_index
    AgentConfig(
        model_id="gpt-4o-mini"
    )
)
agent.run("Which Agent Framework is the best??")
reacted to stefan-french's post with 🔥🚀 about 1 month ago
reacted to sharpenb's post with 🔥 about 1 month ago
view post
Post
3092
We open-sourced the pruna package that can be easily installed with pip install pruna :) It allows to easily ccompress and evaluate AI models including transformers and diffusers.

- Github repo: https://github.com/PrunaAI/pruna
- Documentation: https://docs.pruna.ai/en/stable/index.html

With open-sourcing, people can now inspect and contribute to the open code. Beyond the code, we provide detailed readme, tutorials, benchmarks, and documentation to make transparent compression, evaluation, and saving/loading/serving of AI models.

Happy to share it with you and always interested in collecting your feedback :)
  • 2 replies
·
posted an update about 1 month ago
reacted to chansung's post with 👍 about 1 month ago
view post
Post
1571
Gemma 3 Release in a nutshell
(seems like function calling is not supported whereas the announcement said so)
posted an update about 1 month ago
posted an update about 2 months ago