Hugging Face – Posts

Join the conversation

Join the community of Machine Learners and AI enthusiasts.

All HF Hub posts

danielhanchen

posted an update 2 days ago

Post

3201

🦥 Introducing Unsloth Dynamic v2.0 GGUFs!
Our v2.0 quants set new benchmarks on 5-shot MMLU and KL Divergence, meaning you can now run & fine-tune quantized LLMs while preserving as much accuracy as possible.

Llama 4: unsloth/Llama-4-Scout-17B-16E-Instruct-GGUF
DeepSeek-R1: unsloth/DeepSeek-R1-GGUF-UD
Gemma 3: unsloth/gemma-3-27b-it-GGUF

We made selective layer quantization much smarter. Instead of modifying only a subset of layers, we now dynamically quantize all layers so every layer has a different bit. Now, our dynamic method can be applied to all LLM architectures, not just MoE's.

Blog with Details: https://docs.unsloth.ai/basics/dynamic-v2.0

All our future GGUF uploads will leverage Dynamic 2.0 and our hand curated 300K–1.5M token calibration dataset to improve conversational chat performance.

For accurate benchmarking, we built an evaluation framework to match the reported 5-shot MMLU scores of Llama 4 and Gemma 3. This allowed apples-to-apples comparisons between full-precision vs. Dynamic v2.0, QAT and standard iMatrix quants.

Dynamic v2.0 aims to minimize the performance gap between full-precision models and their quantized counterparts.

julien-c

posted an update 2 days ago

Post

2876

BOOOOM: Today I'm dropping TINY AGENTS

the 50 lines of code Agent in Javascript 🔥

I spent the last few weeks working on this, so I hope you will like it.

I've been diving into MCP (Model Context Protocol) to understand what the hype was all about.

It is fairly simple, but still quite powerful: MCP is a standard API to expose sets of Tools that can be hooked to LLMs.

But while doing that, came my second realization:

Once you have a MCP Client, an Agent is literally just a while loop on top of it. 🤯

➡️ read it exclusively on the official HF blog: https://huggingface.co/blog/tiny-agents

1 reply

merterbak

posted an update 2 days ago

Post

2753

FlowReasoner is a new system that builds a custom set of small AI agents for every user question. Unlike search based methods it uses reasoning driven optimization with external execution feedback.

✅ First, it distills reasoning data using DeepSeek R1-671B to build multi agent systems. 🤖
✅ Then, reasoning data used for DeepSeek-R1-Distill-Qwen-7B via supervised fine tuning for basic reasoning skills. 💡
✅ Finally, RL with GRPO (optimizes by comparing response groups from queries/tasks) to improve reasoning.

FlowReasoner: Reinforcing Query-Level Meta-Agents (2504.15257)
Code: https://github.com/sail-sg/flowreasoner

DawnC

posted an update about 12 hours ago

Post

905

I'm excited to introduce VisionScout —an interactive vision tool that makes computer vision both accessible and powerful! 👀🔍

What can VisionScout do right now?
🖼️ Upload any image and detect 80 different object types using YOLOv8.
🔄 Instantly switch between Nano, Medium, and XLarge models depending on your speed vs. accuracy needs.
🎯 Filter specific classes (people, vehicles, animals, etc.) to focus only on what matters to you.
📊 View detailed statistics about detected objects, confidence levels, and spatial distribution.
🎨 Enjoy a clean, intuitive interface with responsive design and enhanced visualizations.

What's next?
I'm working on exciting updates:
- Support for more models
- Video processing and object tracking across frames
- Faster real-time detection
- Improved mobile responsiveness

The goal is to build a complete but user-friendly vision toolkit for both beginners and advanced users.

Try it yourself! 🚀
DawnC/VisionScout

I'd love to hear your feedback , what features would you find most useful? Any specific use cases you'd love to see supported?

Give it a try and let me know your thoughts in the comments! Stay tuned for future updates.

#ComputerVision #ObjectDetection #YOLO #MachineLearning #TechForLife

as-cle-bert

posted an update 2 days ago

Post

2396

Ever dreamt of ingesting into a vector DB that pile of CSVs, Word documents and presentations laying in some remote folders on your PC?🗂️
What if I told you that you can do it within three to six lines of code?🤯
Well, with my latest open-source project, 𝐢𝐧𝐠𝐞𝐬𝐭-𝐚𝐧𝐲𝐭𝐡𝐢𝐧𝐠 (https://github.com/AstraBert/ingest-anything), you can take all your non-PDF files, convert them to PDF, extract their text, chunk, embed and load them into a vector database, all in one go!🚀
How? It's pretty simple!
📁 The input files are converted into PDF by PdfItDown (https://github.com/AstraBert/PdfItDown)
📑 The PDF text is extracted using LlamaIndex readers
🦛 The text is chunked exploiting Chonkie
🧮 The chunks are embedded thanks to Sentence Transformers models
🗄️ The embeddings are loaded into a Qdrant vector database

And you're done!✅
Curious of trying it? Install it by running:

𝘱𝘪𝘱 𝘪𝘯𝘴𝘵𝘢𝘭𝘭 𝘪𝘯𝘨𝘦𝘴𝘵-𝘢𝘯𝘺𝘵𝘩𝘪𝘯𝘨

And you can start using it in your python scripts!🐍
Don't forget to star it on GitHub and let me know if you have any feedback! ➡️ https://github.com/AstraBert/ingest-anything

3 replies

MonsterMMORPG

posted an update about 14 hours ago

Post

879

ComfyUI 1-click Installers updated for latest official Torch 2.7 with CUDA 12.8 (RTX 5000 series support + older GPUs) - Automatically installs xFormers, Flash Attention, Sage Attention, Triton, DeepSpeed, insightface, accelerate, onnxruntime-gpu for Windows

1-click Installers zip file is here : https://www.patreon.com/posts/105023709

xFormers compiled by me for Windows (Python 3.10, 3.11 and 3.12) and Linux (Python 3.10 only)

Flash Attention compiled by me for Windows (Python 3.10, 3.11 and 3.12) and Linux (Python 3.10 only)

Sage Attention compiled by me for Linux (Python 3.10 only)

insightface compiled by me for Windows (Python 3.10, 3.11 and 3.12)

Pre-compiled Triton + Sage Attention + DeepSpeed installed for Windows

You must have pre-installed Python on your system manually

Tutorial for Python + CUDA 12.8 installation : https://youtu.be/DrhUHnYfwC0

ProCreations

posted an update about 8 hours ago

Post

476

Post of the Day

I’m fine-tuning Qwen 2.5-0.5B to be extremely good at math, using high-quality datasets and some smart training strategies.
The logs are looking really promising so far!

Expected release:
Tomorrow morning?
I’ll post as soon as it’s ready — stay tuned.

If you want faster updates or just wanna chat about it, come join my Discord:
https://discord.gg/EXsug2Ux29
(Heads up: we might ask a couple quick questions when you join — just making sure we keep the server safe.)

Also, check out one of the datasets we’re using:
ProCreations/SimpleMath

This project is also helping shape the future of IntellIte.
The insights and techniques we’re developing here — better dataset curation, fine-tuning tricks, and evaluation methods — will directly contribute to making IntellIte even sharper, faster, and more reliable, especially for math and reasoning tasks.

Big progress ahead. Can’t wait to share it with you all!

1 reply

prithivMLmods

posted an update 2 days ago

Post

1674

Bringing out style-intermixing adapters for Flux.Dev, including Aura Glow, Fallen Ink Art, Cardboard Paper Arts, Black & White Expressions, and Glitter Gem Touch. For more details, visit the model card of the LoRA. 🥳

╰┈➤Demo : prithivMLmods/FLUX-LoRA-DLC2 & prithivMLmods/FLUX-LoRA-DLC

╰┈➤ Adapters :
+ Aura Glow : strangerzonehf/2DAura-Flux
+ Fallen Ink Art : strangerzonehf/FallenArt-Flux
+ Black & White Expressions : strangerzonehf/BnW-Expressions-Flux
+ Glitter Gem Touch : strangerzonehf/Gem-Touch-LoRA-Flux
+ Cardboard Paper Arts v1 : strangerzonehf/Flux-Cardboard-Art-LoRA
+ Cardboard Paper Arts v2 : strangerzonehf/Cardboard-v2-Flux

╰┈➤ Pages :
- Repository Page :

strangerzonehf
- Collection : strangerzonehf/mixer-adp-042025-68095c365d9d1072c8d860be
- Flux Ultimate LoRA Collection : strangerzonehf/Flux-Ultimate-LoRA-Collection
- By prithivMLmods : @prithivMLmods

The best dimensions and inference settings for optimal results are as follows: A resolution of 1280 x 832 with a 3:2 aspect ratio is recommended for the best quality, while 1024 x 1024 with a 1:1 aspect ratio serves as the default option. For inference, the recommended number of steps ranges between 30 and 35 to achieve optimal output.

daavoo

posted an update 2 days ago

Post

1788

We have just released a new version of⭐https://github.com/mozilla-ai/any-agent ⭐exposing an API to be used in async contexts:

import asyncio
from any_agent import AgentConfig, AnyAgent, TracingConfig
from any_agent.tools import search_web

async def main():
    agent = await AnyAgent.create_async(
        "openai",
        AgentConfig(
            model_id="gpt-4.1-mini",
            instructions="You are the main agent. Use the other available agents to find an answer",
        ),
        managed_agents=[
            AgentConfig(
                name="search_web_agent",
                description="An agent that can search the web",
                model_id="gpt-4.1-nano",
                tools=[search_web]
            )
        ],
        tracing=TracingConfig()
    )

    await agent.run_async("Which Agent Framework is the best??")

if __name__ == "__main__":
    asyncio.run(main())

nicolay-r

posted an update 1 day ago

Post

925

🚀 Delighted to share a major milestone in adapting reasoning techniques for data collections augmentation!
Introducing bulk-chain 1.0.0 -- the first major release of a no-string API for adapting your LLM for Chain-of-Thought alike reasoning over records with large amount of parameters across large datasets.

⭐ Check it out: https://github.com/nicolay-r/bulk-chain

What’s new and why it matters:
📦 Fully no-string API for easy client deployment
🔥 Demos are now standalone projects:

Demos:
📺 bash / shell (dispatched): https://github.com/nicolay-r/bulk-chain-shell
📺 tksheet: https://github.com/nicolay-r/bulk-chain-tksheet-client

Using nlp-thirdgate to host the supported providers:
🌌 LLM providers: https://github.com/nicolay-r/nlp-thirdgate

Recently active users