Data Agents

Enterprise
community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

baptistecolleย  updated a model 10 days ago
data-agents/DataAgent-Llama-8B
baptistecolleย  published a model 10 days ago
data-agents/DataAgent-Llama-8B
baptistecolleย  published a model 10 days ago
data-agents/DataAgent-7B
View all activity

data-agents's activity

m-ricย 
posted an update 20 days ago
view post
Post
2682
New king of open VLMs: InternVL3 takes Qwen 2.5's crown! ๐Ÿ‘‘

InternVL have been a wildly successful series of model : and the latest iteration has just taken back their crown thanks to their superior, natively multimodal vision training pipeline.

โžก๏ธ Most of the vision language models (VLMs) these days are built like Frankenstein : take a good text-only Large Language Model (LLM) backbone, stitch a specific vision transformer (ViT) on top of it. Then the training is sequential ๐Ÿ”ข : 1. Freeze the LLM weights while you train the ViT only to work with the LLM part, then 2. Unfreeze all weights to train all weights in order to work together.

๐Ÿ’ซ The Shanghai Lab decided to challenge this paradigm and chose this approach that they call "native". For each of their model sizes, they still start from a good LLM (mostly Qwen-2.5 series, did I tell you I'm a huge fan of Qwen? โค๏ธ), and stitch the ViT, but they don't freeze anything : they train all weights together with interleaved text and image understanding data in a single pre-training phase ๐ŸŽจ.

They claim it results in more seamless interactions between modalities. And the results prove them right: they took the crown of top VLMs, at nearly all sizes, from their Qwen-2.5 parents. ๐Ÿ‘‘
  • 2 replies
ยท
thomwolfย 
posted an update 24 days ago
view post
Post
4650
If you've followed the progress of robotics in the past 18 months, you've likely noticed how robotics is increasingly becoming the next frontier that AI will unlock.

At Hugging Faceโ€”in robotics and across all AI fieldsโ€”we believe in a future where AI and robots are open-source, transparent, and affordable; community-built and safe; hackable and fun. We've had so much mutual understanding and passion working with the Pollen Robotics team over the past year that we decided to join forces!

You can already find our open-source humanoid robot platform Reachy 2 on the Pollen website and the Pollen community and people here on the hub at pollen-robotics

We're so excited to build and share more open-source robots with the world in the coming months!
  • 1 reply
ยท
m-ricย 
posted an update about 1 month ago
view post
Post
2318
๐Ÿš€ DeepSeek R1 moment has come for GUI agents: Rule-based Reinforcement Learning gives better results than SFT with 500x smaller datasets!

Traditionally (by which I mean "in the last few months"), GUI agents have been trained with supervised fine-tuning (SFT). This meant, collecting huge datasets of screen captures from people using computers, and using these to fine-tune your model. ๐Ÿ“š

๐Ÿ‘‰ But last week, a new paper introduced UI-R1, applying DeepSeek's R1-style rule-based reinforcement learning (RL) specifically to GUI action prediction tasks.
This is big news: with RL, maybe we could build good agents without the need for huge datasets.

UI-R1 uses a unified reward function that evaluates multiple responses from models, optimizing via policy algorithms like Group Relative Policy Optimization (GRPO).

Specifically, the reward function assesses:
๐ŸŽฏ Action type accuracy: Does the predicted action match the ground truth?
๐Ÿ“ Coordinate accuracy (specifically for clicks): Is the predicted click within the correct bounding box?
๐Ÿ“‘ Output format: Does the model clearly articulate both its reasoning and final action?

Using just 136 carefully selected mobile tasksโ€”compared to 76,000 tasks for larger models like OS-Atlasโ€”UI-R1 shows significant efficiency and improved performance:
๐Ÿ“ˆ Boosted action prediction accuracy from 76% to 89% on AndroidControl.
๐ŸŒ Outperformed larger, SFT-trained models (e.g., OS-Atlas-7B), demonstrating superior results with vastly fewer data points (136 tasks vs. 76K).
๐Ÿ” Enhanced adaptability and generalization, excelling even in out-of-domain scenarios.

The paper tests this RL-based method only in low-level GUI tasks. Could it generalize to more complex interactions? ๐Ÿง

Read the full paper here ๐Ÿ‘‰ UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning (2503.21620)
thomwolfย 
posted an update about 1 month ago
view post
Post
3350
The new DeepSite space is really insane for vibe-coders
enzostvs/deepsite

With the wave of vibe-coding-optimized LLMs like the latest open-source DeepSeek model (version V3-0324), you can basically prompt out-of-the-box and create any app and game in one-shot.

It feels so powerful to me, no more complex framework or under-the-hood prompt engineering to have a working text-to-app tool.

AI is eating the world and *open-source* AI is eating AI itself!

PS: and even more meta is that the DeepSite app and DeepSeek model are both fully open-source code => time to start recursively improve?

PPS: you still need some inference hosting unless you're running the 600B param model at home, so check the very nice list of HF Inference Providers for this model: deepseek-ai/DeepSeek-V3-0324
  • 1 reply
ยท
freddyaboultonย 
posted an update about 1 month ago
view post
Post
1741
Ever wanted to share your AI creations with friends? โœจ

Screenshots are fine, but imagine letting others play with your ACTUAL model!

Introducing Gradio deep links ๐Ÿ”— - now you can share interactive AI apps, not just images.

Add a gr.DeepLinkButton to any app and get shareable URLs that let ANYONE experiment with your models.

m-ricย 
posted an update about 2 months ago
view post
Post
4947
smolagents now support vLLM! ๐Ÿฅณ

As one of the most popular local inference solutions, the community had been asking us to integrate vLLM: after a heavy refactoring of our LLM classes, we've just released smolagents 1.11.0, with a brand new VLLMModel class.

Go try it and tell us what you think!

https://github.com/huggingface/smolagents/blob/45b2c86857b7f7657daaa74e4d17d347e9e2c4a4/src/smolagents/models.py#L497
thomwolfย 
posted an update about 2 months ago
view post
Post
2872
We've kept pushing our Open-R1 project, an open initiative to replicate and extend the techniques behind DeepSeek-R1.

And even we were mind-blown by the results we got with this latest model we're releasing: โšก๏ธOlympicCoder ( open-r1/OlympicCoder-7B and open-r1/OlympicCoder-32B)

It's beating Claude 3.7 on (competitive) programming โ€“a domain Anthropic has been historically really strong atโ€“ and it's getting close to o1-mini/R1 on olympiad level coding with just 7B parameters!

And the best part is that we're open-sourcing all about its training dataset, the new IOI benchmark, and more in our Open-R1 progress report #3: https://huggingface.co/blog/open-r1/update-3

Datasets are are releasing:
- open-r1/codeforces
- open-r1/codeforces-cots
- open-r1/ioi
- open-r1/ioi-test-cases
- open-r1/ioi-sample-solutions
- open-r1/ioi-cots
- open-r1/ioi-2024-model-solutions
freddyaboultonย 
posted an update about 2 months ago
view post
Post
1952
Privacy matters when talking to AI! ๐Ÿ”‡

We've just added a microphone mute button to FastRTC in our latest update (v0.0.14). Now you control exactly what your LLM hears.

Plus lots more features in this release! Check them out:
https://github.com/freddyaboulton/fastrtc/releases/tag/0.0.14
lewtunย 
posted an update about 2 months ago
view post
Post
2525
Introducing OlympicCoder: a series of open reasoning models that can solve olympiad-level programming problems ๐Ÿง‘โ€๐Ÿ’ป

- 7B open-r1/OlympicCoder-7B
- 32B open-r1/OlympicCoder-32B

We find that OlympicCoder models outperform Claude 3.7 Sonnet, as well as others over 100x larger ๐Ÿ’ช

Together with the models, we are releasing:

๐Ÿ“ŠCodeForces-CoTs: new dataset of code problems from the most popular competitive coding platform, with R1 traces in C++ and Python open-r1/codeforces-cots

๐Ÿ† IOI'2024: a new benchmark of VERY hard programming problems where even frontier models struggle to match human performance open-r1/ioi

For links to the models and datasets, check out our latest progress report from Open R1: https://huggingface.co/blog/open-r1/update-3
  • 1 reply
ยท