Yacine Jernite's picture

Yacine Jernite

yjernite

AI & ML interests

Technical, community, and regulatory tools of AI governance @HuggingFace

Recent Activity

Organizations

Hugging Face's profile picture Society & Ethics's profile picture BigScience Workshop's profile picture GEM benchmark's profile picture BigScience Catalogue Data's profile picture BigScience Data's profile picture HF Task Exploration's profile picture HuggingFaceM4's profile picture BigCode's profile picture Stable Bias's profile picture Hugging Face H4's profile picture 🤗 H4 Community's profile picture BigCode Data's profile picture Stable Diffusion Bias Eval's profile picture Librarian Bots's profile picture Blog-explorers's profile picture Evaluating Social Impacts of Generative AI's profile picture llm-values's profile picture Bias Leaderboard Development's profile picture AI Energy Score's profile picture Journalists on Hugging Face's profile picture Social Post Explorers's profile picture Frugal AI Challenge's profile picture Open R1's profile picture Open Agents's profile picture Hugging Face AI & Society Team's profile picture

Posts 4

view post
Post
3117
Today in Privacy & AI Tooling - introducing a nifty new tool to examine where data goes in open-source apps on 🤗

HF Spaces have tons (100Ks!) of cool demos leveraging or examining AI systems - and because most of them are OSS we can see exactly how they handle user data 📚🔍

That requires actually reading the code though, which isn't always easy or quick! Good news: code LMs have gotten pretty good at automatic review, so we can offload some of the work - here I'm using Qwen/Qwen2.5-Coder-32B-Instruct to generate reports and it works pretty OK 🙌

The app works in three stages:
1. Download all code files
2. Use the Code LM to generate a detailed report pointing to code where data is transferred/(AI-)processed (screen 1)
3. Summarize the app's main functionality and data journeys (screen 2)
4. Build a Privacy TLDR with those inputs

It comes with a bunch of pre-reviewed apps/Spaces, great to see how many process data locally or through (private) HF endpoints 🤗

Note that this is a POC, lots of exciting work to do to make it more robust, so:
- try it: yjernite/space-privacy
- reach out to collab: yjernite/space-privacy

Articles 23

Article
12

Consent by Design: Approaches to User Data in Open AI Ecosystems