Nishith Jain's picture

Nishith Jain

KingNish

AI & ML interests

AI is fun actually.

Recent Activity

liked a model 2 days ago
ICTNLP/LLaMA-Omni2-0.5B
liked a model 6 days ago
ibm-granite/granite-4.0-tiny-preview
liked a model 7 days ago
MiniMaxAI/MiniMax-Text-01
View all activity

Organizations

Wikimedia's profile picture OpenGVLab's profile picture Blog-explorers's profile picture Multi๐Ÿค–Transformers's profile picture The Collectionists's profile picture HelpingAI's profile picture ZeroGPU Explorers's profile picture Project Fluently's profile picture Poscye's profile picture INNOVA AI's profile picture Narra's profile picture Social Post Explorers's profile picture Cognitive Computations's profile picture Dev Mode Explorers's profile picture Stable Diffusion Community (Unofficial, Non-profit)'s profile picture ONNX Community's profile picture Hugging Face Discord Community's profile picture Nerdy Face's profile picture None yet's profile picture Project R's profile picture Doge Face's profile picture Reasoning datasets competition 's profile picture

KingNish's activity

reacted to as-cle-bert's post with ๐Ÿค— 11 days ago
view post
Post
2885
Ever dreamt of ingesting into a vector DB that pile of CSVs, Word documents and presentations laying in some remote folders on your PC?๐Ÿ—‚๏ธ
What if I told you that you can do it within three to six lines of code?๐Ÿคฏ
Well, with my latest open-source project, ๐ข๐ง๐ ๐ž๐ฌ๐ญ-๐š๐ง๐ฒ๐ญ๐ก๐ข๐ง๐  (https://github.com/AstraBert/ingest-anything), you can take all your non-PDF files, convert them to PDF, extract their text, chunk, embed and load them into a vector database, all in one go!๐Ÿš€
How? It's pretty simple!
๐Ÿ“ The input files are converted into PDF by PdfItDown (https://github.com/AstraBert/PdfItDown)
๐Ÿ“‘ The PDF text is extracted using LlamaIndex readers
๐Ÿฆ› The text is chunked exploiting Chonkie
๐Ÿงฎ The chunks are embedded thanks to Sentence Transformers models
๐Ÿ—„๏ธ The embeddings are loaded into a Qdrant vector database

And you're done!โœ…
Curious of trying it? Install it by running:

๐˜ฑ๐˜ช๐˜ฑ ๐˜ช๐˜ฏ๐˜ด๐˜ต๐˜ข๐˜ญ๐˜ญ ๐˜ช๐˜ฏ๐˜จ๐˜ฆ๐˜ด๐˜ต-๐˜ข๐˜ฏ๐˜บ๐˜ต๐˜ฉ๐˜ช๐˜ฏ๐˜จ

And you can start using it in your python scripts!๐Ÿ
Don't forget to star it on GitHub and let me know if you have any feedback! โžก๏ธ https://github.com/AstraBert/ingest-anything
  • 5 replies
ยท
reacted to AdinaY's post with ๐Ÿ”ฅ 14 days ago
view post
Post
3476
MAGI-1 ๐Ÿช„ the autoregressive diffusion video model, released by Sand AI

sand-ai/MAGI-1

โœจ 24B with Apache 2.0
โœจ Strong temporal consistency
โœจ Benchmark-topping performance
  • 1 reply
ยท
reacted to fdaudens's post with ๐Ÿ”ฅ 14 days ago
reacted to albertvillanova's post with ๐Ÿค— 15 days ago
view post
Post
2491
smolagents v1.14.0 is out! ๐Ÿš€
๐Ÿ”Œ MCPClient: A sleek new client for connecting to remote MCP servers, making integrations more flexible and scalable.
๐Ÿชจ Amazon Bedrock: Native support for Bedrock-hosted models.
SmolAgents is now more powerful, flexible, and enterprise-ready. ๐Ÿ’ผ

Full release ๐Ÿ‘‰ https://github.com/huggingface/smolagents/releases/tag/v1.14.0
#smolagents #LLM #AgenticAI
reacted to seawolf2357's post with ๐Ÿ”ฅ 18 days ago
view post
Post
5175
๐Ÿ“š Papers Leaderboard - See the Latest AI Research Trends at a Glance! โœจ

Hello, AI research community! Today I'm introducing a new tool for exploring research papers. Papers Leaderboard is an open-source dashboard that makes it easy to find and filter the latest AI research papers.

Heartsync/Papers-Leaderboard

๐ŸŒŸ Key Features

Date Filtering: View only papers published within a specific timeframe (from May 5, 2023 to present)
Title Search: Quickly find papers containing your keywords of interest
Abstract Search: Explore paper content more deeply by searching for keywords within abstracts
Automatic Updates: The database is updated with the latest papers every hour

๐Ÿ’ก How to Use It?

Select a start date and end date
Enter keywords you want to find in titles or abstracts
Adjust the maximum number of search results for abstract searches
Results are displayed neatly in table format
reacted to stefan-french's post with ๐Ÿ˜Ž 22 days ago
reacted to as-cle-bert's post with ๐Ÿ”ฅ 27 days ago
view post
Post
2943
Llama-4 is out and I couldn't resist but to cook something with it... So I came up with ๐‹๐ฅ๐š๐ฆ๐š๐‘๐ž๐ฌ๐ž๐š๐ซ๐œ๐ก๐ž๐ซ (https://llamaresearcher.com), your deep-research AI companion!๐Ÿ”Ž

The workflow behind ๐—Ÿ๐—น๐—ฎ๐—บ๐—ฎ๐—ฅ๐—ฒ๐˜€๐—ฒ๐—ฎ๐—ฟ๐—ฐ๐—ต๐—ฒ๐—ฟ is simple:
๐Ÿ’ฌ You submit a query
๐Ÿ›ก๏ธ Your query is evaluated by Llama 3 guard model, which deems it safe or unsafe
๐Ÿง  If your query is safe, it is routed to the Researcher Agent
โš™๏ธ The Researcher Agent expands the query into three sub-queries, with which to search the web
๐ŸŒ The web is searched for each of the sub-queries
๐Ÿ“Š The retrieved information is evaluated for relevancy against your original query
โœ๏ธ The Researcher Agent produces an essay based on the information it gathered, paying attention to referencing its sources

The agent itself is also built with easy-to-use and intuitive blocks:
๐Ÿฆ™ LlamaIndex provides the agentic architecture and the integrations with the language models
โšกGroq makes Llama-4 available with its lightning-fast inference
๐Ÿ”Ž Linkup allows the agent to deep-search the web and provides sourced answers
๐Ÿ’ช FastAPI does the heavy loading with wrapping everything within an elegant API interface
โฑ๏ธ Redis is used for API rate limiting
๐ŸŽจ Gradio creates a simple but powerful user interface

Special mention also to Lovable, which helped me build the first draft of the landing page for LlamaResearcher!๐Ÿ’–

If you're curious and you want to try LlamaResearcher, you can - completely for free and without subscription - for 30 days from now โžก๏ธ https://llamaresearcher.com
And if you're like me, and you like getting your hands in code and build stuff on your own machine, I have good news: this is all open-source, fully reproducible locally and Docker-ready๐Ÿ‹
Just go to the GitHub repo: https://github.com/AstraBert/llama-4-researcher and don't forget to star it, if you find it useful!โญ

As always, have fun and feel free to leave your feedbackโœจ
  • 2 replies
ยท
reacted to merterbak's post with ๐Ÿ‘€ 29 days ago
reacted to abidlabs's post with โค๏ธ about 1 month ago
view post
Post
3781
JOURNEY TO 1 MILLION DEVELOPERS

5 years ago, we launched Gradio as a simple Python library to let researchers at Stanford easily demo computer vision models with a web interface.

Today, Gradio is used by >1 million developers each month to build and share AI web apps. This includes some of the most popular open-source projects of all time, like Automatic1111, Fooocus, Oobaboogaโ€™s Text WebUI, Dall-E Mini, and LLaMA-Factory.

How did we get here? How did Gradio keep growing in the very crowded field of open-source Python libraries? I get this question a lot from folks who are building their own open-source libraries. This post distills some of the lessons that I have learned over the past few years:

1. Invest in good primitives, not high-level abstractions
2. Embed virality directly into your library
3. Focus on a (growing) niche
4. Your only roadmap should be rapid iteration
5. Maximize ways users can consume your library's outputs

1. Invest in good primitives, not high-level abstractions

When we first launched Gradio, we offered only one high-level class (gr.Interface), which created a complete web app from a single Python function. We quickly realized that developers wanted to create other kinds of apps (e.g. multi-step workflows, chatbots, streaming applications), but as we started listing out the apps users wanted to build, we realized what we needed to do:

Read the rest here: https://x.com/abidlabs/status/1907886
reacted to hexgrad's post with ๐Ÿ‘€ about 1 month ago
view post
Post
6025
To Meta AI Research: I would like to fold ylacombe/expresso into the training mix of an Apache TTS model series. Can you relax the Expresso dataset license to CC-BY or more permissive?

Barring that, can I have an individual exception to train on the materials and distribute trained Apache models, without direct redistribution of the original files? Thanks!

CC (Expresso paper authors whose handles I could find on HF) @wnhsu @adavirro @bowenshi @itaigat @TalRemez @JadeCopet @hassid @felixkreuk @adiyoss @edupoux
reacted to clem's post with ๐Ÿ”ฅ about 1 month ago
view post
Post
4028
Before 2020, most of the AI field was open and collaborative. For me, that was the key factor that accelerated scientific progress and made the impossible possibleโ€”just look at the โ€œTโ€ in ChatGPT, which comes from the Transformer architecture openly shared by Google.

Then came the myth that AI was too dangerous to share, and companies started optimizing for short-term revenue. That led many major AI labs and researchers to stop sharing and collaborating.

With OAI and sama now saying they're willing to share open weights again, we have a real chance to return to a golden age of AI progress and democratizationโ€”powered by openness and collaboration, in the US and around the world.

This is incredibly exciting. Letโ€™s go, open science and open-source AI!
ยท
reacted to burtenshaw's post with โค๏ธ about 2 months ago
view post
Post
3819
The Hugging Face Agents Course now includes three major agent frameworks!

๐Ÿ”— agents-course

This includes LlamaIndex, LangChain, and our very own smolagents. We've worked to integrate the three frameworks in distinctive ways so that learners can reflect on when and where to use each.

This also means that you can follow the course if you're already familiar with one of these frameworks, and soak up some of the fundamental knowledge in earlier units.

Hopefully, this makes the agents course as open to as many people as possible.
  • 3 replies
ยท
reacted to chansung's post with โค๏ธ about 2 months ago
view post
Post
2621
Mistral AI Small 3.1 24B is not only commercial free but also the best model in a single GPU deployment.

I packed up all the information you need to know in a single picture. Hope this helps! :)
  • 1 reply
ยท
reacted to fdaudens's post with ๐Ÿ”ฅ about 2 months ago
view post
Post
2036
๐Ÿ”Š Meet Orpheus: A breakthrough open-source TTS model that matches human-level speech with empathy & emotion.
- Available in 4 sizes (150M-3B parameters)
- delivers ultra-fast streaming
- zero-shot voice cloning.
- Apache 2.0 license

canopylabs/orpheus-tts-67d9ea3f6c05a941c06ad9d2
  • 1 reply
ยท
reacted to mlabonne's post with ๐Ÿš€ about 2 months ago
view post
Post
6192
โœ‚๏ธ Gemma 3 Abliterated

I noticed that Gemma 3 was much more resilient to refusal removal than other models like Qwen 2.5.

I experimented with different recipes and improved the abliteration technique I wrote about last year.

It's still experimental but the refusal rate is super low in my tests. Enjoy!

mlabonne/gemma-3-4b-it-abliterated
mlabonne/gemma-3-12b-it-abliterated
mlabonne/gemma-3-27b-it-abliterated

  • 1 reply
ยท
reacted to KaiChen1998's post with ๐Ÿ”ฅ about 2 months ago
view post
Post
4843
๐Ÿ“ข Our EMOVA paper has been accepted by CVPR 2025, and we are glad to release all resources, including code (training & inference), datasets (training & evaluation), and checkpoints (EMOVA-3B/7B/72B)!

๐Ÿค— EMOVA is a novel end-to-end omni-modal LLM that can see, hear and speak. Given omni-modal (i.e., textual, visual and speech) inputs, EMOVA can generate both textual and speech responses with vivid emotional controls by utilizing the speech decoder and a style controller.

โœจ EMOVA Highlights
โœ… State-of-the-art omni-modality: EMOVA achieves SoTA comparable results on both vision-language and speech benchmarks simultaneously.
โœ… Device adaptation: our codebase supports training/inference on both NVIDIA GPUs (e.g., A800 & H20) and Ascend NPUs (e.g., 910B3)!
โœ… Modular design: we integrate multiple implementations of vision encoder, vision projector, and language model, even including the most recent DeepSeekMoE-tiny!

๐Ÿ”ฅ You are all welcome to try and star!
- Project page: https://emova-ollm.github.io/
- Github: https://github.com/emova-ollm/EMOVA
- Demo: Emova-ollm/EMOVA-demo
reacted to m-ric's post with ๐Ÿ”ฅ๐Ÿค— about 2 months ago
view post
Post
4946
smolagents now support vLLM! ๐Ÿฅณ

As one of the most popular local inference solutions, the community had been asking us to integrate vLLM: after a heavy refactoring of our LLM classes, we've just released smolagents 1.11.0, with a brand new VLLMModel class.

Go try it and tell us what you think!

https://github.com/huggingface/smolagents/blob/45b2c86857b7f7657daaa74e4d17d347e9e2c4a4/src/smolagents/models.py#L497
reacted to clem's post with ๐Ÿค— about 2 months ago
view post
Post
4670
We just crossed 1,500,000 public models on Hugging Face (and 500k spaces, 330k datasets, 50k papers). One new repository is created every 15 seconds. Congratulations all!
ยท
reacted to AdinaY's post with ๐Ÿ”ฅ about 2 months ago
view post
Post
1930
Open Sora 2.0 is out ๐Ÿ”ฅ
hpcai-tech/open-sora-20-67cfb7efa80a73999ccfc2d5
โœจ 11B with Apache2.0
โœจ Low training cost - $200k
โœจ open weights, code and training workflow