Hi! Any plans to get this fully integrated into transformers?
Harpreet Sahota PRO
harpreetsahota
AI & ML interests
Deep learning, laguage models, prompt engineering, agents, multi-agent systems
Recent Activity
liked
a model
1 day ago
LLaVA-VL/llava_plus_v0_7b
liked
a model
1 day ago
ByteDance-Seed/UI-TARS-1.5-7B
liked
a model
3 days ago
Samsung/TinyClick
Organizations
harpreetsahota's activity
Post
2288
The Coachella of Computer Vision, CVPR, is right around the corner. In anticipation of the conference, I curated a dataset of the papers.
I'll have a technical blog post out tomorrow doing some analysis on the dataset, but I'm so hyped that I wanted to get it out to the community ASAP.
The dataset consists of the following fields:
- An image of the first page of the paper
-
-
-
-
-
-
-
-
Here's how I created the dataset ๐๐ผ
Generic code for building this dataset can be found [here](https://github.com/harpreetsahota204/CVPR-2024-Papers).
This dataset was built using the following steps:
- Scrape the CVPR 2024 website for accepted papers
- Use DuckDuckGo to search for a link to the paper's abstract on arXiv
- Use arXiv.py (python wrapper for the arXiv API) to extract the abstract and categories, and download the pdf for each paper
- Use pdf2image to save the image of paper's first page
- Use GPT-4o to extract keywords from the abstract
Voxel51/CVPR_2024_Papers
I'll have a technical blog post out tomorrow doing some analysis on the dataset, but I'm so hyped that I wanted to get it out to the community ASAP.
The dataset consists of the following fields:
- An image of the first page of the paper
-
title
: The title of the paper-
authors_list
: The list of authors-
abstract
: The abstract of the paper-
arxiv_link
: Link to the paper on arXiv-
other_link
: Link to the project page, if found-
category_name
: The primary category this paper according to [arXiv taxonomy](https://arxiv.org/category_taxonomy)-
all_categories
: All categories this paper falls into, according to arXiv taxonomy-
keywords
: Extracted using GPT-4oHere's how I created the dataset ๐๐ผ
Generic code for building this dataset can be found [here](https://github.com/harpreetsahota204/CVPR-2024-Papers).
This dataset was built using the following steps:
- Scrape the CVPR 2024 website for accepted papers
- Use DuckDuckGo to search for a link to the paper's abstract on arXiv
- Use arXiv.py (python wrapper for the arXiv API) to extract the abstract and categories, and download the pdf for each paper
- Use pdf2image to save the image of paper's first page
- Use GPT-4o to extract keywords from the abstract
Voxel51/CVPR_2024_Papers

posted
an
update
11 months ago
Post
2288
The Coachella of Computer Vision, CVPR, is right around the corner. In anticipation of the conference, I curated a dataset of the papers.
I'll have a technical blog post out tomorrow doing some analysis on the dataset, but I'm so hyped that I wanted to get it out to the community ASAP.
The dataset consists of the following fields:
- An image of the first page of the paper
-
-
-
-
-
-
-
-
Here's how I created the dataset ๐๐ผ
Generic code for building this dataset can be found [here](https://github.com/harpreetsahota204/CVPR-2024-Papers).
This dataset was built using the following steps:
- Scrape the CVPR 2024 website for accepted papers
- Use DuckDuckGo to search for a link to the paper's abstract on arXiv
- Use arXiv.py (python wrapper for the arXiv API) to extract the abstract and categories, and download the pdf for each paper
- Use pdf2image to save the image of paper's first page
- Use GPT-4o to extract keywords from the abstract
Voxel51/CVPR_2024_Papers
I'll have a technical blog post out tomorrow doing some analysis on the dataset, but I'm so hyped that I wanted to get it out to the community ASAP.
The dataset consists of the following fields:
- An image of the first page of the paper
-
title
: The title of the paper-
authors_list
: The list of authors-
abstract
: The abstract of the paper-
arxiv_link
: Link to the paper on arXiv-
other_link
: Link to the project page, if found-
category_name
: The primary category this paper according to [arXiv taxonomy](https://arxiv.org/category_taxonomy)-
all_categories
: All categories this paper falls into, according to arXiv taxonomy-
keywords
: Extracted using GPT-4oHere's how I created the dataset ๐๐ผ
Generic code for building this dataset can be found [here](https://github.com/harpreetsahota204/CVPR-2024-Papers).
This dataset was built using the following steps:
- Scrape the CVPR 2024 website for accepted papers
- Use DuckDuckGo to search for a link to the paper's abstract on arXiv
- Use arXiv.py (python wrapper for the arXiv API) to extract the abstract and categories, and download the pdf for each paper
- Use pdf2image to save the image of paper's first page
- Use GPT-4o to extract keywords from the abstract
Voxel51/CVPR_2024_Papers
Dope!

reacted to
jamarks's
post with ๐คฏ๐ค๐ฅ๐
about 1 year ago
Post
2237
FiftyOne Datasets <> Hugging Face Hub Integration!
As of yesterday's release of FiftyOne
You can now load Parquet datasets from the hub and have them converted directly into FiftyOne datasets. To load MNIST, for example:
You can also load FiftyOne datasets directly from the hub. Here's how you load the first 1000 samples from the VisDrone dataset:
And tying it all together, you can push your FiftyOne datasets directly to the hub:
Major thanks to @tomaarsen @davanstrien @severo @osanseviero and @julien-c for helping to make this happen!!!
Full documentation and details here: https://docs.voxel51.com/integrations/huggingface.html#huggingface-hub
As of yesterday's release of FiftyOne
0.23.8
, the FiftyOne open source library for dataset curation and visualization is now integrated with the Hugging Face Hub!You can now load Parquet datasets from the hub and have them converted directly into FiftyOne datasets. To load MNIST, for example:
pip install -U fiftyone
import fiftyone as fo
import fiftyone.utils.huggingface as fouh
dataset = fouh.load_from_hub(
"mnist",
format="ParquetFilesDataset",
classification_fields="label",
)
session = fo.launch_app(dataset)
You can also load FiftyOne datasets directly from the hub. Here's how you load the first 1000 samples from the VisDrone dataset:
import fiftyone as fo
import fiftyone.utils.huggingface as fouh
dataset = fouh.load_from_hub("jamarks/VisDrone2019-DET", max_samples=1000)
# Launch the App
session = fo.launch_app(dataset)
And tying it all together, you can push your FiftyOne datasets directly to the hub:
import fiftyone.zoo as foz
import fiftyone.utils.huggingface as fouh
dataset = foz.load_zoo_dataset("quickstart")
fouh.push_to_hub(dataset, "my-dataset")
Major thanks to @tomaarsen @davanstrien @severo @osanseviero and @julien-c for helping to make this happen!!!
Full documentation and details here: https://docs.voxel51.com/integrations/huggingface.html#huggingface-hub

reacted to
danielhanchen's
post with โค๏ธ
about 1 year ago
Post
Gemma QLoRA finetuning is now 2.4x faster and uses 58% less VRAM than FA2 through ๐ฆฅUnsloth! Had to rewrite our Cross Entropy Loss kernels to work on all vocab sizes, re-design our manual autograd engine to accept all activation functions, and more! I wrote all about our learnings in our blog post: https://unsloth.ai/blog/gemma.
Also have a Colab notebook with no OOMs, and has 2x faster inference for Gemma & how to merge and save to llama.cpp GGUF & vLLM: https://colab.research.google.com/drive/10NbwlsRChbma1v55m8LAPYG15uQv6HLo?usp=sharing
And uploaded 4bit pre-quantized versions for Gemma 2b and 7b: unsloth/gemma-7b-bnb-4bit unsloth/gemma-2b-bnb-4bit
Also have a Colab notebook with no OOMs, and has 2x faster inference for Gemma & how to merge and save to llama.cpp GGUF & vLLM: https://colab.research.google.com/drive/10NbwlsRChbma1v55m8LAPYG15uQv6HLo?usp=sharing
And uploaded 4bit pre-quantized versions for Gemma 2b and 7b: unsloth/gemma-7b-bnb-4bit unsloth/gemma-2b-bnb-4bit
from unsloth import FastLanguageModel
model, tokenzer = FastLanguageModel.from_pretrained("unsloth/gemma-7b")
model = FastLanguageModel.get_peft_model(model)
Post
google/gemma-7b-it is super good!
I wasn't convinced at first, but after vibe-checking it...I'm quite impressed.
I've got a notebook here, which is kind of a framework for vibe-checking LLMs.
In this notebook, I take Gemma for a spin on a variety of prompts:
โข [nonsensical tokens]( harpreetsahota/diverse-token-sampler
โข [conversation where I try to get some PII)( harpreetsahota/red-team-prompts-questions)
โข [summarization ability]( lighteval/summarization)
โข [instruction following]( harpreetsahota/Instruction-Following-Evaluation-for-Large-Language-Models
โข [chain of thought reasoning]( ssbuild/alaca_chain-of-thought)
I then used LangChain evaluators (GPT-4 as judge), and track everything in LangSmith. I made public links to the traces where you can inspect the runs.
I hope you find this helpful, and I am certainly open to feedback, criticisms, or ways to improve.
Cheers:
You can find the notebook here: https://colab.research.google.com/drive/1RHzg0FD46kKbiGfTdZw9Fo-DqWzajuoi?usp=sharing
I wasn't convinced at first, but after vibe-checking it...I'm quite impressed.
I've got a notebook here, which is kind of a framework for vibe-checking LLMs.
In this notebook, I take Gemma for a spin on a variety of prompts:
โข [nonsensical tokens]( harpreetsahota/diverse-token-sampler
โข [conversation where I try to get some PII)( harpreetsahota/red-team-prompts-questions)
โข [summarization ability]( lighteval/summarization)
โข [instruction following]( harpreetsahota/Instruction-Following-Evaluation-for-Large-Language-Models
โข [chain of thought reasoning]( ssbuild/alaca_chain-of-thought)
I then used LangChain evaluators (GPT-4 as judge), and track everything in LangSmith. I made public links to the traces where you can inspect the runs.
I hope you find this helpful, and I am certainly open to feedback, criticisms, or ways to improve.
Cheers:
You can find the notebook here: https://colab.research.google.com/drive/1RHzg0FD46kKbiGfTdZw9Fo-DqWzajuoi?usp=sharing

reacted to
merve's
post with ๐
about 1 year ago
Post
I've tried DoRA (https://arxiv.org/abs/2402.09353) with SDXL using PEFT, outputs are quite detailed ๐คฉ๐
as usual trained on lego dataset I compiled, I compared them with previously trained pivotal tuned model and the normal DreamBooth model before that ๐
Notebook by @linoyts https://colab.research.google.com/drive/134mt7bCMKtCYyYzETfEGKXT1J6J50ydT?usp=sharing
Integration to PEFT by @BenjaminB https://github.com/huggingface/peft/pull/1474 (more info in the PR)
as usual trained on lego dataset I compiled, I compared them with previously trained pivotal tuned model and the normal DreamBooth model before that ๐
Notebook by @linoyts https://colab.research.google.com/drive/134mt7bCMKtCYyYzETfEGKXT1J6J50ydT?usp=sharing
Integration to PEFT by @BenjaminB https://github.com/huggingface/peft/pull/1474 (more info in the PR)

reacted to
Wauplin's
post with ๐๐คโค๏ธ
about 1 year ago
Post
๐ Just released version 0.21.0 of the
Exciting updates include:
๐๏ธ Dataclasses everywhere for improved developer experience!
๐พ HfFileSystem optimizations!
๐งฉ
โจ
๐ Translated docs in Simplified Chinese and French!
๐ Breaking changes: simplified API for listing models and datasets!
Check out the full release notes for more details: Wauplin/huggingface_hub#4 ๐ค๐ป
huggingface_hub
Python library!Exciting updates include:
๐๏ธ Dataclasses everywhere for improved developer experience!
๐พ HfFileSystem optimizations!
๐งฉ
PyTorchHubMixin
now supports configs and safetensors!โจ
audio-to-audio
supported in the InferenceClient!๐ Translated docs in Simplified Chinese and French!
๐ Breaking changes: simplified API for listing models and datasets!
Check out the full release notes for more details: Wauplin/huggingface_hub#4 ๐ค๐ป

reacted to
clem's
post with โค๏ธ
about 1 year ago

posted
an
update
about 1 year ago
Post
google/gemma-7b-it is super good!
I wasn't convinced at first, but after vibe-checking it...I'm quite impressed.
I've got a notebook here, which is kind of a framework for vibe-checking LLMs.
In this notebook, I take Gemma for a spin on a variety of prompts:
โข [nonsensical tokens]( harpreetsahota/diverse-token-sampler
โข [conversation where I try to get some PII)( harpreetsahota/red-team-prompts-questions)
โข [summarization ability]( lighteval/summarization)
โข [instruction following]( harpreetsahota/Instruction-Following-Evaluation-for-Large-Language-Models
โข [chain of thought reasoning]( ssbuild/alaca_chain-of-thought)
I then used LangChain evaluators (GPT-4 as judge), and track everything in LangSmith. I made public links to the traces where you can inspect the runs.
I hope you find this helpful, and I am certainly open to feedback, criticisms, or ways to improve.
Cheers:
You can find the notebook here: https://colab.research.google.com/drive/1RHzg0FD46kKbiGfTdZw9Fo-DqWzajuoi?usp=sharing
I wasn't convinced at first, but after vibe-checking it...I'm quite impressed.
I've got a notebook here, which is kind of a framework for vibe-checking LLMs.
In this notebook, I take Gemma for a spin on a variety of prompts:
โข [nonsensical tokens]( harpreetsahota/diverse-token-sampler
โข [conversation where I try to get some PII)( harpreetsahota/red-team-prompts-questions)
โข [summarization ability]( lighteval/summarization)
โข [instruction following]( harpreetsahota/Instruction-Following-Evaluation-for-Large-Language-Models
โข [chain of thought reasoning]( ssbuild/alaca_chain-of-thought)
I then used LangChain evaluators (GPT-4 as judge), and track everything in LangSmith. I made public links to the traces where you can inspect the runs.
I hope you find this helpful, and I am certainly open to feedback, criticisms, or ways to improve.
Cheers:
You can find the notebook here: https://colab.research.google.com/drive/1RHzg0FD46kKbiGfTdZw9Fo-DqWzajuoi?usp=sharing

reacted to
philschmid's
post with โค๏ธ
over 1 year ago
Post
What's the best way to fine-tune open LLMs in 2024? Look no further! ๐ย I am excited to share โHow to Fine-Tune LLMs in 2024 with Hugging Faceโ using the latest research techniques, including Flash Attention, Q-LoRA, OpenAI dataset formats (messages), ChatML, Packing, all built with Hugging Face TRL. ๐
It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with:
๐กDefine and understand use cases for fine-tuning
๐ง๐ปโ๐ปย Setup of the development environment
๐งฎย Create and prepare dataset (OpenAI format)
๐๏ธโโ๏ธย Fine-tune LLM using TRL and the SFTTrainer
๐ฅย Test and evaluate the LLM
๐ย Deploy for production with TGI
๐ย https://www.philschmid.de/fine-tune-llms-in-2024-with-trl
Coming soon: Advanced Guides for multi-GPU/multi-Node full fine-tuning and alignment using DPO & KTO. ๐
It is created for consumer-size GPUs (24GB) covering the full end-to-end lifecycle with:
๐กDefine and understand use cases for fine-tuning
๐ง๐ปโ๐ปย Setup of the development environment
๐งฎย Create and prepare dataset (OpenAI format)
๐๏ธโโ๏ธย Fine-tune LLM using TRL and the SFTTrainer
๐ฅย Test and evaluate the LLM
๐ย Deploy for production with TGI
๐ย https://www.philschmid.de/fine-tune-llms-in-2024-with-trl
Coming soon: Advanced Guides for multi-GPU/multi-Node full fine-tuning and alignment using DPO & KTO. ๐

reacted to
abidlabs's
post with ๐คโค๏ธ
over 1 year ago
Post
๐๐ฆ๐๐ซ๐๐๐๐ ๐๐ฒ ๐๐ฎ๐ ๐ ๐ข๐ง๐ ๐
๐๐๐: ๐ญ๐ก๐ ๐๐ง๐ฌ๐ข๐๐ ๐๐ญ๐จ๐ซ๐ฒ ๐จ๐ ๐๐ฎ๐ซ ๐๐ญ๐๐ซ๐ญ๐ฎ๐ฉโ๐ฌ ๐๐๐ช๐ฎ๐ข๐ฌ๐ข๐ญ๐ข๐จ๐ง
In late 2021, our team of five engineers, scattered around the globe, signed the papers to shut down our startup, Gradio. For many founders, this would have been a moment of sadness or even bitter reflection.
But we were celebrating. We were getting acquired by Hugging Face!
We had been working very hard towards this acquisition, but for weeks, the acquisition had been blocked by a single investor. The more we pressed him, the more he buckled down, refusing to sign off on the acquisition. Until, unexpectedly, the investor conceded, allowing us to join Hugging Face.
For the first time since our acquisition, Iโm writing down the story in detail, hoping that it may shed some light into the obscure world of startup acquisitions and what decisions founders can make to improve their odds for a successful acquisition.
To understand how we got acquired by Hugging Face, you need to know why we started Gradio.
๐๐ง ๐๐๐๐ ๐๐ซ๐จ๐ฆ ๐ญ๐ก๐ ๐๐๐๐ซ๐ญ
Two years before the acquisition, in early 2019, I was working on a research project at Stanford. It was the third year of my PhD, and my labmates and I had trained a machine learning model that could predict patient biomarkers (such as whether patients had certain diseases or an implanted pacemaker) from an ultrasound image of their heart โ as well as a cardiologist.
Naturally, cardiologists were skeptical... read the rest of the story here: https://twitter.com/abidlabs/status/1745533306492588303
In late 2021, our team of five engineers, scattered around the globe, signed the papers to shut down our startup, Gradio. For many founders, this would have been a moment of sadness or even bitter reflection.
But we were celebrating. We were getting acquired by Hugging Face!
We had been working very hard towards this acquisition, but for weeks, the acquisition had been blocked by a single investor. The more we pressed him, the more he buckled down, refusing to sign off on the acquisition. Until, unexpectedly, the investor conceded, allowing us to join Hugging Face.
For the first time since our acquisition, Iโm writing down the story in detail, hoping that it may shed some light into the obscure world of startup acquisitions and what decisions founders can make to improve their odds for a successful acquisition.
To understand how we got acquired by Hugging Face, you need to know why we started Gradio.
๐๐ง ๐๐๐๐ ๐๐ซ๐จ๐ฆ ๐ญ๐ก๐ ๐๐๐๐ซ๐ญ
Two years before the acquisition, in early 2019, I was working on a research project at Stanford. It was the third year of my PhD, and my labmates and I had trained a machine learning model that could predict patient biomarkers (such as whether patients had certain diseases or an implanted pacemaker) from an ultrasound image of their heart โ as well as a cardiologist.
Naturally, cardiologists were skeptical... read the rest of the story here: https://twitter.com/abidlabs/status/1745533306492588303