
Major TOM
community
AI & ML interests
Earth Observation Datasets
Recent Activity
View all activity
Major-TOM's activity

prithivMLmods
posted
an
update
2 days ago
Post
2439
Well, here’s the updated version with the 20,000+ entry sampled dataset for Watermark Filter Content Moderation models incl. [Food25, Weather, Watermark, Marathi/Hindi Sign Language Detection], post-trained from the base models: sigLip2 patch16 224 — now with mixed aspect ratios for better performance and reduced misclassification. 🔥
Models :
➮ Watermark-Detection : prithivMLmods/Watermark-Detection-SigLIP2
⌨︎ Watermark Detection & Batch Image Processing Experimentals, Colab Notebook : https://colab.research.google.com/drive/1mlQrSsSjkGimUt0VyRi3SoWMv8OMyvw3?usp=drive_link
➮ Weather-Image-Classification : prithivMLmods/Weather-Image-Classification
➮ TurkishFoods-25 : prithivMLmods/TurkishFoods-25
➮ Marathi-Sign-Language-Detection : prithivMLmods/Marathi-Sign-Language-Detection
➮ Hindi-Sign-Language-Detection : prithivMLmods/Hindi-Sign-Language-Detection
Datasets :
Watermark : qwertyforce/scenery_watermarks
Weather : prithivMLmods/WeatherNet-05-18039
Turkish Foods 25 : yunusserhat/TurkishFoods-25
Marathi Sign Language : VinayHajare/Marathi-Sign-Language
Hindi Sign Language : Vedant3907/Hindi-Sign-Language-Dataset
Collection : prithivMLmods/content-filters-siglip2-vit-68197e3357d4de18fb3b4d2b
Models :
➮ Watermark-Detection : prithivMLmods/Watermark-Detection-SigLIP2
⌨︎ Watermark Detection & Batch Image Processing Experimentals, Colab Notebook : https://colab.research.google.com/drive/1mlQrSsSjkGimUt0VyRi3SoWMv8OMyvw3?usp=drive_link
➮ Weather-Image-Classification : prithivMLmods/Weather-Image-Classification
➮ TurkishFoods-25 : prithivMLmods/TurkishFoods-25
➮ Marathi-Sign-Language-Detection : prithivMLmods/Marathi-Sign-Language-Detection
➮ Hindi-Sign-Language-Detection : prithivMLmods/Hindi-Sign-Language-Detection
Datasets :
Watermark : qwertyforce/scenery_watermarks
Weather : prithivMLmods/WeatherNet-05-18039
Turkish Foods 25 : yunusserhat/TurkishFoods-25
Marathi Sign Language : VinayHajare/Marathi-Sign-Language
Hindi Sign Language : Vedant3907/Hindi-Sign-Language-Dataset
Collection : prithivMLmods/content-filters-siglip2-vit-68197e3357d4de18fb3b4d2b

prithivMLmods
posted
an
update
5 days ago
Post
987
The new versions of Midjourney Mix adapters have been dropped in stranger zone hf. These adapters excel in studio lighting portraits and painterly styles, trained using the style of
strangerzonehf/Flux-Midjourney-Mix2-LoRA. They leverage 24-bit colored synthetic images generated form midjourney v6 to achieve high-quality image reproducibility and support adaptable aspect ratios, using Flux.1 as the base model. 🥳
Models [ ⌗ ]
> Flux-Midjourney-Painterly-LoRA : strangerzonehf/Flux-Midjourney-Painterly-LoRA
> Flux-Midjourney-Studio-LoRA : strangerzonehf/Flux-Midjourney-Studio-LoRA
> Collection : strangerzonehf/midjourney-mix-3-ft-flux1-dev-68165d58a2a08025852d63f3
> Space : prithivMLmods/FLUX-LoRA-DLC2
The best dimensions and inference settings for optimal results are as follows: A resolution of 1280 x 832 with a 3:2 aspect ratio is recommended for the best quality, while 1024 x 1024 with a 1:1 aspect ratio serves as the default option. For inference, the recommended number of steps ranges between 30 and 35 to achieve optimal output.
Models [ ⌗ ]
> Flux-Midjourney-Painterly-LoRA : strangerzonehf/Flux-Midjourney-Painterly-LoRA
> Flux-Midjourney-Studio-LoRA : strangerzonehf/Flux-Midjourney-Studio-LoRA
> Collection : strangerzonehf/midjourney-mix-3-ft-flux1-dev-68165d58a2a08025852d63f3
> Space : prithivMLmods/FLUX-LoRA-DLC2
The best dimensions and inference settings for optimal results are as follows: A resolution of 1280 x 832 with a 3:2 aspect ratio is recommended for the best quality, while 1024 x 1024 with a 1:1 aspect ratio serves as the default option. For inference, the recommended number of steps ranges between 30 and 35 to achieve optimal output.
Post
2941
Forget everything you know about transcription models - NVIDIA's parakeet-tdt-0.6b-v2 changed the game for me!
Just tested it with Steve Jobs' Stanford speech and was speechless (pun intended). The video isn’t sped up.
3 things that floored me:
- Transcription took just 10 seconds for a 15-min file
- Got a CSV with perfect timestamps, punctuation & capitalization
- Stunning accuracy (correctly captured "Reed College" and other specifics)
NVIDIA also released a demo where you can click any transcribed segment to play it instantly.
The improvement is significant: number 1 on the ASR Leaderboard, 6% error rate (best in class) with complete commercial freedom (cc-by-4.0 license).
Time to update those Whisper pipelines! H/t @Steveeeeeeen for the finding!
Model: nvidia/parakeet-tdt-0.6b-v2
Demo: nvidia/parakeet-tdt-0.6b-v2
ASR Leaderboard: hf-audio/open_asr_leaderboard
Just tested it with Steve Jobs' Stanford speech and was speechless (pun intended). The video isn’t sped up.
3 things that floored me:
- Transcription took just 10 seconds for a 15-min file
- Got a CSV with perfect timestamps, punctuation & capitalization
- Stunning accuracy (correctly captured "Reed College" and other specifics)
NVIDIA also released a demo where you can click any transcribed segment to play it instantly.
The improvement is significant: number 1 on the ASR Leaderboard, 6% error rate (best in class) with complete commercial freedom (cc-by-4.0 license).
Time to update those Whisper pipelines! H/t @Steveeeeeeen for the finding!
Model: nvidia/parakeet-tdt-0.6b-v2
Demo: nvidia/parakeet-tdt-0.6b-v2
ASR Leaderboard: hf-audio/open_asr_leaderboard
Post
1527

Check if there's one in your city here: LeRobot-worldwide-hackathon/worldwide-map
Post
1434
The
meta-llama
org just crossed 40,000 followers on Hugging Face. Grateful for all their impact on the field sharing the Llama weights openly and much more!
We need more of this from all other big tech to make the AI more open, collaborative and beneficial to all!

We need more of this from all other big tech to make the AI more open, collaborative and beneficial to all!
Post
782
PSA for anyone using
Both of these themes have been updated to fix some of the long-standing inconsistencies ever since the transition to Gradio v5. Textboxes are no longer bright green and
If your space is already using one of these themes, you just need to restart your space to get the latest version. No code changes needed.
Nymbo/Nymbo_Theme
or Nymbo/Nymbo_Theme_5
in a Gradio space ~Both of these themes have been updated to fix some of the long-standing inconsistencies ever since the transition to Gradio v5. Textboxes are no longer bright green and
in-line code
is readable now! Both themes are now visually identical across versions.If your space is already using one of these themes, you just need to restart your space to get the latest version. No code changes needed.
Post
464
I just gave my chatbots a massive upgrade: they can now generate audio from text, modify images — you name it. Here’s how:
The Gradio team shipped MCP support. That means you can plug any AI app built with it into Claude or Cursor using the Model Context Protocol (MCP) — think of it like a USB port for LLMs.
I put it to the test:
- Whipped up a quick text-to-speech app with Kokoro on HF (with an LLM riding shotgun, naturally)
- Added "mcp_server=True" in the code
- Connected it to Claude
Now I can generate audio from any text. The possibilities are next-level: you can potentially plug any of the 500K+ AI apps on Hugging Face to your favorite LLM.
Is this the new UI for AI?
- My tts app (feel free to use/duplicate it): fdaudens/kokoro-mcp
- Blog post: https://huggingface.co/blog/gradio-mcp
The Gradio team shipped MCP support. That means you can plug any AI app built with it into Claude or Cursor using the Model Context Protocol (MCP) — think of it like a USB port for LLMs.
I put it to the test:
- Whipped up a quick text-to-speech app with Kokoro on HF (with an LLM riding shotgun, naturally)
- Added "mcp_server=True" in the code
- Connected it to Claude
Now I can generate audio from any text. The possibilities are next-level: you can potentially plug any of the 500K+ AI apps on Hugging Face to your favorite LLM.
Is this the new UI for AI?
- My tts app (feel free to use/duplicate it): fdaudens/kokoro-mcp
- Blog post: https://huggingface.co/blog/gradio-mcp
Post
266
Expansion of Global and Dense Open Embeddings Dataset of Earth 🌍
We updated our previous embeddings release with three models MMEarth and DeCUR-S2, DeCUR-S1 of the Major TOM embeddings dataset, developed in collaboration with CloudFerro S.A. asterisk labs and Φ-lab, European Space Agency - ESA. Together with @mikonvergence , Jędrzej S. Bojanowski, we extend the open-access collection of open dataset of Copernicus embeddings built at global scale, providing dense coverage across the entire acquisition area of Sentinel-1 and Sentinel-2 sensors.
Total embedding resources after the update:
- 51 TB of AI-embeddings generated from processed Sentinel data,
- over 40 billion embedding vectors,
- processing of 147 TB of raw satellite data,
- analysis covering more than 15 million Sentinel-1 and Sentinel-2 scenes and more than 16 trillion pixels.
This project delivers open and free vectorized expansions of Major TOM datasets available on CREODIAS and Hugging Face, setting a new standard for embedding releases and enabling lightweight, scalable ingestion of Earth Observation (EO) data for countless applications.
Datasets:
Major-TOM/Core-S2L2A-MMEarth
Major-TOM/Core-S2L1C-DeCUR
Major-TOM/Core-S1RTC-DeCUR
#EarthObservation #AI #CloudFerro #asterisklabs #ESA
We updated our previous embeddings release with three models MMEarth and DeCUR-S2, DeCUR-S1 of the Major TOM embeddings dataset, developed in collaboration with CloudFerro S.A. asterisk labs and Φ-lab, European Space Agency - ESA. Together with @mikonvergence , Jędrzej S. Bojanowski, we extend the open-access collection of open dataset of Copernicus embeddings built at global scale, providing dense coverage across the entire acquisition area of Sentinel-1 and Sentinel-2 sensors.
Total embedding resources after the update:
- 51 TB of AI-embeddings generated from processed Sentinel data,
- over 40 billion embedding vectors,
- processing of 147 TB of raw satellite data,
- analysis covering more than 15 million Sentinel-1 and Sentinel-2 scenes and more than 16 trillion pixels.
This project delivers open and free vectorized expansions of Major TOM datasets available on CREODIAS and Hugging Face, setting a new standard for embedding releases and enabling lightweight, scalable ingestion of Earth Observation (EO) data for countless applications.
Datasets:
Major-TOM/Core-S2L2A-MMEarth
Major-TOM/Core-S2L1C-DeCUR
Major-TOM/Core-S1RTC-DeCUR
#EarthObservation #AI #CloudFerro #asterisklabs #ESA
Post
1725
Want to know which AI models are least likely to hallucinate — and how to keep yours from spiking hallucinations by 20%?
A new benchmark called Phare, by Giskard, tested leading models across multiple languages, revealing three key findings:
1️⃣ Popular models aren't necessarily factual. Some models ranking highest in user satisfaction benchmarks like LMArena are actually more prone to hallucination.
2️⃣ The way you ask matters - a lot. When users present claims confidently ("My teacher said..."), models are 15% less likely to correct misinformation vs. neutral framing ("I heard...").
3️⃣ Telling models to "be concise" can increase hallucination by up to 20%.
What's also cool is that the full dataset is public - use them to test your own models or dive deeper into the results! H/t @davidberenstein1957 for the link.
- Study: https://www.giskard.ai/knowledge/good-answers-are-not-necessarily-factual-answers-an-analysis-of-hallucination-in-leading-llms
- Leaderboard: https://phare.giskard.ai/
- Dataset: giskardai/phare
A new benchmark called Phare, by Giskard, tested leading models across multiple languages, revealing three key findings:
1️⃣ Popular models aren't necessarily factual. Some models ranking highest in user satisfaction benchmarks like LMArena are actually more prone to hallucination.
2️⃣ The way you ask matters - a lot. When users present claims confidently ("My teacher said..."), models are 15% less likely to correct misinformation vs. neutral framing ("I heard...").
3️⃣ Telling models to "be concise" can increase hallucination by up to 20%.
What's also cool is that the full dataset is public - use them to test your own models or dive deeper into the results! H/t @davidberenstein1957 for the link.
- Study: https://www.giskard.ai/knowledge/good-answers-are-not-necessarily-factual-answers-an-analysis-of-hallucination-in-leading-llms
- Leaderboard: https://phare.giskard.ai/
- Dataset: giskardai/phare

prithivMLmods
posted
an
update
9 days ago
Post
1800
Dropping downstream tasks using newly initialized parameters and weights supports domain-specific image classification post-training, based on the SigLIP-2 models: Patch-16/224, Patch-16/256, and Patch-32/256. For more details, please refer to the respective model cards : 🤗
+ watermark detection : prithivMLmods/Watermark-Detection-SigLIP2
+ resisc45 : prithivMLmods/RESISC45-SigLIP2
+ pacs dg : prithivMLmods/PACS-DG-SigLIP2
+ 3d printed or not : prithivMLmods/3D-Printed-Or-Not-SigLIP2
+ formula or text : prithivMLmods/Formula-Text-Detection
- explicit content patch16 256 : prithivMLmods/siglip2-x256-explicit-content
- explicit content patch32 256 : prithivMLmods/siglip2-x256p32-explicit-content
> SigLIP2 Content Filters 042025 Final : https://huggingface.co/collections/prithivMLmods/siglip2-content-filters-04202-final-680fe4aa1a9d589bf2c915ff
> SigLIP2 : google/siglip2-67b5dcef38c175486e240107
> SigLIP2 Multilingual Vision-Language Encoders : https://arxiv.org/pdf/2502.14786
+ watermark detection : prithivMLmods/Watermark-Detection-SigLIP2
+ resisc45 : prithivMLmods/RESISC45-SigLIP2
+ pacs dg : prithivMLmods/PACS-DG-SigLIP2
+ 3d printed or not : prithivMLmods/3D-Printed-Or-Not-SigLIP2
+ formula or text : prithivMLmods/Formula-Text-Detection
Categorizing Un-Safe Content :
- explicit content patch16 256 : prithivMLmods/siglip2-x256-explicit-content
- explicit content patch32 256 : prithivMLmods/siglip2-x256p32-explicit-content
Collection :
> SigLIP2 Content Filters 042025 Final : https://huggingface.co/collections/prithivMLmods/siglip2-content-filters-04202-final-680fe4aa1a9d589bf2c915ff
> SigLIP2 : google/siglip2-67b5dcef38c175486e240107
> SigLIP2 Multilingual Vision-Language Encoders : https://arxiv.org/pdf/2502.14786

blumenstiel
authored
3
papers
10 days ago
Prithvi-EO-2.0: A Versatile Multi-Temporal Foundation Model for Earth Observation Applications
Paper
•
2412.02732
•
Published
•
2
Beyond the Visible: Multispectral Vision-Language Learning for Earth Observation
Paper
•
2503.15969
•
Published
TerraMind: Large-Scale Generative Multimodality for Earth Observation
Paper
•
2504.11171
•
Published
•
1

mikonvergence
updated
a
Space
11 days ago

mkluczek
published
a
dataset
11 days ago

mkluczek
updated
a
dataset
11 days ago

mkluczek
updated
a
dataset
13 days ago

mikonvergence
updated
a
dataset
13 days ago