amazon-sagemaker-community (Amazon SageMaker Community)

philschmid

posted an update 10 days ago

Post

2204

Gemini 2.5 Flash is here! We excited launch our first hybrid reasoning Gemini model. In Flash 2.5 developer can turn thinking off.

**TL;DR:**
- 🧠 Controllable "Thinking" with thinking budget with up to 24k token
- 🌌 1 Million multimodal input context for text, image, video, audio, and pdf
- 🛠️ Function calling, structured output, google search & code execution.
- 🏦 $0.15 1M input tokens; $0.6 or $3.5 (thinking on) per million output tokens (thinking tokens are billed as output tokens)
- 💡 Knowledge cut of January 2025
- 🚀 Rate limits - Free 10 RPM 500 req/day
- 🏅Outperforms 2.0 Flash on every benchmark

Try it ⬇️
https://aistudio.google.com/prompts/new_chat?model=gemini-2.5-flash-preview-04-17

1 reply

·

DrishtiSharma

authored a paper 11 days ago

Robust and Fine-Grained Detection of AI Generated Texts

Paper • 2504.11952 • Published 12 days ago • 11

jeffboudier

posted an update 23 days ago

Post

2153

Llama4 is out and Scout is already on the Dell Enterprise Hub to deploy on Dell systems 👉 dell.huggingface.co

jeffboudier

posted an update 26 days ago

Post

1532

Enterprise orgs now enable serverless Inference Providers for all members
- includes $2 free usage per org member (e.g. an Enterprise org with 1,000 members share $2,000 free credit each month)
- admins can set a monthly spend limit for the entire org
- works today with Together, fal, Novita, Cerebras and HF Inference.

Here's the doc to bill Inference Providers usage to your org: https://huggingface.co/docs/inference-providers/pricing#organization-billing

2 replies

·

birgermoell

authored a paper 26 days ago

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

Paper • 2504.00016 • Published Mar 27 • 1

philschmid

authored a paper about 1 month ago

Gemma 3 Technical Report

Paper • 2503.19786 • Published Mar 25 • 49

birgermoell

authored 2 papers about 1 month ago

The order in speech disorder: a scoping review of state of the art machine learning methods for clinical speech classification

Paper • 2503.04802 • Published Mar 3

Artificial Humans

Paper • 2503.16502 • Published Mar 12

philschmid

posted an update about 1 month ago

Post

2840

Gemini 2.5 Pro, thinking by default! We excited launch our best Gemini model for reasoning, multimodal and coding yet! #1 on LMSYS, Humanity’s Last Exam, AIME and GPQA and more!

TL;DR:
- 💻 Best Gemini coding model yet, particularly for web development (excels on LiveCodeBench).
- 🧠 Default "Thinking" with up to 64k token output
- 🌌 1 Million multimodal input context for text, image, video, audio, and pdf
- 🛠️ Function calling, structured output, google search & code execution.
- 🏆 #1 on LMArena & sota on AIME, GPQA, Humanity's Last Exam
- 💡 Knowledge cut of January 2025
- 🤗 Available for free as Experimental in AI Studio, Gemini API & Gemini APP
- 🚀 Rate limits - Free 2 RPM 50 req/day

Try it ⬇️

https://aistudio.google.com/?model=gemini-2.5-pro-exp-03-25

3 replies

·

w11wo

authored a paper about 2 months ago

COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition

Paper • 2503.07259 • Published Mar 10

birgermoell

authored a paper about 2 months ago

Voice Cloning for Dysarthric Speech Synthesis: Addressing Data Scarcity in Speech-Language Pathology

Paper • 2503.01266 • Published Mar 3

birgermoell

authored 2 papers 2 months ago

Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance

Paper • 2502.11578 • Published Feb 17

Large Language Models and Mathematical Reasoning Failures

Paper • 2502.11574 • Published Feb 17 • 3

vumichien

authored a paper 3 months ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 9

adorkin

authored a paper 3 months ago

Prune or Retrain: Optimizing the Vocabulary of Multilingual Models for Estonian

Paper • 2501.02631 • Published Jan 5

MoritzLaurer

posted an update 3 months ago

Post

3160

Microsoft's rStar-Math paper claims that 🤏 ~7B models can match the math skills of o1 using clever train- and test-time techniques. You can now download their prompt templates from Hugging Face !

📏 The paper introduces rStar-Math, which claims to rival OpenAI o1's math reasoning capabilities by integrating Monte Carlo Tree Search (MCTS) with step-by-step verified reasoning trajectories.
🤖 A Process Preference Model (PPM) enables fine-grained evaluation of intermediate steps, improving training data quality.
🧪 The system underwent four rounds of self-evolution, progressively refining both the policy and reward models to tackle Olympiad-level math problems—without GPT-4-based data distillation.
💾 While we wait for the release of code and datasets, you can already download the prompts they used from the HF Hub!

Details and links here 👇
Prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/
Templates on the hub: MoritzLaurer/rstar-math-prompts
Prompt-templates collection: MoritzLaurer/prompt-templates-6776aa0b0b8a923957920bb4
Paper: https://arxiv.org/pdf/2501.04519

MoritzLaurer

posted an update 4 months ago

Post

3312

FACTS is a great paper from @GoogleDeepMind on measuring the factuality of LLM outputs. You can now download their prompt templates from @huggingface to improve LLM-based fact-checking yourself!

📏 The paper introduces the FACTS Grounding benchmark for evaluating the factuality of LLM outputs.

🤖 Fact-checking is automated by an ensemble of LLM judges that verify if a response is fully grounded in a factual reference document.

🧪 The authors tested different prompt templates on held-out data to ensure their generalization.

📚 It's highly educational to read these templates to learn how frontier labs design prompts and understand their limitations.

💾 You can now download and reuse these prompt templates via the prompt-templates library!

🔄 The library simplifies sharing prompt templates on the HF hub or locally via standardized YAML files. Let’s make LLM work more transparent and reproducible by sharing more templates like this!

Links 👇
- prompt-templates docs: https://moritzlaurer.github.io/prompt_templates/
- all templates on the HF Hub: MoritzLaurer/facts-grounding-prompts
- FACTS paper: https://storage.googleapis.com/deepmind-media/FACTS/FACTS_grounding_paper.pdf

MoritzLaurer

posted an update 4 months ago

Post

1740

The TRL v0.13 release is 🔥! My highlight are the new process reward trainer to train models similar to o1 and tool call support:

🧠 Process reward trainer: Enables training of Process-supervised Reward Models (PRMs), which reward the quality of intermediate steps, promoting structured reasoning. Perfect for tasks like stepwise reasoning.

🔀 Model merging: A new callback leverages mergekit to merge models during training, improving performance by blending reference and policy models - optionally pushing merged models to the Hugging Face Hub.

🛠️ Tool call support: TRL preprocessing now supports tool integration, laying the groundwork for agent fine-tuning with examples like dynamic temperature fetching in prompts.

⚖️ Mixture of judges: The new AllTrueJudge combines decisions from multiple binary judges for more nuanced evaluation.

Read the release notes and other resources here 👇
Release: https://github.com/huggingface/trl/releases/tag/v0.13.0
Mergekit: https://github.com/arcee-ai/mergekit
Mixture of judges paper: The Perfect Blend: Redefining RLHF with Mixture of Judges (2409.20370)

MoritzLaurer

posted an update 4 months ago

Post

2103

OpenAI is losing money on the $200/month subscription 🤯. It's crazy how expensive it is to run these largest LLMs:

- ChatGPT Pro costs $200/month ($2,400/year) and is still unprofitable for OpenAI due to higher-than-expected usage.
- OpenAI reportedly expected losses of about $5 billion on revenue of $3.7 billion last year, with ChatGPT alone once costing an estimated $700,000 per day to operate. 💸🔥
- They build strong models and do great research. Whether this business model will work in the long run is one of the biggest questions in the AI economy today.

Source with the numbers 👇
https://techcrunch.com/2025/01/05/openai-is-losing-money-on-its-pricey-chatgpt-pro-plan-ceo-sam-altman-says/

3 replies

·

jeffboudier

posted an update 4 months ago

Post

742

NVIDIA just announced the Cosmos World Foundation Models, available on the Hub: nvidia/cosmos-6751e884dc10e013a0a0d8e6

Cosmos is a family of pre-trained models purpose-built for generating physics-aware videos and world states to advance physical AI development.
The release includes Tokenizers nvidia/cosmos-tokenizer-672b93023add81b66a8ff8e6

Learn more in this great community article by @mingyuliutw and @PranjaliJoshi https://huggingface.co/blog/mingyuliutw/nvidia-cosmos

1 reply

·

Amazon SageMaker Community

AI & ML interests

Recent Activity

amazon-sagemaker-community's activity

Robust and Fine-Grained Detection of AI Generated Texts

Medical Reasoning in LLMs: An In-Depth Analysis of DeepSeek R1

Gemma 3 Technical Report

The order in speech disorder: a scoping review of state of the art machine learning methods for clinical speech classification

Artificial Humans

COMODO: Cross-Modal Video-to-IMU Distillation for Efficient Egocentric Human Activity Recognition

Voice Cloning for Dysarthric Speech Synthesis: Addressing Data Scarcity in Speech-Language Pathology

Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance

Large Language Models and Mathematical Reasoning Failures

Bridging the Data Provenance Gap Across Text, Speech and Video

Prune or Retrain: Optimizing the Vocabulary of Multilingual Models for Estonian

AI & ML interests

Recent Activity

Team members 108

amazon-sagemaker-community's activity