Lize Pirenne

Inversta

Pangasius

AI & ML interests

LLMs, RL

Recent Activity

upvoted a paper 2 days ago

ToolRL: Reward is All Tool Learning Needs

upvoted a paper 2 days ago

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

upvoted a paper 20 days ago

BitNet b1.58 2B4T Technical Report

View all activity

Organizations

None yet

Inversta's activity

upvoted 2 papers 2 days ago

ToolRL: Reward is All Tool Learning Needs

Paper • 2504.13958 • Published 22 days ago • 43

Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published 20 days ago • 119

upvoted 2 papers 20 days ago

BitNet b1.58 2B4T Technical Report

Paper • 2504.12285 • Published 22 days ago • 70

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 24 days ago • 255

upvoted 2 papers 23 days ago

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

Paper • 2504.08736 • Published 27 days ago • 47

Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model

Paper • 2504.08685 • Published 27 days ago • 123

upvoted a paper 24 days ago

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Paper • 2504.06261 • Published 30 days ago • 107

updated a collection 24 days ago

closed_qa

Collection

11 items • Updated 24 days ago

upvoted 2 papers about 1 month ago

Recitation over Reasoning: How Cutting-Edge Language Models Can Fail on Elementary School-Level Reasoning Problems?

Paper • 2504.00509 • Published Apr 1 • 21

DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 124

upvoted 6 papers about 2 months ago

RWKV-7 "Goose" with Expressive Dynamic State Evolution

Paper • 2503.14456 • Published Mar 18 • 147

Transformers without Normalization

Paper • 2503.10622 • Published Mar 13 • 162

Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models

Paper • 2503.09573 • Published Mar 12 • 71

EuroBERT: Scaling Multilingual Encoders for European Languages

Paper • 2503.05500 • Published Mar 7 • 78

Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 94

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published Mar 3 • 87

upvoted 2 papers 2 months ago

Chain of Draft: Thinking Faster by Writing Less

Paper • 2502.18600 • Published Feb 25 • 48

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 57

liked a Space 2 months ago

2.56k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters