9 132 154

Emanuele Vivoli

emanuelevivoli

https://emanuelevivoli.github.io

AI & ML interests

I work on Comics/Manga :)

Recent Activity

liked a dataset 3 days ago

rghermi/sf20k

upvoted a paper 3 days ago

Phi-4-reasoning Technical Report

upvoted a paper 3 days ago

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

View all activity

Organizations

emanuelevivoli's activity

upvoted 2 papers 3 days ago

Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published 8 days ago • 34

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Paper • 2504.21233 • Published 9 days ago • 37

upvoted 2 papers 21 days ago

AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction

Paper • 2504.01014 • Published Apr 1 • 68

GigaTok: Scaling Visual Tokenizers to 3 Billion Parameters for Autoregressive Image Generation

Paper • 2504.08736 • Published 27 days ago • 47

upvoted a paper 24 days ago

Kimi-VL Technical Report

Paper • 2504.07491 • Published 28 days ago • 125

upvoted a paper 30 days ago

SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published about 1 month ago • 180

upvoted 3 papers about 1 month ago

upvoted a paper about 2 months ago

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Paper • 2503.11579 • Published Mar 14 • 20

upvoted a collection about 2 months ago

Comics Pick-A-Panel

Collection

Dataset, Models and Paper from ComicsPAP: understanding comic strips by picking the correct panel • 4 items • Updated Mar 14 • 3

upvoted 3 papers about 2 months ago

R1-Omni: Explainable Omni-Multimodal Emotion Recognition with Reinforcing Learning

Paper • 2503.05379 • Published Mar 7 • 37

R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model

Paper • 2503.05132 • Published Mar 7 • 58

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Paper • 2503.06749 • Published Mar 9 • 29

upvoted a paper 2 months ago

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published Mar 3 • 87

upvoted 3 papers 3 months ago

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 144

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 186

LM2: Large Memory Models

Paper • 2502.06049 • Published Feb 9 • 30

upvoted an article 3 months ago

Article

Open-R1: a fully open reproduction of DeepSeek-R1

Jan 28

• 852

upvoted a paper 3 months ago

TokenVerse: Versatile Multi-concept Personalization in Token Modulation Space

Paper • 2501.12224 • Published Jan 21 • 48