33 48 27

Solomatin Roman

Samoed

AI & ML interests

None yet

Recent Activity

updated a dataset about 14 hours ago

mteb/PublicHealthQA

published a dataset about 14 hours ago

mteb/PublicHealthQA

updated a dataset about 14 hours ago

mteb/PubChemSMILESBitextMining

View all activity

Organizations

Samoed's activity

upvoted a paper 7 days ago

ReasonIR: Training Retrievers for Reasoning Tasks

Paper • 2504.20595 • Published 9 days ago • 50

upvoted a collection 19 days ago

USER2

Collection

Universal Sentence Encoder for Russian based on RuModernBERT, with support for context lengths up to 8,192 tokens and Matryoshka representation lear • 2 items • Updated 20 days ago • 4

upvoted a paper 23 days ago

MIEB: Massive Image Embedding Benchmark

Paper • 2504.10471 • Published 24 days ago • 16

upvoted an article about 1 month ago

Article

Training and Finetuning Reranker Models with Sentence Transformers v4

Mar 26

• 125

upvoted a paper about 1 month ago

When Less is Enough: Adaptive Token Reduction for Efficient Image Representation

Paper • 2503.16660 • Published Mar 20 • 73

upvoted 3 papers about 2 months ago

Feature-Level Insights into Artificial Text Detection with Sparse Autoencoders

Paper • 2503.03601 • Published Mar 5 • 232

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 95

SigLIP 2: Multilingual Vision-Language Encoders with Improved Semantic Understanding, Localization, and Dense Features

Paper • 2502.14786 • Published Feb 20 • 144

upvoted 5 papers 2 months ago

Rank1: Test-Time Compute for Reranking in Information Retrieval

Paper • 2502.18418 • Published Feb 25 • 26

GHOST 2.0: generative high-fidelity one shot transfer of heads

Paper • 2502.18417 • Published Feb 25 • 67

LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers

Paper • 2502.15007 • Published Feb 20 • 175

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published Feb 20 • 91

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published Feb 19 • 34

upvoted a paper 3 months ago

Contextual Document Embeddings

Paper • 2410.02525 • Published Oct 3, 2024 • 23

upvoted an article 3 months ago

Article

Train 400x faster Static Embedding Models with Sentence Transformers

Jan 15

• 177

upvoted a paper 3 months ago

The Differences Between Direct Alignment Algorithms are a Blur

Paper • 2502.01237 • Published Feb 3 • 115

upvoted a collection 3 months ago

NanoBEIR 🍺

Collection

A collection of smaller versions of BEIR datasets with 50 queries and up to 10K documents each. • 13 items • Updated Sep 11, 2024 • 14

upvoted a paper 3 months ago

Towards General Text Embeddings with Multi-stage Contrastive Learning

Paper • 2308.03281 • Published Aug 7, 2023 • 2