Interpretability XAI - a JM-Brun Collection

JM-Brun 's Collections

Agents

SLMs

LLM-KG

LLM Architecture

Interpretability XAI

Interpretability XAI

updated Mar 4

ReAGent: Towards A Model-agnostic Feature Attribution Method for Generative Language Models

Paper • 2402.00794 • Published Feb 1, 2024 • 1
Rethinking Interpretability in the Era of Large Language Models

Paper • 2402.01761 • Published Jan 30, 2024 • 24
Analyze Feature Flow to Enhance Interpretation and Steering in Language Models

Paper • 2502.03032 • Published Feb 5 • 60
Tell me why: Visual foundation models as self-explainable classifiers

Paper • 2502.19577 • Published Feb 26 • 11