94 17 20

Michael Goin

mgoin

mgoin_
mgoin

AI & ML interests

LLM inference optimization, compression, quantization, pruning, distillation

Recent Activity

new activity 2 days ago

RedHatAI/Qwen2.5-VL-72B-Instruct-quantized.w8a8:用vllm serve启动不了

updated a Space 6 days ago

RedHatAI/README

updated a model 10 days ago

nm-testing/gemma-3-27b-it-FP8-dynamic

View all activity

Organizations

mgoin's activity

New activity in RedHatAI/Qwen2.5-VL-72B-Instruct-quantized.w8a8 2 days ago

用vllm serve启动不了

#2 opened about 1 month ago by

VenomEY

New activity in RedHatAI/Qwen2.5-VL-72B-Instruct-FP8-Dynamic 13 days ago

Fix processor_class to match upstream

#4 opened 17 days ago by

zifeitong

New activity in RedHatAI/Qwen2.5-VL-3B-Instruct-FP8-Dynamic 16 days ago

Remove image_processor_type

#1 opened about 1 month ago by

pooya-davoodi-parasail

New activity in nm-testing/Llama-3_1-Nemotron-Ultra-253B-v1-FP8-dynamic 17 days ago

OSError: nm-testing/Llama-3_1-Nemotron-Ultra-253B-v1-FP8-dynamic does not appear to have a file named decilm.py

#2 opened 19 days ago by

TheDrummer

New activity in nm-testing/Llama-3_1-Nemotron-Ultra-253B-v1-FP8-dynamic 20 days ago

how to deploy this model without internet connection

#1 opened 22 days ago by

superahn

New activity in RedHatAI/Llama-4-Scout-17B-16E-Instruct-FP8-dynamic 27 days ago

Why not FP8 with static and per-tensor quantization?

#2 opened 28 days ago by

wanzhenchn

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 about 1 month ago

Address discrepancies in the languages supported by the Mistral Small 3.1 2503

#54 opened about 1 month ago by

fpaupier

New activity in RedHatAI/Mistral-Small-3.1-24B-Instruct-2503-FP8-dynamic about 1 month ago

Please update the chat template

#1 opened about 1 month ago by

stelterlab

New activity in mistralai/Mistral-Small-3.1-24B-Instruct-2503 about 1 month ago

FP8 Dynamic/W8A16 Quants Please

#44 opened about 1 month ago by

rjmehta

Problem hosting the model using vllm

#45 opened about 1 month ago by

ShaoServient

New activity in RedHatAI/Qwen2.5-VL-72B-Instruct-quantized.w8a8 2 months ago

Remove image_processor_type

#1 opened 2 months ago by

pooya-davoodi-parasail

New activity in RedHatAI/Qwen2.5-VL-7B-Instruct-quantized.w8a8 2 months ago

Remove image_processor_type

#1 opened 2 months ago by

pooya-davoodi-parasail

New activity in RedHatAI/Qwen2.5-VL-72B-Instruct-FP8-Dynamic 2 months ago

Remove image_processor_type

#2 opened 2 months ago by

pooya-davoodi-parasail

New activity in RedHatAI/Qwen2.5-VL-7B-Instruct-FP8-Dynamic 2 months ago

Use Qwen2VLImageProcessor for image_processor_type

#2 opened 3 months ago by

pooya-davoodi-parasail

Use Qwen2VLImageProcessor for image_processor_type

#3 opened 2 months ago by

pooya-davoodi-parasail

New activity in cognitivecomputations/DeepSeek-R1-AWQ 3 months ago

when i use vllm v0.7.2 to deploy r1 awq, i got empty content

#10 opened 3 months ago by

bupalinyu

MLA is not supported with moe_wna16 quantization. Disabling MLA.

#7 opened 3 months ago by

AMOSE

New activity in RedHatAI/gemma-2-9b-it-FP8 3 months ago

AttributeError: 'Gemma2Config' object has no attribute 'interleaved_sliding_window' Traceback (most recent call last):

#3 opened 3 months ago by

samos123

New activity in RedHatAI/granite-3.1-8b-instruct-FP8-dynamic 3 months ago

compressed-tensors MLA support requires fp8 activations and weights in group 'group_0',

#1 opened 3 months ago by

samos123

New activity in RedHatAI/Meta-Llama-3-8B-Instruct-FP8-KV 3 months ago

How to load this model?

#1 opened 10 months ago by

Frz614