Prokhorov
Maverick17
AI & ML interests
None yet
Recent Activity
new activity
9 days ago
Qwen/Qwen3-30B-A3B:Waiting for the Qwen3-VL
new activity
about 1 month ago
unsloth/Llama-4-Scout-17B-16E-Instruct-unsloth-dynamic-bnb-4bit:OOM on 2xH100
Organizations
Maverick17's activity
Waiting for the Qwen3-VL
6
#8 opened 9 days ago
by
Maverick17

OOM on 2xH100
7
#3 opened about 1 month ago
by
Maverick17

Dataset and hyperparameters for training
2
#3 opened 2 months ago
by
Maverick17

Null byte error
7
#2 opened 3 months ago
by
Maverick17

How does the Agent is supposed to be working?
7
#2 opened 5 months ago
by
Maverick17

Text -> Point -> Segmentation
4
#30 opened 6 months ago
by
Maverick17

Agent Loop
6
#6 opened 5 months ago
by
Maverick17

How to finetune using DPO?
7
#31 opened 6 months ago
by
Maverick17

How should I extract attention maps? Can you provide a specific example?
5
#33 opened 6 months ago
by
whopeople
Any plans on when vllm will be supported?
8
#26 opened 7 months ago
by
karlyukang
Using truncation in idefics3 processor
1
#1 opened 6 months ago
by
jainarchit76
Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cuda:0!
3
#6 opened 10 months ago
by
Maverick17

Token indices sequence length is longer than the specified maximum sequence length for this model (4645 > 2048)
4
#5 opened about 1 year ago
by
Maverick17

Usage
3
#1 opened almost 2 years ago
by
Maverick17

Usage
3
#1 opened almost 2 years ago
by
Maverick17
