Junyang Lin's picture

Junyang Lin

JustinLin610

·

https://justinlin610.github.io

AI & ML interests

Pretraining, NLP, CV, etc.

Recent Activity

new activity 8 days ago

Qwen/Qwen3-235B-A22B:fix: use tp 8 for SGLang

published a model 10 days ago

Qwen/Qwen3-235B-A22B-FP8

published a model 10 days ago

Qwen/Qwen3-235B-A22B

View all activity

Organizations

JustinLin610's activity

New activity in Qwen/Qwen3-235B-A22B 8 days ago

fix: use tp 8 for SGLang

#1 opened 10 days ago by

New activity in Qwen/QwQ-32B 2 months ago

复杂推理进入死循环

#21 opened 2 months ago by

New activity in Qwen/Qwen2.5-Math-7B-Instruct 7 months ago

Independent evaluation results

#1 opened 7 months ago by

New activity in Qwen/Qwen2-VL-7B-Instruct 8 months ago

Have you deleted your GitHub page?

#10 opened 8 months ago by

New activity in Qwen/Qwen2-72B-Instruct 10 months ago

32B

#13 opened 11 months ago by

The sample code could not run...

#16 opened 10 months ago by

New activity in Qwen/CodeQwen1.5-7B-Chat 12 months ago

fine-tuning

#16 opened about 1 year ago by

Maybe a silly question...

#18 opened about 1 year ago by

This model is Awesome

#20 opened 12 months ago by

areumtecnologia

New activity in Qwen/Qwen1.5-110B-Chat-AWQ about 1 year ago

Update tokenizer_config.json

#3 opened about 1 year ago by

New activity in Qwen/Qwen1.5-MoE-A2.7B-Chat about 1 year ago

请问这个版本GPU内存消耗28G与14B对比如何?

#7 opened about 1 year ago by

New activity in Qwen/CodeQwen1.5-7B-Chat about 1 year ago

Fine tuning this model with Proprietary Code

#6 opened about 1 year ago by

What are the diffences of this with Qwen/CodeQwen1.5-7B

#5 opened about 1 year ago by

New activity in Qwen/Qwen1.5-7B-Chat about 1 year ago

Adding Evaluation Results

#14 opened about 1 year ago by

leaderboard-pr-bot

qwen1.5-7b-chat是不是推理起来比qwen1.5-7b快很多

#9 opened about 1 year ago by

New activity in Qwen/Qwen1.5-0.5B about 1 year ago

tie_word_embeddings=true ?

#6 opened about 1 year ago by

New activity in Qwen/Qwen1.5-72B-Chat about 1 year ago

Why 72B model has different vocab size comparing with other models?

#1 opened about 1 year ago by

New activity in Qwen/CodeQwen1.5-7B-Chat-GGUF about 1 year ago

Using llama.cpp server, responses always end with <|im_end|>

#2 opened about 1 year ago by

New activity in Qwen/CodeQwen1.5-7B-Chat about 1 year ago

The llm output is incomplete

#11 opened about 1 year ago by

GGUF models

#1 opened about 1 year ago by