Spaces:

jdelavande
/

chat-ui-energy

Running on CPU Upgrade

What are the estimates based on?

by infil00p - opened 12 days ago

12 days ago

The estimates seem to be off on a lot of these models, especially the smaller models. I've run many of these models on both iOS and Android devices, and if some of these took 2% of the battery on each run, there's absolutely no way I would have had a phone with battery charge at NeurIPS. Is the repo for the space available anywhere?

jdelavande

Owner 11 days ago

Could comment! The energy consumption is highly correlated with hardware used. For Qwen 2.5 7B , the estimation is the real measurement done on the gpu (Nvidia L4) measured through nvml. For the other models it's just an estimation supposing a similar hardware, improvements on estimates can absolutely be make. The repo used for deploying Qwen 2.5 7B is here: https://github.com/JulienDelavande/text-generation-inference. It is based on the TGI repo for deploying llms at scale.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment