|
--- |
|
title: Mixture Of Experts |
|
emoji: 📚 |
|
colorFrom: yellow |
|
colorTo: purple |
|
sdk: gradio |
|
sdk_version: 5.19.0 |
|
app_file: app.py |
|
pinned: false |
|
license: mit |
|
models: |
|
- rhymes-ai/Aria-Chat |
|
short_description: Hugging Face Space with Gradio Interface |
|
--- |
|
|
|
[](https://opensource.org/licenses/MIT) |
|
[](https://www.python.org/downloads) |
|
[](https://github.com/psf/black) |
|
|
|
--- |
|
|
|
# Mixture of Experts |
|
|
|
Welcome to **Mixture of Experts** – a Hugging Face Space built to interact with advanced multimodal conversational AI using Gradio. This Space leverages the Aria-Chat model, which excels in handling open-ended, multi-round dialogs with text and image inputs. |
|
|
|
## Key Features |
|
|
|
- **Multimodal Interaction:** Seamlessly integrate text and image inputs for rich, conversational experiences. |
|
- **Advanced Conversational Abilities:** Benefit from Aria-Chat’s fine-tuned performance in generating coherent and context-aware responses. |
|
- **Optimized Performance:** Designed for reliable, long-format outputs, reducing common pitfalls like incomplete markdown or endless list outputs. |
|
- **Multilingual Support:** Optimized to handle multiple languages including Chinese, Spanish, French, and Japanese. |
|
|
|
## Quick Start |
|
|
|
### Installation |
|
|
|
To run the Space locally or to integrate into your workflow, ensure you have the following dependencies installed: |
|
|
|
```bash |
|
pip install transformers==4.45.0 accelerate==0.34.1 sentencepiece==0.2.0 torchvision requests torch Pillow |
|
pip install flash-attn --no-build-isolation |
|
|
|
# Optionally, for improved inference performance: |
|
pip install grouped_gemm==0.1.6 |
|
|
|
``` |
|
|
|
Usage |
|
Below is a simple code snippet demonstrating how to interact with the Aria-Chat model. Customize it further to suit your integration needs: |
|
|
|
```python |
|
import requests |
|
import torch |
|
from PIL import Image |
|
from transformers import AutoModelForCausalLM, AutoProcessor |
|
|
|
model_id_or_path = "rhymes-ai/Aria-Chat" |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id_or_path, |
|
device_map="auto", |
|
torch_dtype=torch.bfloat16, |
|
trust_remote_code=True |
|
) |
|
|
|
processor = AutoProcessor.from_pretrained( |
|
model_id_or_path, |
|
trust_remote_code=True |
|
) |
|
|
|
# Example image input |
|
image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png" |
|
image = Image.open(requests.get(image_url, stream=True).raw) |
|
|
|
# Prepare a conversation message |
|
messages = [ |
|
{ |
|
"role": "user", |
|
"content": [ |
|
{"text": None, "type": "image"}, |
|
{"text": "What is the image?", "type": "text"}, |
|
], |
|
} |
|
] |
|
|
|
# Format text input with chat template |
|
text = processor.apply_chat_template(messages, add_generation_prompt=True) |
|
inputs = processor(text=text, images=image, return_tensors="pt") |
|
inputs["pixel_values"] = inputs["pixel_values"].to(model.dtype) |
|
inputs = {k: v.to(model.device) for k, v in inputs.items()} |
|
|
|
# Generate the response |
|
with torch.inference_mode(), torch.cuda.amp.autocast(dtype=torch.bfloat16): |
|
output = model.generate( |
|
**inputs, |
|
max_new_tokens=500, |
|
stop_strings=["<|im_end|>"], |
|
tokenizer=processor.tokenizer, |
|
do_sample=True, |
|
temperature=0.9, |
|
) |
|
output_ids = output[0][inputs["input_ids"].shape[1]:] |
|
result = processor.decode(output_ids, skip_special_tokens=True) |
|
|
|
print(result) |
|
``` |
|
|
|
### Running the Space with Gradio |
|
Our Space leverages Gradio for an interactive web interface. Once the required dependencies are installed, simply run your Space to: |
|
|
|
- Interact in real time with the multimodal capabilities of Aria-Chat. |
|
- Test various inputs including images and text for a dynamic conversational experience. |
|
|
|
## Advanced Usage |
|
For more complex use cases: |
|
|
|
- Fine-tuning: Check out our linked codebase for guidance on fine-tuning Aria-Chat on your custom datasets. |
|
- vLLM Inference: Explore advanced inference options to optimize latency and throughput. |
|
|
|
### Credits & Citation |
|
If you find this work useful, please consider citing the Aria-Chat model: |
|
|
|
```bibtex |
|
Copy |
|
Edit |
|
@article{aria, |
|
title={Aria: An Open Multimodal Native Mixture-of-Experts Model}, |
|
author={Dongxu Li and Yudong Liu and Haoning Wu and Yue Wang and Zhiqi Shen and Bowen Qu and Xinyao Niu and Guoyin Wang and Bei Chen and Junnan Li}, |
|
year={2024}, |
|
journal={arXiv preprint arXiv:2410.05993}, |
|
} |
|
``` |
|
|
|
## License |
|
This project is licensed under the Apache-2.0 License. |
|
|
|
Happy chatting and expert mixing! If you encounter any issues or have suggestions, feel free to open an issue or contribute to the repository.Running the Space with Gradio |
|
Our Space leverages Gradio for an interactive web interface. Once the required dependencies are installed, simply run your Space to: |
|
|
|
- Interact in real time with the multimodal capabilities of Aria-Chat. |
|
- Test various inputs including images and text for a dynamic conversational experience. |
|
|
|
|
|
## Advanced Usage |
|
For more complex use cases: |
|
|
|
- Fine-tuning: Check out our linked codebase for guidance on fine-tuning Aria-Chat on your custom datasets. |
|
vLLM Inference: Explore advanced inference options to optimize latency and throughput. |
|
## Credits & Citation |
|
If you find this work useful, please consider citing the Aria-Chat model: |
|
|
|
bibtex |
|
@article{aria, |
|
title={Aria: An Open Multimodal Native Mixture-of-Experts Model}, |
|
author={Dongxu Li and Yudong Liu and Haoning Wu and Yue Wang and Zhiqi Shen and Bowen Qu and Xinyao Niu and Guoyin Wang and Bei Chen and Junnan Li}, |
|
year={2024}, |
|
journal={arXiv preprint arXiv:2410.05993}, |
|
} |
|
|
|
## License |
|
This project is licensed under the Apache-2.0 License. |
|
|
|
Happy chatting and expert mixing! If you encounter any issues or have suggestions, feel free to open an issue or contribute to the repository. |
|
|
|
An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index). |