Spaces:

whackthejacker
/

mixture-of-experts

Running

App Files Files Community

whackthejacker commited on Feb 27

Commit

82433a0

verified ·

1 Parent(s): 9ea0df0

Update README.md

Browse files

Files changed (1) hide show

README.md +143 -1

README.md CHANGED Viewed

@@ -16,7 +16,149 @@ models:
 [![Python 3.9+](https://img.shields.io/badge/python-%3E%3D3.9-blue.svg)](https://www.python.org/downloads)
 [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
 ---
 An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).

 [![Python 3.9+](https://img.shields.io/badge/python-%3E%3D3.9-blue.svg)](https://www.python.org/downloads)
 [![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
 ---
+# Mixture of Experts
+Welcome to **Mixture of Experts** – a Hugging Face Space built to interact with advanced multimodal conversational AI using Gradio. This Space leverages the Aria-Chat model, which excels in handling open-ended, multi-round dialogs with text and image inputs.
+## Key Features
+- **Multimodal Interaction:** Seamlessly integrate text and image inputs for rich, conversational experiences.
+- **Advanced Conversational Abilities:** Benefit from Aria-Chat’s fine-tuned performance in generating coherent and context-aware responses.
+- **Optimized Performance:** Designed for reliable, long-format outputs, reducing common pitfalls like incomplete markdown or endless list outputs.
+- **Multilingual Support:** Optimized to handle multiple languages including Chinese, Spanish, French, and Japanese.
+## Quick Start
+### Installation
+To run the Space locally or to integrate into your workflow, ensure you have the following dependencies installed:
+  ```bash
+    pip install transformers==4.45.0 accelerate==0.34.1 sentencepiece==0.2.0 torchvision requests torch Pillow
+    pip install flash-attn --no-build-isolation
+    # Optionally, for improved inference performance:
+    pip install grouped_gemm==0.1.6
+  ```
+Usage
+Below is a simple code snippet demonstrating how to interact with the Aria-Chat model. Customize it further to suit your integration needs:
+```python
+import requests
+import torch
+from PIL import Image
+from transformers import AutoModelForCausalLM, AutoProcessor
+model_id_or_path = "rhymes-ai/Aria-Chat"
+model = AutoModelForCausalLM.from_pretrained(
+    model_id_or_path,
+    device_map="auto",
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True
+)
+processor = AutoProcessor.from_pretrained(
+    model_id_or_path,
+    trust_remote_code=True
+)
+# Example image input
+image_url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png"
+image = Image.open(requests.get(image_url, stream=True).raw)
+# Prepare a conversation message
+messages = [
+    {
+        "role": "user",
+        "content": [
+            {"text": None, "type": "image"},
+            {"text": "What is the image?", "type": "text"},
+        ],
+    }
+]
+# Format text input with chat template
+text = processor.apply_chat_template(messages, add_generation_prompt=True)
+inputs = processor(text=text, images=image, return_tensors="pt")
+inputs["pixel_values"] = inputs["pixel_values"].to(model.dtype)
+inputs = {k: v.to(model.device) for k, v in inputs.items()}
+# Generate the response
+with torch.inference_mode(), torch.cuda.amp.autocast(dtype=torch.bfloat16):
+    output = model.generate(
+        **inputs,
+        max_new_tokens=500,
+        stop_strings=["<|im_end|>"],
+        tokenizer=processor.tokenizer,
+        do_sample=True,
+        temperature=0.9,
+    )
+    output_ids = output[0][inputs["input_ids"].shape[1]:]
+    result = processor.decode(output_ids, skip_special_tokens=True)
+print(result)
+```
+### Running the Space with Gradio
+Our Space leverages Gradio for an interactive web interface. Once the required dependencies are installed, simply run your Space to:
+- Interact in real time with the multimodal capabilities of Aria-Chat.
+- Test various inputs including images and text for a dynamic conversational experience.
+## Advanced Usage
+For more complex use cases:
+- Fine-tuning: Check out our linked codebase for guidance on fine-tuning Aria-Chat on your custom datasets.
+- vLLM Inference: Explore advanced inference options to optimize latency and throughput.
+### Credits & Citation
+If you find this work useful, please consider citing the Aria-Chat model:
+```bibtex
+Copy
+Edit
+@article{aria,
+  title={Aria: An Open Multimodal Native Mixture-of-Experts Model},
+  author={Dongxu Li and Yudong Liu and Haoning Wu and Yue Wang and Zhiqi Shen and Bowen Qu and Xinyao Niu and Guoyin Wang and Bei Chen and Junnan Li},
+  year={2024},
+  journal={arXiv preprint arXiv:2410.05993},
+}
+```
+## License
+This project is licensed under the Apache-2.0 License.
+Happy chatting and expert mixing! If you encounter any issues or have suggestions, feel free to open an issue or contribute to the repository.Running the Space with Gradio
+Our Space leverages Gradio for an interactive web interface. Once the required dependencies are installed, simply run your Space to:
+- Interact in real time with the multimodal capabilities of Aria-Chat.
+- Test various inputs including images and text for a dynamic conversational experience.
+## Advanced Usage
+For more complex use cases:
+- Fine-tuning: Check out our linked codebase for guidance on fine-tuning Aria-Chat on your custom datasets.
+vLLM Inference: Explore advanced inference options to optimize latency and throughput.
+## Credits & Citation
+If you find this work useful, please consider citing the Aria-Chat model:
+bibtex
+@article{aria,
+  title={Aria: An Open Multimodal Native Mixture-of-Experts Model},
+  author={Dongxu Li and Yudong Liu and Haoning Wu and Yue Wang and Zhiqi Shen and Bowen Qu and Xinyao Niu and Guoyin Wang and Bei Chen and Junnan Li},
+  year={2024},
+  journal={arXiv preprint arXiv:2410.05993},
+}
+## License
+This project is licensed under the Apache-2.0 License.
+Happy chatting and expert mixing! If you encounter any issues or have suggestions, feel free to open an issue or contribute to the repository.
 An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).