Spaces:

LuisMBA
/

multimodal_RAG_kaggle_based

Sleeping

App Files Files Community

LuisMBA commited on 27 days ago

Commit

993f74e

verified ·

1 Parent(s): c774d2a

Update README.md

Browse files

Files changed (1) hide show

README.md +38 -1

README.md CHANGED Viewed

@@ -11,4 +11,41 @@ license: apache-2.0
 short_description: Multimodal RAG to augment english recipes searches
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 short_description: Multimodal RAG to augment english recipes searches
 ---
+# Multimodal Retrieval System with FAISS
+This repository contains a prototype system for multimodal information retrieval using FAISS, capable of searching across text and images using vector similarity.
+## Structure
+- `notebook/` (or `.ipynb`): Contains the logic to generate the vector indexes for both text and images.
+- `app.py`: Gradio-based interface for interacting with the system.
+- `search_ocean.py`: Core logic for performing FAISS-based similarity search using precomputed indexes.
+- `text_index.faiss`, `image_index.faiss`: The FAISS index files generated by the notebook (already included in the app).
+- `metadata_text.json`, `metadata_image.json`: Associated metadata for mapping index results back to source information.
+## What it does
+- Loads precomputed FAISS indexes (for text and image).
+- Performs retrieval based on a text or image query.
+- Returns top matching results using cosine similarity.
+## What it doesn't (yet) do
+- No generation step (e.g., using LLMs) is implemented in this app.
+- While the code for image retrieval is ready, image indexes must be built in the notebook beforehand.
+- There is **no context overlap** implemented when chunking the data for indexing. Each chunk is indexed independently, which may affect the quality of retrieval in some use cases.
+## Dependencies
+- `faiss-cpu`
+- `sentence-transformers`
+- `openai-clip`
+- `torch`
+- `torchvision`
+- `gradio`
+- `Pillow`
+## Notes
+- The app is designed to separate concerns between indexing (offline, notebook) and retrieval (live, Gradio app).
+- You can easily extend this to include LLM generation or contextual QA once relevant results are retrieved.