ChatWithData / README.md
Fiqa's picture
Update README.md
ac013c2 verified
|
raw
history blame
3.12 kB

Here’s the updated README.md without the "How to Run Locally" part:


Chat With Documents πŸ€–πŸ“„

Welcome to the Chat with Documents app! πŸš€ This Streamlit app allows you to upload PDF and PPT files, extract their content, store the extracted text in a vector store, and interact with it using natural language queries! πŸ€–πŸ’¬

Built with LangChain, OpenAI, Streamlit, and Astra DB, this project leverages the power of LLMs (Large Language Models) to allow users to chat with their documents like never before. 🧠


πŸš€ Features

  • PDF & PPT Extraction: Upload PDF and PowerPoint files to extract text! πŸ“„βž‘οΈπŸ“
  • Vector Store: Automatically stores extracted text in a Cassandra vector store. πŸ”πŸ“š
  • Ask Anything: Ask questions about the document, and get answers powered by OpenAI! πŸ€–β“

πŸ› οΈ Tech Stack

  • Streamlit: Frontend framework to interact with the app.
  • LangChain: For seamless document processing and querying.
  • OpenAI: For LLM integration to provide intelligent responses.
  • Astra DB: Database for storing and managing vectorized text data.
  • Python Libraries: PyPDF2, python-pptx, cassio, and more.

🌍 Deployment

This project is designed to be deployed on Hugging Face Spaces. Just upload your code, and it will run in the cloud! 🌩️

Make sure to configure the Secrets in Hugging Face Spaces for storing your sensitive API keys securely! πŸ”’


πŸ’‘ How It Works

  • Upload a PDF or PPT file using the file uploader. πŸ“€
  • The app will extract text from the file using PyPDF2 (for PDFs) or python-pptx (for PPTs). πŸ“„βž‘οΈπŸ“
  • The extracted text is split into manageable chunks using LangChain's CharacterTextSplitter. βœ‚οΈ
  • The chunks are then added to Cassandra as vectorized data using OpenAI embeddings. πŸ”„
  • Ask any query about the content of your document, and the app will respond using the power of OpenAI! πŸ€–πŸ’¬

🎯 Why Use This?

  • Make documents interactive: Easily explore the content of your documents by asking questions.
  • Quick retrieval: With the text stored in a vector store, you can query the content efficiently.
  • Secure API keys: API keys are securely managed using environment variables and Hugging Face Spaces Secrets. πŸ”‘πŸ’Ό

🀝 Contributing

Feel free to fork this repo and submit issues or pull requests for any bugs or improvements. Contributions are welcome! πŸ™Œ


πŸ§‘β€πŸ’» License

This project is licensed under the MIT License - see the LICENSE file for details.


πŸ“ Note

Remember to add your API keys and check the environment variables! If you're using Hugging Face Spaces, ensure your keys are added to the Secrets section. πŸ”


✨ Enjoy the App! ✨

Now, go ahead and chat with your documents! πŸ˜„


This version now only focuses on the app’s features and deployment, making it more suited for hosting and sharing on Hugging Face Spaces!