--- title: Event Data Extraction emoji: 🌐 colorFrom: pink colorTo: blue sdk: streamlit sdk_version: 1.42.2 app_file: app.py pinned: false python_version: 3.10.0 --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference # Event Data Extraction A testing and demo application for extracting event-data from websites. ## Repository overview ```txt /pages/ │ └── Streamlit pages for the UI │ /src/ ├── configuration/ │ └── Streamlit-specific configuration files │ ├── crawler/ │ └── Scripts for crawling and collecting event data from websites │ ├── persistence/ │ └── Database connections and query logic │ ├── utils/ │ └── Helper functions and preprocessing utilities │ ├── nlp/ │ ├── experimental/ │ │ └── Various NLP tools and technologies under evaluation │ │ │ └── playground/ │ └── NLP scripts used within the Streamlit app (Pages: Playground, Pipeline, Testing) ``` ## Run locally **Python Version**: 3.10 1. Install requirements from requirements.txt file 2. Create Hugging Face Access Token in Hugging Face Platform 3. Request missing environment variables 4. **Create a `.env` file** in the root directory with the following environment variables (⚠️ **Do NOT commit this file!**) ```env # MongoDB MONGO_HOST=... MONGO_USERNAME=... MONGO_PASSWORD=... # Google Maps API GOOGLE_MAPS_API_KEY=... # OpenAI API OPENAI_API_KEY=... # Hugging Face Inference API INFERENCE_API_TOKEN=... # Hugging Face Spaces (access token) HUGGING_FACE_SPACES_TOKEN=... # Google Cloud Platform API GOOGLE_API_KEY=... ``` 5. Start streamlit app in browser ```bash streamlit run app.py ```