Spaces:

biglam
/

README

Running

App Files Files Community

make shorter

by davanstrien HF Staff - opened 1 day ago

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

+43

-38

Files changed (1) hide show

README.md +43 -38

README.md CHANGED Viewed

@@ -1,67 +1,72 @@
 ---
-title: README
-emoji: 📚
-colorFrom: pink
-colorTo: gray
-sdk: static
-pinned: false
 ---
 # 📚 BigLAM: Machine Learning for Libraries, Archives, and Museums
-**BigLAM** is a community-driven effort to build an open ecosystem of machine learning models, datasets, and tools for **Libraries, Archives, and Museums (LAMs)**.
-We aim to make cultural heritage data more accessible and usable for machine learning by:
-- 🗃️ **Curating and sharing LAM datasets** with potential for ML applications, hosted openly on the [Hugging Face Hub](https://huggingface.co/biglam).
-- 🤖 **Training and releasing open-source models** tailored to LAM-relevant tasks, including classification, generation, and object detection.
 ---
-## ✨ Origins and Purpose
-BigLAM began as a [datasets hackathon](https://github.com/bigscience-workshop/lam) within the [BigScience 🌸](https://bigscience.huggingface.co/) project—an open scientific collaboration involving over 600 researchers from 50 countries and 250 institutions.
-Our initial goal was to make LAM data more discoverable and usable on the Hugging Face Hub. We're continuing this work with the broader aim of:
-- Helping LAM data reach new audiences.
-- Supporting researchers and practitioners working at the intersection of AI and cultural heritage.
-- Ensuring that machine learning datasets reflect the diversity and richness of human culture.
 ---
-## 📂 What You'll Find Here
-The [BigLAM organization on Hugging Face](https://huggingface.co/biglam) hosts:
-- 🧠 **Datasets** from and about libraries, archives, and museums, including image, text, and tabular formats.
-- ⚙️ **Models** fine-tuned for LAM tasks, such as:
-  - Art and historical image classification
-  - OCR and document understanding
   - Metadata quality assessment
-- 🧪 **Spaces and tools** for exploring datasets and running models interactively.
 ---
-## 🧩 Get Involved
-We welcome contributions and collaborations!
-You can:
-- Explore our [datasets and models](https://huggingface.co/biglam).
-- Join the conversation by opening a [New Discussion](https://huggingface.co/spaces/biglam/README/discussions/new) on the BigLAM space.
-- Submit datasets, models, or tools that support AI for cultural heritage.
-- Use our datasets in your own research or projects—and share what you build!
 ---
 ## 🌍 Why It Matters
-Cultural heritage data is too often underrepresented in machine learning. By making LAM data more visible and usable:
-- We support the responsible and inclusive development of AI.
-- We help cultural institutions explore new forms of access and interpretation.
-- We ensure that machine learning models learn from the full range of human knowledge—not just what's convenient to crawl.
-- We develop tools and approaches that are tailored to the specific formats, challenges, and goals of libraries, archives, and museums—supporting long-term reuse and alignment with professional practices.

 ---
+title: README
+emoji: 📚
+colorFrom: pink
+colorTo: gray
+sdk: static
+pinned: false
 ---
 # 📚 BigLAM: Machine Learning for Libraries, Archives, and Museums
+**BigLAM** is a community-driven initiative to build an open ecosystem of machine learning models, datasets, and tools for **Libraries, Archives, and Museums (LAMs)**.
+We aim to:
+- 🗃️ Share machine-learning-ready datasets from LAMs via the [Hugging Face Hub](https://huggingface.co/biglam)
+- 🤖 Train and release open-source models for LAM-relevant tasks
+- 🛠️ Develop tools and approaches tailored to LAM use cases
 ---
+<details>
+<summary><strong>✨ Background</strong></summary>
+BigLAM began as a [datasets hackathon](https://github.com/bigscience-workshop/lam) within the [BigScience 🌸](https://bigscience.huggingface.co/) project, a large-scale, open NLP collaboration.
+Our goal: make LAM datasets more discoverable and usable to support researchers, institutions, and ML practitioners working with cultural heritage data.
+</details>
 ---
+<details>
+<summary><strong>📂 What You'll Find</strong></summary>
+The [BigLAM organization](https://huggingface.co/biglam) hosts:
+- **Datasets**: image, text, and tabular data from and about libraries, archives, and museums
+- **Models**: fine-tuned for tasks like:
+  - Art/historical image classification
+  - Document layout analysis and OCR
   - Metadata quality assessment
+  - Named entity recognition in heritage texts
+- **Spaces**: tools for interactive exploration and demonstration
+</details>
 ---
+<details>
+<summary><strong>🧩 Get Involved</strong></summary>
+We welcome contributions! You can:
+- Use our [datasets and models](https://huggingface.co/biglam)
+- Join the discussion on [GitHub](https://github.com/bigscience-workshop/lam/discussions)
+- Contribute your own tools or data
+- Share your work using BigLAM resources
+</details>
 ---
 ## 🌍 Why It Matters
+Cultural heritage data is often underrepresented in machine learning. BigLAM helps address this by:
+- Supporting inclusive and responsible AI
+- Helping institutions experiment with ML for access, discovery, and preservation
+- Ensuring that ML systems reflect diverse human knowledge and expression
+- Developing tools and methods that work well with the unique formats, values, and needs of LAMs
+---
+*Empowering AI with the richness of human culture.*