davanstrien HF Staff commited on
Commit
04f1306
·
verified ·
1 Parent(s): c583915

make shorter

Browse files
Files changed (1) hide show
  1. README.md +43 -38
README.md CHANGED
@@ -1,67 +1,72 @@
1
  ---
2
- title: README
3
- emoji: 📚
4
- colorFrom: pink
5
- colorTo: gray
6
- sdk: static
7
- pinned: false
8
  ---
9
 
10
  # 📚 BigLAM: Machine Learning for Libraries, Archives, and Museums
11
 
12
- **BigLAM** is a community-driven effort to build an open ecosystem of machine learning models, datasets, and tools for **Libraries, Archives, and Museums (LAMs)**.
13
 
14
- We aim to make cultural heritage data more accessible and usable for machine learning by:
15
 
16
- - 🗃️ **Curating and sharing LAM datasets** with potential for ML applications, hosted openly on the [Hugging Face Hub](https://huggingface.co/biglam).
17
- - 🤖 **Training and releasing open-source models** tailored to LAM-relevant tasks, including classification, generation, and object detection.
 
18
 
19
  ---
20
 
21
- ## ✨ Origins and Purpose
 
22
 
23
- BigLAM began as a [datasets hackathon](https://github.com/bigscience-workshop/lam) within the [BigScience 🌸](https://bigscience.huggingface.co/) project—an open scientific collaboration involving over 600 researchers from 50 countries and 250 institutions.
24
 
25
- Our initial goal was to make LAM data more discoverable and usable on the Hugging Face Hub. We're continuing this work with the broader aim of:
26
-
27
- - Helping LAM data reach new audiences.
28
- - Supporting researchers and practitioners working at the intersection of AI and cultural heritage.
29
- - Ensuring that machine learning datasets reflect the diversity and richness of human culture.
30
 
31
  ---
32
 
33
- ## 📂 What You'll Find Here
34
-
35
- The [BigLAM organization on Hugging Face](https://huggingface.co/biglam) hosts:
36
 
37
- - 🧠 **Datasets** from and about libraries, archives, and museums, including image, text, and tabular formats.
38
- - ⚙️ **Models** fine-tuned for LAM tasks, such as:
39
 
40
- - Art and historical image classification
41
- - OCR and document understanding
 
 
42
  - Metadata quality assessment
43
-
44
- - 🧪 **Spaces and tools** for exploring datasets and running models interactively.
 
45
 
46
  ---
47
 
48
- ## 🧩 Get Involved
 
49
 
50
- We welcome contributions and collaborations!
51
- You can:
52
 
53
- - Explore our [datasets and models](https://huggingface.co/biglam).
54
- - Join the conversation by opening a [New Discussion](https://huggingface.co/spaces/biglam/README/discussions/new) on the BigLAM space.
55
- - Submit datasets, models, or tools that support AI for cultural heritage.
56
- - Use our datasets in your own research or projects—and share what you build!
 
57
 
58
  ---
59
 
60
  ## 🌍 Why It Matters
61
 
62
- Cultural heritage data is too often underrepresented in machine learning. By making LAM data more visible and usable:
 
 
 
 
 
 
 
63
 
64
- - We support the responsible and inclusive development of AI.
65
- - We help cultural institutions explore new forms of access and interpretation.
66
- - We ensure that machine learning models learn from the full range of human knowledge—not just what's convenient to crawl.
67
- - We develop tools and approaches that are tailored to the specific formats, challenges, and goals of libraries, archives, and museums—supporting long-term reuse and alignment with professional practices.
 
1
  ---
2
+ title: README
3
+ emoji: 📚
4
+ colorFrom: pink
5
+ colorTo: gray
6
+ sdk: static
7
+ pinned: false
8
  ---
9
 
10
  # 📚 BigLAM: Machine Learning for Libraries, Archives, and Museums
11
 
12
+ **BigLAM** is a community-driven initiative to build an open ecosystem of machine learning models, datasets, and tools for **Libraries, Archives, and Museums (LAMs)**.
13
 
14
+ We aim to:
15
 
16
+ - 🗃️ Share machine-learning-ready datasets from LAMs via the [Hugging Face Hub](https://huggingface.co/biglam)
17
+ - 🤖 Train and release open-source models for LAM-relevant tasks
18
+ - 🛠️ Develop tools and approaches tailored to LAM use cases
19
 
20
  ---
21
 
22
+ <details>
23
+ <summary><strong>✨ Background</strong></summary>
24
 
25
+ BigLAM began as a [datasets hackathon](https://github.com/bigscience-workshop/lam) within the [BigScience 🌸](https://bigscience.huggingface.co/) project, a large-scale, open NLP collaboration.
26
 
27
+ Our goal: make LAM datasets more discoverable and usable to support researchers, institutions, and ML practitioners working with cultural heritage data.
28
+ </details>
 
 
 
29
 
30
  ---
31
 
32
+ <details>
33
+ <summary><strong>📂 What You'll Find</strong></summary>
 
34
 
35
+ The [BigLAM organization](https://huggingface.co/biglam) hosts:
 
36
 
37
+ - **Datasets**: image, text, and tabular data from and about libraries, archives, and museums
38
+ - **Models**: fine-tuned for tasks like:
39
+ - Art/historical image classification
40
+ - Document layout analysis and OCR
41
  - Metadata quality assessment
42
+ - Named entity recognition in heritage texts
43
+ - **Spaces**: tools for interactive exploration and demonstration
44
+ </details>
45
 
46
  ---
47
 
48
+ <details>
49
+ <summary><strong>🧩 Get Involved</strong></summary>
50
 
51
+ We welcome contributions! You can:
 
52
 
53
+ - Use our [datasets and models](https://huggingface.co/biglam)
54
+ - Join the discussion on [GitHub](https://github.com/bigscience-workshop/lam/discussions)
55
+ - Contribute your own tools or data
56
+ - Share your work using BigLAM resources
57
+ </details>
58
 
59
  ---
60
 
61
  ## 🌍 Why It Matters
62
 
63
+ Cultural heritage data is often underrepresented in machine learning. BigLAM helps address this by:
64
+
65
+ - Supporting inclusive and responsible AI
66
+ - Helping institutions experiment with ML for access, discovery, and preservation
67
+ - Ensuring that ML systems reflect diverse human knowledge and expression
68
+ - Developing tools and methods that work well with the unique formats, values, and needs of LAMs
69
+
70
+ ---
71
 
72
+ *Empowering AI with the richness of human culture.*