Spaces:

chromadb
/

README

Running

jeffreyhuber commited on Jul 3, 2023

Commit

fc44b3f

1 Parent(s): 0c05b88

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -7,4 +7,39 @@ sdk: static
 pinned: false
 ---
-Edit this `README.md` markdown file to author your organization card 🔥

 pinned: false
 ---
+## Chroma Datasets
+Making it easy to load data into Chroma since 2023
+```
+pip install chroma_datasets
+```
+### Current Datasets
+- State of the Union `from chroma_datasets import StateOfTheUnion`
+- Paul Graham Essay `from chroma_datasets import PaulGrahamEssay`
+- Glue `from chroma_datasets import Glue`
+- SciPy `from chroma_datasets import SciPy`
+`chroma_datasets` is generally backed by hugging face datasets, but it is not a requirement.
+### How to use
+The following will:
+1. Download the 2022 State of the Union
+2. Chunk it up for you
+3. Embed it using Chroma's default open-source embedding function
+4. Import it into Chroma
+```python
+import chromadb
+from chroma_datasets import StateOfTheUnion
+from chroma_datasets.utils import import_into_chroma
+chroma_client = chromadb.Client()
+collection = import_into_chroma(chroma_client=chroma_client, dataset=StateOfTheUnion)
+result = collection.query(query_texts=["The United States of America"])
+print(result)
+```
+Learn about how to create and contribute a package at [chroma-core/chroma_datasets](https://github.com/chroma-core/chroma_datasets).