File size: 2,829 Bytes
ea6201c
 
a077a24
 
ea6201c
 
a077a24
 
 
cf0503a
ea6201c
 
a077a24
1dde6f9
a077a24
1dde6f9
 
a077a24
1dde6f9
 
 
 
 
 
 
 
 
 
 
 
 
cf0503a
 
1dde6f9
 
 
cf0503a
 
1dde6f9
 
 
cf0503a
 
 
1dde6f9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ffa3e06
1dde6f9
ffa3e06
1dde6f9
 
 
cf0503a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
---
title: README
emoji: ๐Ÿš€
colorFrom: indigo
colorTo: pink
sdk: gradio
pinned: true
license: agpl-3.0
short_description: Accelerate Molecular Biology Research with Machine Learning
sdk_version: 5.25.2
---


# [MultiMolecule](https://multimolecule.danling.org)

> [!TIP]
> Accelerate Molecular Biology Research with Machine Learning

[![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.15119050.svg)](https://doi.org/10.5281/zenodo.15119050)
[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)

## ๐Ÿงฌ Introduction

MultiMolecule is a framework that bridges molecular biology and machine learning. It offers machine learning tools specifically designed for biomolecular data (RNA, DNA, and protein).

MultiMolecule serves as a foundation for advancing research at the intersection of molecular biology and machine learning.

## ๐Ÿš€ Features

### ๐Ÿ“‘ Resources

- **[Model Hub](https://huggingface.co/multimolecule)**: Models designed for biomolecular data.
- **[Dataset Hub](https://huggingface.co/multimolecule)**: Processed biomolecular datasets.

### ๐Ÿ› ๏ธ Tools

- **[`pipelines`](https://multimolecule.danling.org/pipelines)**: End-to-end workflows for applying models.
- **[`runner`](https://multimolecule.danling.org/runner)**: Automatic Runner for training models.

### โš™๏ธ Infrastructure

- **[`data`](https://multimolecule.danling.org/data)**: Smart Dataset that automatically infer tasksโ€”including their level (sequence, token, contact) and type (classification, regression).
- **[`tokenisers`](https://multimolecule.danling.org/tokenisers)**: Tokenizers for biomolecular sequences.
- **[`module`](https://multimolecule.danling.org/module)**: Neural network building blocks.

## ๐Ÿ”ง Installation

=== "Install the stable release from PyPI"

    ```bash
    pip install multimolecule
    ```

=== "Install the latest development version"

    ```bash
    pip install git+https://github.com/DLS5-Omics/multimolecule
    ```

## ๐Ÿ“œ Citation

If you use MultiMolecule in your research, please cite us as follows:

```bibtex
@software{chen_2024_12638419,
  author    = {Chen, Zhiyuan and Zhu, Sophia Y.},
  title     = {MultiMolecule},
  doi       = {10.5281/zenodo.12638419},
  publisher = {Zenodo},
  url       = {https://doi.org/10.5281/zenodo.12638419},
  year      = 2024,
  month     = may,
  day       = 4
}
```

## ๐Ÿ“„ License

We believe openness is the Foundation of Research.

MultiMolecule is licensed under the [GNU Affero General Public License](https://multimolecule.danling.org/about/license).

For additional terms and clarifications, please refer to our [License FAQ](https://multimolecule.danling.org/about/license-faq).

Please join us in building an open research community.

`SPDX-License-Identifier: AGPL-3.0-or-later`