File size: 2,725 Bytes
ed4d993
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
"""Wrapper around Bookend AI embedding models."""

import json
from typing import Any, List

import requests
from langchain_core.embeddings import Embeddings
from langchain_core.pydantic_v1 import BaseModel, Field

API_URL = "https://api.bookend.ai/"
DEFAULT_TASK = "embeddings"
PATH = "/models/predict"


class BookendEmbeddings(BaseModel, Embeddings):
    """Bookend AI sentence_transformers embedding models.

    Example:
        .. code-block:: python

            from langchain_community.embeddings import BookendEmbeddings

            bookend = BookendEmbeddings(
                domain={domain}
                api_token={api_token}
                model_id={model_id}
            )
            bookend.embed_documents([
                "Please put on these earmuffs because I can't you hear.",
                "Baby wipes are made of chocolate stardust.",
            ])
            bookend.embed_query(
                "She only paints with bold colors; she does not like pastels."
            )
    """

    domain: str
    """Request for a domain at https://bookend.ai/ to use this embeddings module."""
    api_token: str
    """Request for an API token at https://bookend.ai/ to use this embeddings module."""
    model_id: str
    """Embeddings model ID to use."""
    auth_header: dict = Field(default_factory=dict)

    def __init__(self, **kwargs: Any):
        super().__init__(**kwargs)
        self.auth_header = {"Authorization": "Basic {}".format(self.api_token)}

    def embed_documents(self, texts: List[str]) -> List[List[float]]:
        """Embed documents using a Bookend deployed embeddings model.

        Args:
            texts: The list of texts to embed.

        Returns:
            List of embeddings, one for each text.
        """
        result = []
        headers = self.auth_header
        headers["Content-Type"] = "application/json; charset=utf-8"
        params = {
            "model_id": self.model_id,
            "task": DEFAULT_TASK,
        }

        for text in texts:
            data = json.dumps(
                {"text": text, "question": None, "context": None, "instruction": None}
            )
            r = requests.request(
                "POST",
                API_URL + self.domain + PATH,
                headers=headers,
                params=params,
                data=data,
            )
            result.append(r.json()[0]["data"])

        return result

    def embed_query(self, text: str) -> List[float]:
        """Embed a query using a Bookend deployed embeddings model.

        Args:
            text: The text to embed.

        Returns:
            Embeddings for the text.
        """
        return self.embed_documents([text])[0]