prdev commited on
Commit
579e358
·
verified ·
1 Parent(s): ab076e0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -0
README.md CHANGED
@@ -11,6 +11,66 @@ language:
11
  - en
12
  ---
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  # Uploaded model
15
 
16
  - **Developed by:** prdev
 
11
  - en
12
  ---
13
 
14
+ # Query Generation with LoRA Finetuning
15
+
16
+ This project fine-tunes a language model using supervised fine-tuning (SFT) and LoRA adapters to generate queries from documents. The model was trained on the [`prdev/qtack-gq-embeddings-unsupervised`](https://huggingface.co/datasets/prdev/qtack-gq-embeddings-unsupervised) dataset using an A100 GPU.
17
+
18
+ ## Overview
19
+
20
+ - **Objective:**
21
+ The goal is to train a model that, given a document, generates a relevant query. Each training example is formatted with custom markers:
22
+ - `<|document|>\n` precedes the document text.
23
+ - `<|query|>\n` precedes the query text.
24
+ - An EOS token is appended at the end to signal termination.
25
+
26
+ - **Text Chunking:**
27
+ For optimal performance, **chunk your text** into smaller, coherent pieces before providing it to the model. Long documents can lead the model to focus on specific details rather than the overall context.
28
+
29
+ - **Training Setup:**
30
+ The model is fine-tuned using the Unsloth framework with LoRA adapters, taking advantage of an A100 GPU for efficient training.
31
+
32
+ ## Quick Usage
33
+
34
+ Below is an example code snippet to load the finetuned model and test it with a chunked document:
35
+
36
+ ```python
37
+ from unsloth import FastLanguageModel
38
+ from transformers import TextStreamer
39
+
40
+ # Load the finetuned model and tokenizer from Hugging Face Hub.
41
+ # Replace 'your_username/your_model_repo_name' with your actual model repository.
42
+ model, tokenizer = FastLanguageModel.from_pretrained("your_username/your_model_repo_name", load_in_4bit=True)
43
+
44
+ # Enable faster inference if supported.
45
+ FastLanguageModel.for_inference(model)
46
+
47
+ # Example document chunk (ensure text is appropriately chunked).
48
+ document_chunk = (
49
+ "liberal arts. 1. the academic course of instruction at a college intended to provide general knowledge "
50
+ "and comprising the arts, humanities, natural sciences, and social sciences, as opposed to professional or technical subjects."
51
+ )
52
+
53
+ # Create the prompt using custom markers.
54
+ prompt = (
55
+ "<|document|>\n" + document_chunk + "\n<|query|>\n"
56
+ )
57
+
58
+ # Tokenize the prompt.
59
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
60
+
61
+ # Set up a TextStreamer to view token-by-token generation.
62
+ streamer = TextStreamer(tokenizer, skip_prompt=True)
63
+
64
+ # Generate a query from the document.
65
+ _ = model.generate(
66
+ input_ids=inputs["input_ids"],
67
+ streamer=streamer,
68
+ max_new_tokens=100,
69
+ temperature=0.7,
70
+ min_p=0.1,
71
+ eos_token_id=tokenizer.eos_token_id, # Ensures proper termination.
72
+ )
73
+
74
  # Uploaded model
75
 
76
  - **Developed by:** prdev