ashimdahal commited on
Commit
e1facd3
·
verified ·
1 Parent(s): 5ee111e

Add/Update generated README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -8
README.md CHANGED
@@ -6,8 +6,9 @@ tags:
6
  - generated-by-script
7
  - peft # Assume PEFT adapter unless explicitly a full model repo
8
  - image-captioning # Add more specific task tags if applicable
9
- base_model:
10
- - microsoft/git-base # Heuristic guess for decoder, VERIFY MANUALLY
 
11
  ---
12
 
13
  # Model: ashimdahal/microsoft-git-base_microsoft-git-base
@@ -24,7 +25,7 @@ https://github.com/ashimdahal/captioning_image/blob/main
24
 
25
  **⚠️ Important:** The `base_model` tag in the metadata above is initially empty. The models listed here are *heuristic guesses* based on the training directory name (`microsoft-git-base_microsoft-git-base`). Please verify these against your training configuration and update the `base_model:` list in the YAML metadata block at the top of this README with the correct Hugging Face model identifiers.
26
 
27
- ## How to Use (Example with PEFT)::: This is generated by script and not verified manually so proceed with caution
28
 
29
  ```python
30
  from transformers import AutoProcessor, AutoModelForVision2Seq, Blip2ForConditionalGeneration # Or other relevant classes
@@ -33,7 +34,7 @@ import torch
33
 
34
  # --- Configuration ---
35
  # 1. Specify the EXACT base model identifiers used during training
36
- # base_processor_id = "microsoft/git-base" # <-- Replace with correct HF ID
37
  base_model_id = "microsoft/git-base" # <-- Replace with correct HF ID (e.g., Salesforce/blip2-opt-2.7b)
38
 
39
  # 2. Specify the PEFT adapter repository ID (this repo)
@@ -47,9 +48,11 @@ processor = AutoProcessor.from_pretrained(base_processor_id)
47
  base_model = Blip2ForConditionalGeneration.from_pretrained(
48
  base_model_id,
49
  torch_dtype=torch.float16 # Or torch.bfloat16 or float32, match training/inference needs
50
- )
51
  # Or for other model types:
52
  base_model = AutoModelForVision2Seq.from_pretrained(base_model_id, torch_dtype=torch.float16)
 
 
53
 
54
  # --- Load PEFT Adapter ---
55
  # Load the adapter config and merge the adapter weights into the base model
@@ -60,16 +63,15 @@ model.eval() # Set model to evaluation mode
60
  # --- Inference Example ---
61
  device = "cuda" if torch.cuda.is_available() else "cpu"
62
  model.to(device)
63
- #
64
  image = ... # Load your image (e.g., using PIL)
65
  text = "a photo of" # Optional prompt start
66
- #
67
  inputs = processor(images=image, text=text, return_tensors="pt").to(device, torch.float16) # Match model dtype
68
 
69
  generated_ids = model.generate(**inputs, max_new_tokens=50)
70
  generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0].strip()
71
  print(f"Generated Caption: {{generated_text}}")
72
-
73
  ```
74
 
75
  *More model-specific documentation, evaluation results, and usage examples should be added here.*
 
6
  - generated-by-script
7
  - peft # Assume PEFT adapter unless explicitly a full model repo
8
  - image-captioning # Add more specific task tags if applicable
9
+ base_model: [] # <-- FIXED: Provide empty list as default to satisfy validator
10
+ # - microsoft/git-base # Heuristic guess for processor, VERIFY MANUALLY
11
+ # - microsoft/git-base # Heuristic guess for decoder, VERIFY MANUALLY
12
  ---
13
 
14
  # Model: ashimdahal/microsoft-git-base_microsoft-git-base
 
25
 
26
  **⚠️ Important:** The `base_model` tag in the metadata above is initially empty. The models listed here are *heuristic guesses* based on the training directory name (`microsoft-git-base_microsoft-git-base`). Please verify these against your training configuration and update the `base_model:` list in the YAML metadata block at the top of this README with the correct Hugging Face model identifiers.
27
 
28
+ ## How to Use (Example with PEFT)
29
 
30
  ```python
31
  from transformers import AutoProcessor, AutoModelForVision2Seq, Blip2ForConditionalGeneration # Or other relevant classes
 
34
 
35
  # --- Configuration ---
36
  # 1. Specify the EXACT base model identifiers used during training
37
+ base_processor_id = "microsoft/git-base" # <-- Replace with correct HF ID
38
  base_model_id = "microsoft/git-base" # <-- Replace with correct HF ID (e.g., Salesforce/blip2-opt-2.7b)
39
 
40
  # 2. Specify the PEFT adapter repository ID (this repo)
 
48
  base_model = Blip2ForConditionalGeneration.from_pretrained(
49
  base_model_id,
50
  torch_dtype=torch.float16 # Or torch.bfloat16 or float32, match training/inference needs
51
+ )
52
  # Or for other model types:
53
  base_model = AutoModelForVision2Seq.from_pretrained(base_model_id, torch_dtype=torch.float16)
54
+ base_model = AutoModelForCausalLM
55
+ ......
56
 
57
  # --- Load PEFT Adapter ---
58
  # Load the adapter config and merge the adapter weights into the base model
 
63
  # --- Inference Example ---
64
  device = "cuda" if torch.cuda.is_available() else "cpu"
65
  model.to(device)
66
+
67
  image = ... # Load your image (e.g., using PIL)
68
  text = "a photo of" # Optional prompt start
69
+
70
  inputs = processor(images=image, text=text, return_tensors="pt").to(device, torch.float16) # Match model dtype
71
 
72
  generated_ids = model.generate(**inputs, max_new_tokens=50)
73
  generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0].strip()
74
  print(f"Generated Caption: {{generated_text}}")
 
75
  ```
76
 
77
  *More model-specific documentation, evaluation results, and usage examples should be added here.*