[OnDeviceMedNotes/Struct_Med_Note_v01]

This is a merged model based on unsloth/Llama-3.2-1B-Instruct-unsloth-bnb-4bit, fine-tuned with PEFT (LoRA) using the Unsloth library to generate SOAP notes from provided patient information.

The model was trained using the PEFT adapter weights hosted at OnDeviceMedNotes/SOAP, which were then merged into the base model weights using Unsloth's optimized saving capabilities.

Model Details

Base Model: unsloth/Llama-3.2-1B-Instruct-unsloth-bnb-4bit (A 1.1 Billion parameter Llama-3.2 Instruct model, optimized by Unsloth)
Fine-tuning Method: PEFT (Parameter-Efficient Fine-Tuning) using LoRA.
Training Framework: Unsloth library for accelerated fine-tuning and merging.
Task: Text Generation (specifically, generating structured SOAP notes).

Training Details

Dataset:
- "This model was fine-tuned on a collection of 1500 synthetic patient cases specifically formatted for SOAP note generation."
- starfishdata/playground_endocronology_notes_1500

Intended Use

This model is intended to assist healthcare professionals or students in drafting SOAP notes based on provided patient subjective, objective, assessment, and plan information.

It is NOT a substitute for professional medical judgment. All generated content must be reviewed, verified, and edited by a qualified healthcare professional before use in a clinical setting.

How to Use

Since this is a merged model, you can load it directly using the standard Hugging Face transformers library.

How to cite

On Device Medical Notes. Struct_Med_Note_v01 (Revision 3ca1b5c). Hugging Face, 2025. URL: https://huggingface.co/OnDeviceMedNotes/Struct_Med_Note_v01 DOI: 10.57967/hf/5354

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Specify the model identifier on Hugging Face Hub
model_id = "OnDeviceMedNotes/Struct_Med_Note_v01" 

# Load the merged model and tokenizer
print(f"Loading model from {model_id}...")
# Use appropriate dtype based on how you saved (e.g., float16, bfloat16)
# device_map='auto' helps load the model efficiently across available devices (GPU/CPU)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16, # Adjust dtype if needed (e.g., torch.bfloat16)
    device_map="auto",
    # Add other loading arguments if applicable (e.g., quantization args if you quantized AFTER merging)
)
print("Model loaded successfully.")

print("Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(model_id)
# Set pad_token if it's None, often useful for generation
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token
print("Tokenizer loaded successfully.")

# Determine the device
device = "cuda" if torch.cuda.is_available() else "cpu"
# model.to(device) # If not using device_map='auto'


# Define a system prompt (optional, but helpful for context)
#SYSTEM_PROMPT = """You are an expert medical scribe. Given a patient encounter transcript, create a medical SOAP note."""

# Prepare your input prompt using the Llama 3 chat template
input_text = """Convert the following medical transcript to a structured medical note:
Use these sections in this order: 
1. Presenting Illness
(Bullet point statements of the main problem and duration)
2. History of Presenting Illness
(Chronological narrative: symptom onset, progression, modifiers, associated factors)
3. Past Medical History
(List chronic illnesses and past medical diagnoses mentioned in the transcript. Do not include surgeries)
4. Surgical History
(List prior surgeries with year if known mentioned in the transcript)
5. Family History
(Relevant family history mentioned in the transcript)
6. Social History
(Occupation, tobacco/alcohol/drug use, exercise, living situation if mentioned in the transcript)
7. Allergy History
(Drug/food/environmental allergies + reactions - if mentioned in the transcript)
8. Medication History
(List medications the patient is already taking under  Medication History. Do not place any NEW or PROPOSED drugs in this section.) 
9. Dietary History
(“Not applicable” if unrelated, otherwise summarize diet pattern)
10. Review of Systems
(Head-to-toe α-ordered bullets; note positives and pertinent negatives- mentioned in the transcript)
11. Physical Exam Findings
Vital Signs (BP, HR, RR, Temp, SpO₂, HT, WT, BMI) - if mentioned in the transcript
(Structured by system: General, HEENT, CV, Resp, Abd, Neuro, MSK, Skin, Psych) - if mentioned in the transcript
12. Labs and Imaging
(labs, imaging results)
13. Assessment and Plan 
(List each diagnoses and treatment plan. No other information needed in this section)
Transcript: 
Good morning, Johnson. I understand you're here for follow-up of your hypertension.  How have you been feeling lately? Well, Doctor, I have been having some trouble sleeping lately.  
Are you wake up with a headache most morning? And I have noticed my vision gets a bit blurry  sometimes, especially when I wake up first. 
Have you been taking your blood pressure medication  since I've prescribed? Yes, I take lisinopril 10 milligrams every morning, but I'm not sure  it's working well. 
I've been checking my blood pressure at home and it's usually 150 over 95.  That's higher than we would like it to be. 
Have you made any changes to your diet or exercise  routine? I've been trying to eat less salt, but I haven't really increased my exercise. 
I get winded  pretty easily when I try to walk for more than a few minutes. Any chest pain or shortness of breath?  
No chest pain, but I feel shortness of breath sometimes, especially when I climb stairs.  Any swelling in your legs or feet? Yeah, I have noticed my ankles are a bit puffy by the end of the day.  
All right, let's do a physical exam. Your blood pressure today is 160 over 100, which is quite high.  
Your heart weights 88 beats per minute. Let me listen to your heart and lungs. I hear a slight  murmur in your heart and there are some crackles at the base of your lungs. 
Your angles do have some  edema. You know what? I like to do some tests. We'll do an EKG, chest x-ray and some blood work,  including electrolytes and kidney function. 
Based on these findings, you know, I think we need to  adjust your medications. I'm going to increase your lisinopril to 20 milligrams daily and add  a diuretic hydrochlorothiocide, 25 milligrams daily. 
This should help with your blood pressure and the  swelling. I wanted to start at low sodium diet and try to walk for 10 minutes a day. 
Gradually  increasing as you can tolerate. We will form up in two weeks to see you how you're doing. Do you  have any questions? Nope. I think I understand. Thank you, Dr.  Thank you
"""

# Format the prompt using the chat template (essential for instruct models)

messages = [
    {"role": "user", "content": input_text},
]
input_ids = tokenizer.apply_chat_template(
    messages,
    return_tensors="pt",
    add_generation_prompt=True # Adds the assistant turn prompt
).to(device)

print("\nGenerating SOAP note...")
with torch.no_grad(): # Disable gradient calculation for inference
    outputs = model.generate(
        input_ids,
        max_new_tokens=4096, # Adjust as needed for length of SOAP notes
        do_sample=True,      # Set to True for sampling, False for greedy
        temperature=0.1,     # Controls randomness (lower=less random)
        top_p=0.9,             
        num_return_sequences=1,
        eos_token_id=tokenizer.eos_token_id, # Stop generation at EOS token
        pad_token_id=tokenizer.pad_token_id, # Use pad_token or eos_token
        use_cache=True
    )

print("Generation complete.")

# Decode and display the generated text
# The output contains the input + generated text. Strip the input portion.
decoded_output_full = tokenizer.decode(outputs[0], skip_special_tokens=False)

# Llama 3 chat template structure: <|begin_of_text|>...<|end_header_id|>\n<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n[GENERATED TEXT]
# We need to find the start of the generated text after the assistant prompt.
assistant_prompt_end_marker = "<|start_header_id|>assistant<|end_header_id|>"
assistant_prompt_end_index = decoded_output_full.rfind(assistant_prompt_end_marker)

if assistant_prompt_end_index != -1:
    # Find the newline character immediately following the assistant prompt marker
    start_of_generation = decoded_output_full.find('\n', assistant_prompt_end_index + len(assistant_prompt_end_marker))
    if start_of_generation != -1:
        # Extract text after the newline, skipping special tokens
        decoded_generated_text = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True).strip()
    else:
        # Fallback if structure is unexpected, decode from after input tokens
         print("Warning: Could not find expected newline after assistant prompt. Decoding from after input tokens.")
         decoded_generated_text = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True).strip()
else:
    # Fallback if assistant prompt marker is not found
    print("Warning: Could not find assistant prompt marker. Decoding from after input tokens.")
    decoded_generated_text = tokenizer.decode(outputs[0][input_ids.shape[-1]:], skip_special_tokens=True).strip()


print("\n--- Generated SOAP Note ---")
print(decoded_generated_text)
print("---------------------------")

Limitations and Ethical Considerations

Accuracy: Language models can generate factually incorrect or misleading information ("hallucinate"). The generated SOAP notes should be treated as drafts requiring careful review and validation by a qualified medical professional. Do not rely on this model for medical diagnosis, treatment decisions, or clinical advice.

Bias: The model may inherit biases from its base model and the data it was fine-tuned on. This could potentially manifest in biased language or inappropriate suggestions.

Patient Data Privacy: Never use real, protected patient health information (PHI) as input to this model unless you have implemented appropriate safeguards and are compliant with relevant privacy regulations (e.g., HIPAA). The example input provided uses synthetic data.

Scope: The model's performance is limited to the types of cases and data it was trained on. It may perform poorly on novel or complex medical scenarios not represented in the training data.

Disclaimer THE GENERATED CONTENT IS PROVIDED FOR INFORMATIONAL AND ASSISTIVE PURPOSES ONLY. IT DOES NOT CONSTITUTE MEDICAL ADVICE AND SHOULD NOT BE USED TO DIAGNOSE, TREAT, CURE, OR PREVENT ANY MEDICAL CONDITION. ALWAYS SEEK THE ADVICE OF A QUALIFIED HEALTHCARE PROFESSIONAL WITH ANY QUESTIONS REGARDING A MEDICAL CONDITION. THE DEVELOPERS OF THIS MODEL ARE NOT RESPONSIBLE FOR ANY OUTCOMES RESULTING FROM THE USE OF THIS MODEL.

Acknowledgements This model is based on the excellent work by Meta AI on the Llama 3 family of models. Fine-tuning and merging were significantly accelerated and simplified using the Unsloth library. Thanks to the Hugging Face transformers and peft libraries.

License This merged model inherits the license of the base model, the Meta Llama 3 Community License. Please review the terms of this license before using the model.

OnDeviceMedNotes
/

Struct_Med_Note_v01

You need to agree to share your contact information to access this model