How to use?

  • We use Unsloth for faster inference and load the adapter:
from unsloth import FastLanguageModel
max_seq_length = 8192 
dtype = None 
load_in_4bit = True
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "patched-codes/Llama-3.2-1B-FastApply",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model) # Enable native 2x faster inference
  • The model works with original code and the edited code as input to generate the final updated code:
original_code = """import React from 'react';
import { Loader } from 'lucide-react';

interface ButtonProps {
  text: string;
  onClick?: () => void;
  loading?: boolean;
  disabled?: boolean;
  icon?: React.ReactNode;
}

const Button: React.FC<ButtonProps> = ({
  text,
  onClick,
  loading = false,
  disabled = false,
  icon
}) => (
  <button
    className="bg-blue-500 text-white p-2 rounded flex items-center gap-2"
    onClick={onClick}
    disabled={disabled || loading}
  >
    {loading ? <Loader className="animate-spin" /> : icon}
    {text}
  </button>
);

export default Button;
"""

update_snippet = """interface ButtonProps {
  variant?: 'primary' | 'secondary' | 'danger';
  size?: 'small' | 'medium' | 'large';
  // ... other props
}

const Button: React.FC<ButtonProps> = ({
  variant = 'primary',
  size = 'medium',
  // ... other props
}) => (
  <button
    className={`flex items-center gap-2 rounded ${
      size === 'small' ? 'p-1 text-sm' :
      size === 'large' ? 'p-3 text-lg' :
      'p-2 text-md'
    } ${
      variant === 'primary' ? 'bg-blue-500 text-white' :
      variant === 'secondary' ? 'bg-gray-500 text-white' :
      'bg-red-500 text-white'
    }`}
    // ... other attributes
  >
    // ... existing code ...
  </button>
);
"""
  • Prepare your input following the prompt structure:
input_text = f"""
Merge all changes from the <update> snippet into the <code> below.
- Preserve the code's structure, order, comments, and indentation exactly.
- Output only the updated code, enclosed within <updated-code> and </updated-code> tags.
- Do not include any additional text, explanations, placeholders, ellipses, or code fences.

<code>{original_code}</code>

<update>{update_snippet}</update>

Provide the complete updated code.
"""

messages = [
    {"role": "system", "content": "You are a coding assistant that helps merge code updates, ensuring every modification is fully integrated."},
    {"role": "user", "content": input_text.strip()},
]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize = True,
    add_generation_prompt = True, # Must add for generation
    return_tensors = "pt",
).to("cuda")

from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer, skip_prompt = True)
output = model.generate(input_ids = inputs, streamer = text_streamer, max_new_tokens = 8192,
                   use_cache = True, temperature = 1.5, min_p = 0.1)

response = tokenizer.decode(output[0][len(inputs[0]):])

updated_code = response.split("<updated-code>")[1].split("</updated-code>")[0]

Uploaded model

  • Developed by: patched-codes
  • License: apache-2.0
  • Finetuned from model : unsloth/llama-3.2-1b-instruct-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
78
GGUF
Model size
1.24B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support