--- base_model: unsloth/deepseek-r1-distill-llama-8b-unsloth-bnb-4bit library_name: peft license: mit language: - en --- # Model Card for SQL Injection Classifier This model is designed to classify SQL queries as either normal (0) or as potential SQL injection attacks (1). ## Model Details ### Model Description This model is trained to identify SQL injection attacks, which are a type of code injection technique where an attacker can execute arbitrary SQL code in a database query. By analyzing the structure of SQL queries, the model predicts whether a given query is a normal query or contains malicious code indicative of an SQL injection attack. - **Developed by:** [More Information Needed] - **Funded by [optional]:** [More Information Needed] - **Shared by [optional]:** [More Information Needed] - **Model type:** Fine-tuned Llama 8B model (Distilled Version) - **Language(s) (NLP):** English - **License:** [More Information Needed] - **Finetuned from model [optional]:** [More Information Needed] ### Model Sources [optional] - **Repository:** [More Information Needed] - **Paper [optional]:** [More Information Needed] - **Demo [optional]:** [More Information Needed] ## Uses ### Colab Use To use the SQL Injection Classifier model, you can follow the code snippet below. This example demonstrates how to predict whether a given SQL query is normal or an injection attack. ```python # 1=sql injection query and 0=normal sql query from unsloth import FastLanguageModel from transformers import AutoTokenizer # Load the model and tokenizer model_name = "shukdevdatta123/sql_injection_classifier_DeepSeek_R1_fine_tuned_model" hf_token = "your hf tokens" model, tokenizer = FastLanguageModel.from_pretrained( model_name=model_name, load_in_4bit=True, token=hf_token, ) # Function for testing queries def predict_sql_injection(query): # Prepare the model for inference inference_model = FastLanguageModel.for_inference(model) prompt = f"### Instruction:\nClassify the following SQL query as normal (0) or an injection attack (1).\n\n### Query:\n{query}\n\n### Classification:\n" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") # Use the inference model for generation outputs = inference_model.generate( input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, max_new_tokens=1000, use_cache=True, ) prediction = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0] return prediction.split("### Classification:\n")[-1].strip() # Example usage test_query = "SELECT * FROM users WHERE id = '1' OR '1'='1' --" result = predict_sql_injection(test_query) print(f"Query: {test_query}\nPrediction: {result}") ``` ### Downstream Use This model can be integrated into applications requiring SQL injection detection, such as web application firewalls, database query analyzers, and security auditing tools. It can help identify and prevent potential vulnerabilities in SQL queries. ### Out-of-Scope Use This model should not be used for malicious purposes, such as testing vulnerabilities on unauthorized systems, or for making security decisions without human oversight. It is essential to understand that the model's predictions should be interpreted with caution and supplemented with additional security measures. ## Bias, Risks, and Limitations This model was trained on a dataset of SQL queries and may exhibit certain limitations: - **Bias**: The model may have limited generalization across different types of SQL injections or databases outside those present in the training set. - **Risks**: False positives or false negatives could lead to missed SQL injection attacks or incorrect identification of normal queries as injections. - **Limitations**: The model may not perform well on highly obfuscated attacks or queries that exploit novel vulnerabilities not present in the training data. ### Recommendations Users (both direct and downstream) should be aware of the potential risks of relying on the model in security-sensitive applications. Additional domain-specific testing and validation are recommended before deployment. ## How to Get Started with the Model (Colab Streamlit) ```python !pip install unsloth %%writefile app.py # 1=sql injection query and 0=normal sql query import streamlit as st from unsloth import FastLanguageModel from transformers import AutoTokenizer # Streamlit UI for input st.title("SQL Injection Classifier") hf_token = st.text_input("Enter your Hugging Face Token", type="password") model_name = "shukdevdatta123/sql_injection_classifier_DeepSeek_R1_fine_tuned_model" # Load the model and tokenizer when HF token is provided if hf_token: try: model, tokenizer = FastLanguageModel.from_pretrained( model_name=model_name, load_in_4bit=True, token=hf_token, ) # Function for testing queries def predict_sql_injection(query): # Prepare the model for inference inference_model = FastLanguageModel.for_inference(model) prompt = f"### Instruction:\nClassify the following SQL query as normal (0) or an injection attack (1).\n\n### Query:\n{query}\n\n### Classification:\n" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") # Use the inference model for generation outputs = inference_model.generate( input_ids=inputs.input_ids, attention_mask=inputs.attention_mask, max_new_tokens=1000, use_cache=True, ) prediction = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0] return prediction.split("### Classification:\n")[-1].strip() # Input query from the user query = st.text_area("Enter an SQL query to test for injection", "") # Add a button to classify the query if st.button("Classify SQL Injection"): if query: result = predict_sql_injection(query) st.write(f"Prediction: {result}") else: st.write("Please enter a SQL query first.") except Exception as e: st.error(f"Error loading model: {str(e)}") else: st.write("Please enter your Hugging Face token to proceed.") !pip install streamlit !streamlit run app.py & npx localtunnel --port 8501 ``` ## Training Details ### Training Data The model was trained using a dataset of SQL queries, specifically focusing on SQL injection examples and normal queries. Each query is labeled as either normal (0) or an injection (1). ### Training Procedure The model was fine-tuned using the PEFT (Parameter Efficient Fine-Tuning) technique, optimizing a pre-trained Llama 8B model for the task of SQL injection detection. #### Training Hyperparameters - **Training regime:** Mixed precision (fp16). - **Learning rate:** 2e-4. - **Batch size:** 2 per device, with gradient accumulation steps of 4. - **Max steps:** 200. ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data The evaluation was performed on a separate set of labeled SQL queries designed to test the model’s ability to differentiate between normal queries and SQL injection attacks. #### Metrics - **Accuracy:** How accurately the model classifies the queries. - **Precision and Recall:** Evaluating the model’s performance in detecting both true positives (injection attacks) and avoiding false positives. ### Results The model was evaluated based on the training loss across 200 steps. Below is the training loss progression during the training process: | Step | Training Loss | |------|---------------| | 10 | 2.951600 | | 20 | 1.572900 | | 30 | 1.370200 | | 40 | 1.081900 | | 50 | 0.946200 | | 60 | 1.028700 | | 70 | 0.873700 | | 80 | 0.793300 | | 90 | 0.892700 | | 100 | 0.863000 | | 110 | 0.694700 | | 120 | 0.685900 | | 130 | 0.778400 | | 140 | 0.748500 | | 150 | 0.721600 | | 160 | 0.714400 | | 170 | 0.764900 | | 180 | 0.750800 | | 190 | 0.664200 | | 200 | 0.700600 | #### Summary The model shows a significant reduction in training loss over the first 100 steps, indicating good convergence during the fine-tuning process. After step 100, the training loss becomes more stable but continues to fluctuate slightly. Overall, the model achieved a low loss by the final training step, suggesting effective learning and adaptation to the task of classifying SQL injections. ## Technical Specifications ### Model Architecture and Objective The model is based on a fine-tuned Llama 8B architecture, utilizing the PEFT technique to reduce the number of parameters required for fine-tuning while still maintaining good performance. ### Compute Infrastructure The model was trained using a powerful GPU cluster, leveraging mixed precision and gradient accumulation for optimal performance on large datasets. #### Hardware T4 GPU of Colab #### Software - **Libraries:** Hugging Face Transformers, unsloth, TRL, PyTorch. - **Training Framework:** PEFT. ## Glossary - **SQL Injection**: A type of attack where malicious SQL statements are executed in an application’s database. - **PEFT**: Parameter Efficient Fine-Tuning, a technique used for fine-tuning large models with fewer parameters. ## Model Card Authors [Shukdev Datta](https://www.linkedin.com/in/shukdev-datta-729767144/) ## Model Card Contact - **Email**: shukdevdatta@gmail.com - **GitHub**: [Click to here to access the Github Profile](https://github.com/shukdevtroy) - **WhatsApp**: [Click here to chat](https://wa.me/+8801719296601) ### Framework versions - PEFT 0.14.0