--- license: apache-2.0 datasets: - armanc/scientific_papers language: - en base_model: - google/flan-t5-small tags: - summarization - research-papers - arxiv - t5 --- # arxiv-summarization This model is a fine-tuned version of [`google/flan-t5-small`](https://huggingface.co/google/flan-t5-small) on a dataset of armanc/scientific_papers (arxiv). It is optimized for **summarizing scientific abstracts**. ## Model Details - **Base Model:** `google/flan-t5-small` - **Training Data:** Arxiv Research Papers (`article` → `abstract`) - **Fine-Tuned Task:** Text Summarization - **Use Case:** Generate shorter summaries of long research papers - **License:** Apache 2.0 ## How to Use ```python from transformers import T5ForConditionalGeneration, T5Tokenizer model = T5ForConditionalGeneration.from_pretrained("Talina06/arxiv-summarization") tokenizer = T5Tokenizer.from_pretrained("Talina06/arxiv-summarization") text = "Summarize: Deep learning is being used to advance medical research, particularly in cancer detection." inputs = tokenizer(text, return_tensors="pt") summary_ids = model.generate(**inputs) summary = tokenizer.decode(summary_ids[0], skip_special_tokens=True) print("Generated Summary:", summary) ``` ## Training Details - **Training Data:** 100k+ Arxiv research papers - **Training Framework:** Hugging Face Transformers - **Hyperparameters:** - Learning Rate: `5e-5` - Batch Size: `8` - Epochs: `10` - **Hardware Used:** TPU & GPU ## Limitations - ❌ May struggle with **very technical** papers (e.g., complex math formulas). ## Example Summaries | **Original Abstract** | **Generated Summary** | |----------------------|----------------------| | "Deep learning has transformed many fields... We propose a new CNN for cancer detection..." | "A CNN model is proposed for cancer detection using deep learning." | | "Quantum computing has shown potential for cryptographic applications..." | "Quantum computing can be used in cryptography." |