diff --git "a/train-mygameintro-gemma-nathmath-v1-9-demo.ipynb" "b/train-mygameintro-gemma-nathmath-v1-9-demo.ipynb" new file mode 100644--- /dev/null +++ "b/train-mygameintro-gemma-nathmath-v1-9-demo.ipynb" @@ -0,0 +1,2726 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "Wj4lhJCuMYcm" + }, + "source": [ + "# Training on Your Private Data - by NathMath @ bilibili" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "大家好,这里是Nathmath。我将用一个系列给你们讲解如何基于预训练好的底模进行大语言模型的私有数据微调。\n", + "> * 区别于部分UP主,我可能会“废话”很多。但是,“废话”才是你们学习的基础。因为我的“废话”是在讲解原理,让你们`能“鱼”也能“渔”`(钓鱼),意思是懂得原理,就可以不仅仅学会这一个微调,并且能够自己用在其他需要的地方,迁移学习。而不是仅仅学会我这一个东西,无法举一反三。\n", + "\n", + "> * 本系列视频特别推荐大家动手。以本期视频举例,很多同学还不会准备数据集,没事,请一定要拿我的数据先跑一遍,遍跑遍听我的讲解,理解每一步在做什么;我后面的视频会继续教你们怎么准备数据集(会的同学仅看本期就可以),以及怎么进行多轮对话训练、怎么进行思考训练、怎么进行其他模型的训练;当然,最基础的,建议大家自己`先照猫画虎把我的Notebook跑通`,然后再自己尝试自己的数据。\n", + "\n", + "> * 微调和训练是很难很难的内容。包括训练数据准备。在行内,有着“`数据处理80%,建模训练20%`”的行话,意思是数据处理所消耗的时间和精力占到整个机器学习的80%,其也决定了你模型的质量的80%,因为\"garbage in, garbage out\"(进去的是垃圾,出来的也是垃圾)。大家`一定不要灰心`,如果想学的话,踏踏实实学,有问题就问ChatGPT/DeepSeek,它能解决很多问题。\n", + "\n", + "> * 关于在线训练平台。UP个人推荐Kaggle。原因是`每周`有30小时的免费的T4(16G)x2的GPU使用,需要注册并完成手机号认证(认证时候中国手机记着加上86)。另外提醒,数据特别敏感的个人或者企业用户请自己花钱租用服务器。" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "IlrY86-MNfjf" + }, + "source": [ + "## 1. Prepare the Environment" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "0AhWNVC9U9B4", + "trusted": true + }, + "outputs": [], + "source": [ + "# Reference https://colab.research.google.com/drive/1Ys44kVvmeZtnICzWz0xgpRnrIOjZAuxp?usp=sharing#scrollTo=FqfebeAdT073\n", + "# 参考文献" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8l_HaJMosoVY" + }, + "source": [ + "* Unsloth supports Llama, Mistral, Phi-3, Gemma, Yi, DeepSeek, Qwen, TinyLlama, Vicuna, Open Hermes etc\n", + "* Unsloth supports 16bit LoRA or 4bit QLoRA. Both 2x faster.\n", + "* With [PR 26037](https://github.com/huggingface/transformers/pull/26037), we support downloading 4bit models **4x faster**! [Our repo](https://huggingface.co/unsloth) has Llama, Mistral 4bit models." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "7FXuUqc9j1dw", + "trusted": true + }, + "outputs": [], + "source": [ + "# Modified Auther NathMath, open-sourced with Apache-2.0 Licence\n", + "# 修改作者:NathMath,以Apache-2.0 Licence许可证开源" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:20:48.721745Z", + "iopub.status.busy": "2025-04-06T06:20:48.721397Z", + "iopub.status.idle": "2025-04-06T06:20:48.726731Z", + "shell.execute_reply": "2025-04-06T06:20:48.726047Z", + "shell.execute_reply.started": "2025-04-06T06:20:48.721713Z" + }, + "trusted": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Completed\n" + ] + } + ], + "source": [ + "# Use Multi-GPUs if available\n", + "# 可行时使用双CPU,适用于Kaggle T4x2\n", + "\n", + "import os\n", + "os.environ[\"CUDA_VISIBLE_DEVICES\"] = \"0,1\"\n", + "print(\"Completed\")" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:20:50.278607Z", + "iopub.status.busy": "2025-04-06T06:20:50.278246Z", + "iopub.status.idle": "2025-04-06T06:23:59.485514Z", + "shell.execute_reply": "2025-04-06T06:23:59.484487Z", + "shell.execute_reply.started": "2025-04-06T06:20:50.278573Z" + }, + "trusted": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Collecting unsloth==2025.3.18\n", + " Downloading unsloth-2025.3.18-py3-none-any.whl.metadata (46 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m46.2/46.2 kB\u001b[0m \u001b[31m2.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hCollecting unsloth_zoo>=2025.3.14 (from unsloth==2025.3.18)\n", + " Downloading unsloth_zoo-2025.3.17-py3-none-any.whl.metadata (8.0 kB)\n", + "Requirement already satisfied: torch>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (2.5.1+cu121)\n", + "Collecting xformers>=0.0.27.post2 (from unsloth==2025.3.18)\n", + " Downloading xformers-0.0.29.post3-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (1.0 kB)\n", + "Collecting bitsandbytes (from unsloth==2025.3.18)\n", + " Downloading bitsandbytes-0.45.4-py3-none-manylinux_2_24_x86_64.whl.metadata (5.0 kB)\n", + "Collecting triton>=3.0.0 (from unsloth==2025.3.18)\n", + " Downloading triton-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (1.4 kB)\n", + "Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (24.2)\n", + "Collecting tyro (from unsloth==2025.3.18)\n", + " Downloading tyro-0.9.18-py3-none-any.whl.metadata (9.2 kB)\n", + "Collecting transformers!=4.47.0,>=4.46.1 (from unsloth==2025.3.18)\n", + " Downloading transformers-4.51.0-py3-none-any.whl.metadata (38 kB)\n", + "Requirement already satisfied: datasets>=2.16.0 in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (3.3.1)\n", + "Requirement already satisfied: sentencepiece>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (0.2.0)\n", + "Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (4.67.1)\n", + "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (5.9.5)\n", + "Requirement already satisfied: wheel>=0.42.0 in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (0.45.1)\n", + "Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (1.26.4)\n", + "Requirement already satisfied: accelerate>=0.34.1 in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (1.2.1)\n", + "Collecting trl!=0.15.0,!=0.9.0,!=0.9.1,!=0.9.2,!=0.9.3,<=0.15.2,>=0.7.9 (from unsloth==2025.3.18)\n", + " Downloading trl-0.15.2-py3-none-any.whl.metadata (11 kB)\n", + "Requirement already satisfied: peft!=0.11.0,>=0.7.1 in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (0.14.0)\n", + "Requirement already satisfied: protobuf<4.0.0 in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (3.20.3)\n", + "Requirement already satisfied: huggingface_hub in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (0.29.0)\n", + "Requirement already satisfied: hf_transfer in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (0.1.9)\n", + "Requirement already satisfied: diffusers in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (0.31.0)\n", + "Requirement already satisfied: torchvision in /usr/local/lib/python3.10/dist-packages (from unsloth==2025.3.18) (0.20.1+cu121)\n", + "Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.34.1->unsloth==2025.3.18) (6.0.2)\n", + "Requirement already satisfied: safetensors>=0.4.3 in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.34.1->unsloth==2025.3.18) (0.4.5)\n", + "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from datasets>=2.16.0->unsloth==2025.3.18) (3.17.0)\n", + "Requirement already satisfied: pyarrow>=15.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.16.0->unsloth==2025.3.18) (19.0.1)\n", + "Requirement already satisfied: dill<0.3.9,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.16.0->unsloth==2025.3.18) (0.3.8)\n", + "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from datasets>=2.16.0->unsloth==2025.3.18) (2.2.3)\n", + "Requirement already satisfied: requests>=2.32.2 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.16.0->unsloth==2025.3.18) (2.32.3)\n", + "Requirement already satisfied: xxhash in /usr/local/lib/python3.10/dist-packages (from datasets>=2.16.0->unsloth==2025.3.18) (3.5.0)\n", + "Requirement already satisfied: multiprocess<0.70.17 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.16.0->unsloth==2025.3.18) (0.70.16)\n", + "Requirement already satisfied: fsspec<=2024.12.0,>=2023.1.0 in /usr/local/lib/python3.10/dist-packages (from fsspec[http]<=2024.12.0,>=2023.1.0->datasets>=2.16.0->unsloth==2025.3.18) (2024.12.0)\n", + "Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets>=2.16.0->unsloth==2025.3.18) (3.11.12)\n", + "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface_hub->unsloth==2025.3.18) (4.12.2)\n", + "Requirement already satisfied: mkl_fft in /usr/local/lib/python3.10/dist-packages (from numpy->unsloth==2025.3.18) (1.3.8)\n", + "Requirement already satisfied: mkl_random in /usr/local/lib/python3.10/dist-packages (from numpy->unsloth==2025.3.18) (1.2.4)\n", + "Requirement already satisfied: mkl_umath in /usr/local/lib/python3.10/dist-packages (from numpy->unsloth==2025.3.18) (0.1.1)\n", + "Requirement already satisfied: mkl in /usr/local/lib/python3.10/dist-packages (from numpy->unsloth==2025.3.18) (2025.0.1)\n", + "Requirement already satisfied: tbb4py in /usr/local/lib/python3.10/dist-packages (from numpy->unsloth==2025.3.18) (2022.0.0)\n", + "Requirement already satisfied: mkl-service in /usr/local/lib/python3.10/dist-packages (from numpy->unsloth==2025.3.18) (2.4.1)\n", + "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=2.4.0->unsloth==2025.3.18) (3.4.2)\n", + "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=2.4.0->unsloth==2025.3.18) (3.1.4)\n", + "Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.10/dist-packages (from torch>=2.4.0->unsloth==2025.3.18) (1.13.1)\n", + "Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy==1.13.1->torch>=2.4.0->unsloth==2025.3.18) (1.3.0)\n", + "Collecting huggingface_hub (from unsloth==2025.3.18)\n", + " Downloading huggingface_hub-0.30.1-py3-none-any.whl.metadata (13 kB)\n", + "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers!=4.47.0,>=4.46.1->unsloth==2025.3.18) (2024.11.6)\n", + "Requirement already satisfied: tokenizers<0.22,>=0.21 in /usr/local/lib/python3.10/dist-packages (from transformers!=4.47.0,>=4.46.1->unsloth==2025.3.18) (0.21.0)\n", + "Requirement already satisfied: rich in /usr/local/lib/python3.10/dist-packages (from trl!=0.15.0,!=0.9.0,!=0.9.1,!=0.9.2,!=0.9.3,<=0.15.2,>=0.7.9->unsloth==2025.3.18) (13.9.4)\n", + "Collecting cut_cross_entropy (from unsloth_zoo>=2025.3.14->unsloth==2025.3.18)\n", + " Downloading cut_cross_entropy-25.1.1-py3-none-any.whl.metadata (9.3 kB)\n", + "Requirement already satisfied: pillow in /usr/local/lib/python3.10/dist-packages (from unsloth_zoo>=2025.3.14->unsloth==2025.3.18) (11.0.0)\n", + "Collecting torch>=2.4.0 (from unsloth==2025.3.18)\n", + " Downloading torch-2.6.0-cp310-cp310-manylinux1_x86_64.whl.metadata (28 kB)\n", + "Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", + "Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", + "Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)\n", + "Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)\n", + "Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", + "Collecting nvidia-cufft-cu12==11.2.1.3 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", + "Collecting nvidia-curand-cu12==10.3.5.147 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", + "Collecting nvidia-cusolver-cu12==11.6.1.9 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)\n", + "Collecting nvidia-cusparse-cu12==12.3.1.170 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)\n", + "Collecting nvidia-cusparselt-cu12==0.6.2 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_cusparselt_cu12-0.6.2-py3-none-manylinux2014_x86_64.whl.metadata (6.8 kB)\n", + "Collecting nvidia-nccl-cu12==2.21.5 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_nccl_cu12-2.21.5-py3-none-manylinux2014_x86_64.whl.metadata (1.8 kB)\n", + "Collecting nvidia-nvtx-cu12==12.4.127 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_nvtx_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.7 kB)\n", + "Collecting nvidia-nvjitlink-cu12==12.4.127 (from torch>=2.4.0->unsloth==2025.3.18)\n", + " Downloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)\n", + "Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.10/dist-packages (from diffusers->unsloth==2025.3.18) (8.5.0)\n", + "INFO: pip is looking at multiple versions of torchvision to determine which version is compatible with other requirements. This could take a while.\n", + "Collecting torchvision (from unsloth==2025.3.18)\n", + " Downloading torchvision-0.21.0-cp310-cp310-manylinux1_x86_64.whl.metadata (6.1 kB)\n", + "Requirement already satisfied: docstring-parser>=0.15 in /usr/local/lib/python3.10/dist-packages (from tyro->unsloth==2025.3.18) (0.16)\n", + "Collecting shtab>=1.5.6 (from tyro->unsloth==2025.3.18)\n", + " Downloading shtab-1.7.1-py3-none-any.whl.metadata (7.3 kB)\n", + "Requirement already satisfied: typeguard>=4.0.0 in /usr/local/lib/python3.10/dist-packages (from tyro->unsloth==2025.3.18) (4.4.1)\n", + "Collecting typing-extensions>=3.7.4.3 (from huggingface_hub->unsloth==2025.3.18)\n", + " Downloading typing_extensions-4.13.1-py3-none-any.whl.metadata (3.0 kB)\n", + "Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.16.0->unsloth==2025.3.18) (2.4.6)\n", + "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.16.0->unsloth==2025.3.18) (1.3.2)\n", + "Requirement already satisfied: async-timeout<6.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.16.0->unsloth==2025.3.18) (5.0.1)\n", + "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.16.0->unsloth==2025.3.18) (25.1.0)\n", + "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.16.0->unsloth==2025.3.18) (1.5.0)\n", + "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.16.0->unsloth==2025.3.18) (6.1.0)\n", + "Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.16.0->unsloth==2025.3.18) (0.2.1)\n", + "Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.16.0->unsloth==2025.3.18) (1.18.3)\n", + "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets>=2.16.0->unsloth==2025.3.18) (3.4.1)\n", + "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets>=2.16.0->unsloth==2025.3.18) (3.10)\n", + "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets>=2.16.0->unsloth==2025.3.18) (2.3.0)\n", + "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets>=2.16.0->unsloth==2025.3.18) (2025.1.31)\n", + "Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich->trl!=0.15.0,!=0.9.0,!=0.9.1,!=0.9.2,!=0.9.3,<=0.15.2,>=0.7.9->unsloth==2025.3.18) (3.0.0)\n", + "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich->trl!=0.15.0,!=0.9.0,!=0.9.1,!=0.9.2,!=0.9.3,<=0.15.2,>=0.7.9->unsloth==2025.3.18) (2.19.1)\n", + "Requirement already satisfied: zipp>=3.20 in /usr/local/lib/python3.10/dist-packages (from importlib-metadata->diffusers->unsloth==2025.3.18) (3.21.0)\n", + "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=2.4.0->unsloth==2025.3.18) (3.0.2)\n", + "Requirement already satisfied: intel-openmp>=2024 in /usr/local/lib/python3.10/dist-packages (from mkl->numpy->unsloth==2025.3.18) (2024.2.0)\n", + "Requirement already satisfied: tbb==2022.* in /usr/local/lib/python3.10/dist-packages (from mkl->numpy->unsloth==2025.3.18) (2022.0.0)\n", + "Requirement already satisfied: tcmlib==1.* in /usr/local/lib/python3.10/dist-packages (from tbb==2022.*->mkl->numpy->unsloth==2025.3.18) (1.2.0)\n", + "Requirement already satisfied: intel-cmplr-lib-rt in /usr/local/lib/python3.10/dist-packages (from mkl_umath->numpy->unsloth==2025.3.18) (2024.2.0)\n", + "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets>=2.16.0->unsloth==2025.3.18) (2.9.0.post0)\n", + "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets>=2.16.0->unsloth==2025.3.18) (2025.1)\n", + "Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets>=2.16.0->unsloth==2025.3.18) (2025.1)\n", + "Requirement already satisfied: intel-cmplr-lib-ur==2024.2.0 in /usr/local/lib/python3.10/dist-packages (from intel-openmp>=2024->mkl->numpy->unsloth==2025.3.18) (2024.2.0)\n", + "Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich->trl!=0.15.0,!=0.9.0,!=0.9.1,!=0.9.2,!=0.9.3,<=0.15.2,>=0.7.9->unsloth==2025.3.18) (0.1.2)\n", + "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas->datasets>=2.16.0->unsloth==2025.3.18) (1.17.0)\n", + "Downloading unsloth-2025.3.18-py3-none-any.whl (192 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m192.5/192.5 kB\u001b[0m \u001b[31m8.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading transformers-4.51.0-py3-none-any.whl (10.4 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m10.4/10.4 MB\u001b[0m \u001b[31m85.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m:01\u001b[0m\n", + "\u001b[?25hDownloading huggingface_hub-0.30.1-py3-none-any.whl (481 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m481.2/481.2 kB\u001b[0m \u001b[31m32.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading triton-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (253.1 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m253.1/253.1 MB\u001b[0m \u001b[31m6.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m0:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading trl-0.15.2-py3-none-any.whl (318 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m318.9/318.9 kB\u001b[0m \u001b[31m24.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading unsloth_zoo-2025.3.17-py3-none-any.whl (127 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m127.8/127.8 kB\u001b[0m \u001b[31m9.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading xformers-0.0.29.post3-cp310-cp310-manylinux_2_28_x86_64.whl (43.3 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m43.3/43.3 MB\u001b[0m \u001b[31m41.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading torch-2.6.0-cp310-cp310-manylinux1_x86_64.whl (766.7 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m766.7/766.7 MB\u001b[0m \u001b[31m2.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m0:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl (363.4 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m363.4/363.4 MB\u001b[0m \u001b[31m1.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m0:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (13.8 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m13.8/13.8 MB\u001b[0m \u001b[31m90.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (24.6 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m24.6/24.6 MB\u001b[0m \u001b[31m64.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (883 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m883.7/883.7 kB\u001b[0m \u001b[31m50.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl (664.8 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m664.8/664.8 MB\u001b[0m \u001b[31m2.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m0:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl (211.5 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m211.5/211.5 MB\u001b[0m \u001b[31m2.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m0:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_curand_cu12-10.3.5.147-py3-none-manylinux2014_x86_64.whl (56.3 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m56.3/56.3 MB\u001b[0m \u001b[31m31.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_cusolver_cu12-11.6.1.9-py3-none-manylinux2014_x86_64.whl (127.9 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m127.9/127.9 MB\u001b[0m \u001b[31m13.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_cusparse_cu12-12.3.1.170-py3-none-manylinux2014_x86_64.whl (207.5 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m207.5/207.5 MB\u001b[0m \u001b[31m8.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m0:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_cusparselt_cu12-0.6.2-py3-none-manylinux2014_x86_64.whl (150.1 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m150.1/150.1 MB\u001b[0m \u001b[31m11.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_nccl_cu12-2.21.5-py3-none-manylinux2014_x86_64.whl (188.7 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m188.7/188.7 MB\u001b[0m \u001b[31m9.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m0:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_nvjitlink_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (21.1 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m21.1/21.1 MB\u001b[0m \u001b[31m74.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading nvidia_nvtx_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl (99 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m99.1/99.1 kB\u001b[0m \u001b[31m7.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading bitsandbytes-0.45.4-py3-none-manylinux_2_24_x86_64.whl (76.0 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m76.0/76.0 MB\u001b[0m \u001b[31m23.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m:00:01\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading torchvision-0.21.0-cp310-cp310-manylinux1_x86_64.whl (7.2 MB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.2/7.2 MB\u001b[0m \u001b[31m100.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m00:01\u001b[0m\n", + "\u001b[?25hDownloading tyro-0.9.18-py3-none-any.whl (123 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m123.6/123.6 kB\u001b[0m \u001b[31m10.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading shtab-1.7.1-py3-none-any.whl (14 kB)\n", + "Downloading typing_extensions-4.13.1-py3-none-any.whl (45 kB)\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m45.7/45.7 kB\u001b[0m \u001b[31m2.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25hDownloading cut_cross_entropy-25.1.1-py3-none-any.whl (22 kB)\n", + "Installing collected packages: triton, nvidia-cusparselt-cu12, typing-extensions, shtab, nvidia-nvtx-cu12, nvidia-nvjitlink-cu12, nvidia-nccl-cu12, nvidia-curand-cu12, nvidia-cufft-cu12, nvidia-cuda-runtime-cu12, nvidia-cuda-nvrtc-cu12, nvidia-cuda-cupti-cu12, nvidia-cublas-cu12, nvidia-cusparse-cu12, nvidia-cudnn-cu12, huggingface_hub, tyro, nvidia-cusolver-cu12, torch, cut_cross_entropy, transformers, trl, xformers, unsloth_zoo, torchvision, bitsandbytes, unsloth\n", + " Attempting uninstall: typing-extensions\n", + " Found existing installation: typing_extensions 4.12.2\n", + " Uninstalling typing_extensions-4.12.2:\n", + " Successfully uninstalled typing_extensions-4.12.2\n", + " Attempting uninstall: nvidia-nvjitlink-cu12\n", + " Found existing installation: nvidia-nvjitlink-cu12 12.6.85\n", + " Uninstalling nvidia-nvjitlink-cu12-12.6.85:\n", + " Successfully uninstalled nvidia-nvjitlink-cu12-12.6.85\n", + " Attempting uninstall: nvidia-nccl-cu12\n", + " Found existing installation: nvidia-nccl-cu12 2.23.4\n", + " Uninstalling nvidia-nccl-cu12-2.23.4:\n", + " Successfully uninstalled nvidia-nccl-cu12-2.23.4\n", + " Attempting uninstall: nvidia-curand-cu12\n", + " Found existing installation: nvidia-curand-cu12 10.3.7.77\n", + " Uninstalling nvidia-curand-cu12-10.3.7.77:\n", + " Successfully uninstalled nvidia-curand-cu12-10.3.7.77\n", + " Attempting uninstall: nvidia-cufft-cu12\n", + " Found existing installation: nvidia-cufft-cu12 11.3.0.4\n", + " Uninstalling nvidia-cufft-cu12-11.3.0.4:\n", + " Successfully uninstalled nvidia-cufft-cu12-11.3.0.4\n", + " Attempting uninstall: nvidia-cuda-runtime-cu12\n", + " Found existing installation: nvidia-cuda-runtime-cu12 12.6.77\n", + " Uninstalling nvidia-cuda-runtime-cu12-12.6.77:\n", + " Successfully uninstalled nvidia-cuda-runtime-cu12-12.6.77\n", + " Attempting uninstall: nvidia-cuda-cupti-cu12\n", + " Found existing installation: nvidia-cuda-cupti-cu12 12.6.80\n", + " Uninstalling nvidia-cuda-cupti-cu12-12.6.80:\n", + " Successfully uninstalled nvidia-cuda-cupti-cu12-12.6.80\n", + " Attempting uninstall: nvidia-cublas-cu12\n", + " Found existing installation: nvidia-cublas-cu12 12.6.4.1\n", + " Uninstalling nvidia-cublas-cu12-12.6.4.1:\n", + " Successfully uninstalled nvidia-cublas-cu12-12.6.4.1\n", + " Attempting uninstall: nvidia-cusparse-cu12\n", + " Found existing installation: nvidia-cusparse-cu12 12.5.4.2\n", + " Uninstalling nvidia-cusparse-cu12-12.5.4.2:\n", + " Successfully uninstalled nvidia-cusparse-cu12-12.5.4.2\n", + " Attempting uninstall: nvidia-cudnn-cu12\n", + " Found existing installation: nvidia-cudnn-cu12 9.6.0.74\n", + " Uninstalling nvidia-cudnn-cu12-9.6.0.74:\n", + " Successfully uninstalled nvidia-cudnn-cu12-9.6.0.74\n", + " Attempting uninstall: huggingface_hub\n", + " Found existing installation: huggingface-hub 0.29.0\n", + " Uninstalling huggingface-hub-0.29.0:\n", + " Successfully uninstalled huggingface-hub-0.29.0\n", + " Attempting uninstall: nvidia-cusolver-cu12\n", + " Found existing installation: nvidia-cusolver-cu12 11.7.1.2\n", + " Uninstalling nvidia-cusolver-cu12-11.7.1.2:\n", + " Successfully uninstalled nvidia-cusolver-cu12-11.7.1.2\n", + " Attempting uninstall: torch\n", + " Found existing installation: torch 2.5.1+cu121\n", + " Uninstalling torch-2.5.1+cu121:\n", + " Successfully uninstalled torch-2.5.1+cu121\n", + " Attempting uninstall: transformers\n", + " Found existing installation: transformers 4.47.0\n", + " Uninstalling transformers-4.47.0:\n", + " Successfully uninstalled transformers-4.47.0\n", + " Attempting uninstall: torchvision\n", + " Found existing installation: torchvision 0.20.1+cu121\n", + " Uninstalling torchvision-0.20.1+cu121:\n", + " Successfully uninstalled torchvision-0.20.1+cu121\n", + "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n", + "fastai 2.7.18 requires torch<2.6,>=1.10, but you have torch 2.6.0 which is incompatible.\n", + "langchain 0.3.12 requires async-timeout<5.0.0,>=4.0.0; python_version < \"3.11\", but you have async-timeout 5.0.1 which is incompatible.\n", + "pylibcugraph-cu12 24.10.0 requires pylibraft-cu12==24.10.*, but you have pylibraft-cu12 25.2.0 which is incompatible.\n", + "pylibcugraph-cu12 24.10.0 requires rmm-cu12==24.10.*, but you have rmm-cu12 25.2.0 which is incompatible.\n", + "tensorflow-decision-forests 1.10.0 requires tensorflow==2.17.0, but you have tensorflow 2.17.1 which is incompatible.\n", + "torchaudio 2.5.1+cu121 requires torch==2.5.1, but you have torch 2.6.0 which is incompatible.\u001b[0m\u001b[31m\n", + "\u001b[0mSuccessfully installed bitsandbytes-0.45.4 cut_cross_entropy-25.1.1 huggingface_hub-0.30.1 nvidia-cublas-cu12-12.4.5.8 nvidia-cuda-cupti-cu12-12.4.127 nvidia-cuda-nvrtc-cu12-12.4.127 nvidia-cuda-runtime-cu12-12.4.127 nvidia-cudnn-cu12-9.1.0.70 nvidia-cufft-cu12-11.2.1.3 nvidia-curand-cu12-10.3.5.147 nvidia-cusolver-cu12-11.6.1.9 nvidia-cusparse-cu12-12.3.1.170 nvidia-cusparselt-cu12-0.6.2 nvidia-nccl-cu12-2.21.5 nvidia-nvjitlink-cu12-12.4.127 nvidia-nvtx-cu12-12.4.127 shtab-1.7.1 torch-2.6.0 torchvision-0.21.0 transformers-4.51.0 triton-3.2.0 trl-0.15.2 typing-extensions-4.13.1 tyro-0.9.18 unsloth-2025.3.18 unsloth_zoo-2025.3.17 xformers-0.0.29.post3\n" + ] + } + ], + "source": [ + "# Install or import unsloth\n", + "# 安装或导入用于微调的unsloth库\n", + "!pip install unsloth==\"2025.3.18\"\n", + "\n", + "# It is slow; so be patient\n", + "# 这一步很慢请耐心等待" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "trusted": true + }, + "outputs": [], + "source": [ + "# DO NOT CARE BUG \"ERROR: pip's dependency resolver does not currently take into account\"\n", + "# 这个报错不用管:“ERROR: pip's dependency resolver does not currently take into account”" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:24:50.658179Z", + "iopub.status.busy": "2025-04-06T06:24:50.657834Z", + "iopub.status.idle": "2025-04-06T06:24:52.575296Z", + "shell.execute_reply": "2025-04-06T06:24:52.574234Z", + "shell.execute_reply.started": "2025-04-06T06:24:50.658142Z" + }, + "trusted": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "7 5\n" + ] + } + ], + "source": [ + "# Import torch backend\n", + "# 导入torch后端\n", + "import torch\n", + "\n", + "torch_version = torch.cuda.get_device_capability()\n", + "torch_major_v, torch_minor_v = torch_version\n", + "print(torch_major_v, torch_minor_v)\n", + "# The first version digit must be greater or equal to 7, or a bug will be raised\n", + "# 第一个数大版本必须为7或者以上,否则会提示CUDA运算版本不足bug\n", + "\n", + "# If an error is thrown here, then it means you DO NOT have a valid NVIDIA accelerator\n", + "# 如果这里报错,那么意味着你没有一个有效的NVIDIA显卡作为运算加速器,请选择T4x2而不是P100,P100会提示版本不足" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:24:54.811600Z", + "iopub.status.busy": "2025-04-06T06:24:54.811118Z", + "iopub.status.idle": "2025-04-06T06:24:59.661805Z", + "shell.execute_reply": "2025-04-06T06:24:59.660259Z", + "shell.execute_reply.started": "2025-04-06T06:24:54.811531Z" + }, + "trusted": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Requirement already satisfied: xformers in /usr/local/lib/python3.10/dist-packages (0.0.29.post3)\n", + "Requirement already satisfied: trl in /usr/local/lib/python3.10/dist-packages (0.15.2)\n", + "Requirement already satisfied: peft in /usr/local/lib/python3.10/dist-packages (0.14.0)\n", + "Requirement already satisfied: accelerate in /usr/local/lib/python3.10/dist-packages (1.2.1)\n", + "Requirement already satisfied: bitsandbytes in /usr/local/lib/python3.10/dist-packages (0.45.4)\n", + "Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from xformers) (1.26.4)\n", + "Requirement already satisfied: torch==2.6.0 in /usr/local/lib/python3.10/dist-packages (from xformers) (2.6.0)\n", + "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (3.17.0)\n", + "Requirement already satisfied: typing-extensions>=4.10.0 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (4.13.1)\n", + "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (3.4.2)\n", + "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (3.1.4)\n", + "Requirement already satisfied: fsspec in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (2024.12.0)\n", + "Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.4.127 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (12.4.127)\n", + "Requirement already satisfied: nvidia-cuda-runtime-cu12==12.4.127 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (12.4.127)\n", + "Requirement already satisfied: nvidia-cuda-cupti-cu12==12.4.127 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (12.4.127)\n", + "Requirement already satisfied: nvidia-cudnn-cu12==9.1.0.70 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (9.1.0.70)\n", + "Requirement already satisfied: nvidia-cublas-cu12==12.4.5.8 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (12.4.5.8)\n", + "Requirement already satisfied: nvidia-cufft-cu12==11.2.1.3 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (11.2.1.3)\n", + "Requirement already satisfied: nvidia-curand-cu12==10.3.5.147 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (10.3.5.147)\n", + "Requirement already satisfied: nvidia-cusolver-cu12==11.6.1.9 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (11.6.1.9)\n", + "Requirement already satisfied: nvidia-cusparse-cu12==12.3.1.170 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (12.3.1.170)\n", + "Requirement already satisfied: nvidia-cusparselt-cu12==0.6.2 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (0.6.2)\n", + "Requirement already satisfied: nvidia-nccl-cu12==2.21.5 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (2.21.5)\n", + "Requirement already satisfied: nvidia-nvtx-cu12==12.4.127 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (12.4.127)\n", + "Requirement already satisfied: nvidia-nvjitlink-cu12==12.4.127 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (12.4.127)\n", + "Requirement already satisfied: triton==3.2.0 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (3.2.0)\n", + "Requirement already satisfied: sympy==1.13.1 in /usr/local/lib/python3.10/dist-packages (from torch==2.6.0->xformers) (1.13.1)\n", + "Requirement already satisfied: mpmath<1.4,>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from sympy==1.13.1->torch==2.6.0->xformers) (1.3.0)\n", + "Requirement already satisfied: datasets>=2.21.0 in /usr/local/lib/python3.10/dist-packages (from trl) (3.3.1)\n", + "Requirement already satisfied: rich in /usr/local/lib/python3.10/dist-packages (from trl) (13.9.4)\n", + "Requirement already satisfied: transformers>=4.46.0 in /usr/local/lib/python3.10/dist-packages (from trl) (4.51.0)\n", + "Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from peft) (24.2)\n", + "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from peft) (5.9.5)\n", + "Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from peft) (6.0.2)\n", + "Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from peft) (4.67.1)\n", + "Requirement already satisfied: safetensors in /usr/local/lib/python3.10/dist-packages (from peft) (0.4.5)\n", + "Requirement already satisfied: huggingface-hub>=0.25.0 in /usr/local/lib/python3.10/dist-packages (from peft) (0.30.1)\n", + "Requirement already satisfied: pyarrow>=15.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.21.0->trl) (19.0.1)\n", + "Requirement already satisfied: dill<0.3.9,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.21.0->trl) (0.3.8)\n", + "Requirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from datasets>=2.21.0->trl) (2.2.3)\n", + "Requirement already satisfied: requests>=2.32.2 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.21.0->trl) (2.32.3)\n", + "Requirement already satisfied: xxhash in /usr/local/lib/python3.10/dist-packages (from datasets>=2.21.0->trl) (3.5.0)\n", + "Requirement already satisfied: multiprocess<0.70.17 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.21.0->trl) (0.70.16)\n", + "Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets>=2.21.0->trl) (3.11.12)\n", + "Requirement already satisfied: mkl_fft in /usr/local/lib/python3.10/dist-packages (from numpy->xformers) (1.3.8)\n", + "Requirement already satisfied: mkl_random in /usr/local/lib/python3.10/dist-packages (from numpy->xformers) (1.2.4)\n", + "Requirement already satisfied: mkl_umath in /usr/local/lib/python3.10/dist-packages (from numpy->xformers) (0.1.1)\n", + "Requirement already satisfied: mkl in /usr/local/lib/python3.10/dist-packages (from numpy->xformers) (2025.0.1)\n", + "Requirement already satisfied: tbb4py in /usr/local/lib/python3.10/dist-packages (from numpy->xformers) (2022.0.0)\n", + "Requirement already satisfied: mkl-service in /usr/local/lib/python3.10/dist-packages (from numpy->xformers) (2.4.1)\n", + "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.46.0->trl) (2024.11.6)\n", + "Requirement already satisfied: tokenizers<0.22,>=0.21 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.46.0->trl) (0.21.0)\n", + "Requirement already satisfied: markdown-it-py>=2.2.0 in /usr/local/lib/python3.10/dist-packages (from rich->trl) (3.0.0)\n", + "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from rich->trl) (2.19.1)\n", + "Requirement already satisfied: aiohappyeyeballs>=2.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.21.0->trl) (2.4.6)\n", + "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.21.0->trl) (1.3.2)\n", + "Requirement already satisfied: async-timeout<6.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.21.0->trl) (5.0.1)\n", + "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.21.0->trl) (25.1.0)\n", + "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.21.0->trl) (1.5.0)\n", + "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.21.0->trl) (6.1.0)\n", + "Requirement already satisfied: propcache>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.21.0->trl) (0.2.1)\n", + "Requirement already satisfied: yarl<2.0,>=1.17.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.21.0->trl) (1.18.3)\n", + "Requirement already satisfied: mdurl~=0.1 in /usr/local/lib/python3.10/dist-packages (from markdown-it-py>=2.2.0->rich->trl) (0.1.2)\n", + "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets>=2.21.0->trl) (3.4.1)\n", + "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets>=2.21.0->trl) (3.10)\n", + "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets>=2.21.0->trl) (2.3.0)\n", + "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.32.2->datasets>=2.21.0->trl) (2025.1.31)\n", + "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch==2.6.0->xformers) (3.0.2)\n", + "Requirement already satisfied: intel-openmp>=2024 in /usr/local/lib/python3.10/dist-packages (from mkl->numpy->xformers) (2024.2.0)\n", + "Requirement already satisfied: tbb==2022.* in /usr/local/lib/python3.10/dist-packages (from mkl->numpy->xformers) (2022.0.0)\n", + "Requirement already satisfied: tcmlib==1.* in /usr/local/lib/python3.10/dist-packages (from tbb==2022.*->mkl->numpy->xformers) (1.2.0)\n", + "Requirement already satisfied: intel-cmplr-lib-rt in /usr/local/lib/python3.10/dist-packages (from mkl_umath->numpy->xformers) (2024.2.0)\n", + "Requirement already satisfied: python-dateutil>=2.8.2 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets>=2.21.0->trl) (2.9.0.post0)\n", + "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets>=2.21.0->trl) (2025.1)\n", + "Requirement already satisfied: tzdata>=2022.7 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets>=2.21.0->trl) (2025.1)\n", + "Requirement already satisfied: intel-cmplr-lib-ur==2024.2.0 in /usr/local/lib/python3.10/dist-packages (from intel-openmp>=2024->mkl->numpy->xformers) (2024.2.0)\n", + "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.2->pandas->datasets>=2.21.0->trl) (1.17.0)\n" + ] + } + ], + "source": [ + "# Install other dependences\n", + "# 安装其他依赖项\n", + "!pip install xformers trl peft accelerate bitsandbytes" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:25:02.000393Z", + "iopub.status.busy": "2025-04-06T06:25:02.000056Z", + "iopub.status.idle": "2025-04-06T06:25:30.193397Z", + "shell.execute_reply": "2025-04-06T06:25:30.192730Z", + "shell.execute_reply.started": "2025-04-06T06:25:02.000367Z" + }, + "id": "1IGsxSprNG63", + "trusted": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.\n", + "Unsloth: Failed to patch Gemma3ForConditionalGeneration.\n", + "🦥 Unsloth Zoo will now patch everything to make training faster!\n" + ] + } + ], + "source": [ + "# Import unsloth FastLanguageModel\n", + "# 导入FastLanguageModel\n", + "from unsloth import FastLanguageModel" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:26:07.827577Z", + "iopub.status.busy": "2025-04-06T06:26:07.827167Z", + "iopub.status.idle": "2025-04-06T06:26:07.833521Z", + "shell.execute_reply": "2025-04-06T06:26:07.832622Z", + "shell.execute_reply.started": "2025-04-06T06:26:07.827500Z" + }, + "trusted": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "GPU Number: 2\n", + "GPU 0: Tesla T4\n", + "GPU 1: Tesla T4\n" + ] + } + ], + "source": [ + "# See if both GPUs are activated\n", + "# 看看是否两个GPU都被激活了\n", + "\n", + "gpu_count = torch.cuda.device_count()\n", + "print(\"GPU Number:\", gpu_count)\n", + "for i in range(gpu_count):\n", + " print(f\"GPU {i}: {torch.cuda.get_device_name(i)}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 387 + }, + "execution": { + "iopub.execute_input": "2025-04-06T06:26:14.381921Z", + "iopub.status.busy": "2025-04-06T06:26:14.381575Z", + "iopub.status.idle": "2025-04-06T06:26:14.385706Z", + "shell.execute_reply": "2025-04-06T06:26:14.384802Z", + "shell.execute_reply.started": "2025-04-06T06:26:14.381892Z" + }, + "id": "vrYjQLxTSFjN", + "outputId": "ce5ca1de-43d8-414b-b72f-8f811d7e42cf", + "trusted": true + }, + "outputs": [], + "source": [ + "# Import training utilities\n", + "# 导入其他训练工具\n", + "from trl import SFTTrainer\n", + "from transformers import TrainingArguments\n", + "from unsloth import is_bfloat16_supported" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:26:16.012361Z", + "iopub.status.busy": "2025-04-06T06:26:16.011859Z", + "iopub.status.idle": "2025-04-06T06:26:16.016410Z", + "shell.execute_reply": "2025-04-06T06:26:16.015528Z", + "shell.execute_reply.started": "2025-04-06T06:26:16.012320Z" + }, + "id": "GLUb83gYSxMW", + "trusted": true + }, + "outputs": [], + "source": [ + "# Import data science packeges\n", + "# 导入数据科学使用的包\n", + "import numpy as np\n", + "import pandas as pd" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "iaZJQxXascfv", + "trusted": true + }, + "outputs": [], + "source": [ + "# By Nathmath" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "YK_VnKgONnIe" + }, + "source": [ + "## 2. Configurate the underlying model" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:26:18.282317Z", + "iopub.status.busy": "2025-04-06T06:26:18.281995Z", + "iopub.status.idle": "2025-04-06T06:26:18.286172Z", + "shell.execute_reply": "2025-04-06T06:26:18.285298Z", + "shell.execute_reply.started": "2025-04-06T06:26:18.282290Z" + }, + "id": "Gm712pctXX3V", + "trusted": true + }, + "outputs": [], + "source": [ + "# HF token\n", + "# HF 的token,如果你需要把训练好的模型保存到hugging face时需要\n", + "_global_hf_token = \"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:26:19.802259Z", + "iopub.status.busy": "2025-04-06T06:26:19.801970Z", + "iopub.status.idle": "2025-04-06T06:26:19.806029Z", + "shell.execute_reply": "2025-04-06T06:26:19.805249Z", + "shell.execute_reply.started": "2025-04-06T06:26:19.802237Z" + }, + "id": "PMXPujtPN0I1", + "trusted": true + }, + "outputs": [], + "source": [ + "# Model configuration\n", + "# 模型设定\n", + "_global_model_name = \"unsloth/gemma-2-9b-bnb-4bit\" # HF 模型识别名称\n", + "_global_model_max_seqlen = 2048 # 模型的最长输出tokens数,小说设置到8192,但显著增加训练时间\n", + "_global_model_dtype = None\n", + "_global_model_load_in_4bit = True\n" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:26:21.966709Z", + "iopub.status.busy": "2025-04-06T06:26:21.966362Z", + "iopub.status.idle": "2025-04-06T06:26:21.974985Z", + "shell.execute_reply": "2025-04-06T06:26:21.974092Z", + "shell.execute_reply.started": "2025-04-06T06:26:21.966678Z" + }, + "id": "epDSH2s4Sosn", + "trusted": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "1525629678" + ] + }, + "execution_count": 12, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Model training seed configuration\n", + "# 模型训练时的种子,随机生成一个,你也可以自己设定一个\n", + "_train_seed = int(np.random.rand() * 2 ** 32)\n", + "_train_seed" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 2.1. Load the base model into the environment" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:26:57.061090Z", + "iopub.status.busy": "2025-04-06T06:26:57.060754Z", + "iopub.status.idle": "2025-04-06T06:27:38.048294Z", + "shell.execute_reply": "2025-04-06T06:27:38.047599Z", + "shell.execute_reply.started": "2025-04-06T06:26:57.061060Z" + }, + "id": "HzK6KLZUSnml", + "trusted": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "==((====))== Unsloth 2025.3.18: Fast Gemma2 patching. Transformers: 4.51.0.\n", + " \\\\ /| Tesla T4. Num GPUs = 2. Max memory: 14.741 GB. Platform: Linux.\n", + "O^O/ \\_/ \\ Torch: 2.6.0+cu124. CUDA: 7.5. CUDA Toolkit: 12.4. Triton: 3.2.0\n", + "\\ / Bfloat16 = FALSE. FA [Xformers = 0.0.29.post3. FA2 = False]\n", + " \"-____-\" Free license: http://github.com/unslothai/unsloth\n", + "Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!\n" + ] + }, + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "6d2dee1095da450ab2962c47e0e28da8", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "model.safetensors: 0%| | 0.00/6.13G [00:00'}" + ] + }, + "execution_count": 40, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# See what is it like again\n", + "# 看看格式化后的数据是否正确\n", + "dataset_format[495]" + ] + }, + { + "cell_type": "code", + "execution_count": 41, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:41:24.245307Z", + "iopub.status.busy": "2025-04-06T06:41:24.244985Z", + "iopub.status.idle": "2025-04-06T06:41:24.271344Z", + "shell.execute_reply": "2025-04-06T06:41:24.270389Z", + "shell.execute_reply.started": "2025-04-06T06:41:24.245278Z" + }, + "trusted": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "Dataset({\n", + " features: ['Unnamed: 0', 'system', 'user', 'assistant', 'formatted'],\n", + " num_rows: 28599\n", + "})" + ] + }, + "execution_count": 41, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# Shuffle the dataset for randomness\n", + "\n", + "shuffled_dataset = dataset_format.shuffle(seed = int(np.random.rand() * 2 ** 32))\n", + "shuffled_dataset" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "dJzvujwnR_r2" + }, + "source": [ + "## 4. Train the model" + ] + }, + { + "cell_type": "code", + "execution_count": 42, + "metadata": { + "execution": { + "iopub.execute_input": "2025-04-06T06:41:31.370177Z", + "iopub.status.busy": "2025-04-06T06:41:31.369847Z", + "iopub.status.idle": "2025-04-06T06:41:43.354349Z", + "shell.execute_reply": "2025-04-06T06:41:43.353632Z", + "shell.execute_reply.started": "2025-04-06T06:41:31.370149Z" + }, + "id": "Vmj5pSNdSDau", + "trusted": true + }, + "outputs": [ + { + "data": { + "application/vnd.jupyter.widget-view+json": { + "model_id": "af60e02c533e490ead132f8ecc539b8f", + "version_major": 2, + "version_minor": 0 + }, + "text/plain": [ + "Unsloth: Tokenizing [\"formatted\"] (num_proc=4): 0%| | 0/28599 [00:00\n", + " \n", + " \n", + " [ 3/476 01:58 < 15:38:02, 0.01 it/s, Epoch 0.00/1]\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
StepTraining Loss
12.093000

" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "## Officially train the model\n", + "## 正式开始训练你的模型,耗费时间(几十分钟到数小时)\n", + "trainer_stats = trainer.train()\n", + "print(\"Completed.\")\n", + "\n", + "# 什么时候停止?如果你的学习率太大,几十步loss损失函数就到0.8/0.6这个位置了,那么\n", + "# 哪怕训练数据没用完,停! 因为接下来的都是过拟合(最简单的理解就是只会照猫画虎,不会举一反三),训练loss不是越低越好。\n", + "# 此时,自己测试一下,如果满意,导出保存。\n", + "# 如果不满意,调低一点learning_rate,再从头训练。\n", + "# 此外,如果你们的数据集中有大量数据重复或者高度相似,也有可能很快过拟合,请考虑数据集的问题。\n", + "#\n", + "# 你们一般不会见到欠拟合的情况。我也不用多说。当然如果真遇到了,再跑一次这一行代码就可以。\n", + "# 个人建议最终停止的位置是loss函数稳定到0.8 ~ 1.4之间的某一个阶段,例如已经稳定了30steps。\n", + "# 如果你们愿意略微调整一下模型而严格防止过拟合,那么可以再1.4 ~ 1.8左右停止训练,或者增加weight_decay。" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "7QH2H_mCTkpu", + "outputId": "4fe81129-f0f6-45b1-870b-085493bfae30", + "trusted": true + }, + "outputs": [], + "source": [ + "# Show final training stats,code is provided by unsloth\n", + "# 训练的统计数据,代码由unsloth提供\n", + "used_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)\n", + "used_memory_for_lora = round(used_memory - start_gpu_memory, 3)\n", + "used_percentage = round(used_memory /max_memory*100, 3)\n", + "lora_percentage = round(used_memory_for_lora/max_memory*100, 3)\n", + "print(f\"{trainer_stats.metrics['train_runtime']} seconds used for training.\")\n", + "print(f\"{round(trainer_stats.metrics['train_runtime']/60, 2)} minutes used for training.\")\n", + "print(f\"Peak reserved memory = {used_memory} GB.\")\n", + "print(f\"Peak reserved memory for training = {used_memory_for_lora} GB.\")\n", + "print(f\"Peak reserved memory % of max memory = {used_percentage} %.\")\n", + "print(f\"Peak reserved memory for training % of max memory = {lora_percentage} %.\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3h_v1V-s1zuf" + }, + "source": [ + "## 6. Inference Test" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "trusted": true + }, + "outputs": [], + "source": [ + "# Reload the model, if you enter for the second time\n", + "# 重新加载模型,如果你保存了然后第二次进来\n", + "if False:\n", + " from unsloth import FastLanguageModel\n", + "\n", + " _global_model, _global_tokenizer = FastLanguageModel.from_pretrained(\n", + " model_name=\"my_gameintro_gemma9b\",\n", + " max_seq_length=_global_model_max_seqlen,\n", + " dtype = _global_model_dtype,\n", + " load_in_4bit = _global_model_load_in_4bit\n", + " )\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "trusted": true + }, + "outputs": [], + "source": [ + "# If you plan to infer rather than continue to train, call this\n", + "# 如果你希望推理而不是���续训练,调用如下这行代码,训练一定不要调\n", + "if False:\n", + " FastLanguageModel.for_inference(_global_model) # Enable native 2x faster inference" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "9ozwL6jWTtrG", + "outputId": "5ff93dec-d3a1-4f46-e4e6-64df5b9ecbf9", + "trusted": true + }, + "outputs": [], + "source": [ + "# Inference\n", + "# 推理测试\n", + "FastLanguageModel.for_inference(_global_model)\n", + "inputs = _global_tokenizer(\n", + "[\n", + " # Use Infer when doing inference\n", + " alpaca_prompt_infer.format(\n", + " \"你是一个游戏剧情规划师。请你根据我提供的游戏名和游戏特色规划剧情,写出一段引人入胜的游戏介绍。\", # system\n", + " \"请根据游戏名编写游戏介绍:【游戏名】:在终焉的世界里寻找盛开的花。\", # input\n", + " \"\", # output - 留空等待AI生成\n", + " )\n", + "], return_tensors = \"pt\").to(\"cuda\")\n", + "\n", + "outputs = _global_model.generate(**inputs, max_new_tokens = 256, use_cache = 2048)\n", + "_global_tokenizer.batch_decode(outputs)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "fIJ415USTvOe", + "outputId": "95b525e2-f2b3-4707-f023-85d57790e502", + "trusted": true + }, + "outputs": [], + "source": [ + "# Inference in stream mode\n", + "# 流式推理测试\n", + "FastLanguageModel.for_inference(_global_model)\n", + "inputs = _global_tokenizer(\n", + "[\n", + " # Use Infer when doing inference\n", + " alpaca_prompt_infer.format(\n", + " \"你是一个游戏剧情规划师。请你根据我提供的游戏名和游戏特色规划剧情,写出一段引人入胜的游戏介绍。\", # system\n", + " \"请根据游戏名和游戏特色编写游戏介绍:【游戏名】:风陇之歌 ~Tracking the footprints of time~,【游戏特色】:奇幻, 哲学, 冒险, 宗教, 神话, 白毛。\", # input\n", + " \"\", # output - 留空等待AI生成\n", + " )\n", + "], return_tensors = \"pt\").to(\"cuda\")\n", + "\n", + "from transformers import TextStreamer\n", + "text_streamer = TextStreamer(_global_tokenizer)\n", + "_ = _global_model.generate(**inputs, streamer = text_streamer, max_new_tokens = 2048)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ttGAEGntVavo" + }, + "source": [ + "## 7. Save the model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "_boIgrvbT30O", + "outputId": "3e0bd437-23c0-4bb9-9ecb-f7c0948b5ad3", + "trusted": true + }, + "outputs": [], + "source": [ + "# Save model as the native huggingface version - local\n", + "# 保存模型为原生的huggingface模型到本地\n", + "_global_model.save_pretrained(\"my_gameintro_gemma9b\")\n", + "_global_tokenizer.save_pretrained(\"my_gameintro_gemma9b\")\n", + "\n", + "# Save the merged - locel\n", + "# 保存融合的模型(包括底模)到本地\n", + "_global_model.save_pretrained_merged(\"my_gameintro_gemma9b_merged\", _global_tokenizer, save_method = \"merged_16bit\",)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "trusted": true + }, + "outputs": [], + "source": [ + "# Save model as the native huggingface version - hf, costs time\n", + "# 保存模型为原生的huggingface模型到hf模型站,消耗时间!请为你的运行时留足时间\n", + "_global_model.push_to_hub(\"DOFOFFICIAL/NathUI-Tutorial\", token = \"hf_...\")\n", + "_global_tokenizer.push_to_hub(\"DOFOFFICIAL/NathUI-Tutorial\", token = \"hf_...\") \n", + "\n", + "# Save the merged - hf, costs time\n", + "# 保存融合的模型(包括底模)到hf模型站,消耗时间!请为你的运行时留足时间\n", + "_global_model.push_to_hub_merged(\"DOFOFFICIAL/NathUI-Tutorial\", _global_tokenizer, save_method = \"merged_16bit\", token = \"hf_...\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "tHIKsMlqVwB_", + "outputId": "c0631069-7724-4f1d-d7cc-24d12786192a", + "trusted": true + }, + "outputs": [], + "source": [ + "# Save as quanted, costs time\n", + "# 保存为量化后的模型,消耗时间!请为你的运行时留足时间\n", + "\n", + "# Save to Q8_0\n", + "# 保存为量化的gguf Q8_0\n", + "# _global_model.save_pretrained_gguf(\"my_gameintro_gemma9b_Q8_0\", _global_tokenizer, quantization_method = \"q8_0\")\n", + "_global_model.push_to_hub_gguf(\"DOFOFFICIAL/ThisIsTmp\", _global_tokenizer, quantization_method = \"q8_0\", token = \"hf_...\")\n", + "\n", + "# Save to Q4_K_M\n", + "# 保存为量化的gguf Q4_K_M\n", + "# _global_model.save_pretrained_gguf(\"my_gameintro_gemma9b_Q4_K_M\", _global_tokenizer, quantization_method = \"q4_K_M\")\n", + "_global_model.push_to_hub_gguf(\"DOFOFFICIAL/ThisIsTmp\", _global_tokenizer, quantization_method = \"q4_K_M\", token = \"hf_...\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "3AmGcJfh3B_o", + "trusted": true + }, + "outputs": [], + "source": [ + "# Modified Auther NathMath, open-sourced with Apache-2.0 Licence\n", + "# 修改作者:NathMath,以Apache-2.0 Licence许可证开源" + ] + } + ], + "metadata": { + "colab": { + "provenance": [], + "toc_visible": true + }, + "kaggle": { + "accelerator": "nvidiaTeslaT4", + "dataSources": [ + { + "datasetId": 7061846, + "sourceId": 11293954, + "sourceType": "datasetVersion" + } + ], + "dockerImageVersionId": 30919, + "isGpuEnabled": true, + "isInternetEnabled": true, + "language": "python", + "sourceType": "notebook" + }, + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + }, + "widgets": { + "application/vnd.jupyter.widget-state+json": { + "025ddae40b4541a684f1d76c5b289ab6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_c345a572c5d4493d93423618f0a6f143", + "max": 32325930, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_813942f77ed3426a8a6261ac60542f48", + "value": 32325930 + } + }, + "141a311668984b2ebaca90db0c57d815": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "14b2db86f938477696e3e54b321caeed": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": "20px" + } + }, + "27b05fd0ecb34c10ae93eac9d4cdc1c5": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_c8177522a2e64f4eafef0d6cbc1f6cdd", + "placeholder": "​", + "style": "IPY_MODEL_29a4704b56cf4d4fa145fac101dbcb8d", + "value": "TrainGemma2.gameintro.queries.lf.csv: 100%" + } + }, + "27daab81e7b64087abae1b5e2a3c0c5d": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "29a4704b56cf4d4fa145fac101dbcb8d": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "39d9dd8bc753487f9a50c939b9ed38e5": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "3a7e8f584e72412c8fa4fd8063630e2b": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "48641e4f73954da987f034a74c3b8b5e": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "498c133792ea4cd38b6385be5f7f8c62": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_ec1882dbce7d46e28580b8cf2b4da19b", + "placeholder": "​", + "style": "IPY_MODEL_860d1a845d764ff6853ae56e6ac3afcc", + "value": " 28599/0 [00:00<00:00, 35222.06 examples/s]" + } + }, + "4a562f7c91b849599ff06f76da226c85": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "523070898a474b74bde2d365c69dddff": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "528f68abd3af4761b1b00b40599e0fbf": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_f84cfa74576a46e88caa9989161d08fc", + "IPY_MODEL_5d03f9e044f146b7b36b34cdcf21153a", + "IPY_MODEL_498c133792ea4cd38b6385be5f7f8c62" + ], + "layout": "IPY_MODEL_141a311668984b2ebaca90db0c57d815" + } + }, + "5d03f9e044f146b7b36b34cdcf21153a": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_14b2db86f938477696e3e54b321caeed", + "max": 1, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_fdb857f832a54286ba504046a177a883", + "value": 1 + } + }, + "5f4a077a96ad446cb41241108eac2618": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "63175136102c4135acce3e700bf92cf8": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "729a4366db2d4203adb0fd2b0fb2dd4c": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "FloatProgressModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "FloatProgressModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "ProgressView", + "bar_style": "success", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_3a7e8f584e72412c8fa4fd8063630e2b", + "max": 28599, + "min": 0, + "orientation": "horizontal", + "style": "IPY_MODEL_39d9dd8bc753487f9a50c939b9ed38e5", + "value": 28599 + } + }, + "813942f77ed3426a8a6261ac60542f48": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + }, + "860d1a845d764ff6853ae56e6ac3afcc": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "96b6572ec6184bfcb655ba62329631ec": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_523070898a474b74bde2d365c69dddff", + "placeholder": "​", + "style": "IPY_MODEL_fced41bbaf3a496aa62080a6b88e1afc", + "value": " 28599/28599 [00:00<00:00, 50440.48 examples/s]" + } + }, + "9dd744f6393943f2959c27ac89e93ba1": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_27b05fd0ecb34c10ae93eac9d4cdc1c5", + "IPY_MODEL_025ddae40b4541a684f1d76c5b289ab6", + "IPY_MODEL_bd4b2fd97cb34f5f900e91969cc588c7" + ], + "layout": "IPY_MODEL_48641e4f73954da987f034a74c3b8b5e" + } + }, + "b43311c7e443439d9776c1cb51f07baf": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HBoxModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HBoxModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HBoxView", + "box_style": "", + "children": [ + "IPY_MODEL_c121083f404b494f881cab7d9ec122d6", + "IPY_MODEL_729a4366db2d4203adb0fd2b0fb2dd4c", + "IPY_MODEL_96b6572ec6184bfcb655ba62329631ec" + ], + "layout": "IPY_MODEL_ca83618081ae48ee88976c512c3f2022" + } + }, + "bd4b2fd97cb34f5f900e91969cc588c7": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_e35595cc50a24fcca6f812997c8c8dc5", + "placeholder": "​", + "style": "IPY_MODEL_63175136102c4135acce3e700bf92cf8", + "value": " 32.3M/32.3M [00:00<00:00, 34.5MB/s]" + } + }, + "c121083f404b494f881cab7d9ec122d6": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_5f4a077a96ad446cb41241108eac2618", + "placeholder": "​", + "style": "IPY_MODEL_4a562f7c91b849599ff06f76da226c85", + "value": "Map: 100%" + } + }, + "c345a572c5d4493d93423618f0a6f143": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "c8177522a2e64f4eafef0d6cbc1f6cdd": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "ca83618081ae48ee88976c512c3f2022": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "e04c52ea5941431f9d1dd12d2f41654a": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "e35595cc50a24fcca6f812997c8c8dc5": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "ec1882dbce7d46e28580b8cf2b4da19b": { + "model_module": "@jupyter-widgets/base", + "model_module_version": "1.2.0", + "model_name": "LayoutModel", + "state": { + "_model_module": "@jupyter-widgets/base", + "_model_module_version": "1.2.0", + "_model_name": "LayoutModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "LayoutView", + "align_content": null, + "align_items": null, + "align_self": null, + "border": null, + "bottom": null, + "display": null, + "flex": null, + "flex_flow": null, + "grid_area": null, + "grid_auto_columns": null, + "grid_auto_flow": null, + "grid_auto_rows": null, + "grid_column": null, + "grid_gap": null, + "grid_row": null, + "grid_template_areas": null, + "grid_template_columns": null, + "grid_template_rows": null, + "height": null, + "justify_content": null, + "justify_items": null, + "left": null, + "margin": null, + "max_height": null, + "max_width": null, + "min_height": null, + "min_width": null, + "object_fit": null, + "object_position": null, + "order": null, + "overflow": null, + "overflow_x": null, + "overflow_y": null, + "padding": null, + "right": null, + "top": null, + "visibility": null, + "width": null + } + }, + "f84cfa74576a46e88caa9989161d08fc": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "HTMLModel", + "state": { + "_dom_classes": [], + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "HTMLModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/controls", + "_view_module_version": "1.5.0", + "_view_name": "HTMLView", + "description": "", + "description_tooltip": null, + "layout": "IPY_MODEL_e04c52ea5941431f9d1dd12d2f41654a", + "placeholder": "​", + "style": "IPY_MODEL_27daab81e7b64087abae1b5e2a3c0c5d", + "value": "Generating train split: " + } + }, + "fced41bbaf3a496aa62080a6b88e1afc": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "DescriptionStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "DescriptionStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "description_width": "" + } + }, + "fdb857f832a54286ba504046a177a883": { + "model_module": "@jupyter-widgets/controls", + "model_module_version": "1.5.0", + "model_name": "ProgressStyleModel", + "state": { + "_model_module": "@jupyter-widgets/controls", + "_model_module_version": "1.5.0", + "_model_name": "ProgressStyleModel", + "_view_count": null, + "_view_module": "@jupyter-widgets/base", + "_view_module_version": "1.2.0", + "_view_name": "StyleView", + "bar_color": null, + "description_width": "" + } + } + } + } + }, + "nbformat": 4, + "nbformat_minor": 4 +}