-
Notifications
You must be signed in to change notification settings - Fork 131
Add new example to fine tune llama-2 70b with lora #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
1f7f497
4d6870e
f59abd4
3e7d8ba
cf8eba3
d4d7fc0
f941387
b5e81bc
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| { | ||
| "fp16": { | ||
| "enabled": false | ||
| }, | ||
| "bf16": { | ||
| "enabled": true | ||
| }, | ||
| "optimizer": { | ||
| "type": "AdamW", | ||
| "params": { | ||
| "lr": "auto", | ||
| "betas": "auto", | ||
| "eps": "auto", | ||
| "weight_decay": "auto" | ||
| } | ||
| }, | ||
| "scheduler": { | ||
| "type": "WarmupLR", | ||
| "params": { | ||
| "warmup_min_lr": "auto", | ||
| "warmup_max_lr": "auto", | ||
| "warmup_num_steps": "auto" | ||
| } | ||
| }, | ||
| "zero_optimization": { | ||
| "stage": 2, | ||
| "overlap_comm": true, | ||
| "contiguous_gradients": true, | ||
| "sub_group_size": 5e7, | ||
| "reduce_bucket_size": "auto", | ||
| "reduce_scatter": true, | ||
| "offload_param": { | ||
| "device": "cpu", | ||
| "pin_memory": true | ||
| }, | ||
| "offload_optimizer": { | ||
| "device": "cpu", | ||
| "pin_memory": true | ||
| } | ||
| }, | ||
| "gradient_accumulation_steps": "auto", | ||
| "gradient_clipping": "auto", | ||
| "steps_per_print": 50, | ||
| "train_batch_size": "auto", | ||
| "train_micro_batch_size_per_gpu": "auto", | ||
| "wall_clock_breakdown": false | ||
| } |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,74 @@ | ||||||
| # Databricks notebook source | ||||||
| # MAGIC %md | ||||||
| # MAGIC | ||||||
| # MAGIC # Fine tune llama-2-70b with LoRA and deepspeed on a single node | ||||||
| # MAGIC | ||||||
| # MAGIC [Llama 2](https://huggingface.co/meta-llama) is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. It is trained with 2T tokens and supports context length window upto 4K tokens. [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) is the 7B pretrained model, converted for the Hugging Face Transformers format. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit
Suggested change
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. deepspeed is used for multi-GPU training with LORA. |
||||||
| # MAGIC | ||||||
| # MAGIC This is to fine-tune [llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) models on the [dolly_hhrlhf](https://huggingface.co/datasets/mosaicml/dolly_hhrlhf) dataset. | ||||||
| # MAGIC | ||||||
| # MAGIC Environment for this notebook: | ||||||
| # MAGIC - Runtime: 14.0 GPU ML Runtime | ||||||
| # MAGIC - Instance: `Standard_NC48ads_A100_v4` on Azure with 2 A100-80GB GPUs, `p4d.24xlarge` on AWS with 8 A100-40GB GPUs | ||||||
| # MAGIC | ||||||
| # MAGIC Requirements: | ||||||
| # MAGIC - To get the access of the model on HuggingFace, please visit the [Meta website](https://ai.meta.com/resources/models-and-libraries/llama-downloads) and accept our license terms and acceptable use policy before submitting this form. Requests will be processed in 1-2 days. | ||||||
| # MAGIC | ||||||
|
|
||||||
| # COMMAND ---------- | ||||||
|
|
||||||
| # MAGIC %md | ||||||
| # MAGIC Install the missing libraries | ||||||
|
|
||||||
| # COMMAND ---------- | ||||||
|
|
||||||
| # MAGIC %pip install deepspeed==0.9.5 xformers | ||||||
| # MAGIC %pip install git+https://github.com/huggingface/peft.git | ||||||
| # MAGIC %pip install bitsandbytes==0.40.1 einops==0.6.1 trl==0.4.7 | ||||||
| # MAGIC %pip install -U torch==2.0.1 accelerate==0.21.0 transformers==4.31.0 | ||||||
| # MAGIC dbutils.library.restartPython() | ||||||
|
|
||||||
| # COMMAND ---------- | ||||||
|
|
||||||
| import os | ||||||
| os.environ["HF_HOME"] = "/local_disk0/hf" | ||||||
| os.environ["HF_DATASETS_CACHE"] = "/local_disk0/hf" | ||||||
| os.environ["TRANSFORMERS_CACHE"] = "/local_disk0/hf" | ||||||
|
|
||||||
| # COMMAND ---------- | ||||||
|
|
||||||
| from huggingface_hub import notebook_login | ||||||
|
|
||||||
| # Login to Huggingface to get access to the model | ||||||
| notebook_login() | ||||||
|
|
||||||
| # COMMAND ---------- | ||||||
|
|
||||||
| # MAGIC %md | ||||||
| # MAGIC ## Fine tune the model with `deepspeed` | ||||||
| # MAGIC | ||||||
| # MAGIC The fine tune logic is written in `scripts/fine_tune_deepspeed.py`. The dataset used for fine tune is [databricks-dolly-15k ](https://huggingface.co/datasets/databricks/databricks-dolly-15k) dataset. | ||||||
| # MAGIC | ||||||
| # MAGIC | ||||||
|
|
||||||
| # COMMAND ---------- | ||||||
|
|
||||||
| # MAGIC %sh | ||||||
| # MAGIC deepspeed \ | ||||||
| # MAGIC --num_gpus 2 \ | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. Let me remove it. |
||||||
| # MAGIC scripts/fine_tune_lora.py \ | ||||||
| # MAGIC --output_dir="/local_disk0/output" | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Q: What is the difference between |
||||||
|
|
||||||
| # COMMAND ---------- | ||||||
|
|
||||||
| # MAGIC %md | ||||||
| # MAGIC Model checkpoint is saved at `/local_disk0/final_model`. | ||||||
|
|
||||||
| # COMMAND ---------- | ||||||
|
|
||||||
| # MAGIC %sh | ||||||
| # MAGIC ls /local_disk0/final_model | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you also add instructions or code for how to load this for inference? |
||||||
|
|
||||||
| # COMMAND ---------- | ||||||
|
|
||||||
|
|
||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since 07 is used for AI gateway, maybe other indices
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Let's design a proper orders after.