databricks · lu-wang-dl · Oct 5, 2023 · Oct 5, 2023 · Oct 9, 2023 · Oct 9, 2023
@@ -30,7 +30,7 @@ table th:nth-of-type(4) {
 
 | Use case                               | Quality-optimized                                                                                                                                                                                                                                                                                                                                 | Balanced                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Speed-optimized                                                                                     |
 |----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------|
-| Text generation following instructions | [MPT-30B-Instruct](llm-models/mpt/mpt-30b/) <br> <br> [Llama-2-70b-chat-hf](llm-models/llamav2/llamav2-70b)                                                                                                                                                                                                                                       | [MPT-7B-Instruct](llm-models/mpt/mpt-7b) <br> [MPT-7B-8k-Instruct](llm-models/mpt/mpt-7b-8k) <br> <br> [Llama-2-7b-chat-hf](llm-models/llamav2/llamav2-7b) <br> [Llama-2-13b-chat-hf](llm-models/llamav2/llamav2-13b)                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |                                                                                                     |
+| Text generation following instructions | [MPT-30B-Instruct](llm-models/mpt/mpt-30b/) <br> <br> [Llama-2-70b-chat-hf](llm-models/llamav2/llamav2-70b)                                                                                                                                                                                                                                       | [mistral-7b](llm-models/mistral/mistral-7b) <br><br> [MPT-7B-Instruct](llm-models/mpt/mpt-7b) <br> [MPT-7B-8k-Instruct](llm-models/mpt/mpt-7b-8k) <br> <br> [Llama-2-7b-chat-hf](llm-models/llamav2/llamav2-7b) <br> [Llama-2-13b-chat-hf](llm-models/llamav2/llamav2-13b)                                                                                                                                                                                                                                                                                                                                                                                                             |                                                                                                     |
 | Text embeddings (English only)         |                                                                                                                                                                                                                                                                                                                                                   | [bge-large-en(0.3B)](llm-models/embedding/bge/bge-large) <br> [e5-large-v2 (0.3B)](llm-models/embedding/e5-v2) <br> [instructor-xl (1.3B)](llm-models/embedding/instructor-xl)*                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | [bge-base-en (0.1B)](llm-models/embedding/bge) <br> [e5-base-v2 (0.1B)](llm-models/embedding/e5-v2) |
 | Transcription (speech to text)         |                                                                                                                                                                                                                                                                                                                                                   | [whisper-large-v2](llm-models/transcription/whisper)(1.6B) <br> [whisper-medium](llm-models/transcription/whisper) (0.8B)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                     |
 | Image generation                       |                                                                                                                                                                                                                                                                                                                                                   | [stable-diffusion-xl](llm-models/image_generation/stable_diffusion)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                                                                                     |

@@ -0,0 +1,47 @@
+{
+    "fp16": {
+      "enabled": false
+    },
+    "bf16": {
+      "enabled": true
+    },
+    "optimizer": {
+      "type": "AdamW",
+      "params": {
+        "lr": "auto",
+        "betas": "auto",
+        "eps": "auto",
+        "weight_decay": "auto"
+      }
+    },
+    "scheduler": {
+      "type": "WarmupLR",
+      "params": {
+        "warmup_min_lr": "auto",
+        "warmup_max_lr": "auto",
+        "warmup_num_steps": "auto"
+      }
+    },
+    "zero_optimization": {
+      "stage": 2,
+      "overlap_comm": true,
+      "contiguous_gradients": true,
+      "sub_group_size": 5e7,
+      "reduce_bucket_size": "auto",
+      "reduce_scatter": true,
+      "offload_param": {
+      "device": "cpu",
+      "pin_memory": true
+      },
+      "offload_optimizer": {
+        "device": "cpu",
+        "pin_memory": true
+      }
+    },
+    "gradient_accumulation_steps": "auto",
+    "gradient_clipping": "auto",
+    "steps_per_print": 50,
+    "train_batch_size": "auto",
+    "train_micro_batch_size_per_gpu": "auto",
+    "wall_clock_breakdown": false
+}
@@ -0,0 +1,74 @@
+# Databricks notebook source
+# MAGIC %md
+# MAGIC
+# MAGIC # Fine tune llama-2-70b with LoRA and deepspeed on a single node
+# MAGIC
+# MAGIC [Llama 2](https://huggingface.co/meta-llama) is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. It is trained with 2T tokens and supports context length window upto 4K tokens. [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) is the 7B pretrained model, converted for the Hugging Face Transformers format.
-# MAGIC [Llama 2](https://huggingface.co/meta-llama) is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. It is trained with 2T tokens and supports context length window upto 4K tokens. [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) is the 7B pretrained model, converted for the Hugging Face Transformers format.
+# MAGIC [Llama 2](https://huggingface.co/meta-llama) is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. It is trained with 2T tokens and supports context length window upto 4K tokens. [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) is the 70B pretrained model, converted for the Hugging Face Transformers format.
-# MAGIC [Llama 2](https://huggingface.co/meta-llama) is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. It is trained with 2T tokens and supports context length window upto 4K tokens. [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) is the 7B pretrained model, converted for the Hugging Face Transformers format.
+# MAGIC [Llama 2](https://huggingface.co/meta-llama) is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. It is trained with 2T tokens and supports context length window upto 4K tokens. [Llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) is the 70B pretrained model, converted for the Hugging Face Transformers format.
+# MAGIC
+# MAGIC This is to fine-tune [llama-2-70b-hf](https://huggingface.co/meta-llama/Llama-2-70b-hf) models on the [dolly_hhrlhf](https://huggingface.co/datasets/mosaicml/dolly_hhrlhf) dataset.
+# MAGIC
+# MAGIC Environment for this notebook:
+# MAGIC - Runtime: 14.0 GPU ML Runtime
+# MAGIC - Instance: `Standard_NC48ads_A100_v4` on Azure with 2 A100-80GB GPUs, `p4d.24xlarge` on AWS with 8 A100-40GB GPUs
+# MAGIC
+# MAGIC Requirements:
+# MAGIC - To get the access of the model on HuggingFace, please visit the [Meta website](https://ai.meta.com/resources/models-and-libraries/llama-downloads) and accept our license terms and acceptable use policy before submitting this form. Requests will be processed in 1-2 days.
+# MAGIC
+
+# COMMAND ----------
+
+# MAGIC %md
+# MAGIC Install the missing libraries
+
+# COMMAND ----------
+
+# MAGIC %pip install deepspeed==0.9.5 xformers
+# MAGIC %pip install git+https://github.com/huggingface/peft.git
+# MAGIC %pip install bitsandbytes==0.40.1 einops==0.6.1 trl==0.4.7
+# MAGIC %pip install -U torch==2.0.1 accelerate==0.21.0 transformers==4.31.0
+# MAGIC dbutils.library.restartPython()
+
+# COMMAND ----------
+
+import os
+os.environ["HF_HOME"] = "/local_disk0/hf"
+os.environ["HF_DATASETS_CACHE"] = "/local_disk0/hf"
+os.environ["TRANSFORMERS_CACHE"] = "/local_disk0/hf"
+
+# COMMAND ----------
+
+from huggingface_hub import notebook_login
+
+# Login to Huggingface to get access to the model
+notebook_login()
+
+# COMMAND ----------
+
+# MAGIC %md
+# MAGIC ## Fine tune the model with `deepspeed`
+# MAGIC
+# MAGIC The fine tune logic is written in `scripts/fine_tune_deepspeed.py`. The dataset used for fine tune is [databricks-dolly-15k ](https://huggingface.co/datasets/databricks/databricks-dolly-15k) dataset.
+# MAGIC
+# MAGIC
+
+# COMMAND ----------
+
+# MAGIC %sh
+# MAGIC deepspeed \
+# MAGIC --num_gpus 2 \
+# MAGIC scripts/fine_tune_lora.py \
+# MAGIC --output_dir="/local_disk0/output"
+
+# COMMAND ----------
+
+# MAGIC %md
+# MAGIC Model checkpoint is saved at `/local_disk0/final_model`.
+
+# COMMAND ----------
+
+# MAGIC %sh
+# MAGIC ls /local_disk0/final_model
+
+# COMMAND ----------
+
+