Update README.md to clarify license

AanchalA · Mar 25, 2023 · 73cac8b · 73cac8b
1 parent eb5b171
commit 73cac8b
Showing 1 changed file with 34 additions and 18 deletions.
diff --git a/README.md b/README.md
@@ -3,18 +3,23 @@
 <a href="https://crfm.stanford.edu/alpaca/" target="_blank"><img src="assets/logo.png" alt="Stanford-Alpaca" style="width: 50%; min-width: 300px; display: block; margin: auto;"></a>
 </p>
 
-# Stanford Alpaca: An Instruction-following LLaMA Model 
-[![License](https://img.shields.io/badge/License-Apache_2.0-green.svg)](https://github.com/tatsu-lab/stanford_alpaca/blob/main/LICENSE) 
-[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/release/python-390/) 
-[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black) 
+# Stanford Alpaca: An Instruction-following LLaMA Model
+
+[![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg)](https://github.com/tatsu-lab/stanford_alpaca/blob/main/LICENSE)
+[![Data License](https://img.shields.io/badge/Data%20License-CC%20By%20NC%204.0-red.svg)](https://github.com/tatsu-lab/stanford_alpaca/blob/main/DATA_LICENSE)
+[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/release/python-390/)
+[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
 
 This is the repo for the Stanford Alpaca project, which aims to build and share an instruction-following LLaMA model. The repo contains:
-- The [52K data](#data-release) used for fine-tuning the model
-- The code for [generating the data](#data-generation-process)
-- The code for [fine-tuning the model](#fine-tuning)
+
+- The [52K data](#data-release) used for fine-tuning the model.
+- The code for [generating the data](#data-generation-process).
+- The code for [fine-tuning the model](#fine-tuning).
 
 Note: We thank the community for feedback on Stanford-Alpaca and supporting our research. Our live demo is suspended until further notice.
 
+**Usage and License Notices**: Alpaca is intended and licensed for research use only. The dataset is CC BY NC 4.0 (allowing only non-commercial use) and models trained using the dataset should not be used outside of research purposes.
+
 ## Overview
 
 The current Alpaca model is fine-tuned from a 7B LLaMA model [1] on 52K instruction-following data generated by the techniques in the Self-Instruct [2] paper, with some modifications that we discuss in the next section.
@@ -28,21 +33,23 @@ Our initial release contains the data generation procedure, dataset, and trainin
 
 **Please read our release [blog post](https://crfm.stanford.edu/2023/03/13/alpaca.html) for more details about the model, our discussion of the potential harm and limitations of Alpaca models, and our thought process for releasing a reproducible model.**
 
-
 [1]: LLaMA: Open and Efficient Foundation Language Models. Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample. https://arxiv.org/abs/2302.13971v1
 
 [2]: Self-Instruct: Aligning Language Model with Self Generated Instructions. Yizhong Wang, Yeganeh Kordi, Swaroop Mishra, Alisa Liu, Noah A. Smith, Daniel Khashabi, Hannaneh Hajishirzi. https://arxiv.org/abs/2212.10560
 
-
 ## Data Release
+
 [`alpaca_data.json`](./alpaca_data.json) contains 52K instruction-following data we used for fine-tuning the Alpaca model.
 This JSON file is a list of dictionaries, each dictionary contains the following fields:
+
 - `instruction`: `str`, describes the task the model should perform. Each of the 52K instructions is unique.
 - `input`: `str`, optional context or input for the task. For example, when the instruction is "Summarize the following article", the input is the article. Around 40% of the examples have an input.
 - `output`: `str`, the answer to the instruction as generated by `text-davinci-003`.
 
 We used the following prompts for fine-tuning the Alpaca model:
+
 - for examples with a non-empty input field:
+
  ```
  Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
  
@@ -54,7 +61,9 @@ We used the following prompts for fine-tuning the Alpaca model:
  
  ### Response:
  ```
+
 - for examples with an empty input field:
+
  ```
  Below is an instruction that describes a task. Write a response that appropriately completes the request.
  
@@ -63,7 +72,7 @@ We used the following prompts for fine-tuning the Alpaca model:
  
  ### Response:
  ```
- 
+
  During inference (eg for the web demo), we use the user instruction with an empty input field (second option).
 
 ## Data Generation Process
@@ -78,13 +87,14 @@ We used the following prompts for fine-tuning the Alpaca model:
 </details>
 
 We built on the data generation pipeline from [self-instruct](https://github.com/yizhongw/self-instruct) and made the following modifications:
+
 - We used `text-davinci-003` to generate the instruction data instead of `davinci`.
-- We wrote a new prompt (`prompt.txt`) that explicitly gave the requirement of instruction generation to `text-davinci-003`. Note: there is a slight error in the prompt we used, and future users should incorporate the edit in https://github.com/tatsu-lab/stanford_alpaca/pull/24
+- We wrote a new prompt (`prompt.txt`) that explicitly gave the requirement of instruction generation to `text-davinci-003`. Note: there is a slight error in the prompt we used, and future users should incorporate the edit in <https://github.com/tatsu-lab/stanford_alpaca/pull/24>
 - We adopted much more aggressive batch decoding, i.e., generating 20 instructions at once, which significantly reduced the cost of data generation.
 - We simplified the data generation pipeline by discarding the difference between classification and non-classification instructions.
 - We only generated a single instance for each instruction, instead of 2 to 3 instances as in [1].
 
-This produced an instruction-following dataset with 52K examples obtained at a much lower cost (less than $500). 
+This produced an instruction-following dataset with 52K examples obtained at a much lower cost (less than $500).
 In a preliminary study, we also find our 52K generated data to be much more diverse than the data released by [self-instruct](https://github.com/yizhongw/self-instruct/blob/main/data/seed_tasks.jsonl).
 We plot the below figure (in the style of Figure 2 in the [self-instruct paper](https://arxiv.org/abs/2212.10560) to demonstrate the diversity of our data.
 The inner circle of the plot represents the root verb of the instructions, and the outer circle represents the direct objects.
@@ -93,6 +103,7 @@ The inner circle of the plot represents the root verb of the instructions, and t
 [<img src="assets/parse_analysis.png" width="750" />](./assets/parse_analysis.png)
 
 ## Fine-tuning
+
 We fine-tune our models using standard Hugging Face training code with the following hyperparameters:
 
 | Hyperparameter | Value |
@@ -106,15 +117,17 @@ We fine-tune our models using standard Hugging Face training code with the follo
 Given Hugging Face hasn't officially supported the LLaMA models, we fine-tuned LLaMA with Hugging Face's transformers library by installing it from a particular fork (i.e. this [PR](https://github.com/huggingface/transformers/pull/21955) to be merged).
 The hash of the specific commit we installed was `68d640f7c368bcaaaecfc678f11908ebbd3d6176`.
 
-To reproduce our fine-tuning runs for LLaMA, first install the requirements 
+To reproduce our fine-tuning runs for LLaMA, first install the requirements
+
 ```bash
 pip install -r requirements.txt
 ```
+
 Then, install the particular fork of Hugging Face's transformers library.
 
-Below is a command that fine-tunes LLaMA-7B with our dataset on a machine with 4 A100 80G GPUs in FSDP `full_shard` mode. 
+Below is a command that fine-tunes LLaMA-7B with our dataset on a machine with 4 A100 80G GPUs in FSDP `full_shard` mode.
 We were able to reproduce a model of similar quality as the one we hosted in our demo with the following command using **Python 3.10**.
-Replace `<your_random_port>` with a port of your own, `<your_path_to_hf_converted_llama_ckpt_and_tokenizer>` with the 
+Replace `<your_random_port>` with a port of your own, `<your_path_to_hf_converted_llama_ckpt_and_tokenizer>` with the
 path to your converted checkpoint and tokenizer (following instructions in the PR), and `<your_output_dir>` with where you want to store your outputs.
 
 ```bash
@@ -142,9 +155,10 @@ torchrun --nproc_per_node=4 --master_port=<your_random_port> train.py \
 ```
 
 ### Warning
-`fsdp_transformer_layer_cls_to_wrap` must be set to the name of the specific decoder layer. 
-The LLaMA Hugging Face PR is not stable. 
-Earlier commits used the name `LLaMADecoderLayer` for their decoder layer (the commit hash our code is based on this). 
+
+`fsdp_transformer_layer_cls_to_wrap` must be set to the name of the specific decoder layer.
+The LLaMA Hugging Face PR is not stable.
+Earlier commits used the name `LLaMADecoderLayer` for their decoder layer (the commit hash our code is based on this).
 More recent commits use `LlamaDecoderLayer` (notice the small case difference).
 Not setting `fsdp_transformer_layer_cls_to_wrap` to the correct name will lead to drastic slowdowns in training.
 
@@ -180,6 +194,7 @@ Note the given training script is meant to be simple and easy to use, and is not
 To run on more gpus, you may prefer to turn down `gradient_accumulation_steps` to keep a global batch size of 128. Global batch size has not been tested for optimality.
 
 ### Authors
+
 All grad students below contributed equally and the order is determined by random draw.
 
 - [Rohan Taori](https://www.rohantaori.com/)
@@ -193,6 +208,7 @@ All advised by [Tatsunori B. Hashimoto](https://thashim.github.io/). Yann is als
 ### Citation
 
 Please cite the repo if you use the data or code in this repo.
+
 ```
 @misc{alpaca,
   author = {Rohan Taori and Ishaan Gulrajani and Tianyi Zhang and Yann Dubois and Xuechen Li and Carlos Guestrin and Percy Liang and Tatsunori B. Hashimoto },