Merge branch 'main' into eval-example

huggingface · Sep 23, 2024 · 84e63ea · 84e63ea
2 parents ecdadcb + 241791c
commit 84e63ea
Show file tree

Hide file tree

Showing 31 changed files with 589 additions and 68 deletions.
diff --git a/.github/workflows/doc-build.yml b/.github/workflows/doc-build.yml
@@ -10,13 +10,14 @@ on:
       - .github/workflows/doc-build.yml
 
 jobs:
-   build:
+  build:
     uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main
     with:
       commit_sha: ${{ github.sha }}
       package: Google-Cloud-Containers
       package_name: google-cloud
       additional_args: --not_python_module
+      pre_command: cd Google-Cloud-Containers && make docs
     secrets:
       token: ${{ secrets.HUGGINGFACE_PUSH }}
       hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
diff --git a/.github/workflows/doc-pr-build.yml b/.github/workflows/doc-pr-build.yml
@@ -19,3 +19,4 @@ jobs:
       package: Google-Cloud-Containers
       package_name: google-cloud
       additional_args: --not_python_module
+      pre_command: cd Google-Cloud-Containers && make docs
diff --git a/.gitignore b/.gitignore
@@ -161,3 +161,6 @@ cython_debug/
 
 # .DS_Store files
 .DS_Store
+
+# Auto-generated docs
+docs/source/examples/
diff --git a/Makefile b/Makefile
@@ -0,0 +1,33 @@
+.PHONY: docs clean help
+
+docs: clean
+	@echo "Processing README.md files from examples/gke, examples/cloud-run, and examples/vertex-ai..."
+	@mkdir -p docs/source/examples
+	@echo "Converting Jupyter Notebooks to MDX..."
+	@doc-builder notebook-to-mdx examples/vertex-ai/notebooks/
+	@echo "Auto-generating example files for documentation..."
+	@python docs/scripts/auto-generate-examples.py
+	@echo "Cleaning up generated Markdown Notebook files..."
+	@find examples/vertex-ai/notebooks -name "vertex-notebook.md" -type f -delete
+	@echo "Generating YAML tree structure and appending to _toctree.yml..."
+	@python docs/scripts/auto-update-toctree.py
+	@echo "YAML tree structure appended to docs/source/_toctree.yml"
+	@echo "Documentation setup complete."
+
+clean:
+	@echo "Cleaning up generated documentation..."
+	@rm -rf docs/source/examples
+	@awk '/^# GENERATED CONTENT DO NOT EDIT!/,/^# END GENERATED CONTENT/{next} {print}' docs/source/_toctree.yml > docs/source/_toctree.yml.tmp && mv docs/source/_toctree.yml.tmp docs/source/_toctree.yml
+	@echo "Cleaning up generated Markdown Notebook files (if any)..."
+	@find examples/vertex-ai/notebooks -name "vertex-notebook.md" -type f -delete
+	@echo "Cleanup complete."
+
+serve:
+	@echo "Serving documentation via doc-builder"
+	doc-builder preview gcloud docs/source --not_python_module
+
+help:
+	@echo "Usage:"
+	@echo "  make docs   - Auto-generate the examples for the docs"
+	@echo "  make clean  - Remove the auto-generated docs"
+	@echo "  make help   - Display this help message"
diff --git a/README.md b/README.md
@@ -42,15 +42,16 @@ The [`examples`](./examples) directory contains examples for using the container
 
 ### Training Examples
 
-| Service   | Example                                                                                                       | Description                                                                             |
-| --------- | ------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------- |
-| GKE       | [trl-full-fine-tuning](./examples/gke/trl-full-fine-tuning)                                                   | Full SFT fine-tuning of Gemma 2B in a multi-GPU instance with TRL on GKE.               |
-| GKE       | [trl-lora-fine-tuning](./examples/gke/trl-lora-fine-tuning)                                                   | LoRA SFT fine-tuning of Mistral 7B v0.3 in a single GPU instance with TRL on GKE.       |
-| Vertex AI | [trl-full-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai) | Full SFT fine-tuning of Mistral 7B v0.3 in a multi-GPU instance with TRL on Vertex AI.  |
-| Vertex AI | [trl-lora-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai) | LoRA SFT fine-tuning of Mistral 7B v0.3 in a single GPU instance with TRL on Vertex AI. |
+| Service   | Example                                                                                                                                    | Title                                                                       |
+| --------- | ------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------- |
+| GKE       | [examples/gke/trl-full-fine-tuning](./examples/gke/trl-full-fine-tuning)                                                                   | Fine-tune Gemma 2B with PyTorch Training DLC using SFT on GKE               |
+| GKE       | [examples/gke/trl-lora-fine-tuning](./examples/gke/trl-lora-fine-tuning)                                                                   | Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT + LoRA on GKE |
+| Vertex AI | [examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-full-sft-fine-tuning-on-vertex-ai) | Fine-tune Mistral 7B v0.3 with PyTorch Training DLC using SFT on Vertex AI  |
+| Vertex AI | [examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai](./examples/vertex-ai/notebooks/trl-lora-sft-fine-tuning-on-vertex-ai) | Fine-tune Gemma 2B with PyTorch Training DLC using SFT + LoRA on Vertex AI  |
 
 ### Inference Examples
 
+
 | Service   | Example                                                                                                                   | Description                                                                                                                                     |
 | --------- | ------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
 | GKE       | [tgi-deployment](./examples/gke/tgi-deployment)                                                                           | Deploying Llama3 8B with Text Generation Inference (TGI) on GKE.                                                                                |

diff --git a/containers/tgi/README.md b/containers/tgi/README.md
@@ -16,7 +16,7 @@ Below you will find the instructions on how to run and test the TGI containers a
 
 To run the Docker container in GPUs you need to ensure that your hardware is supported (NVIDIA drivers on your device need to be compatible with CUDA version 12.2 or higher) and also install the NVIDIA Container Toolkit.
 
-To find the supported models and hardware before running the TGI DLC, feel free to check [TGI's documentation](https://huggingface.co/docs/text-generation-inference/supported_models).
+To find the supported models and hardware before running the TGI DLC, feel free to check [TGI Documentation](https://huggingface.co/docs/text-generation-inference/supported_models).
 
 ### Run
 
@@ -51,7 +51,7 @@ Which returns the following output containing the optimal configuration for depl
 Then you are ready to run the container as follows:
 
 ```bash
-docker run --gpus all -ti -p 8080:8080 \
+docker run --gpus all -ti --shm-size 1g -p 8080:8080 \
     -e MODEL_ID=google/gemma-7b-it \
     -e NUM_SHARD=4 \
     -e HF_TOKEN=$(cat ~/.cache/huggingface/token) \
@@ -85,7 +85,7 @@ curl 0.0.0.0:8080/v1/chat/completions \
 
 Which will start streaming the completion tokens for the given messages until the stop sequences are generated.
 
-Alternatively, you can also use the `/generate` endpoint instead, which already expects the inputs to be formatted according to the tokenizer's requirements, which is more convenient when working with base models without a pre-defined chat template or whenever you want to use a custom chat template instead, and can be used as follows:
+Alternatively, you can also use the `/generate` endpoint instead, which already expects the inputs to be formatted according to the tokenizer requirements, which is more convenient when working with base models without a pre-defined chat template or whenever you want to use a custom chat template instead, and can be used as follows:
 
 ```bash
 curl 0.0.0.0:8080/generate \
@@ -108,7 +108,7 @@ curl 0.0.0.0:8080/generate \
 > [!WARNING]
 > Building the containers is not recommended since those are already built by Hugging Face and Google Cloud teams and provided openly, so the recommended approach is to use the pre-built containers available in [Google Cloud's Artifact Registry](https://console.cloud.google.com/artifacts/docker/deeplearning-platform-release/us/gcr.io) instead.
 
-In order to build TGI's Docker container, you will need an instance with at least 4 NVIDIA GPUs available with at least 24 GiB of VRAM each, since TGI needs to build and compile the kernels required for the optimized inference. Also note that the build process may take ~30 minutes to complete, depending on the instance's specifications.
+In order to build TGI Docker container, you will need an instance with at least 4 NVIDIA GPUs available with at least 24 GiB of VRAM each, since TGI needs to build and compile the kernels required for the optimized inference. Also note that the build process may take ~30 minutes to complete, depending on the instance's specifications.
 
 ```bash
 docker build -t us-docker.pkg.dev/deeplearning-platform-release/gcr.io/huggingface-text-generation-inference-cu121.2-2.ubuntu2204.py310 -f containers/tgi/gpu/2.2.0/Dockerfile .

diff --git a/docs/scripts/auto-generate-examples.py b/docs/scripts/auto-generate-examples.py
@@ -0,0 +1,103 @@
+import os
+import re
+
+
+def process_readme_files():
+    print("Processing README.md files from examples/gke and examples/cloud-run...")
+    os.makedirs("docs/source/examples", exist_ok=True)
+
+    for dir in ["gke", "cloud-run", "vertex-ai/notebooks"]:
+        for root, _, files in os.walk(f"examples/{dir}"):
+            for file in files:
+                if file == "README.md" or file == "vertex-notebook.md":
+                    process_file(root, file, dir)
+
+
+def process_file(root, file, dir):
+    dir_name = dir if not dir.__contains__("/") else dir.replace("/", "-")
+
+    file_path = os.path.join(root, file)
+    subdir = root.replace(f"examples/{dir}/", "")
+    base = os.path.basename(subdir)
+
+    if file_path == f"examples/{dir}/README.md":
+        target = f"docs/source/examples/{dir_name}-index.mdx"
+    else:
+        target = f"docs/source/examples/{dir_name}-{base}.mdx"
+
+    print(f"Processing {file_path} to {target}")
+    with open(file_path, "r") as f:
+        content = f.read()
+
+    # For Juypter Notebooks, remove the comment i.e. `<!--` and the `--!>` but keep the metadata
+    content = re.sub(r"<!-- (.*?) -->", r"\1", content, flags=re.DOTALL)
+
+    # Replace image and link paths
+    content = re.sub(
+        r"\(\./(imgs|assets)/([^)]*\.png)\)",
+        r"(https://raw.githubusercontent.com/huggingface/Google-Cloud-Containers/main/"
+        + root
+        + r"/\1/\2)",
+        content,
+    )
+    content = re.sub(
+        r"\(\.\./([^)]+)\)",
+        r"(https://github.com/huggingface/Google-Cloud-Containers/tree/main/examples/"
+        + dir
+        + r"/\1)",
+        content,
+    )
+    content = re.sub(
+        r"\(\.\/([^)]+)\)",
+        r"(https://github.com/huggingface/Google-Cloud-Containers/tree/main/"
+        + root
+        + r"/\1)",
+        content,
+    )
+
+    # Regular expression to match the specified blocks
+    pattern = r"> \[!(NOTE|WARNING)\]\n((?:> .*\n)+)"
+
+    def replacement(match):
+        block_type = match.group(1)
+        content = match.group(2)
+
+        # Remove '> ' from the beginning of each line and strip whitespace
+        lines = [
+            line.lstrip("> ").strip() for line in content.split("\n") if line.strip()
+        ]
+
+        # Determine the Tip type
+        tip_type = " warning" if block_type == "WARNING" else ""
+
+        # Construct the new block
+        new_block = f"<Tip{tip_type}>\n\n"
+        new_block += "\n".join(lines)
+        new_block += "\n\n</Tip>\n"
+
+        return new_block
+
+    # Perform the transformation
+    content = re.sub(pattern, replacement, content, flags=re.MULTILINE)
+
+    # Remove blockquotes
+    content = re.sub(r"^(>[ ]*)+", "", content, flags=re.MULTILINE)
+
+    # Check for remaining relative paths
+    if re.search(r"\(\.\./|\(\./", content):
+        print("WARNING: Relative paths still exist in the processed file.")
+        print(
+            "The following lines contain relative paths, consider replacing those with GitHub URLs instead:"
+        )
+        for i, line in enumerate(content.split("\n"), 1):
+            if re.search(r"\(\.\./|\(\./", line):
+                print(f"{i}: {line}")
+    else:
+        print("No relative paths found in the processed file.")
+
+    with open(target, "w") as f:
+        f.write(content)
+
+
+if __name__ == "__main__":
+    process_readme_files()
diff --git a/docs/scripts/auto-update-toctree.py b/docs/scripts/auto-update-toctree.py
@@ -0,0 +1,97 @@
+import glob
+import os
+import re
+
+
+from pathlib import Path
+
+
+def update_toctree_yaml():
+    output_file = "docs/source/_toctree.yml"
+    dirs = ["vertex-ai", "gke", "cloud-run"]
+
+    with open(output_file, "a") as f:
+        f.write("# GENERATED CONTENT DO NOT EDIT!\n")
+        f.write("- sections:\n")
+
+        for dir in dirs:
+            f.write("    - sections:\n")
+
+            # Find and sort files
+            files = sorted(glob.glob(f"docs/source/examples/{dir}-*.mdx"))
+            files = [file for file in files if not file.endswith(f"{dir}-index.mdx")]
+
+            # Dictionary to store files by type
+            files_by_type = {}
+
+            for file in files:
+                with open(file, "r+") as mdx_file:
+                    content = mdx_file.read()
+                    metadata_match = re.search(r"---(.*?)---", content, re.DOTALL)
+
+                    metadata = {}
+                    if metadata_match:
+                        metadata_str = metadata_match.group(1)
+                        metadata = dict(re.findall(r"(\w+):\s*(.+)", metadata_str))
+
+                        # Remove metadata from content assuming it's the block on top
+                        # surrounded by `---` including those too
+                        content = re.sub(
+                            r"^---\s*\n.*?\n---\s*\n",
+                            "",
+                            content,
+                            flags=re.DOTALL | re.MULTILINE,
+                        )
+                        content = content.strip()
+
+                        mdx_file.seek(0)
+                        mdx_file.write(content)
+                        mdx_file.truncate()
+
+                if not all(key in metadata for key in ["title", "type"]):
+                    print(f"WARNING: Metadata missing in {file}")
+                    print("Ensure that the file contains the following metadata:")
+                    print("title: <title>")
+                    print("type: <type>")
+
+                    # Remove the file from `docs/source/examples` if doesn't contain metadata
+                    print(
+                        "Removing the file as it won't be included in the _toctree.yml"
+                    )
+                    os.remove(file)
+
+                    continue
+
+                file_type = metadata["type"]
+                if file_type not in files_by_type:
+                    files_by_type[file_type] = []
+                files_by_type[file_type].append((file, metadata))
+
+            for file_type, file_list in files_by_type.items():
+                f.write("        - sections:\n")
+                for file, metadata in file_list:
+                    base = Path(file).stem
+                    title = metadata["title"]
+                    f.write(f"            - local: examples/{base}\n")
+                    f.write(f'              title: "{title}"\n')
+                f.write("          isExpanded: false\n")
+                f.write(f"          title: {file_type.capitalize()}\n")
+
+            f.write("      isExpanded: true\n")
+
+            if dir == "cloud-run":
+                f.write(f"      local: examples/{dir}-index\n")
+                f.write("      title: Cloud Run\n")
+            elif dir == "vertex-ai":
+                f.write("      title: Vertex AI\n")
+            else:
+                f.write(f"      local: examples/{dir}-index\n")
+                f.write(f"      title: {dir.upper()}\n")
+
+        f.write("  # local: examples/index\n")
+        f.write("  title: Examples\n")
+        f.write("# END GENERATED CONTENT\n")
+
+
+if __name__ == "__main__":
+    update_toctree_yaml()
diff --git a/docs/source/_toctree.yml b/docs/source/_toctree.yml
@@ -1,4 +1,14 @@
 - sections:
-  - local: index
-    title: Hugging Face on Google Cloud
+    - local: index
+      title: Hugging Face on Google Cloud
+    - local: features
+      title: Features & benefits
+    - local: resources
+      title: Other Resources
   title: Getting Started
+- sections:
+    - local: containers/introduction
+      title: Introduction
+    - local: containers/available
+      title: Available DLCs on Google Cloud
+  title: Deep Learning Containers (DLCs)