From 4fa7d02209fd8fce139b0db72b22bd14d5345739 Mon Sep 17 00:00:00 2001 From: vb Date: Mon, 20 Jan 2025 17:11:55 +0100 Subject: [PATCH] Created using Colab --- ...eek_r1_distill_qwen1_5B_transformers.ipynb | 270 ++++++++++++++++++ 1 file changed, 270 insertions(+) create mode 100644 deepseek_r1_distill_qwen1_5B_transformers.ipynb diff --git a/deepseek_r1_distill_qwen1_5B_transformers.ipynb b/deepseek_r1_distill_qwen1_5B_transformers.ipynb new file mode 100644 index 0000000..0fd98e5 --- /dev/null +++ b/deepseek_r1_distill_qwen1_5B_transformers.ipynb @@ -0,0 +1,270 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "provenance": [], + "gpuType": "T4", + "authorship_tag": "ABX9TyMaHXlZf4FF/2AbgPjQfxrR", + "include_colab_link": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + }, + "language_info": { + "name": "python" + }, + "accelerator": "GPU" + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "view-in-github", + "colab_type": "text" + }, + "source": [ + "\"Open" + ] + }, + { + "cell_type": "markdown", + "source": [ + "# Run DeepSeek R1 Distill Qwen 1.5B in FREE Google Colab\n", + "\n", + "Powered by Transformers and DeepSeek! ❤️" + ], + "metadata": { + "id": "6dxqPSkyeDoO" + } + }, + { + "cell_type": "markdown", + "source": [ + "## Download the model checkpoint\n", + "\n", + "https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B" + ], + "metadata": { + "id": "4vwR5z2LeN8h" + } + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": { + "id": "uU4FwZWVbgdO" + }, + "outputs": [], + "source": [ + "from transformers import AutoModelForCausalLM, AutoTokenizer\n", + "\n", + "model_name = \"deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B\"\n", + "\n", + "model = AutoModelForCausalLM.from_pretrained(\n", + " model_name,).to(\"cuda\")\n", + "tokenizer = AutoTokenizer.from_pretrained(model_name)" + ] + }, + { + "cell_type": "markdown", + "source": [ + "## Provide a prompt & generation parameters" + ], + "metadata": { + "id": "TH2jLGUHeSty" + } + }, + { + "cell_type": "code", + "source": [ + "prompt = \"write an efficient alogirthm for sorting a 2 dimensional array\"\n", + "messages = [\n", + " {\"role\": \"system\", \"content\": \"You are an extremely focused and to the point assistant.\"},\n", + " {\"role\": \"user\", \"content\": prompt}\n", + "]\n", + "text = tokenizer.apply_chat_template(\n", + " messages,\n", + " tokenize=False,\n", + " add_generation_prompt=True\n", + ")\n", + "model_inputs = tokenizer([text], return_tensors=\"pt\").to(model.device)" + ], + "metadata": { + "id": "A122cuXIcAUu" + }, + "execution_count": 8, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Generate text" + ], + "metadata": { + "id": "ImQgJ188eZ3X" + } + }, + { + "cell_type": "code", + "source": [ + "generated_ids = model.generate(\n", + " **model_inputs,\n", + " max_new_tokens=2048\n", + ")\n", + "generated_ids = [\n", + " output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)\n", + "]" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "pGg5GtgCcDCM", + "outputId": "c6eeafd7-b1c3-4eaa-ad8e-f98bc77ca534" + }, + "execution_count": 9, + "outputs": [ + { + "output_type": "stream", + "name": "stderr", + "text": [ + "Setting `pad_token_id` to `eos_token_id`:151643 for open-end generation.\n" + ] + } + ] + }, + { + "cell_type": "markdown", + "source": [ + "## Decode response" + ], + "metadata": { + "id": "8aIGWhNwefE-" + } + }, + { + "cell_type": "code", + "source": [ + "response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]\n" + ], + "metadata": { + "id": "EfunHq9IcEH0" + }, + "execution_count": 11, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "## Voila, enjoy the response!" + ], + "metadata": { + "id": "RjUJoLlTehLA" + } + }, + { + "cell_type": "code", + "source": [ + "print(response)" + ], + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "_dQ4sBtoct_R", + "outputId": "8295138b-3d63-45d8-8090-f2e32eb718b0" + }, + "execution_count": 12, + "outputs": [ + { + "output_type": "stream", + "name": "stdout", + "text": [ + "\n", + "Okay, so I need to figure out how to write an efficient algorithm for sorting a 2-dimensional array. Hmm, let's start by understanding what exactly is being asked here. The user wants an algorithm that can sort a 2D array, but I'm not entirely sure if they mean a 2D array of numbers or something else. Maybe it's a list of lists, where each sublist is a row. I should clarify that in my response.\n", + "\n", + "Alright, assuming it's a 2D array where each element is a number, I need to think about the best sorting algorithms for this. I know that for a single list, the most efficient sorting algorithms are typically O(n log n), like merge sort or quicksort. But since this is a 2D array, I have to consider how to sort it efficiently.\n", + "\n", + "One approach is to sort each row individually. That would mean applying a sorting algorithm to each sublist. But if the rows are of different lengths, that could cause issues. Wait, in a 2D array, are all rows of the same length? I think in most cases, yes, but I should consider that possibility.\n", + "\n", + "Another idea is to sort the entire array as a 2D structure. That would involve comparing elements across rows and columns. For example, sorting based on the first element, then the second, and so on. This is similar to how you sort a list of tuples in Python using the default sort, which compares elements lexicographically.\n", + "\n", + "I should also think about the time complexity. If I sort each row individually, the time complexity would be O(m * n log n), where m is the number of rows and n is the average number of elements per row. If the rows are of varying lengths, this could be inefficient. On the other hand, sorting the entire array as a 2D structure would have a time complexity of O(m * n^2 log n), which is worse.\n", + "\n", + "So, which approach is better? If the rows are of similar lengths and the sorting is done element-wise, sorting each row individually might be more efficient. But if the rows are of different lengths, the entire array approach would be better.\n", + "\n", + "Wait, the user didn't specify whether the array is a list of lists or a single list. I should clarify that. If it's a single list, then the algorithm would be O(n log n). If it's a 2D array, then it depends on the structure.\n", + "\n", + "I think the user is referring to a 2D array, so I should proceed with that assumption. Therefore, the algorithm should be able to handle a 2D array and sort it efficiently. I'll outline the steps for both approaches: sorting each row individually and sorting the entire array as a 2D structure.\n", + "\n", + "I should also mention that the choice between the two depends on the specific requirements, like the size of the array and the desired time complexity. For most cases, sorting each row individually might be sufficient and easier to implement.\n", + "\n", + "Finally, I'll provide a code example for both methods to illustrate how they can be implemented in Python. This way, the user can choose the one that best fits their needs.\n", + "\n", + "\n", + "To sort a 2-dimensional array efficiently, you can choose between two approaches: sorting each row individually or sorting the entire array as a 2D structure. Here's how you can implement each method:\n", + "\n", + "### 1. Sort Each Row Individually\n", + "This approach involves applying a sorting algorithm to each sublist (row) of the 2D array. This is efficient if the rows are of similar lengths and the sorting is done element-wise.\n", + "\n", + "**Algorithm:**\n", + "1. For each row in the 2D array:\n", + " - Apply a sorting algorithm (e.g., quicksort, mergesort, or a built-in sort function) to the row.\n", + "2. Return the modified 2D array.\n", + "\n", + "**Python Code Example:**\n", + "```python\n", + "def sort_rows(arr):\n", + " if not arr:\n", + " return []\n", + " for row in arr:\n", + " row.sort()\n", + " return arr\n", + "\n", + "# Example usage:\n", + "arr = [[3, 1, 2], [4, 5, 6], [7, 8, 9]]\n", + "sorted_arr = sort_rows(arr)\n", + "print(sorted_arr)\n", + "```\n", + "\n", + "### 2. Sort the Entire 2D Array\n", + "This approach involves sorting the entire array as a 2D structure, which can be done lexicographically (element-wise comparison).\n", + "\n", + "**Algorithm:**\n", + "1. Sort the entire 2D array using a sorting algorithm that compares elements across rows and columns.\n", + "2. Return the sorted 2D array.\n", + "\n", + "**Python Code Example:**\n", + "```python\n", + "def sort_2d_array(arr):\n", + " return sorted(arr)\n", + "\n", + "# Example usage:\n", + "arr = [[3, 1, 2], [4, 5, 6], [7, 8, 9]]\n", + "sorted_arr = sort_2d_array(arr)\n", + "print(sorted_arr)\n", + "```\n", + "\n", + "### Choosing the Appropriate Method\n", + "- **Sorting Each Row Individually:** More efficient if rows are of similar lengths and the sorting is done element-wise.\n", + "- **Sorting the Entire 2D Array:** More efficient if the rows are of varying lengths and the entire array needs to be sorted lexicographically.\n", + "\n", + "### Conclusion\n", + "The choice between the two methods depends on the specific requirements of your use case. If rows are of similar lengths and element-wise sorting is sufficient, sorting each row individually is more efficient. If the rows are of varying lengths and a lexicographic sort is needed, sorting the entire array as a 2D structure is more appropriate.\n" + ] + } + ] + }, + { + "cell_type": "code", + "source": [], + "metadata": { + "id": "YQQXrzuBcvo-" + }, + "execution_count": null, + "outputs": [] + } + ] +} \ No newline at end of file