diff options
| author | leshe4ka46 <alex9102naid1@ya.ru> | 2025-10-18 12:25:53 +0300 |
|---|---|---|
| committer | leshe4ka46 <alex9102naid1@ya.ru> | 2025-10-18 12:25:53 +0300 |
| commit | 910a222fa60ce6ea0831f2956470b8a0b9f62670 (patch) | |
| tree | 1d6bbccafb667731ad127f93390761100fc11b53 /Fundamentals_of_Accelerated_Data_Science/1-04_interoperability.ipynb | |
| parent | 35b9040e4104b0e79bf243a2c9769c589f96e2c4 (diff) | |
nvidia2
Diffstat (limited to 'Fundamentals_of_Accelerated_Data_Science/1-04_interoperability.ipynb')
| -rw-r--r-- | Fundamentals_of_Accelerated_Data_Science/1-04_interoperability.ipynb | 1186 |
1 files changed, 1186 insertions, 0 deletions
diff --git a/Fundamentals_of_Accelerated_Data_Science/1-04_interoperability.ipynb b/Fundamentals_of_Accelerated_Data_Science/1-04_interoperability.ipynb new file mode 100644 index 0000000..cb725df --- /dev/null +++ b/Fundamentals_of_Accelerated_Data_Science/1-04_interoperability.ipynb @@ -0,0 +1,1186 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "b53a7b12-538d-4459-b82a-a35c8c417849", + "metadata": {}, + "source": [ + "<img src=\"./images/DLI_Header.png\" width=400/>" + ] + }, + { + "cell_type": "markdown", + "id": "ae497b71-bc43-471e-8970-88a1878e7cf9", + "metadata": {}, + "source": [ + "# Fundamentals of Accelerated Data Science # " + ] + }, + { + "cell_type": "markdown", + "id": "a149b6d1-1880-4a5d-9d71-f963d3097aa4", + "metadata": {}, + "source": [ + "## 04 - Interoperability of the GPU PyData Ecosystem ##\n", + "\n", + "**Table of Contents**\n", + "<br>\n", + "This notebook provides examples of how we can use cuDF and CuPy together to take advantage of CuPy array functionality (such as advanced linear algebra operations). This notebook covers the below sections: \n", + "1. [NumPy, SciPy, and CuPy](#NumPy,-SciPy,-and-CuPy)\n", + " * [cuDF vs. CuPy](#cuDF-vs.-CuPy)\n", + "2. [Working with CuPy](#Working-with-CuPy)\n", + "3. [Grid Converter](#Grid-Converter)\n", + " * [Lat/Long to OSGB Grid Converter with NumPy](#Lat/Long-to-OSGB-Grid-Converter-with-NumPy)\n", + " * [Lat/Long to OSGB Grid Converter with CuPy](#Lat/Long-to-OSGB-Grid-Converter-with-CuPy)\n", + " * [Exercise #1 - Adding Grid Coordinate Columns to Dataframe](#Exercise-#1---Adding-Grid-Coordinate-Columns-to-DataFrame)\n", + "4. [Boolean Array Indexing](#Boolean-Array-Indexing)" + ] + }, + { + "cell_type": "markdown", + "id": "3c0ac5b5-5cc1-4240-9375-297cf15c6fbc", + "metadata": {}, + "source": [ + "## NumPy, SciPy, and CuPy ##\n", + "Per it's own user guide, [NumPy](https://numpy.org/doc/stable/user/whatisnumpy.html) is the fundamental package for scientific computing in Python. It is a Python library that provides a **multidimensional array object**, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays. These operations include mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more. While NumPy focuses on arrays, mathematical operations, and basic linear algebra, [SciPy](https://docs.scipy.org/doc/scipy-1.8.1/tutorial/general.html) builds on this foundation to provide additional functionality, especially in the domain of scientific computing and optimization. \n", + "\n", + "On the other hands, [CuPy](https://cupy.dev/) is an open-source array library for GPU-accelerated computing with Python. CuPy can be seen as a GPU-accelerated counterpart to NumPy, offering similar functionality and API with the added benefit of GPU acceleration for compatible workloads. While NumPy operates on CPU memory, CuPy primarily works with GPU memory, leveraging CUDA-enabled GPUs for computation. CuPy's interface is highly compatible with NumPy and SciPy. In most cases it can be used as a drop-in replacement. All we need to do is just replace `numpy` and `scipy` with `cupy` and `cupyx.scipy` in the Python code. This makes it easier for users familiar with NumPy to transition to GPU-accelerated computing. \n", + "\n", + "CuPy is designed to work seamlessly with other GPU-accelerated libraries in the RAPIDS ecosystem, similar to how NumPy works with pandas and other CPU-based libraries. By keeping data on the GPU throughout the workflow, we are able to reduce data transfer overhead between CPU and GPU memory. " + ] + }, + { + "cell_type": "markdown", + "id": "3e1483d0-c479-474a-a7fa-a31f4140268d", + "metadata": {}, + "source": [ + "### cuDF vs. CuPy ###\n", + "So far, the DataFrame we've worked with puts data in a structured, tabular format. This is useful when we need to perform DataFrame-like operations such as grouping, aggregating, filtering, and joining data. However, there might be use cases that requires working with multi-dimensional arrays or matrics, such as perfrming linear algebra operations or scientific computing tasks. For these instances, we would want to use libraries that are dedicated to performing these tasks, such as NumPy, SciPy, or CuPy. In other words, use cuDF when working with high-level (less abstract) data manipulation, and use CuPy when doing low-level numerical operations on multi-dimensional arrays. \n", + "\n", + "In practice, most data scientists work with both libraries, since most of their workflows involve DataFrame operations and array-based computations. For example, we might use cuDF for data loading and preprocessing, then convert to CuPy arrays for specific numerical computations, and convert back to cuDF for further analysis or output. cuDF and CuPy are designed to be interoperable, allowing us to easily convert between cuDF DataFrames/Series and CuPy arrays while keeping the data on the GPU. This enables us to create efficient workflows that take advntage of both libraries' strengths. " + ] + }, + { + "cell_type": "markdown", + "id": "2fb75cf7-2590-4638-8e92-6d90a8e34b63", + "metadata": {}, + "source": [ + "## Working with CuPy ##\n", + "There are several ways to use CuPy. From cuDF, the `DataFrame.values` property will return the CuPy representation of the data frame. Alternatively, we can also convert via the CUDA array interface using `DataFrame.to_cupy()`. In addition to these, we can also pass the Series to the `cupy.asarray()` function since cuDF Series exposes the CUDA array interface as the fastest approach. \n", + "\n", + "Below we demonstrate a **row-wise sum** on the DataFrame. cuDF’s support for row-wise operations isn’t mature, so we’d need to either transpose the DataFrame or write a UDF and explicitly calculate the sum across each row. Transposing could lead to hundreds of thousands of columns (which cuDF wouldn’t perform well with) depending on our data’s shape, and writing a UDF can be time intensive. By leveraging the interoperability of the GPU PyData ecosystem, this operation becomes very easy. " + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "4808029f-a5e6-4066-b125-84d63c3c6d43", + "metadata": {}, + "outputs": [], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "# import libraries\n", + "import cudf\n", + "import time" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "50653cc1-53a1-4141-b1fb-4ef3ef7994f4", + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/html": [ + "<div>\n", + "<style scoped>\n", + " .dataframe tbody tr th:only-of-type {\n", + " vertical-align: middle;\n", + " }\n", + "\n", + " .dataframe tbody tr th {\n", + " vertical-align: top;\n", + " }\n", + "\n", + " .dataframe thead th {\n", + " text-align: right;\n", + " }\n", + "</style>\n", + "<table border=\"1\" class=\"dataframe\">\n", + " <thead>\n", + " <tr style=\"text-align: right;\">\n", + " <th></th>\n", + " <th>a</th>\n", + " <th>b</th>\n", + " <th>c</th>\n", + " <th>d</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <th>0</th>\n", + " <td>0</td>\n", + " <td>10</td>\n", + " <td>100</td>\n", + " <td>1000</td>\n", + " </tr>\n", + " <tr>\n", + " <th>1</th>\n", + " <td>1</td>\n", + " <td>11</td>\n", + " <td>101</td>\n", + " <td>1001</td>\n", + " </tr>\n", + " <tr>\n", + " <th>2</th>\n", + " <td>2</td>\n", + " <td>12</td>\n", + " <td>102</td>\n", + " <td>1002</td>\n", + " </tr>\n", + " <tr>\n", + " <th>3</th>\n", + " <td>3</td>\n", + " <td>13</td>\n", + " <td>103</td>\n", + " <td>1003</td>\n", + " </tr>\n", + " <tr>\n", + " <th>4</th>\n", + " <td>4</td>\n", + " <td>14</td>\n", + " <td>104</td>\n", + " <td>1004</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n", + "</div>" + ], + "text/plain": [ + " a b c d\n", + "0 0 10 100 1000\n", + "1 1 11 101 1001\n", + "2 2 12 102 1002\n", + "3 3 13 103 1003\n", + "4 4 14 104 1004" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "num_ele = 1000000\n", + "\n", + "df = cudf.DataFrame(\n", + " {\n", + " \"a\": range(num_ele),\n", + " \"b\": range(10, num_ele + 10),\n", + " \"c\": range(100, num_ele + 100),\n", + " \"d\": range(1000, num_ele + 1000)\n", + " }\n", + ")\n", + "\n", + "# preview\n", + "df.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "b122f0a0-0354-43ed-9078-651d9761d6ed", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0 1110\n", + "1 1114\n", + "2 1118\n", + "3 1122\n", + "4 1126\n", + " ... \n", + "999995 4001090\n", + "999996 4001094\n", + "999997 4001098\n", + "999998 4001102\n", + "999999 4001106\n", + "Length: 1000000, dtype: int64" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "0.31916284561157227" + ] + }, + "execution_count": 3, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "start=time.time()\n", + "display(df.sum(axis=1))\n", + "time.time()-start" + ] + }, + { + "cell_type": "markdown", + "id": "d49ebab0-bff0-4b7e-aabd-c46430799983", + "metadata": {}, + "source": [ + "The same operation runs faster with CuPy. " + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "9b618d3b-9b6d-423d-beb7-98334a8b3339", + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "array([ 1110, 1114, 1118, ..., 4001098, 4001102, 4001106])" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "0.21457505226135254" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "arr=df.values\n", + "\n", + "start=time.time()\n", + "# alternative approach\n", + "# arr=df.to_cupy()\n", + "\n", + "display(arr.sum(axis=1))\n", + "time.time()-start" + ] + }, + { + "cell_type": "markdown", + "id": "ae656ac1-d4a2-499f-a904-6d1409c6ba79", + "metadata": {}, + "source": [ + "When using cuDF pandas, we can use the `.values` property as well as the `cupy.asarray()` function. \n", + "\n", + "**Note**: We can use the `.to_numpy()` method to convert cuDF DataFrames or Series to NumPy arrays. " + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "84e218f5-b4ee-4d49-baff-eac5f1f76b3f", + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'status': 'ok', 'restart': True}" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "import IPython\n", + "app = IPython.Application.instance()\n", + "app.kernel.do_shutdown(True)" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "44200db8-d133-44c5-9f71-2796ec37df5c", + "metadata": {}, + "outputs": [], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "%load_ext cudf.pandas\n", + "import pandas as pd\n", + "\n", + "import numpy as np\n", + "import cupy as cp\n", + "import time" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "e8016e6d-829d-4117-a37c-bae82431626c", + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/html": [ + "<div>\n", + "<style scoped>\n", + " .dataframe tbody tr th:only-of-type {\n", + " vertical-align: middle;\n", + " }\n", + "\n", + " .dataframe tbody tr th {\n", + " vertical-align: top;\n", + " }\n", + "\n", + " .dataframe thead th {\n", + " text-align: right;\n", + " }\n", + "</style>\n", + "<table border=\"1\" class=\"dataframe\">\n", + " <thead>\n", + " <tr style=\"text-align: right;\">\n", + " <th></th>\n", + " <th>a</th>\n", + " <th>b</th>\n", + " <th>c</th>\n", + " <th>d</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <th>0</th>\n", + " <td>0</td>\n", + " <td>10</td>\n", + " <td>100</td>\n", + " <td>1000</td>\n", + " </tr>\n", + " <tr>\n", + " <th>1</th>\n", + " <td>1</td>\n", + " <td>11</td>\n", + " <td>101</td>\n", + " <td>1001</td>\n", + " </tr>\n", + " <tr>\n", + " <th>2</th>\n", + " <td>2</td>\n", + " <td>12</td>\n", + " <td>102</td>\n", + " <td>1002</td>\n", + " </tr>\n", + " <tr>\n", + " <th>3</th>\n", + " <td>3</td>\n", + " <td>13</td>\n", + " <td>103</td>\n", + " <td>1003</td>\n", + " </tr>\n", + " <tr>\n", + " <th>4</th>\n", + " <td>4</td>\n", + " <td>14</td>\n", + " <td>104</td>\n", + " <td>1004</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n", + "</div>" + ], + "text/plain": [ + " a b c d\n", + "0 0 10 100 1000\n", + "1 1 11 101 1001\n", + "2 2 12 102 1002\n", + "3 3 13 103 1003\n", + "4 4 14 104 1004" + ] + }, + "execution_count": 2, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "num_ele = 1000000\n", + "\n", + "df = pd.DataFrame(\n", + " {\n", + " \"a\": range(num_ele),\n", + " \"b\": range(10, num_ele + 10),\n", + " \"c\": range(100, num_ele + 100),\n", + " \"d\": range(1000, num_ele + 1000)\n", + " }\n", + ")\n", + "\n", + "# preview\n", + "df.head()" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "0d314943-69cb-4fcd-a01a-5fe4734ff0bf", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "array([ 1110, 1114, 1118, ..., 4001098, 4001102, 4001106])" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/html": [ + "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"font-style: italic\"> </span>\n", + "<span style=\"font-style: italic\"> Total time elapsed: 0.732 seconds </span>\n", + "<span style=\"font-style: italic\"> </span>\n", + "<span style=\"font-style: italic\"> Stats </span>\n", + "<span style=\"font-style: italic\"> </span>\n", + "┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓\n", + "┃<span style=\"font-weight: bold\"> Line no. </span>┃<span style=\"font-weight: bold\"> Line </span>┃<span style=\"font-weight: bold\"> GPU TIME(s) </span>┃<span style=\"font-weight: bold\"> CPU TIME(s) </span>┃\n", + "┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩\n", + "│ 2 │ <span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\"> arr</span><span style=\"color: #ff4689; text-decoration-color: #ff4689; background-color: #272822\">=</span><span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\">df</span><span style=\"color: #ff4689; text-decoration-color: #ff4689; background-color: #272822\">.</span><span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\">values</span><span style=\"background-color: #272822\"> </span> │ 0.328914536 │ │\n", + "│ │ <span style=\"background-color: #272822\"> </span> │ │ │\n", + "│ 6 │ <span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\"> start</span><span style=\"color: #ff4689; text-decoration-color: #ff4689; background-color: #272822\">=</span><span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\">time</span><span style=\"color: #ff4689; text-decoration-color: #ff4689; background-color: #272822\">.</span><span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\">time()</span><span style=\"background-color: #272822\"> </span> │ │ │\n", + "│ │ <span style=\"background-color: #272822\"> </span> │ │ │\n", + "│ 8 │ <span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\"> display(arr</span><span style=\"color: #ff4689; text-decoration-color: #ff4689; background-color: #272822\">.</span><span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\">sum(axis</span><span style=\"color: #ff4689; text-decoration-color: #ff4689; background-color: #272822\">=</span><span style=\"color: #ae81ff; text-decoration-color: #ae81ff; background-color: #272822\">1</span><span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\">))</span> │ 0.011681609 │ │\n", + "│ │ <span style=\"background-color: #272822\"> </span> │ │ │\n", + "│ 10 │ <span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\"> time</span><span style=\"color: #ff4689; text-decoration-color: #ff4689; background-color: #272822\">.</span><span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\">time()</span><span style=\"color: #ff4689; text-decoration-color: #ff4689; background-color: #272822\">-</span><span style=\"color: #f8f8f2; text-decoration-color: #f8f8f2; background-color: #272822\">start</span><span style=\"background-color: #272822\"> </span> │ │ │\n", + "│ │ <span style=\"background-color: #272822\"> </span> │ │ │\n", + "└──────────┴──────────────────────────────┴─────────────┴─────────────┘\n", + "</pre>\n" + ], + "text/plain": [ + "\u001b[3m \u001b[0m\n", + "\u001b[3m Total time elapsed: 0.732 seconds \u001b[0m\n", + "\u001b[3m \u001b[0m\n", + "\u001b[3m Stats \u001b[0m\n", + "\u001b[3m \u001b[0m\n", + "┏━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓\n", + "┃\u001b[1m \u001b[0m\u001b[1mLine no.\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mLine \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mGPU TIME(s)\u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1mCPU TIME(s)\u001b[0m\u001b[1m \u001b[0m┃\n", + "┡━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩\n", + "│ 2 │ \u001b[38;2;248;248;242;48;2;39;40;34m \u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34marr\u001b[0m\u001b[38;2;255;70;137;48;2;39;40;34m=\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34mdf\u001b[0m\u001b[38;2;255;70;137;48;2;39;40;34m.\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34mvalues\u001b[0m\u001b[48;2;39;40;34m \u001b[0m │ 0.328914536 │ │\n", + "│ │ \u001b[48;2;39;40;34m \u001b[0m │ │ │\n", + "│ 6 │ \u001b[38;2;248;248;242;48;2;39;40;34m \u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34mstart\u001b[0m\u001b[38;2;255;70;137;48;2;39;40;34m=\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34mtime\u001b[0m\u001b[38;2;255;70;137;48;2;39;40;34m.\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34mtime\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34m(\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34m)\u001b[0m\u001b[48;2;39;40;34m \u001b[0m │ │ │\n", + "│ │ \u001b[48;2;39;40;34m \u001b[0m │ │ │\n", + "│ 8 │ \u001b[38;2;248;248;242;48;2;39;40;34m \u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34mdisplay\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34m(\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34marr\u001b[0m\u001b[38;2;255;70;137;48;2;39;40;34m.\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34msum\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34m(\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34maxis\u001b[0m\u001b[38;2;255;70;137;48;2;39;40;34m=\u001b[0m\u001b[38;2;174;129;255;48;2;39;40;34m1\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34m)\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34m)\u001b[0m │ 0.011681609 │ │\n", + "│ │ \u001b[48;2;39;40;34m \u001b[0m │ │ │\n", + "│ 10 │ \u001b[38;2;248;248;242;48;2;39;40;34m \u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34mtime\u001b[0m\u001b[38;2;255;70;137;48;2;39;40;34m.\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34mtime\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34m(\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34m)\u001b[0m\u001b[38;2;255;70;137;48;2;39;40;34m-\u001b[0m\u001b[38;2;248;248;242;48;2;39;40;34mstart\u001b[0m\u001b[48;2;39;40;34m \u001b[0m │ │ │\n", + "│ │ \u001b[48;2;39;40;34m \u001b[0m │ │ │\n", + "└──────────┴──────────────────────────────┴─────────────┴─────────────┘\n" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "%%cudf.pandas.line_profile\n", + "# DO NOT CHANGE THIS CELL\n", + "arr=df.values\n", + "# alternative approach\n", + "# arr=cp.asarray(df)\n", + "\n", + "start=time.time()\n", + "\n", + "display(arr.sum(axis=1))\n", + "\n", + "time.time()-start" + ] + }, + { + "cell_type": "markdown", + "id": "5a97fac8-b555-4ec1-a555-9b5ab79c2fe1", + "metadata": {}, + "source": [ + "Just like we can do with NumPy and pandas, we can weave cuDF and CuPy together in the same workflow while keeping the data entirely on the GPU. We’re able to seamlessly move between data structures in this ecosystem, giving us enormous flexibility without sacrificing speed. If we’re working with RAPIDS cuDF but need a more linear-algebra oriented function that exists in CuPy, we can leverage the interoperability of the GPU PyData ecosystem to use that function. \n", + "\n", + "To convert a CuPy array to a cuDF DataFrame or Series, we can use their respective constructors. " + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "9cea340d-c9c4-468d-8431-015492682db9", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "<div>\n", + "<style scoped>\n", + " .dataframe tbody tr th:only-of-type {\n", + " vertical-align: middle;\n", + " }\n", + "\n", + " .dataframe tbody tr th {\n", + " vertical-align: top;\n", + " }\n", + "\n", + " .dataframe thead th {\n", + " text-align: right;\n", + " }\n", + "</style>\n", + "<table border=\"1\" class=\"dataframe\">\n", + " <thead>\n", + " <tr style=\"text-align: right;\">\n", + " <th></th>\n", + " <th>a</th>\n", + " <th>b</th>\n", + " <th>c</th>\n", + " <th>d</th>\n", + " <th>sum</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <th>0</th>\n", + " <td>0</td>\n", + " <td>10</td>\n", + " <td>100</td>\n", + " <td>1000</td>\n", + " <td>1110</td>\n", + " </tr>\n", + " <tr>\n", + " <th>1</th>\n", + " <td>1</td>\n", + " <td>11</td>\n", + " <td>101</td>\n", + " <td>1001</td>\n", + " <td>1114</td>\n", + " </tr>\n", + " <tr>\n", + " <th>2</th>\n", + " <td>2</td>\n", + " <td>12</td>\n", + " <td>102</td>\n", + " <td>1002</td>\n", + " <td>1118</td>\n", + " </tr>\n", + " <tr>\n", + " <th>3</th>\n", + " <td>3</td>\n", + " <td>13</td>\n", + " <td>103</td>\n", + " <td>1003</td>\n", + " <td>1122</td>\n", + " </tr>\n", + " <tr>\n", + " <th>4</th>\n", + " <td>4</td>\n", + " <td>14</td>\n", + " <td>104</td>\n", + " <td>1004</td>\n", + " <td>1126</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n", + "</div>" + ], + "text/plain": [ + " a b c d sum\n", + "0 0 10 100 1000 1110\n", + "1 1 11 101 1001 1114\n", + "2 2 12 102 1002 1118\n", + "3 3 13 103 1003 1122\n", + "4 4 14 104 1004 1126" + ] + }, + "execution_count": 4, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "df['sum']=arr.sum(axis=1)\n", + "\n", + "df.head()" + ] + }, + { + "cell_type": "markdown", + "id": "2cfe3955-2849-451b-945a-c37955b91235", + "metadata": {}, + "source": [ + "## Grid Converter ##\n", + "Much of our data is provided with latitude and longitude coordinates, but for some of our tasks involving distance - identifying geographically dense clusters of infected people, locating the nearest hospital or clinic from a given person - it is convenient to have Cartesian grid coordinates instead. By using a region-specific map projection - in this case, the [Ordnance Survey Great Britain 1936](https://en.wikipedia.org/wiki/Ordnance_Survey_National_Grid) - we can compute local distances efficiently and with good accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "8dc82f3c-f1cb-436b-becb-a97635ec5f6a", + "metadata": {}, + "outputs": [ + { + "data": { + "text/html": [ + "<div>\n", + "<style scoped>\n", + " .dataframe tbody tr th:only-of-type {\n", + " vertical-align: middle;\n", + " }\n", + "\n", + " .dataframe tbody tr th {\n", + " vertical-align: top;\n", + " }\n", + "\n", + " .dataframe thead th {\n", + " text-align: right;\n", + " }\n", + "</style>\n", + "<table border=\"1\" class=\"dataframe\">\n", + " <thead>\n", + " <tr style=\"text-align: right;\">\n", + " <th></th>\n", + " <th>age</th>\n", + " <th>sex</th>\n", + " <th>county</th>\n", + " <th>lat</th>\n", + " <th>long</th>\n", + " <th>name</th>\n", + " </tr>\n", + " </thead>\n", + " <tbody>\n", + " <tr>\n", + " <th>0</th>\n", + " <td>0</td>\n", + " <td>m</td>\n", + " <td>DARLINGTON</td>\n", + " <td>54.533638</td>\n", + " <td>-1.524400</td>\n", + " <td>FRANCIS</td>\n", + " </tr>\n", + " <tr>\n", + " <th>1</th>\n", + " <td>0</td>\n", + " <td>m</td>\n", + " <td>DARLINGTON</td>\n", + " <td>54.426254</td>\n", + " <td>-1.465314</td>\n", + " <td>EDWARD</td>\n", + " </tr>\n", + " <tr>\n", + " <th>2</th>\n", + " <td>0</td>\n", + " <td>m</td>\n", + " <td>DARLINGTON</td>\n", + " <td>54.555199</td>\n", + " <td>-1.496417</td>\n", + " <td>TEDDY</td>\n", + " </tr>\n", + " <tr>\n", + " <th>3</th>\n", + " <td>0</td>\n", + " <td>m</td>\n", + " <td>DARLINGTON</td>\n", + " <td>54.547909</td>\n", + " <td>-1.572342</td>\n", + " <td>ANGUS</td>\n", + " </tr>\n", + " <tr>\n", + " <th>4</th>\n", + " <td>0</td>\n", + " <td>m</td>\n", + " <td>DARLINGTON</td>\n", + " <td>54.477638</td>\n", + " <td>-1.605995</td>\n", + " <td>CHARLIE</td>\n", + " </tr>\n", + " </tbody>\n", + "</table>\n", + "</div>" + ], + "text/plain": [ + " age sex county lat long name\n", + "0 0 m DARLINGTON 54.533638 -1.524400 FRANCIS\n", + "1 0 m DARLINGTON 54.426254 -1.465314 EDWARD\n", + "2 0 m DARLINGTON 54.555199 -1.496417 TEDDY\n", + "3 0 m DARLINGTON 54.547909 -1.572342 ANGUS\n", + "4 0 m DARLINGTON 54.477638 -1.605995 CHARLIE" + ] + }, + "execution_count": 5, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "dtype_dict={\n", + " 'age': 'int8', \n", + " 'sex': 'category', \n", + " 'county': 'category', \n", + " 'lat': 'float32', \n", + " 'long': 'float32', \n", + " 'name': 'category'\n", + "}\n", + " \n", + "df=pd.read_csv('./data/uk_pop.csv', dtype=dtype_dict)\n", + "df.head()" + ] + }, + { + "cell_type": "markdown", + "id": "f25cfe6a-9f31-4a13-b06f-c85a95637a44", + "metadata": {}, + "source": [ + "### Lat/Long to OSGB Grid Converter with NumPy ###\n", + "To perform coordinate conversion, we will create a function `latlong2osgbgrid` which accepts latitude/longitude coordinates and converts them to [OSGB36 coordinates](https://en.wikipedia.org/wiki/Ordnance_Survey_National_Grid): \"northing\" and \"easting\" values representing the point's Cartesian coordinate distances from the southwest corner of the grid.\n", + "\n", + "Immediately below is `latlong2osgbgrid`, which relies heavily on NumPy:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "3e298328-a11c-4cda-ae15-eb292fed3a5c", + "metadata": {}, + "outputs": [], + "source": [ + "# https://www.ordnancesurvey.co.uk/docs/support/guide-coordinate-systems-great-britain.pdf\n", + "\n", + "def latlong2osgbgrid(lat, long, input_degrees=True):\n", + " '''\n", + " Converts latitude and longitude (ellipsoidal) coordinates into northing and easting (grid) coordinates, using a Transverse Mercator projection.\n", + " \n", + " Inputs:\n", + " lat: latitude coordinate (north)\n", + " long: longitude coordinate (east)\n", + " input_degrees: if True (default), interprets the coordinates as degrees; otherwise, interprets coordinates as radians\n", + " \n", + " Output:\n", + " (northing, easting)\n", + " '''\n", + " \n", + " if input_degrees:\n", + " lat = lat * np.pi/180\n", + " long = long * np.pi/180\n", + "\n", + " a = 6377563.396\n", + " b = 6356256.909\n", + " e2 = (a**2 - b**2) / a**2\n", + "\n", + " N0 = -100000 # northing of true origin\n", + " E0 = 400000 # easting of true origin\n", + " F0 = .9996012717 # scale factor on central meridian\n", + " phi0 = 49 * np.pi / 180 # latitude of true origin\n", + " lambda0 = -2 * np.pi / 180 # longitude of true origin and central meridian\n", + " \n", + " sinlat = np.sin(lat)\n", + " coslat = np.cos(lat)\n", + " tanlat = np.tan(lat)\n", + " \n", + " latdiff = lat-phi0\n", + " longdiff = long-lambda0\n", + "\n", + " n = (a-b) / (a+b)\n", + " nu = a * F0 * (1 - e2 * sinlat ** 2) ** -.5\n", + " rho = a * F0 * (1 - e2) * (1 - e2 * sinlat ** 2) ** -1.5\n", + " eta2 = nu / rho - 1\n", + " M = b * F0 * ((1 + n + 5/4 * (n**2 + n**3)) * latdiff - \n", + " (3*(n+n**2) + 21/8 * n**3) * np.sin(latdiff) * np.cos(lat+phi0) +\n", + " 15/8 * (n**2 + n**3) * np.sin(2*(latdiff)) * np.cos(2*(lat+phi0)) - \n", + " 35/24 * n**3 * np.sin(3*(latdiff)) * np.cos(3*(lat+phi0)))\n", + " I = M + N0\n", + " II = nu/2 * sinlat * coslat\n", + " III = nu/24 * sinlat * coslat ** 3 * (5 - tanlat ** 2 + 9 * eta2)\n", + " IIIA = nu/720 * sinlat * coslat ** 5 * (61-58 * tanlat**2 + tanlat**4)\n", + " IV = nu * coslat\n", + " V = nu / 6 * coslat**3 * (nu/rho - np.tan(lat)**2)\n", + " VI = nu / 120 * coslat ** 5 * (5 - 18 * tanlat**2 + tanlat**4 + 14 * eta2 - 58 * tanlat**2 * eta2)\n", + "\n", + " northing = I + II * longdiff**2 + III * longdiff**4 + IIIA * longdiff**6\n", + " easting = E0 + IV * longdiff + V * longdiff**3 + VI * longdiff**5\n", + "\n", + " return(northing, easting)" + ] + }, + { + "cell_type": "markdown", + "id": "a4ddcbe5-7524-4976-a94d-ff6753da3ad8", + "metadata": {}, + "source": [ + "### Lat/Long to OSGB Grid Converter with CuPy ###\n", + "In the following `latlong2osgbgrid_cupy`, we simply swap `cp` in for `np`. While CuPy supports a wide variety of powerful GPU-accelerated tasks, this simple technique of being able to swap in CuPy calls for NumPy calls makes it an incredibly powerful tool to have at our disposal." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "41ffa2a3-34ba-4664-9399-4b74cc28f623", + "metadata": {}, + "outputs": [], + "source": [ + "# https://www.ordnancesurvey.co.uk/docs/support/guide-coordinate-systems-great-britain.pdf\n", + "\n", + "def latlong2osgbgrid_cupy(lat, long, input_degrees=True):\n", + " '''\n", + " Converts latitude and longitude (ellipsoidal) coordinates into northing and easting (grid) coordinates, using a Transverse Mercator projection.\n", + " \n", + " Inputs:\n", + " lat: latitude coordinate (north)\n", + " long: longitude coordinate (east)\n", + " input_degrees: if True (default), interprets the coordinates as degrees; otherwise, interprets coordinates as radians\n", + " \n", + " Output:\n", + " (northing, easting)\n", + " '''\n", + " \n", + " if input_degrees:\n", + " lat = lat * cp.pi/180\n", + " long = long * cp.pi/180\n", + "\n", + " a = 6377563.396\n", + " b = 6356256.909\n", + " e2 = (a**2 - b**2) / a**2\n", + "\n", + " N0 = -100000 # northing of true origin\n", + " E0 = 400000 # easting of true origin\n", + " F0 = .9996012717 # scale factor on central meridian\n", + " phi0 = 49 * cp.pi / 180 # latitude of true origin\n", + " lambda0 = -2 * cp.pi / 180 # longitude of true origin and central meridian\n", + " \n", + " sinlat = cp.sin(lat)\n", + " coslat = cp.cos(lat)\n", + " tanlat = cp.tan(lat)\n", + " \n", + " latdiff = lat-phi0\n", + " longdiff = long-lambda0\n", + "\n", + " n = (a-b) / (a+b)\n", + " nu = a * F0 * (1 - e2 * sinlat ** 2) ** -.5\n", + " rho = a * F0 * (1 - e2) * (1 - e2 * sinlat ** 2) ** -1.5\n", + " eta2 = nu / rho - 1\n", + " M = b * F0 * ((1 + n + 5/4 * (n**2 + n**3)) * latdiff - \n", + " (3*(n+n**2) + 21/8 * n**3) * cp.sin(latdiff) * cp.cos(lat+phi0) +\n", + " 15/8 * (n**2 + n**3) * cp.sin(2*(latdiff)) * cp.cos(2*(lat+phi0)) - \n", + " 35/24 * n**3 * cp.sin(3*(latdiff)) * cp.cos(3*(lat+phi0)))\n", + " I = M + N0\n", + " II = nu/2 * sinlat * coslat\n", + " III = nu/24 * sinlat * coslat ** 3 * (5 - tanlat ** 2 + 9 * eta2)\n", + " IIIA = nu/720 * sinlat * coslat ** 5 * (61-58 * tanlat**2 + tanlat**4)\n", + " IV = nu * coslat\n", + " V = nu / 6 * coslat**3 * (nu/rho - cp.tan(lat)**2)\n", + " VI = nu / 120 * coslat ** 5 * (5 - 18 * tanlat**2 + tanlat**4 + 14 * eta2 - 58 * tanlat**2 * eta2)\n", + "\n", + " northing = I + II * longdiff**2 + III * longdiff**4 + IIIA * longdiff**6\n", + " easting = E0 + IV * longdiff + V * longdiff**3 + VI * longdiff**5\n", + "\n", + " return(northing, easting)" + ] + }, + { + "cell_type": "markdown", + "id": "8b9f1ea5-595d-4933-afb0-29fa75feea29", + "metadata": {}, + "source": [ + "Below we pass the latitude/longitude coordinates into the converter, which returns north and east values within the OSGB grid. " + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "c35f24b6-0a70-4ad0-897b-fdbb9dee3cec", + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CPU times: user 482 ms, sys: 562 ms, total: 1.04 s\n", + "Wall time: 1.04 s\n" + ] + } + ], + "source": [ + "%%time\n", + "# DO NOT CHANGE THIS CELL\n", + "numpy_lat = np.asarray(df['lat'])\n", + "numpy_long = np.asarray(df['long'])" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "edc2feb1-29a8-420c-80ab-62fff620edaa", + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CPU times: user 12.8 s, sys: 3.27 s, total: 16.1 s\n", + "Wall time: 16.1 s\n" + ] + } + ], + "source": [ + "%%time\n", + "# DO NOT CHANGE THIS CELL\n", + "n_numpy_array, e_numpy_array = latlong2osgbgrid(numpy_lat, numpy_long)" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "5c520b09-41a5-4d59-ac24-1690fcc8bbcf", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CPU times: user 662 μs, sys: 206 μs, total: 868 μs\n", + "Wall time: 881 μs\n" + ] + } + ], + "source": [ + "%%time\n", + "# DO NOT CHANGE THIS CELL\n", + "cumpy_lat = cp.asarray(df['lat'])\n", + "cumpy_long = cp.asarray(df['long'])" + ] + }, + { + "cell_type": "code", + "execution_count": 14, + "id": "46de3086-d8a9-4f0a-a593-3f68d2c4493e", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CPU times: user 3.39 s, sys: 24.8 ms, total: 3.41 s\n", + "Wall time: 3.39 s\n" + ] + } + ], + "source": [ + "%%time\n", + "# DO NOT CHANGE THIS CELL\n", + "n_cumpy_array, e_cumpy_array = latlong2osgbgrid_cupy(cumpy_lat, cumpy_long)" + ] + }, + { + "cell_type": "markdown", + "id": "b3f10d6f-2044-4517-904e-c9bd46350f17", + "metadata": {}, + "source": [ + "### Exercise #1 - Adding Grid Coordinate Columns to DataFrame ###\n", + "Now we will utilize `latlong2osgbgrid_cupy` to add `northing` and `easting` columns to `df`. We start by converting the two columns we need, `lat` and `long`, to CuPy arrays with the `cp.asarray()` function. Because cuDF and CuPy interface directly via the `__cuda_array_interface__`, the conversion can happen in nanoseconds. \n", + "**Instructions**: <br>\n", + "* Execute the below cell to create CuPy arrays for the `lat` and `long` columns. \n", + "* Modify the `<FIXME>` only and execute the cell below to use `latlong2osgbgrid_cupy` with `cupy_lat` and `cupy_long`, followed by add them as the `northing` and `easting` columns with the dtype `float32`. " + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "3febd9fb-fc44-41f6-9712-60902da18d22", + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "CPU times: user 875 μs, sys: 0 ns, total: 875 μs\n", + "Wall time: 889 μs\n" + ] + } + ], + "source": [ + "%%time\n", + "# DO NOT CHANGE THIS CELL\n", + "cupy_lat = cp.asarray(df['lat'])\n", + "cupy_long = cp.asarray(df['long'])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "da67fb99-fdab-49c4-9480-a9b5546dbc95", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%time\n", + "n_cupy_array, e_cupy_array = <<<<FIXME>>>>\n", + "df['northing'] = <<<<FIXME>>>>\n", + "df['easting'] = <<<<FIXME>>>>\n", + "print(df.dtypes)\n", + "df.head()" + ] + }, + { + "cell_type": "raw", + "id": "4c65c727-18a4-47a6-9e43-acba85212834", + "metadata": { + "jupyter": { + "source_hidden": true + }, + "scrolled": true + }, + "source": [ + "\n", + "%%time\n", + "n_cupy_array, e_cupy_array = latlong2osgbgrid_cupy(cupy_lat, cupy_long)\n", + "df['northing'] = pd.Series(n_cupy_array).astype('float32')\n", + "df['easting'] = e_cupy_array.astype('float32')\n", + "print(df.dtypes)\n", + "df.head()" + ] + }, + { + "cell_type": "markdown", + "id": "82640e5c-ae3a-495b-bc48-df9b46f5d4b1", + "metadata": {}, + "source": [ + "Click ... for solution. " + ] + }, + { + "cell_type": "markdown", + "id": "d0e92cb7-c138-49ae-9aa4-e55f76235b98", + "metadata": {}, + "source": [ + "## Boolean Array Indexing ##\n", + "Below we use `np.logical_and` for element-wise boolean selection." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "61d7946d-f239-4c5b-9c38-2057aa88a085", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "start=time.time()\n", + "display(df.loc[np.logical_and(df['name'].str.startswith('E'), df['name'].str.endswith('D'))].head())\n", + "print(f'Duration: {round(time.time()-start, 2)} seconds')" + ] + }, + { + "cell_type": "markdown", + "id": "3f768cdc-3412-4256-921e-ffccfdcd6058", + "metadata": {}, + "source": [ + "Below we use the CuPy for boolean selection. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "04ac5517-c69b-4355-81e0-1e94deddcb2e", + "metadata": { + "scrolled": true + }, + "outputs": [], + "source": [ + "%%cudf.pandas.line_profile\n", + "# DO NOT CHANGE THIS CELL\n", + "start=time.time()\n", + "display(df.loc[cp.logical_and(df['name'].str.startswith('E'), df['name'].str.endswith('D'))].head())\n", + "print(f'Duration: {round(time.time()-start, 2)} seconds')" + ] + }, + { + "cell_type": "markdown", + "id": "c2cab4f4-eff4-4b09-a68d-dc1c089bfa72", + "metadata": {}, + "source": [ + "**Note**: String array is not yet implemented in CuPy. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7321b4f6-61c6-4968-a77b-3d3ed4cde745", + "metadata": {}, + "outputs": [], + "source": [ + "# DO NOT CHANGE THIS CELL\n", + "import IPython\n", + "app = IPython.Application.instance()\n", + "app.kernel.do_shutdown(True)" + ] + }, + { + "cell_type": "markdown", + "id": "f9d91047-2067-42f7-89b5-e9915eefb1de", + "metadata": {}, + "source": [ + "**Well Done!** Let's move to the [next notebook](1-05_grouping.ipynb). " + ] + }, + { + "cell_type": "markdown", + "id": "ec10008d-0c52-4f35-ad4a-969a892c35b8", + "metadata": {}, + "source": [ + "<img src=\"./images/DLI_Header.png\" width=400/>" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.15" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} |
