aboutsummaryrefslogtreecommitdiff
path: root/Fundamentals_of_Accelerated_Data_Science
diff options
context:
space:
mode:
Diffstat (limited to 'Fundamentals_of_Accelerated_Data_Science')
-rw-r--r--Fundamentals_of_Accelerated_Data_Science/1-08_cudf-polars.ipynb7
-rw-r--r--Fundamentals_of_Accelerated_Data_Science/2-04_networkx_cugraph.ipynb19
-rw-r--r--Fundamentals_of_Accelerated_Data_Science/3-04_logistic_regression.ipynb11
-rw-r--r--Fundamentals_of_Accelerated_Data_Science/3-06_xgboost.ipynb4
-rw-r--r--Fundamentals_of_Accelerated_Data_Science/3-07_triton.ipynb7
5 files changed, 38 insertions, 10 deletions
diff --git a/Fundamentals_of_Accelerated_Data_Science/1-08_cudf-polars.ipynb b/Fundamentals_of_Accelerated_Data_Science/1-08_cudf-polars.ipynb
index c0f1115..6f810c8 100644
--- a/Fundamentals_of_Accelerated_Data_Science/1-08_cudf-polars.ipynb
+++ b/Fundamentals_of_Accelerated_Data_Science/1-08_cudf-polars.ipynb
@@ -803,7 +803,7 @@
},
{
"cell_type": "code",
- "execution_count": 30,
+ "execution_count": null,
"id": "6f5883f3-6238-4f98-970b-f06adabfb50e",
"metadata": {},
"outputs": [
@@ -846,7 +846,10 @@
],
"source": [
"# Show optimized Graph\n",
- "lazy_result.show_graph(optimized=True)"
+ "lazy_result.show_graph(optimized=True)\n",
+ "\n",
+ "# pi - projection compute and keep these columns\n",
+ "# sigme - selection "
]
},
{
diff --git a/Fundamentals_of_Accelerated_Data_Science/2-04_networkx_cugraph.ipynb b/Fundamentals_of_Accelerated_Data_Science/2-04_networkx_cugraph.ipynb
index 48764b4..e6f3382 100644
--- a/Fundamentals_of_Accelerated_Data_Science/2-04_networkx_cugraph.ipynb
+++ b/Fundamentals_of_Accelerated_Data_Science/2-04_networkx_cugraph.ipynb
@@ -62,6 +62,19 @@
]
},
{
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7375dbf9",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# https://networkx.org/documentation/stable/reference/configs.html\n",
+ "\n",
+ "# nx.config.backend_priority = [\"cugraph\", \"..\"]\n",
+ "# env NETWORKX_BACKEND_PRIORITY=\"cugraph,..\""
+ ]
+ },
+ {
"cell_type": "markdown",
"id": "697ea4c9-b416-43d5-9d2c-28aa41ef2561",
"metadata": {},
@@ -291,12 +304,14 @@
"### Pagerank Centrality ###\n",
"Determines a node's importance based on the quantity and quality of links to it, similar to Google's original PageRank algorithm\n",
"\n",
- "PageRank’s main difference from EigenCentrality is that it accounts for link direction. Each node in a network is assigned a score based on its number of incoming links (its ‘indegree’). These links are also weighted depending on the relative score of its originating node."
+ "PageRank’s main difference from EigenCentrality is that it accounts for link direction. Each node in a network is assigned a score based on its number of incoming links (its ‘indegree’). These links are also weighted depending on the relative score of its originating node.\n",
+ "\n",
+ "1/n"
]
},
{
"cell_type": "code",
- "execution_count": 9,
+ "execution_count": null,
"id": "a17ee15b-8758-484b-82b9-a158187231c5",
"metadata": {},
"outputs": [],
diff --git a/Fundamentals_of_Accelerated_Data_Science/3-04_logistic_regression.ipynb b/Fundamentals_of_Accelerated_Data_Science/3-04_logistic_regression.ipynb
index 3890ea7..4e4b2ab 100644
--- a/Fundamentals_of_Accelerated_Data_Science/3-04_logistic_regression.ipynb
+++ b/Fundamentals_of_Accelerated_Data_Science/3-04_logistic_regression.ipynb
@@ -205,7 +205,9 @@
"## Logistic Regression ##\n",
"Logistic regression can be used to estimate the probability of an outcome as a function of some (assumed independent) inputs. In our case, we would like to estimate infection risk based on population members' age and sex.\n",
"\n",
- "Below we train a logistic regresion model. We first create a cuML logistic regression instance `logreg`. The `logreg.fit` method takes 2 arguments: the model's independent variables *X*, and the dependent variable *y*. Fit the `logreg` model using the `gdf` columns `age` and `sex` as *X* and the `infected` column as *y*."
+ "Below we train a logistic regresion model. We first create a cuML logistic regression instance `logreg`. The `logreg.fit` method takes 2 arguments: the model's independent variables *X*, and the dependent variable *y*. Fit the `logreg` model using the `gdf` columns `age` and `sex` as *X* and the `infected` column as *y*.\n",
+ "\n",
+ "1/(1+e^{-z}) sigmoid"
]
},
{
@@ -637,6 +639,13 @@
]
},
{
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ },
+ {
"cell_type": "markdown",
"metadata": {},
"source": [
diff --git a/Fundamentals_of_Accelerated_Data_Science/3-06_xgboost.ipynb b/Fundamentals_of_Accelerated_Data_Science/3-06_xgboost.ipynb
index 654a782..d3548f2 100644
--- a/Fundamentals_of_Accelerated_Data_Science/3-06_xgboost.ipynb
+++ b/Fundamentals_of_Accelerated_Data_Science/3-06_xgboost.ipynb
@@ -443,7 +443,7 @@
},
{
"cell_type": "code",
- "execution_count": 12,
+ "execution_count": null,
"metadata": {},
"outputs": [],
"source": [
@@ -453,8 +453,6 @@
" 'device': 'cuda',\n",
" 'tree_method': 'hist',\n",
" 'objective': 'binary:logistic',\n",
- " 'grow_policy': 'lossguide',\n",
- " 'eval_metric': 'logloss',\n",
" 'subsample': '0.8'\n",
"}"
]
diff --git a/Fundamentals_of_Accelerated_Data_Science/3-07_triton.ipynb b/Fundamentals_of_Accelerated_Data_Science/3-07_triton.ipynb
index 757b3fa..47d586c 100644
--- a/Fundamentals_of_Accelerated_Data_Science/3-07_triton.ipynb
+++ b/Fundamentals_of_Accelerated_Data_Science/3-07_triton.ipynb
@@ -120,11 +120,14 @@
},
{
"cell_type": "code",
- "execution_count": 3,
+ "execution_count": null,
"id": "61d898fb-a8d2-4d1c-a13f-2c4be6c18969",
"metadata": {},
"outputs": [],
"source": [
+ "# RAPIDS Forest Inference Library\n",
+ "# FIL returns class labels or probabilities\n",
+ "# mem / disk\n",
"config_text = f\"\"\"backend: \"fil\"\n",
"max_batch_size: 32768\n",
"input [ \n",
@@ -148,7 +151,7 @@
" value: {{ string_value: \"xgboost_json\" }}\n",
" }},\n",
" {{\n",
- " key: \"output_class\"\n",
+ " key: \"output_class\" \n",
" value: {{ string_value: \"false\" }}\n",
" }},\n",
" {{\n",