Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
208 changes: 208 additions & 0 deletions example/jupyter/tutorial_dnn_iris.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Classify Iris Dataset Using DNNClassifer\n",
"\n",
"This tutorial demonstrates how to\n",
"1. train a DNNClassifer on iris dataset.\n",
"1. use trained DNNClassifer to predict iris class.\n",
"\n",
"## The Dataset\n",
"\n",
"The Iris data set contains four features and one label. The four features identify the botanical characteristics of individual Iris flowers. Each feature is stored as a single float number. The label indicates the class of individual Iris flowers. The label is stored as a integer and has possible value of 0, 1, 2.\n",
"\n",
"We have prepared the iris dataset in table `iris.train` and `iris.test`. We will be using them as training data and test data respectively.\n",
"\n",
"We can have a quick peek of the data by running the following standard SQL statements."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"+--------------+---------+------+-----+---------+-------+\n",
"| Field | Type | Null | Key | Default | Extra |\n",
"+--------------+---------+------+-----+---------+-------+\n",
"| sepal_length | float | YES | | None | |\n",
"| sepal_width | float | YES | | None | |\n",
"| petal_length | float | YES | | None | |\n",
"| petal_width | float | YES | | None | |\n",
"| class | int(11) | YES | | None | |\n",
"+--------------+---------+------+-----+---------+-------+"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sqlflow\n",
"describe iris.train;"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"+--------------+-------------+--------------+-------------+-------+\n",
"| sepal_length | sepal_width | petal_length | petal_width | class |\n",
"+--------------+-------------+--------------+-------------+-------+\n",
"| 6.4 | 2.8 | 5.6 | 2.2 | 2 |\n",
"| 5.0 | 2.3 | 3.3 | 1.0 | 1 |\n",
"| 4.9 | 2.5 | 4.5 | 1.7 | 2 |\n",
"| 4.9 | 3.1 | 1.5 | 0.1 | 0 |\n",
"| 5.7 | 3.8 | 1.7 | 0.3 | 0 |\n",
"+--------------+-------------+--------------+-------------+-------+"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sqlflow\n",
"select *\n",
"from iris.train\n",
"limit 5;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Train\n",
"\n",
"Let's train a DNNClassifier, which has two hidden layers where each layer has ten hidden units. This can be done by specifying the training clause for SQLFlow's extended syntax.\n",
"\n",
"```\n",
"TRAIN DNNClassifier\n",
"WITH\n",
" model.n_classes = 3,\n",
" model.hidden_units = [10, 20]\n",
"```\n",
"\n",
"To specify the training data, we use standard SQL statements like `SELECT * FROM iris.train`.\n",
"\n",
"We explicit specify which column is used for features and which column is used for the label by writing\n",
"\n",
"```\n",
"COLUMN sepal_length, sepal_width, petal_length, petal_width\n",
"LABEL class\n",
"```\n",
"\n",
"At the end of the training process, we save the trained DNN model into table `sqlflow_models.my_dnn_model` by writing `INTO sqlflow_models.my_dnn_model`.\n",
"\n",
"Putting it all together, we have our first SQLFlow training statement."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Evaluation result: {'accuracy': 0.4347826, 'average_loss': 1.6101575, 'loss': 1.6101575, 'global_step': 100}\n",
"\n",
"Done training\n",
"\n"
]
}
],
"source": [
"%%sqlflow\n",
"SELECT *\n",
"FROM iris.train\n",
"TRAIN DNNClassifier\n",
"WITH\n",
" model.n_classes = 3,\n",
" model.hidden_units = [10, 20],\n",
" train.epoch = 100\n",
"COLUMN sepal_length, sepal_width, petal_length, petal_width\n",
"LABEL class\n",
"INTO sqlflow_models.my_dnn_model;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Predict\n",
"\n",
"SQLFlow also supports prediction out-of-the-box.\n",
"\n",
"To specify the prediction data, we use standard SQL statements like `SELECT * FROM iris.test`.\n",
"\n",
"Say we want the model, previously stored at `sqlflow_models.my_dnn_model`, to read the prediction data and write the predicted result into table `iris.predict` column `class`. We can write the following SQLFlow prediction statement."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sqlflow\n",
"SELECT *\n",
"FROM iris.test\n",
"predict iris.predict.class\n",
"USING sqlflow_models.my_dnn_model;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After the prediction, we can checkout the prediction result by"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%%sqlflow\n",
"SELECT *\n",
"FROM iris.predict\n",
"LIMIT 5;"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 2
}