You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/design/design_clustermodel.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,13 +22,13 @@ The figure below demonstrates the overall workflow for cluster model training, w
22
22
23
23
In this scenario, we focus on the extraction of data patterns in unsupervised learning.
24
24
25
-
So, the user can use `TRAIN` keyword to training a model. The user can also specify the training hyper-parameters with the keyword `WITH` and determine whether to use pre-trained model by `USING`. The training and predicting syntax looks like:
25
+
So, the user can use `TO TRAIN` keyword to training a model. The user can also specify the training hyper-parameters with the keyword `WITH` and determine whether to use pre-trained model by `USING`. The training and predicting syntax looks like:
The basic idea of SQLFlow is to extend the SELECT statement of SQL to have the TRAIN and PREDICT clauses. For more discussion, please refer to the [syntax design](/doc/design/design_syntax.md). SQLFlow translates such "extended SQL statements" into submitter programs, which forward the part from SELECT to TRAIN or PREDICT, which we call the "standard part", to the SQL engine. SQLFlow also accepts the SELECT statement without TRAIN or PREDICT clauses and would forward such "standard statements" to the engine. It is noticeable that the "standard part" or "standard statements" are not standardized. For example, various engines use different syntax for `FULL OUTER JOIN`.
11
+
The basic idea of SQLFlow is to extend the SELECT statement of SQL to have the TO TRAIN and PREDICT clauses. For more discussion, please refer to the [syntax design](/doc/design/design_syntax.md). SQLFlow translates such "extended SQL statements" into submitter programs, which forward the part from SELECT to TO TRAIN or PREDICT, which we call the "standard part", to the SQL engine. SQLFlow also accepts the SELECT statement without TO TRAIN or PREDICT clauses and would forward such "standard statements" to the engine. It is noticeable that the "standard part" or "standard statements" are not standardized. For example, various engines use different syntax for `FULL OUTER JOIN`.
12
12
13
13
- Hive supports `FULL OUTER JOIN` directly.
14
14
- MySQL doesn't have `FULL OUTER JOIN`. However, a user can emulates `FULL OUTER JOIN` using `LEFT JOIN`, `UNION` and `RIGHT JOIN`.
Copy file name to clipboardExpand all lines: doc/design/design_elasticdl_on_sqlflow.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -12,7 +12,7 @@ This is a design doc on integration with [ElasticDL](https://github.com/wangkuiy
12
12
SELECT
13
13
c1, c2, c3, c4, c5 as class
14
14
FROM training_data
15
-
TRAIN ElasticDLKerasClassifier
15
+
TO TRAIN ElasticDLKerasClassifier
16
16
WITH
17
17
optimizer ="optimizer",
18
18
loss ="loss",
@@ -47,7 +47,7 @@ Users can provide run-time configurations to ElasticDL job via additional parame
47
47
SELECT
48
48
c1, c2, c3, c4, c5 as class
49
49
FROM training_data
50
-
TRAIN ElasticDLKerasClassifier
50
+
TO TRAIN ElasticDLKerasClassifier
51
51
WITH
52
52
optimizer ="optimizer",
53
53
loss ="loss",
@@ -73,7 +73,7 @@ INTO trained_elasticdl_keras_classifier;
73
73
Steps:
74
74
75
75
1. Based on `SELECT ... FROM ...`, read ODPS table and write it to [RecordIO](https://github.com/wangkuiyi/recordio) files, including both features and labels. These files will be stored in [Kubernetes Persistent Volumes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/). In the future, we will support reading ODPS table directly without having to convert it to RecordIO files.
76
-
2. Generate model definition file (e.g. [cifar10_functional_api.py](https://github.com/wangkuiyi/elasticdl/blob/develop/model_zoo/cifar10_functional_api/cifar10_functional_api.py)) that will be used in `TRAIN` clause, which includes:
76
+
2. Generate model definition file (e.g. [cifar10_functional_api.py](https://github.com/wangkuiyi/elasticdl/blob/develop/model_zoo/cifar10_functional_api/cifar10_functional_api.py)) that will be used in `TO TRAIN` clause, which includes:
77
77
78
78
- In model definition function e.g. `custom_model()`, we need to configure model input and output shapes correctly in `inputs = tf.keras.layers.Input(shape=<input_shape>)` (only when the model is defined using `tf.keras` functional APIs) and `outputs = tf.keras.layers.Dense(<num_classes>)`(based on `COLUMN ... LABEL ...`). For this MVP, users can provide `<input_shape>` and `<num_classes>` using `WITH` clause which will then get passed to the model constructor `custom_model(input_shape, num_classes)` via `--params` argument in ElasticDL high-level API. In the future, this will be inferred from the ODPS table.
79
79
- Pass additional parameters from `WITH` clause to `custom_model()`'s instantiation, such as `optimizer` and `loss`.
0 commit comments