Updated documentation with some performance considerations (#262)

filipecosta90 · lantiga · web-flow · commit 9adc16fe1268 · 2020-02-24T07:59:39.000Z
* [add] added mention to THREADS_PER_QUEUE feature. [add] splitted configuration options

* [add] documented Intermediate tensors memory overhead when issuing AI.MODELRUN and AI.SCRIPTRUN

* [add] Document the overhead of AI.TENSORGET and AI.TENSORSET with the optional arg VALUES

* [add] extended configuration documentation per PR review

* Update configuring.md

Co-authored-by: Luca Antiga &lt;luca.antiga@orobix.com&gt;
diff --git a/docs/commands.md b/docs/commands.md
@@ -1,35 +1,5 @@
 # RedisAI Commands
 
-## AI.CONFIG LOADBACKEND
-
-Load a DL/ML backend.
-
-By default, RedisAI starts with the ability to set and get tensor data, but setting and running models and scritps requires a computing backend to be loaded. This command allows to dynamically load a backend by specifying the backend identifier and the path to the backend library. Currently, once loaded, a backend cannot be unloaded, and there can be at most one backend per identifier loaded.
-
-```sql
-AI.CONFIG LOADBACKEND <backend_identifier> <location_of_backend_library>
-```
-
-* allowed backend identifiers are: TF (TensorFlow), TORCH (PyTorch), ONNX (ONNXRuntime).
-
-It is possible to specify backends at the command-line when starting `redis-server`, see example below.
-
-### AI.CONFIG LOADBACKEND Example
-
-> Load the TORCH backend
-
-```sql
-AI.CONFIG LOADBACKEND TORCH install/backend/redisai_torch/redisai_torch.so
-```
-
-> Load the TORCH backend at the command-line
-
-```bash
-redis-server --loadmodule install/redisai.so TORCH install/backend/redisai_torch/redisai_torch.so
-```
-
-This replaces the need for loading a backend using AI.CONFIG LOADBACKEND
-
 ## AI.TENSORSET
 
 Set a tensor.
@@ -61,6 +31,11 @@ Optional args:
 AI.TENSORSET foo FLOAT 2 2 VALUES 1 2 3 4
 ```
 
+!!! warning "Overhead of `AI.TENSORSET` with the optional arg VALUES"
+        
+    It is possible to set a tensor by specifying each individual value (VALUES ... ) or the entire tensor content as a binary buffer (BLOB ...). You should always try to use the `BLOB` option since it removes the overhead of parsing each individual value and does not require serialization/deserialization of the tensor, thus reducing the overall command latency an improving the maximum attainable performance of the model server.
+---
+
 ## AI.TENSORGET
 
 Get a tensor.
@@ -82,6 +57,11 @@ Get binary data for tensor at `foo`. Meta data is also returned.
 AI.TENSORGET foo BLOB
 ```
 
+!!! warning "Overhead of `AI.TENSORGET` with the optional arg VALUES"
+        
+    It is possible to receive a tensor as a list of each individual value (VALUES ... ) or the entire tensor content as a binary buffer (BLOB ...). You should always try to use the `BLOB` option since it removes the overhead of replying each individual value and does not require serialization/deserialization of the tensor, thus reducing the overall command latency an improving the maximum attainable performance of the model server.
+---
+
 ## AI.MODELSET
 
 Set a model.
@@ -159,6 +139,13 @@ If needed, input tensors are copied to the device specified in `AI.MODELSET` bef
 AI.MODELRUN resnet18 INPUTS image12 OUTPUTS label12
 ```
 
+!!! warning "Intermediate tensors memory overhead when issuing `AI.MODELRUN` and `AI.SCRIPTRUN`"
+        
+    The execution of models will generate intermediate tensors that are not allocated by the Redis allocator, but by whatever allocator is used in the backends (which may act on main memory or GPU memory, depending on the device), thus not being limited by maxmemory settings on Redis.
+---
+
+
+
 ## AI.SCRIPTSET
 
 Set a script.
@@ -235,6 +222,11 @@ If needed, input tensors are copied to the device specified in `AI.SCRIPTSET` be
 AI.SCRIPTRUN addscript addtwo INPUTS a b OUTPUTS c
 ```
 
+!!! warning "Intermediate tensors memory overhead when issuing `AI.MODELRUN` and `AI.SCRIPTRUN`"
+        
+    The execution of models will generate intermediate tensors that are not allocated by the Redis allocator, but by whatever allocator is used in the backends (which may act on main memory or GPU memory, depending on the device), thus not being limited by maxmemory settings on Redis.
+---
+
 ## AI.INFO
 
 Return information about runs of a `MODEL` or a `SCRIPT`.
diff --git a/docs/configuring.md b/docs/configuring.md
@@ -0,0 +1,134 @@
+# Configuration
+
+RedisAI supports both run-time configuration options and others that should be specified when loading the module. 
+
+## Configuration Options During Loading
+
+In general, passing configuration options is done by appending arguments after the `--loadmodule` argument in the command line, `loadmodule` configuration directive in a Redis config file, or the `MODULE LOAD` command. 
+
+The module dynamic library `redisai.so` can be located in any path, provided that we specify the full path or a path relative to where the `redis-server` command is issued. The additional arguments are options passed to the module. Currently the supported options are:
+
+- `BACKENDSPATH`: specify the default backends path used when loading a dynamic backend library.
+- `TORCH`: specify the location of the PyTorch backend library, and dynamically load it. The location can be given in two ways, absolute or relative to the `<BACKENDSPATH>`. Using this option replaces the need for loading the PyTorch backend on runtime.
+- `TF`: specify the location of the TensorFlow backend library, and dynamically load it. The location can be given in two ways, absolute or relative to the `<BACKENDSPATH>`. Using this option replaces the need for loading the TensorFlow backend on runtime.
+- `TFLITE`: specify the location of the TensorFlow Lite backend library, and dynamically load it. The location can be given in two ways, absolute or relative to the `<BACKENDSPATH>`. Using this option replaces the need for loading the TensorFlow Lite backend on runtime.
+- `ONNX`: specify the location of the ONNXRuntime backend library, and dynamically load it. The location can be given in two ways, absolute or relative to the `<BACKENDSPATH>`. Using this option replaces the need for loading the ONNXRuntime backend on runtime.
+- `THREADS_PER_QUEUE`: specify the fixed number of worker threads up front per device. This option is described in detail at [THREADS_PER_QUEUE](##THREADS_PER_QUEUE) section and can be only set when loading the module.
+
+
+### Configuration Examples
+
+In redis.conf:
+
+```
+loadmodule redisai.so OPT1 OPT2
+```
+
+From redis-cli:
+
+```
+127.0.0.6379> MODULE load redisai.so OPT1 OPT2
+```
+
+From command line using relative path:
+
+```
+$ redis-server --loadmodule ./redisai.so OPT1 OPT2
+```
+
+From command line using full path:
+
+```
+$ redis-server --loadmodule /usr/lib/redis/modules/redisai.so OPT1 OPT2
+```
+
+
+### THREADS_PER_QUEUE
+
+```
+THREADS_PER_QUEUE {number}
+```
+Enable configuring the main thread to create a fixed number of worker threads up front per device. This controls the maximum number of threads to be used for parallel execution of independent different operations. 
+
+This option can significantly improve the model run performance for simple models (models that require low computation effort), since there is usually room for extra computation on modern CPU's and hardware accelerators (GPUs, TPUs, etc.).
+
+#### THREADS_PER_QUEUE Default
+
+By default only one worker thread is used per device. 
+
+#### THREADS_PER_QUEUE Example
+
+```
+$ redis-server --loadmodule ./redisai.so THREADS_PER_QUEUE 4
+```
+
+---
+
+
+## Setting Configuration Options In Run-Time
+
+### AI.CONFIG BACKENDSPATH
+
+Specify the default backends path to use when dynamically loading a backend. 
+
+```sql
+AI.CONFIG BACKENDSPATH <default_location_of_backend_libraries>
+```
+
+#### AI.CONFIG BACKENDSPATH Example
+
+
+```sql
+AI.CONFIG BACKENDSPATH /usr/lib/redis/modules/redisai/backends
+```
+
+### AI.CONFIG LOADBACKEND
+
+Load a DL/ML backend.
+
+```sql
+AI.CONFIG LOADBACKEND <backend_identifier> <location_of_backend_library>
+```
+
+RedisAI currently supports PyTorch (libtorch), Tensorflow (libtensorflow), TensorFlow Lite, and ONNXRuntime as backends. 
+
+Allowed backend identifiers are:
+-  `TF` (TensorFlow)
+-  `TFLITE` (TensorFlow Lite)
+-  `TORCH` (PyTorch)
+-  `ONNX` (ONNXRuntime)
+
+
+
+By default, RedisAI starts with the ability to set and get tensor data, but setting and running models and scritps requires a computing backend to be loaded, which can be done during loading, as [explained above](##-Configuration-Options-During-Loading), or at or run-time using the `AI.CONFIG` commmand.
+
+This command allows to dynamically load a backend by specifying the backend identifier and the path to the backend library. Currently, once loaded, a backend cannot be unloaded, and there can be at most one backend per identifier loaded.
+
+
+If you don't specify a backend on load time, RedisAI will look into the default location lazily, when a model of a given backend is loaded.
+
+The default location relative to the `<BACKENDSPATH>` directory. If unspecified, by default RedisAI will look for:
+- ONNXRuntime dynamic library at: `<BACKENDSPATH>/redisai_onnxruntime/redisai_onnxruntime.so`
+- TensorFlow dynamic library at: `<BACKENDSPATH>/redisai_tensorflow/redisai_tensorflow.so`
+- TensorFlow Lite dynamic library at: `<BACKENDSPATH>/redisai_tflite/redisai_tflite.so`
+- PyTorch dynamic library at: `<BACKENDSPATH>/redisai_torch/redisai_torch.so`
+
+Any library dependency will be resolved automatically, and the mentioned directories are portable on all platforms.
+
+If relative, it is relative to `<BACKENDSPATH>`.
+
+
+#### AI.CONFIG LOADBACKEND Examples
+
+ Load the TORCH backend, relative to `BACKENDSPATH`
+
+```sql
+AI.CONFIG LOADBACKEND TORCH redisai_torch/redisai_torch.so
+```
+
+ Load the TORCH backend, specifying full path
+
+
+```sql
+AI.CONFIG LOADBACKEND TORCH /usr/lib/redis/modules/redisai/backends/redisai_torch/redisai_torch.so
+```
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -27,6 +27,7 @@ pages:
     - 'Quickstart': index.md
     - 'Commands': commands.md
     - 'Backend': dataandbackend.md
+    - 'Configuration': configuring.md
 
 markdown_extensions:
   - admonition