diff --git a/doc/changelog.md b/doc/changelog.md index b2bf0152d5..bcca548854 100644 --- a/doc/changelog.md +++ b/doc/changelog.md @@ -14,7 +14,8 @@ To be released at some point in the future Description - Implement workaround for Tensorflow that allows RedisAI to build with GCC-14 -- Add instructions for installing SmartSim on PML's Scylla +- Add installation instructions for airgapped machines +- Add installation instructions for PML's Scylla - Fix typos in documentation Detailed Notes @@ -26,6 +27,11 @@ Detailed Notes Future versions of Tensorflow may fix this problem, but for now this seems to be the best workaround. ([SmartSim-PR738](https://github.com/CrayLabs/SmartSim/pull/738)) +- Update install notes and documentation for custom backends +- Update/reorganize the install instructions to include a split between advanced + install notes and instructions for specific platforms. Additionally, add + instructions for machines which do not have access to the internet. + ([SmartSim-PR749](https://github.com/CrayLabs/SmartSim/pull/749)) - PML's Scylla is still under development. The usual SmartSim build instructions do not apply because the GPU dependencies have yet to be installed at a system-wide level. Scylla has diff --git a/doc/contributing.rst b/doc/contributing.rst index a8a860045c..7e200078f1 100644 --- a/doc/contributing.rst +++ b/doc/contributing.rst @@ -1,3 +1,4 @@ +.. _contributing: ****************** Contributing Guide diff --git a/doc/index.rst b/doc/index.rst index 4c64712b23..f9e05a51c7 100644 --- a/doc/index.rst +++ b/doc/index.rst @@ -13,6 +13,7 @@ overview installation_instructions/basic installation_instructions/platform + installation_instructions/troubleshooting/troubleshooting contributing smartsim_zoo diff --git a/doc/installation_instructions/basic.rst b/doc/installation_instructions/basic.rst index 226ccb0854..d4dda3b688 100644 --- a/doc/installation_instructions/basic.rst +++ b/doc/installation_instructions/basic.rst @@ -4,7 +4,9 @@ Basic Installation ****************** -The following will show how to install both SmartSim and SmartRedis. +The following instructions guide you through installing SmartSim and SmartRedis. +SmartSim, despite being a Python-library, has a second build step for Redis and +RedisAI. Please follow these instructions carefully. .. note:: @@ -30,30 +32,29 @@ The base prerequisites to install SmartSim and SmartRedis wtih CPU-only support .. note:: - GCC 9, 11-13 is recommended (here are known issues compiling with GCC 10). For - CUDA 11.8, GCC 9 or 11 must be used. - -.. warning:: - - Apple Clang 15 seems to have issues on MacOS with Apple Silicon. Please modify - your path to ensure that a version of GCC installed by brew has priority. Note - this seems to be hardcoded to `gcc` and `g++` in the Redis build so ensure that - `which gcc g++` do not point to Apple Clang. + We suggest using GCC to build Redis, RedisAI, and the ML backends. For specific + version requirements see the :ref:`Requirements ` section. + SmartRedis can be compiled with GCC, Intel, Cray, and Nvidia compilers. ML Library Support ================== -We currently support both Nvidia and AMD GPUs when using RedisAI for GPU inference. The support -for these GPUs often depends on the version of the CUDA or ROCm stack that is availble on your -machine. In _most_ cases, the versions backwards compatible. If you encounter problems, please -contact us and we can build the backend libraries for your desired version of CUDA and ROCm. +SmartSim supports using Nvidia and AMD GPUs when using RedisAI for GPU +inference. GPU support often depends on the version of the CUDA or ROCm stack +that is available on your machine. In _most_ cases, the versions of the ML +frameworks are backwards compatible. If you encounter problems, please contact +us at (smartsim at hpe dot com) and we can build the backend libraries for your +desired version of CUDA and/or ROCm. CPU backends are provided for Apple (both Intel and Apple Silicon) and Linux (x86_64). -Be sure to reference the table below to find which versions of the ML libraries are supported for -your particular platform. Additional, see :ref:`installation notes ` for helpful -information regarding various system types before installation. +Be sure to reference the table below to find which versions of the ML libraries +are supported for your particular platform. Additionally, see :ref:`Platform +Installation Guide ` for helpful information regarding +specific systems. + +.. _requirements: Linux ----- @@ -64,7 +65,7 @@ Linux Additional requirements: - * GCC <= 11 + * GCC <= 11 (except 10) * CUDA Toolkit 11.7 or 11.8 * cuDNN 8.9 @@ -86,6 +87,7 @@ Linux Additional requirements: + * GCC >= 11 * CUDA Toolkit 12 * cuDNN 8.9 @@ -287,8 +289,7 @@ combination. GPU builds can be troublesome due to the way that RedisAI and the ML-package backends look for the CUDA Toolkit and cuDNN libraries. Please see the - :ref:`Platform Installation Section ` section for guidance. - + :ref:`Install Troubleshooting ` section for guidance. .. _dragon_install: diff --git a/doc/installation_instructions/platform.rst b/doc/installation_instructions/platform.rst index c1eb51df1a..c2aca958fc 100644 --- a/doc/installation_instructions/platform.rst +++ b/doc/installation_instructions/platform.rst @@ -1,30 +1,21 @@ -.. _install-notes: +.. _platform-installation: -Installation on specific platforms -================================== +Platform Install Guide +====================== -The following describes installation details for various systems and platforms -that SmartSim may be used on. -.. include:: platform/generic.rst - -.. include:: platform/nonroot-linux.rst +HPC platforms often provide modules that enable user to avoid retrieving all +build dependencies themselves. Additionally, some machines require environment +variables and/or configuration settings that need to be set for optimal +performance. The below machines have vetted instructions. Please feel free to +contribute instructions for your own platform (see :ref:`Contributing Guide +`). .. include:: platform/frontier.rst - .. include:: platform/perlmutter.rst - +.. include:: platform/pml-scylla.rst .. include:: platform/cray.rst - -.. include:: platform/ncar-cheyenne.rst - .. include:: platform/olcf-summit.rst -.. include:: platform/pml-scylla.rst - .. _site_installation: - -.. include:: site-install.rst - - - +.. include:: site-install.rst \ No newline at end of file diff --git a/doc/installation_instructions/platform/cray.rst b/doc/installation_instructions/platform/cray.rst index 1a352abd99..6b763c0236 100644 --- a/doc/installation_instructions/platform/cray.rst +++ b/doc/installation_instructions/platform/cray.rst @@ -1,5 +1,5 @@ HPE Cray supercomputers -======================= +----------------------- On certain HPE Cray machines, the SmartSim dependencies have been installed system-wide though specific paths and names might vary (please contact the team diff --git a/doc/installation_instructions/platform/frontier.rst b/doc/installation_instructions/platform/frontier.rst index 9b05061fe1..06828bac9e 100644 --- a/doc/installation_instructions/platform/frontier.rst +++ b/doc/installation_instructions/platform/frontier.rst @@ -1,8 +1,8 @@ -OLCF Frontier -============= +Frontier (OLCF) +--------------- Known limitations ------------------ +^^^^^^^^^^^^^^^^^ We are continually working on getting all the features of SmartSim working on Frontier, however we do have some known limitations: @@ -23,7 +23,7 @@ Please raise an issue in the SmartSim Github or contact the developers if the ab issues are affecting your workflow or if you find any other problems. One-time Setup --------------- +^^^^^^^^^^^^^^ To install the SmartRedis and SmartSim python packages on Frontier, please follow these instructions, being sure to set the following variables @@ -87,7 +87,7 @@ The following output indicates a successful install: 16:26:35 login SmartSim[557020:MainThread] INFO Success! Post-installation ------------------ +^^^^^^^^^^^^^^^^^ Before running SmartSim, the environment should match the one used to build, and some variables should be set to optimize performance: @@ -109,7 +109,7 @@ build, and some variables should be set to optimize performance: mkdir -p ${MIOPEN_USER_DB_PATH} Binding DBs to Slingshot ------------------------- +^^^^^^^^^^^^^^^^^^^^^^^^ Each Frontier node has *four* NICs, which also means users need to bind DBs to *four* network interfaces, ``hsn0``, ``hsn1``, ``hsn2``, diff --git a/doc/installation_instructions/platform/ncar-cheyenne.rst b/doc/installation_instructions/platform/ncar-cheyenne.rst deleted file mode 100644 index aeb994e917..0000000000 --- a/doc/installation_instructions/platform/ncar-cheyenne.rst +++ /dev/null @@ -1,33 +0,0 @@ - -Cheyenne at NCAR -================ - -Since SmartSim does not currently support the Message Passing Toolkit (MPT), -Cheyenne users of SmartSim will need to utilize OpenMPI. - -The following module commands were utilized to run the examples: - -.. code-block:: bash - - $ module purge - $ module load ncarenv/1.3 gnu/8.3.0 ncarcompilers/0.5.0 netcdf/4.7.4 openmpi/4.0.5 - -With this environment loaded, users will need to build and install both SmartSim -and SmartRedis through pip. Usually we recommend users installing or loading -miniconda and using the pip that comes with that installation. - -.. code-block:: bash - - $ pip install smartsim - $ smart build --device cpu #(Since Cheyenne does not have GPUs) - -To make the SmartRedis library (C, C++, Fortran clients), follow these steps -with the same environment loaded. - -.. code-block:: bash - - # clone SmartRedis and build - $ git clone https://github.com/SmartRedis.git smartredis - $ cd smartredis - $ make lib - diff --git a/doc/installation_instructions/platform/nonroot-linux.rst b/doc/installation_instructions/platform/nonroot-linux.rst deleted file mode 100644 index 3070a871ae..0000000000 --- a/doc/installation_instructions/platform/nonroot-linux.rst +++ /dev/null @@ -1,18 +0,0 @@ -GPU dependencies (non-root) -=========================== - -The Nvidia installation instructions for CUDA Toolkit and cuDNN tend to be -tailored for users with root access. For those on HPC platforms where root -access is rare, manually downloading and installing these dependencies as -a user is possible. - -.. code-block:: bash - - wget https://developer.download.nvidia.com/compute/cuda/11.4.4/local_installers/cuda_11.4.4_470.82.01_linux.run - chmod +x cuda_11.4.4_470.82.01_linux.run - ./cuda_11.4.4_470.82.01_linux.run --toolkit --silent --toolkitpath=/path/to/install/location/ - -For cuDNN, follow `Nvidia's instructions -`_, -and copy the cuDNN libraries to the `lib64` directory at the CUDA Toolkit -location specified above. \ No newline at end of file diff --git a/doc/installation_instructions/platform/olcf-summit.rst b/doc/installation_instructions/platform/olcf-summit.rst index 07be24eec7..fbb4f9b6d2 100644 --- a/doc/installation_instructions/platform/olcf-summit.rst +++ b/doc/installation_instructions/platform/olcf-summit.rst @@ -1,6 +1,6 @@ -Summit at OLCF -============== +Summit (OLCF) +------------- Since SmartSim does not have a built PowerPC build, the build steps for an IBM system are slightly different than other systems. diff --git a/doc/installation_instructions/platform/perlmutter.rst b/doc/installation_instructions/platform/perlmutter.rst index 71f97a4dc9..7f1a0088c8 100644 --- a/doc/installation_instructions/platform/perlmutter.rst +++ b/doc/installation_instructions/platform/perlmutter.rst @@ -1,8 +1,8 @@ -NERSC Perlmutter -================ +Perlmutter (NERSC) +------------------ One-time Setup --------------- +^^^^^^^^^^^^^^ To install SmartSim on Perlmutter, follow these steps: @@ -53,7 +53,7 @@ The following output indicates a successful install: 16:26:35 login SmartSim[557020:MainThread] INFO Success! Post-installation ------------------ +^^^^^^^^^^^^^^^^^ After completing the above steps to install SmartSim in a conda environment, you can reload the conda environment by running the following commands: diff --git a/doc/installation_instructions/platform/pml-scylla.rst b/doc/installation_instructions/platform/pml-scylla.rst index c13f178213..8aa80c0e7f 100644 --- a/doc/installation_instructions/platform/pml-scylla.rst +++ b/doc/installation_instructions/platform/pml-scylla.rst @@ -1,12 +1,12 @@ -PML Scylla -========== +Scylla (PML) +------------ .. warning:: As of September 2024, the software stack on Scylla is still being finalized. Therefore, please consider these instructions as preliminary for now. One-time Setup --------------- +^^^^^^^^^^^^^^ To install SmartSim on Scylla, follow these steps: @@ -72,7 +72,7 @@ The following output indicates a successful install: 16:26:35 login SmartSim[557020:MainThread] INFO Success! Post-installation ------------------ +^^^^^^^^^^^^^^^^^ After completing the above steps to install SmartSim in a conda environment, you can reload the conda environment by running the following commands: diff --git a/doc/installation_instructions/site-install.rst b/doc/installation_instructions/site-install.rst index 53e0ff8bf0..ca4b3f0c60 100644 --- a/doc/installation_instructions/site-install.rst +++ b/doc/installation_instructions/site-install.rst @@ -12,4 +12,4 @@ from source with the following steps replacing ``COMPILER_VERSION`` and module use -a /lus/scratch/smartsim/local/modulefiles module load cudatoolkit/11.8 cudnn smartsim-deps/COMPILER_VERSION/SMARTSIM_VERSION pip install smartsim - smart build --skip-backends --device gpu [--onnx] + smart build --skip-backends --device gpu diff --git a/doc/installation_instructions/troubleshooting/cuda-dependencies.rst b/doc/installation_instructions/troubleshooting/cuda-dependencies.rst new file mode 100644 index 0000000000..cf6dcacc0f --- /dev/null +++ b/doc/installation_instructions/troubleshooting/cuda-dependencies.rst @@ -0,0 +1,111 @@ +Nvidia GPU Dependencies +----------------------- + +The Nvidia installation instructions for CUDA Toolkit and cuDNN tend to be +tailored for users with root access. For those on HPC platforms where root +access is rare, users can install Nvidia dependencies in user-space. Even on +machines where these dependencies are available, if environment variables are +not set, the ``smart build`` step may fail. This section details how to download +and install these dependencies and configure your build environment. + +.. note:: + + The Orchestrator must launched in an environment with the cuDNN and CUDA + Toolkit libraries findable by the link loader (e.g. available in the + ``LD_LIBRARY_PATH`` environment variable). + +Download and install +^^^^^^^^^^^^^^^^^^^^ + +**Step 1:** Find a location which is globally accessible and has sufficient +storage space (about 12GB) and set an environment variable: + +.. code-block:: bash + + export CUDA_TOOLKIT_INSTALL_PATH=/path/to/install/location/cudatoolkit + export CUDNN_INSTALL_PATH=/path/to/install/location/cudnn + +**Step 2:** Download cudatoolkit and install it: + +.. tabs:: + + .. group-tab:: CUDA 11 + + .. code-block:: bash + + wget https://developer.download.nvidia.com/compute/cuda/11.8.0/local_installers/cuda_11.8.0_520.61.05_linux.run + sh ./cuda_11.8.0_520.61.05_linux.run --toolkit --silent --toolkitpath=$CUDA_TOOLKIT_INSTALL_PATH + + .. group-tab:: CUDA 12 + + .. code-block:: bash + + wget https://developer.download.nvidia.com/compute/cuda/12.5.0/local_installers/cuda_12.5.0_555.42.02_linux.run + sh ./cuda_12.5.0_555.42.02_linux.run --toolkit --silent --toolkitpath=$CUDA_TOOLKIT_INSTALL_PATH + +**Step 3:** Download cuDNN: +For cuDNN, follow `Nvidia's instructions +`_ for +downloading cuDNN version 8.9 for either CUDA-11 or CUDA-12. + +**Step 4:** Untar the cuDNN archive: + +.. tabs:: + + .. group-tab:: CUDA 11 + + .. code-block:: bash + + mkdir -p $CUDNN_INSTALL_PATH + tar -xf cudnn-linux-x86_64-8.9.7.29_cuda11-archive.tar -C $CUDNN_INSTALL_PATH --strip-components 1 + + .. group-tab:: CUDA 12 + + .. code-block:: bash + + mkdir -p $CUDNN_INSTALL_PATH + tar -xf cudnn-linux-x86_64-8.9.7.29_cuda12-archive.tar -C $CUDNN_INSTALL_PATH --strip-components 1 + +Option 1: Environment Variables +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The following environment variables help the ``smart build`` step find and link in the +CUDA Toolkit and cuDNN libraries needed to build the ML backends. + +.. code-block:: bash + + # CUDA Toolkit variables + export CUDA_TOOLKIT_ROOT_DIR=$CUDA_TOOLKIT_INSTALL_PATH + export CUDA_NVCC_EXECUTABLE=$CUDA_TOOLKIT_ROOT_DIR/bin/nvcc + export CUDA_INCLUDE_DIRS=$CUDA_TOOLKIT_ROOT_DIR/include + + # cuDNN Variables + export CUDNN_LIBRARY=$CUDNN_INSTALL_PATH/lib/libcudnn.so + export CUDNN_INCLUDE_DIR=$CUDNN_INSTALL_PATH/include + +Option 2: Setup Modulefiles +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Alternatively, these environment variables can be setup by using environment +modules. This is useful when the CUDA dependencies are intended to be shared +across users. + +**Step 1:** Download these two modulefiles to a directory of your choosing + +- :download:`CUDA Toolkit <./cudatoolkit>` +- :download:`cuDNN <./cudnn>` + +**Step 2:** Modify the files to set the ``cuda_home`` and ``CUDNN_ROOT`` +variables to match the installed locations for CUDA Toolkit and cuDNN. + +**Step 3:** In your ``.bashrc`` add the following line + +.. code-block:: + + module use /path/to/modulefile root + +**Step 4:** Activate the modulefiles + +.. code-block:: + + module load cudatoolkit cudnn \ No newline at end of file diff --git a/doc/installation_instructions/troubleshooting/cudatoolkit b/doc/installation_instructions/troubleshooting/cudatoolkit new file mode 100644 index 0000000000..7b8dd73068 --- /dev/null +++ b/doc/installation_instructions/troubleshooting/cudatoolkit @@ -0,0 +1,24 @@ +#%Module -*- tcl -*- ## +## modulefile + +proc ModulesHelp { } { + + puts stderr "\tAdds CUDA Toolkit to your environment," + +} + +module-whatis "CUDA Toolkit development libraries" + +conflict cudatoolkit + +set cuda_home path/to/cudatoolkit + +setenv CUDA_HOME $cuda_home +setenv CUDA_TOOLKIT_ROOT_DIR $cuda_home +set cuda_lib $cuda_home/lib64/ +setenv CUDA_LIBRARY $cuda_lib +prepend-path LD_LIBRARY_PATH $cuda_lib + +prepend-path PATH $cuda_home/bin +setenv CUDA_NVCC_EXECUTABLE $cuda_home/bin/nvcc +setenv CUDA_INCLUDE_DIRS $cuda_home/include \ No newline at end of file diff --git a/doc/installation_instructions/troubleshooting/cudnn b/doc/installation_instructions/troubleshooting/cudnn new file mode 100644 index 0000000000..ccf6a6f211 --- /dev/null +++ b/doc/installation_instructions/troubleshooting/cudnn @@ -0,0 +1,13 @@ +#%Module -*- tcl -*- ## +## modulefile +proc ModulesHelp { } { + +puts stderr "\tAdds CUDNN to your environment," + +} + +module-whatis "CUDNN development libraries" + +set CUDNN_ROOT /path/to/cudnn +set cudnn_lib $CUDNN_ROOT/lib +setenv CUDNN_INSTALL_PATH $CUDNN_ROOT \ No newline at end of file diff --git a/doc/installation_instructions/troubleshooting/custom_backends.rst b/doc/installation_instructions/troubleshooting/custom_backends.rst new file mode 100644 index 0000000000..98031cf5e8 --- /dev/null +++ b/doc/installation_instructions/troubleshooting/custom_backends.rst @@ -0,0 +1,41 @@ +Custom ML backends +------------------ + +The ML backends (Torch, ONNX Runtime, and Tensorflow) and their associated +python packages have different versions and indices that can be supported based +on the intended device (CPU, ROCm, CUDA-11, or CUDA-12). The officially +supported backends are stored in JSON files within the +``smartsim/_core/_install/configs/mlpackages`` directory. + +To customize the version of a backend and/or package, we recommend that you use +a configuration shipped with SmartSim as a template (for example the one at the +end of this section). Copy the file and update as needed. Afterwards, use +``smart build --config-dir`` to tell the build process to use custom +configuration(s). + +The following table describes the main fields needed to define a machine learning +backend used by RedisAI. + +.. list-table:: MLPackages fields + :widths: 15 60 + :header-rows: 1 + + * - Field Name + - Description + * - ``name`` + - The name of the C++ frontend to the ML package itself (e.g. libtorch) + * - ``version`` + - A string used to identify the version of the library. Note that this does not have + an effect on the build process itself, but is used to display information + * - ``pip_index`` + - The pip index from which to install the python packages associated with this ML package + * - ``lib_source`` + - The location of the archive which contains the ML backend. If this is a URL, the file + will be downloaded, otherwise if this is a local path, the archive will be copied to + the build library and extracted + * - ``rai_patches`` + - Patch RedisAI source code with modifications needed by this ML package + +As an example, the following file describes the ML frameworks for Linux on CUDA-12 devices: + +.. literalinclude:: ../../../smartsim/_core/_install/configs/mlpackages/LinuxX64CUDA12.json diff --git a/doc/installation_instructions/platform/generic.rst b/doc/installation_instructions/troubleshooting/generic.rst similarity index 53% rename from doc/installation_instructions/platform/generic.rst rename to doc/installation_instructions/troubleshooting/generic.rst index 6ead091028..f035c3bc47 100644 --- a/doc/installation_instructions/platform/generic.rst +++ b/doc/installation_instructions/troubleshooting/generic.rst @@ -1,5 +1,5 @@ Customizing environment variables -================================= +--------------------------------- Various environment variables can be used to control the compilers and dependencies for SmartSim. These are particularly important to set before the @@ -14,14 +14,8 @@ backends are compiled with the desired compilation environment. that this works as intended however, please be sure to set the correct environment for the simulation using the ``RunSettings``. -All of the following environment variables must be *exported* to ensure that -they are used throughout the entire build process. Additionally at runtime, the -environment in which the Orchestrator is launched must have the cuDNN and CUDA -Toolkit libraries findable by the link loader (e.g. available in the -``LD_LIBRARY_PATH`` environment variable). - Compiler environment --------------------- +^^^^^^^^^^^^^^^^^^^^ Unlike SmartRedis, we *strongly* encourage users to only use the GNU compiler chain to build the SmartSim dependencies. Notably, RedisAI has some coding @@ -30,22 +24,4 @@ compiler should be used (e.g. the Cray Programming Environment wrappers), the following environment variables will control the C and C++ compilers: - ``CC``: Path to the C compiler -- ``CXX``: Path the C++ compiler - -CUDA-related ------------- - -The following environment variables help the ``smart build`` step find and link in the -CUDA Toolkit and cuDNN libraries needed to build the ML backends. - -cuDNN: - -- ``CUDNN_LIBRARY``: Path to the cuDNN shared libraries (e.g. ``libcudnn.so``) are found -- ``CUDNN_INCLUDE_DIR``: Path to cuDNN header files (e.g. ``cudnn.h``) - -CUDA Toolkit: - -- ``CUDA_TOOLKIT_ROOT_DIR``: Path to the root directory of CUDA Toolkit -- ``CUDA_NVCC_EXECUTABLE``: Path to the ``nvcc`` compiler -- ``CUDA_INCLUDE_DIRS``: Path to the CUDA Toolkit headers - +- ``CXX``: Path the C++ compiler \ No newline at end of file diff --git a/doc/installation_instructions/troubleshooting/offline.rst b/doc/installation_instructions/troubleshooting/offline.rst new file mode 100644 index 0000000000..87b19f1c3d --- /dev/null +++ b/doc/installation_instructions/troubleshooting/offline.rst @@ -0,0 +1,58 @@ +Airgapped Systems +----------------- + +SmartSim assumes that dependencies can be retrieved via the Internet. The +``smart build`` step can be bypassed by transferring the build artifacts from a +different machine. + +.. warning:: + + The Redis Source Available License (which licenses RedisAI) prohibits + distributing binaries to third-parties. Thus, compiled binaries should not + be shared outside of your organization (see `RSAL v2 + `_). + + +The easiest way to accomplish this assumes that you have the following +- A source machine connected to the internet with SmartSim built (referred to as Machine A). +- A target machine not connected to the Internet + +.. warning:: + The build and compilation environments of Machine A and B must be compatibile. + +**Step 1:** Note the path to SmartSim's ``core`` directory on Machine A + +.. code:: + + smart info + +**Step 2:** tar the ``bin`` and ``lib`` directories + +.. code:: + + tar -cf smartsim_build_artifacts.tar -C bin/ lib/ + +**Step 3:** Copy the tarball, SmartSim wheel, SmartRedis wheel, +SmartRedis libraries to Machine B (method will vary by machine) + +**Step 4:** Install SmartSim and SmartRedis on Machine B + +.. code:: + + pip install + +**Step 5:** Find the path to the core directory again with + +.. code:: + + smart info + +**Step 6:** Unpack the tarball to the core directory + +.. code:: + + tar -xf smartsim_build_artifacts.tar -C + +**Step 7:** Install the python packages associated with the ML frameworks +(for the default versions reference +``smartsim/_core/_install/configs/mlpackages``) \ No newline at end of file diff --git a/doc/installation_instructions/troubleshooting/troubleshooting.rst b/doc/installation_instructions/troubleshooting/troubleshooting.rst new file mode 100644 index 0000000000..c9490637fb --- /dev/null +++ b/doc/installation_instructions/troubleshooting/troubleshooting.rst @@ -0,0 +1,16 @@ +.. _installation-troubleshooting: + +Installation Troubleshooting +============================ + +SmartSim has been installed on a variety of systems with different build +environments and toolchains. The following two sections detail some common +situations and how to configure your build environment: + +.. include:: generic.rst + +.. include:: cuda-dependencies.rst + +.. include:: offline.rst + +.. include:: custom_backends.rst \ No newline at end of file diff --git a/smartsim/_core/_cli/build.py b/smartsim/_core/_cli/build.py index 5d094b72f4..7b6347ea40 100644 --- a/smartsim/_core/_cli/build.py +++ b/smartsim/_core/_cli/build.py @@ -306,7 +306,8 @@ def execute( logger.warning("Dragon installation failed") # REDIS/KeyDB - build_database(build_env, versions, keydb, verbose) + if not args.skip_database: + build_database(build_env, versions, keydb, verbose) if (CONFIG.lib_path / "redisai.so").exists(): logger.warning("RedisAI was previously built, run 'smart clean' to rebuild") @@ -368,6 +369,9 @@ def configure_parser(parser: argparse.ArgumentParser) -> None: action="store_true", help="Do not compile RedisAI and the backends", ) + parser.add_argument( + "--skip-database", action="store_true", help="Do not build the database" + ) parser.add_argument( "--skip-torch", action="store_true", diff --git a/smartsim/_core/_cli/info.py b/smartsim/_core/_cli/info.py index c08fcb1a35..21a426bafc 100644 --- a/smartsim/_core/_cli/info.py +++ b/smartsim/_core/_cli/info.py @@ -9,6 +9,7 @@ import smartsim._core._cli.utils as _utils import smartsim._core.utils.helpers as _helpers from smartsim._core._install.buildenv import BuildEnv as _BuildEnv +from smartsim._core.config import CONFIG _MISSING_DEP = _helpers.colorize("Not Installed", "red") @@ -29,6 +30,12 @@ def execute( end="\n\n", ) + print("SmartSim Paths") + path_table = [["core", str(CONFIG.dependency_path)]] + path_table.append(["bin", str(CONFIG.bin_path)]) + path_table.append(["lib", str(CONFIG.lib_path)]) + print(tabulate(path_table, tablefmt="fancy_outline"), end="\n\n") + print("Orchestrator Configuration:") db_path = _utils.get_db_path() db_table = [["Installed", _fmt_installed_db(db_path)]]