Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion doc/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,8 @@ To be released at some point in the future
Description

- Implement workaround for Tensorflow that allows RedisAI to build with GCC-14
- Add instructions for installing SmartSim on PML's Scylla
- Add installation instructions for airgapped machines
- Add installation instructions for PML's Scylla
- Fix typos in documentation

Detailed Notes
Expand All @@ -26,6 +27,11 @@ Detailed Notes
Future versions of Tensorflow may fix this problem, but for now this seems to be
the best workaround.
([SmartSim-PR738](https://github.com/CrayLabs/SmartSim/pull/738))
- Update install notes and documentation for custom backends
- Update/reorganize the install instructions to include a split between advanced
install notes and instructions for specific platforms. Additionally, add
instructions for machines which do not have access to the internet.
([SmartSim-PR749](https://github.com/CrayLabs/SmartSim/pull/749))
- PML's Scylla is still under development. The usual SmartSim
build instructions do not apply because the GPU dependencies
have yet to be installed at a system-wide level. Scylla has
Expand Down
1 change: 1 addition & 0 deletions doc/contributing.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
.. _contributing:

******************
Contributing Guide
Expand Down
1 change: 1 addition & 0 deletions doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
overview
installation_instructions/basic
installation_instructions/platform
installation_instructions/troubleshooting/troubleshooting
contributing
smartsim_zoo

Expand Down
41 changes: 21 additions & 20 deletions doc/installation_instructions/basic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@
Basic Installation
******************

The following will show how to install both SmartSim and SmartRedis.
The following instructions guide you through installing SmartSim and SmartRedis.
SmartSim, despite being a Python-library, has a second build step for Redis and
RedisAI. Please follow these instructions carefully.

.. note::

Expand All @@ -30,30 +32,29 @@ The base prerequisites to install SmartSim and SmartRedis wtih CPU-only support

.. note::

GCC 9, 11-13 is recommended (here are known issues compiling with GCC 10). For
CUDA 11.8, GCC 9 or 11 must be used.

.. warning::

Apple Clang 15 seems to have issues on MacOS with Apple Silicon. Please modify
your path to ensure that a version of GCC installed by brew has priority. Note
this seems to be hardcoded to `gcc` and `g++` in the Redis build so ensure that
`which gcc g++` do not point to Apple Clang.
We suggest using GCC to build Redis, RedisAI, and the ML backends. For specific
version requirements see the :ref:`Requirements <requirements>` section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does not look like there is a Requirements section, and it instead takes you to the ML Library Support/Linux section -> should this link instead point to the ML Library Support section?


SmartRedis can be compiled with GCC, Intel, Cray, and Nvidia compilers.

ML Library Support
==================

We currently support both Nvidia and AMD GPUs when using RedisAI for GPU inference. The support
for these GPUs often depends on the version of the CUDA or ROCm stack that is availble on your
machine. In _most_ cases, the versions backwards compatible. If you encounter problems, please
contact us and we can build the backend libraries for your desired version of CUDA and ROCm.
SmartSim supports using Nvidia and AMD GPUs when using RedisAI for GPU
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SmartSim supports using Nvidia and AMD GPUs when using RedisAI for GPU inference - what are your thoughts on rephrasing it to SmartSim enables the use of Nvidia and AMD GPUs for GPU inference with RedisAI?

inference. GPU support often depends on the version of the CUDA or ROCm stack
that is available on your machine. In _most_ cases, the versions of the ML
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like most is not rendering correctly in readthedocs and is showing up exactly as is - instead try *most*

frameworks are backwards compatible. If you encounter problems, please contact
us at (smartsim at hpe dot com) and we can build the backend libraries for your
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for pointing them on how to contact us, what are your thoughts on pointing them to this area of the docs?

https://www.craylabs.org/docs/contributing.html#how-to-connect

so they have multiple options?

ALSO! what are your thoughts on changing (smartsim at hpe dot com) to instead
[email protected] <mailto:[email protected]>_

desired version of CUDA and/or ROCm.

CPU backends are provided for Apple (both Intel and Apple Silicon) and Linux (x86_64).

Be sure to reference the table below to find which versions of the ML libraries are supported for
your particular platform. Additional, see :ref:`installation notes <install-notes>` for helpful
information regarding various system types before installation.
Be sure to reference the table below to find which versions of the ML libraries
are supported for your particular platform. Additionally, see :ref:`Platform
Installation Guide <platform-installation>` for helpful information regarding
specific systems.

.. _requirements:

Linux
-----
Expand All @@ -64,7 +65,7 @@ Linux

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I notice in the Linux tabs, there are additional requirements for CUDA 11 and CUDA 12 but not for ROCm or CPU - just raising this just incase!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh also why does ROCm 6 have N/A for two columns?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPU seems a little out of place, what are your thoughts on splitting this table into two with

  1. GPU Configurations
  2. CPU Configurations

Additional requirements:

* GCC <= 11
* GCC <= 11 (except 10)
* CUDA Toolkit 11.7 or 11.8
* cuDNN 8.9

Expand All @@ -86,6 +87,7 @@ Linux

Additional requirements:

* GCC >= 11
* CUDA Toolkit 12
* cuDNN 8.9

Expand Down Expand Up @@ -287,8 +289,7 @@ combination.

GPU builds can be troublesome due to the way that RedisAI and the ML-package
backends look for the CUDA Toolkit and cuDNN libraries. Please see the
:ref:`Platform Installation Section <install-notes>` section for guidance.

:ref:`Install Troubleshooting <installation-troubleshooting>` section for guidance.

.. _dragon_install:

Expand Down
31 changes: 11 additions & 20 deletions doc/installation_instructions/platform.rst
Original file line number Diff line number Diff line change
@@ -1,30 +1,21 @@
.. _install-notes:
.. _platform-installation:

Installation on specific platforms
==================================
Platform Install Guide
======================

The following describes installation details for various systems and platforms
that SmartSim may be used on.

.. include:: platform/generic.rst

.. include:: platform/nonroot-linux.rst
HPC platforms often provide modules that enable user to avoid retrieving all
build dependencies themselves. Additionally, some machines require environment
variables and/or configuration settings that need to be set for optimal
performance. The below machines have vetted instructions. Please feel free to
contribute instructions for your own platform (see :ref:`Contributing Guide
<contributing>`).

.. include:: platform/frontier.rst

.. include:: platform/perlmutter.rst

.. include:: platform/pml-scylla.rst
.. include:: platform/cray.rst

.. include:: platform/ncar-cheyenne.rst

.. include:: platform/olcf-summit.rst

.. include:: platform/pml-scylla.rst

.. _site_installation:

.. include:: site-install.rst



.. include:: site-install.rst
2 changes: 1 addition & 1 deletion doc/installation_instructions/platform/cray.rst
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
HPE Cray supercomputers
=======================
-----------------------

On certain HPE Cray machines, the SmartSim dependencies have been installed
system-wide though specific paths and names might vary (please contact the team
Expand Down
12 changes: 6 additions & 6 deletions doc/installation_instructions/platform/frontier.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
OLCF Frontier
=============
Frontier (OLCF)
---------------

Known limitations
-----------------
^^^^^^^^^^^^^^^^^

We are continually working on getting all the features of SmartSim working on
Frontier, however we do have some known limitations:
Expand All @@ -23,7 +23,7 @@ Please raise an issue in the SmartSim Github or contact the developers if the ab
issues are affecting your workflow or if you find any other problems.

One-time Setup
--------------
^^^^^^^^^^^^^^

To install the SmartRedis and SmartSim python packages on Frontier, please follow
these instructions, being sure to set the following variables
Expand Down Expand Up @@ -87,7 +87,7 @@ The following output indicates a successful install:
16:26:35 login SmartSim[557020:MainThread] INFO Success!

Post-installation
-----------------
^^^^^^^^^^^^^^^^^

Before running SmartSim, the environment should match the one used to
build, and some variables should be set to optimize performance:
Expand All @@ -109,7 +109,7 @@ build, and some variables should be set to optimize performance:
mkdir -p ${MIOPEN_USER_DB_PATH}

Binding DBs to Slingshot
------------------------
^^^^^^^^^^^^^^^^^^^^^^^^

Each Frontier node has *four* NICs, which also means users need to bind
DBs to *four* network interfaces, ``hsn0``, ``hsn1``, ``hsn2``,
Expand Down
33 changes: 0 additions & 33 deletions doc/installation_instructions/platform/ncar-cheyenne.rst

This file was deleted.

18 changes: 0 additions & 18 deletions doc/installation_instructions/platform/nonroot-linux.rst

This file was deleted.

4 changes: 2 additions & 2 deletions doc/installation_instructions/platform/olcf-summit.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

Summit at OLCF
==============
Summit (OLCF)
-------------

Since SmartSim does not have a built PowerPC build, the build steps for an IBM
system are slightly different than other systems.
Expand Down
8 changes: 4 additions & 4 deletions doc/installation_instructions/platform/perlmutter.rst
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
NERSC Perlmutter
================
Perlmutter (NERSC)
------------------

One-time Setup
--------------
^^^^^^^^^^^^^^

To install SmartSim on Perlmutter, follow these steps:

Expand Down Expand Up @@ -53,7 +53,7 @@ The following output indicates a successful install:
16:26:35 login SmartSim[557020:MainThread] INFO Success!

Post-installation
-----------------
^^^^^^^^^^^^^^^^^

After completing the above steps to install SmartSim in a conda environment, you
can reload the conda environment by running the following commands:
Expand Down
8 changes: 4 additions & 4 deletions doc/installation_instructions/platform/pml-scylla.rst
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
PML Scylla
==========
Scylla (PML)
------------

.. warning::
As of September 2024, the software stack on Scylla is still being finalized.
Therefore, please consider these instructions as preliminary for now.

One-time Setup
--------------
^^^^^^^^^^^^^^

To install SmartSim on Scylla, follow these steps:

Expand Down Expand Up @@ -72,7 +72,7 @@ The following output indicates a successful install:
16:26:35 login SmartSim[557020:MainThread] INFO Success!

Post-installation
-----------------
^^^^^^^^^^^^^^^^^

After completing the above steps to install SmartSim in a conda environment, you
can reload the conda environment by running the following commands:
Expand Down
2 changes: 1 addition & 1 deletion doc/installation_instructions/site-install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,4 @@ from source with the following steps replacing ``COMPILER_VERSION`` and
module use -a /lus/scratch/smartsim/local/modulefiles
module load cudatoolkit/11.8 cudnn smartsim-deps/COMPILER_VERSION/SMARTSIM_VERSION
pip install smartsim
smart build --skip-backends --device gpu [--onnx]
smart build --skip-backends --device gpu
Loading
Loading