Skip to content

Commit f01cd56

Browse files
authored
Merge branch 'master' into ci/migrate-tpu
2 parents 19dd229 + 9978e71 commit f01cd56

File tree

94 files changed

+2709
-501
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

94 files changed

+2709
-501
lines changed

.azure/gpu-tests.yml renamed to .azure/gpu-tests-pytorch.yml

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ trigger:
1414
- "refs/tags/*"
1515
paths:
1616
include:
17-
- ".azure/gpu-tests.yml"
17+
- ".azure/gpu-tests-pytorch.yml"
1818
- "examples/run_ddp_examples.sh"
1919
- "examples/convert_from_pt_to_pl/**"
2020
- "examples/run_pl_examples.sh"
@@ -39,7 +39,7 @@ pr:
3939
- "release/*"
4040
paths:
4141
include:
42-
- ".azure/gpu-tests.yml"
42+
- ".azure/gpu-tests-pytorch.yml"
4343
- "examples/run_ddp_examples.sh"
4444
- "examples/convert_from_pt_to_pl/**"
4545
- "examples/run_pl_examples.sh"
@@ -97,6 +97,7 @@ jobs:
9797
set -e
9898
python -c "fname = 'requirements/pytorch/strategies.txt' ; lines = [line for line in open(fname).readlines() if 'horovod' not in line] ; open(fname, 'w').writelines(lines)"
9999
python -c "fname = 'requirements/pytorch/strategies.txt' ; lines = [line for line in open(fname).readlines() if 'bagua' not in line] ; open(fname, 'w').writelines(lines)"
100+
python -c "fname = 'requirements/pytorch/strategies.txt' ; lines = [line for line in open(fname).readlines() if 'colossalai' not in line] ; open(fname, 'w').writelines(lines)"
100101
101102
PYTORCH_VERSION=$(python -c "import torch; print(torch.__version__.split('+')[0])")
102103
python ./requirements/pytorch/adjust-versions.py requirements/pytorch/base.txt ${PYTORCH_VERSION}
@@ -110,6 +111,11 @@ jobs:
110111
CUDA_VERSION_BAGUA=$(python -c "print([ver for ver in [116,113,111,102] if $CUDA_VERSION_MM >= ver][0])")
111112
pip install "bagua-cuda$CUDA_VERSION_BAGUA"
112113
114+
PYTORCH_VERSION_COLOSSALAI=$(python -c "import torch; print(torch.__version__.split('+')[0][:4])")
115+
CUDA_VERSION_MM_COLOSSALAI=$(python -c "import torch ; print(''.join(map(str, torch.version.cuda)))")
116+
CUDA_VERSION_COLOSSALAI=$(python -c "print([ver for ver in [11.3, 11.1] if $CUDA_VERSION_MM_COLOSSALAI >= ver][0])")
117+
pip install "colossalai==0.1.10+torch${PYTORCH_VERSION_COLOSSALAI}cu${CUDA_VERSION_COLOSSALAI}" --find-links https://release.colossalai.org
118+
113119
pip list
114120
env:
115121
PACKAGE_NAME: pytorch

.github/checkgroup.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ subprojects:
8585

8686
- id: "pytorch_lightning: Azure GPU"
8787
paths:
88-
- ".azure/gpu-tests.yml"
88+
- ".azure/gpu-tests-pytorch.yml"
8989
- "tests/tests_pytorch/run_standalone_*.sh"
9090
checks:
9191
- "pytorch-lightning (GPUs)"

.github/workflows/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
| Test PyTorch slow | .github/workflows/ci-pytorch-test-slow.yml | Run only slow tests. Slow tests usually need to spawn threads and cannot be speed up or simplified. | CPU |
1111
| pytorch-lightning (IPUs) | .azure-pipelines/ipu-tests.yml | Run only IPU-specific tests. | IPU |
1212
| pytorch-lightning (HPUs) | .azure-pipelines/hpu-tests.yml | Run only HPU-specific tests. | HPU |
13-
| pytorch-lightning (GPUs) | .azure-pipelines/gpu-tests.yml | Run all CPU and GPU-specific tests, standalone, and examples. Each standalone test needs to be run in separate processes to avoid unwanted interactions between test cases. | GPU |
13+
| pytorch-lightning (GPUs) | .azure-pipelines/gpu-tests-pytorch.yml | Run all CPU and GPU-specific tests, standalone, and examples. Each standalone test needs to be run in separate processes to avoid unwanted interactions between test cases. | GPU |
1414
| PyTorchLightning.Benchmark | .azure-pipelines/gpu-benchmark.yml | Run speed/memory benchmarks for parity with pure PyTorch. | GPU |
1515
| test-on-tpus | .github/workflows/ci-pytorch-test-tpu.yml | Run only TPU-specific tests. | TPU |
1616

.github/workflows/ci-pytorch-test-slow.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ jobs:
2929
fail-fast: false
3030
matrix:
3131
os: [ubuntu-20.04, windows-2022, macOS-11]
32-
# same config as '.azure-pipelines/gpu-tests.yml'
32+
# same config as '.azure-pipelines/gpu-tests-pytorch.yml'
3333
python-version: ["3.7"]
3434
pytorch-version: ["1.11"]
3535

.github/workflows/code-checks.yml

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ jobs:
1818

1919
- uses: actions/setup-python@v4
2020
with:
21-
python-version: 3.9
21+
python-version: '3.10.6'
2222

2323
# Note: This uses an internal pip API and may not always work
2424
# https://github.com/actions/cache/blob/master/examples.md#multiple-oss-in-a-workflow
@@ -35,8 +35,10 @@ jobs:
3535
pip install torch==1.12 --find-links https://download.pytorch.org/whl/cpu/torch_stable.html
3636
python ./requirements/pytorch/adjust-versions.py requirements/pytorch/extra.txt
3737
# todo: adjust requirements for both code-bases
38-
pip install -r requirements/pytorch/devel.txt --find-links https://download.pytorch.org/whl/cpu/torch_stable.html
38+
pip install -r requirements/pytorch/devel.txt -r requirements/app/devel.txt -r requirements/lite/devel.txt --find-links https://download.pytorch.org/whl/cpu/torch_stable.html
3939
pip list
4040
4141
- name: Check typing
42-
run: mypy
42+
run: |
43+
mkdir .mypy_cache
44+
mypy

dockers/base-cuda/Dockerfile

Lines changed: 30 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -54,9 +54,10 @@ RUN \
5454
libopenmpi-dev \
5555
openmpi-bin \
5656
ssh \
57+
ninja-build \
5758
libnccl2=$TO_INSTALL_NCCL \
5859
libnccl-dev=$TO_INSTALL_NCCL && \
59-
# Install python
60+
# Install python
6061
add-apt-repository ppa:deadsnakes/ppa && \
6162
apt-get install -y \
6263
python${PYTHON_VERSION} \
@@ -65,7 +66,7 @@ RUN \
6566
&& \
6667
update-alternatives --install /usr/bin/python${PYTHON_VERSION%%.*} python${PYTHON_VERSION%%.*} /usr/bin/python${PYTHON_VERSION} 1 && \
6768
update-alternatives --install /usr/bin/python python /usr/bin/python${PYTHON_VERSION} 1 && \
68-
# Cleaning
69+
# Cleaning
6970
apt-get autoremove -y && \
7071
apt-get clean && \
7172
rm -rf /root/.cache && \
@@ -82,14 +83,15 @@ RUN \
8283
rm get-pip.py && \
8384
pip install -q fire && \
8485
# Disable cache \
85-
CUDA_VERSION_MM=$(python -c "print(''.join('$CUDA_VERSION'.split('.')[:2]))") && \
86+
export CUDA_VERSION_MM=$(python -c "print(''.join('$CUDA_VERSION'.split('.')[:2]))") && \
8687
pip config set global.cache-dir false && \
8788
# set particular PyTorch version
8889
python ./requirements/pytorch/adjust-versions.py requirements/pytorch/base.txt ${PYTORCH_VERSION} && \
8990
python ./requirements/pytorch/adjust-versions.py requirements/pytorch/extra.txt ${PYTORCH_VERSION} && \
9091
python ./requirements/pytorch/adjust-versions.py requirements/pytorch/examples.txt ${PYTORCH_VERSION} && \
91-
# Install all requirements \
92-
pip install -r requirements/pytorch/devel.txt --no-cache-dir --find-links https://download.pytorch.org/whl/cu${CUDA_VERSION_MM}/torch_stable.html && \
92+
93+
# Install base requirements \
94+
pip install -r requirements/pytorch/base.txt --no-cache-dir --find-links https://download.pytorch.org/whl/cu${CUDA_VERSION_MM}/torch_stable.html && \
9395
rm assistant.py
9496

9597
ENV \
@@ -108,7 +110,7 @@ RUN \
108110
export HOROVOD_BUILD_CUDA_CC_LIST=${HOROVOD_BUILD_CUDA_CC_LIST//"."/""} && \
109111
echo $HOROVOD_BUILD_CUDA_CC_LIST && \
110112
cmake --version && \
111-
pip install --no-cache-dir -r ./requirements/pytorch/strategies.txt && \
113+
pip install --no-cache-dir horovod && \
112114
horovodrun --check-build
113115

114116
RUN \
@@ -136,6 +138,28 @@ RUN \
136138
if [[ "$CUDA_VERSION_MM" = "$CUDA_VERSION_BAGUA" ]]; then python -c "import bagua_core; bagua_core.install_deps()"; fi && \
137139
python -c "import bagua; print(bagua.__version__)"
138140

141+
RUN \
142+
# install ColossalAI
143+
SHOULD_INSTALL_COLOSSAL=$(python -c "import torch; print(1 if int(torch.__version__.split('.')[1]) > 9 else 0)") && \
144+
if [[ "$SHOULD_INSTALL_COLOSSAL" = "1" ]]; then \
145+
PYTORCH_VERSION_COLOSSALAI=$(python -c "import torch; print(torch.__version__.split('+')[0][:4])") ; \
146+
CUDA_VERSION_MM_COLOSSALAI=$(python -c "import torch ; print(''.join(map(str, torch.version.cuda)))") ; \
147+
CUDA_VERSION_COLOSSALAI=$(python -c "print([ver for ver in [11.3, 11.1] if $CUDA_VERSION_MM_COLOSSALAI >= ver][0])") ; \
148+
pip install "colossalai==0.1.10+torch${PYTORCH_VERSION_COLOSSALAI}cu${CUDA_VERSION_COLOSSALAI}" --find-links https://release.colossalai.org ; \
149+
python -c "import colossalai; print(colossalai.__version__)" ; \
150+
fi
151+
152+
RUN \
153+
# install rest of strategies
154+
# remove colossalai from requirements since they are installed separately
155+
SHOULD_INSTALL_COLOSSAL=$(python -c "import torch; print(1 if int(torch.__version__.split('.')[1]) > 9 else 0)") && \
156+
if [[ "$SHOULD_INSTALL_COLOSSAL" = "0" ]]; then \
157+
python -c "fname = 'requirements/pytorch/strategies.txt' ; lines = [line for line in open(fname).readlines() if 'colossalai' not in line] ; open(fname, 'w').writelines(lines)" ; \
158+
fi && \
159+
echo "$SHOULD_INSTALL_COLOSSAL" && \
160+
cat requirements/pytorch/strategies.txt && \
161+
pip install -r requirements/pytorch/devel.txt -r requirements/pytorch/strategies.txt --no-cache-dir --find-links https://download.pytorch.org/whl/cu${CUDA_VERSION_MM}/torch_stable.html
162+
139163
COPY requirements/pytorch/check-avail-extras.py check-avail-extras.py
140164
COPY requirements/pytorch/check-avail-strategies.py check-avail-strategies.py
141165

dockers/release/Dockerfile

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,11 @@ RUN \
4141
fi && \
4242
# otherwise there is collision with folder name ans pkg name on Pypi
4343
cd lightning && \
44-
pip install .["extra","loggers","strategies"] --no-cache-dir && \
44+
SHOULD_INSTALL_COLOSSAL=$(python -c "import torch; print(1 if int(torch.__version__.split('.')[1]) > 9 else 0)") && \
45+
if [[ "$SHOULD_INSTALL_COLOSSAL" = "0" ]]; then \
46+
python -c "fname = 'requirements/pytorch/strategies.txt' ; lines = [line for line in open(fname).readlines() if 'colossalai' not in line] ; open(fname, 'w').writelines(lines)" ; \
47+
fi && \
48+
pip install .["extra","loggers","strategies"] --no-cache-dir --find-links https://release.colossalai.org && \
4549
cd .. && \
4650
rm -rf lightning
4751

docs/source-app/glossary/secrets.rst

Lines changed: 35 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -4,55 +4,56 @@
44
Encrypted Secrets
55
#################
66

7-
Is your App using data or values (for example: API keys or access credentials) that you don't want to expose in your App code? If the answer is yes, you'll want to use Secrets. Secrets are encrypted values that are stored in the Lightning.ai database and are decrypted at runtime.
7+
Encrypted Secrets allow you to pass private data to your apps, like API keys, access tokens, database passwords, or other credentials, in a secure way without exposing them in your code.
8+
Secrets provide you with a secure way to store this data in a way that is accessible to Apps so that they can authenticate third-party services/solutions.
89

910
.. tip::
1011
For non-sensitive configuration values, we recommend using :ref:`plain-text Environment Variables <environment_variables>`.
1112

12-
***************
13-
What did we do?
14-
***************
13+
************
14+
Add a secret
15+
************
1516

16-
When a Lightning App (App) **runs in the cloud**, a Secret can be exposed to the App using environment variables.
17-
The value of the Secret is encrypted in the Lightning.ai database, and is only decrypted and accessible to
18-
LightningFlow (Flow) or LightningWork (Work) processes in the cloud (when you use the ``--cloud`` option running your App).
17+
Add the secret to your profile on lightning.ai.
18+
Log in to your lightning.ai account > **Profile** > **Secrets** tab > click the **+New** button.
19+
Provide a name and value to your secret, for example, name could be "github_api_token".
1920

20-
----
21-
22-
**********************
23-
What were we thinking?
24-
**********************
21+
.. note::
22+
Secret names must start with a letter and can only contain letters, numbers, dashes, and periods. The Secret names must comply with `RFC1123 naming conventions <https://www.rfc-editor.org/rfc/rfc1123>`_. The Secret value has no restrictions.
2523

26-
Many Apps require access to private data like API keys, access tokens, database passwords, or other credentials. You need to protect this data.
27-
We developed this feature to provide you with a secure way to store this data in a way that is accessible to Apps so that they can authenticate third-party services/solutions.
24+
.. raw:: html
2825

29-
----
26+
<br />
27+
<video id="background-video" autoplay loop muted controls poster="https://pl-flash-data.s3.amazonaws.com/assets_lightning/docs/images/storage/encrypted_secrets_login.png" width="100%">
28+
<source src="https://pl-flash-data.s3.amazonaws.com/assets_lightning/docs/images/storage/encrypted_secrets_login.mp4" type="video/mp4" width="100%">
29+
</video>
30+
<br />
31+
<br />
3032

31-
*********************
32-
Use Encrypted Secrets
33-
*********************
33+
************
34+
Use a secret
35+
************
3436

35-
To use Encrypted Secrets:
37+
1. Add an environment variable to your app to read the secret. For example, add an "api_token" environment variable:
3638

37-
#. Log in to your lightning.ai account, go to **Secrets**, and create the Secret (provide a name and value for the secret).
39+
.. code:: python
3840
39-
.. note:: Once you create a Secret, you can bind it to any of your Apps. You do not need to create a new Secret for each App if the Secret value is the same.
41+
import os
4042
41-
#. Prepare an environment variable to use with the Secret in your App.
43+
component.connect(api_token=os.environ["api_token"])
4244
43-
#. Use the following command to add the Secret to your App:
45+
2. Pass the secret to your app run with the following command:
4446

4547
.. code:: bash
4648
4749
lightning run app app.py --cloud --secret <environment-variable>=<secret-name>
4850
49-
The environment variables are available in all Flows and Works, and can be accessed as follows:
51+
In this example, the command would be:
5052

51-
.. code:: python
53+
.. code:: bash
5254
53-
import os
55+
lightning run app app.py --cloud --secret api_token=github_api_token
5456
55-
print(os.environ["<environment-variable>"])
5657
5758
The ``--secret`` option can be used for multiple Secrets, and alongside the ``--env`` option.
5859

@@ -62,41 +63,13 @@ Here's an example:
6263
6364
lightning run app app.py --cloud --env FOO=bar --secret MY_APP_SECRET=my-secret --secret ANOTHER_SECRET=another-secret
6465
65-
----
66-
67-
Example
68-
^^^^^^^
6966
70-
The best way to show you how to use Encrypted Secrets is with an example.
71-
72-
First, log in to your `lightning.ai account <https://lightning.ai/>`_ and create a Secret.
73-
74-
.. raw:: html
75-
76-
<br />
77-
<video id="background-video" autoplay loop muted controls poster="https://pl-flash-data.s3.amazonaws.com/assets_lightning/docs/images/storage/encrypted_secrets_login.png" width="100%">
78-
<source src="https://pl-flash-data.s3.amazonaws.com/assets_lightning/docs/images/storage/encrypted_secrets_login.mp4" type="video/mp4" width="100%">
79-
</video>
80-
<br />
81-
<br />
82-
83-
.. note::
84-
Secret names must start with a letter and can only contain letters, numbers, dashes, and periods. The Secret names must comply with `RFC1123 naming conventions <https://www.rfc-editor.org/rfc/rfc1123>`_. The Secret value has no restrictions.
85-
86-
After creating a Secret named ``my-secret`` with the value ``some-secret-value`` we'll bind it to the environment variable ``MY_APP_SECRET`` within our App. The binding is accomplished by using the ``--secret`` option when running the App from the Lightning CLI.
87-
88-
The ``--secret``` option works similar to ``--env``, but instead of providing a value, you provide the name of the Secret that is replaced with with the value that you want to bind to the environment variable:
89-
90-
.. code:: bash
91-
92-
lightning run app app.py --cloud --secret MY_APP_SECRET=my-secret
93-
94-
The environment variables are available in all Flows and Works, and can be accessed as follows:
95-
96-
.. code:: python
97-
98-
import os
67+
----
9968

100-
print(os.environ["MY_APP_SECRET"])
69+
******************
70+
How does this work
71+
******************
10172

102-
This code prints out ``some-secret-value``.
73+
When a Lightning App (App) **runs in the cloud**, a Secret can be exposed to the App using environment variables.
74+
The value of the Secret is encrypted in the Lightning.ai database, and is only decrypted and accessible to
75+
LightningFlow (Flow) or LightningWork (Work) processes in the cloud (when you use the ``--cloud`` option running your App).

docs/source-app/glossary/storage/drive.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,4 +10,4 @@ Drive Storage
1010

1111
----
1212

13-
.. include:: ../../glossary/storage/drive_content.rst
13+
.. include:: ../../glossary/storage/drive_content_old.rst

docs/source-app/glossary/storage/drive_content.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
:orphan:
12

23
**************************
34
What are Lightning Drives?

0 commit comments

Comments
 (0)