|
| 1 | +--- |
| 2 | +title: Getting Started |
| 3 | +weight: 10 |
| 4 | +aliases: /getting-started/ |
| 5 | +--- |
| 6 | + |
| 7 | +:toc: |
| 8 | +:imagesdir: /images |
| 9 | +:_content-type: ASSEMBLY |
| 10 | +include::modules/comm-attributes.adoc[] |
| 11 | + |
| 12 | +[id="installing-rag-llm-azure-pattern"] |
| 13 | +== Installing the RAG-LLM GitOps Pattern on Microsoft Azure |
| 14 | + |
| 15 | +.Prerequisites |
| 16 | + |
| 17 | +* You are logged into an existing a Red Hat OpenShift cluster on Microsoft Azure with administrative privileges. |
| 18 | +* Your Azure subscription has the required GPU quota to provision the necessary compute resources for the vLLM inference service. The default is Standard_NC8as_T4_v3, which requires at least 8 CPUs. |
| 19 | +* A Hugging Face token. |
| 20 | +* Database server |
| 21 | + ** Microsoft SQL Server - It is the default vector database for deploying the RAG-LLM pattern on Azure. |
| 22 | + ** (Optional) Local databases- You can also deploy Redis, PostgreSQL (EDB), or Elasticsearch (ELASTIC) directly within your cluster. If choosing a local database, ensure that it is provisioned and accessible before deployment. |
| 23 | + |
| 24 | +[IMPORTANT] |
| 25 | +==== |
| 26 | +* To select your database type, edit `overrides/values-Azure.yaml` file. |
| 27 | ++ |
| 28 | +[source,yaml] |
| 29 | +---- |
| 30 | +global: |
| 31 | + db: |
| 32 | + type: "MSSQL" # Options: MSSQL, AZURESQL, REDIS, EDB, ELASTIC |
| 33 | +---- |
| 34 | +
|
| 35 | +
|
| 36 | +* When choosing local database instances such as Redis, PostgreSQL, or Elasticsearch, ensure that your cluster has sufficient resources available. |
| 37 | +==== |
| 38 | + |
| 39 | +[id="overview-of-the-installation-workflow_{context}"] |
| 40 | +== Overview of the installation workflow |
| 41 | +To install the RAG-LLM GitOps Pattern on Microsoft Azure, you must complete the following setup and configurations: |
| 42 | + |
| 43 | +* xref:creating-huggingface-token[Create a Hugging face token] |
| 44 | +* xref:creating-secret-credentials[Create required secrets] |
| 45 | +* xref:provisioning-gpu-nodes[Create GPU nodes] |
| 46 | +* xref:deploy-rag-llm-azure-pattern[Install the RAG-LLM GitOps Pattern on Microsoft Azure] |
| 47 | + |
| 48 | +[id="creating-huggingface-token_{context}"] |
| 49 | +=== Creating a Hugging Face token |
| 50 | +.Procedure |
| 51 | + |
| 52 | +. To obtain a Hugging Face token, navigate to the link:https://huggingface.co/settings/tokens[Hugging Face] site. |
| 53 | +. Log in to your account. |
| 54 | +. Go to your *Settings* -> *Access Tokens*. |
| 55 | +. Create a new token with appropriate permissions. Ensure you accept the terms of the specific model you plan to use, as required by Hugging Face. For example, Mistral-7B-Instruct-v0.3-AWQ |
| 56 | + |
| 57 | +[id="creating-secret-credentials_{context}"] |
| 58 | +=== Creating secret credentials |
| 59 | + |
| 60 | +To securely store your sensitive credentials, create a YAML file named `~/values-secret-rag-llm-gitops.yaml`. This file is used during the pattern deployment; however, you must not commit it to your Git repository. |
| 61 | + |
| 62 | +[source,yaml] |
| 63 | +---- |
| 64 | +# ~/values-secret-rag-llm-gitops.yaml |
| 65 | +# Replace placeholders with your actual credentials |
| 66 | +version: "2.0" |
| 67 | +
|
| 68 | +secrets: |
| 69 | + - name: hfmodel |
| 70 | + fields: |
| 71 | + - name: hftoken <1> |
| 72 | + value: <hf_your_huggingface_token> |
| 73 | + - name: mssql |
| 74 | + fields: |
| 75 | + - name: sa-pass <2> |
| 76 | + value: <value: <password_for_sa_user> |
| 77 | +---- |
| 78 | +<1> Specify your Hugging Face token. |
| 79 | +<2> Specify the system administrator password for the MS SQL Server instance. |
| 80 | + |
| 81 | +[id="provisioning-gpu-nodes_{context}"] |
| 82 | +=== Provisioning GPU nodes |
| 83 | + |
| 84 | +The vLLM inference service requires dedicated GPU nodes with a specific taint. You can provision these nodes by using one of the following methods: |
| 85 | + |
| 86 | +Automatic Provisioning:: The pattern includes capabilities to automatically provision GPU-enabled `MachineSet` resources. |
| 87 | ++ |
| 88 | +Run the following command to create a single Standard_NC8as_T4_v3 GPU node: |
| 89 | ++ |
| 90 | +[source,terminal] |
| 91 | +---- |
| 92 | +./pattern.sh make create-gpu-machineset-azure |
| 93 | +---- |
| 94 | + |
| 95 | +Customizable Method:: For environments requiring more granular control, you can manually create a `MachineSet` with the necessary GPU instance types and apply the required taint. |
| 96 | ++ |
| 97 | +To control GPU node specifics, provide additional parameters: |
| 98 | ++ |
| 99 | +[source,terminal] |
| 100 | +---- |
| 101 | +./pattern.sh make create-gpu-machineset-azure GPU_REPLICAS=3 OVERRIDE_ZONE=2 GPU_VM_SIZE=Standard_NC16as_T4_v3 |
| 102 | +---- |
| 103 | ++ |
| 104 | +where: |
| 105 | ++ |
| 106 | + - `GPU_REPLICAS` is the umber of GPU nodes to provision. |
| 107 | ++ |
| 108 | + - (Optional): `OVERRIDE_ZONE` is the availability zone . |
| 109 | ++ |
| 110 | + - `GPU_VM_SIZE` is the Azure VM SKU for GPU nodes. |
| 111 | ++ |
| 112 | +The script automatically applies the required taint. The NVIDIA GPU Operator that is installed by the pattern manages the CUDA driver installation on GPU nodes. |
| 113 | + |
| 114 | +[id="deploy-rag-llm-azure-pattern_{context}"] |
| 115 | +=== Deploying the RAG-LLM GitOps Pattern |
| 116 | + |
| 117 | +To deploy the RAG-LLM GitOps Pattern to your ARO cluster, run the following command: |
| 118 | + |
| 119 | +[source,terminal] |
| 120 | +---- |
| 121 | +pattern.sh make install |
| 122 | +---- |
| 123 | + |
| 124 | +This command initiates the GitOps-driven deployment process, which installs and configures all RAG-LLM components on your ARO cluster based on the provided values and secrets. |
0 commit comments