|
| 1 | +--- |
| 2 | +title: Getting Started |
| 3 | +weight: 10 |
| 4 | +aliases: /getting-started/ |
| 5 | +--- |
| 6 | + |
| 7 | +:toc: |
| 8 | +:imagesdir: /images |
| 9 | +:_content-type: ASSEMBLY |
| 10 | +include::modules/comm-attributes.adoc[] |
| 11 | + |
| 12 | +[id="installing-rag-llm-azure-pattern"] |
| 13 | +== Installing the RAG-LLM GitOps Pattern on Microsoft Azure |
| 14 | + |
| 15 | +.Prerequisites |
| 16 | + |
| 17 | +* You are logged into an existing Azure Red Hat OpenShift (ARO) cluster with administrative privileges. |
| 18 | +* Your Azure subscription has the required GPU quota to provision the necessary compute resources for the vLLM inference service. The default is Standard_NC8as_T4_v3, which requires at least 8 CPUs. |
| 19 | +* A Hugging Face token: |
| 20 | +* Database server |
| 21 | + ** Azure SQL database server - It is the default vector database for deploying the RAG-LLM GitOps Pattern on Azure. |
| 22 | + ** (Optional) Local databases- You can also deploy Redis, PostgreSQL (EDB), or Elasticsearch (ELASTIC) directly within your cluster. If choosing a local database, ensure that it is provisioned and accessible before deployment. |
| 23 | + |
| 24 | +[IMPORTANT] |
| 25 | +==== |
| 26 | +* To select your database type, edit `overrides/values-Azure.yaml` file. |
| 27 | ++ |
| 28 | +[source,yaml] |
| 29 | +---- |
| 30 | +global: |
| 31 | + db: |
| 32 | + type: "AZURESQL" # Options: AZURESQL, REDIS, EDB, ELASTIC |
| 33 | +---- |
| 34 | +
|
| 35 | +
|
| 36 | +* When choosing local database instances such as Redis, PostgreSQL, or Elasticsearch, ensure that your cluster has sufficient resources available. |
| 37 | +==== |
| 38 | + |
| 39 | +[id="overview-of-the-installation-workflow_{context}"] |
| 40 | +== Overview of the installation workflow |
| 41 | +To install the RAG-LLM GitOps Pattern on Microsoft Azure, you must complete the following setup and configurations: |
| 42 | + |
| 43 | +* xref:creating-huggingface-token[Create a Hugging face token] |
| 44 | +* xref:deploying-azure-sql-server[Deploy Azure SQL] |
| 45 | +* xref:creating-secret-credentials[Create required secrets] |
| 46 | +* xref:provisioning-gpu-nodes[Create GPU nodes] |
| 47 | +* xref:deploy-rag-llm-azure-pattern[Install the RAG-LLM GitOps Pattern on Microsoft Azure] |
| 48 | + |
| 49 | +[id="creating-huggingface-token_{context}"] |
| 50 | +=== Creating a Hugging Face token |
| 51 | +.Procedure |
| 52 | + |
| 53 | +. To obtain a Hugging Face token, navigate to the link:https://huggingface.co/settings/tokens[Hugging Face] site. |
| 54 | +. Log in to your account. |
| 55 | +. Go to your *Settings* -> *Access Tokens*. |
| 56 | +. Create a new token with appropriate permissions. Ensure you accept the terms of the specific model you plan to use, as required by Hugging Face. For example, Mistral-7B-Instruct-v0.3-AWQ |
| 57 | + |
| 58 | +[id="deploying-azure-sql-server_{context}"] |
| 59 | +=== Deploying Azure SQL Server |
| 60 | + |
| 61 | +.Procedure |
| 62 | + |
| 63 | +. Navigate to the Azure portal and create a new SQL Database server. |
| 64 | +. When prompted for authentication, select `Use SQL authentication`. |
| 65 | +. Record the generated *Server name*, *Server admin login*, and *Password*. These credentials will be used later when creating secrets. |
| 66 | +. On the *Networking* tab, ensure that *Allow Azure services and resources to access this server is set* to *Yes*. This allows your ARO cluster to connect to the database. |
| 67 | +. Click *Review + create*, and then click *Create*. |
| 68 | + |
| 69 | +Wait for the SQL Server deployment to complete and become active before proceeding. |
| 70 | + |
| 71 | +[id="creating-secret-credentials_{context}"] |
| 72 | +=== Creating secret credentials |
| 73 | + |
| 74 | +To securely store your sensitive credentials, create a YAML file named `~/values-secret-rag-llm-gitops.yaml`. This file is used during the pattern deployment; however, you must not commit it to your Git repository. |
| 75 | + |
| 76 | +[IMPORTANT] |
| 77 | +==== |
| 78 | +If you’re not using Azure SQL Server, omit the entire `azuresql` section. |
| 79 | +==== |
| 80 | + |
| 81 | +[source,yaml] |
| 82 | +---- |
| 83 | +# ~/values-secret-rag-llm-gitops.yaml |
| 84 | +# Replace placeholders with your actual credentials |
| 85 | +version: "2.0" |
| 86 | +
|
| 87 | +secrets: |
| 88 | + - name: hfmodel |
| 89 | + fields: |
| 90 | + - name: hftoken <1> |
| 91 | + value: <hf_your_huggingface_token> |
| 92 | + - name: azuresql |
| 93 | + fields: |
| 94 | + - name: user <2> |
| 95 | + value: <adminuser> |
| 96 | + - name: password <3> |
| 97 | + value: <your_password> |
| 98 | + - name: server <4> |
| 99 | + value: <yourservername.database.windows.net> |
| 100 | +---- |
| 101 | +<1> Specify your Hugging Face token. |
| 102 | +<2> Specify the username for your Azure SQL server. |
| 103 | +<3> Specify the password for your Azure SQL server. |
| 104 | +<4> Specify the fully qualified of your Azure SQL server name. |
| 105 | + |
| 106 | +[id="provisioning-gpu-nodes_{context}"] |
| 107 | +=== Provisioning GPU nodes |
| 108 | + |
| 109 | +The vLLM inference service requires dedicated GPU nodes with a specific taint. You can provision these nodes by using one of the following methods: |
| 110 | + |
| 111 | +Automatic Provisioning:: The pattern includes capabilities to automatically provision GPU-enabled `MachineSet` resources. |
| 112 | ++ |
| 113 | +Run the following command to create a single Standard_NC8as_T4_v3 GPU node: |
| 114 | ++ |
| 115 | +[source,terminal] |
| 116 | +---- |
| 117 | +./pattern.sh make create-gpu-machineset-azure |
| 118 | +---- |
| 119 | + |
| 120 | +Customizable Method:: For environments requiring more granular control, you can manually create a `MachineSet` with the necessary GPU instance types and apply the required taint. For more information on creating custom `MachineSet` resources for ARO cluster, see link:https://docs.redhat.com/en/documentation/openshift_container_platform/4.19/html/machine_management/managing-compute-machines-with-the-machine-api#creating-machineset-azure[Creating a compute machine set on Azure] |
| 121 | ++ |
| 122 | +To control GPU node specifics, provide additional parameters: |
| 123 | ++ |
| 124 | +[source,terminal] |
| 125 | +---- |
| 126 | +./pattern.sh make create-gpu-machineset-azure GPU_REPLICAS=3 OVERRIDE_ZONE=2 GPU_VM_SIZE=Standard_NC16as_T4_v3 |
| 127 | +---- |
| 128 | ++ |
| 129 | +where: |
| 130 | ++ |
| 131 | + - `GPU_REPLICAS` is the umber of GPU nodes to provision. |
| 132 | ++ |
| 133 | + - (Optional): `OVERRIDE_ZONE` is the availability zone . |
| 134 | ++ |
| 135 | + - `GPU_VM_SIZE` is the Azure VM SKU for GPU nodes. |
| 136 | ++ |
| 137 | +The script automatically applies the required taint. The NVIDIA GPU Operator that is installed by the pattern manages the CUDA driver installation on GPU nodes. |
| 138 | + |
| 139 | +[id="deploy-rag-llm-azure-pattern_{context}"] |
| 140 | +=== Deploying the RAG-LLM GitOps Pattern |
| 141 | + |
| 142 | +To deploy the RAG-LLM GitOps Pattern to your ARO cluster, run the following command: |
| 143 | + |
| 144 | +[source,terminal] |
| 145 | +---- |
| 146 | +pattern.sh make install |
| 147 | +---- |
| 148 | + |
| 149 | +This command initiates the GitOps-driven deployment process, which installs and configures all RAG-LLM components on your ARO cluster based on the provided values and secrets. |
0 commit comments