Private LLM API with AWS SageMaker

This repository contains two major projects that work together to deploy and serve Large Language Models (LLMs) on AWS SageMaker.

Repository Overview

1. Infrastructure (CDK)

The infrastructure component is a CDK-based project that creates and manages the AWS resources, including:

Route 53: DNS management with SSL certificates
ECS Cluster: Container orchestration for the Model API service
DynamoDB: For storing API keys
SageMaker Endpoints: For hosting deployed LLMs
CloudWatch: Monitoring and logging

2. Model API Application

An Express.js application that serves as a middleware between clients and LLMs deployed to SageMaker endpoints:

OpenAI-Compatible API: Drop-in replacement for OpenAI API clients
REST API: Endpoints for model inference, management, and configuration
Request Transformation: Converts API requests to SageMaker format
Response Processing: Processes and formats model responses

Project Structure

infra/ - CDK-based infrastructure project that creates AWS resources including Route 53, DynamoDB, ECS cluster for the model API service, SageMaker endpoints, and related components.
model-api-app/ - Express.js application that serves LLMs deployed to SageMaker endpoints. This REST API is OpenAI-compatible and handles transformation between API requests and the SageMaker format.

Prerequisites

AWS CLI
AWS CDK CLI
Docker
Node.js 20.x or later
AWS Account and credentials configured
Increase service quota for Sagemaker Endpoints instance type ( ml.g6.12xlarge )

Getting Started

Infrastructure Deployment

Prepare environment variables

cp .env.example .env

Update environment variables

APP_NAME="llm-sagemaker"
HUGGINGFACE_TOKEN="your-huggingface-token"
VPC_ID="your-vpc-id"
SUBNET_TYPE="PRIVATE" # PUBLIC or PRIVATE
DOMAIN_NAME="your-domain-name.com"

Navigate to the infrastructure directory:

cd infra

Install dependencies:

npm install

Deploy to AWS:

npm run deploy:bootstrap

Grab the API Key

During deployment, the API key will be generated and stored in the DynamoDB table. This will be printed in the console. Copy the API key and save it for later use.

Swagger UI

The Swagger UI is available at https://genai.<your-domain-name>/api-docs/

How to use the API

The API is Open AI compatible. You can use it as a drop-in replacement for OpenAI API clients. There are few examples under examples folder. For example:

pip install openai

import PIL.Image as Image
import base64
from openai import OpenAI


client = OpenAI(
    base_url = "https://genai.<your-domain-name>/v1",
    api_key="your-api-key",
   )


if __name__ == "__main__":
    image_path = "../ingredients.png"
    model = "llama3-2-11b"
    instruction = "What are the ingredients in this image?"
    with open(image_path, "rb") as image_file:
            image_buffer = image_file.read()

    base64_image = base64.b64encode(image_buffer).decode('utf-8')

    completion = client.chat.completions.create(
            model=model,
            temperature=0,
            top_p=0.90,
            max_tokens=1024,
            messages=[
            {
                "role": "user",
                "content": [
                {
                    "type": "text",
                    "text": instruction
                },
                {
                    "type": "image_url",
                    "image_url": {
                    "url": f"data:image/png;base64,{base64_image}",
                    "detail": "auto"
                    },
                },
                ],
            },
            ],
    )
    print(completion.choices[0].message.content)
    print(completion.usage)

Running the Model API Locally

Navigate to the model API directory:

cd model-api-app

Install dependencies:

npm install

Start the API server:

npm run dev

API Documentation

The Model API provides an OpenAI-compatible interface for interacting with LLMs deployed to SageMaker.

Full API documentation is available via Swagger UI:

https://yourdomain.com/api-docs/

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
diagrams		diagrams
examples		examples
infra		infra
model-api-app		model-api-app
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Private LLM API with AWS SageMaker

Repository Overview

1. Infrastructure (CDK)

2. Model API Application

Project Structure

Prerequisites

Getting Started

Infrastructure Deployment

Grab the API Key

Swagger UI

How to use the API

Running the Model API Locally

API Documentation

About

Uh oh!

Releases

Packages

Languages

just4give/llm-sagemaker-fargate-api

Folders and files

Latest commit

History

Repository files navigation

Private LLM API with AWS SageMaker

Repository Overview

1. Infrastructure (CDK)

2. Model API Application

Project Structure

Prerequisites

Getting Started

Infrastructure Deployment

Grab the API Key

Swagger UI

How to use the API

Running the Model API Locally

API Documentation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages