Skip to content

tensorwavecloud/ScalarLM

Repository files navigation

ScalarLM - Advanced LLM Platform with Clean vLLM Integration

License: Apache-2.0 Python 3.8+

ScalarLM is a fully open source, integrated LLM inference and training platform built on top of vLLM, Huggingface, and Megatron-LM

πŸ“‹ Core Dependencies

ScalarLM is built on top of these core components:

  • vLLM - High-performance LLM inference engine
  • Megatron-LM - Training harness, distribution strategy
  • PyTorch - Deep learning framework
  • Transformers - Model implementations and utilities
  • FastAPI - API server framework

πŸš€ Quick Start

Prerequisites

  • Python 3.8+
  • PyTorch 2.0+
  • vLLM (installed in step 2 below)
  • CUDA 11.8+ (optional but recommended, for GPU acceleration)

1. Installation

# Clone the repository
git clone https://github.com/scalarlm/scalarlm.git
cd scalarlm

# Start it
./scalarlm up

πŸ“¦ What's New in v1.0: Clean Architecture

ScalarLM has been completely redesigned with a clean architecture that solves dependency management issues:

After: Clean Architecture (Solution)

  • βœ… Zero coupling - vLLM has no knowledge of ScalarLM
  • βœ… External enhancement - ScalarLM adapters enhance vLLM models
  • βœ… Version independence - Use any vLLM version
  • βœ… Clean separation - Both systems evolve independently

πŸƒβ€β™‚οΈ Running ScalarLM

Quick Start with scalarlm CLI

# Start ScalarLM server (simplest way)
./scalarlm up

# View available commands
./scalarlm --help

Available CLI Commands

./scalarlm up              # Start ScalarLM server
./scalarlm benchmark       # Run performance benchmarks
./scalarlm llm-logs        # View LLM logs
./scalarlm llm-ls          # List available models
./scalarlm llm-plot        # Plot training metrics
./scalarlm llm-squeue      # View training queue status
./scalarlm test            # Run tests
./scalarlm build-image     # Build Docker image

🐳 Docker Support

Prebuilt Containers

Target Container Latest Release
NVIDIA BLACKWELL gdiamos/scalarlm-nvidia-12.0:latest gdiamos/scalarlm-nvidia-12.0:v0.99
NVIDIA HOPPER gdiamos/scalarlm-nvidia-8.0:latest gdiamos/scalarlm-nvidia-8.0:v0.99
NVIDIA HOPPER gdiamos/scalarlm-nvidia-8.6:latest gdiamos/scalarlm-nvidia-8.6:v0.99
NVIDIA ADA gdiamos/scalarlm-nvidia-7.5:latest gdiamos/scalarlm-nvidia-7.5:v0.99
ARM gdiamos/scalarlm-arm:latest gdiamos/scalarlm-arm:v0.99
AMD gdiamos/scalarlm-amd:latest gdiamos/scalarlm-amd:v0.99
x86 gdiamos/scalarlm-cpu:latest gdiamos/scalarlm-cpu:v0.99

Quick Docker Start

# Or use ./scalarlm up command
./scalarlm up cpu        # CPU version
./scalarlm up nvidia     # NVIDIA GPU version
./scalarlm up amd        # AMD GPU version

βš™οΈ Configuration

Environment Variables

# Core Settings
export SCALARLM_MODEL="meta-llama/Llama-2-7b-hf"  # Default model

# Performance Settings
export SCALARLM_GPU_MEMORY_UTILIZATION="0.9"     # GPU memory usage
export SCALARLM_MAX_MODEL_LENGTH="2048"          # Maximum model length

Configuration Files

ScalarLM looks for configuration in these locations (in order):

  1. /app/cray/cray-config.yaml - Local project config (in the container)

Example cray-config.yaml:

model: meta-llama/Llama-2-7b-hf
max_model_length: 2048

gpu_memory_utilization: 0.9

πŸ“‚ Project Structure

scalarlm/
β”œβ”€β”€ tests/                          # Unit and integration tests
β”œβ”€β”€ infra/                           # ScalarLM infrastructure
β”œβ”€β”€ ml/                              # Training and ML components
β”œβ”€β”€ deployment/                      # Deployment configurations
└── README.md                        # This file

πŸ“Š Features

Core Features

  • πŸš€ High-performance inference via vLLM
  • 🎯 Advanced training with Megatron-LM integration
  • πŸ”Œ OpenAI-compatible API for easy integration
  • πŸ“ˆ Distributed training capabilities
  • πŸŽ›οΈ Tokenformer adapters for enhanced performance

Clean Architecture Benefits

  • πŸ—οΈ Zero coupling between vLLM and ScalarLM
  • πŸ”„ Version independence - use any vLLM version
  • πŸ›‘οΈ Robust dependency management
  • πŸ”§ Easy maintenance and updates
  • πŸ“¦ Modern packaging with pyproject.toml

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Run tests (make test integration-test)
  5. Commit your changes (git commit -m 'Add amazing feature')
  6. Push to the branch (git push origin feature/amazing-feature)
  7. Open a Pull Request

Development Guidelines

  • Follow the clean architecture principles
  • Maintain zero coupling between vLLM and ScalarLM
  • Add tests for new features
  • Update documentation as needed
  • Use the provided Makefile for development tasks

πŸ“š Documentation

Getting Help

πŸ“„ License

ScalarLM is licensed under the CC-0 License. See LICENSE for details.

πŸ™ Acknowledgments

ScalarLM is inspired by the work of Seymour Roger Cray (1925-1996), "the father of supercomputing", who created the supercomputer industry and designed the fastest computers in the world for decades.

Built with:


Ready to get started? Run ./scalarlm up to set up your development environment!

About

ScalarLM - a unified training and inference stack

Resources

License

Stars

Watchers

Forks

Packages

No packages published