Epic: ai-extension-main-project-part


# Epic: AI Extension Main Project Part

## Overview

This epic focuses on implementing the **main project interface layer** for the AI extension, establishing the foundation for AI-powered SQL generation in atest. The implementation leverages atest's existing plugin architecture and provides a dual-layer gRPC interface that seamlessly integrates with the current system while enabling future AI plugin development.

**Key Technical Focus**: Main project proto extensions, HTTP API layer, and plugin communication infrastructure - NOT the AI plugin implementation itself.

## Architecture Decisions

### 1. Dual Proto Architecture Pattern
- **Decision**: Implement AI interfaces in both `pkg/server/server.proto` (HTTP API) and `pkg/testing/remote/loader.proto` (plugin communication)
- **Rationale**: Maintains separation between public REST API and internal plugin gRPC communication, following atest's established architectural patterns
- **Impact**: Enables independent development of AI plugin while providing standardized API interface

### 2. Backward Compatible Message Extension  
- **Decision**: Extend existing `DataQuery`/`DataQueryResult` messages using field numbers 10+ for AI-specific fields
- **Rationale**: Preserves full compatibility with existing plugins and client code
- **Impact**: Zero breaking changes to current functionality

### 3. Type-Based Query Routing
- **Decision**: Use `DataQuery.type = "ai"` to identify and route AI queries through existing infrastructure
- **Rationale**: Leverages existing query routing mechanism without architectural changes
- **Impact**: Minimal code changes in main project, maximum reuse of existing infrastructure

### 4. Plugin Lifecycle Integration
- **Decision**: AI plugin follows standard atest plugin lifecycle (ExtManager, health checks, socket communication)
- **Rationale**: Consistency with existing plugin ecosystem, proven reliability
- **Impact**: Standard deployment, monitoring, and management capabilities

### 5. Implementation Approach Selection
- **Options Available**: 
  - **Basic Implementation**: Core AI functionality with minimal proto extensions
  - **Enhanced Implementation**: Enterprise-grade features with comprehensive gRPC patterns based on context7 best practices
- **Decision**: Start with Enhanced Implementation approach for production-ready foundation
- **Rationale**: Context7 analysis revealed critical gRPC patterns that ensure long-term maintainability and scalability
- **Impact**: Higher initial effort but provides robust foundation for future expansion

## Detailed Technical Specifications

### Proto File Extensions

#### pkg/server/server.proto (HTTP API Layer)
Key additions following Google API design patterns:
```protobuf
service Runner {
    // AI SQL Generation - follows RESTful action pattern
    rpc GenerateSQL(GenerateSQLRequest) returns (GenerateSQLResponse) {
        option (google.api.http) = {
            post: "/api/v1/ai/sql:generate"
            body: "*"
        };
    }
    
    rpc GetAICapabilities(google.protobuf.Empty) returns (AICapabilitiesResponse) {
        option (google.api.http) = {
            get: "/api/v1/ai/capabilities"
        };
    }
}

// Enhanced message definitions with proper field numbering
message GenerateSQLRequest {
    string natural_language = 1;
    DatabaseTarget database_target = 2;
    GenerationOptions options = 3;
    map<string, string> context = 4;
    reserved 5 to 10;  // Future expansion
}
```

#### pkg/testing/remote/loader.proto (Plugin Communication)
Extends existing patterns:
```protobuf
// Extension to existing DataQuery message
message DataQuery {
    // Existing fields 1-5 unchanged
    
    // AI extensions starting from field 10
    string natural_language = 10;
    string database_type = 11;
    bool explain_query = 12;
    map<string, string> ai_context = 13;
    reserved 14 to 20;
}

message DataQueryResult {
    // Existing fields 1-3 unchanged
    
    // AI result information
    AIProcessingInfo ai_info = 10;
}
```

### Message Design Principles
Based on Protocol Buffers and gRPC best practices:
1. **Field Numbering**: Use 10+ for extensions, reserve ranges for future growth
2. **Enum Definitions**: Include UNSPECIFIED = 0 as first value
3. **Backward Compatibility**: All new fields optional, no modification of existing fields
4. **Error Handling**: Standard gRPC status codes with detailed error messages
5. **Streaming Support**: Design messages to support future streaming capabilities

## Technical Approach

### Frontend Components
**Scope**: Main project UI integration points only
- **AI Trigger Button**: Right-bottom corner floating button integrated into existing atest UI framework
- **Plugin Health Indicator**: Visual feedback for AI plugin availability status
- **Error Boundaries**: Graceful degradation when AI plugin is unavailable
- **API Integration Layer**: HTTP client interface to new AI endpoints

### Backend Services
**Core Focus**: Interface and routing implementation
- **HTTP-to-gRPC Gateway**: Convert REST API calls to internal gRPC plugin calls
- **Plugin Discovery**: Extend ExtManager to handle AI plugin registration and health monitoring
- **Query Router Enhancement**: Modify existing query routing to handle `type="ai"` requests
- **Message Transformation**: Convert between HTTP API formats and internal plugin formats

### Infrastructure
**Main Project Extensions**:
- **Proto Code Generation**: Update build pipeline to regenerate gRPC code after proto changes
- **API Documentation**: Extend OpenAPI specs with new AI endpoints
- **Health Check Integration**: Include AI plugin status in overall system health
- **Configuration Schema**: Extend stores.yaml schema to support AI plugin configuration

## Implementation Strategy

### Phase 0: Proto Interface Foundation (CRITICAL PATH - Week 1)
**BLOCKING**: Must complete before any other work
1. **Design Review**: Finalize proto message definitions based on enhanced design documents
2. **Proto Implementation**: Add AI-specific methods and messages to both proto files
3. **Code Generation**: Regenerate all gRPC client/server code
4. **Interface Validation**: Unit tests for new message types and services

### Phase 1: HTTP API Layer (Week 2)
1. **Runner Service Extension**: Implement `GenerateSQL`, `GetAIStatus`, and related methods in main server
2. **Request Validation**: Input sanitization and validation for AI requests
3. **Error Handling**: Standard gRPC error codes and HTTP status mapping
4. **API Documentation**: Update OpenAPI specifications

### Phase 2: Plugin Communication Bridge (Week 3)
1. **Query Router Updates**: Extend existing query routing logic for AI type
2. **Message Transformation**: Convert between API and plugin message formats
3. **Plugin Discovery**: Extend ExtManager to discover and manage AI plugin
4. **Health Integration**: Include AI plugin in system health checks

### Phase 3: UI Integration Points (Week 4)
1. **Trigger Button Component**: Minimal Vue component for AI activation
2. **Plugin Status Integration**: UI indicators for AI plugin availability
3. **Error States**: User-friendly error messages and retry mechanisms
4. **API Client**: Frontend service for calling new AI endpoints

### Risk Mitigation
- **Plugin Unavailable**: Graceful degradation with clear user messaging
- **Proto Changes**: Comprehensive backward compatibility testing
- **Performance Impact**: Minimal overhead through lazy loading and caching
- **Integration Issues**: Extensive integration testing with mock AI plugin

### Testing Approach
1. **Unit Tests**: All new message types, validators, and transformers (>90% coverage)
2. **Integration Tests**: End-to-end API flow with mock AI plugin
3. **Compatibility Tests**: Ensure existing functionality remains unaffected
4. **Load Tests**: Verify performance impact is within acceptable limits (<5% overhead)

## Task Breakdown Preview

High-level task categories (≤10 total tasks):
- [ ] **Proto Interface Design**: Finalize and implement dual proto extensions with AI-specific messages and services
- [ ] **HTTP API Implementation**: Build Runner service AI methods with validation, error handling, and documentation
- [ ] **Plugin Communication Layer**: Extend query routing and message transformation for AI plugin integration
- [ ] **ExtManager Enhancement**: Add AI plugin discovery, health checks, and lifecycle management
- [ ] **Frontend Integration**: Create AI trigger button, status indicators, and API client services
- [ ] **Configuration Schema**: Extend stores.yaml and configuration validation for AI plugin settings
- [ ] **Health Check Integration**: Include AI plugin status in system health monitoring
- [ ] **Testing Suite**: Comprehensive unit, integration, and compatibility tests
- [ ] **Documentation Update**: API docs, configuration guides, and integration specifications

## Dependencies

### External Dependencies
- **Protocol Buffers Compiler**: protoc with Go plugins for code generation
- **gRPC Libraries**: Latest compatible versions for Go implementation
- **Vue.js Framework**: Existing atest frontend framework for UI integration

### stores.yaml Configuration Schema
Required configuration structure for AI plugin:
```yaml
stores:
  - name: "ai-assistant"
    type: "ai"
    url: "unix:///tmp/atest-store-ai.sock"
    properties:
      ai_provider: "openai"  # openai, claude, local
      api_key: "${AI_API_KEY}"  # Environment variable
      model: "gpt-4"  # Default model
      max_tokens: 4096
      temperature: 0.1
      timeout: 30s
      enable_sql_execution: true
      confidence_threshold: 0.7
      supported_databases:
        - mysql
        - postgresql
        - sqlite
      rate_limit:
        requests_per_minute: 60
        burst_size: 10
```

### Security Requirements
- **Input Validation**: Sanitize natural language input to prevent prompt injection
- **SQL Injection Prevention**: Validate generated SQL before execution
- **Rate Limiting**: Implement per-user and global rate limits for AI requests
- **Data Privacy**: Option to use local models for sensitive environments
- **Audit Logging**: Log all AI queries and generated SQL for compliance
- **Access Control**: Respect existing atest authentication and authorization

### Plugin Development Constraints

#### Binary Requirements
- **Plugin Name**: Must be exactly `atest-store-ai` (required for atest discovery)
- **Go Version**: Go 1.19+ required for gRPC and protobuf compatibility
- **Socket Path**: Unix socket at `/tmp/atest-store-ai.sock`
- **gRPC Registration**: Must register as `pb.RegisterLoaderServer(grpcServer, aiPluginServer)`

#### Communication Protocol
- **Service Discovery**: Plugin must be discoverable within 2 seconds of startup
- **Health Check**: Must implement both `Verify()` and `GetAICapabilities()` methods
- **Error Handling Philosophy**:
  - Fail fast for critical configuration (missing AI API key)
  - Log and continue for optional features (extraction models)
  - Graceful degradation when external services unavailable

#### Configuration Integration
Environment variable support required:
```bash
AI_PROVIDER=openai|claude|local
OPENAI_API_KEY=${API_KEY}
AI_MODEL=gpt-4
AI_TIMEOUT=30s
```

#### Testing Requirements
- **No Mock Services**: "Do not use mock services for anything ever" - use real implementations
- **Test Coverage**: >90% coverage for new code with comprehensive integration tests
- **Deployment Verification**: Three-phase process (development → integration → deployment)
- **Socket Permissions**: Verify Unix socket file permissions in deployment testing

### Internal Dependencies
**CRITICAL PATH**:
1. **Architecture Review**: Stakeholder approval of dual proto design approach
2. **Proto Design Finalization**: Complete message and service definitions from enhanced design docs
3. **Build Pipeline**: Access to modify proto generation and build processes
4. **atest Core Stability**: No breaking changes to existing plugin architecture during implementation

### Team Dependencies
- **Core atest Team**: Proto review, build pipeline access, architecture guidance
- **Frontend Team**: UI framework integration points and component patterns
- **DevOps Team**: Build process modifications and deployment pipeline updates

## Success Criteria (Technical)

### Performance Benchmarks
- **API Response Time**: AI endpoints respond within 100ms (excluding AI processing time)
- **System Impact**: <5% increase in memory usage and startup time
- **Plugin Discovery**: AI plugin detected within 2 seconds of startup
- **Error Recovery**: Failed AI calls degrade gracefully within 500ms

### Quality Gates
- **Test Coverage**: >90% coverage for new code, >80% overall project coverage
- **Backward Compatibility**: 100% existing functionality preserved
- **API Consistency**: All new endpoints follow existing atest API patterns
- **Documentation**: Complete API docs and integration guides

### Acceptance Criteria
- **Proto Extensions**: Successfully generate Go code from extended proto files
- **API Integration**: Frontend can successfully call AI endpoints and handle responses
- **Plugin Communication**: Main project can discover, communicate with, and monitor AI plugin
- **Error Handling**: Graceful handling of all failure modes (plugin down, invalid input, timeouts)
- **Health Monitoring**: AI plugin status visible in system health dashboard

## Estimated Effort

### Overall Timeline: 4 weeks
**Week 1**: Proto foundation and code generation
**Week 2**: HTTP API layer implementation  
**Week 3**: Plugin communication bridge
**Week 4**: UI integration and testing

### Resource Requirements
- **1 Backend Developer**: Go/gRPC expertise, familiar with atest architecture
- **1 Frontend Developer**: Vue.js, API integration experience
- **0.5 DevOps Engineer**: Build pipeline modifications
- **Architecture Review**: 2-3 hours from atest core team

### Critical Path Items
1. **Proto Interface Approval**: Must complete before any implementation (Day 1-2)
2. **Code Generation Pipeline**: Required for all subsequent development (Day 3-4)
3. **Plugin Communication Testing**: Validates entire integration approach (Week 3)

**Risk Buffer**: Additional 1 week for integration testing and bug fixes

## Technical Resources

### Epic-Specific Documentation
- **`technical-specs.md`**: Complete interface specifications, message definitions, and implementation details
- **`plugin-development-guide.md`**: Comprehensive guide for AI plugin developers (for future plugin epic)

### External Dependencies
- **Context7 Research**: gRPC-Go and Protocol Buffers best practices analysis completed
- **Enhanced Design Patterns**: Enterprise-grade gRPC patterns identified and integrated into specifications

---

**Note**: This epic establishes the main project foundation only. The actual AI plugin implementation with natural language processing, SQL generation, and AI model integration will be covered in a separate epic.

## Stats

Total tasks: 8
Parallel tasks: 4 (tasks 005, 006, 007 can run concurrently after prerequisites) (can be worked on simultaneously)
Sequential tasks: 4 (critical path: 001 → 002 → 003 → 004, final: 008) (have dependencies)
Estimated total effort: 104 hours (13 working days)




Epic: ai-extension-main-project-part #1

Description

Epic: AI Extension Main Project Part

Overview

Architecture Decisions

1. Dual Proto Architecture Pattern

2. Backward Compatible Message Extension

3. Type-Based Query Routing

4. Plugin Lifecycle Integration

5. Implementation Approach Selection

Detailed Technical Specifications

Proto File Extensions

pkg/server/server.proto (HTTP API Layer)

pkg/testing/remote/loader.proto (Plugin Communication)

Message Design Principles

Technical Approach

Frontend Components

Backend Services

Infrastructure

Implementation Strategy

Phase 0: Proto Interface Foundation (CRITICAL PATH - Week 1)

Phase 1: HTTP API Layer (Week 2)

Phase 2: Plugin Communication Bridge (Week 3)

Phase 3: UI Integration Points (Week 4)

Risk Mitigation

Testing Approach

Task Breakdown Preview

Dependencies

External Dependencies

stores.yaml Configuration Schema

Security Requirements

Plugin Development Constraints

Binary Requirements

Communication Protocol

Configuration Integration

Testing Requirements

Internal Dependencies

Team Dependencies

Success Criteria (Technical)

Performance Benchmarks

Quality Gates

Acceptance Criteria

Estimated Effort

Overall Timeline: 4 weeks

Resource Requirements

Critical Path Items

Technical Resources

Epic-Specific Documentation

External Dependencies

Stats

Sub-issues

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions