Skip to content

Monitoring

Clement Tee edited this page Aug 20, 2025 · 14 revisions

Twingate Kubernetes Access Gateway exposes a suite of Prometheus metrics. You can scrape these metrics with Prometheus and visualize them in Grafana to monitor the Gateway performance and usage patterns.

Configuration

Enabling Metrics Scraping

The Gateway exposes a Prometheus-compatible /metrics endpoint on port 9090. You can configure your Prometheus instance to scrape this endpoint.

If you are using Prometheus Operator, there are two ways to enable metrics scraping. You could define either PodMonitor or ServiceMonitor configuration in your Gateway Helm chart values.yaml file:

Warning

Do not enable both PodMonitor and ServiceMonitor at the same time! You might get duplicated metrics.

  1. PodMonitor
metrics:
  podMonitor:
    enabled: true
    # optional: add Prometheus label selectors so your Prometheus instance can discover this PodMonitor
    additionalLabels:
      release: prometheus
  1. ServiceMonitor
metrics:
  serviceMonitor:
    enabled: true
    # optional: add Prometheus label selectors so your Prometheus instance can discover this ServiceMonitor
    additionalLabels:
      release: prometheus

You can verify the health of the Gateway scrape targets on the Prometheus status page:

Prometheus status

Enabling PrometheusRule Alerts

PrometheusRule is a Custom Resource Definition (CRD) provided by the Prometheus Operator. You should have Prometheus Operator installed, to enable alerting rules.

Add the following configuration in your Gateway Helm chart values.yaml file:

metrics:
  prometheusRule:
    enabled: true
    rules:
      # High error rate alerts
      - alert: TwingateGatewayHighHTTPErrorRate
        expr: |
          max_over_time(
            (sum by (code, method, type) (rate(twingate_gateway_http_requests_total{code=~"5[0-9][0-9]"}[5m])) > bool 0.8)[2m:30s]
          ) > 0
        for: 2m
        labels:
          severity: critical
        annotations:
          description: Twingate Gateway HTTP {{ $labels.code }} error rate is above 0.8 requests/sec over last 2 minutes for method={{ $labels.method }}, type={{ $labels.type }}.

Tip

Example alert rules are provided as comments in the values.yaml. Uncomment and customize them as needed for your environment.

Enabling Grafana Dashboard

Grafana Dashboard

To enable Grafana dashboard, add the following configuration to your values.yaml file:

metrics:
  grafanaDashboard:
    enabled: true
    labels:
      grafana_dashboard: "1"

This will create a ConfigMap discoverable by Grafana.

Tip

If your Grafana and Gateway are deployed in different namespaces, ensure that Grafana can search for dashboards across namespaces.

For example, if you deployed Grafana using Helm, update sidecar.dashboards.searchNamespace in the value file to include the namespace where the Gateway is deployed.

Metrics

The Gateway exposes the following metrics.

TCP Connection

These metrics are collected when the client opens a TCP connection to the Gateway.

Name Type Description
twingate_gateway_tcp_connection_duration_seconds Histogram Duration of client TCP connections in seconds
twingate_gateway_tcp_connections_total Counter Total number of client TCP connections processed
twingate_gateway_active_tcp_connections Gauge Number of currently active client TCP connections

Twingate Client Authentication

Metrics captured when the client attempts to authenticate to the Gateway via HTTP CONNECT:

Name Type Description
twingate_gateway_client_authentication_total Counter Total number of HTTP CONNECT authentication attempts
twingate_gateway_client_connection_duration_seconds Histogram Duration of HTTP CONNECT authentication attempt in seconds

HTTP Requests

These metrics are collected for every incoming HTTP request handled by the Gateway. The requests are labelled into three types: HTTP, Websocket, and SPDY.

Name Type Description
twingate_gateway_http_active_requests Gauge Number of currently active HTTP requests
twingate_gateway_http_request_duration_seconds Histogram Latencies of HTTP requests in seconds
twingate_gateway_http_request_size_bytes Histogram Size of incoming HTTP request in bytes
twingate_gateway_http_requests_total Counter Total number of HTTP requests processed
twingate_gateway_http_response_size_bytes Histogram Size of outgoing HTTP response in bytes

Kubernetes API Server Requests

These metrics are collected when the Gateway send API requests to the upstream server (Kubernetes API server). The requests are labelled into three types: HTTP, Websocket, and SPDY.

Name Type Description
twingate_gateway_api_server_active_requests Gauge Number of currently active requests from Gateway to API Server
twingate_gateway_api_server_request_duration_seconds Histogram Measures the initial HTTP request-response latency between Gateway and API Server in seconds. For HTTP streaming, WebSocket, and SPDY connections, this metric captures only the setup time and not the duration of the data transfer.
twingate_gateway_api_server_requests_total Counter Total number of requests from Gateway to API Server processed

Recorded Sessions

Name Type Description
twingate_gateway_recorded_session_duration_seconds Histogram Duration of WebSocket session in seconds

Go Runtime & Process Metrics

The Gateway exports standard Go runtime (prefixed go_...) and process (prefixed process_...) metrics. These depend on the platform and are not enumerated here.

Metrics Endpoint

These metrics are collected when Prometheus scrapes the Gateway’s /metrics endpoint.

Name Type Description
promhttp_metric_handler_errors_total Counter Total number of internal errors encountered by the promhttp handler
promhttp_metric_handler_requests_in_flight Gauge Current number of scrapes in progress
promhttp_metric_handler_requests_total Counter Total number of scrapes, partitioned by HTTP status code
Clone this wiki locally