-
Notifications
You must be signed in to change notification settings - Fork 137
Closed
Labels
Description
Describe the bug
NGF 2.1.1 control plane breaks when an HTTPRoute
references a SnippetsFilter
with an invalid configuration. The only apparent way to sort it out seems to be undeploying and redeploying NGF again. I reproduced this using the Helm installation on a Kubernetes 1.30.6 cluster.
To Reproduce
- Deploy NGINX using https://github.com/f5devcentral/NGINX-Gateway-Fabric-Lab/blob/main/DEPLOYING.md The actual Helm command should enable
snippetsFilters
, so I used:
helm install ngf oci://ghcr.io/nginx/charts/nginx-gateway-fabric \
--set nginx.image.repository=private-registry.nginx.com/nginx-gateway-fabric/nginx-plus \
--set nginx.image.tag=2.1.1 \
--set nginx.plus=true \
--set serviceAccount.imagePullSecret=nginx-plus-registry-secret \
--set nginx.imagePullSecret=nginx-plus-registry-secret \
--set nginx.usage.secretName=nplus-license \
--set nginx.service.type=NodePort \
--set nginxGateway.snippetsFilters.enable=true \
-n nginx-gateway
-
Apply the attached files
1.phpapp.yaml
2.gateway.yaml
3.snippetsfilter.yaml
and4.httproute.yaml
. The SnippetsFilter manifest contains an invalid configuration likefastcgi_pass invalid-fqdn:9000;
-
After applying all manifests, we have
$ kubectl apply -f .
configmap/phpinfo created
deployment.apps/php-fpm created
service/php-fpm created
gateway.gateway.networking.k8s.io/gateway created
snippetsfilter.gateway.nginx.org/fastcgi created
httproute.gateway.networking.k8s.io/php-fpm created
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
gateway-nginx-d6b4c56c-dk9x7 0/1 Running 0 6s
php-fpm-8b8b4cbdf-vzjf7 1/1 Running 0 7s
$ kubectl get gateway
NAME CLASS ADDRESS PROGRAMMED AGE
gateway nginx 10.102.15.248 False 3m11s
- The NGF control plane pod logs:
{"level":"info","ts":"2025-09-25T10:29:29Z","logger":"provisioner","msg":"Creating/Updating nginx resources","namespace":"default","name":"gateway-nginx"}
{"level":"info","ts":"2025-09-25T10:29:29Z","logger":"eventHandler","msg":"NGINX configuration was successfully updated"}
{"level":"info","ts":"2025-09-25T10:29:29Z","logger":"provisioner","msg":"Creating/Updating nginx resources","namespace":"default","name":"gateway-nginx"}
{"level":"info","ts":"2025-09-25T10:29:29Z","logger":"provisioner","msg":"Creating/Updating nginx resources","namespace":"default","name":"gateway-nginx"}
{"level":"info","ts":"2025-09-25T10:29:29Z","logger":"provisioner","msg":"Creating/Updating nginx resources","namespace":"default","name":"gateway-nginx"}
{"level":"info","ts":"2025-09-25T10:29:29Z","logger":"provisioner","msg":"Creating/Updating nginx resources","namespace":"default","name":"gateway-nginx"}
{"level":"info","ts":"2025-09-25T10:29:29Z","logger":"provisioner","msg":"Creating/Updating nginx resources","namespace":"default","name":"gateway-nginx"}
{"level":"info","ts":"2025-09-25T10:29:30Z","logger":"provisioner","msg":"Creating/Updating nginx resources","namespace":"default","name":"gateway-nginx"}
{"level":"info","ts":"2025-09-25T10:29:30Z","logger":"eventHandler","msg":"NGINX configuration was successfully updated"}
{"level":"info","ts":"2025-09-25T10:29:30Z","logger":"eventHandler","msg":"NGINX configuration was successfully updated"}
{"level":"info","ts":"2025-09-25T10:29:30Z","logger":"eventHandler","msg":"NGINX configuration was successfully updated"}
{"level":"info","ts":"2025-09-25T10:29:31Z","logger":"eventHandler","msg":"NGINX configuration was successfully updated"}
{"level":"info","ts":"2025-09-25T10:29:36Z","logger":"nginxUpdater.commandService","msg":"Creating connection for nginx pod: gateway-nginx-d6b4c56c-dk9x7"}
{"level":"info","ts":"2025-09-25T10:29:37Z","logger":"nginxUpdater.commandService","msg":"Successfully connected to nginx agent gateway-nginx-d6b4c56c-dk9x7"}
{"level":"error","ts":"2025-09-25T10:29:42Z","logger":"nginxUpdater.commandService","msg":"error sending request to agent","error":"msg: Config apply failed, rolling back config; error: failed validating config NGINX config test failed exit status 1: 2025/09/25 10:29:37 [emerg] 34#34: host not found in upstream \"invalid-fqdn\" in /etc/nginx/includes/SnippetsFilter_http.server.location_default_fastcgi.conf:1\nnginx: [emerg] host not found in upstream \"invalid-fqdn\" in /etc/nginx/includes/SnippetsFilter_http.server.location_default_fastcgi.conf:1\nnginx: configuration file /etc/nginx/nginx.conf test failed\n\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: Config apply failed, rollback successful; error: failed validating config NGINX config test failed exit status 1: 2025/09/25 10:29:37 [emerg] 34#34: host not found in upstream \"invalid-fqdn\" in /etc/nginx/includes/SnippetsFilter_http.server.location_default_fastcgi.conf:1\nnginx: [emerg] host not found in upstream \"invalid-fqdn\" in /etc/nginx/includes/SnippetsFilter_http.server.location_default_fastcgi.conf:1\nnginx: configuration file /etc/nginx/nginx.conf test failed\n\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured","stacktrace":"github.com/nginx/nginx-gateway-fabric/v2/internal/controller/nginx/agent.(*commandService).logAndSendErrorStatus\n\t/home/runner/work/nginx-gateway-fabric/nginx-gateway-fabric/internal/controller/nginx/agent/command.go:365\ngithub.202132.xyz/nginx/nginx-gateway-fabric/v2/internal/controller/nginx/agent.(*commandService).setInitialConfig\n\t/home/runner/work/nginx-gateway-fabric/nginx-gateway-fabric/internal/controller/nginx/agent/command.go:322\ngithub.202132.xyz/nginx/nginx-gateway-fabric/v2/internal/controller/nginx/agent.(*commandService).Subscribe\n\t/home/runner/work/nginx-gateway-fabric/nginx-gateway-fabric/internal/controller/nginx/agent/command.go:149\ngithub.202132.xyz/nginx/agent/v3/api/grpc/mpi/v1._CommandService_Subscribe_Handler\n\tpkg/mod/github.com/nginx/agent/[email protected]/api/grpc/mpi/v1/command_grpc.pb.go:233\ngithub.202132.xyz/nginx/nginx-gateway-fabric/v2/internal/controller/nginx/agent/grpc/interceptor.(*ContextSetter).Stream.ContextSetter.Stream.func1\n\t/home/runner/work/nginx-gateway-fabric/nginx-gateway-fabric/internal/controller/nginx/agent/grpc/interceptor/interceptor.go:65\ngoogle.golang.org/grpc.(*Server).processStreamingRPC\n\tpkg/mod/google.golang.org/[email protected]/server.go:1728\ngoogle.golang.org/grpc.(*Server).handleStream\n\tpkg/mod/google.golang.org/[email protected]/server.go:1845\ngoogle.golang.org/grpc.(*Server).serveStreams.func2.1\n\tpkg/mod/google.golang.org/[email protected]/server.go:1061"}
{"level":"error","ts":"2025-09-25T10:29:42Z","logger":"eventHandler","msg":"Failed to update NGINX configuration","error":"msg: Config apply failed, rolling back config; error: failed validating config NGINX config test failed exit status 1: 2025/09/25 10:29:37 [emerg] 34#34: host not found in upstream \"invalid-fqdn\" in /etc/nginx/includes/SnippetsFilter_http.server.location_default_fastcgi.conf:1\nnginx: [emerg] host not found in upstream \"invalid-fqdn\" in /etc/nginx/includes/SnippetsFilter_http.server.location_default_fastcgi.conf:1\nnginx: configuration file /etc/nginx/nginx.conf test failed\n\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: Config apply failed, rollback successful; error: failed validating config NGINX config test failed exit status 1: 2025/09/25 10:29:37 [emerg] 34#34: host not found in upstream \"invalid-fqdn\" in /etc/nginx/includes/SnippetsFilter_http.server.location_default_fastcgi.conf:1\nnginx: [emerg] host not found in upstream \"invalid-fqdn\" in /etc/nginx/includes/SnippetsFilter_http.server.location_default_fastcgi.conf:1\nnginx: configuration file /etc/nginx/nginx.conf test failed\n\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured\nmsg: ; error: failed to preform API action, NGINX Plus API is not configured","stacktrace":"github.com/nginx/nginx-gateway-fabric/v2/internal/controller.(*eventHandlerImpl).waitForStatusUpdates\n\t/home/runner/work/nginx-gateway-fabric/nginx-gateway-fabric/internal/controller/handler.go:262"}
- Remove the application and NGF resources:
$ kubectl delete -f .
configmap "phpinfo" deleted from default namespace
deployment.apps "php-fpm" deleted from default namespace
service "php-fpm" deleted from default namespace
gateway.gateway.networking.k8s.io "gateway" deleted from default namespace
snippetsfilter.gateway.nginx.org "fastcgi" deleted from default namespace
httproute.gateway.networking.k8s.io "php-fpm" deleted from default namespace
- Deploying a working application with a working set of NGF manifests such as https://github.com/f5devcentral/NGINX-Gateway-Fabric-Lab/tree/main/labs/1.basic-app shows that the NGF control plane pod is stuck.
Gateway
objects are not provisioned correctly and apparently the only way out is to undeploy NGF through its Helm chart and redeploy it again
NAME CLASS ADDRESS PROGRAMMED AGE
gateway nginx Unknown 31s
$ kubectl describe gateway gateway
Name: gateway
Namespace: default
Labels: <none>
Annotations: <none>
API Version: gateway.networking.k8s.io/v1
Kind: Gateway
Metadata:
Creation Timestamp: 2025-09-25T10:34:43Z
Generation: 1
Resource Version: 64123979
UID: 2993bd59-0647-45a9-9aa2-abbb0b6fb197
Spec:
Gateway Class Name: nginx
Listeners:
Allowed Routes:
Namespaces:
From: Same
Hostname: *.example.com
Name: http
Port: 80
Protocol: HTTP
Status:
Conditions:
Last Transition Time: 1970-01-01T00:00:00Z
Message: Waiting for controller
Reason: Pending
Status: Unknown
Type: Accepted
Last Transition Time: 1970-01-01T00:00:00Z
Message: Waiting for controller
Reason: Pending
Status: Unknown
Type: Programmed
Events: <none>
Expected behavior
Applying SnippetsFilter
resources with wrong configurations shouldn't break the NGF control plane pod.
Your environment
- NGF 2.1.1
- "Vanilla" Kubernetes 1.30.6 running on Ubuntu on qemu
- Tested (with the same outcome) exposing NGF in NodePort and LoadBalancer mode
1.phpapp.yaml
2.gateway.yaml
3.snippetsfilter.yaml
4.httproute.yaml
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
✅ Done