Skip to content

Conversation

AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Oct 1, 2025

@AkihiroSuda AkihiroSuda added this to the v2.0.0 milestone Oct 1, 2025
@AkihiroSuda AkihiroSuda force-pushed the fix-3237 branch 2 times, most recently from 2e45090 to f74fbdf Compare October 1, 2025 10:42
@AkihiroSuda
Copy link
Member Author

Before (964fb30)

$ du -hs _output/
128M    _output/

$ ls -lh _output/bin/limactl _output/share/lima/lima-guestagent.Linux-*
-rwxr-xr-x@ 1 suda  staff    28M Oct  1 19:47 _output/bin/limactl*
-rw-r--r--@ 1 suda  staff    14M Oct  1 19:47 _output/share/lima/lima-guestagent.Linux-aarch64.gz
-rw-r--r--@ 1 suda  staff    15M Oct  1 19:47 _output/share/lima/lima-guestagent.Linux-armv7l.gz
-rw-r--r--@ 1 suda  staff    14M Oct  1 19:47 _output/share/lima/lima-guestagent.Linux-ppc64le.gz
-rw-r--r--@ 1 suda  staff    15M Oct  1 19:47 _output/share/lima/lima-guestagent.Linux-riscv64.gz
-rw-r--r--@ 1 suda  staff    16M Oct  1 19:47 _output/share/lima/lima-guestagent.Linux-s390x.gz
-rw-r--r--@ 1 suda  staff    16M Oct  1 19:47 _output/share/lima/lima-guestagent.Linux-x86_64.gz

After (f74fbdf)

$ du -hs _output/
 88M    _output/

$ ls -lh _output/bin/limactl _output/share/lima/lima-guestagent.Linux-*
-rwxr-xr-x@ 1 suda  staff    28M Oct  1 19:49 _output/bin/limactl*
-rw-r--r--@ 1 suda  staff   8.4M Oct  1 19:49 _output/share/lima/lima-guestagent.Linux-aarch64.gz
-rw-r--r--@ 1 suda  staff   8.8M Oct  1 19:49 _output/share/lima/lima-guestagent.Linux-armv7l.gz
-rw-r--r--@ 1 suda  staff   8.4M Oct  1 19:49 _output/share/lima/lima-guestagent.Linux-ppc64le.gz
-rw-r--r--@ 1 suda  staff   8.8M Oct  1 19:49 _output/share/lima/lima-guestagent.Linux-riscv64.gz
-rw-r--r--@ 1 suda  staff   9.2M Oct  1 19:49 _output/share/lima/lima-guestagent.Linux-s390x.gz
-rw-r--r--@ 1 suda  staff   9.3M Oct  1 19:49 _output/share/lima/lima-guestagent.Linux-x86_64.gz

@jandubois
Copy link
Member

I have not looked at this PR at all yet, but wanted to mention a couple of things I discussed with @Nino-K as requirements for his port monitoring PR:

  • Keep retrying to connect to k8s indefinitely, as it will not be running yet by the time the guest agent starts.
  • When the connection breaks, keep trying to reconnect with a short delay indefinitely, as the user may have stopped and restarted k8s.

For this PR also: the kubectl binary may not yet be available on the PATH when you try to invoke it, keep trying. It may also fail because the port may be open, but the apiserver not yet responding, or the kubeconfig may be missing etc. The retry on broken connections should handle this automatically.

Because of the indefinite retries, the kubernetes watcher should be opt-in (configurable in lima.yaml), so it only runs when the VM is known to run Kubernetes.

@AkihiroSuda AkihiroSuda marked this pull request as ready for review October 3, 2025 06:53
@AkihiroSuda
Copy link
Member Author

The retry on broken connections should handle this automatically.

Yes, this is retried.

Because of the indefinite retries, the kubernetes watcher should be opt-in

It wasn't opt-in so far, and I don't think it has to be so, as the overhead of polling LookPath("kubectl") seems trivial.

The resource constraint should be set by the caller via
`$LIMACTL_CREATE_ARGS`.

Signed-off-by: Akihiro Suda <[email protected]>
Part of issue 3237

TODO: drop dependency on k8s.io/api

Signed-off-by: Akihiro Suda <[email protected]>
Copy link
Member

@nirs nirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trimming the guest agent is nice, anything using client-go becomes huge quickly. But did you measure memory and cpu usage before and after this change?

With the new code we always keep kubectl watch command running, which has the similar cpu usage to what we had before in the guest agent, but now we format the json events and parse them back in the guest agent, and keep all service in memory twice, once is kubectl (using the informer) and once in the guest agent.

set -x
limactl shell "$NAME" kubectl get nodes -o wide
limactl shell "$NAME" kubectl create deployment nginx --image="${nginx_image}"
limactl shell "$NAME" kubectl create service nodeport nginx --node-port=31080 --tcp=80:80
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use yaml file with the deployment and service?

limactl shell "$NAME" kubectl get nodes -o wide
limactl shell "$NAME" kubectl create deployment nginx --image="${nginx_image}"
limactl shell "$NAME" kubectl create service nodeport nginx --node-port=31080 --tcp=80:80
timeout 3m bash -euxc "until curl -f --retry 30 --retry-connrefused http://127.0.0.1:31080; do sleep 3; done"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking the connection makes sense only after the deployment is available. I would do this:

kubectl apply -f nginx.yaml
kubectl rollout status deployment nginx --timeout 60s

At this point the service is typically not available, but it should be in few seconds, so we can start checking every second.

Since we wait separately for the deployment, we don't need to wait 3 minutes for the connection, maybe 30 seconds ie enough to detect broken port forwarding.

Notes for the curl command:

  • Using --fail will be more readable
  • Adding --silent will avoid unhelpful noise in the test logs

func (s *ServiceWatcher) readKubectlStream(r io.Reader) error {
scanner := bufio.NewScanner(r)
// increase buffer in case of large JSON objects
const maxBuf = 10 * 1024 * 1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have units constants in lima (e.g. KiB, MiB, GiB)? It will make the code more readable.

line := scanner.Bytes()
line = bytes.TrimSpace(line)
if len(line) == 0 {
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We expect empty lines in json stream?

cache.WaitForCacheSync(ctx.Done(), serviceInformer.HasSynced)
go func() {
for i := 0; ; i++ {
if i > 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you skip the first iteration?

The code will more clear if we separate the loop from the action preformed for each iteration.


if kubeconfig != "" {
cmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfig)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is simpler and more efficient to use --kubeconfig=

serviceInformer := informerFactory.Core().V1().Services().Informer()
informerFactory.Start(ctx.Done())
cache.WaitForCacheSync(ctx.Done(), serviceInformer.HasSynced)
go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separating the function performing the loop will make the code simpler and easier to follow:

func (s *ServiceWatcher) Start(ctx context.Context) {
    ...
    go s.loop(ctx)
}

func (s *ServiceWatcher) loop(ctx context.Context) {
    for {
        ...
    }
}

s.rwMutex.Lock()
switch ev.Type {
case watch.Added, watch.Modified:
s.services[key] = &svc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doe we need to keep all services in memory? We can extract the relevant details here and minimize memory usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants