Skip to content

Conversation

isaac-mcfadyen
Copy link
Contributor

@isaac-mcfadyen isaac-mcfadyen commented Sep 21, 2025

Overview

This PR is an alternative to PR #16079. While that PR works to add subdirectory support to the server, it has a number of pitfalls that make it quite tricky to use in practice:

  • The PR proposes an env var, BASE_PATH, to be added that will properly update paths in the web UI so that they work at a different subdirectory (ex. for deploying at /server)
  • However that env var is build-time only, and does not work at runtime. This makes it unusable with tools like llama-swap, and probably some others, that need the same server to work at many different paths.

Why it changed

The previous server used hash-based routing.

This worked because we could use relative paths in all cases and only the hash would change when moving between pages. Since the hash is a purely browser-side component of the URL the same file could be served at any path and it would work perfectly.

When we moved to SvelteKit we moved to absolute paths. This meant that ex. a full llama.cpp server at /server would try to hit /props (when it should be hitting /server/props) and fail.

Alternatives considered

  • We cannot enable SvelteKit's prerender option because routes like /chat/<id> will not work (they can't be prerendered).
  • We cannot enable SvelteKit's relative option without hash-based routing because it will not work with a single-file build (because the same file will be served for all paths, so the client-side router can't know what files it needs to fetch).

Proposed Solution

This PR enables SvelteKit's hash-based routing mode as well as the relative path mode. This mode is non-ideal for most apps since it's bad for SEO and disables all server-side rendering, but is perfect for our use-case.

By enabling hash-based routing, we can serve the same index.html file for all paths, and the client-side router on SvelteKit will handle changing the page based on the path after the hash, such as /#/chat/conv-1757280717333.

Caveats

A few things to keep in mind with this approach. All of these have been addressed in this PR:

  • All links to other pages within the SvelteKit project need to start with #. So a link to /chat/${id} would now be #/chat/${id}.
  • We no longer need a +layout.ts with export const csr = true. CSR is always enabled/forced with hash-based routing.
  • Any links to the server, such as /props, need to be changed to be relative.
  • If hitting a server at a subdirectory and you try and visit a URL without a trailing slash, such as https://example.com/upstream/mistral3.2 (a real URL that might be used in llama-swap) it will cause a redirect loop.
    I'm still trying to track down the cause of this. Using a trailing slash fixes this for now (e.g https://example.com/upstream/mistral3.2/) (edit: is fixed in llama-swap by Upstream endpoint should always end with a / mostlygeek/llama-swap#321).

Testing

I tested this locally as much as I could. Everything seems to work well and I couldn't find any broken links manually.

I also tested this with llama-swap and it solves the related issue mostlygeek/llama-swap#306.

Copy link
Collaborator

@allozaur allozaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @isaac-mcfadyen, thank you for this contribution. I've tested this locally on my end and the routing works fine for me.

@ngxson i think that this solution is a better alternative to #16079 due to arguments presented in the PR description and these comments:

So i'd say — let's ship it!

@allozaur allozaur added server/webui bugfix fixes an issue or bug labels Sep 22, 2025
@allozaur allozaur self-assigned this Sep 22, 2025
@ngxson
Copy link
Collaborator

ngxson commented Sep 23, 2025

Looks good to me, I think the hash router is a more portable solution overall.

Btw don't remember to revert the code for handling single page app in server.cpp

@isaac-mcfadyen
Copy link
Contributor Author

Btw don't remember to revert the code for handling single page app in server.cpp

Thank you for calling that out! Removed the stale code.

@ServeurpersoCom
Copy link
Collaborator

Tested OK on my side for a generic Apache2 httpd reverse proxy with, and without llama-swap.
Tested NOK for llama-swap /upstream reverse proxy (Error fetching server props: Error: Failed to fetch server props: 404) I need to double-check.

@isaac-mcfadyen
Copy link
Contributor Author

Tested OK on my side for a generic Apache2 httpd reverse proxy with, and without llama-swap. Tested NOK for llama-swap /upstream reverse proxy (Error fetching server props: Error: Failed to fetch server props: 404) I need to double-check.

That's odd, I tested that exact use-case multiple times and it worked.

Can you confirm what HTTP request is actually failing from the browser DevTools?

Also make sure you completely rebuild llama-server after pulling this PR (the web UI is baked directly into the server binary so the whole thing needs to be rebuilt).

@ServeurpersoCom
Copy link
Collaborator

Also make sure you completely rebuild llama-server after pulling this PR (the web UI is baked directly into the server binary so the whole thing needs to be rebuilt).

This is the "I need to double-check." :D because I serve my static version from my http server and I not updated the binaries on my llama-server instance :)

@ServeurpersoCom
Copy link
Collaborator

Sans titre All OK

@mostlygeek
Copy link
Contributor

Tested this with latest llama-swap and it's works as expected.

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isaac-mcfadyen Could you update to latest master? We'll merge after this.

@ServeurpersoCom
Copy link
Collaborator

ServeurpersoCom commented Sep 26, 2025

There is a tiny oversight, the "llama.cpp" link in the side pane points to the root of the domain (or local IP)

(root|~/llama.cpp.pascal/tools/server/webui) git grep '"/"'
src/lib/components/app/chat/ChatSidebar/ChatSidebar.svelte:

I try . or best: ./ (more explicit)

"./" work for me

@isaac-mcfadyen isaac-mcfadyen force-pushed the server-webui-hash-routing branch from 4ce4640 to cd9a0b7 Compare September 26, 2025 15:06
@isaac-mcfadyen
Copy link
Contributor Author

@ServeurpersoCom Fixed that broken link, thanks!
@ggerganov Rebased onto master.

@ggerganov
Copy link
Member

There is still a conflict to resolve.

@isaac-mcfadyen
Copy link
Contributor Author

isaac-mcfadyen commented Sep 26, 2025

There is still a conflict to resolve.

Looks like Git is complaining because it doesn't know how to merge the (bundled/built) index.html file, is that right? In which case we probably want to just overwrite with the one from this branch since it has all of the changes? Not exactly sure how to do that since my local git doesn't show any conflict and GitHub won't let me open in the web editor.

@ggerganov
Copy link
Member

Yes. And you are sure you regenerated the index.html.gz file after the rebase?

@isaac-mcfadyen isaac-mcfadyen force-pushed the server-webui-hash-routing branch from cd9a0b7 to 7bfdc50 Compare September 26, 2025 15:24
@isaac-mcfadyen
Copy link
Contributor Author

Fixed! Accidentally forgot to sync my fork to the main repo before I did a rebase, did that and now conflicts should be resolved. And I rebuilt index.html.gz as part of my latest commit so it should be up-to-date with the webui/ folder.

@ggerganov ggerganov merged commit e0539eb into ggml-org:master Sep 26, 2025
58 of 63 checks passed
@mostlygeek
Copy link
Contributor

Thanks to everyone for landing this change! Much appreciation.

struct pushed a commit to struct/llama.cpp that referenced this pull request Sep 26, 2025
…gml-org#16157)

* Switched web UI to hash-based routing

* Added hash to missed goto function call

* Removed outdated SPA handling code

* Fixed broken sidebar home link
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants