Skip to content

Commit ffe7e4b

Browse files
Copilotmikebarkmin
andauthored
Add llms.txt generation for LLM-optimized documentation export (#1023)
* Initial plan * Add llms.txt generation feature with documentation and changeset Co-authored-by: mikebarkmin <[email protected]> * Address code review comments - improve type safety and optimize file system operations Co-authored-by: mikebarkmin <[email protected]> * activate llms * add more instructions to llms.txt --------- Co-authored-by: copilot-swe-agent[bot] <[email protected]> Co-authored-by: mikebarkmin <[email protected]> Co-authored-by: Mike Barkmin <[email protected]>
1 parent 3c00541 commit ffe7e4b

File tree

8 files changed

+176
-7
lines changed

8 files changed

+176
-7
lines changed

.changeset/llms-txt-generation.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
"hyperbook": minor
3+
"@hyperbook/types": minor
4+
---
5+
6+
Add llms.txt file generation feature. When the `llms` property is set to `true` in hyperbook.json, a `llms.txt` file will be generated during build that combines all markdown files in order. The file includes the book name and version in the header. Pages and sections with `hide: true` are automatically excluded from the generated file.

packages/hyperbook/build.ts

Lines changed: 162 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,9 @@ import {
1111
Link,
1212
Hyperproject,
1313
HyperbookContext,
14+
HyperbookJson,
15+
HyperbookPage,
16+
HyperbookSection,
1417
Navigation,
1518
} from "@hyperbook/types";
1619
import lunr from "lunr";
@@ -19,6 +22,139 @@ import packageJson from "./package.json";
1922

2023
export const ASSETS_FOLDER = "__hyperbook_assets";
2124

25+
/**
26+
* Generates an llms.txt file by combining all markdown files in order
27+
*/
28+
async function generateLlmsTxt(
29+
root: string,
30+
rootOut: string,
31+
hyperbookJson: HyperbookJson,
32+
pagesAndSections: Pick<Navigation, "pages" | "sections" | "glossary">,
33+
version: string,
34+
): Promise<void> {
35+
const lines: string[] = [];
36+
37+
// Add header with book name and version
38+
lines.push(`<SYSTEM>${hyperbookJson.name} - Version ${version}</SYSTEM>`);
39+
lines.push(""); // Empty line after header
40+
41+
// Get all book files once to avoid repeated file system operations
42+
const allFiles = await vfile.listForFolder(root, "book");
43+
44+
// Helper function to recursively process sections and pages
45+
const processSection = async (
46+
section: HyperbookSection,
47+
level: number = 0,
48+
): Promise<void> => {
49+
// Skip if hidden
50+
if (section.hide) {
51+
return;
52+
}
53+
54+
// Add section header if it has content
55+
if (section.href && !section.isEmpty) {
56+
const file = allFiles.find((f) => f.path.href === section.href);
57+
if (file) {
58+
// Add section name as a header
59+
lines.push(`# ${section.name}`);
60+
lines.push("");
61+
62+
// Get the markdown content without frontmatter
63+
const content = file.markdown.content.trim();
64+
if (content) {
65+
lines.push(content);
66+
lines.push(""); // Empty line after content
67+
}
68+
}
69+
}
70+
71+
// Process nested pages
72+
if (section.pages) {
73+
for (const page of section.pages) {
74+
await processPage(page);
75+
}
76+
}
77+
78+
// Process nested sections
79+
if (section.sections) {
80+
for (const subsection of section.sections) {
81+
await processSection(subsection, level + 1);
82+
}
83+
}
84+
};
85+
86+
const processPage = async (page: HyperbookPage): Promise<void> => {
87+
// Skip if hidden or empty
88+
if (page.hide || page.isEmpty) {
89+
return;
90+
}
91+
92+
if (page.href) {
93+
const file = allFiles.find((f) => f.path.href === page.href);
94+
if (file) {
95+
// Add page name as a header
96+
lines.push(`# ${page.name}`);
97+
lines.push("");
98+
99+
// Get the markdown content without frontmatter
100+
const content = file.markdown.content.trim();
101+
if (content) {
102+
lines.push(content);
103+
lines.push(""); // Empty line after content
104+
}
105+
}
106+
}
107+
};
108+
109+
// Process root-level pages first
110+
for (const page of pagesAndSections.pages) {
111+
await processPage(page);
112+
}
113+
114+
// Process sections
115+
for (const section of pagesAndSections.sections) {
116+
await processSection(section);
117+
}
118+
119+
// Write the llms.txt file
120+
lines.push(`
121+
When you are writing a hyperbook syntax you must use markdown plus the documented elements, also keep track of colons. Colons act like parentheses in programming languages. Every opening colon must have a closing colon. For example, in the syntax :bold text: the first colon opens the bold formatting and the second colon closes it. If there is a missing colon, it can lead to formatting errors or unexpected behavior in the rendered output. Always ensure that colons are properly paired to maintain the intended structure and appearance of your hyperbook content.
122+
123+
Single colons are inline elements.
124+
125+
Example: :t[Test]
126+
127+
Double colons are block elements.
128+
129+
Example: ::p5{src="sketch.js"}
130+
131+
Triple colons are special elements that can contain other elements inside them.
132+
133+
Example:
134+
135+
:::::alert{info}
136+
137+
::::tabs
138+
139+
:::tab{title="JavaScript"}
140+
141+
Hi
142+
143+
:::
144+
145+
146+
::::
147+
148+
:::::
149+
150+
When you want to nest elements you need to increase the number of colons by one for each level of nesting. The outer level should have the most colons.
151+
152+
Also you need to use unique ids when the element supports it.
153+
`);
154+
const llmsTxtContent = lines.join("\n");
155+
await fs.writeFile(path.join(rootOut, "llms.txt"), llmsTxtContent);
156+
}
157+
22158
export async function runBuildProject(
23159
project: Hyperproject,
24160
rootProject: Hyperproject,
@@ -390,17 +526,17 @@ async function runBuild(
390526

391527
if (!faviconExists && hyperbookJson.logo) {
392528
console.log(`${chalk.blue(`[${prefix}]`)} Generating favicons from logo.`);
393-
529+
394530
// Only generate if logo is a local file (not a URL)
395531
if (!hyperbookJson.logo.includes("://")) {
396532
let logoPath: string | null = null;
397-
533+
398534
// Resolve logo path by checking multiple locations
399535
if (hyperbookJson.logo.startsWith("/")) {
400536
// Absolute path starting with / - check book folder, then public folder
401537
const bookPath = path.join(root, "book", hyperbookJson.logo);
402538
const publicPath = path.join(root, "public", hyperbookJson.logo);
403-
539+
404540
try {
405541
await fs.access(bookPath);
406542
logoPath = bookPath;
@@ -417,7 +553,7 @@ async function runBuild(
417553
const rootPath = path.join(root, hyperbookJson.logo);
418554
const bookPath = path.join(root, "book", hyperbookJson.logo);
419555
const publicPath = path.join(root, "public", hyperbookJson.logo);
420-
556+
421557
try {
422558
await fs.access(rootPath);
423559
logoPath = rootPath;
@@ -435,11 +571,18 @@ async function runBuild(
435571
}
436572
}
437573
}
438-
574+
439575
if (logoPath) {
440576
try {
441-
const { generateFavicons } = await import("./helpers/generate-favicons");
442-
await generateFavicons(logoPath, rootOut, hyperbookJson, ASSETS_FOLDER);
577+
const { generateFavicons } = await import(
578+
"./helpers/generate-favicons"
579+
);
580+
await generateFavicons(
581+
logoPath,
582+
rootOut,
583+
hyperbookJson,
584+
ASSETS_FOLDER,
585+
);
443586
console.log(
444587
`${chalk.green(`[${prefix}]`)} Favicons generated successfully.`,
445588
);
@@ -598,5 +741,17 @@ const SEARCH_DOCUMENTS = ${JSON.stringify(documents)};
598741
),
599742
);
600743

744+
// Generate llms.txt if enabled
745+
if (hyperbookJson.llms) {
746+
console.log(`${chalk.blue(`[${prefix}]`)} Generating llms.txt`);
747+
await generateLlmsTxt(
748+
root,
749+
rootOut,
750+
hyperbookJson,
751+
pagesAndSections,
752+
packageJson.version,
753+
);
754+
}
755+
601756
console.log(`${chalk.green(`[${prefix}]`)} Build success: ${rootOut}`);
602757
}

packages/types/src/index.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ export type HyperbookJson = {
9393
search?: boolean;
9494
qrcode?: boolean;
9595
toc?: boolean;
96+
llms?: boolean;
9697
author?: {
9798
name?: string;
9899
url?: string;

platforms/vscode/schemas/hyperbook.schema.json

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -280,6 +280,9 @@
280280
},
281281
"type": "array"
282282
},
283+
"llms": {
284+
"type": "boolean"
285+
},
283286
"logo": {
284287
"type": "string"
285288
},

website/de/book/configuration/book.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ von Optionen, die du definieren kannst. Optionen mit einem "\*" müssen gesetzt
3838
| allowDangerousHtml | Erlaube HTML im Hyperbook. Dies kann zu Inkompatibilität in zukünftigen Versionen führen. |
3939
| qrcode | Zeigt ein Icon, um einen QR-Code zur aktuellen Seite anzuzeigen. |
4040
| toc | Zeige ein Inhaltsverzeichnis. Diese ist standardmäßig aktiviert für Seiten und deaktiviert für Begriffe im Glossar. |
41+
| llms | Wenn auf true gesetzt, wird eine llms.txt-Datei generiert, die alle Markdown-Dateien in Reihenfolge kombiniert. Die Datei enthält den Buchnamen und die Version im Header-Format. |
4142
| trailingSlash | Exportiert alle Datei in eigene Verzeichnisse und erzeugt nur index.html Dateien. |
4243
| importExport | Ermöglicht das Importieren und Exportieren des Zustands des Hyperbooks als Datei. Schaltflächen zum Importieren und Exportieren befinden sich am unteren Rand der Seite. |
4344

website/de/hyperbook.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
"name": "Hyperbook Dokumenation",
33
"qrcode": true,
44
"search": true,
5+
"llms": true,
56
"importExport": true,
67
"description": "Dokumentation für Hyperbook erstellt mit Hyperbook",
78
"author": {

website/en/book/configuration/book.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ can and part wise must set (indicated by a \*).
3737
| allowDangerousHtml | Allow HTML. This can lead to incompatibilities in future versions. |
3838
| qrcode | Shows an icon, which opens a qr code to the current page. |
3939
| toc | Show or hide a table of content for the page. This is on for pages and off for glossary entries by default |
40+
| llms | When set to true, generates an llms.txt file that combines all markdown files in order. The file includes the book name and version in a header format. |
4041
| trailingSlash | Outputs all files into ther own folders and produces only index.html files. |
4142
| importExport | Allows to import and export the state of the Hyperbook as a file. Buttons for importing and exporting will be at the bottom of the page. |
4243

website/en/hyperbook.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
{
22
"name": "Hyperbook Documentation",
33
"qrcode": true,
4+
"llms": true,
45
"importExport": true,
56
"search": true,
67
"description": "Documentation for Hyperbook created with Hyperbook",

0 commit comments

Comments
 (0)