Skip to content

Commit d231fac

Browse files
Merge pull request #1485 from ethereum-optimism/simplify-metadata-logic
fix:simplify-metadata-logic
2 parents d2542ba + 60aa2d4 commit d231fac

File tree

9 files changed

+240
-1233
lines changed

9 files changed

+240
-1233
lines changed

.coderabbit.yaml

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,11 +33,10 @@ reviews:
3333
---
3434
```
3535
3. If any required fields are missing or empty, comment:
36-
'This file appears to be missing required metadata. You can fix this by running:
36+
'This file appears to be missing required metadata. Please check keywords.config.yaml for valid options and add the required fields manually. You can validate your changes by running:
3737
```bash
38-
pnpm metadata-batch-cli:dry "path/to/this/file.mdx"
39-
```
40-
Review the changes, then run without :dry to apply them.'
38+
pnpm validate-metadata
39+
```'
4140
- Use proper nouns in place of personal pronouns like 'We' and 'Our' to maintain consistency in communal documentation.
4241
- Avoid gender-specific language and use the imperative form.
4342
- Monitor capitalization for emphasis. Avoid using all caps, italics, or bold for emphasis.

dist/tsconfig.tsbuildinfo

Lines changed: 0 additions & 1 deletion
This file was deleted.

notes/metadata-update.md

Lines changed: 42 additions & 180 deletions
Original file line numberDiff line numberDiff line change
@@ -1,204 +1,66 @@
1-
# Metadata Management System
1+
# Metadata Validation System
22

3-
Quick guide on using our metadata management system for the OP Stack documentation.
3+
## Overview
44

5-
## What the System Does
5+
This system validates metadata in MDX documentation files against rules defined in `keywords.config.yaml`.
66

7-
* Validates and updates metadata in .mdx documentation files
8-
* Ensures consistent metadata across documentation
9-
* Generates a manifest of processed files
10-
* Supports dry run mode for previewing changes
11-
* Automatically detects content categories and types
7+
## Key Features
128

13-
## Using the Scripts
9+
- Validates required metadata fields
10+
- Ensures only valid keywords are used
11+
- Provides suggestions from keywords.config.yaml
12+
- Handles duplicate/imported pages with `is_imported_content` flag
13+
- No automatic fixes - content writers make manual updates
1414

15-
1. Run a dry run to preview changes:
16-
```bash
17-
# Process all .mdx files in a directory and its subdirectories
18-
pnpm metadata-batch-cli:dry "pages/superchain/**/*.mdx"
15+
## Using the Validator
1916

20-
# Process a specific file with verbose output
21-
pnpm metadata-batch-cli:verbose "pages/app-developers/example.mdx"
22-
23-
# Process multiple directories
24-
pnpm metadata-batch-cli:dry "pages/app-developers/**/*.mdx" "pages/node-operators/**/*.mdx"
25-
```
26-
27-
2. Apply the changes (remove :dry):
28-
```bash
29-
pnpm metadata-batch-cli "pages/app-developers/**/*.mdx"
30-
```
31-
32-
### Important Note About File Patterns
33-
34-
Use these patterns to match files:
35-
36-
* `directory/**/*.mdx` - matches all .mdx files in a directory and all its subdirectories
37-
* `directory/*.mdx` - matches only .mdx files in the specific directory
38-
* `directory/subdirectory/**/*.mdx` - matches all .mdx files in a specific subdirectory tree
39-
* the quotes around the pattern are important to prevent shell expansion
40-
41-
### Configuration Files
42-
43-
1. **keywords.config.yaml**
44-
* Located in the project root
45-
* **Single source of truth** for all valid metadata values
46-
* Defines validation rules for metadata fields
47-
* Specifies required fields for different content types
48-
* Contains keyword mappings for category detection
49-
* Example configuration:
50-
```yaml
51-
metadata_rules:
52-
topic:
53-
required: true
54-
validation_rules:
55-
- pattern: "^[a-z0-9-]+$"
56-
description: "Must be lowercase with hyphens"
57-
personas:
58-
required: true
59-
multiple: true
60-
validation_rules:
61-
- enum:
62-
- app-developer
63-
- node-operator
64-
- chain-operator
65-
- protocol-developer
66-
- partner
67-
content_type:
68-
required: true
69-
validation_rules:
70-
- enum:
71-
- tutorial
72-
- guide
73-
- reference
74-
- landing-page
75-
- troubleshooting
76-
- notice
77-
categories:
78-
required: true
79-
multiple: true
80-
validation_rules:
81-
- enum:
82-
- protocol
83-
- security
84-
- governance
85-
- tokens
86-
- standard-bridge
87-
- interoperable-message-passing
88-
- devnets
89-
- infrastructure
17+
To check your changes before committing:
18+
```bash
19+
pnpm validate-metadata
9020
```
9121

92-
2. **metadata-types.ts**
93-
* Defines TypeScript interfaces for metadata
94-
* Imports and validates valid values from keywords.config.yaml
95-
* Provides type-safe exports for use throughout the system
96-
* Used for type checking and validation
97-
98-
## What to Watch For
22+
This will validate any MDX files you've modified but haven't committed yet. Fix any validation errors before committing your changes.
9923

100-
1. **Before Running**
101-
* Commit your current changes
102-
* Ensure you're in the docs root directory
103-
* Check that keywords.config.yaml exists and is properly configured
104-
* **Important**: All metadata values must be defined in keywords.config.yaml
24+
Note: Additional validation will run automatically when you create a PR.
10525

106-
2. **After Running**
107-
* Review the categories assigned to each file
108-
* Check that topics and personas are correct
109-
* Verify any files marked for review
110-
* Make sure network types (mainnet/testnet) are correct for registry files
26+
## Metadata Requirements
11127

112-
## Content Analysis
28+
All valid metadata values are defined in `keywords.config.yaml`. This file is the single source of truth for:
11329

114-
The `metadata-analyzer.ts` script handles automatic content analysis and categorization.
30+
- Required fields
31+
- Valid personas
32+
- Valid content types
33+
- Valid categories
11534

116-
### How It Works
35+
### Handling Duplicate Pages
11736

118-
1. **Category Detection**
119-
* Analyzes file content and paths for relevant keywords
120-
* Detects appropriate categories based on context
121-
* Handles special cases for chain operators and node operators
122-
* Supports parent category inheritance for landing pages
37+
For pages that are imported or duplicated across sections:
38+
- Set `is_imported_content: 'true'` for duplicate/imported pages
39+
- Set `is_imported_content: 'false'` for original pages
40+
- This helps manage duplicate topics and enables proper filtering
12341

124-
2. **Content Type Detection**
125-
* Identifies content type (guide, reference, tutorial, etc.)
126-
* Uses filename patterns and content signals
127-
* Considers component usage (e.g., <Cards>, <Steps>)
128-
* Scores content against multiple type indicators
42+
## When Validation Fails
12943

130-
### Valid Categories
44+
1. Review the validation errors
45+
2. Check keywords.config.yaml for valid options
46+
3. Update your MDX frontmatter manually
47+
4. Run validation again to confirm fixes
13148

132-
Categories are defined in `keywords.config.yaml`. Check this file for the current list of valid categories. The metadata validation system uses these definitions to ensure consistency across all documentation.
49+
## Example Valid Frontmatter
13350

134-
Example of how categories are defined:
13551
```yaml
136-
metadata_rules:
137-
categories:
138-
required: true
139-
multiple: true
140-
validation_rules:
141-
- enum:
142-
- protocol
143-
- security
144-
- governance
145-
# ... see keywords.config.yaml for complete list
146-
```
147-
148-
### Example Analysis
149-
150-
Input file with chain operator content:
151-
```yaml
152-
---
153-
title: Genesis Creation
154-
description: Learn how to create a genesis file.
15552
---
156-
```
157-
158-
Detected categories:
159-
```yaml
53+
title: Example Page
54+
description: A clear description
55+
topic: example-topic
56+
personas:
57+
- app-developer
58+
content_type: guide
16059
categories:
16160
- protocol
162-
- devnets
163-
- governance
164-
- security
61+
- infrastructure
62+
is_imported_content: 'false' # Required: 'true' for imported/duplicate pages
63+
---
16564
```
16665

167-
### Special Cases
168-
169-
* **Landing Pages**: Categories are determined by analyzing the child content they link to (typically via `<Cards>` components)
170-
* **Chain Operators**: Additional category detection for specific features
171-
* **Node Operators**: Special handling for node operation content
172-
* **Imported Content**: Skip category review flags for imported content
173-
174-
## Implementation Files
175-
176-
* `utils/metadata-manager.ts`: Main metadata management system
177-
* `utils/metadata-analyzer.ts`: Content analysis and categorization logic
178-
* `utils/metadata-batch-cli.ts`: CLI tool for batch updates
179-
* `utils/types/metadata-types.ts`: TypeScript type definitions
180-
* `keywords.config.yaml`: Validation rules and keyword mappings
181-
182-
## Automated PR Checks
183-
184-
The documentation repository includes automated checks for metadata completeness:
185-
186-
1. **CircleCI Validation**
187-
* Automatically runs on all PRs
188-
* Checks metadata in modified .mdx files
189-
* Fails if required metadata is missing
190-
* Run locally with: `pnpm validate-pr-metadata`
191-
192-
2. **CodeRabbit Review**
193-
* Reviews frontmatter in modified files
194-
* Checks for required fields based on content type
195-
* Suggests running metadata-batch-cli when metadata is incomplete
196-
197-
### When to Run Metadata Updates
198-
199-
* When CircleCI metadata validation fails
200-
* When CodeRabbit suggests missing metadata
201-
* When adding new documentation files
202-
* When specifically asked to update metadata
203-
204-
Do not run the script on files that already have correct metadata, as this may overwrite manual customizations.
66+
Remember: All valid options are in keywords.config.yaml

package.json

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
"type": "module",
66
"scripts": {
77
"lint": "eslint . --ext mdx --max-warnings 0 && pnpm spellcheck:lint && pnpm check-breadcrumbs && pnpm check-redirects && pnpm validate-metadata && pnpm link-checker",
8-
"fix": "eslint . --ext mdx --fix && pnpm spellcheck:fix && pnpm fix-redirects && pnpm metadata-batch-cli && pnpm fix-links",
8+
"fix": "eslint . --ext mdx --fix && pnpm spellcheck:fix && pnpm fix-redirects && pnpm fix-links",
99
"spellcheck:lint": "cspell lint \"**/*.mdx\"",
1010
"prepare": "husky",
1111
"spellcheck:fix": "cspell --words-only --unique \"**/*.mdx\" | sort --ignore-case | uniq > words.txt",
@@ -15,11 +15,8 @@
1515
"fix-redirects": "NODE_NO_WARNINGS=1 node --loader ts-node/esm utils/fix-redirects.ts",
1616
"link-checker": "NODE_NO_WARNINGS=1 node --loader ts-node/esm utils/link-checker.ts",
1717
"fix-links": "NODE_NO_WARNINGS=1 node --loader ts-node/esm utils/fix-broken-links.ts",
18-
"metadata-batch-cli": "NODE_NO_WARNINGS=1 node --loader ts-node/esm utils/metadata-batch-cli.ts",
19-
"metadata-batch-cli:dry": "NODE_NO_WARNINGS=1 node --loader ts-node/esm utils/metadata-batch-cli.ts --dry-run",
20-
"metadata-batch-cli:verbose": "pnpm metadata-batch-cli --verbose",
21-
"validate-metadata": "CHANGED_FILES=$(git diff --name-only HEAD) pnpm metadata-batch-cli:dry",
22-
"validate-pr-metadata": "NODE_NO_WARNINGS=1 node --loader ts-node/esm utils/metadata-manager.ts --pr",
18+
"validate-metadata": "CHANGED_FILES=$(git diff --name-only --cached) NODE_NO_WARNINGS=1 node --loader ts-node/esm utils/metadata-validator.ts",
19+
"validate-pr-metadata": "NODE_NO_WARNINGS=1 node --loader ts-node/esm utils/metadata-validator.ts --pr",
2320
"dev": "next dev",
2421
"build": "next build",
2522
"start": "next start",

0 commit comments

Comments
 (0)