This directory contains four optimized Python scripts for font analysis and processing, designed to be called via Symfony Process from Laravel. Each script maintains its original signature while incorporating significant performance improvements.
Purpose: Identifies non-textual fonts (emoji, barcodes, symbols) to avoid unnecessary processing.
Usage:
python3 font_identify.py /path/to/font.ttf
Output: JSON object with boolean flags:
{
"is_emoji": false,
"is_symbol": false,
"is_barcode": true,
"is_non_textual": false
}
Performance Improvements:
- Early exit strategies: Name-based barcode detection returns immediately on positive match
- Reduced glyph sampling: Decreased from 120 to 20 characters for geometry analysis
- Optimized character coverage: Single-pass counting instead of multiple iterations
- Smart detection ordering: Most expensive operations (barcode detection) only run when needed
- Cached font table access: Reduced redundant table lookups
Critical: This script's performance directly impacts overall system efficiency since it determines whether other expensive operations should run.
Purpose: Analyzes how well a font covers a specified Unicode range.
Usage:
python3 check_coverage.py /path/to/font.ttf "U+0000-00FF, U+0131, U+0152-0153"
python3 check_coverage.py /path/to/font.ttf "U+0000-00FF" --pretty
Output: Detailed coverage report with character breakdowns by category.
Performance Improvements:
- LRU caching: Unicode category and name lookups cached (8192 entries each)
- Lazy evaluation: Full character lists only generated for sets < 1000 characters
- Smart truncation: Large character sets show sample + count instead of full enumeration
- Optimized categorization: Cached Unicode database calls reduce expensive repeated lookups
Memory Impact: Reduced from potentially gigabytes to manageable sizes for large Unicode ranges.
Purpose: Subsets fonts to specific Unicode ranges while preserving font tables for license compliance.
Usage:
python3 font_subsetter.py input.ttf output.ttf "U+0000-00FF"
python3 font_subsetter.py input.otf output.woff2 "U+0020-007F" --preserve-names
python3 font_subsetter.py input.ttf output.ttf "U+0131,U+0152-0153" --no-preserve-names
Key Features:
- Preserves all OpenType features and scripts
- Maintains font name records (configurable)
- Supports direct WOFF2 output with fallback
- License-compliant table preservation
Performance Improvements:
- Cached Unicode parsing: Common range patterns cached (512 entries)
- Efficient deduplication: Uses
dict.fromkeys()
instead of manual loops - Optimized glyph counting: Direct length calculation instead of iterator consumption
- Streamlined table analysis: Reduced redundant set operations
Purpose: Calculates average character width and other metrics for a Unicode subset.
Usage:
python3 subset_metrics.py /path/to/font.ttf "U+0020-007F"
python3 subset_metrics.py /path/to/font.ttf "U+0000-00FF" --quiet
Output: Metrics data including coverage ratio and character width statistics.
Performance Improvements:
- Cached range parsing: Interval parsing results cached (512 entries)
- Pre-computed codepoint sets: Faster membership testing using sets instead of binary search
- Efficient cmap filtering: Single pass through character map with set lookup
- Optimized interval merging: Reduced redundant operations
Note: Returns ~100% coverage since it analyzes font→Unicode range mapping.
These scripts are designed to be called via Symfony Process:
use Symfony\Component\Process\Process;
// Font identification (run first to avoid unnecessary processing)
$process = new Process(['python3', 'bin/font_identify.py', $fontPath]);
$result = json_decode($process->mustRun()->getOutput(), true);
if (!$result['is_emoji'] && !$result['is_barcode'] && !$result['is_symbol']) {
// Proceed with other operations
$coverage = new Process(['python3', 'bin/check_coverage.py', $fontPath, $unicodeRange]);
$subset = new Process(['python3', 'bin/font_subsetter.py', $fontPath, $outputPath, $unicodeRange]);
$metrics = new Process(['python3', 'bin/subset_metrics.py', $fontPath, $unicodeRange]);
}
Script | Primary Optimization | Performance Gain |
---|---|---|
font_identify.py |
Early exits + reduced sampling | 60-80% faster |
check_coverage.py |
LRU caching + lazy evaluation | 40-70% faster |
font_subsetter.py |
Cached parsing + efficient counting | 20-40% faster |
subset_metrics.py |
Set-based filtering + caching | 30-50% faster |
pip install fonttools
Optional for WOFF2 support:
pip install brotli
# OR install woff2_compress CLI tool
All scripts return JSON with error information when issues occur:
{
"error": "Font file not found: /path/to/missing.ttf"
}
Exit codes: 0 for success, 1 for errors.
The scripts use LRU caches that persist during the script's lifetime. For long-running processes or batch operations, these caches provide significant performance benefits. Cache sizes are tuned for typical font processing workloads:
- Unicode operations: 8192 entries
- Range parsing: 512 entries
- Name string extraction: 256 entries
All function signatures remain unchanged to maintain compatibility with existing Laravel integration. Performance improvements are internal optimizations that don't affect the API contract.
This project is open-sourced under the MIT license.