A Node.js command-line tool that converts audio files (MP3/AAC) to text and performs comprehensive text analysis including summarization, keyword extraction, sentiment analysis, and more.
audio-text-analyzer/
├── src/
│   └── index.js          # Main application
├── data/input/                # Place your audio files here
├── data/output/               # Analysis reports & transcripts are saved here
├── .vscode/              # VSCode configuration
│   ├── launch.json       # Debug configurations
│   └── tasks.json        # Task configurations
├── package.json
├── .gitignore
└── README.md
- Node.js (v14 or higher)
- Python 3.7+ (required for Whisper)
- FFmpeg (for audio processing)
brew install ffmpegsudo apt update && sudo apt install -y ffmpeg- Install Node.js dependencies:
# Using npm
npm install
# Or using yarn
yarn install- Set up Python virtual environment and install Whisper (CLI):
# Create virtual environment at project root (expected by src/index.js)
python3 -m venv venv
# Activate virtual environment
source venv/bin/activate   # Linux/macOS
# Install Whisper CLI
pip install openai-whisper- For future runs, activate the virtual environment before using the tool:
source venv/bin/activate   # Linux/macOS# Convert audio to text (auto or specified language)
node src/audioToText.js data/input/sample.mp3 -l en
# Save report and transcript
node src/audioToText.js data/input/sample.mp3 -o data/output/analysis.txt
# Convert SRT subtitles to plain text
node src/subtitleToText.js data/input/sample.srt
# Using npm scripts
npm run analyze              # Analyzes data/input/sample.mp3
npm run analyze:output       # Saves to data/output/analysis.txt- -l, --language <lang>: Language code (en, it, fr, es, de, etc.) - defaults to auto-detect
- -o, --output <file>: Save full report to file; transcript is also saved separately to- data/output/<input_basename>.txt
The tool shows real-time progress during transcription:
Starting audio analysis...
Starting transcription...
Detected language: Italian
[0%] Transcribing audio...
[25%] Transcribing audio...
[50%] Transcribing audio...
[100%] Transcribing audio...
Analyzing text...
Generating report...
Analysis complete!
- Place your audio file in the data/input/directory
- Use Ctrl+Shift+P → "Tasks: Run Task" → Select task
- Or use F5 to debug with the configured launch settings
Available tasks:
- Install Dependencies: Run npm install
- Run Audio Analyzer: Analyze sample file
- Run with Output: Analyze and save to output directory
- Speech-to-Text: Uses OpenAI Whisper CLI for accurate transcription with real-time progress
- Language Support: Auto-detect or manually specify language (Italian, French, English, Spanish, German, etc.)
- Summarization: Extractive summary of key sentences
- Keyword Extraction: Top 10 relevant keywords
- Sentiment Analysis: Overall sentiment with scoring
- Named Entity Recognition: People, places, organizations
- Topic Modeling: Top 5 topics based on word frequency
- Reading Statistics: Word count and estimated reading time
- Shows full report including transcript, analysis, and statistics
- Transcript file: Always saved to data/output/<input_basename>.txt
- Report file (when using -ooption): Complete analysis report at the specified path
- MP3
- AAC
- English (en)
- Italian (it)
- French (fr)
- Spanish (es)
- German (de)
- And many more supported by Whisper
- Ensure the Python venv is created at the project root so the Whisper CLI is available at venv/bin/whisperas expected by the app.
- If Whisper or FFmpeg are not found, verify your environment is activated and dependencies are installed.