Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions .github/workflows/cicd.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,16 +35,20 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
ruby-version: ['3.1', '3.2', '3.3', '3.4', 'jruby-10.0.1.0']
rails-version: ['rails-7.1', 'rails-7.2', 'rails-8.0']
ruby-version: ['3.1', '3.2', '3.3', '3.4', 'jruby-10.0.2.0']
rails-version: ['rails-7.1', 'rails-7.2', 'rails-8.0', 'rails-8.1']
exclude:
# Rails 8 requires Ruby 3.2+
- ruby-version: '3.1'
rails-version: 'rails-8.0'
- ruby-version: '3.1'
rails-version: 'rails-8.1'
# JRuby only supports up to 7.1 right now
- ruby-version: 'jruby-10.0.1.0'
- ruby-version: 'jruby-10.0.2.0'
rails-version: 'rails-8.1'
- ruby-version: 'jruby-10.0.2.0'
rails-version: 'rails-8.0'
- ruby-version: 'jruby-10.0.1.0'
- ruby-version: 'jruby-10.0.2.0'
rails-version: 'rails-7.2'

steps:
Expand Down Expand Up @@ -200,4 +204,4 @@ jobs:
fi
}
env:
GEM_HOST_API_KEY: "${{secrets.RUBYGEMS_AUTH_TOKEN}}"
GEM_HOST_API_KEY: "${{secrets.RUBYGEMS_AUTH_TOKEN}}"
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ build-iPhoneSimulator/
# for a library or gem, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
Gemfile.lock
gemfiles/*.lock
# .ruby-version
# .ruby-gemset

Expand Down
6 changes: 6 additions & 0 deletions Appraisals
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,9 @@ appraise 'rails-8.0' do
gem 'rails', '~> 8.0.0'
end
end

appraise 'rails-8.1' do
group :development do
gem 'rails', '~> 8.1.0'
end
end
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,11 @@ RubyLLM.embed "Ruby is elegant and expressive"
RubyLLM.transcribe "meeting.wav"
```

```ruby
# Text to speech
RubyLLM.tts "Hello, welcome to RubyLLM!"
```

```ruby
# Moderate content for safety
RubyLLM.moderate "Check if this text is safe"
Expand Down
96 changes: 96 additions & 0 deletions docs/_core_features/text-to-speech.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
---
layout: default
title: Text to Speech
nav_order: 7
description: Convert text to speech
redirect_from:
- /guides/audio-transcription
- /guides/transcription
---

# {{ page.title }}
{: .d-inline-block .no_toc }

v1.9.0+
{: .label .label-green }

{{ page.description }}
{: .fs-6 .fw-300 }

## Table of contents
{: .no_toc .text-delta }

1. TOC
{:toc}

---

After reading this guide, you will know:

* How to generate speech from text.
* How to save audio files.
* How to select different voices.
* How to access raw audio data.
* Specifics of language support.

## Basic Text to Speech

Generate audio with the global `RubyLLM.tts` method:

```ruby
audio = RubyLLM.tts("Hello, welcome to RubyLLM!")

```

## Save Audio File
You can save the generated audio to a file.
If you are using OpenAI, the audio will be saved as an MP3 file.

```ruby
audio = RubyLLM.tts("This is a text to speech example.", provider: :openai, model: "gpt-4o-mini-tts")
audio.save("example.mp3")
```

If you are using Gemini, the audio will be saved as a raw PCM file.

```ruby
audio = RubyLLM.tts("This is a text to speech example.", provider: :gemini, model: "gemini-2.5-flash-preview-tts")
audio.save("example.pcm")
```

You can convert it to MP3 using ffmpeg:

```bash
ffmpeg -f s16le -ar 24000 -ac 1 -i example.pcm example.mp3
```

### Select Voice
You can specify different voices. Supported voices for OpenAI
are alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, and verse.

For Gemini have a look at the [gemini voices](https://ai.google.dev/gemini-api/docs/speech-generation#voices).

```ruby
# Using a specific voice
voice = "ash"
audio = RubyLLM.tts("Hello, this is a #{voice}`s voice.", voice: voice)
```

### Access Audio Data
You can access the raw audio data:

```ruby
audio = RubyLLM.tts("Accessing raw audio data.")
audio.data # => binary audio data (MP3 for OpenAI, PCM for Gemini)
```

### Language Support
OpenAi and Gemini gather language support automatically based on the text provided.
Previously, you could specify the language manually in Gemini.

## Next Steps

* [Chatting with AI Models]({% link _core_features/chat.md %}): Learn about conversational AI.
* [Image Generation]({% link _core_features/image-generation.md %}): Generate images from text.
* [Error Handling]({% link _advanced/error-handling.md %}): Master handling API errors.

5 changes: 5 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,11 @@ RubyLLM.embed "Ruby is elegant and expressive"
RubyLLM.transcribe "meeting.wav"
```

```ruby
# Text to speech
RubyLLM.tts "Hello, welcome to RubyLLM!"
```

```ruby
# Moderate content for safety
RubyLLM.moderate "Check if this text is safe"
Expand Down
Loading