Added proper handling of streaming error responses across both Faraday V1 and V2 (#273)

dansingerman · crmne · web-flow · commit 735e36bc25fe · 2025-07-30T15:49:10.000+02:00
## What this does When used within our app, streaming error responses were throwing an error and not being properly handled ``` worker | D, [2025-07-03T18:49:52.221013 #81269] DEBUG -- RubyLLM: Received chunk: event: error worker | data: {"type":"error","error":{"details":null,"type":"overloaded_error","message":"Overloaded"} } worker | worker | worker | 2025-07-03 18:49:52.233610 E [81269:sidekiq.default/processor chat_agent.rb:42] {jid: 7382519287f08cfa7cd1e4e4, queue: default} Rails -- Error in ChatAgent#send_with_streaming: NoMethodError - undefined method `merge' for nil:NilClass worker | worker | error_response = env.merge(body: JSON.parse(error_data), status: status) worker | ^^^^^^ worker | 2025-07-03 18:49:52.233852 E [81269:sidekiq.default/processor chat_agent.rb:43] {jid: 7382519287f08cfa7cd1e4e4, queue: default} Rails -- Backtrace: /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:91:in `handle_error_chunk' worker | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:62:in `process_stream_chunk' worker | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/ruby_llm-1.3.1/lib/ruby_llm/streaming.rb:70:in `block in legacy_stream_processor' worker | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/faraday-net_http-1.0.1/lib/faraday/adapter/net_http.rb:113:in `block in perform_request' worker | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:535:in `call_block' worker | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb:526:in `<<' worker | /Users/dansingerman/.rbenv/versions/3.1.6/lib/ruby/gems/3.1.0/gems/net-protocol-0.2.2/lib/net/protocol.rb ``` It looks like the [introduction of support for Faraday V1 ](#173 this error, as the error handling relies on an `env` that is no longer passed. This should provide a fix for both V1 and V2. One thing to note, I had to manually construct the VCR cassettes, I'm not sure of a better way to test an intermittent error response. I have also only written the tests against `anthropic/claude-3-5-haiku-20241022` - it's possible other models with a different error format may still not be properly handled, but even in that case it won't error for the reasons fixed here. ## Type of change - [x] Bug fix - [ ] New feature - [ ] Breaking change - [ ] Documentation - [ ] Performance improvement ## Scope check - [x] I read the [Contributing Guide](https://github.com/crmne/ruby_llm/blob/main/CONTRIBUTING.md) - [x] This aligns with RubyLLM's focus on **LLM communication** - [x] This isn't application-specific logic that belongs in user code - [x] This benefits most users, not just my specific use case ## Quality check - [x] I ran `overcommit --install` and all hooks pass - [x] I tested my changes thoroughly - [x] I updated documentation if needed - [x] I didn't modify auto-generated files manually (`models.json`, `aliases.json`) ## API changes - [ ] Breaking change - [ ] New public methods/classes - [ ] Changed method signatures - [x] No API changes ## Related issues --------- Co-authored-by: Carmine Paolino <carmine@paolino.me>
diff --git a/lib/ruby_llm/providers/openai/streaming.rb b/lib/ruby_llm/providers/openai/streaming.rb
@@ -21,6 +21,20 @@ def build_chunk(data)
             output_tokens: data.dig('usage', 'completion_tokens')
           )
         end
+
+        def parse_streaming_error(data)
+          error_data = JSON.parse(data)
+          return unless error_data['error']
+
+          case error_data.dig('error', 'type')
+          when 'server_error'
+            [500, error_data['error']['message']]
+          when 'rate_limit_exceeded', 'insufficient_quota'
+            [429, error_data['error']['message']]
+          else
+            [400, error_data['error']['message']]
+          end
+        end
       end
     end
   end
diff --git a/lib/ruby_llm/streaming.rb b/lib/ruby_llm/streaming.rb
@@ -55,13 +55,13 @@ def create_stream_processor(parser, buffer, &)
       end
     end
 
-    def process_stream_chunk(chunk, parser, _env, &)
+    def process_stream_chunk(chunk, parser, env, &)
       RubyLLM.logger.debug "Received chunk: #{chunk}"
 
       if error_chunk?(chunk)
-        handle_error_chunk(chunk, nil)
+        handle_error_chunk(chunk, env)
       else
-        yield handle_sse(chunk, parser, nil, &)
+        yield handle_sse(chunk, parser, env, &)
       end
     end
 
@@ -88,7 +88,16 @@ def error_chunk?(chunk)
     def handle_error_chunk(chunk, env)
       error_data = chunk.split("\n")[1].delete_prefix('data: ')
       status, _message = parse_streaming_error(error_data)
-      error_response = env.merge(body: JSON.parse(error_data), status: status)
+      parsed_data = JSON.parse(error_data)
+
+      # Create a response-like object that works for both Faraday v1 and v2
+      error_response = if env
+                         env.merge(body: parsed_data, status: status)
+                       else
+                         # For Faraday v1, create a simple object that responds to .status and .body
+                         Struct.new(:body, :status).new(parsed_data, status)
+                       end
+
       ErrorMiddleware.parse_error(provider: self, response: error_response)
     rescue JSON::ParserError => e
       RubyLLM.logger.debug "Failed to parse error chunk: #{e.message}"
@@ -122,7 +131,16 @@ def handle_data(data)
 
     def handle_error_event(data, env)
       status, _message = parse_streaming_error(data)
-      error_response = env.merge(body: JSON.parse(data), status: status)
+      parsed_data = JSON.parse(data)
+
+      # Create a response-like object that works for both Faraday v1 and v2
+      error_response = if env
+                         env.merge(body: parsed_data, status: status)
+                       else
+                         # For Faraday v1, create a simple object that responds to .status and .body
+                         Struct.new(:body, :status).new(parsed_data, status)
+                       end
+
       ErrorMiddleware.parse_error(provider: self, response: error_response)
     rescue JSON::ParserError => e
       RubyLLM.logger.debug "Failed to parse error event: #{e.message}"
diff --git a/spec/ruby_llm/chat_streaming_spec.rb b/spec/ruby_llm/chat_streaming_spec.rb
@@ -4,6 +4,7 @@
 
 RSpec.describe RubyLLM::Chat do
   include_context 'with configured RubyLLM'
+  include StreamingErrorHelpers
 
   describe 'streaming responses' do
     CHAT_MODELS.each do |model_info|
@@ -47,4 +48,83 @@
       end
     end
   end
+
+  describe 'Error handling' do
+    CHAT_MODELS.each do |model_info|
+      model = model_info[:model]
+      provider = model_info[:provider]
+
+      context "with #{provider}/#{model}" do
+        let(:chat) { RubyLLM.chat(model: model, provider: provider) }
+
+        describe 'Faraday version 1' do # rubocop:disable RSpec/NestedGroups
+          before do
+            stub_const('Faraday::VERSION', '1.10.0')
+          end
+
+          it "#{provider}/#{model} supports handling streaming error chunks" do # rubocop:disable RSpec/ExampleLength
+            skip('Error handling not implemented yet') unless error_handling_supported?(provider)
+
+            stub_error_response(provider, :chunk)
+
+            chunks = []
+
+            expect do
+              chat.ask('Count from 1 to 3') do |chunk|
+                chunks << chunk
+              end
+            end.to raise_error(expected_error_for(provider))
+          end
+
+          it "#{provider}/#{model} supports handling streaming error events" do # rubocop:disable RSpec/ExampleLength
+            skip('Error handling not implemented yet') unless error_handling_supported?(provider)
+
+            stub_error_response(provider, :event)
+
+            chunks = []
+
+            expect do
+              chat.ask('Count from 1 to 3') do |chunk|
+                chunks << chunk
+              end
+            end.to raise_error(expected_error_for(provider))
+          end
+        end
+
+        describe 'Faraday version 2' do # rubocop:disable RSpec/NestedGroups
+          before do
+            stub_const('Faraday::VERSION', '2.0.0')
+          end
+
+          it "#{provider}/#{model} supports handling streaming error chunks" do # rubocop:disable RSpec/ExampleLength
+            skip('Error handling not implemented yet') unless error_handling_supported?(provider)
+
+            stub_error_response(provider, :chunk)
+
+            chunks = []
+
+            expect do
+              chat.ask('Count from 1 to 3') do |chunk|
+                chunks << chunk
+              end
+            end.to raise_error(expected_error_for(provider))
+          end
+
+          it "#{provider}/#{model} supports handling streaming error events" do # rubocop:disable RSpec/ExampleLength
+            skip('Error handling not implemented yet') unless error_handling_supported?(provider)
+
+            stub_error_response(provider, :event)
+
+            chunks = []
+
+            expect do
+              chat.ask('Count from 1 to 3') do |chunk|
+                chunks << chunk
+              end
+            end.to raise_error(expected_error_for(provider))
+          end
+        end
+      end
+    end
+  end
 end
diff --git a/spec/spec_helper.rb b/spec/spec_helper.rb
@@ -42,6 +42,7 @@
 require 'fileutils'
 require 'ruby_llm'
 require 'webmock/rspec'
+require_relative 'support/streaming_error_helpers'
 
 # VCR Configuration
 VCR.configure do |config|
diff --git a/spec/support/streaming_error_helpers.rb b/spec/support/streaming_error_helpers.rb
@@ -0,0 +1,111 @@
+# frozen_string_literal: true
+
+module StreamingErrorHelpers
+  ERROR_HANDLING_CONFIGS = {
+    anthropic: {
+      url: 'https://api.anthropic.com/v1/messages',
+      error_response: {
+        type: 'error',
+        error: {
+          type: 'overloaded_error',
+          message: 'Overloaded'
+        }
+      },
+      chunk_status: 529,
+      expected_error: RubyLLM::OverloadedError
+    },
+    openai: {
+      url: 'https://api.openai.com/v1/chat/completions',
+      error_response: {
+        error: {
+          message: 'The server is temporarily overloaded. Please try again later.',
+          type: 'server_error',
+          param: nil,
+          code: nil
+        }
+      },
+      chunk_status: 500,
+      expected_error: RubyLLM::ServerError
+    },
+    gemini: {
+      url: 'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:streamGenerateContent?alt=sse',
+      error_response: {
+        error: {
+          code: 529,
+          message: 'Service overloaded - please try again later',
+          status: 'RESOURCE_EXHAUSTED'
+        }
+      },
+      chunk_status: 529,
+      expected_error: RubyLLM::OverloadedError
+    },
+    deepseek: {
+      url: 'https://api.deepseek.com/chat/completions',
+      error_response: {
+        error: {
+          message: 'Service overloaded - please try again later',
+          type: 'server_error',
+          param: nil,
+          code: nil
+        }
+      },
+      chunk_status: 500,
+      expected_error: RubyLLM::ServerError
+    },
+    openrouter: {
+      url: 'https://openrouter.ai/api/v1/chat/completions',
+      error_response: {
+        error: {
+          message: 'Service overloaded - please try again later',
+          type: 'server_error',
+          param: nil,
+          code: nil
+        }
+      },
+      chunk_status: 500,
+      expected_error: RubyLLM::ServerError
+    },
+    ollama: {
+      url: 'http://localhost:11434/v1/chat/completions',
+      error_response: {
+        error: {
+          message: 'Service overloaded - please try again later',
+          type: 'server_error',
+          param: nil,
+          code: nil
+        }
+      },
+      chunk_status: 500,
+      expected_error: RubyLLM::ServerError
+    }
+  }.freeze
+
+  def error_handling_supported?(provider)
+    ERROR_HANDLING_CONFIGS.key?(provider)
+  end
+
+  def expected_error_for(provider)
+    ERROR_HANDLING_CONFIGS[provider][:expected_error]
+  end
+
+  def stub_error_response(provider, type)
+    config = ERROR_HANDLING_CONFIGS[provider]
+    return unless config
+
+    body = case type
+           when :chunk
+             "#{config[:error_response].to_json}\n\n"
+           when :event
+             "event: error\ndata: #{config[:error_response].to_json}\n\n"
+           end
+
+    status = type == :chunk ? config[:chunk_status] : 200
+
+    stub_request(:post, config[:url])
+      .to_return(
+        status: status,
+        body: body,
+        headers: { 'Content-Type' => 'text/event-stream' }
+      )
+  end
+end