Skip to content

memory leak in 2.7.6 (and net-http-persistent 3.0.0)? #528

@maia

Description

@maia

I'm experiencing a reproducible memory leak when updating mechanize from 2.7.5 to 2.7.6 and net-http-persistent from 2.9.4 to 3.0.0, within a few hours my worker memory usage climbs from a stable ~250MB (for the past months) by about 20MB per hour when updating these gems:

$ bundle outdated --strict
...
Outdated gems included in the bundle:
  * mechanize (newest 2.7.6, installed 2.7.5) in groups "default"
  * net-http-persistent (newest 3.0.0, installed 2.9.4)
$ bundle update mechanize net-http-persistent

Here's how I use mechanize in my worker:

class UrlQuery

  USER_AGENT = 'Mac Safari'
  TIMEOUT    = 5.0 # seconds
  EXCEPTIONS = [
      Errno::ECONNREFUSED, Errno::ECONNRESET, Errno::EHOSTUNREACH, Errno::EINVAL,
      Errno::ENETUNREACH, Errno::ETIMEDOUT, Mechanize::Error,
      Mechanize::RedirectLimitReachedError, Mechanize::ResponseCodeError,
      Mechanize::UnauthorizedError, Net::HTTP::Persistent::Error, Net::HTTPFatalError,
      Net::HTTPInternalServerError, Net::HTTPMethodNotAllowed, Net::HTTPServerException,
      Net::HTTPServiceUnavailable, Net::OpenTimeout, Net::ReadTimeout,
      OpenSSL::SSL::SSLError, SocketError, Timeout::Error, URI::InvalidURIError
  ].freeze

  def initialize(url)
    @url = url
  end

  def call
    page           = agent.head(@url)
    uri            = page.uri.to_s
    content_type   = page.response['content-type']
    content_length = page.response['content-length'].to_i
    [uri, nil, content_type, content_length]
  rescue *EXCEPTIONS => e
    [nil, "#{e.class}: #{e.message}"]
  rescue StandardError => e
    Rollbar.error(e)
    [nil, "#{e.class}: #{e.message}"]
  end

  private

    def agent
      Mechanize.new do |agent|
        agent.user_agent_alias = USER_AGENT
        agent.open_timeout     = TIMEOUT
        agent.read_timeout     = TIMEOUT
        agent.verify_mode      = OpenSSL::SSL::VERIFY_PEER
      end
    end

end

All dependencies are updated to the most recent version:

domain_name (0.5.20180417)
http-cookie (1.0.3)
mime-types (3.1)
net-http-digest_auth (1.4.1)
net-http-persistent (3.0.0)
nokogiri (1.8.4)
ntlm-http (0.1.1)
webrobots (0.1.2)
connection_pool (2.2.2)

The app is using rails 5.1.6 and ruby 2.5.0 and is running on a heroku dyno, the worker job is called every 10 minutes and each time parses some dozens URLs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions