Skip to content

Conversation

david-luna
Copy link
Member

@david-luna david-luna commented Aug 3, 2023

By making it async now the agent's users can a await it to make sure any OTel MeterProviders has finalised its shutdown process.

Closes #3222

Checklist

  • Implement code
  • Add tests
  • Update TypeScript typings
  • Update documentation
  • Add CHANGELOG.asciidoc entry
  • Commit message follows commit guidelines

@david-luna david-luna requested a review from trentm August 3, 2023 15:25
@github-actions github-actions bot added the agent-nodejs Make available for APM Agents project planning. label Aug 3, 2023
Copy link
Member Author

@david-luna david-luna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/test tav

lib/agent.js Outdated
this.logger.warn('failed to shutdown OTel MeterProvider:', reason);
});
try {
await this._otelMeterProvider.shutdown({ timeoutMillis: 1000 });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't necessarily the job of this change, but here is a concern. This .shutdown call, effectively results in await'ing ElasticApmMetricExporter.flush() (

async shutdown() {
return this._agent.flush();
}
), which calls _agent.flush().

At the top of Agent._flush is:

  if (!this._apmClient) {
    // Log an *err* to provide a stack for the user.
    const err = new Error('cannot flush agent before it is started');
    this.logger.warn({ err }, err.message);
    if (cb) {
      process.nextTick(cb);
    }
    return;
  }

and earlier in Agent.destroy we've done this._apmClient = null. So now I am curious if we are getting this 'cannot flush agent before it is started' log warning when attempting to apm.destroy(). If not, then why not?

So, I think TODOs here are:

  1. Add a test for awaiting this apm.destroy().
  2. Understand why we are or are not getting that warning. Note that we will only run this this._otelMeterProvider.shutdown if we have instrumented @opentelemetry/sdk-metrics, so I'm guessing we just haven't tested this.
  3. Likely we want to move the await this._otelMeterProvider.shutdown(...) to the top of Agent.prototype.destroy().

@david-luna
Copy link
Member Author

david-luna commented Aug 3, 2023

We crossed notes. I don't recall seeing a warning on the test so I?m going to check out again and probably debug

Co-authored-by: Trent Mick <[email protected]>
@trentm
Copy link
Member

trentm commented Aug 4, 2023

but here is a concern.

So indeed this is an existing issue. With this code:

// foo.js
const apm = require('elastic-apm-node');
const otel = require('@opentelemetry/api');

const meter = otel.metrics.getMeter('my-meter');
const numReqs = meter.createCounter('num_requests', {
  description: 'number of HTTP requests',
});

let interval = setInterval(() => {
  numReqs.add(1);
}, 1000);

setTimeout(async () => {
  console.log('XXX destroy and exit');
  clearInterval(interval);
  await apm.destroy();
  console.log('XXX destroyed');
}, 5000);

Running that results in the warning about "cannot flush" that I mentioned above:

$ node -r elastic-apm-node/start.js foo.js | ecslog
...
XXX destroy and exit
XXX destroy: start
[2023-08-04T19:10:12.850Z]  WARN (elastic-apm-node): cannot flush agent before it is started
    error: {
        "type": "Error",
        "message": "cannot flush agent before it is started",
        "stack_trace":
            Error: cannot flush agent before it is started
                at Agent._flush (/Users/trentm/el/apm-agent-nodejs6/lib/agent.js:850:17)
                at /Users/trentm/el/apm-agent-nodejs6/lib/agent.js:822:12
                at new Promise (<anonymous>)
                at Agent.flush (/Users/trentm/el/apm-agent-nodejs6/lib/agent.js:821:12)
                at ElasticApmMetricExporter.shutdown (/Users/trentm/el/apm-agent-nodejs6/lib/opentelemetry-metrics/ElasticApmMetricExporter.js:160:24)
                at PeriodicExportingMetricReader.onShutdown (/Users/trentm/el/apm-agent-nodejs6/node_modules/@opentelemetry/sdk-metrics/build/src/export/PeriodicExportingMetricReader.js:102:30)
                at PeriodicExportingMetricReader.shutdown (/Users/trentm/el/apm-agent-nodejs6/node_modules/@opentelemetry/sdk-metrics/build/src/export/MetricReader.js:102:53)
                at MetricCollector.shutdown (/Users/trentm/el/apm-agent-nodejs6/node_modules/@opentelemetry/sdk-metrics/build/src/state/MetricCollector.js:53:34)
                at /Users/trentm/el/apm-agent-nodejs6/node_modules/@opentelemetry/sdk-metrics/build/src/MeterProvider.js:77:30
                at Array.map (<anonymous>)
                at MeterProvider.shutdown (/Users/trentm/el/apm-agent-nodejs6/node_modules/@opentelemetry/sdk-metrics/build/src/MeterProvider.js:76:62)
                at Agent.destroy (/Users/trentm/el/apm-agent-nodejs6/lib/agent.js:140:37)
                at Timeout._onTimeout (/Users/trentm/el/apm-agent-nodejs6/examples/opentelemetry-metrics/foo.js:22:13)
                at listOnTimeout (node:internal/timers:559:17)
                at processTimers (node:internal/timers:502:7)
    }
XXX destroyed

@trentm trentm marked this pull request as ready for review August 4, 2023 19:40
@trentm trentm merged commit d118152 into dev/4.x Aug 4, 2023
@trentm trentm deleted the dluna/3222-agent-destroy-async branch August 4, 2023 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agent-nodejs Make available for APM Agents project planning.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants