Skip to content

Conversation

@drozdziak1
Copy link
Contributor

@drozdziak1 drozdziak1 commented Jul 26, 2023

Motivation

We've seen instances of massive error log volumes coming from the agent. This was made significantly worse by the multiline error backtrace output provided from the anyhow package. This change alters most error!() logs to only use the concise single-line error messages and separates backtrace printing to debug!() logs.

Additionally, the error locations tend to be hard to track, which is fixed by adding use_file_location to the logger constructor. This should automatically annotate each log message with file/line location information.

Finally, two simple warnings are fixed to clean up compilation log output.

Testing

  • tests continue to pass locally,
  • The location information is visible in agent logs (see below for example)
  • After manually reproducing, the ERRO logs became much shorter, with full context present in DEBG:
Jul 26 15:29:44.938 ERRO[src/agent/solana/exporter.rs:755:17] RPC request error: cluster version query failed: error sending request for url (http://api.pythnet.yth.network:8899/): error tryi
ng to connect: dns error: failed to lookup address information: Name or service not known, primary: true                                                                                       
Jul 26 15:29:44.938 DEBG[src/agent/solana/exporter.rs:756:17] error context, context: RPC request error: cluster version query failed: error sending request for url (http://api.pythnet.yth.ne
twork:8899/): error trying to connect: dns error: failed to lookup address information: Name or service not known                                                                              
                                                                                                                                                                                               
Caused by:                                                                                                                                                                                     
    RPC request error: cluster version query failed: error sending request for url (http://api.pythnet.yth.network:8899/): error trying to connect: dns error: failed to lookup address informa
tion: Name or service not known, primary: true                                                                                                                                                 
Jul 26 15:29:45.117 ERRO[src/agent/solana/oracle.rs:667:21] AsyncWrap error, primary: false                                                                                                    
Jul 26 15:29:45.117 DEBG[src/agent/solana/oracle.rs:668:21] error context, context: AsyncWrap error                                                                                            
                                                                                                                                                                                               
Caused by:                                                                                                                                                                                     
    channel closed, primary: false

@drozdziak1
Copy link
Contributor Author

drozdziak1 commented Aug 4, 2023

Changes since last review

  • Added --log-flavor, taking plain (default) or json - this chooses between regular log statements (good for terminals) and structured JSON logging (good for machine-readability. JSON support is provided with slog-bunyan
  • Added -L which enables file/line information in all log statements, under "loc"
  • Added a log line for printing working directory and agent version

Samples

json logging, -L enabled

{"msg":"Reading publish keypair returned an error. Waiting for a remote-loaded key before publishing.","v":0,"name":"pyth-agent","level":40,"time":"2023-08-04T12:55:31.04613299Z","hostname":"stanix","pid":349260,"primary":false,"loc":"src/agent/solana.rs:182","error":"No such file or directory (os error 2)","publish_keypair_path":"/path/to/keypair.json"}
{"msg":"starting api server","v":0,"name":"pyth-agent","level":30,"time":"2023-08-04T12:55:31.046396709Z","hostname":"stanix","pid":349260,"loc":"src/agent/pythd/api.rs:666","listen address":"127.0.0.1:8910"}
{"msg":"fetching all pyth account data","v":0,"name":"pyth-agent","level":30,"time":"2023-08-04T12:55:31.047453123Z","hostname":"stanix","pid":349260,"primary":false,"loc":"src/agent/solana/oracle.rs:402"}
{"msg":"AsyncWrap error","v":0,"name":"pyth-agent","level":50,"time":"2023-08-04T12:55:31.551660312Z","hostname":"stanix","pid":349260,"primary":false,"loc":"src/agent/solana/oracle.rs:667"}
{"msg":"fetched mapping accounts","v":0,"name":"pyth-agent","level":30,"time":"2023-08-04T12:55:31.847140684Z","hostname":"stanix","pid":349260,"primary":true,"loc":"src/agent/solana/oracle.rs:225","total":1,"new":"[AHtgzX45WTKfkPG53L6WYhGEXwQkN1BVknET3sVsLL8J]"}

Note on nested values

From what I could find in slog-* docs, nested values support is available, but not every logging formatter supports them (in our case slog-json does while slog-term doesn't). For this reason, values that in theory should be JSON arrays/objects are still printed as strings.

plain logging, -L enabled

Aug 04 12:17:06.409 WARN Reading publish keypair returned an error. Waiting for a remote-loaded
 key before publishing., error: No such file or directory (os error 2), publish_keypair_path: /
path/to/keypair.json, primary: false, loc: src/agent/solana.rs:182
Aug 04 12:17:06.410 INFO starting api server, listen address: 127.0.0.1:8910, loc: src/agent/py
thd/api.rs:666
Aug 04 12:17:06.410 INFO fetching all pyth account data, primary: false, loc: src/agent/solana/
oracle.rs:402
Aug 04 12:17:06.410 INFO fetching all pyth account data, primary: true, loc: src/agent/solana/o
racle.rs:402
Aug 04 12:17:06.837 ERRO AsyncWrap error, primary: false, loc: src/agent/solana/oracle.rs:667
Aug 04 12:17:07.118 INFO fetched mapping accounts, total: 1, new: [AHtgzX45WTKfkPG53L6WYhGEXwQk
N1BVknET3sVsLL8J], primary: true, loc: src/agent/solana/oracle.rs:225

Copy link
Collaborator

@ali-behjati ali-behjati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also bump the version before release?

@drozdziak1 drozdziak1 merged commit f7729f8 into main Aug 7, 2023
@drozdziak1 drozdziak1 deleted the drozdziak1/shorter-error-logging branch August 7, 2023 12:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants