-
Notifications
You must be signed in to change notification settings - Fork 839
Re-try addition of configurable trace sampling strategy #709
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-try addition of configurable trace sampling strategy #709
Conversation
|
Thanks! Will get around to this next week.
No idea about the dep stuff, but at a guess you don't need the constraint,
you just need to run `dep update` (or `upgrade`, I forget exact spelling).
…On Wed, 14 Feb 2018 at 19:21 Cody Boggs ***@***.***> wrote:
@jml <https://github.com/jml> - I'm not entirely sure that adding an
override to Gopkg.toml was the proper fix, so if I got it wrong, let me
know what the proper method is and I'll fix it up. :-)
Thanks!
------------------------------
You can view, comment on, or merge this pull request online at:
#709
Commit Summary
- enable configurable Jaeger sampling strategies (with defaults)
- add trace config opts to lite, and DRY up default trace sampling
logic
- DRY up some more tracing instantation bits
- use new convenience function to construct tracer from env vars
- add forgotten vendor update
File Changes
- *M* Gopkg.toml
<https://github.com/weaveworks/cortex/pull/709/files#diff-0> (4)
- *M* cmd/distributor/main.go
<https://github.com/weaveworks/cortex/pull/709/files#diff-1> (4)
- *M* cmd/lite/main.go
<https://github.com/weaveworks/cortex/pull/709/files#diff-2> (4)
- *M* cmd/querier/main.go
<https://github.com/weaveworks/cortex/pull/709/files#diff-3> (4)
- *M* cmd/ruler/main.go
<https://github.com/weaveworks/cortex/pull/709/files#diff-4> (4)
- *M* vendor/github.com/weaveworks/common/Gopkg.lock
<https://github.com/weaveworks/cortex/pull/709/files#diff-5> (119)
- *M* vendor/github.com/weaveworks/common/Gopkg.toml
<https://github.com/weaveworks/cortex/pull/709/files#diff-6> (8)
- *M* vendor/github.com/weaveworks/common/logging/logging.go
<https://github.com/weaveworks/cortex/pull/709/files#diff-7> (7)
- *M* vendor/github.com/weaveworks/common/middleware/grpc_logging.go
<https://github.com/weaveworks/cortex/pull/709/files#diff-8> (2)
- *M* vendor/github.com/weaveworks/common/middleware/logging.go
<https://github.com/weaveworks/cortex/pull/709/files#diff-9> (2)
- *M* vendor/github.com/weaveworks/common/tracing/tracing.go
<https://github.com/weaveworks/cortex/pull/709/files#diff-10> (28)
Patch Links:
- https://github.com/weaveworks/cortex/pull/709.patch
- https://github.com/weaveworks/cortex/pull/709.diff
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#709>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHq6rqob6r6oghwjzmGVttvkPMIHTY7ks5tUzJOgaJpZM4SF1lS>
.
|
|
Gotcha. I didn't do the |
|
Just run |
|
Also you must check in |
|
Since there are a couple more updates brought in from Log gRPC request on error weaveworks/common#85 And now I'm a bit worried by weaveworks/common#85 - will it log the entire set of samples on a failed |
|
@bboreham, thanks for the pointers! I've removed the override, but the vendor update should still be in place. I'm not sure how to go about answering your question re: weaveworks/common#85, unfortunately. Any guidance there? |
@bboreham yes it will. It was added to help diagnose errors in other gRPC calls with smaller argument lists, not realising that it was being used here... |
|
@bboreham Do you object to us merging this as is? What do we need to do to get it ready to merge? |
|
I would suggest testing the error path and then deciding if that level of logging is acceptable. |
|
I notice #705 also contains the update to latest version of dep, so one of them is going to have to rebase after merge of the other. |
|
I'd be glad to wait for #705 @bboreham and @tomwilkie. The |
ed071d7 to
3dcffef
Compare
|
Has the error logging been tested yet? |
3dcffef to
303fba6
Compare
|
Finally just blew away my local branch and forced things to a sane state from latest master. @bboreham I'm not sure how to go about testing the error logging, short of letting it run for a while in our staging cluster and see what happens in the logs. Do you know of a quick / easy-ish way to cause this failure pattern so that I can give some quicker feedback? Thanks! |
|
@bboreham I think I may have inadvertently found an instance of what you expect to see in the logs with the recent change to common: Sure looks like a log of all samples in the failed push. Let me know if that's not what you were expecting and I'll dig a bit more. |
|
@bboreham, do you think it would be reasonable to explicitly truncate failed gRPC request logs past some reasonable-ish length? I'm Thinking something like 512 bytes or some such, but I'm not firm on that number. I'm also open to other tricks that could retain the improved logging but avoid the pain of many-KB messages on a failed push. |
|
Tuncating may be a workable compromise; however that dump you gave as an example doesn't look readable at any length. |
|
Definitely not readable at any length. I'll try to peruse the dumping code a bit next week once we've tackled a few other things, and see if we can at least make it readable for some preamble length. |
|
I abstain. Sorry! |
|
PR in common looks plausible; it's not clear what the answer to my |
|
Ah, oops, I forgot to address that part. The trouble I ran into there is that @csmarchbanks, do you have some time to crawl this code with me and see if there's a reasonable place to implement an interface or something for this? |
|
@bboreham, I had a quick clarifying question for you on the logging output issue. Is there a particular reason you'd like to see the string-ified version of the byte array that's being logged? I ask because the full series is being logged prior to the byte array, and the only additional information to be seen in the request body is the full set of samples and timestamps. String-ifying this would indeed shrink the output by a small bit (more accurately, by the number of bytes that make up the delta between each char in the series string and their corresponding ASCII value lengths, plus whitespace), but the rest of the output will be a big nested array of floats. If this is definitely needed, I can spend the time to figure out a way to do so (it's still not proving to be as trivial as I'd hoped), but I'd like to better understand that need first. :-) (I'm hoping we can get this particular PR unblocked sooner than later, as tracing is entirely non-viable in anything beyond a playground environment with zero load without this change. 😢) Thanks! |
|
I don't specially need to see the string; I just think it's dumb to print something we know is a string as the ascii codes. Happy to have that filed as a clean-up issue to do afterwards. |
|
Agreed that it's dumb to print a string as raw ascii codes. :-) I just think it's easier and (in this specific case) pretty reasonable to just omit that massive output entirely. I'll write up an issue to do the cleanup! |
|
I went back to weaveworks/common#89; I can't see where it is truncating at N bytes - it seems to be just omitting the request details completely. |
|
Yes, that's true. Maybe "truncate" is the wrong word on my part. I started down the path of a byte limit, but given that the data at the beginning of the ingester push requests is duplicate data from earlier in the log message, it seemed reasonable to add a configurable omission of the request body. |
|
Hmm... Maybe that is indeed too sledgehammer-y. I forget that ingesters take more than just I'll try the byte limiter again, and retain the configuration option for such. Stand by! |
|
@bboreham, does this PR seem OK now with the update to weaveworks/common? |
| jaegerAgentHost := os.Getenv("JAEGER_AGENT_HOST") | ||
| trace := tracing.New(jaegerAgentHost, "querier") | ||
| trace := tracing.NewFromEnv("querier") | ||
| defer trace.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need two defers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awww, weak copypasta skillz on my part. Thanks!
cmd/ingester/main.go
Outdated
| GRPCMiddleware: []grpc.UnaryServerInterceptor{ | ||
| middleware.ServerUserHeaderInterceptor, | ||
| }, | ||
| ExcludeRequestInLog: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you wanted this true
|
Whoo! Thanks a million @bboreham for your help (and patience)! |
EDIT: Please see #703 for description
@jml - I'm not entirely sure that adding an override to Gopkg.toml was the proper fix, so if I got it wrong, let me know what the proper method is and I'll fix it up. :-)
Thanks!