-
Notifications
You must be signed in to change notification settings - Fork 115
Make telemetry batch size configurable and add time-based flush #622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
@@ -258,6 +261,7 @@ def test_factory_error_handling(self): | |||
session_id_hex=session_id, | |||
auth_provider=AccessTokenAuthProvider("token"), | |||
host_url="test-host.com", | |||
batch_size=TelemetryClientFactory.DEFAULT_BATCH_SIZE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you need to pass the default value ? Inside the function add the default value, pass it only when users sends a new value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We set the default value of telemetry_batch_size
if required during extraction from kwargs in connection initialization, so I did not set a default value for batch_size
in the initialize_telemetry_client
function in the TelemetryClientFactory
class.
tests/unit/test_telemetry.py
Outdated
|
||
@patch("databricks.sql.client.Session") | ||
def test_connection_with_custom_batch_size(MockSession): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think this test is needed. You are just setting a value and checking if it is set
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed it
cls._initialized = True | ||
logger.debug( | ||
"TelemetryClientFactory initialized with thread pool (max_workers=10)" | ||
) | ||
|
||
@classmethod | ||
def _start_flush_thread(cls): | ||
"""Start the shared background thread for periodic flushing of all clients""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought is this background thread and its overhead required and is the timely refresh feature needed ? Like anyway we send the final telemtry logs while shutdown, and we can reduce the default batch size to have a near similar effect without the tbackground thread overload cc @vikrantpuppala
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pattern followed in telemetry in the JDBC driver.
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Signed-off-by: Sai Shree Pradhan <[email protected]>
f90cd99
to
4f88318
Compare
What type of PR is this?
Description
The flush timer is centralized in
TelemetryClientFactory
, single background thread to manage all connections. Keeping it inTelemetryClient
would mean creating a timer thread per connection.Used
threading.Thread
withthreading.Event
. Thethreading.Event
acts as a thread-safe shutdown signal, and itswait(timeout)
method allows the thread to wait for the next flush interval while remaining immediately responsive to a shutdown command.While
threading.Timer
could be used, it would create a new thread every flush interval as we need to create a timer after each execution.How is this tested?
Related Tickets & Documents
design doc
PECOBLR-654