-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Description
Type of issue
Missing information
Description
The Azure Storage SDK’s download_blob method allows users to set the max_concurrency parameter to enable parallel downloads for blobs larger than 64MB. By increasing max_concurrency, developers can potentially speed up blob downloads by using multiple connections simultaneously.
However, the underlying implementation of download_blob relies on urllib3, which has a default connection pool size of 10. When max_concurrency is set to a value higher than the default pool size, this triggers a warning:
Connection pool is full, discarding connection
This behaviour can lead to inefficiencies and confusion, as developers may assume max_concurrency controls the number of connections directly, without realising that the connection pool size needs to be adjusted accordingly.
Suggested Improvements:
1. Documentation Update: It would be helpful if the official documentation for `download_blob` clearly stated that increasing `max_concurrency` beyond the default connection pool size requires explicitly configuring the connection pool (e.g., through requests.Session() or a similar method).
2. Proactive Guidance: Adding a note or example in the documentation on how to configure the `BlobServiceClient` to adjust the pool size based on the intended `max_concurrency` would prevent potential issues and improve user experience.
This small clarification can prevent warnings and ensure that users get the expected performance when downloading large blobs with high concurrency.
Page URL
Content source URL
Document Version Independent Id
9ee6555a-aaca-243f-409e-1ac5881e3dbc
Article author
Metadata
- ID: 2a557056-1da5-6c2d-fcee-e4e246a7a221
- Service: azure-storage