-
-
Notifications
You must be signed in to change notification settings - Fork 153
fix: log source format in stream info #1268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rename as below - OtelLogs -> otel-logs OtelMetrics -> otel-metrics OtelTraces -> otel-traces Pmeta -> pmeta Json -> json Kinesis -> kinesis added migration steps to perform the rename for existing streams
WalkthroughThe changes update the JSON serialization for the Changes
Sequence Diagram(s)sequenceDiagram
participant M as migrate_stream_metadata
participant V as Version Migration Functions
participant R as rename_log_source_v6
participant S as Storage
M->>V: Run migration functions (v1_v4, v4_v5, v5_v6)
V-->>M: Return updated metadata
M->>R: Call rename_log_source_v6 on metadata
R-->>M: Return metadata with updated log_source_format
M->>S: Save the final metadata
sequenceDiagram
participant RS as rename_log_source_v6
participant MAP as FormatMapping
participant L as LogSource Element
RS->>MAP: Create mapping ("Kinesis" -> "kinesis", etc.)
RS->>RS: Check for log_source field in metadata
loop For each log_source entry
RS->>L: Inspect log_source_format field
alt Field value matches mapping
RS->>L: Update to new format key
else
RS->>L: No changes made
end
end
RS-->>Return: Return updated metadata
Suggested reviewers
Poem
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/event/format/mod.rs(1 hunks)src/migration/mod.rs(3 hunks)src/migration/stream_metadata_migration.rs(3 hunks)
🧰 Additional context used
🧬 Code Definitions (1)
src/migration/mod.rs (2)
src/migration/stream_metadata_migration.rs (2)
rename_log_source_v6(204-229)v4_v5(154-181)src/migration/metadata_migration.rs (1)
v4_v5(137-162)
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: Quest Smoke and Load Tests for Standalone deployments
- GitHub Check: Quest Smoke and Load Tests for Distributed deployments
- GitHub Check: coverage
- GitHub Check: Build Default x86_64-pc-windows-msvc
- GitHub Check: Build Default aarch64-apple-darwin
- GitHub Check: Build Default x86_64-apple-darwin
- GitHub Check: Build Default aarch64-unknown-linux-gnu
- GitHub Check: Build Default x86_64-unknown-linux-gnu
- GitHub Check: Build Kafka x86_64-unknown-linux-gnu
- GitHub Check: Build Kafka aarch64-apple-darwin
🔇 Additional comments (10)
src/migration/mod.rs (6)
249-250: Appropriate implementation of the log source format renaming in the migration flow.The addition of the
rename_log_source_v6function call ensures that log source formats are standardized as part of the v1 migration path. This aligns with the PR objective of renaming log source formats consistently.
265-267: LGTM: Consistent application of the renaming function in v2 migration path.The function call is correctly placed after the existing migration steps for v2 metadata.
281-283: LGTM: Format standardization properly integrated in v3 migration path.Good placement of the function call after existing v3 migration steps.
291-293: LGTM: Consistent application of format renaming for v4 migration.The function call maintains consistency with the pattern established in previous version migrations.
300-305: LGTM: Format standardization correctly applied to v5 metadata.The changes follow the established pattern of applying the rename function after the version-specific migration.
306-312: LGTM: Implemented format standardization for unrecognized versions.The default case now properly applies the format renaming to maintain consistency across all streams regardless of version.
src/migration/stream_metadata_migration.rs (3)
20-20: LGTM: Appropriate import added for HashMap usage.The import of
HashMapis necessary for the newrename_log_source_v6function.
204-229: Well-structured migration function for standardizing log source formats.This function properly implements the renaming of log source formats as specified in the PR objectives. The code:
- Creates a mapping between old and new format names
- Traverses the log_source array in the metadata
- Updates each log_source_format value according to the mapping
The implementation handles potential edge cases gracefully by checking if fields exist at each step.
299-305: LGTM: Comprehensive test for the new migration function.The test effectively validates that all format names are properly transformed according to the mapping. It uses realistic test data that includes multiple log source entries with different formats.
src/event/format/mod.rs (1)
62-63: LGTM: Added appropriate serialization names for Kinesis, OtelLogs, Pmeta, and Json variants.The added
#[serde(rename = "...")]attributes ensure these enum variants are properly serialized with the standardized names.Also applies to: 66-67, 77-81
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/migration/stream_metadata_migration.rs (3)
192-228: Consider reusing the new rename function to avoid code duplication.
The logic for mapping log-source strings and wrapping them as an array largely duplicates the newrename_log_source_v6function. Unifying both can reduce repetition and potential maintenance overhead.Below is an example approach to unify logic (replace lines 192–228). This retains your single-string check, then leverages
rename_log_source_v6for the final mapping to ensure consistent transformations:if let Some(log_source_value) = stream_metadata_map.remove("log_source") { // If it's a string, convert it into an array of one object, else default to an array of one JSON. - if let Some(format_str) = log_source_value.as_str() { - let transformed_format = match format_str { - "Kinesis" => "kinesis", - "OtelLogs" => "otel-logs", - "OtelTraces" => "otel-traces", - "OtelMetrics" => "otel-metrics", - "Pmeta" => "pmeta", - "Json" => "json", - _ => "json", - }; - - let log_source_entry = json!({ - "log_source_format": transformed_format, - "fields": [] - }); - - stream_metadata_map.insert("log_source".to_owned(), json!([log_source_entry])); - } else { - let default_entry = json!({ - "log_source_format": "json", - "fields": [] - }); - stream_metadata_map.insert("log_source".to_owned(), json!([default_entry])); - } + let mut array_entries = vec![]; + if let Some(format_str) = log_source_value.as_str() { + array_entries.push(json!({"log_source_format": format_str, "fields": []})); + } else { + array_entries.push(json!({"log_source_format": "Json", "fields": []})); + } + // Insert the array for rename_log_source_v6 to process + stream_metadata_map.insert("log_source".to_owned(), Value::Array(array_entries)); } else { let default_entry = json!({ "log_source_format": "json", "fields": [] }); stream_metadata_map.insert("log_source".to_owned(), json!([default_entry])); } - // After setting up the array-based log_source, call rename_log_source_v6 to unify transformations - let updated_value = super::rename_log_source_v6(Value::Object(stream_metadata_map.clone())); - *stream_metadata_map = updated_value.as_object().unwrap().clone();
230-255: Centralize the format mapping in a constant or helper function.
The format mapping here is identical to the match statements inv5_v6. Consider abstracting repetitive mappings into a single static or function for maintainability. This way, bothv5_v6andrename_log_source_v6can use the same logic and remain in sync.
324-331: Expand test coverage for unknown formats.
The test covers standard conversions but does not confirm behavior for custom or unsupported formats. Including a test with a non-mapped format (e.g.,"MyCustomLogs") ensures thatrename_log_source_v6behaves as intended when encountering unexpected values.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/migration/mod.rs(4 hunks)src/migration/stream_metadata_migration.rs(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/migration/mod.rs
⏰ Context from checks skipped due to timeout of 90000ms (10)
- GitHub Check: Quest Smoke and Load Tests for Distributed deployments
- GitHub Check: Build Default x86_64-pc-windows-msvc
- GitHub Check: Build Default aarch64-unknown-linux-gnu
- GitHub Check: Build Default aarch64-apple-darwin
- GitHub Check: Build Default x86_64-apple-darwin
- GitHub Check: Build Default x86_64-unknown-linux-gnu
- GitHub Check: Quest Smoke and Load Tests for Standalone deployments
- GitHub Check: Build Kafka x86_64-unknown-linux-gnu
- GitHub Check: Build Kafka aarch64-apple-darwin
- GitHub Check: coverage
🔇 Additional comments (2)
src/migration/stream_metadata_migration.rs (2)
20-20: No issues with this import.
This import is required for theHashMapusage in the new function.
25-25: Import addition looks good.
AddingINTERNAL_STREAM_NAMEto the imports is consistent with your usage below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/migration/stream_metadata_migration.rs (2)
216-226: Potential duplication inmap_log_source_format.This direct approach works well for immediate usage, but note that similar logic appears in
create_format_mapping(). Consider consolidating them to reduce duplication and future maintenance.
256-266: Refactor suggestion to unify mapping logic.This
create_format_mapping()function repeats the same mapping asmap_log_source_format(). Consider unifying them into a single mapping source or function to avoid duplication and reduce divergence risk.Possible approach:
- fn map_log_source_format(format_str: &str) -> &str { - match format_str { - "Kinesis" => "kinesis", - "OtelLogs" => "otel-logs", - ... - _ => "json", - } - } - - fn create_format_mapping() -> HashMap<&'static str, &'static str> { - HashMap::from([ - ("Kinesis", "kinesis"), - ("OtelLogs", "otel-logs"), - ... - ]) - } + lazy_static! { + static ref FORMAT_MAPPING: HashMap<&'static str, &'static str> = { + HashMap::from([ + ("Kinesis", "kinesis"), + ("OtelLogs", "otel-logs"), + // ... + ]) + }; + } + + fn map_log_source_format(format_str: &str) -> &str { + FORMAT_MAPPING.get(format_str).unwrap_or(&"json") + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/migration/stream_metadata_migration.rs(4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: Build Default x86_64-apple-darwin
- GitHub Check: coverage
- GitHub Check: Build Default x86_64-pc-windows-msvc
- GitHub Check: Build Kafka aarch64-apple-darwin
- GitHub Check: Build Kafka x86_64-unknown-linux-gnu
- GitHub Check: Quest Smoke and Load Tests for Distributed deployments
- GitHub Check: Build Default x86_64-unknown-linux-gnu
- GitHub Check: Quest Smoke and Load Tests for Standalone deployments
🔇 Additional comments (7)
src/migration/stream_metadata_migration.rs (7)
20-20: No issues with the new import.This import is needed to support the mappings in subsequent code.
25-25: IncludingINTERNAL_STREAM_NAMEreference.Adding
handlers::http::cluster::INTERNAL_STREAM_NAMEensures the function can correctly assignstream_typefor internal streams. This seems aligned with the migration logic.
181-203: Validate unknown log sources defaulting to JSON.The
v5_v6logic sets the schema version to v6 and ensures that any missing or unrecognizedlog_sourceis converted to a default JSON-based entry. If this fallback is intentional, the code looks fine. Ensure that future changes or expansions of supported formats remain consistent with this fallback approach.
204-214:transform_log_sourcefallback for unknown formats.This function applies a known mapping for recognized strings and otherwise returns a default JSON entry. Confirm that silently discarding unknown format strings is the desired behavior, as the data will be overwritten by default.
Would you like to verify that dropping unknown formats is correct, or should we preserve the unrecognized format string for debugging?
228-233: Default JSON log source.Using an explicit default entry is a clean approach for unknown or missing data. It provides clarity about the fallback.
235-253: Check behavior for unmapped log_source formats.Unlike
v5_v6,rename_log_source_v6will leave formats unchanged if no mapping exists. Verify if this difference is intended:
v5_v6replaces unknown with"json".rename_log_source_v6leaves unknown formats untouched.If keeping unknown formats is desired, this is fine; otherwise, unify the behavior with
v5_v6.
336-341: Excellent test coverage forrename_log_source_v6.This test ensures that multiple Otel-based formats are properly renamed. The approach is thorough and verifies the correct transformation, matching PR objectives for standardizing log source formats.
fix: log source format in stream info
Summary by CodeRabbit
New Features
Tests