-
Notifications
You must be signed in to change notification settings - Fork 3.3k
feat(ingest/sql-parsing): Add fallback parser for MSSQL stored procedures with control flow #15340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat(ingest/sql-parsing): Add fallback parser for MSSQL stored procedures with control flow #15340
Conversation
…ures with control flow Implements fallback parsing for MSSQL/TSQL stored procedures containing TRY/CATCH blocks and other control flow syntax that sqlglot doesn't support. The parser extracts and parses individual DML statements separately, then aggregates the lineage results.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Use single TSQL_CONTROL_FLOW_KEYWORDS for both detection and filtering. Enables support for stored procedures with IF, WHILE, and other control flow.
| This is necessary because sqlglot doesn't support TSQL control flow syntax like | ||
| TRY/CATCH blocks, which causes the entire procedure to be unparseable. | ||
| """ | ||
| from datahub.sql_parsing.split_statements import split_statements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move import to the top?
| """ | ||
| from datahub.sql_parsing.split_statements import split_statements | ||
|
|
||
| logger.info( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be on debug level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I right? I'm just thinking from user's perspective if it makes sense to show this on info level.
| return create | ||
|
|
||
|
|
||
| def _is_stored_procedure_with_unsupported_syntax( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we match the dialect as well? This is an mssql specific syntax, right? Can we check for the dialect as well?
… stored procedure parser - Add dialect check to ensure TSQL control flow detection only applies to MSSQL/TSQL - Fix infinite recursion bug by adding _disable_fallback_parser flag to prevent recursive fallback parser calls - Add test coverage for non-MSSQL dialects (PostgreSQL, MySQL) This addresses a code review comment and resolves a critical recursion issue where split_statements could return chunks still matching the CREATE PROCEDURE pattern, causing infinite loops.
Implements fallback parsing for MSSQL/TSQL stored procedures containing TRY/CATCH blocks and other control flow syntax that sqlglot doesn't support. The parser extracts and parses individual DML statements separately, then aggregates the lineage results.