Skip to content

Conversation

@askumar27
Copy link
Contributor

@askumar27 askumar27 commented Nov 18, 2025

📋 Summary

Quote database identifiers in Fivetran connector to align with Snowflake connector pattern and ensure consistent case-sensitive handling across all identifiers. Includes backward compatibility for valid unquoted identifiers to support existing Snowflake databases and schemas.

🎯 Motivation

This change addresses a regression identified in the postmortem where database names were not quoted while schema names were already quoted (since v1.2.0.7). The inconsistency created confusion and potential issues for users with case-sensitive or special-character database names. Also, introduced backward incompatibility with unquoted identifier usage in Snowflake configurations.

Root Cause:

Why This Matters:

  • Consistency: Aligns Fivetran connector with the Snowflake connector's pattern of always quoting identifiers
  • Special Characters: Enables proper handling of database names with hyphens, spaces, dots, and other special characters
  • Case Sensitivity: Ensures database names preserve exact case, matching the behavior already established for schema names
  • Cross-Platform Support: Required for proper SQL transpilation to BigQuery and Databricks destinations
  • Backward Compatibility: Automatically handles existing Snowflake databases created with unquoted identifiers

Related Context:

🏗️ Architecture/Design Notes

Identifier Handling Pattern:

  • Follows Snowflake's quoted identifier convention (double quotes)
  • All identifiers (database and schema) are now consistently quoted
  • Quote escaping: """ to handle database/schema names containing quotes
  • SQL queries are written in Snowflake SQL and transpiled to target dialects at runtime

Backward Compatibility Strategy:

  • Valid unquoted identifiers are detected using _is_valid_unquoted_identifier():
    • Must start with letter (A-Z) or underscore (_)
    • Must contain only letters, numbers, and underscores
    • Cannot be already quoted (starts/ends with double quotes)
  • For Snowflake destinations, valid unquoted identifiers are automatically uppercased before quoting
  • This matches Snowflake's behavior: unquoted identifiers are auto-converted to uppercase
  • Ensures existing databases created without quotes continue to work seamlessly

Alignment with Snowflake Connector:

  • Matches the pattern used in src/datahub/ingestion/source/snowflake/snowflake_query.py
  • Both connectors now use: use database "{db_name}" and "{schema_name}".table_name
  • Ensures consistent behavior across DataHub's Snowflake-related connectors

📊 Impact Assessment

Breaking Changes:

  • NO
  • Backward compatibility is automatically handled for valid unquoted identifiers
  • Users with standard identifiers (letters, numbers, underscores only) require no configuration changes
  • Users with special characters or case-sensitive requirements must ensure exact case matching - This is a new enhancement in this PR to allow such usage.

Migration Impact:

  • No Action Required for standard identifiers (letters, numbers, underscores only):

    • Valid unquoted identifiers are automatically uppercased to match existing Snowflake objects
    • Example: fivetran_logs → automatically becomes FIVETRAN_LOGS"FIVETRAN_LOGS"
    • Existing configurations continue to work without changes
  • Recommended Best Practice: Maintain exact case in configuration for consistency:

    • If database is stored as MYDATABASE in Snowflake → use MYDATABASE in config
    • If database is stored as mydatabase in Snowflake → use mydatabase in config
    • This ensures consistency and makes configuration more explicit
  • Timeline: Change impacts all CLI versions since v1.2.0.7 (schema quoting) and now extends to database names

Risk Level: Low

  • Backward compatibility automatically handles existing configurations
  • Comprehensive test coverage (unit + integration) reduces risk of regressions
  • Change follows established pattern already used for schema names
  • Low technical risk - change is straightforward and well-tested

🔗 References

…se quoted identifiers for Snowflake compatibility
@github-actions github-actions bot added ingestion PR or Issue related to the ingestion of metadata docs Issues and Improvements to docs labels Nov 18, 2025
@datahub-cyborg datahub-cyborg bot added the needs-review Label for PRs that need review from a maintainer. label Nov 18, 2025
@codecov
Copy link

codecov bot commented Nov 18, 2025

❌ 1 Tests Failed:

Tests completed Failed Passed Skipped
5496 1 5495 38
View the top 1 failed test(s) by shortest run time
tests.integration.kafka-connect-confluent-cloud.test_kafka_connect_confluent_cloud::test_kafka_connect_confluent_cloud_ingest
Stack Traces | 60.8s run time
docker_compose_runner = <function docker_compose_runner.<locals>.run at 0x7fdd6b41f6a0>
pytestconfig = <_pytest.config.Config object at 0x7fde49cc9590>
test_resources_dir = PosixPath('.../tests/integration/kafka-connect-confluent-cloud')

    @pytest.fixture(scope="module")
    def confluent_cloud_mock_runner(
        docker_compose_runner, pytestconfig, test_resources_dir
    ):
        docker_compose_file = [str(test_resources_dir / "docker-compose.override.yml")]
    
>       with docker_compose_runner(
            docker_compose_file, "kafka-connect-confluent-cloud"
        ) as docker_services:

.../integration/kafka-connect-confluent-cloud/test_kafka_connect_confluent_cloud.py:45: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
.../hostedtoolcache/Python/3.11.14....../x64/lib/python3.11/contextlib.py:137: in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
.../datahub/testing/docker_utils.py:65: in run
    with pytest_docker.plugin.get_docker_services(
.../hostedtoolcache/Python/3.11.14....../x64/lib/python3.11/contextlib.py:137: in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
venv/lib/python3.11........./site-packages/pytest_docker/plugin.py:212: in get_docker_services
    docker_compose.execute(command)
venv/lib/python3.11........./site-packages/pytest_docker/plugin.py:140: in execute
    return execute(command, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

command = 'docker compose --parallel -1 -f ".../tests/integration/kafka-connect-confluent-cloud/docker-compose.override.yml" -p "pytest5034-kafka-connect-confluent-cloud" up --build --wait'
success_codes = (0,), ignore_stderr = False

    def execute(command: str, success_codes: Iterable[int] = (0,), ignore_stderr: bool = False) -> Union[bytes, Any]:
        """Run a shell command."""
        try:
            stderr_pipe = subprocess.DEVNULL if ignore_stderr else subprocess.STDOUT
            output = subprocess.check_output(command, stderr=stderr_pipe, shell=True)
            status = 0
        except subprocess.CalledProcessError as error:
            output = error.output or b""
            status = error.returncode
            command = error.cmd
    
        if status not in success_codes:
>           raise Exception(
                'Command {} returned {}: """{}""".'.format(command, status, output.decode("utf-8"))
            )
E           Exception: Command docker compose --parallel -1 -f ".../tests/integration/kafka-connect-confluent-cloud/docker-compose.override.yml" -p "pytest5034-kafka-connect-confluent-cloud" up --build --wait returned 1: """#1 [internal] load local bake definitions
E           #1 reading from stdin 578B done
E           #1 DONE 0.0s
E           
E           #2 [internal] load build definition from Dockerfile
E           #2 transferring dockerfile: 364B done
E           #2 DONE 0.0s
E           
E           #3 [internal] load metadata for docker.io/library/python:3.11-slim
E           #3 DONE 0.2s
E           
E           #4 [internal] load .dockerignore
E           #4 transferring context: 2B done
E           #4 DONE 0.0s
E           
E           #5 [internal] load build context
E           #5 transferring context: 18.80kB done
E           #5 DONE 0.0s
E           
E           #6 [1/4] FROM docker.io/library/python:3.11-slim@sha256:8ef21a26e7c342e978a68cf2d6b07627885930530064f572f432ea422a8c0907
E           #6 resolve docker.io/library/python:3.11-slim@sha256:8ef21a26e7c342e978a68cf2d6b07627885930530064f572f432ea422a8c0907 done
E           #6 sha256:c4116d4d7ea9320db352f6516001262753529edf1e20b2c6609a6b9a49cc6be4 1.75kB / 1.75kB done
E           #6 sha256:040af88f5bce9e9239b7e739e86304d26964d1d55ada56b9297a3d891e91634d 5.48kB / 5.48kB done
E           #6 sha256:0e4bc2bd6656e6e004e3c749af70e5650bac2258243eb0949dea51cb8b7863db 7.34MB / 29.78MB 0.1s
E           #6 sha256:22b63e76fde1e200371ed9f3cee91161d192063bcff65c9ab6bf63819810a974 1.29MB / 1.29MB 0.1s done
E           #6 sha256:b3dd773c329649f22e467ae63d1c612a039a0559dec99ffb9ada904ab5c60c55 6.29MB / 14.36MB 0.1s
E           #6 sha256:1771569cc1299abc9cc762fc4419523e721b11a3927ef968ae63ba0a4a88f2da 0B / 251B 0.1s
E           #6 sha256:8ef21a26e7c342e978a68cf2d6b07627885930530064f572f432ea422a8c0907 10.37kB / 10.37kB done
E           #6 sha256:0e4bc2bd6656e6e004e3c749af70e5650bac2258243eb0949dea51cb8b7863db 27.26MB / 29.78MB 0.3s
E           #6 sha256:b3dd773c329649f22e467ae63d1c612a039a0559dec99ffb9ada904ab5c60c55 14.36MB / 14.36MB 0.2s done
E           #6 sha256:1771569cc1299abc9cc762fc4419523e721b11a3927ef968ae63ba0a4a88f2da 251B / 251B 0.1s done
E           #6 sha256:0e4bc2bd6656e6e004e3c749af70e5650bac2258243eb0949dea51cb8b7863db 29.78MB / 29.78MB 0.3s done
E           #6 extracting sha256:0e4bc2bd6656e6e004e3c749af70e5650bac2258243eb0949dea51cb8b7863db 0.1s
E           #6 extracting sha256:0e4bc2bd6656e6e004e3c749af70e5650bac2258243eb0949dea51cb8b7863db 0.5s done
E           #6 extracting sha256:22b63e76fde1e200371ed9f3cee91161d192063bcff65c9ab6bf63819810a974 0.1s done
E           #6 extracting sha256:b3dd773c329649f22e467ae63d1c612a039a0559dec99ffb9ada904ab5c60c55
E           #6 extracting sha256:b3dd773c329649f22e467ae63d1c612a039a0559dec99ffb9ada904ab5c60c55 0.4s done
E           #6 extracting sha256:1771569cc1299abc9cc762fc4419523e721b11a3927ef968ae63ba0a4a88f2da
E           #6 extracting sha256:1771569cc1299abc9cc762fc4419523e721b11a3927ef968ae63ba0a4a88f2da done
E           #6 DONE 1.7s
E           
E           #7 [2/4] WORKDIR /app
E           #7 DONE 0.0s
E           
E           #8 [3/4] COPY confluent_cloud_mock_api.py ./
E           #8 DONE 0.0s
E           
E           #9 [4/4] RUN pip install --no-cache-dir flask
E           #9 1.613 Collecting flask
E           #9 1.627   Downloading flask-3.1.2-py3-none-any.whl.metadata (3.2 kB)
E           #9 1.639 Collecting blinker>=1.9.0 (from flask)
E           #9 1.643   Downloading blinker-1.9.0-py3-none-any.whl.metadata (1.6 kB)
E           #9 1.659 Collecting click>=8.1.3 (from flask)
E           #9 1.661   Downloading click-8.3.1-py3-none-any.whl.metadata (2.6 kB)
E           #9 1.670 Collecting itsdangerous>=2.2.0 (from flask)
E           #9 1.673   Downloading itsdangerous-2.2.0-py3-none-any.whl.metadata (1.9 kB)
E           #9 1.686 Collecting jinja2>=3.1.2 (from flask)
E           #9 1.688   Downloading jinja2-3.1.6-py3-none-any.whl.metadata (2.9 kB)
E           #9 1.727 Collecting markupsafe>=2.1.1 (from flask)
E           #9 1.730   Downloading markupsafe-3.0.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl.metadata (2.7 kB)
E           #9 1.750 Collecting werkzeug>=3.1.0 (from flask)
E           #9 1.753   Downloading werkzeug-3.1.3-py3-none-any.whl.metadata (3.7 kB)
E           #9 1.764 Downloading flask-3.1.2-py3-none-any.whl (103 kB)
E           #9 1.767    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.3/103.3 kB 53.3 MB/s eta 0:00:00
E           #9 1.770 Downloading blinker-1.9.0-py3-none-any.whl (8.5 kB)
E           #9 1.772 Downloading click-8.3.1-py3-none-any.whl (108 kB)
E           #9 1.773    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 108.3/108.3 kB 579.6 MB/s eta 0:00:00
E           #9 1.775 Downloading itsdangerous-2.2.0-py3-none-any.whl (16 kB)
E           #9 1.778 Downloading jinja2-3.1.6-py3-none-any.whl (134 kB)
E           #9 1.779    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 134.9/134.9 kB 607.9 MB/s eta 0:00:00
E           #9 1.781 Downloading markupsafe-3.0.3-cp311-cp311-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl (22 kB)
E           #9 1.784 Downloading werkzeug-3.1.3-py3-none-any.whl (224 kB)
E           #9 1.785    ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 224.5/224.5 kB 657.7 MB/s eta 0:00:00
E           #9 1.820 Installing collected packages: markupsafe, itsdangerous, click, blinker, werkzeug, jinja2, flask
E           #9 2.066 Successfully installed blinker-1.9.0 click-8.3.1 flask-3.1.2 itsdangerous-2.2.0 jinja2-3.1.6 markupsafe-3.0.3 werkzeug-3.1.3
E           #9 2.067 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
E           #9 2.114 
E           #9 2.114 [notice] A new release of pip is available: 24.0 -> 25.3
E           #9 2.114 [notice] To update, run: pip install --upgrade pip
E           #9 DONE 2.3s
E           
E           #10 exporting to image
E           #10 exporting layers
E           #10 exporting layers 0.5s done
E           #10 writing image sha256:0cc303943d4c48bd103d1aaff9a5a09541bf80c5ad2b9381fec888cf42f000f8 done
E           #10 naming to docker.io/library/pytest5034-kafka-connect-confluent-cloud-confluent-cloud-mock done
E           #10 DONE 0.5s
E           
E           #11 resolving provenance for metadata file
E           #11 DONE 0.0s
E            confluent-cloud-mock  Built
E            Network pytest5034-kafka-connect-confluent-cloud_default  Creating
E            Network pytest5034-kafka-connect-confluent-cloud_default  Created
E            Container test_confluent_cloud_mock  Creating
E            Container test_confluent_cloud_mock  Created
E            Container test_confluent_cloud_mock  Starting
E            Container test_confluent_cloud_mock  Started
E            Container test_confluent_cloud_mock  Waiting
E           container test_confluent_cloud_mock is unhealthy
E           """.

venv/lib/python3.11........./site-packages/pytest_docker/plugin.py:37: Exception

To view more test analytics, go to the Test Analytics Dashboard
📋 Got 3 mins? Take this short survey to help us improve Test Analytics.

@datahub-cyborg datahub-cyborg bot added pending-submitter-merge and removed needs-review Label for PRs that need review from a maintainer. labels Nov 18, 2025
…ntifiers in Snowflake

- Added logic to handle unquoted database and schema names by converting them to uppercase quoted identifiers.
- Introduced a validation method for unquoted identifiers to ensure compliance with Snowflake's requirements.
- Updated tests to cover various cases for database and schema names, ensuring correct handling and transpilation for Snowflake, BigQuery, and Databricks.
…sensitivity in Snowflake

- Removed outdated information regarding unquoted identifiers and emphasized the importance of using quoted identifiers for database and schema names.
- Added details on backward compatibility for valid unquoted identifiers and recommended best practices for configuration.
- Clarified action required for users to ensure case sensitivity in their Fivetran source configurations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Issues and Improvements to docs ingestion PR or Issue related to the ingestion of metadata pending-submitter-merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants