Skip to content

Fivetran Ingestion doesn't work for BigQuery Destination due to failure in parsing BigQuery sql syntax #14210

@yingying-chen-cko

Description

@yingying-chen-cko

Describe the bug
A clear and concise description of what the bug is.

I specified the Fivetran log dataset with the GCP project id and the ingestion failed due to the sqlglot.parse_one has defaulted the dialect to snowflake. See: https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/fivetran/fivetran_log_api.py#L88

To Reproduce
Steps to reproduce the behavior:

Specify the fivetran_log_config.bigquery_destination_config.dataset with the gcp project id that the dataset is in and trigger the ingestion:

source:
    type: fivetran
    config:
        history_sync_lookback_period: 1
        fivetran_log_config:
            destination_platform: bigquery
            bigquery_destination_config:
              dataset: bq-project-id.fivetran_logs
              credential:
                project_id: '${BQ_PROJECT_ID}'
                private_key_id: '${BQ_DATAHUB_SA_PRIVATE_KEY_ID}'
                private_key: '${BQ_DATAHUB_SA_PRIVATE_KEY}'
                client_email: '${BQ_DATAHUB_SA_CLIENT_EMAIL}'
                client_id: '${BQ_DATAHUB_SA_CLIENT_ID}'
        stateful_ingestion:
          enabled: true

pipeline_name: 'fivetran-default-ingestion'

Expected behavior
A clear and concise description of what you expected to happen.
The ingestion will fail with this error:

<class 'sqlglot.errors.ParseError'>: Invalid expression / Unexpected token. Line 9, Col: 10.
  er_id,
  connector_type_id,
  connector_name,
  paused,
  sync_frequency,
  destination_id
FROM �[4m-�[0mdxxxata-xxx-prod-xxx-.fivetran_logs.connector
WHERE
  _fivetran_deleted = FALS

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: [e.g. iOS]
  • Browser [e.g. chrome, safari]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions