Skip to content

Conversation

@mpatou
Copy link
Contributor

@mpatou mpatou commented Mar 2, 2024

Summary

On the columns that are part of the clause for a join, remove the schema
prefix by creating an alias based on the table and setting the table
attribute of the columns in the join clause to be alias and not the
table.
The name of the alias is picked to be the name of the table as well as
it works in plain SQL.

We only do that if:

  1. there is a schema prefix on the table (ie. commons or production)
  2. the "table" is not actually already an alias
  3. the "table" is not actually a subquery

Testing

I created unit tests for the visit_join function they are all passing.
I also created a wheel package and uploaded it in superset, I used to
have issues with queries when superset wanted to do a self join to limit
the number of series returned in a query.
It was complaining:

Relation name `commons` not found. If you are trying to access a nested
field within an object

With the fix the query is working fine.

@mpatou
Copy link
Contributor Author

mpatou commented Mar 2, 2024

I also tried the tests without the fix and I would get error like :

'"s1"."t1" JOIN "s2"."t2" ON "x" = "s2"."t2"."y"' != '"s1"."t1" JOIN "s2"."t2" ON "x" = "t2"."y"'

@mpatou mpatou requested review from dhruba, haneeshr and jhochmuth March 2, 2024 03:50
@mpatou mpatou force-pushed the fix_joins_schema branch from 981266f to 2d9772e Compare March 4, 2024 20:05
@mpatou mpatou changed the base branch from add_ci_github to main March 4, 2024 20:05
@mpatou mpatou force-pushed the fix_joins_schema branch from 2d9772e to 8530127 Compare March 4, 2024 20:16
@mpatou mpatou changed the base branch from main to redo_github_ci March 4, 2024 20:16
@mpatou mpatou force-pushed the fix_joins_schema branch from 8530127 to 5a0828c Compare March 4, 2024 20:25
@mpatou mpatou force-pushed the fix_joins_schema branch from 5a0828c to d40f65a Compare March 4, 2024 20:28
Summary

On the columns that are part of the clause for a join, remove the schema
prefix by creating an alias based on the table and setting the `table`
attribute of the columns in the join clause to be alias and not the
table.
The name of the alias is picked to be the name of the table as well as
it works in plain SQL.

We only do that if:

1. there is a schema prefix on the table (ie. `commons` or `production`)
2. the "table" is not actually already an alias
3. the "table" is not actually a subquery

Testing

I created unit tests for the visit_join function they are all passing.
I also created a wheel package and uploaded it in `superset`, I used to
have issues with queries when superset wanted to do a self join to limit
the number of series returned in a query.
It was complaining:
```
Relation name `commons` not found. If you are trying to access a nested
field within an object
```
With the fix the query is working fine.
@mpatou mpatou force-pushed the fix_joins_schema branch from d40f65a to c577823 Compare March 4, 2024 20:31
@mpatou mpatou changed the base branch from redo_github_ci to fix_formatting March 4, 2024 20:31
@mpatou mpatou deleted the branch fix_formatting March 5, 2024 22:55
@mpatou mpatou closed this Mar 5, 2024
@mpatou mpatou deleted the fix_joins_schema branch April 2, 2024 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants