Skip to content

Conversation

@bpintea
Copy link
Contributor

@bpintea bpintea commented Mar 17, 2021

This replaces the qualified JDBC array enum types with a single ARRAY
type, which is then further qualified within the JDBC column info.

The name of the type reported through the metadata primitives is still
the composition of the base type and "_array".

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@bpintea
Copy link
Contributor Author

bpintea commented Mar 17, 2021

This addresses #68966 (comment)

@bpintea bpintea changed the title SQL: Replace qualified array types with a generic one SQL: Replace JDBC qualified array types with a generic one Mar 17, 2021
@bpintea
Copy link
Contributor Author

bpintea commented Mar 18, 2021

@elasticmachine run elasticsearch-ci/bwc

Copy link
Contributor

@matriv matriv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
Please add some tests with arrays or arrays to validate the name creation and extraction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs double checking. The javadocs indicate if it's a user defined type, it should be a fully qualified name - that is, it should return a class type that potentially can be loaded by the app.
I wonder if in this case, we should return the SQL Array interface or just "ARRAY".

Copy link
Contributor Author

@bpintea bpintea Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is, it should return a class type that potentially can be loaded by the app.

Wondering if that's correct though: wouldn't you want to use getColumnClassName() for that (which indeed returns a FQ class name, ....ARRAY) instead?

The docs for this method read Returns: type name used by the database. (and then indeed If the column type is a user-defined type, then a fully-qualified type name is returned).
But my interpretation is that (1) it has to be the DB-used name and (2) that the fully-qualified attribute refers here to a hierarchy (which is a common understanding of this concept), which we lack.

Edit:
getColumnClassName() returned column's base type, which I believe is incorrect. I'll push and update.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getColumnClassName() returned column's base type, which I believe is incorrect. I'll push and update.

I believe it's correct for non-arrays, in case of an array the proper classname is required - int[] -> [I.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unchecked cast from object to list - shouldn't this fail?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's only a warning, but would be nice to add the @SuppressWarnings("unchecked")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cast failure is checked (by code), but yes, not sure why isn't a warning emitted by the compiler; editor labels the annotation as redundant, if added.

@bpintea
Copy link
Contributor Author

bpintea commented Mar 22, 2021

Please add some tests with arrays or arrays to validate the name creation and extraction.

I've added a test on the corrected naming of the column class name.
The column type name was tested already.

@bpintea bpintea requested a review from costin March 22, 2021 11:55
@bpintea bpintea force-pushed the feat/sql-multivalue branch from 5659f2c to 63ad874 Compare March 25, 2021 10:09
@bpintea bpintea force-pushed the feat/jdbc-array-type branch from edba5cd to 4d2ed42 Compare March 25, 2021 10:18
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getColumnClassName() returned column's base type, which I believe is incorrect. I'll push and update.

I believe it's correct for non-arrays, in case of an array the proper classname is required - int[] -> [I.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect since this method is used to create the value container on the java side.
int[] -> int[].class.getName() -> [I, int[][] -> int[][].class -> [[I, Integer[] -> Integer[].class.getName() -> [LInteger;, etc...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that could also make sense and I could update it inline with these examples, but just for comparison, taking PostgreSQL's JDBC driver as a comparison, running the following query:
SELECT ARRAY[1, 2, 5], ARRAY['foo', 'bar'], 'foobar'
will return for the the following methods:

  • resultSet.getMetaData().getColumnName(i)
  • .getColumnTypeName(i)
  • .getColumnClassName(i)
  • .getColumnType(i)

the following results:

           array|           array|        ?column?|
           _int4|           _text|            text|
  java.sql.Array|  java.sql.Array|java.lang.String|
            2003|            2003|              12|

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth checking other drivers and potentially the JDBC spec to see whether there's consensus on whether the interface name should be returned or the concrete class.
Since there's a chance of nested arrays, returning java.sql.Array makes more sense.

Comment on lines 67 to 68
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fact that it's a non public class is a give away that the return value is incorrect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit hesitant here as I'm not sure if the docs aren't a bit conflicting: Returns the fully-qualified name of the Java class whose instances are manufactured if the method ResultSet.getObject is called to retrieve a value from the column.
So "org.elasticsearch.xpack.sql.jdbc.JdbcArray" is the one fitting this requirement.

But then also: ResultSet.getObject may return a subclass of the class returned by this method.
...so it might actually be a subclass and not those whose instances are manufactured.

Postgresql's driver obviously went for the superclass (from the above example).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue here is with visibility; I believe this is better explained in the JDBC spec.
In short, as a consumer of the driver I'm using the methods here to perform discovery, for example:

ResultSet.getObject(1, ClassLoader.loadClass(rsMetadata.getColumnClassName(1))

or expect that
ClassLoader.loadClass(rsMetadata.getColumnClassName()).isAssignable(resultSet.getObject(1).getClass())

The above are problematic since either I need to use the driver classloader to load the private class or the object return won't match the classname (as it's not part of its hierarchy).
Essentially whatever the results are, they should be visible and useful to the user - anything else is a bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make sense, but I'm not sure if having the implementation private and being able to load it by it's fully qualified class name are compatible?

Based on the docs (ResultSet.getObject may return a subclass of the class returned by this method., i.e. getColumnClassName()), it would seem that returning the interface name for the class name might be the solution.

The JDBC spec doesn't cover this detail.
From what I can tell, some implementations make use of the ARRAY interface - like Solr's Calcite and Presto JDBC - while others seem to use the implementation - like Teradata and MySQL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make sense, but I'm not sure if having the implementation private and being able to load it by it's fully qualified class name are compatible?

There's no value in returning a class that is private since one cannot use the class object directly (MyPrivate.class). Return a public parent (interface or otherwise) instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@costin costin removed the request for review from palesz April 4, 2021 14:33
Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bpintea
Copy link
Contributor Author

bpintea commented May 20, 2021

@elasticmachine run elasticsearch-ci/bwc

@bpintea bpintea force-pushed the feat/sql-multivalue branch from 63ad874 to 2aa1365 Compare May 20, 2021 20:27
bpintea added 5 commits May 20, 2021 23:15
This adds support for the xDBC and CLI clients.
Based on those, the CsvJdbc-based QA tests are also integrated.
This replaces the qualified JDBC array enum types with a single ARRAY
type, which is then further qualified within the JDBC column info.

The name of the type reported through the metadata primitives is still
the composition of the base type and "_array".
This needs to return the FQ class name of the object instance as
returned by ResultSet#getObject().
Return java.sql.Array for array types in
ResultSetMetaData#getColumnClassName().
Remove two functions duplicated while rebasing.
@bpintea bpintea force-pushed the feat/jdbc-array-type branch from 9c57cb5 to 162b074 Compare May 20, 2021 21:32
@bpintea bpintea merged commit feb9b9e into elastic:feat/sql-multivalue May 20, 2021
@bpintea bpintea deleted the feat/jdbc-array-type branch May 20, 2021 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants