-
Notifications
You must be signed in to change notification settings - Fork 25.6k
SQL: Add xDBC and CLI support. QA CSV specs #68966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SQL: Add xDBC and CLI support. QA CSV specs #68966
Conversation
This adds support for the xDBC and CLI clients. Based on those, the CsvJdbc-based QA tests are also integrated.
Adds a REST test that generates all possible path combination of a JSON object, all generating values to be retrieved with ARRAY().
…/multivalue-clients
|
@elasticmachine run elasticsearch-ci/2 |
1 similar comment
|
@elasticmachine run elasticsearch-ci/2 |
…/multivalue-clients
|
Pinging @elastic/es-ql (Team:QL) |
matriv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!! Had a first pass and left some mostly minor comments.
| public void testTwoQueriesExecuted() { | ||
| Metrics metrics = new Metrics(); | ||
| Verifier verifier = new Verifier(metrics); | ||
| Verifier verifier = new Verifier(metrics, SqlTestUtils.TEST_CFG.version()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static import?
| EsIndex test = new EsIndex("test", mapping); | ||
| IndexResolution getIndexResult = IndexResolution.valid(test); | ||
| analyzer = new Analyzer(SqlTestUtils.TEST_CFG, new SqlFunctionRegistry(), getIndexResult, new Verifier(new Metrics())); | ||
| analyzer = new Analyzer(SqlTestUtils.TEST_CFG, new SqlFunctionRegistry(), getIndexResult, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static import of TEST_CFG?
| EsIndex test = new EsIndex("test", mapping); | ||
| getIndexResult = IndexResolution.valid(test); | ||
| analyzer = new Analyzer(SqlTestUtils.TEST_CFG, functionRegistry, getIndexResult, new Verifier(new Metrics())); | ||
| analyzer = new Analyzer(SqlTestUtils.TEST_CFG, functionRegistry, getIndexResult, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| EsIndex test = new EsIndex("test", mapping); | ||
| Analyzer analyzer = new Analyzer(SqlTestUtils.TEST_CFG, new FunctionRegistry(), IndexResolution.valid(test), | ||
| new Verifier(new Metrics())); | ||
| new Verifier(new Metrics(), SqlTestUtils.TEST_CFG.version())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
| } | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: new lines
|
|
||
| private String error(IndexResolution getIndexResult, String sql) { | ||
| Analyzer analyzer = new Analyzer(TEST_CFG, new SqlFunctionRegistry(), getIndexResult, new Verifier(new Metrics())); | ||
| Analyzer analyzer = new Analyzer(TEST_CFG, new SqlFunctionRegistry(), getIndexResult, new Verifier(new Metrics(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
|
|
||
| private LogicalPlan accept(IndexResolution resolution, String sql) { | ||
| Analyzer analyzer = new Analyzer(TEST_CFG, new SqlFunctionRegistry(), resolution, new Verifier(new Metrics())); | ||
| Analyzer analyzer = new Analyzer(TEST_CFG, new SqlFunctionRegistry(), resolution, new Verifier(new Metrics(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed hopefully everywhere (except here, where it was already statically imported and used)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
| } | ||
| } | ||
|
|
||
| private static List<String> splitCsvLine(String line) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you checked if org.relique.jdbc.csv (which we use already) has this functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They do have the functionality, but don't export it.
| } | ||
|
|
||
| static boolean isArray(EsType type) { | ||
| return type.getName().endsWith("_ARRAY"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't we have a flag on the EsType for that instead of checking the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks, fixed to check against Types.ARRAY.
| throw new SQLFeatureNotSupportedException("Array not supported"); | ||
| EsType type = columnType(columnIndex); | ||
| if (isArray(type) == false) { | ||
| throw new SQLException("Cannot get an array for the column [" + columnIndex + "] of type [" + type.getName() + "]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe Cannot get column ... of type ... as array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better, fixed, thx.
|
Can we have some test with prepared statement where an array is set to a query param? (Of course param currently can only be in the SELECT clause). If it's not in the scope of this PR to have this, then let's have a simple test that tests the Exception thrown. |
- static imports - exception message rephrasing - better array type checking.
Array parameters are indeed out of scope atp (just like literals or constructors).
Not sure how to go about this:
|
I meant to have a test for |
astefan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a first round of reviews, after going through roughly half of the code changes.
| if (index < 0 || index > Integer.MAX_VALUE) { | ||
| throw new SQLException("Index value [" + index + "] out of range [0, " + Integer.MAX_VALUE + "]"); | ||
| } | ||
| int index0 = (int) index - 1; // 0-based index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index0 initialization could be moved after the count < 0 check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved.
| if (count < 0) { | ||
| throw new SQLException("Illegal negative count [" + count + "]"); | ||
| } | ||
| int available = index0 < values.size() ? values.size() - index0 : 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you double check the logic here, because I'm not sure it's correct. From what I can see index can be 0. If it is 0 and there are no values in the array values this means the Arrays.copyOfRange will act on an array starting at index -1 until -1 + Math.min(0, 1) and the call to copyOfRange will throw an exception for the negative to index. I think index should be checked at the start of this method with the value 1 (getArray javadoc mentions index the array index of the first element to retrieve;the first element is at index 1).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can see index can be 0.
Right, it could, but should not, as it is 1-based index.
I think index should be checked at the start of this method with the value 1
Correct. The out of range exception you exemplified below was triggered due to the incorrect initial check, index < 0, which is now corrected to < 1 (I also assumed it's 0-based... as an array, so wrote the check, then confronted with the docs, but missed to fix the check :-/).
| : type; | ||
| } | ||
|
|
||
| static EsType arrayOf(EsType type) throws SQLException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this method here, since it's only used in TypeConverterTests? Wouldn't have a better place in that test class? It is because ENUM_NAME_TO_TYPE is local to TypeUtils?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is because ENUM_NAME_TO_TYPE is local to TypeUtils?
Yes.
(TypeUtils#isString() is btw also only used once, in JdbcPreparedStatement, with no TypeUtils dependencies.)
| static final List<EsType> ARRAY_TYPES = Arrays.stream(EsType.values()).filter(TypeUtils::isArray).collect(Collectors.toList()); | ||
|
|
||
| public void testMetaData() throws Exception { | ||
| EsType arrayType = randomFrom(ARRAY_TYPES); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I would have iterated through all of the array types and assert the checks. I don't think it's worth randomizing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exhaustive-ised.
| Object[] contained = (Object[]) array.getArray(4, 3); | ||
| assertEquals(asList(3, 4, 5), asList(contained)); | ||
|
|
||
| Object[] overlapping = (Object[]) array.getArray(9, 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this test class my assumption about getArray with a 0 as index parameter is confirmed if I do Object[] bla = (Object[]) array.getArray(0, 1);. The call fails with ArrayIndexOutOfBoundsException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks, fixed.
|
|
||
| static <T> Set<T> randomSet(int minSetSize, int maxSetSize, Supplier<T> valueConstructor) { | ||
| final int size = randomIntBetween(minSetSize, maxSetSize); | ||
| Set<T> set = new HashSet<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not putting a size to the HashSet constructor, since it's already known. To help with this a bit, we have in CollectionUtils the method mapSize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CollectionUtils is in ql, which isn't a dependency. Not sure if optimising a random-size-driven test would be worth, but still, I could add the method ("formula") here if you think it'd be better that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not worth adding ql as a dependency for this.
- fix index bounds check and add tests for it; - imporove testing; - minor refactorings.
I see. Makes sense, I've added it. |
…/multivalue-clients
| List<Integer> expected = asList(ints); | ||
| expected.sort(Integer::compare); | ||
| List<?> actual = asList((Object[]) intArray.getArray()); | ||
| actual.sort(Comparator.comparing(x -> ((Integer) x))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| actual.sort(Comparator.comparing(x -> ((Integer) x))); | |
| actual.sort(Comparator.comparingInt(x -> (Integer) x)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied, thanks.
|
|
||
| strings.sort(String::compareTo); | ||
| actual = asList((Object[]) strArray.getArray()); | ||
| actual.sort(Comparator.comparing(x -> ((String) x))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| actual.sort(Comparator.comparing(x -> ((String) x))); | |
| actual.sort(Comparator.comparing(x -> (String) x)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
| builder.append("}").append("\n"); | ||
| }); | ||
|
|
||
| request = new Request("POST", "/" + index + "/_bulk?refresh=wait_for"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused about refresh=wait_for and the parameter below request.addParameter("refresh", "true"). Aren't these overriding one another?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of the two is that refresh=true initiates a refresh right away, while refresh=wait_for will cause the response to only be given once the refresh is finished. I.e. refresh=true speeds up the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@astefan thanks for looking further into it. I've replaced ?refresh=wait_for with ?refresh=true (the last parameter being the one that applies) throughout the class.
| } | ||
| } | ||
| if (actualObject instanceof Point) { | ||
| // geo points are loaded form doc values where they are stored as long-encoded values leading |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this statement is not true anymore. We don't use docvalues extraction. We used to, not anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the comment (but kept the "with-delta" comparison, I guess the fields also returns them from docvalues?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, no. Extraction from other sources other than _source is something for the future.
| @SuppressWarnings("unchecked") | ||
| public void testMultiValueDifferentPathsWithArraysAndNulls() throws IOException { | ||
| ExhaustiveMultiPathMapper<Long> mapper = new ExhaustiveMultiPathMapper<>(randomIntBetween(3, 10)); | ||
| // TODO AwaitsFix(bugUrl = "https://github.com/elastic/elasticsearch/issues/68979") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find!
| } | ||
|
|
||
| Map<String, Object> map() { | ||
| return map == null ? null : new HashMap<>(map); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why new HashMap(map) and not returning the map itself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To allow the caller to modify what's returned and still fetch the original if needed. (Flexibility not really used now, tho.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's not used, I'd say let's keep it simple for the moment and when we'll need it, we'll add the flexibility. My 2c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the returns to the respective unmodifiable collections.
| if (val instanceof Map) { | ||
| @SuppressWarnings("unchecked") | ||
| Map<String, Object> innerMap = (Map<String, Object>) val; | ||
| if (randomNonNegativeByte() < 60) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this class needs some comments here and there to help understanding what is it actually doing. For example, what's with if (randomNonNegativeByte() < 60)? Is 60 just a random number you picked up or does it have a deeper meaning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some comments here and there.
In this case // assuming a linear distribution of the random non-negative byte values (0-255), [access this branch] roughly 1/4 of the time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some helper methods already existent that fit into the same ballpark idea. Maybe reuse those, if you think they fit the original intention in this test? rarely(), usually(), frequently().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I've switched to usually(), which boost the probability to over 50% from ~25%, but should not make a difference testing-wise.
| private List<T> values; | ||
|
|
||
| // the depth excludes the leaf | ||
| ExhaustiveMultiPathMapper(int depth) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not passing the supplier to the constructor as well and initialize all the internal variables - paths, map, values - in the constructor and from outside you only call the geter methods. The way it is now, the caller of this class needs to call the constructor and, then, call the generate method to populate all the internal variables. And I don't see why it should leave room for trial&error from the user of this class. Restricting everything to the constructor makes the user of the class just call the constructor (to have everything ready) and then get the internal variables values by calling geters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I've refactored.
(The initial idea was to allow recalling one mapper instance with different value suppliers, but this is superfluous after all.)
- more comments - refactor testing class - remove unnecessary brackets.
…/multivalue-clients
| assertEquals("Conversion from type [byte[]] to [INTERVAL_DAY_TO_MINUTE] not supported", sqle.getMessage()); | ||
| } | ||
|
|
||
| public void testThrownExceptionWhenSettingArrayValue() throws SQLException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx!
- refactor the way results are returned in mapper; - reuse existing probability functions. - remove redundant `refresh=wait_for`.
…/multivalue-clients
…/multivalue-clients
astefan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
matriv
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, Nice work!
…/multivalue-clients
- adjust test after upstream ticket has been addressed.
Now that 7.13.0 is available, update the release target to that version.
costin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let a few comments. Great work!
|
|
||
| import static org.elasticsearch.xpack.sql.jdbc.TypeUtils.baseType; | ||
|
|
||
| public class JdbcArray implements Array { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have package visibility to avoid instantiation outside the package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks, fixed.
| GEO_SHAPE( ExtraTypes.GEOMETRY), | ||
| SHAPE( ExtraTypes.GEOMETRY); | ||
| SHAPE( ExtraTypes.GEOMETRY), | ||
| BOOLEAN_ARRAY( Types.ARRAY), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed with @bpintea to refactor this to use only a generic ARRAY type.
|
|
||
| @Override | ||
| public Object getArray(long index, int count, Map<String, Class<?>> map) throws SQLException { | ||
| if (map == null || map.isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small preference, it's better to do the validation and error checking upfront as oppose to last.
That is check if the map is non null and non-empty and throw the exception vs what the way it is right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated it.
However, I believe we might need to do further updates to stay consistent.
| } | ||
|
|
||
| @Override | ||
| public ResultSet getResultSet() throws SQLException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A follow-up should implement this method - typically by returning a ResultSet that contains two columns - index and value.
A good deal of clients rely on this method since they're used to parsing ResultSet vs getArray().
See JdbcDatabaseMetaData#emptySet for example of creating a dummy ResultSet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ResultSet methods added in #69512.
| } | ||
|
|
||
| @Override | ||
| public void free() throws SQLException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this method is called, all other methods should return an exception which is not the case.
There are no side-effects however this is the mandatory behavior described by the spec:
* After <code>free</code> has been called, any attempt to invoke a
* method other than <code>free</code> will result in a <code>SQLException</code>
* being thrown. If <code>free</code> is called multiple times, the subsequent
* calls to <code>free</code> are treated as a no-op.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, on another look, that phrasing is mandatory. Check added.
| } | ||
| case IP: | ||
| return v.toString(); | ||
| case BOOLEAN_ARRAY: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be improved
| private final EsType type; | ||
| private final List<?> values; | ||
|
|
||
| public JdbcArray(EsType type, List<?> values) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass in the base type separately instead of determining it from the EsType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to pass the base type.
| } | ||
| } | ||
|
|
||
| private static Object extractObject(ResultSet resultSet, int column, boolean fromEs, Class<?> expectedColumnClass) throws SQLException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of work went in here.
| * Some of the paths are randomly multiplied: | ||
| * {"a": [{"b": {"c": 1}}}, {"b": {"c": 1}}}], ...} | ||
| */ | ||
| private static class ExhaustiveMultiPathMapper<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here...
|
|
||
| public final class VersionCompatibilityChecks { | ||
|
|
||
| public static final SqlVersion INTRODUCING_ARRAY_TYPES = SqlVersion.fromId(V_7_13_0.id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we need to remove INTRODUCING_ since it will be the same for all new features. Then again, we need to differentiate from those that are being removed.
This adds support for the xDBC and CLI clients. Based on those, the CsvJdbc-based QA tests are also integrated.
This adds support for the xDBC and CLI clients. Based on those, the CsvJdbc-based QA tests are also integrated.
This adds support for the xDBC and CLI clients. Based on those, the CsvJdbc-based QA tests are also integrated.
This adds support for the xDBC and CLI clients. Based on those, the CsvJdbc-based QA tests are also integrated.
This adds support for the xDBC and CLI clients.
Based on these, the CsvJdbc-based QA tests are also added.