Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
093e066
Add custom FieldComparatorSource for BytesRef values returned by scripts
luigidellaquila Apr 15, 2022
71fa269
Add BytesRefProducer interface and make Version implement it
luigidellaquila Apr 19, 2022
4663b48
Add 'format' option to script sort
luigidellaquila Apr 19, 2022
c25793a
Code format
luigidellaquila Apr 19, 2022
2095555
Better NPE management (wip)
luigidellaquila Apr 19, 2022
4b14243
Remove 'format' option and use 'type' directly
luigidellaquila Apr 19, 2022
2270056
Fix test cases
luigidellaquila Apr 19, 2022
3585a06
Merge branch 'master' into poc/custom_bytesref_script_sort
elasticmachine Apr 19, 2022
1d0219a
Fix compile problem
luigidellaquila Apr 19, 2022
6ef4bcd
Merge remote-tracking branch 'luigidellaquila/poc/custom_bytesref_scr…
luigidellaquila Apr 19, 2022
69c5d93
Remove unused code
luigidellaquila Apr 19, 2022
a39eccd
Remove unused code
luigidellaquila Apr 19, 2022
ab7f3f2
Review retrieval of MappedFieldType for version
luigidellaquila Apr 19, 2022
439dc74
Optimise Version to use existing BytesRef
luigidellaquila Apr 20, 2022
87dfaad
Refactor ScriptSortBuilder and add more tests
luigidellaquila Apr 21, 2022
53203cc
Merge branch 'master' into poc/custom_bytesref_script_sort
elasticmachine Apr 21, 2022
f1f3cb8
Address review comments
luigidellaquila Apr 26, 2022
8a530ef
Merge remote-tracking branch 'luigidellaquila/poc/custom_bytesref_scr…
luigidellaquila Apr 26, 2022
0ecab77
Update docs/changelog/85990.yaml
luigidellaquila Apr 26, 2022
7fae61e
Address further review comments
luigidellaquila Apr 27, 2022
8e88f50
Merge remote-tracking branch 'luigidellaquila/poc/custom_bytesref_scr…
luigidellaquila Apr 27, 2022
1345751
Further fixes based on review comments
luigidellaquila Apr 28, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docs/changelog/85990.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
pr: 85990
summary: Allow to sort by script value using `SemVer` semantics
area: Infra/Scripting
type: bug
issues:
- 85989
- 82287
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#
# Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
# or more contributor license agreements. Licensed under the Elastic License
# 2.0 and the Server Side Public License, v 1; you may not use this file except
# in compliance with, at your election, the Elastic License 2.0 or the Server
# Side Public License, v 1.
#

# The whitelist for the fields api
# The scripts must be whitelisted for painless to find the classes for the field API

class org.elasticsearch.script.BytesRefProducer @no_import {
}

class org.elasticsearch.script.BytesRefSortScript @no_import {
}
class org.elasticsearch.script.BytesRefSortScript$Factory @no_import {
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/

package org.elasticsearch.script;

import org.apache.lucene.util.BytesRef;

/**
* used by {@link org.elasticsearch.search.sort.ScriptSortBuilder} to refer to classes in x-pack
* (eg. org.elasticsearch.xpack.versionfield.Version) that need a custom FieldComparatorSource
*/
public interface BytesRefProducer {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any plans to use this interface on anything other than Version? Otherwise maybe the instance check this is used here could simple test for Version instead. Or is this here so you don't need to reference Version from within the server module?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface is there only to decouple this fix from x-pack, I'd be happy to remove it if we move Verson to server module.
Just for completeness, we could use this interface (and the same fix) for at least another case, that is IP field type, but the interface would still be overkill (an if/else would be more than enough probably)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thats what I guessed. No worries, I think if moving complicates things right now the interface is fine. Maybe this could benefit from a small class comment why its there then, so the next person doesn't ask again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a line in the Javadoc about how this class is used.


BytesRef toBytesRef();
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License
* 2.0 and the Server Side Public License, v 1; you may not use this file except
* in compliance with, at your election, the Elastic License 2.0 or the Server
* Side Public License, v 1.
*/
package org.elasticsearch.script;

import java.io.IOException;
import java.util.Map;

public abstract class BytesRefSortScript extends AbstractSortScript {

public static final String[] PARAMETERS = {};

public static final ScriptContext<Factory> CONTEXT = new ScriptContext<>("bytesref_sort", Factory.class);

public BytesRefSortScript(Map<String, Object> params, DocReader docReader) {
super(params, docReader);
}

public abstract Object execute();

/**
* A factory to construct {@link BytesRefSortScript} instances.
*/
public interface LeafFactory {
BytesRefSortScript newInstance(DocReader reader) throws IOException;
}

/**
* A factory to construct stateful {@link BytesRefSortScript} factories for a specific index.
*/
public interface Factory extends ScriptFactory {
LeafFactory newFactory(Map<String, Object> params);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ public class ScriptModule {
ScoreScript.CONTEXT,
NumberSortScript.CONTEXT,
StringSortScript.CONTEXT,
BytesRefSortScript.CONTEXT,
TermsSetQueryScript.CONTEXT,
UpdateScript.CONTEXT,
BucketAggregationScript.CONTEXT,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,13 @@
import org.elasticsearch.index.fielddata.SortedNumericDoubleValues;
import org.elasticsearch.index.fielddata.fieldcomparator.BytesRefFieldComparatorSource;
import org.elasticsearch.index.fielddata.fieldcomparator.DoubleValuesComparatorSource;
import org.elasticsearch.index.mapper.MappedFieldType;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryRewriteContext;
import org.elasticsearch.index.query.QueryShardException;
import org.elasticsearch.index.query.SearchExecutionContext;
import org.elasticsearch.script.BytesRefProducer;
import org.elasticsearch.script.BytesRefSortScript;
import org.elasticsearch.script.DocValuesDocReader;
import org.elasticsearch.script.NumberSortScript;
import org.elasticsearch.script.Script;
Expand Down Expand Up @@ -72,14 +75,16 @@ public class ScriptSortBuilder extends SortBuilder<ScriptSortBuilder> {

private NestedSortBuilder nestedSort;

private DocValueFormat scriptResultValueFormat = DocValueFormat.RAW;

/**
* Constructs a script sort builder with the given script.
*
* @param script
* The script to use.
* @param type
* The type of the script, can be either {@link ScriptSortType#STRING} or
* {@link ScriptSortType#NUMBER}
* The type of the script, can be {@link ScriptSortType#STRING},
* {@link ScriptSortType#NUMBER} or {@link ScriptSortType#VERSION}
*/
public ScriptSortBuilder(Script script, ScriptSortType type) {
Objects.requireNonNull(script, "script cannot be null");
Expand Down Expand Up @@ -246,9 +251,18 @@ public static ScriptSortBuilder fromXContent(XContentParser parser, String eleme

@Override
public SortFieldAndFormat build(SearchExecutionContext context) throws IOException {
if ("version".equals(this.type.toString())) {
try {
// TODO there must be a better way to get the field type...
MappedFieldType scriptFieldType = context.buildAnonymousFieldType(this.type.toString());
scriptResultValueFormat = scriptFieldType.docValueFormat(null, null);
} catch (Exception e) {
// "version" type is not available, fall back to RAW and sort as a string
}
}
return new SortFieldAndFormat(
new SortField("_script", fieldComparatorSource(context), order == SortOrder.DESC),
DocValueFormat.RAW
scriptResultValueFormat == null ? DocValueFormat.RAW : scriptResultValueFormat
);
}

Expand Down Expand Up @@ -355,6 +369,64 @@ protected void setScorer(Scorable scorer) {
}
};
}
case VERSION -> {
final BytesRefSortScript.Factory factory = context.compile(script, BytesRefSortScript.CONTEXT);
final BytesRefSortScript.LeafFactory searchScript = factory.newFactory(script.getParams());
return new BytesRefFieldComparatorSource(null, null, valueMode, nested) {
BytesRefSortScript leafScript;

@Override
protected SortedBinaryDocValues getValues(LeafReaderContext context) throws IOException {
leafScript = searchScript.newInstance(new DocValuesDocReader(searchLookup, context));
final BinaryDocValues values = new AbstractBinaryDocValues() {

@Override
public boolean advanceExact(int doc) throws IOException {
leafScript.setDocument(doc);
return true;
}

@Override
public BytesRef binaryValue() {
Object result = leafScript.execute();
if (result == null) {
return null;
}
if (result instanceof BytesRefProducer) {
return ((BytesRefProducer) result).toBytesRef();
}

if (scriptResultValueFormat == null) {
throw new IllegalArgumentException("Invalid sort type: version");
}
return scriptResultValueFormat.parseBytesRef(result);
}
};
return FieldData.singleton(values);
}

@Override
protected void setScorer(Scorable scorer) {
leafScript.setScorer(scorer);
}

@Override
public BucketedSort newBucketedSort(
BigArrays bigArrays,
SortOrder sortOrder,
DocValueFormat format,
int bucketSize,
BucketedSort.ExtraData extra
) {
throw new IllegalArgumentException(
"error building sort for [_script]: "
+ "script sorting only supported on [numeric] scripts but was ["
+ type
+ "]"
);
}
};
}
default -> throw new QueryShardException(context, "custom script sort type [" + type + "] not supported");
}
}
Expand Down Expand Up @@ -394,7 +466,9 @@ public enum ScriptSortType implements Writeable {
/** script sort for a string value **/
STRING,
/** script sort for a numeric value **/
NUMBER;
NUMBER,
/** script sort for a Version field value **/
VERSION;

@Override
public void writeTo(final StreamOutput out) throws IOException {
Expand All @@ -413,6 +487,7 @@ public static ScriptSortType fromString(final String str) {
return switch (str.toLowerCase(Locale.ROOT)) {
case ("string") -> ScriptSortType.STRING;
case ("number") -> ScriptSortType.NUMBER;
case ("version") -> ScriptSortType.VERSION;
default -> throw new IllegalArgumentException("Unknown ScriptSortType [" + str + "]");
};
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -130,16 +130,21 @@ public void testScriptSortType() {
// we rely on these ordinals in serialization, so changing them breaks bwc.
assertEquals(0, ScriptSortType.STRING.ordinal());
assertEquals(1, ScriptSortType.NUMBER.ordinal());
assertEquals(2, ScriptSortType.VERSION.ordinal());

assertEquals("string", ScriptSortType.STRING.toString());
assertEquals("number", ScriptSortType.NUMBER.toString());
assertEquals("version", ScriptSortType.VERSION.toString());

assertEquals(ScriptSortType.STRING, ScriptSortType.fromString("string"));
assertEquals(ScriptSortType.STRING, ScriptSortType.fromString("String"));
assertEquals(ScriptSortType.STRING, ScriptSortType.fromString("STRING"));
assertEquals(ScriptSortType.NUMBER, ScriptSortType.fromString("number"));
assertEquals(ScriptSortType.NUMBER, ScriptSortType.fromString("Number"));
assertEquals(ScriptSortType.NUMBER, ScriptSortType.fromString("NUMBER"));
assertEquals(ScriptSortType.VERSION, ScriptSortType.fromString("version"));
assertEquals(ScriptSortType.VERSION, ScriptSortType.fromString("Version"));
assertEquals(ScriptSortType.VERSION, ScriptSortType.fromString("VERSION"));
}

public void testScriptSortTypeNull() {
Expand Down Expand Up @@ -301,6 +306,10 @@ public void testBuildCorrectComparatorType() throws IOException {
sortBuilder = new ScriptSortBuilder(mockScript(MOCK_SCRIPT_NAME), ScriptSortType.NUMBER);
sortField = sortBuilder.build(createMockSearchExecutionContext()).field;
assertThat(sortField.getComparatorSource(), instanceOf(DoubleValuesComparatorSource.class));

sortBuilder = new ScriptSortBuilder(mockScript(MOCK_SCRIPT_NAME), ScriptSortType.VERSION);
sortField = sortBuilder.build(createMockSearchExecutionContext()).field;
assertThat(sortField.getComparatorSource(), instanceOf(BytesRefFieldComparatorSource.class));
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,8 @@ public boolean needs_score() {
return context.factoryClazz.cast(factory);
} else if (context.instanceClazz.equals(StringSortScript.class)) {
return context.factoryClazz.cast(new MockStringSortScriptFactory(script));
} else if (context.instanceClazz.equals(BytesRefSortScript.class)) {
return context.factoryClazz.cast(new MockBytesRefSortScriptFactory(script));
} else if (context.instanceClazz.equals(IngestScript.class)) {
IngestScript.Factory factory = vars -> new IngestScript(vars) {
@Override
Expand Down Expand Up @@ -804,4 +806,30 @@ public String execute() {
};
}
}

class MockBytesRefSortScriptFactory implements BytesRefSortScript.Factory {
private final MockDeterministicScript script;

MockBytesRefSortScriptFactory(MockDeterministicScript script) {
this.script = script;
}

@Override
public boolean isResultDeterministic() {
return script.isResultDeterministic();
}

@Override
public BytesRefSortScript.LeafFactory newFactory(Map<String, Object> parameters) {
return docReader -> new BytesRefSortScript(parameters, docReader) {
@Override
public BytesRefProducer execute() {
Map<String, Object> vars = new HashMap<>(parameters);
vars.put("params", parameters);
vars.put("doc", getDoc());
return (BytesRefProducer) script.apply(vars);
}
};
}
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,21 +7,28 @@

package org.elasticsearch.xpack.versionfield;

import org.elasticsearch.xcontent.ToXContent;
import org.apache.lucene.util.BytesRef;
import org.elasticsearch.script.BytesRefProducer;
import org.elasticsearch.xcontent.ToXContentFragment;
import org.elasticsearch.xcontent.XContentBuilder;

import java.io.IOException;

/**
* Script value class.
* TODO(stu): implement {@code Comparable<Version>} based on {@code VersionEncoder#prefixDigitGroupsWithLength(String, BytesRefBuilder)}
* See: https://github.com/elastic/elasticsearch/issues/82287
*/
public class Version implements ToXContent {
public class Version implements ToXContentFragment, BytesRefProducer, Comparable<Version> {
protected String version;
protected BytesRef bytes;

public Version(String version) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should have another constructor that takes the encoded version, this String version is mostly for users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean a protected Version(String version, BytesRef ref) for internal use only, so that we don't have to recalculate the BytesRef? I guess having a constructor with BytesRef would have the same problem, ie. we would have to decode it for the string value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick look I think we can have a ctor that only takes the BytesRef and internally decodes it to a String only once, then use one for the xContent/toString part and the other for comparison. Version only seems to be created from VersionStringDocValuesField where we can also directly get the bytes, in fact currently we do the decoding step there once in getInternal, that could happen in the ctor instead.

Copy link
Contributor

@stu-elastic stu-elastic Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need the Version(String) constructor, which is whitelisted for painless users. Users have to have the ability to provide a Version as a default value when fetching version fields, as in this test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm keeping both (see last commit)

this.version = version;
this.bytes = VersionEncoder.encodeVersion(version).bytesRef;
}

protected Version(BytesRef bytes) {
this.version = VersionEncoder.decodeVersion(bytes);
this.bytes = bytes;
}

@Override
Expand All @@ -35,7 +42,12 @@ public XContentBuilder toXContent(XContentBuilder builder, Params params) throws
}

@Override
public boolean isFragment() {
return false;
public BytesRef toBytesRef() {
return bytes;
}

@Override
public int compareTo(Version o) {
return toBytesRef().compareTo(o.toBytesRef());
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
import org.elasticsearch.painless.spi.Whitelist;
import org.elasticsearch.painless.spi.WhitelistLoader;
import org.elasticsearch.script.AggregationScript;
import org.elasticsearch.script.BytesRefSortScript;
import org.elasticsearch.script.FieldScript;
import org.elasticsearch.script.FilterScript;
import org.elasticsearch.script.NumberSortScript;
Expand Down Expand Up @@ -41,6 +42,7 @@ public Map<ScriptContext<?>, List<Whitelist>> getContextWhitelists() {
whitelist.put(FieldScript.CONTEXT, list);
whitelist.put(NumberSortScript.CONTEXT, list);
whitelist.put(StringSortScript.CONTEXT, list);
whitelist.put(BytesRefSortScript.CONTEXT, list);
return whitelist;
}
}
Loading