Skip to content

Conversation

@csoulios
Copy link
Contributor

@csoulios csoulios commented Jan 25, 2022

Since _tsid cannot be a multi-value field, this PR modifies the TimeSeriesIdFieldMapper so that _tsid is added as a SortedDocValuesField (instead of a SortedSetDocValuesField)

Relates to #80276

@csoulios csoulios added :Search Foundations/Mapping Index mappings, including merging and defining field types :StorageEngine/TSDB You know, for Metrics v8.1.0 labels Jan 25, 2022
@csoulios csoulios requested a review from romseygeek January 25, 2022 13:46

@Override
public SortField sortField(@Nullable Object missingValue, MultiValueMode sortMode, Nested nested, boolean reverse) {
SortField sortField = new SortField(getFieldName(), SortField.Type.STRING, reverse);
Copy link
Contributor Author

@csoulios csoulios Jan 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if setting this to SortField.Type.STRING by default is ok.

If I leave it to the default SortField.Type.CUSTOM method IndexSortConfig#validateIndexSortField fails

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type.STRING is correct here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type.CUSTOM means that it's expecting you to give it a custom FieldComparator, which we don't need. It is a horrible API from about 2004 and I keep meaning to hack it out and replace it with something less awful...

@csoulios csoulios requested a review from jpountz January 25, 2022 19:04
@csoulios csoulios marked this pull request as ready for review January 25, 2022 19:04
@elasticmachine elasticmachine added Team:Search Meta label for search team Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Jan 25, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I left a couple of suggestions.


@Override
public SortField sortField(@Nullable Object missingValue, MultiValueMode sortMode, Nested nested, boolean reverse) {
SortField sortField = new SortField(getFieldName(), SortField.Type.STRING, reverse);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type.STRING is correct here.

SortField sortField = new SortField(getFieldName(), SortField.Type.STRING, reverse);
sortField.setMissingValue(
sortMissingLast(missingValue) ^ reverse ? SortedSetSortField.STRING_LAST : SortedSetSortField.STRING_FIRST
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect anything to be missing a _tsid field? If not we can just ignore setMissingValue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In time series indices _tsid field must always be present. I only left it because SortedOrdinalsIndexFieldData may possibly be used for other fields too.

iterator = scorer.iterator();
docBase = context.docBase;
tsids = context.reader().getSortedSetDocValues(TimeSeriesIdFieldMapper.NAME);
tsids = context.reader().getSortedDocValues(TimeSeriesIdFieldMapper.NAME);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd use DocValues.getSorted(context.reader(), TimeSeriesIdFieldMapper.NAME) here as it will protect you from NPEs and also automatically convert between SortedSet and Sorted if we need it.

if (docId != DocIdSetIterator.NO_MORE_DOCS && (liveDocs == null || liveDocs.get(docId))) {
if (tsids.advanceExact(docId)) {
BytesRef tsid = tsids.lookupOrd(tsids.nextOrd());
BytesRef tsid = tsids.lookupOrd(tsids.ordValue());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can replace this with an ord comparison now that we know there's a single value, but let's do that in a follow-up.


@Override
public LeafOrdinalsFieldData load(LeafReaderContext context) {
return new SortedSetBytesLeafFieldData(context.reader(), getFieldName(), toScriptField);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth a comment here that SortedSetBytesLeafFieldData will happily load SortedDocValues as well, so there's no need for a single-valued implementation.


@Override
public SortField sortField(@Nullable Object missingValue, MultiValueMode sortMode, Nested nested, boolean reverse) {
SortField sortField = new SortField(getFieldName(), SortField.Type.STRING, reverse);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type.CUSTOM means that it's expecting you to give it a custom FieldComparator, which we don't need. It is a horrible API from about 2004 and I keep meaning to hack it out and replace it with something less awful...

@csoulios csoulios requested a review from romseygeek January 26, 2022 12:06
Copy link
Contributor

@romseygeek romseygeek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@csoulios csoulios merged commit 40cab17 into elastic:master Jan 26, 2022
@csoulios csoulios deleted the tsid-dv-field branch January 26, 2022 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search Foundations/Mapping Index mappings, including merging and defining field types :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Search Meta label for search team v8.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants