Skip to content

Conversation

@romseygeek
Copy link
Contributor

This allows us to be more conservative about what needs to be loaded
when using the fields API, and opens up the possibility of avoiding
using stored fields or source altogether if we can use doc values to
fetch values.

@romseygeek romseygeek added :Search/Search Search-related issues that do not fall into other categories >refactoring v8.8.0 labels Mar 28, 2023
@romseygeek romseygeek requested review from iverase and jdconrad March 28, 2023 11:51
@romseygeek romseygeek self-assigned this Mar 28, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label Mar 28, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a first pass and looks good to me. I think there is a potential performance bug in PreloadedFieldLookupProvider that needs to be addressed.

public void populateFieldLookup(FieldLookup fieldLookup, int doc) throws IOException {
String field = fieldLookup.fieldType().name();
if (storedFields.containsKey(field)) {
fieldLookup.setValues(storedFields.get(field));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return here if the stored field is preloaded? If not we are always loading it with the backupLoader.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, yes, good spot... I'll add some better testing here as well.

PreloadedSourceProvider sourceProvider = new PreloadedSourceProvider();
context.getSearchExecutionContext().setSourceProvider(sourceProvider);
PreloadedFieldLookupProvider fieldLookupProvider = new PreloadedFieldLookupProvider();
context.getSearchExecutionContext().setLookupProviders(sourceProvider, ctx -> fieldLookupProvider);
Copy link
Contributor

@iverase iverase Mar 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking out loud ... Is really required to treat _source different to any other (stored) field here apart from historical reasons? I guess it is but does it mean we can potentially preload source twice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_source might be synthesised rather than come from a stored field so it needs to be handled separately. It might be possible to explicitly ask for _source via _fields which would load things twice, and we can probably intercept that and make sure it doesn't happen. Will add more tests!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have a check for _source in stored fields, which gets ignored.

@Override
public StoredFieldsSpec storedFieldsSpec() {
// TODO can we get finer-grained information from the FieldFetcher for this?
return StoredFieldsSpec.NEEDS_SOURCE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice TODO removal

this.fieldChain = Collections.emptySet();
this.sourceProvider = sourceProvider;
this.fieldDataLookup = fieldDataLookup;
this.fieldLookupProvider = fieldLookupProvider;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used somewhere else?

new LeafDocLookup(fieldTypeLookup, this::getForField, context),
sourceProvider,
new LeafStoredFieldsLookup(fieldTypeLookup, () -> context.reader().storedFields())
new LeafStoredFieldsLookup(fieldTypeLookup, LeafFieldLookupProvider.fromStoredFields().apply(context))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you mean here to use fieldLookupProvider?

@romseygeek
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/bwc

Copy link
Contributor

@iverase iverase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! learnt quite a bit while reviewing it.


private void addValue(Object value) {
destination.add(field.valueForDisplay(value));
destination.add(value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my curiosity, Was this call (#valueForDisplay) not necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's moved into FieldLookup.setValues(), mainly because there are now two paths that can set it (here in the stored fields provider and also in the fetchphase pre-loaded implementation)

* Value fetcher that loads from doc values.
*/
// TODO rename this? It doesn't load from doc values, it loads from fielddata
// Which might be doc values, but might not be...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, so FormattedDocValues can actually come from a stored field?

@romseygeek
Copy link
Contributor Author

@elasticmachine update branch

@romseygeek romseygeek merged commit 131da70 into elastic:main Mar 30, 2023
@romseygeek romseygeek deleted the value-fetcher/stored-fields-spec branch March 30, 2023 09:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>refactoring :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team v8.8.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants