Skip to content

Conversation

@brusic
Copy link
Contributor

@brusic brusic commented Jan 24, 2018

PR based on discussion #25179

Although this PR runs as intended, the initial commit is more for discussion purposes. Beside, we know that no PR gets merged easily. :)

The current API only allows getting scripts one at a time based on the ID Get Script. The proposal is to add an endpoint to retrieve all scripts, with no filtering (example by lang) done on the server side.

The returned format will look something like

{
  "1": {
    "script": {
      "lang": "painless",
      "source": "_score * doc['myParent.weight'].value"
    }
  },
  "foo": {
    "script": {
      "lang": "painless",
      "source": "_score * 9.9"
    }
  }
}

rather than

{
   "_id": "calculate-score",
   "found": true,
   "script": {
      "lang": "painless",
      "source": "Math.log(_score * 2) + params.my_modifier"
   }
}

The format is identical to the result from a get templates request. In fact, my original use case was to retrieve all search templates, so consistency between get index/search templates is an added bonus.

In order not to introduce a breaking change, I have opted to create a new family of Action/Request/Response called RestGetStoredScripts. Note the 's' after StoredScript to differentiate from the existing RestGetStoredScript family. I find this kludgy and introduces a ton of boilerplate code. IMHO, the best course of action is for a breaking change for the existing Get Script API call and make it analogous to the Get Index Template call. Not my call to make and I rather not deal with having to deal with a breaking change. No clue where the functional dependencies are with the current API.

Issues:

  • For integration testing, I added a REST test to the lang-painless module since I need some script plugin to be loaded in order to put a script into the cluster. This change is language independent, so I am unclear with how to run a test
  • Need to check for nulls in ScriptService. Related to the issue above since the ScriptMetaData could potentially not be loaded.
  • Add cat API
  • Add documentation

@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

1 similar comment
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@rjernst
Copy link
Member

rjernst commented Jan 26, 2018

Thanks @brusic. I'm torn on the best path for this. I absolutely see the benefit of having a way to get all scripts in one request, but I am very uncomfortable adding new request/response/etc objects that differ with the existing single script api only by plurality. It looks like the template code has a mix of plurality: the rest action is singular (as is the rest path), but the transport classes are plural.

I think making this a breaking change (in 7.0) and making the plural versions what we use going forward is the way to go. In that case we can mark the singular as deprecated in 6.x and remove in a followup in master, as well as backporting the new classes in 6.x.

In regard to the singular vs multi format, we should probably update the singular format to be wrapped with the script name (similar to what template requests do). We could add a temporary flag (with a deprecation warning) that triggers the old vs new format, and again, remove in 7.0.

@rjernst
Copy link
Member

rjernst commented Jan 26, 2018

Also interested in @nik9000's thoughts on this.

@brusic
Copy link
Contributor Author

brusic commented Jan 26, 2018

Thanks for the quick follow up @rjernst.

I think you see now see the genesis of my question regarding REST testing on the mailing list. :)

My PR was meant to have a working version that outputted the proposed format. I also very much do not like adding a new family of classes (as noted in my original comment), but I did not consider the possibility of having both during a deprecation phase. Great suggestion.

On the subject of singular/plural and maintaining two similar hierarchies, the singular RestGetStoredScriptAction has a plural name and REST endpoint, just to make things difficult.

"/_scripts/{id}"
"get_stored_scripts_action"

In the new REST action, I used the name get_all_stored_scripts_action. Did not think it would be permanent, but if the two sets of classes is the proposed solution, I say we change the name in the deprecated class and use "get_stored_scripts_action" in the new one. The REST endpoint works as is.

For the singular version, a not found script will return an empty collection, rather than found: false.
For plural, should multiple names/wildcards be supported? Or maintain the singular get/{id} with the addition of get all?

@brusic
Copy link
Contributor Author

brusic commented Feb 9, 2018

Since it seems inevitable, I can go ahead and make it a proper breaking change since the change only affects things at the REST level, and not at the inter-code level. Is there an example that I can follow for deprecating code?

@rjernst
Copy link
Member

rjernst commented Feb 26, 2018

@brusic Sorry I lost track of this. There are many examples of code that has been deprecated, look for uses of the DeprecationLogger.

I'm going to mark this PR for discussion so we ca get consensus on what the final plurality should be.

@rjernst rjernst added discuss :Core/Infra/Scripting Scripting abstractions, Painless, and Mustache labels Feb 26, 2018
@rjernst
Copy link
Member

rjernst commented Mar 15, 2018

We discussed this in FixItFriday and came to the conclusion that we should stick with typical REST style resource naming, meaning using the singular form. This looks a little odd in the code because the request/response/action classes are singular there as well, but we already do this with most of these classes.

@brusic If you are still interested in working on this, I think you should first move the implementation of the iteration into RestGetStoredScriptAction and add /_script as a rest controller path. Then in followups we can add /_script/{id} to the other stored scripting actions, along with deprecation warnings when using the old paths, and the finally remove the plural ones in master.

@rjernst rjernst removed the discuss label Apr 6, 2018
@brusic
Copy link
Contributor Author

brusic commented Apr 23, 2018

Now it is I that must apologize. You responded during a busy time for myself, then I promptly forgot.

Just to get on the same page: what do you mean by "first move the implementation of the iteration", specifically the meaning of iteration.

From what I understand, roll the PoC logic into the existing classes, using the singular _script as the endpoint to get all scripts.

This PR should not address renaming the existing endpoint (if I understood correctly, I agree that it should be a separate PR). The endpoint and the resulting format remains the same. Non-consistent format, but I should be able to knock out a follow-up quickly. If the existing logic remains the same, I do not have to worry (too much) about breaking changes. 👍

@rjernst
Copy link
Member

rjernst commented Apr 24, 2018

what do you mean by "first move the implementation of the iteration", specifically the meaning of iteration.

I mean your code to iterate over all the scripts.

@rjernst
Copy link
Member

rjernst commented Dec 6, 2018

@brusic Is this something you would still like to work on?

@brusic
Copy link
Contributor Author

brusic commented Dec 7, 2018

Sorry if I never got back to this PR. I did write the code, but ran into some issues. Before I was able to ask a question, my motherboard fried, lost my hard drive (thanks Apple) and I did not have the heart to recreate everything.

Let me try to remember what the challenge. IIRC, the issue was the hybrid approach does not work. Since the point of the PR is to retrieve all scripts, the /_scripts must exist. In the GetIndexTemplate example, this is handled by using a GetIndexTemplatesResponse with all the values. In the StoredScript example, the GetStoredScriptResponse handles only value. You can pass around a dummy object or null to signify ALL, but it is kludgy. The best bet is to drop the previous format.

This writeup is coming from memory from months old code. I can revisit it this weekend, unless you are itching to get it running.

@rjernst
Copy link
Member

rjernst commented Feb 28, 2019

@brusic Sorry for not responding before (lost track during holidays). Your summary sounds correct to my recollection. Is this something you still want to do now that the holidays are over?

@brusic brusic force-pushed the get-all-scripts branch 2 times, most recently from 002fc92 to 6a6162c Compare July 11, 2019 22:44
@brusic
Copy link
Contributor Author

brusic commented Aug 12, 2019

Now I am the that must say sorry for the lack of response. Was busy after your last response, and you know how it then goes.

Updated the PR (in a new branch for now) to the latest master and was about to redo the changes that I originally lost. But first:

  • is this PR still wanted?
  • would prefer to make it a breaking change. REST endpoint is easy enough, but to support both singular/plural would require passing nulls to maintain state which I prefer not to do

@rjernst
Copy link
Member

rjernst commented Aug 12, 2019

is this PR still wanted?

Yes, this has been asked for now several times by various users.

would prefer to make it a breaking change

Yes, I think we want to make it breaking. To re-summarize our past discussions here and on the original issue, there are a couple concerns about backwards compatibility:

  • The rest path to get a single script is named pluraly. We want that to match the other rest paths naming, so that singular is used, even when we have multiple (typical REST resource naming convention). This new path can be implemented in the same RestGetStoredScriptAction that exists currently. Which path is requested should determine whether we error on multiple scripts requested, and using the old path should emit a deprecation warning.
  • The transport action along with the request/response classes will need to be modified to support passing multiple scripts. This should be done in such a way that transport client users in 7.x will not be broken (note that for 8.0 the transport client has been removed). We can do this by having the internals of GetStoredScriptRequest contain a list of ids (or some marker or flag for when all should be retrieved), but keeping the singular id() methods on the request for backcompat, which would assert/assume the list only contains one item. The response class can be modified similarly.

@brusic
Copy link
Contributor Author

brusic commented Aug 13, 2019

Yes, I think we want to make it breaking.

This should be done in such a way that transport client users in 7.x will not be broken

I think we have conflicting definitions of what the word 'breaking' means. :) The response is different, so supporting both formats would lead to some terrible spaghetti code. Have the code ready, except for supporting backwards compatibility. Will think about it more, but having two response formats is tricky.

@rjernst
Copy link
Member

rjernst commented Aug 13, 2019

Sorry for the confusion. We try to only make breaking changes in major releases. So by "make it breaking" I meant we would add the option for the new format (backported to 7.x), then remove the older format for 8.0 (in master).

By "response format" I assume you mean the json response in the rest action? I think this can be done straightforward. The toXContent method of GetStoredScriptResponse is what serializes to json. It takes Params, which is passed as the original RestRequest. So we can have a query param that determines whether the old or new format is used. At first this should default to the old format (which will be backported), and then a followup in master can remove the use of this flag.

@brusic
Copy link
Contributor Author

brusic commented Aug 14, 2019

I am perfectly fine with making this an 8.0 release and not have to worry about backwards compatibility. Besides a conditional in toXContent, fromXContent has to be duplicated as well, and every public interface must be maintained. What does the output look like if the old format is requested, but there is more than one script? What if the old format is requested, the param list indicates multiple scripts are selected, but only one is returned? Also believe the response should no longer be a StatusToXContentObject. Too many inconsistencies.

Forced backwards compatibility into the code for the most part, now just working on JSON parsing and tests. Is there a util to dump the outputs of a parser? Currently running tests, goodbye CPU. :(

@brusic brusic force-pushed the get-all-scripts branch 2 times, most recently from 46aa260 to a85567e Compare August 14, 2019 05:48
@brusic
Copy link
Contributor Author

brusic commented Aug 14, 2019

@elasticmachine run elasticsearch-ci/1

@brusic
Copy link
Contributor Author

brusic commented Aug 14, 2019

I am assuming non-employees cannot kickstart a test build (makes sense). Wanted to offload the full unit testing.

During the integration tests, I keep on getting errors like the following:
Elasticsearch exception [type=illegal_argument_exception, reason=request [/_script/calculate-score] contains unrecognized parameter: [id]]

Not sure what/where this id parameter is coming from. Not in the test code. I am guessing it is being validated by other mechanisms.

@rjernst
Copy link
Member

rjernst commented Aug 14, 2019

I am perfectly fine with making this an 8.0 release and not have to worry about backwards compatibility.

Typically we don't want to suddenly introduce changes like that in a major, but instead have some way to use the newer behavior in a minor release. By doing this, and adding the appropriate deprecations, we can remove the api in 8.0, vs having to wait until master is bumped to 9.0 to actually remove.

During the integration tests, I keep on getting errors like the following:
Elasticsearch exception [type=illegal_argument_exception, reason=request [/_script/calculate-score] contains unrecognized parameter: [id]]

This likely has to do with a known issue on how named parameters are handled in the rest parser. You can't have differently named parameters in the same position. In your change I see you have _script/{name} and in another handler _script/{id}. This is why I would handle the old and new in a single rest handler. The underlying RestController cannot handle both.

I am assuming non-employees cannot kickstart a test build (makes sense). Wanted to offload the full unit testing.

Yes we want to ensure code is not pushed and run in our CI without some basic checking by a human in the company. I'll mark this PR to be tested whenever you push to the branch.

@elasticmachine ok to test

@brusic
Copy link
Contributor Author

brusic commented Sep 3, 2019

Attempting to disable bwc did not work. Testing branches via ./gradlew check -Dbwc.remote=origin -Dbwc.refspec.7.x=backport.7.4 did not work. Same cryptic error message that pops up all the time.

> build-idea/testclusters/integTest-0/distro/plugins/discovery-azure-classic/NOTICE.txt[1303:1385]: Snippet missing a language. This is required by Elasticsearch's doc testing infrastructure so we be sure we don't accidentally forget to test a snippet.

Open issue is the REST format:
The current REST format is the existing one. The new format is used if selected via a the new_format=true (placeholder name) param, if more than one id is requested or if more the one id is returned. If a single wildcard is used and more than one items are returned, the new format is used, if not the old format. However, there is a "bug" in that the name returned for the old format is the wildcard. Can detect wildcards, but this code is getting unruly.

AFAIK, only the tests using the static FromXContent method. Since the old format is the default, this method needs to use the old format as well. There is no testing the new format. I added the static method for the new format, but it is unused.

For status, not found is false as long as one item is returned, regardless of the number requested. Not a fan, but this is how index templates works as well.

On that note, I implemented HEAD since it is used by the index tempaltes API. I like consistency.

@brusic
Copy link
Contributor Author

brusic commented Sep 12, 2019

I should have been explicitly clear in my previous response that I think the code is done.

The snippet error is due to me using both IntelliJ and the command line to run gradle, but gradle clean on the command line will not clean up previous builds created in Intellij. Once that was fixed, I was able to get a little further with tests. Full testing takes a long time and my (beefy) laptop's temperature makes me think I am mining bitcoin. :) I usually get a couple of hours in before hitting some flickering/flapping test. The oldEs090 tests require Java 8. Docker tests continuously time out. Increasing the volume size from 64GB to 80GB helped some. I already debugged integration test handling and the gradle build, please don't make me debug docker. :)

Running bwcTest against 7.x branches work.

Is there a good public AMI with all versions of java, and docker/gradle? Have yet to finish tests for

./gradlew check -Dbwc.remote=origin -Dbwc.refspec.7.x=get-all-scripts_7.4

Last commit disables bwc tests. All testing was done with the default of them being enabled.

@rjernst
Copy link
Member

rjernst commented Apr 2, 2020

@brusic I'm sorry for missing your updates last year. Do you want to bring this up to date with master and we can help you work through the test failures?

@brusic
Copy link
Contributor Author

brusic commented Apr 3, 2020

Sure thing, stressful times, but I'll get it done. I believe the failing tests were related to the BWC and the testing framework not being able to handle them.

@rjernst rjernst added the Team:Core/Infra Meta label for core/infra team label May 4, 2020
@brusic
Copy link
Contributor Author

brusic commented Jun 3, 2020

Resolved all conflicts which mainly dealt with the new Routing framework and fix any new usages of the "old" URL endpoint.

Went ahead and cleaned up some of the API, leaving only one deprecated method. Since there are no more binary clients, I think it is also safe to remove. GetStoredScriptResponse#getSource()

https://github.com/brusic/elasticsearch/blob/get-all-scripts/server/src/main/java/org/elasticsearch/action/admin/cluster/storedscripts/GetStoredScriptResponse.java#L120-L123

Testing is becoming an even bigger chore! :) How much memory would you allocate to docker on a 16GB laptop? Preferred EC2 image for testing?

@elasticsearchmachine elasticsearchmachine changed the base branch from master to main July 22, 2022 23:15
@thecoop
Copy link
Member

thecoop commented Nov 1, 2024

Given this has had no progress for several years, I'm going to close this. If there is interest in this feature again, please feel free to re-open and re-base this and we can take a look at it anew.

@thecoop thecoop closed this Nov 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Core/Infra/Scripting Scripting abstractions, Painless, and Mustache >enhancement Team:Core/Infra Meta label for core/infra team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants