-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Raised IOException on deleteBlob #18815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Suggestions on how to write tests for these changes? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that a security exception should be re-thrown as an IOException.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A DirectoryNotEmptyException is an IOException, why is it being wrapped in an IOException?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my question in my comment here. Couldn't either exception be considered as being "unable to delete the blob"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's mistake to wrap a SecurityException in an IOException, I think that the Javadocs need to be changed. And again, a DirectoryNotEmptyException is already an IOException it doesn't need to be wrapped and allowing it to bubble up does not violate what should be the contract for this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the DirectoryNotFoundException, my bad. I misread your comment there. With regards to the SecurityException, I do see what you mean, though probably input from @abeyad would also be useful here since he did bring the issue initially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not catch the SecurityException at all. Let it propagate. We should not have even gotten to this point if the security manager did not give us access here, but in any case, its not an exception we should handle at this level. It should just be propagated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not catch the
SecurityExceptionat all. Let it propagate.
Precisely.
It explains the what, but not the why and the why is really important. |
|
@jasontedor : I understand that these exceptions are not |
|
@jasontedor : I think the issue that I am closing explains the importance. Do I need to repeat it in the PR? |
|
@gfyoung Regarding writing tests, take a look at |
Yes, the leniency is the real issue. |
Yes, the why needs to be in the commit message which ends up as the default body for the initial PR comment. |
|
@jasontedor : Regarding the leniency, sure thing. I suppose a more careful examination of the possible exceptions thrown needs to be done here. Regarding the commit body, fair enough. That can be added. |
Feedback on how to properly write them would be great! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this file context is initialized properly. Look at HdfsRepository for how it is initialized, with the appropriate security checks. You will need to create a temp dir for this, see the createTempDir() method that ESBlobStoreContainerTestCase inherits.
Did you run this test? What did it say, if anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It fails at the moment because of permissions. However I was pretty sure I didn't properly initialize everything either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I figured... my feeling is these tests weren't written before because they require mocking of the S3/Azure/HDFS services (HDFS may actually work if you give it a createTempDir() and initialize it the same way HdfsRepository does). I will be interested to here @imotov and @tlrx thoughts. The Google blob store repository has a mock storage client that mocks the behavior of what a real Google storage client would do, without actually connecting to an external service. Something similar would need to be done for S3 and Azure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was looking for some sort of mock objects as for the Google object, but I could not find any such thing. Should probably see if I can Mockito somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, for example, in the case of S3, you would have to create a mock
implementation for the AmazonS3 interface. Its a pretty lengthy interface,
but you would only need to provide mock implementations for the methods
that S3BlobStore uses, which are just a few. The rest of the methods could
just return null or some dummy values.
On Sun, Jun 12, 2016 at 12:45 PM, gfyoung [email protected] wrote:
In
plugins/repository-hdfs/src/test/java/org/elasticsearch/repositories/hdfs/HdfsBlobStoreContainerTests.java
#18815 (comment)
:
- * specific language governing permissions and limitations
- * under the License.
- */
+package org.elasticsearch.repositories.hdfs;
+
+import org.apache.hadoop.fs.FileContext;
+import org.elasticsearch.common.blobstore.BlobStore;
+import org.elasticsearch.repositories.ESBlobStoreContainerTestCase;
+
+import java.io.IOException;
+
+public class HdfsBlobStoreContainerTests extends ESBlobStoreContainerTestCase {
+- @OverRide
- protected BlobStore newBlobStore() throws IOException {
return new HdfsBlobStore(FileContext.getFileContext(), "", 100);Yeah, I was looking for some sort of mock objects as for the Google
object, but I could not find any such thing. Should probably see if I can
Mockito somehow.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/elastic/elasticsearch/pull/18815/files/5ecfabc0dffffcbec1cbf7c5c7db0e3ab05a8c07#r66723801,
or mute the thread
https://github.com/notifications/unsubscribe/ABjkQVZ4QGKT05XXuqbFiz4DONPvQmbGks5qLDeUgaJpZM4Iy-tw
.
|
Is there any update on whether or not we can use actual repositories for both Azure or S3 for these tests? If that can be done, that will save the working of having to mock everything for both repositories. |
We can not. These tests need to be able to run from developer machines. |
|
@jasontedor : Ah, I see. Hmm...then perhaps some more input from others on how to possibly mock the Azure and S3 repositories? |
|
Sometimes a fresh look at a problem is needed. Took a different approach to mocking S3-related components and found that I really just needed to mock the client as @abeyad had mentioned at some point. Onto |
|
|
|
Having some issues with An exception is thrown because when |
|
@gfyoung great news, let me know when its ready and I can give it another review. |
|
@abeyad : read my comment above - some assistance on the |
|
Not familiar with HDFS but perhaps @jbaiera could help? |
|
@abeyad Re HDFS: Many of the Hadoop services' authentication is based around Hadoop's |
|
|
Added BlobContainer tests for Azure storage and caught a bug at the same time in which deleteBlob was not raising an IOException when the blobName did not exist.
Added BlobContainer tests for HDFS storage and caught a bug at the same time in which deleteBlob was not raising an IOException when the blobName did not exist.
|
@abeyad : Change has been made. Ready to merge if there are no other concerns. |
This reverts commit d24cc65 as it seems to be causing test failures.
|
@abeyad I reverted this commit as it was causing consistent test failures, here is an example: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-intake/942/console |
|
IMHO reverting the entire commit was a bit overkill. In such situations, I have seen maintainers simply disable tests temporarily with the expectation that the committer (in this case, me) would address them. Almost never have I seen anyone shy away from the task. A closer look at the test failures suggests that we have a conflict of behaviours here. The two test failures are coming from
Both test failures was triggered by the change from In the first test, because the metadata file is deleted, when Thus, the question is, what needs to change? Does |
|
yes @gfyoung I could have probably annotated the test with |
|
@javanna : no worries, any thoughts on the rest of what I wrote? |
A commit that causes test failures generally means a bug somewhere that was not thought through. Anyone could create a snapshot build at any time off of master - we don't want to risk master introducing new bugs from a recent PR, to the extent possible, and tests failing is an indication of it. Hence, I believe @javanna made the right call. Once its resolved, the everything will be merged back in, so its not an issue.
We should call The line below it asserting the snapshot is deleted will probably not be true too, you'll have to check and change the assertion in that case as well.
Whats the stack trace here? If corrupted files (which can happen) cause the entire snapshot to be undeletable, then I think we have a problem. If that's the case, we should open a separate issue indicating snapshot/restore has to handle corrupted files more robustly, and put an The thing is, if the |
|
@gfyoung Also, run |
|
What about following what That might be a nicer solution for both test failures (and for users too perhaps), as we could still clean up all the files that are delete-able but make the user aware of those failures. |
|
Also, your proposed change for |
|
@gfyoung i'll pull down your branch and experiment |
|
@gfyoung I've created a branch here: https://github.com/abeyad/elasticsearch/tree/stricter-delete-blob. I'd like to issue a PR against your branch, but I don't have access to your elasticsearch fork. Before incorporating, I'd like feedback from @imotov first on what I've done. |
|
@gfyoung I haven't forgotten about this; I've discussed some changes with @imotov that should be done in order to avoid issues revealed with the two broken tests. I'm working on those now and will hopefully have a PR for review in the next couple days. Then both our work can be merged together. Can I request that you remove your latest commit to this branch and push? My commits will cause merge conflicts with the latest commit. Lastly, I still can't issue a PR against your branch. I believe you need to add me as one of your collaborators for elasticsearch (https://github.com/gfyoung/elasticsearch/settings/collaboration). Alternatively, we can setup a feature branch that incorporates both of our commits - this may be the best approach so I'll set it up and let you know when its there so you can push your commits to it. |
|
@abeyad : I reverted that last commit on my branch. That was my attempt to patch the errors. Strange that you can't put up PR's against my fork (I haven't seen that issue before), but yes, do let me know once the feature branch is up! |
|
@abeyad : any updates on the feature branch? |
Title is self-explanatory. Closes #18530.