Skip to content

Conversation

@talevy
Copy link
Contributor

@talevy talevy commented Aug 13, 2018

These are some user-auth scenarios I think may be worth testing. For some reason, the tests hang on the read-only action. Still not sure why.

@talevy talevy added WIP :Data Management/ILM+SLM Index and Snapshot lifecycle management labels Aug 13, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@talevy talevy force-pushed the ilm-security-user-test branch from a3414eb to b58b30e Compare August 14, 2018 22:17
@talevy talevy removed the WIP label Aug 14, 2018
@talevy talevy requested review from colings86 and dakrone August 14, 2018 22:19
pollIntervalEntity.startObject("transient");
{
pollIntervalEntity.field(LifecycleSettings.LIFECYCLE_POLL_INTERVAL, "1s");
}pollIntervalEntity.endObject();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the newline is messed up here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying out a new formatting, I am missing a space there. but no like? I am happy to push it down a line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this style of formatting, I think it looks confusing and somewhat ugly since it does not fit in with the code style of the rest of the code. I don't mind the style we have elsewhere sometimes of indenting lines which are inside objects but since here the JSON we are building is quite straight forward I also think its easy to follow formatted normally

assertThat(indexExplain.get("managed"), equalTo(true));
assertThat(indexExplain.get("step"), equalTo("error"));
assertThat(indexExplain.get("failed_step"), equalTo("readonly"));
assertThat(indexExplain.get("step_info"), equalTo("permissionsss!"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wat?

Do we really assert that the explanation for the step is "permissionsss!" 🐍 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha, yes!

This is technically WIP since the test does not pass... but I have no idea why. I was hoping to at least get early feedback on the types of tests... as to why things are stuck on readonly. still a mystery to me and I am debugging it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can help look into this tomorrow if you are still having trouble finding the cause of the failure

pollIntervalEntity.startObject("transient");
{
pollIntervalEntity.field(LifecycleSettings.LIFECYCLE_POLL_INTERVAL, "1s");
}pollIntervalEntity.endObject();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of this style of formatting, I think it looks confusing and somewhat ugly since it does not fit in with the code style of the rest of the code. I don't mind the style we have elsewhere sometimes of indenting lines which are inside objects but since here the JSON we are building is quite straight forward I also think its easy to follow formatted normally

import static org.elasticsearch.xpack.core.security.authc.support.UsernamePasswordToken.basicAuthHeaderValue;
import static org.hamcrest.Matchers.equalTo;

public class PermissionsIT extends ESRestTestCase {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a JavaDoc here explaining what this test class aims to test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

public void testCanManageIndexAndPolicyDifferentUsers() throws Exception {
String index = "ilm-00001";
createIndexAsAdmin(index, indexSettingsWithPolicy, "");
assertBusy(() -> assertFalse(indexExists(index)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually testing anything ILM related? I think we need to test that the policy actually progresses here to ensure the ILM side is doing the permissions correctly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the index is deleted by ILM?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

theoretically. plan to continue debugging the lack of progress in the system today

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a javadoc for this test to explain what its doing? At first glance it looks a bit confusing because it just creates an index and then checks it doesn't exist.

assertThat(indexExplain.get("managed"), equalTo(true));
assertThat(indexExplain.get("step"), equalTo("error"));
assertThat(indexExplain.get("failed_step"), equalTo("readonly"));
assertThat(indexExplain.get("step_info"), equalTo("permissionsss!"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can help look into this tomorrow if you are still having trouble finding the cause of the failure

@colings86
Copy link
Contributor

colings86 commented Aug 20, 2018

Took a while to track down but it seems that the problem with the policy not progressing is due to the update settings request in the read only step being dropped by the SecurityActionFilter. This is because the ClientHelper.INDEX_LIFECYCLE_ORIGIN is not present in the switch statement in AuthorizationUtils. Adding this makes the policy progress.

Additionally I think there is a bug in the SecurityActionFilter.apply() because if AuthorizationUtils.switchUserBasedOnActionOriginAndExecute() throws an IllegalArgumentException there is no catch surrounding it in SecurityActionFilter.apply() to then call listener.OnFailure() so the request gets silently dropped. We should probably fix this bug too but I'll leave it up to you whether we do this in this PR or in a separate change. /cc @jaymode

@jasontedor
Copy link
Member

I am not seeing what you're seeing @colings86. Current master looks like this:

try {
if (useSystemUser) {
securityContext.executeAsUser(SystemUser.INSTANCE, (original) -> {
try {
applyInternal(action, request, authenticatedListener);
} catch (IOException e) {
listener.onFailure(e);
}
}, Version.CURRENT);
} else if (AuthorizationUtils.shouldSetUserBasedOnActionOrigin(threadContext)) {
AuthorizationUtils.switchUserBasedOnActionOriginAndExecute(threadContext, securityContext, (original) -> {
try {
applyInternal(action, request, authenticatedListener);
} catch (IOException e) {
listener.onFailure(e);
}
});
} else {
try (ThreadContext.StoredContext ignore = threadContext.newStoredContext(true)) {
applyInternal(action, request, authenticatedListener);
}
}
} catch (Exception e) {
listener.onFailure(e);
}

The whole block is wrapped in a try-catch. And anyway, higher up we have this wrapped too, which is always our last resort against issues like this:

@Override
public void proceed(Task task, String actionName, Request request, ActionListener<Response> listener) {
int i = index.getAndIncrement();
try {
if (i < this.action.filters.length) {
this.action.filters[i].apply(task, actionName, request, listener, this);
} else if (i == this.action.filters.length) {
this.action.doExecute(task, request, listener);
} else {
listener.onFailure(new IllegalStateException("proceed was called too many times"));
}
} catch(Exception e) {
logger.trace("Error during transport action execution.", e);
listener.onFailure(e);
}
}
}

@colings86
Copy link
Contributor

colings86 commented Aug 20, 2018

@jasontedor hmm good point with the surrounding try-catches. I'm not sure why the request was getting dropped then but I added log lines to all the listener.onFailure() calls and did not see any output from them. AuthorizationUtils.switchUserBasedOnActionOriginAndExecute() was definitely throwing an IllegalArgumentException though so I think the fix to add ilm to the case statement is correct even if we don't need to fix the action filter itself?

@jasontedor
Copy link
Member

@colings86 I think that you're running with assertions enabled? That would lead to the assertion tripping, throwing an AssertionError which we do not catch, and that would lead to a hung listener? Are you sure that an IllegalArugmentException is being thrown? I don't even see how that method could throw one. I do see an IllegalStateException but that would not be thrown if it is indeed the assertion that it is tripping, which I think is what is happening here?

@colings86
Copy link
Contributor

@jasontedor ah yes you are right, since I'm running in a test from gradle ./gradlew :x-pack:plugin:ilm:qa:withsecurity:integtest it will have assertions enabled so you are right that it won't be caught. My certainty that the IllegalStateException was being thrown (i mixed up IllegalStateException and IllegalArgumentException was based on adding logging before this line and seeing it output in the logs but still forgetting that assertions would be on. The fix to add ClientHelper.INDEX_LIFECYCLE_ORIGIN to the switch statement I still think is the right fix though.

It is a bit unfortunately that the asserts don't really show up clearly in the logs though, this made this bug quite difficult to track down without digging into it. I wonder if the assertion is actually worth it here since if the IllegalStateException had been thrown it would have highlighted the bug quicker I think?

@jaymode
Copy link
Member

jaymode commented Aug 20, 2018

AuthorizationUtils.switchUserBasedOnActionOriginAndExecute() was definitely throwing an IllegalArgumentException though so I think the fix to add ilm to the case statement is correct even if we don't need to fix the action filter itself?

That is the right fix for getting the user added to the request.

@jasontedor
Copy link
Member

The fix to add ClientHelper.INDEX_LIFECYCLE_ORIGIN to the switch statement I still think is the right fix though.

I am not arguing against the fix, the fix is clearly correct. I am arguing that there is not any problem with SecurityActionFilter#apply.

@jasontedor
Copy link
Member

jasontedor commented Aug 20, 2018

I wonder if the assertion is actually worth it here since if the IllegalStateException had been thrown it would have highlighted the bug quicker I think?

The problem here is that the assertion error was getting caught by the JDK, set as the outcome on a future task, and lost. We need to ensure that errors thrown from triggered listeners do not get lost. I opened #32998.

@talevy
Copy link
Contributor Author

talevy commented Aug 20, 2018

thanks for the assistance here everyone. I think I caught up with changes necessary to make these tests fly!

that being said, the qa:with-security tests passed without the adding of the INDEX_LIFECYCLE_ORIGIN

Copy link
Contributor

@colings86 colings86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments but I think this is close now

String token = basicAuthHeaderValue("test_admin", new SecureString("x-pack-test-password".toCharArray()));
return Settings.builder()
.put(ThreadContext.PREFIX + ".Authorization", token)
.build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this going to be safe? As I understand it, the rest test case uses the admin settings to reset the cluster to a clean state between tests, but here this admin user only has access to the ilm-* indices so I'm not sure if it will be able to clean up the the not-ilm index?

Copy link
Contributor Author

@talevy talevy Aug 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_admin has full administrator authority. the client() is running as the ILM-specific, test_ilm, user. This is why wipeCluster() has not been throwing an exception and failing the tests when running as adminClient().

public void testCanManageIndexAndPolicyDifferentUsers() throws Exception {
String index = "ilm-00001";
createIndexAsAdmin(index, indexSettingsWithPolicy, "");
assertBusy(() -> assertFalse(indexExists(index)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a javadoc for this test to explain what its doing? At first glance it looks a bit confusing because it just creates an index and then checks it doesn't exist.

protected void masterOperation(Request request, ClusterState state, ActionListener<Response> listener) {
Map<String, String> filteredHeaders = threadPool.getThreadContext().getHeaders().entrySet().stream()
.filter(e -> ClientHelper.SECURITY_HEADER_FILTERS.contains(e.getKey()))
.collect(Collectors.toMap(Map.Entry::getKey, Map.Entry::getValue));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be worth a comment explaining why this needs to be here and not in the task so someone doesn't move it into the task without realising it will break things?

@talevy talevy requested a review from colings86 August 21, 2018 13:42
Copy link
Contributor

@colings86 colings86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@talevy talevy merged commit 6780ab9 into elastic:index-lifecycle Aug 21, 2018
@talevy talevy deleted the ilm-security-user-test branch August 21, 2018 19:27
talevy added a commit that referenced this pull request Aug 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/ILM+SLM Index and Snapshot lifecycle management

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants