Skip to content

Conversation

@squito
Copy link
Contributor

@squito squito commented Aug 10, 2018

What changes were proposed in this pull request?

(a) disabled rest submission server by default in standalone mode
(b) fails the standalone master if rest server enabled & authentication secret set
(c) fails the mesos cluster dispatcher if authentication secret set
(d) doc updates
(e) when submitting a standalone app, only try the rest submission first if spark.master.rest.enabled=true

otherwise you'd see a 10 second pause like
18/08/09 08:13:22 INFO RestSubmissionClient: Submitting a request to launch an application in spark://...
18/08/09 08:13:33 WARN RestSubmissionClient: Unable to connect to server spark://...

I also made sure the mesos cluster dispatcher failed with the secret enabled, though I had to do that on slightly different code as I don't have mesos native libs around.

How was this patch tested?

I ran the tests in the mesos module & in core for org.apache.spark.deploy.*

I ran a test on a cluster with standalone master to make sure I could still start with the right configs, and would fail the right way too.

@SparkQA
Copy link

SparkQA commented Aug 10, 2018

Test build #94572 has finished for PR 22071 at commit b4ca224.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 10, 2018

Test build #4241 has started for PR 22071 at commit b4ca224.

@felixcheung
Copy link
Member

@tnachen

@SparkQA
Copy link

SparkQA commented Aug 12, 2018

Test build #4244 has finished for PR 22071 at commit b4ca224.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 12, 2018

Test build #4245 has finished for PR 22071 at commit b4ca224.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

extends Logging {

{
val authKey = SecurityManager.SPARK_AUTH_SECRET_CONF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be better to place this in the MesosRestServer code, since it's not really about the framework (MesosClusterDispatcher) but the RestServer receiving requests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this too, and originally wrote it that way ... but then I figured the way it works now, its really the same thing. If I put it in the MesosRestServer, it might seem like you could run the ClusterDispatcher without the RestServer somehow -- but maybe the exception itself is clear enough?

anyway, don't feel particularly strongly either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, my reasoning is that it could be harder for someone looking at the code to figure out why this is not allowed, since we don't really mention about the rest server which is really the one requiring security to be turned off. Another reason it will be beneficial to have the check in the MesosRestServer is that the MesosClusterDispatcher framework could technically be decoupled from the MesosRestServer and allow another way to receive requests, so to increase flexibility and avoid someone forgetting about why we put this here, my suggestion is to move the check closer to where it's being required will help maintain this a bit better.

Copy link
Contributor

@tgravescs tgravescs Aug 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally the way I read this, I agree with @squito and it should be here. From my understanding from reading the docs on running on mesos, the user doesn't start the rest server, they start the dispatcher. The dispatcher doesn't support any sort of authentication. The fact that it uses the rest server is an implementation detail the user doesn't necessarily care about. If you change the dispatcher to have another way then it should support authentication as well or have this same error. Someone adding another way other then rest server should be made aware of this, so it being here accomplishes that.

Perhaps just adding a comment in the code about the rest server would help to clarify to developers? Or please correct me if my understanding of running the dispatcher is wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note if someone add another way to that supports authentication we just move it at that point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MesosClusterDispatcher framework could technically be decoupled from the MesosRestServer

I actually think that is a reason to keep it here. If somebody adds another way, then this check is still in place for the new way -- unless they consciously think about making it work, and then they'd move the check to a more appropriate spot.

I can add a comment in the code: // This doesn't support authentication because the RestSubmissionServer doesn't support it.

Copy link
Contributor

@tgravescs tgravescs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. +1

@SparkQA
Copy link

SparkQA commented Aug 14, 2018

Test build #94706 has finished for PR 22071 at commit 897b587.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tnachen
Copy link
Contributor

tnachen commented Aug 14, 2018

LGTM as well

Copy link
Contributor

@jerryshao jerryshao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@tgravescs
Copy link
Contributor

+1

@asfgit asfgit closed this in 1024875 Aug 14, 2018
@srowen
Copy link
Member

srowen commented Aug 14, 2018

Merged to master

@squito
Copy link
Contributor Author

squito commented Aug 14, 2018

any objections about putting this in prior branches as well?

@srowen
Copy link
Member

srowen commented Aug 14, 2018

Docs -- obviously OK to backport. I wasn't as sure about the behavior change. Normally we'd never do that for a maintenance branch if it can be helped. I could see an argument here, but did think we were just going to change the behavior for 2.4. 2.3.2 -- I could see.

@felixcheung
Copy link
Member

in this case maybe ok. perhaps just rel note this iff there's another 2.2.x or 2.1.x releases?

@akirillov
Copy link

akirillov commented Feb 21, 2019

Hi @squito, if I may, I have a few comments to share regarding this change and probably you can help me to understand it in better details.

From Spark Security Doc page it looks like spark.master.rest.enabled property belongs to Spark Standalone Master only. IMO, It makes sense, because it's the server side who is responsible for enabling and disabling REST endpoint according to the source code.

Now, from the comment in the code of SparkSubmit.scala it looks like REST-based gateway introduced in Spark 1.3 is the default way of communication with Spark Standalone Master with a fallback to legacy RPC.

And here's the problem: now spark.master.rest.enabled became a client-side property which defaults to false for both Mesos and Spark Standalone and breaks all spark-submit calls to Mesos Dispatcher if this property is not provided explicitly. Now users have to provide it every time for Mesos cluster mode while REST is the only mode supported by the dispatcher. Looks like a breaking change.

Another note is regarding spark.authenticate.secret assertion in MesosClusterDispatcher. Dispatcher will be unable to start if spark.authenticate.secret is set. It looks like not so great behaviour if it doesn't allow to run a program at all if some flag is set. Which also looks like a breaking change.

Suggestions:

  • it looks natural for spark.master.rest.enabled to be handled in SparkSubmit only when used with Spark Standalone Master. It can be completely ignored with Mesos Dispatcher because REST interface is the only endpoint it exposes
  • defaulting Spark Standalone Master's spark.master.rest.enabled to false switches the communication to 'legacy' RPC protocol which is marked as a fallback-only since Spark 1.3. It's not clear how this protocol is going to be supported in the future. It looks safer to document how to disable REST endpoint for security purposes instead while preserving REST communications by default
  • instead of throwing an assertion error - log a WARN message in Mesos Dispatcher if spark.authenticate.secret is provided in configuration

Please let me know what you think. Thanks.

dongjoon-hyun pushed a commit that referenced this pull request Apr 30, 2019
…2.9.8

## What changes were proposed in this pull request?

This reverts commit 6f394a2.

In general, we need to be very cautious about the Jackson upgrade in the patch releases, especially when this upgrade could break the existing behaviors of the external packages or data sources, and generate different results after the upgrade. The external packages and data sources need to change their source code to keep the original behaviors. The upgrade requires more discussions before releasing it, I think.

In the previous PR #22071, we turned off `spark.master.rest.enabled` by default and added the following claim in our security doc:
> The Rest Submission Server and the MesosClusterDispatcher do not support authentication.  You should ensure that all network access to the REST API & MesosClusterDispatcher (port 6066 and 7077 respectively by default) are restricted to hosts that are trusted to submit jobs.

We need to understand whether this Jackson CVE applies to Spark. Before officially releasing it, we need more inputs from all of you. Currently, I would suggest to revert this upgrade from the upcoming 2.4.3 release, which is trying to fix the accidental default Scala version changes in pre-built artifacts.

## How was this patch tested?

N/A

Closes #24493 from gatorsmile/revert24418.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Jul 23, 2019
…2.9.8

## What changes were proposed in this pull request?

This reverts commit 6f394a2.

In general, we need to be very cautious about the Jackson upgrade in the patch releases, especially when this upgrade could break the existing behaviors of the external packages or data sources, and generate different results after the upgrade. The external packages and data sources need to change their source code to keep the original behaviors. The upgrade requires more discussions before releasing it, I think.

In the previous PR apache#22071, we turned off `spark.master.rest.enabled` by default and added the following claim in our security doc:
> The Rest Submission Server and the MesosClusterDispatcher do not support authentication.  You should ensure that all network access to the REST API & MesosClusterDispatcher (port 6066 and 7077 respectively by default) are restricted to hosts that are trusted to submit jobs.

We need to understand whether this Jackson CVE applies to Spark. Before officially releasing it, we need more inputs from all of you. Currently, I would suggest to revert this upgrade from the upcoming 2.4.3 release, which is trying to fix the accidental default Scala version changes in pre-built artifacts.

## How was this patch tested?

N/A

Closes apache#24493 from gatorsmile/revert24418.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Jul 25, 2019
…2.9.8

## What changes were proposed in this pull request?

This reverts commit 6f394a2.

In general, we need to be very cautious about the Jackson upgrade in the patch releases, especially when this upgrade could break the existing behaviors of the external packages or data sources, and generate different results after the upgrade. The external packages and data sources need to change their source code to keep the original behaviors. The upgrade requires more discussions before releasing it, I think.

In the previous PR apache#22071, we turned off `spark.master.rest.enabled` by default and added the following claim in our security doc:
> The Rest Submission Server and the MesosClusterDispatcher do not support authentication.  You should ensure that all network access to the REST API & MesosClusterDispatcher (port 6066 and 7077 respectively by default) are restricted to hosts that are trusted to submit jobs.

We need to understand whether this Jackson CVE applies to Spark. Before officially releasing it, we need more inputs from all of you. Currently, I would suggest to revert this upgrade from the upcoming 2.4.3 release, which is trying to fix the accidental default Scala version changes in pre-built artifacts.

## How was this patch tested?

N/A

Closes apache#24493 from gatorsmile/revert24418.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
kai-chi pushed a commit to kai-chi/spark that referenced this pull request Aug 1, 2019
…2.9.8

## What changes were proposed in this pull request?

This reverts commit 6f394a2.

In general, we need to be very cautious about the Jackson upgrade in the patch releases, especially when this upgrade could break the existing behaviors of the external packages or data sources, and generate different results after the upgrade. The external packages and data sources need to change their source code to keep the original behaviors. The upgrade requires more discussions before releasing it, I think.

In the previous PR apache#22071, we turned off `spark.master.rest.enabled` by default and added the following claim in our security doc:
> The Rest Submission Server and the MesosClusterDispatcher do not support authentication.  You should ensure that all network access to the REST API & MesosClusterDispatcher (port 6066 and 7077 respectively by default) are restricted to hosts that are trusted to submit jobs.

We need to understand whether this Jackson CVE applies to Spark. Before officially releasing it, we need more inputs from all of you. Currently, I would suggest to revert this upgrade from the upcoming 2.4.3 release, which is trying to fix the accidental default Scala version changes in pre-built artifacts.

## How was this patch tested?

N/A

Closes apache#24493 from gatorsmile/revert24418.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
yoock pushed a commit to yoock/spark-apache that referenced this pull request Jan 14, 2020
…2.9.8

## What changes were proposed in this pull request?

This reverts commit 6f394a2.

In general, we need to be very cautious about the Jackson upgrade in the patch releases, especially when this upgrade could break the existing behaviors of the external packages or data sources, and generate different results after the upgrade. The external packages and data sources need to change their source code to keep the original behaviors. The upgrade requires more discussions before releasing it, I think.

In the previous PR apache/spark#22071, we turned off `spark.master.rest.enabled` by default and added the following claim in our security doc:
> The Rest Submission Server and the MesosClusterDispatcher do not support authentication.  You should ensure that all network access to the REST API & MesosClusterDispatcher (port 6066 and 7077 respectively by default) are restricted to hosts that are trusted to submit jobs.

We need to understand whether this Jackson CVE applies to Spark. Before officially releasing it, we need more inputs from all of you. Currently, I would suggest to revert this upgrade from the upcoming 2.4.3 release, which is trying to fix the accidental default Scala version changes in pre-built artifacts.

## How was this patch tested?

N/A

Closes #24493 from gatorsmile/revert24418.

Authored-by: gatorsmile <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants