Skip to content

Conversation

@hvanhovell
Copy link
Contributor

What changes were proposed in this pull request?

This PR introduces a shared RuntimeConfig interface.

Why are the changes needed?

We are creating a shared Scala Spark SQL interface for Classic and Connect.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing tests.

Was this patch authored or co-authored using generative AI tooling?

No.

@hvanhovell
Copy link
Contributor Author

Merging to master.

@asfgit asfgit closed this in 6fc176f Sep 17, 2024
* @since 3.4.0
*/
class RuntimeConfig private[sql] (client: SparkConnectClient) extends Logging {
class ConnectRuntimeConfig private[sql] (client: SparkConnectClient)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a breaking change at API level.

  • I'm wondering if this is okay, @hvanhovell ?
  • In addition, please fix @since 3.4.0 tag because ConnectRuntimeConfig only exists after 4.0.0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class is not part of the public API. It is in internal, which is not public API. Regarding the tag I am not sure, that class has existed since 3.4.0. It just got renamed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// SPARK-49413: Create a shared RuntimeConfig interface.
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.RuntimeConfig"),
ProblemFilters.exclude[MissingClassProblem]("org.apache.spark.sql.RuntimeConfig$"),
Copy link
Member

@dongjoon-hyun dongjoon-hyun Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, this looks like a real breaking change indeed from Spark SQL part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is not. The org.apache.spark.sql.RuntimeConfig class is still there. It was just moved. MiMa - unfortunately - cannot deal with that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, got it. So, there is no breaking change effectively because it's in api module?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment here as a follow-up, please, @hvanhovell ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I will create a follow-up.

session: SparkSession,
configurations: Seq[ConfigEntry[Boolean]]): SparkSession = {
val configsEnabled = configurations.filter(session.conf.get[Boolean])
val configsEnabled = configurations.filter(session.sessionState.conf.getConf[Boolean])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although this is not a breaking change, do we still need this change inevitably for some reasons?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If then, the developers seem to be annoyed by this change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR I use the abstract class in sql/api the main interface. I did not really wanted to add support for this because it is not user facing, and because it requires me to move even more stuff into sql/api.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which developers? Spark developers?

override def beforeAll(): Unit = {
super.beforeAll()
spark.conf.set(SQLConf.ORC_IMPLEMENTATION, "native")
spark.conf.set(SQLConf.ORC_IMPLEMENTATION.key, "native")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a required user-facing change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the method that takes a sql conf entry is gone now... This was not a user facing feature to begin with.

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Sep 17, 2024

Thank you for making this direction, @hvanhovell . In general it looks like achieving most goals, but the following changes looks like a big user (Spark app developers) facing glitch to me. If possible, can we make Apache Spark 4.0.0 migration more seamlessly by supporting back in the same way?

spark.conf.get
spark.conf.set

@dongjoon-hyun
Copy link
Member

If this is difficult, could you file a JIRA issue instead? Then, if it's required by someone, he might pick it up.

it requires me to move even more stuff into sql/api.

@hvanhovell
Copy link
Contributor Author

@dongjoon-hyun which developers are you talking about? I have fixed all the spark cases. Generally library developers should not really be using any of this, if they don't want their stuff to break.

@hvanhovell
Copy link
Contributor Author

If this is difficult, could you file a JIRA issue instead? Then, if it's required by someone, he might pick it up.

It is not difficult, it is just yet another thing that needs to be moved to common/utils.

asfgit pushed a commit that referenced this pull request Dec 4, 2024
…interface

### What changes were proposed in this pull request?
This PR adds support for ConfigEntry to the RuntimeConfig interface. This was removed in #47980.

### Why are the changes needed?
This functionality is used a lot by Spark libraries. Removing them caused friction, and adding them does not pollute the RuntimeConfig interface.

### Does this PR introduce _any_ user-facing change?
No. This is developer API.

### How was this patch tested?
I have added tests cases for Connect and Classic.

### Was this patch authored or co-authored using generative AI tooling?
No.

Closes #49062 from hvanhovell/SPARK-49709.

Authored-by: Herman van Hovell <[email protected]>
Signed-off-by: Herman van Hovell <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants