Skip to content

Conversation

@vicennial
Copy link
Contributor

@vicennial vicennial commented Jan 20, 2023

What changes were proposed in this pull request?

Adds the following methods:

  • Dataset API methods
    • project
    • filter
    • limit
  • SparkSession
    • range (and its variations)

This PR also introduces Column and functions to support the above changes.

Why are the changes needed?

Incremental development of Spark Connect Scala Client.

Does this PR introduce any user-facing change?

Yes, users may now use the proposed API methods.
Example: val df = sparkSession.range(5).limit(3)

How was this patch tested?

Unit tests + simple E2E test.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?


private var server: Server = _
private var service: DummySparkConnectService = _
private val SERVER_PORT = 15250
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not fix the port but give the dummy server 0 and let the sever return the bind port for client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I've updated the logic with this approach.

Copy link
Contributor

@hvanhovell hvanhovell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants