Skip to content

Conversation

@gatorsmile
Copy link
Member

Feeling a little bit guilty to pick the one with the least functions. : )

One question to @yhuai : when users describe these functions, I think it might be nice to provide them multiple useful examples. For example, for the function sort_array, I am wondering how to put the following example in extended?

> val df = Seq((Array(2, 1, 3), Array("b", "c", "a"))).toDF("col1", "col2")
> df.select(sort_array($"col1", false), sort_array($"col2", true)).collect()
Array[org.apache.spark.sql.Row] = Array([WrappedArray(3, 2, 1),WrappedArray(a, b, c)])

So far, the example in Upper is not very clear when we adding the complex use cases. Could you show us more examples?

Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is better to use _FUNC_(array, ascendingOrder) and mention that ascendingOrder is a boolean?

@yhuai
Copy link
Contributor

yhuai commented Dec 21, 2015

I am wondering if we only need to show the SQL usage? For DF version, we have API doc (maybe that is good enough for now).

@gatorsmile
Copy link
Member Author

I see. will add SQL cases. Thank you!

@SparkQA
Copy link

SparkQA commented Dec 21, 2015

Test build #48124 has finished for PR 10418 at commit e49b036.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 21, 2015

Test build #48129 has finished for PR 10418 at commit afdcb1c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i would use triple quotes for strings that contain quotes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to do it, but realized annotation argument needs to be a constant. A newline marker \n is an issue here. We are unable to call stripMargin after triple quotes. Thus, I called stripMargin when building ExpressionInfo. Do you like it?

@SparkQA
Copy link

SparkQA commented Dec 22, 2015

Test build #48177 has finished for PR 10418 at commit 99cb7d8.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotation argument needs to be a constant. Thus, I called stripMargin when building ExpressionInfo. I am not sure if this is a right way. If so, should we call stripMargin for df.usage() too?

If we do not care indenting, there exists another solution:

  extended = """> SELECT _FUNC_(array("b", "d", "c", "a"));
4""")

Or do you have a better way for adding a newline marker between triple quotes? Thank you! @marmbrus

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would you want to add a new line character in a raw string?

It would be nice to add stripMargin to df.usage as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a new line character in a raw string does not work if we use the multi-line string. Thus, I added stripMargin here for avoiding weird indenting.

@gatorsmile
Copy link
Member Author

retest this please.

@SparkQA
Copy link

SparkQA commented Dec 22, 2015

Test build #48205 has finished for PR 10418 at commit 99cb7d8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail to compile in scala 2.11. Use a multiline string here, see: #10488

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I will change all of them.

@SparkQA
Copy link

SparkQA commented Dec 29, 2015

Test build #48385 has finished for PR 10418 at commit 64f32a4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

@yhuai Could you take a look at this fix? Thanks!

BTW, I added stripMargin in the object FunctionRegistry. This change might help the other related PRs.

@gatorsmile gatorsmile closed this Apr 19, 2016
asfgit pushed a commit that referenced this pull request Apr 19, 2016
…ection Functions

#### What changes were proposed in this pull request?
#12185 contains the original PR I submitted in #10418

However, it misses one of the extended example, a wrong description and a few typos for collection functions. This PR is fix all these issues.

#### How was this patch tested?
The existing test cases already cover it.

Author: gatorsmile <[email protected]>

Closes #12492 from gatorsmile/expressionUpdate.
@gatorsmile gatorsmile deleted the ExpDesc4C branch August 6, 2016 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants