Skip to content

Conversation

@gatorsmile
Copy link
Member

What changes were proposed in this pull request?

The session catalog caches some persistent functions in the FunctionRegistry, so there can be duplicates. Our Catalog API listFunctions does not handle it.

It would be better if SessionCatalog API can de-duplciate the records, instead of doing it by each API caller. In FunctionRegistry, our functions are identified by the unquoted string. Thus, this PR is try to parse it using our parser interface and then de-duplicate the names.

How was this patch tested?

Added test cases.

FunctionIdentifier(f)
}
}
val functions = dbFunctions ++ loadedFunctions
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of persistent function exists in both dbFunctions and loadedFunctions.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dbFunctions

@SparkQA
Copy link

SparkQA commented Apr 16, 2017

Test build #75832 has finished for PR 17646 at commit 7e26e5a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

cc @cloud-fan @hvanhovell

@ditta95aR
Copy link

7e26e5a

FunctionIdentifier(f, Some(dbName)) }
val loadedFunctions =
StringUtils.filterPattern(functionRegistry.listFunction(), pattern).map { f =>
// In functionRegistry, function names are stored as an unquoted format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we use FunctionIdentifier as the key in functionRegistry?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Will do it using this way in the following refactoring.

We can first fix the issue and then backport it to the previous branches.

@cloud-fan
Copy link
Contributor

LGTM

asfgit pushed a commit that referenced this pull request Apr 17, 2017
…ing persistent functions

### What changes were proposed in this pull request?
The session catalog caches some persistent functions in the `FunctionRegistry`, so there can be duplicates. Our Catalog API `listFunctions` does not handle it.

It would be better if `SessionCatalog` API can de-duplciate the records, instead of doing it by each API caller. In `FunctionRegistry`, our functions are identified by the unquoted string. Thus, this PR is try to parse it using our parser interface and then de-duplicate the names.

### How was this patch tested?
Added test cases.

Author: Xiao Li <[email protected]>

Closes #17646 from gatorsmile/showFunctions.

(cherry picked from commit 01ff035)
Signed-off-by: Xiao Li <[email protected]>
@gatorsmile
Copy link
Member Author

Thanks! Merging to master/2.1

@asfgit asfgit closed this in 01ff035 Apr 17, 2017
asfgit pushed a commit that referenced this pull request Apr 18, 2017
…functions after using persistent functions

Revert the changes of #17646 made in Branch 2.1, because it breaks the build. It needs the parser interface, but SessionCatalog in branch 2.1 does not have it.

### What changes were proposed in this pull request?

The session catalog caches some persistent functions in the `FunctionRegistry`, so there can be duplicates. Our Catalog API `listFunctions` does not handle it.

It would be better if `SessionCatalog` API can de-duplciate the records, instead of doing it by each API caller. In `FunctionRegistry`, our functions are identified by the unquoted string. Thus, this PR is try to parse it using our parser interface and then de-duplicate the names.

### How was this patch tested?
Added test cases.

Author: Xiao Li <[email protected]>

Closes #17661 from gatorsmile/compilationFix17646.
peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
…ing persistent functions

### What changes were proposed in this pull request?
The session catalog caches some persistent functions in the `FunctionRegistry`, so there can be duplicates. Our Catalog API `listFunctions` does not handle it.

It would be better if `SessionCatalog` API can de-duplciate the records, instead of doing it by each API caller. In `FunctionRegistry`, our functions are identified by the unquoted string. Thus, this PR is try to parse it using our parser interface and then de-duplicate the names.

### How was this patch tested?
Added test cases.

Author: Xiao Li <[email protected]>

Closes apache#17646 from gatorsmile/showFunctions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants