[BACKPORT-2.1][SPARK-19104][SQL] Lambda variables in ExternalMapToCatalyst should be global #18627
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR is backport of #18418 to Spark 2.1. SPARK-21391 reported this problem in Spark 2.1.
The issue happens in
ExternalMapToCatalyst. For example, the following codes create ExternalMapExternalMapToCatalystToCatalyst to convert Scala Map to catalyst map format.The
valueConverterinExternalMapToCatalystlooks like:There is a
CreateNamedStructexpression (named_struct) to create a row ofInnerData.nameandInnerData.valuethat are referred byExternalMapToCatalyst_value52.Because
ExternalMapToCatalyst_value52are local variable, whenCreateNamedStructsplits expressions to individual functions, the local variable can't be accessed anymore.How was this patch tested?
Added a new test suite into
DatasetPrimitiveSuite