Skip to content

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Feb 3, 2023

What changes were proposed in this pull request?

As a subtask of SPARK-42050, this PR adds Codegen Support for HiveSimpleUDF

Why are the changes needed?

Improve codegen coverage and performance

Does this PR introduce any user-facing change?

No.

How was this patch tested?

New UT & Pass GA.

@github-actions github-actions bot added the SQL label Feb 3, 2023
@HyukjinKwon
Copy link
Member

Hm, I think we should better go and figure out to resolve this by using Invoke so we don't have to manually implement the codegen logic. I was fine with #39555 as a one time thing but seems like there are some more to fix.

@HyukjinKwon
Copy link
Member

cc @cloud-fan @yaooqinn @dongjoon-hyun FYI

@dongjoon-hyun
Copy link
Member

I agree with @HyukjinKwon 's comment too.

@panbingkun
Copy link
Contributor Author

panbingkun commented Feb 9, 2023

Hm, I think we should better go and figure out to resolve this by using Invoke so we don't have to manually implement the codegen logic. I was fine with #39555 as a one time thing but seems like there are some more to fix.

As the first step, I have submitted a new Pr(#39949) to rewrite HiveGenericUDF with Invoke.
cc @HyukjinKwon @cloud-fan @yaooqinn @dongjoon-hyun @LuciferYang

@DonnyZone
Copy link
Contributor

DonnyZone commented Mar 8, 2023

Any progress on this topic? We find Codegen combined with the expression-reuse feature (#29975) can greatly improve performance. Many users write duplicate hiveUDFs in their queries.

@panbingkun panbingkun closed this May 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants