Skip to content

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Dec 17, 2023

What changes were proposed in this pull request?

The pr aims to upgrade Netty from 4.1.100.Final to 4.1.106.Final.

Why are the changes needed?

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Pass GA.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the BUILD label Dec 17, 2023
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check the CI failures?

@panbingkun panbingkun marked this pull request as draft December 18, 2023 01:41
@panbingkun
Copy link
Contributor Author

Could you check the CI failures?

Okay, let me check it, I'll temporarily change PR to draft first.

@LuciferYang
Copy link
Contributor

LuciferYang commented Dec 18, 2023

[info] - applyInPandasWithState should require StatefulOpClusteredDistribution from children - without initial state *** FAILED *** (2 seconds, 622 milliseconds)
[info]   org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 7231fcdb-98d1-4627-858e-55f020fb52bd, runId = 70165d3c-1a0e-4d20-a73f-1d4658698069] terminated with exception: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 7) (localhost executor driver): java.lang.NoSuchFieldError: chunkSize
[info] 	at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.<init>(PooledByteBufAllocatorL.java:153)
[info] 	at io.netty.buffer.PooledByteBufAllocatorL.<init>(PooledByteBufAllocatorL.java:49)
[info] 	at org.apache.arrow.memory.NettyAllocationManager.<clinit>(NettyAllocationManager.java:51)
[info] 	at org.apache.arrow.memory.DefaultAllocationManagerFactory.<clinit>(DefaultAllocationManagerFactory.java:26)
[info] 	at java.base/java.lang.Class.forName0(Native Method)
[info] 	at java.base/java.lang.Class.forName(Class.java:375)

java.lang.NoSuchFieldError: chunkSize... It seems that the change of Netty has caused compatibility issues with Arrow again ...

@panbingkun
Copy link
Contributor Author

[info] - applyInPandasWithState should require StatefulOpClusteredDistribution from children - without initial state *** FAILED *** (2 seconds, 622 milliseconds)
[info]   org.apache.spark.sql.streaming.StreamingQueryException: [STREAM_FAILED] Query [id = 7231fcdb-98d1-4627-858e-55f020fb52bd, runId = 70165d3c-1a0e-4d20-a73f-1d4658698069] terminated with exception: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 7) (localhost executor driver): java.lang.NoSuchFieldError: chunkSize
[info] 	at io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.<init>(PooledByteBufAllocatorL.java:153)
[info] 	at io.netty.buffer.PooledByteBufAllocatorL.<init>(PooledByteBufAllocatorL.java:49)
[info] 	at org.apache.arrow.memory.NettyAllocationManager.<clinit>(NettyAllocationManager.java:51)
[info] 	at org.apache.arrow.memory.DefaultAllocationManagerFactory.<clinit>(DefaultAllocationManagerFactory.java:26)
[info] 	at java.base/java.lang.Class.forName0(Native Method)
[info] 	at java.base/java.lang.Class.forName(Class.java:375)

java.lang.NoSuchFieldError: chunkSize... It seems that the change of Netty has caused compatibility issues with Arrow again ...

Yes, I am fixing this issue.

@panbingkun
Copy link
Contributor Author

A new pr for fix issued about arrow-memory-netty: apache/arrow#39266

@dongjoon-hyun
Copy link
Member

Thank you for the info, @panbingkun .

@LuciferYang
Copy link
Contributor

@panbingkun Please rebase this PR, and I think the upgrade target can be 4.1.106, we can reuse this ticket

@dongjoon-hyun
Copy link
Member

+1 for @LuciferYang 's comment.

@panbingkun
Copy link
Contributor Author

Done.

@panbingkun panbingkun changed the title [SPARK-46432][BUILD] Upgrade Netty to 4.1.104.Final [SPARK-46432][BUILD] Upgrade Netty to 4.1.106.Final Jan 24, 2024
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please regenerate the dependency file. It seems that there is a new entry like the following.

netty-transport-native-epoll/4.1.106.Final/linux-riscv64/netty-transport-native-epoll-4.1.106.Final-linux-riscv64.jar

@panbingkun
Copy link
Contributor Author

netty-transport-native-epoll-4.1.106.Final-linux-riscv64

Yeah, I analyzed the dependency using the following command

build/mvn -Phive-thriftserver -Pkubernetes -Pyarn -Phive -Pspark-ganglia-lgpl -Pkinesis-asl -Phadoop-cloud -Phadoop-3 dependency:tree -pl assembly -am

and found that it is dependent on io.netty:netty-transport-classes-kqueue:jar:4.1.106.Final
image

So, let's add it.

@panbingkun panbingkun marked this pull request as ready for review January 25, 2024 01:42
@LuciferYang
Copy link
Contributor

LuciferYang commented Jan 25, 2024

Can Spark run on the riscv64 machine? Can we exclude this dependency? If not, we still need to fix network-yarn/pom.xml:

<configuration>
<target>
<echo message="Shade netty native libraries to ${spark.shade.native.packageName}" />
<unzip src="${shuffle.jar}" dest="${project.build.directory}/exploded/" />
<move file="${project.build.directory}/exploded/META-INF/native/libnetty_transport_native_epoll_x86_64.so"
tofile="${project.build.directory}/exploded/META-INF/native/lib${spark.shade.native.packageName}_netty_transport_native_epoll_x86_64.so" />
<move file="${project.build.directory}/exploded/META-INF/native/libnetty_transport_native_kqueue_x86_64.jnilib"
tofile="${project.build.directory}/exploded/META-INF/native/lib${spark.shade.native.packageName}_netty_transport_native_kqueue_x86_64.jnilib" />
<move file="${project.build.directory}/exploded/META-INF/native/libnetty_transport_native_epoll_aarch_64.so"
tofile="${project.build.directory}/exploded/META-INF/native/lib${spark.shade.native.packageName}_netty_transport_native_epoll_aarch_64.so" />
<move file="${project.build.directory}/exploded/META-INF/native/libnetty_transport_native_kqueue_aarch_64.jnilib"

tofile="${project.build.directory}/exploded/META-INF/native/lib${spark.shade.native.packageName}_netty_transport_native_epoll_aarch_64.so" />
<move file="${project.build.directory}/exploded/META-INF/native/libnetty_transport_native_kqueue_aarch_64.jnilib"
tofile="${project.build.directory}/exploded/META-INF/native/lib${spark.shade.native.packageName}_netty_transport_native_kqueue_aarch_64.jnilib" />
<move file="${project.build.directory}/exploded/META-INF/native/libnetty_transport_native_epoll_riscv64.so"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only this entry has been added, and the rest has been adjusted for indentation, from 4 spaces to 2 spaces.

@panbingkun
Copy link
Contributor Author

Can Spark run on the riscv64 machine? Can we exclude this dependency? If not, we still need to fix network-yarn/pom.xml:

<configuration>
<target>
<echo message="Shade netty native libraries to ${spark.shade.native.packageName}" />
<unzip src="${shuffle.jar}" dest="${project.build.directory}/exploded/" />
<move file="${project.build.directory}/exploded/META-INF/native/libnetty_transport_native_epoll_x86_64.so"
tofile="${project.build.directory}/exploded/META-INF/native/lib${spark.shade.native.packageName}_netty_transport_native_epoll_x86_64.so" />
<move file="${project.build.directory}/exploded/META-INF/native/libnetty_transport_native_kqueue_x86_64.jnilib"
tofile="${project.build.directory}/exploded/META-INF/native/lib${spark.shade.native.packageName}_netty_transport_native_kqueue_x86_64.jnilib" />
<move file="${project.build.directory}/exploded/META-INF/native/libnetty_transport_native_epoll_aarch_64.so"
tofile="${project.build.directory}/exploded/META-INF/native/lib${spark.shade.native.packageName}_netty_transport_native_epoll_aarch_64.so" />
<move file="${project.build.directory}/exploded/META-INF/native/libnetty_transport_native_kqueue_aarch_64.jnilib"

Done.

@LuciferYang
Copy link
Contributor

In fact, I am more inclined to exclude this dependency. Because we do not yet have the corresponding CI to verify the usability of Apache Spark on RISC-V, so I personally think it is not supported on this architecture for the time being.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. I'm fine with new entry. It doesn't mean Apache Spark claims any new additional architecture support. Could you reconsider your decision, @LuciferYang ?

netty-transport-native-epoll/4.1.106.Final/linux-aarch_64/netty-transport-native-epoll-4.1.106.Final-linux-aarch_64.jar
netty-transport-native-epoll/4.1.106.Final/linux-riscv64/netty-transport-native-epoll-4.1.106.Final-linux-riscv64.jar
netty-transport-native-epoll/4.1.106.Final/linux-x86_64/netty-transport-native-epoll-4.1.106.Final-linux-x86_64.jar

Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, let's add this new dependency.

@LuciferYang
Copy link
Contributor

Merged into master for Spark 4.0. Thanks @panbingkun and @dongjoon-hyun ~

@dongjoon-hyun
Copy link
Member

Thank you, @panbingkun and @LuciferYang !

szehon-ho pushed a commit to szehon-ho/spark that referenced this pull request Aug 7, 2024
The pr aims to upgrade `Netty` from `4.1.100.Final` to `4.1.106.Final`.

- To bring the latest bug fixes
Automatically close Http2StreamChannel when Http2FrameStreamExceptionreaches end ofChannelPipeline ([apache#13651](netty/netty#13651))
Symbol not found: _netty_jni_util_JNI_OnLoad ([apache#13695](netty/netty#13728))

- 4.1.106.Final release note: https://netty.io/news/2024/01/19/4-1-106-Final.html
- 4.1.105.Final release note: https://netty.io/news/2024/01/16/4-1-105-Final.html
- 4.1.104.Final release note: https://netty.io/news/2023/12/15/4-1-104-Final.html
- 4.1.103.Final release note: https://netty.io/news/2023/12/13/4-1-103-Final.html
- 4.1.101.Final release note: https://netty.io/news/2023/11/09/4-1-101-Final.html

No.

Pass GA.

No.

Closes apache#44384 from panbingkun/SPARK-46432.

Lead-authored-by: panbingkun <[email protected]>
Co-authored-by: panbingkun <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants