Skip to content

[LTS 8.8 RT] CVE-2023-4206, CVE-2023-4207, CVE-2023-4208 #157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

pvts-mat
Copy link
Contributor

@pvts-mat pvts-mat commented Mar 8, 2025

[LTS 8.8 RT]
CVE-2023-4206 VULN-6647
CVE-2023-4207 VULN-6654
CVE-2023-4208 VULN-6661

Problem

The PR addresses a series of related CVEs, which were once listed under a single CVE-2023-4128. From https://lore.kernel.org/netdev/[email protected]/:

Three classifiers (cls_fw, cls_u32 and cls_route) always copy
tcf_result struct into the new instance of the filter on update.

This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.

This patch set fixes this issue in all affected classifiers by no longer
copying the tcf_result struct from the old filter.

Each CVE is related to a different classifier:

CVE File Option
CVE-2023-4206 net/sched/cls_route.c CONFIG_NET_CLS_ROUTE4
CVE-2023-4207 net/sched/cls_fw.c CONFIG_NET_CLS_FW
CVE-2023-4208 net/sched/cls_u32.c CONFIG_NET_CLS_U32

Analysis and solution

Official fixes

The official fixes for each of the vulnerabilities are as follows:

CVE Mainline fix Backport to 4.19 (closest to 4.18 of Rocky LTS 8.8 RT) Relation to mainline fix Applicable to LTS 8.8 RT
CVE-2023-4206 b80b829 ad8f36f96696a7f1d191da66637c415959bab6d8 Same Yes
CVE-2023-4207 76e42ae 4f38dc8496d1991e2c055a0068dd98fb48affcc6 Same Yes
CVE-2023-4208 3044b16 4aae24015ecd70d824a953e2dc5b0ca2c4769243 Same Yes

Applicability

Each change is applicable to the LTS 8.8 RT from the configuration standpoint.

grep -e '\(CONFIG_NET_SCHED\|CONFIG_NET_CLS\|CONFIG_NET_CLS_ROUTE4\|CONFIG_NET_CLS_FW\|CONFIG_NET_CLS_U32\)\b' configs/kernel-rt-4.18.0-x86_64.config

CONFIG_NET_SCHED=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m

Analysis

For the discussion of the validity of a fix based on simply ignoring a certain field while copying a data structure where the actual copy may be expected see analysis for LTS 8.6 RT Pull Request - it was not repeated for the LTS 8.8 RT version.

Unrelated to the tcf_result issue, it may be worth considering the retirement of the tcindex filter in LTS 8.8 RT, as it was done in the mainline kernel for security reasons on 2023-02-16:

commit 8c710f75256bb3cf05ac7b1672c82b92c43f3d28
Author:     Jamal Hadi Salim <[email protected]>
AuthorDate: Tue Feb 14 08:49:14 2023 -0500
Commit:     Paolo Abeni <[email protected]>
CommitDate: Thu Feb 16 09:27:07 2023 +0100

    net/sched: Retire tcindex classifier
    
    The tcindex classifier has served us well for about a quarter of a century
    but has not been getting much TLC due to lack of known users. Most recently
    it has become easy prey to syzkaller. For this reason, we are retiring it.
    
    Signed-off-by: Jamal Hadi Salim <[email protected]>
    Acked-by: Jiri Pirko <[email protected]>
    Signed-off-by: Paolo Abeni <[email protected]>

(Syzkaller = Google's fuzzing framework)

Retiring tcindex from mainline kernel is unfortunate, because it leaves LTS 8.8 RT not only with rich source of vulnerabilities, as the commit's message suggests, but a silent source, without any CVEs nor patches made for them by kernel.org in the future.

kABI check: omitted (unstable ABI of RT kernels)

Boot test: passed

boot-test.log

Kselftests: passed relative

Methodology

A mix of kernel-selftests-internal and source-compiled tests were used:

  • kernel-selftests-internal: bpf tests, except:
    • bpf:test_kmod.sh: takes very long time to finish and always fails anyway,
    • bpf:test_progs: unstable, can crash the machine,
    • bpf:test_progs-no_alu32: unstable, can crash the machine.
  • source-compiled: all the rest.

Coverage (including tests skipped during execution)

android, bpf, breakpoints, capabilities, cgroup, core, cpu-hotplug, cpufreq, drivers/net/bonding, drivers/net/team, efivarfs, exec, filesystems, firmware, fpu, ftrace, futex, gpio, intel_pstate, ipc, kcmp, kvm, lib, livepatch, membarrier, memfd, memory-hotplug, mount, mqueue, net, net/forwarding, net/mptcp, netfilter, nsfs, proc, pstore, ptrace, rseq, rtc, sgx, sigaltstack, size, splice, static_keys, sync, sysctl, tc-testing, tdx, timens, timers, tpm2, user, vm, x86, zram

Reference

Two test runs were conducted on the reference kernel.
kselftests–mix–ciqlts8_8-rt–run1.log
kselftests–mix–ciqlts8_8-rt–run2.log

Patch

Two test runs were conducted on the patched kernel.
kselftests–mix–ciqlts8_8-rt-CVE-2023-4206.4207.4208–run1.log
kselftests–mix–ciqlts8_8-rt-CVE-2023-4206.4207.4208–run2.log

Comparison

ktests.xsh table --where "Summary = 'diff'" kselftests*.log

Column    File
--------  ---------------------------------------------------------------
Status0   kselftests--mix--ciqlts8_8-rt--run1.log
Status1   kselftests--mix--ciqlts8_8-rt--run2.log
Status2   kselftests--mix--ciqlts8_8-rt-CVE-2023-4206.4207.4208--run1.log
Status3   kselftests--mix--ciqlts8_8-rt-CVE-2023-4206.4207.4208--run2.log

TestCase                   Status0  Status1  Status2  Status3  Summary
bpf:test_tcpnotify_user    fail     pass     pass     pass     diff
net/mptcp:simult_flows.sh  pass     pass     pass     fail     diff
net:gro.sh                 fail     pass     pass     pass     diff
net:reuseport_addr_any.sh  fail     pass     fail     fail     diff
netfilter:nft_queue.sh     fail     fail     fail     pass     diff

Of the differing results the tests bpf:test_tcpnotify_user, net/mptcp:simult_flows.sh, net:gro.sh, netfilter:nft_queue.sh were known to give inconsistent results before.

The net:reuseport_addr_any.sh test was known to be always failing before and now it shows inconsistent results for the reference batch - added to the list of flappy tests.

Kselftests (networking): passed relative

Methodology

In general kselftests all the net/forwarding tests fail (really should be skipped) because of the missing tool dependencies

# selftests: net/forwarding: bridge_igmp.sh
# SKIP: jq not installed
not ok 1 selftests: net/forwarding: bridge_igmp.sh # exit=1

Because the patch deals with networking specifically, an additional batch of tests was carried out after solving the test requirements issues.

sudo make ARCH=$(uname -m) -C tools/testing/selftests TARGETS="net/forwarding" run_tests

The tools/testing/selftests/net/forwarding/forwarding.config file used was created directly from the tools/testing/selftests/net/forwarding/forwarding.config.sample.

Reference

Three test runs were conducted on the reference kernel.
kselftests-net-forwarding–src–ciqlts8_8-rt–run1.log
kselftests-net-forwarding–src–ciqlts8_8-rt–run2.log
kselftests-net-forwarding–src–ciqlts8_8-rt–run3.log

Patch

A single test run was conducted on the patched kernel.
kselftests-net-forwarding–src–ciqlts8_8-rt-CVE-2023-4206.4207.4208–run1.log

Comparison and discussion

Results for the reference and patched kernel are the same.

ktests.xsh table kselftests-net-forwarding*.log

Column    File
--------  ------------------------------------------------------------------------------
Status0   kselftests-net-forwarding--src--ciqlts8_8-rt--run1.log
Status1   kselftests-net-forwarding--src--ciqlts8_8-rt--run2.log
Status2   kselftests-net-forwarding--src--ciqlts8_8-rt--run3.log
Status3   kselftests-net-forwarding--src--ciqlts8_8-rt-CVE-2023-4206.4207.4208--run1.log

TestCase                                     Status0  Status1  Status2  Status3  Summary
net/forwarding:bridge_igmp.sh                fail     fail     fail     fail     same
net/forwarding:bridge_locked_port.sh         pass     pass     pass     pass     same
net/forwarding:bridge_port_isolation.sh      pass     pass     pass     pass     same
net/forwarding:bridge_sticky_fdb.sh          pass     pass     pass     pass     same
net/forwarding:bridge_vlan_aware.sh          fail     fail     fail     fail     same
net/forwarding:bridge_vlan_unaware.sh        pass     pass     pass     pass     same
net/forwarding:ethtool.sh                    fail     fail     fail     fail     same
net/forwarding:gre_multipath.sh              fail     fail     fail     fail     same
net/forwarding:ip6_forward_instats_vrf.sh    fail     fail     fail     fail     same
net/forwarding:ipip_flat_gre.sh              pass     pass     pass     pass     same
net/forwarding:ipip_flat_gre_key.sh          pass     pass     pass     pass     same
net/forwarding:ipip_flat_gre_keys.sh         pass     pass     pass     pass     same
net/forwarding:ipip_hier_gre.sh              pass     pass     pass     pass     same
net/forwarding:ipip_hier_gre_key.sh          pass     pass     pass     pass     same
net/forwarding:ipip_hier_gre_keys.sh         pass     pass     pass     pass     same
net/forwarding:loopback.sh                   skip     skip     skip     skip     same
net/forwarding:mirror_gre.sh                 fail     fail     fail     fail     same
net/forwarding:mirror_gre_bound.sh           pass     pass     pass     pass     same
net/forwarding:mirror_gre_bridge_1d.sh       pass     pass     pass     pass     same
net/forwarding:mirror_gre_bridge_1d_vlan.sh  fail     fail     fail     fail     same
net/forwarding:mirror_gre_bridge_1q.sh       fail     fail     fail     fail     same
net/forwarding:mirror_gre_bridge_1q_lag.sh   fail     fail     fail     fail     same
net/forwarding:mirror_gre_changes.sh         fail     fail     fail     fail     same
net/forwarding:mirror_gre_flower.sh          fail     fail     fail     fail     same
net/forwarding:mirror_gre_lag_lacp.sh        pass     pass     pass     pass     same
net/forwarding:mirror_gre_neigh.sh           pass     pass     pass     pass     same
net/forwarding:mirror_gre_nh.sh              pass     pass     pass     pass     same
net/forwarding:mirror_gre_vlan.sh            pass     pass     pass     pass     same
net/forwarding:mirror_gre_vlan_bridge_1q.sh  fail     fail     fail     fail     same
net/forwarding:mirror_vlan.sh                pass     pass     pass     pass     same
net/forwarding:router.sh                     skip     skip     skip     skip     same
net/forwarding:router_bridge.sh              pass     pass     pass     pass     same
net/forwarding:router_bridge_vlan.sh         pass     pass     pass     pass     same
net/forwarding:router_broadcast.sh           pass     pass     pass     pass     same
net/forwarding:router_multicast.sh           skip     skip     skip     skip     same
net/forwarding:router_multipath.sh           fail     fail     fail     fail     same
net/forwarding:router_vid_1.sh               pass     pass     pass     pass     same

The list of net/forwarding tests performed is not exhaustive (37 / 54). The net/forwarding:sch_ets.sh test executed right after net/forwarding:router_vid_1.sh causes the machine to hang for more than 10 minutes and the used testing framework interrupts the test suite.

...
ok 37 selftests: net/forwarding: router_vid_1.sh
# selftests: net/forwarding: sch_ets.sh
# TEST: ping vlan 10                                                  [ OK ]
# TEST: ping vlan 11                                                  [ OK ]
# TEST: ping vlan 12                                                  [ OK ]
# Running in priomap mode
# Testing ets bands 3 strict 3, streams 0 1
# TEST: band 0                                                        [ OK ]
# INFO: Expected ratio >95% Measured ratio 100.00
# TEST: band 1                                                        [ OK ]
# INFO: Expected ratio <5% Measured ratio 0
# Testing ets bands 3 strict 3, streams 1 2
ERROR:root:Subprocess exceeded the maximum freeze time 600 s. Terminating
INFO:root:Finished tests. Cleaning up the machine
...

The fix for the problem was deferred to another CVE for the sake of patching efficiency.

pvts-mat added 3 commits March 5, 2025 15:35
…e-after-free

jira VULN-6647
cve CVE-2023-4206
commit-author valis <[email protected]>
commit b80b829

When route4_change() is called on an existing filter, the whole
tcf_result struct is always copied into the new instance of the filter.

This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.

Fix this by no longer copying the tcf_result struct from the old filter.

Fixes: 1109c00 ("net: sched: RCU cls_route")
	Reported-by: valis <[email protected]>
	Reported-by: Bing-Jhong Billy Jheng <[email protected]>
	Signed-off-by: valis <[email protected]>
	Signed-off-by: Jamal Hadi Salim <[email protected]>
	Reviewed-by: Victor Nogueira <[email protected]>
	Reviewed-by: Pedro Tammela <[email protected]>
	Reviewed-by: M A Ramdhan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit b80b829)
	Signed-off-by: Marcin Wcisło <[email protected]>
…fter-free

jira VULN-6654
cve CVE-2023-4207
commit-author valis <[email protected]>
commit 76e42ae

When fw_change() is called on an existing filter, the whole
tcf_result struct is always copied into the new instance of the filter.

This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.

Fix this by no longer copying the tcf_result struct from the old filter.

Fixes: e35a8ee ("net: sched: fw use RCU")
	Reported-by: valis <[email protected]>
	Reported-by: Bing-Jhong Billy Jheng <[email protected]>
	Signed-off-by: valis <[email protected]>
	Signed-off-by: Jamal Hadi Salim <[email protected]>
	Reviewed-by: Victor Nogueira <[email protected]>
	Reviewed-by: Pedro Tammela <[email protected]>
	Reviewed-by: M A Ramdhan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 76e42ae)
	Signed-off-by: Marcin Wcisło <[email protected]>
…after-free

jira VULN-6661
cve CVE-2023-4208
commit-author valis <[email protected]>
commit 3044b16

When u32_change() is called on an existing filter, the whole
tcf_result struct is always copied into the new instance of the filter.

This causes a problem when updating a filter bound to a class,
as tcf_unbind_filter() is always called on the old instance in the
success path, decreasing filter_cnt of the still referenced class
and allowing it to be deleted, leading to a use-after-free.

Fix this by no longer copying the tcf_result struct from the old filter.

Fixes: de5df63 ("net: sched: cls_u32 changes to knode must appear atomic to readers")
	Reported-by: valis <[email protected]>
	Reported-by: M A Ramdhan <[email protected]>
	Signed-off-by: valis <[email protected]>
	Signed-off-by: Jamal Hadi Salim <[email protected]>
	Reviewed-by: Victor Nogueira <[email protected]>
	Reviewed-by: Pedro Tammela <[email protected]>
	Reviewed-by: M A Ramdhan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
	Signed-off-by: Jakub Kicinski <[email protected]>
(cherry picked from commit 3044b16)
	Signed-off-by: Marcin Wcisło <[email protected]>
@pvts-mat pvts-mat marked this pull request as ready for review March 12, 2025 22:11
Copy link
Collaborator

@bmastbergen bmastbergen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

Copy link
Collaborator

@PlaidCat PlaidCat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@PlaidCat PlaidCat merged commit 580261a into ctrliq:ciqlts8_8-rt Mar 13, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants