Skip to content

[CBR 7.9] net/sched: Retire tcindex classifier #411

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: ciqcbr7_9
Choose a base branch
from

Conversation

pvts-mat
Copy link
Contributor

[CBR 7.9]
CVE-2023-1829
VULN-7630

Problem

https://nvd.nist.gov/vuln/detail/CVE-2023-1829

A use-after-free vulnerability in the Linux Kernel traffic control index filter (tcindex) can be exploited to achieve local privilege escalation. The tcindex_delete function which does not properly deactivate filters in case of a perfect hashes while deleting the underlying structure which can later lead to double freeing the structure. A local attacker user can use this vulnerability to elevate its privileges to root. We recommend upgrading past commit 8c710f7.

The commit 8c710f7 recommended to upgrade past simply removes the tcindex traffic control filter.

Applicability: yes

The tcindex filter is enabled in configs/kernel-3.10.0-x86_64.config

CONFIG_NET_CLS_TCINDEX=m

The CVE doesn't follow the typical bug-patch scheme, with the 8c710f7 fix not indicating any "fixes" commit. The brief, official bug description

The tcindex_delete function which does not properly deactivate filters in case of a perfect hashes while deleting the underlying structure which can later lead to double freeing the structure.

along with the third party analysis https://starlabs.sg/blog/2023/06-breaking-the-code-exploiting-and-examining-cve-2023-1829-in-cls_tcindex-classifier-vulnerability/#vulnerability-analysis was used to assess the applicability.

Out of four functions playing their part in creating the use-after-free scenario mentioned in the article - tcindex_delete, tcindex_destroy_rexts_work, __tcindex_destroy_rexts, tcf_exts_destroy - two differ slightly in the CBR 7.9 version:

  • __tcindex_destroy_rexts

    • Star Lab analysis:

      static void __tcindex_destroy_rexts(struct tcindex_filter_result *r)
      {
          tcf_exts_destroy(&r->exts);
          tcf_exts_put_net(&r->exts);
          tcindex_data_put(r->p);
      }
      
    • CBR 7.9:

      static void __tcindex_destroy_rexts(struct tcindex_filter_result *r)
      {
      tcf_exts_destroy(&r->exts);
      tcf_exts_put_net(&r->exts);
      }

  • tcf_exts_destroy

    • Star Lab analysis:

      void tcf_exts_destroy(struct tcf_exts *exts)
      {
      #ifdef CONFIG_NET_CLS_ACT
          if (exts->actions) {
              tcf_action_destroy(exts->actions, TCA_ACT_UNBIND);
              printk("free exts->actions: %px\n", exts->actions);
              kfree(exts->actions);  // [3]
          }
          exts->nr_actions = 0;
      #endif
      }
      
    • CBR 7.9:

      void tcf_exts_destroy(struct tcf_exts *exts)
      {
      #ifdef CONFIG_NET_CLS_ACT
      LIST_HEAD(actions);
      ASSERT_RTNL();
      tcf_exts_to_list(exts, &actions);
      tcf_action_destroy(&actions, TCA_ACT_UNBIND);
      kfree(exts->actions);
      exts->nr_actions = 0;
      #endif
      }

However, the execution path leading to the error remains in place. Following Star Lab analysis

In the case of imperfect hashes, we observe that the filter linked to the result r is eliminated from the specified hash table at [2]. However, when it comes to perfect hashes at [1], no actions are taken to delete or deactivate the filter. Due to the fact that f is never set in the case of imperfect hashes, the function tcindex_destroy_rexts_work() will be invoked

The tcindex_delete function is exactly the same in ciqcbr7_9, so this part applies directly. (The last "imperfect" was probably a typo, as f is never set in the case of perfect hashes, which are also the subject of the bug.)

Once the tcf_exts_destroy() function is called, the exts->actions will be freed at index [3]. However, it will not be deactivated from the filter, which means that the pointer can still be accessed by the destroy function. This situation creates a use-after-free chunk, referred to as a perfect hash filter.

Despite __tcindex_destroy_rexts and tcf_exts_destroy being different, the problematic tcindex_destroy_rexts_work__tcindex_destroy_rextstcf_exts_destroykfree call chain remains in place.

Based on this, and on the bug replication given later in this PR, it was assessed that CVE-2023-1829 applies to CBR 7.9.

Solution

It was chosen to follow the upstream policy and "fix" this bug by removing the cls_tcindex module as it's done in 8c710f7. This solution was also used by Debian (https://lists.debian.org/debian-lts-announce/2023/05/msg00005.html):

"valis" reported two flaws in the cls_tcindex network traffic
classifier which could lead to a use-after-free. A local user can
exploit these for privilege escalation. This update removes
cls_tcindex entirely.

Some changes relative to the upstream 8c710f7 were introduced:

  • Changes not in ciqcbr7_9:
    • No changes to include/net/tc_wrapper.h as this file was introduced much later, in v6.2, and has no predecessor in ciqcbr7_9.
    • No changes to tools/testing/selftests/tc-testing/tc-tests/filters/tcindex.json as this file is part of tc testing which ciqcbr7_9 lack entirely.
  • Changes not in the upstream:
    • Removal of CONFIG_NET_CLS_TCINDEX options from the files in configs/* - kernel.org doesn't keep config files under version control, unlike CIQ.

kABI check: passed

[root@ciqcbr-7-9 pvts]# python /mnt/code/kernel-dist-git-el-7.9/SOURCES/check-kabi -k /mnt/code/kernel-dist-git-el-7.9/SOURCES/Module.kabi_x86_64 -s /mnt/build_files/kernel-src-tree-ciqcbr7_9-CVE-2023-1281/Module.symvers
[root@ciqcbr-7-9 pvts]# echo $? 
0

Boot test: passed

boot-test.log

Kselftests: passed relative

Reference

kselftests–ciqcbr7_9–run1.log
kselftests–ciqcbr7_9–run2.log
kselftests–ciqcbr7_9–run3.log

Patch

kselftests–ciqcbr7_9-CVE-2023-1829–run1.log
kselftests–ciqcbr7_9-CVE-2023-1829–run2.log
kselftests–ciqcbr7_9-CVE-2023-1829–run3.log

Comparison

The results were compared manually with Meld. No changes indicative of some newly introduced malfunctions were found.

Specific tests: passed

Bug replication (reference)

A simple bash script to replicate the CVE-2023-1829 bug can be found at https://mpdesouza.com/blog/five-commands-to-crash-the-kernel/:

for i in {1..300}; do
	tc qdisc add dev lo root handle 1:0 htb default 30
	tc class add dev lo parent 1: classid 1:1 htb rate 256kbit

	# Create a tcindex filter setting shift and mask to certain values to
	# make tcindex use the perfect hash
	tc filter add dev lo parent 1: protocol ip prio 1  \
		handle 10 tcindex mask 0xFFFF shift 10 classid 1:1 action drop

	# delete the filter to trigger the UAF on tcindex_delete
	tc filter delete dev lo parent 1: prio 1 handle 10 protocol ip tcindex
	tc qdisc delete dev lo root handle 1:0 htb
done

The bug trigger isn't deterministic and the for loop is simply to increase the likelihood of a crash.

The author of the script reports the following output on his system:

bash rep.sh
+ for i in {1..300}
+ tc qdisc add dev lo root handle 1:0 htb default 30
+ tc class add dev lo parent 1: classid 1:1 htb rate 256kbit
+ tc filter add dev lo parent 1: protocol ip prio 1 handle 10 tcindex mask 0xFFFF shift 10 classid 1:1 action drop
[    7.066915] GACT probability NOT on
[    7.073193] tc (113) used greatest stack depth: 24352 bytes left
+ tc filter delete dev lo parent 1: prio 1 handle 10 protocol ip tcindex
+ tc qdisc delete dev lo root handle 1:0 htb
+ for i in {1..300}
+ tc qdisc add dev lo root handle 1:0 htb default 30
[    7.141508] ==================================================================
[    7.142150] BUG: KASAN: use-after-free in tcf_action_destroy+0x66/0xd0
[    7.142414] Read of size 8 at addr ffff8881140aba00 by task kworker/u8:0/9
[    7.142671]
[    7.142732] CPU: 0 PID: 9 Comm: kworker/u8:0 Not tainted 6.2.0 #2
[    7.142956] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[    7.143213] Workqueue: tc_filter_workqueue tcindex_destroy_rexts_work [cls_tcindex]
[    7.143926] Call Trace:
[    7.144059]  <TASK>
[    7.144162]  dump_stack_lvl+0x48/0x5f
[    7.144335]  print_report+0x184/0x4b1
[    7.144510]  ? __virt_addr_valid+0xdd/0x160
[    7.144683]  kasan_report+0xdd/0x120
[    7.144850]  ? tcf_action_destroy+0x66/0xd0
[    7.145065]  ? tcf_action_destroy+0x66/0xd0
[    7.145301]  tcf_action_destroy+0x66/0xd0
[    7.145515]  tcf_exts_destroy+0x2d/0x60
[    7.145710]  __tcindex_destroy_rexts+0x11/0xf0 [cls_tcindex]
[    7.145991]  tcindex_destroy_rexts_work+0x1b/0x30 [cls_tcindex]
[    7.146297]  process_one_work+0x57a/0xa40
[    7.146512]  ? __pfx_process_one_work+0x10/0x10
[    7.146737]  ? __pfx_do_raw_spin_lock+0x10/0x10
[    7.146971]  worker_thread+0x93/0x700
[    7.147158]  ? __pfx_worker_thread+0x10/0x10
[    7.147369]  kthread+0x159/0x190
[    7.147534]  ? __pfx_kthread+0x10/0x10
[    7.147725]  ret_from_fork+0x29/0x50
[    7.147909]  </TASK>
…

This kind of ouput requires KASAN (Kernel Address Sanitizer) to be enabled. Unfortunately, it was introduced in version 4.18 and is unavailable in CBR 7.9 (v 3.18).

The bug, of course, would occur regardless of KASAN enabled or not, it just may not be so easily triggered and so clearly reported. Instead of explicit "use-after-free" messages some memory-adjacent, but otherwise seemingly unrelated, kernel crashes are expected to occur when using-after-free didn't get away with it and caused some actual issues. This is exactly what was observed when running the script multiple times.

  • Example 1:

    [root@ciqcbr-7-9 pvts]# /mnt/code/CVE-2023-1829/CVE-2023-1829-repro.sh
    …
    + tc filter delete dev lo parent 1: prio 1 handle 10 protocol ip tcindex
    [ 2554.775511] Kernel panic - not syncing: CRED: put_cred_rcu() sees ffff8af52d8860c0 with usage -1
    [ 2554.775511] 
    [ 2554.779816] CPU: 6 PID: 0 Comm: swapper/6 Kdump: loaded Tainted: G        W      ------------   3.10.0-ciqcbr7_9 #3
    [ 2554.783367] Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-2.el9_5.1 04/01/2014
    [ 2554.785913] Call Trace:
    [ 2554.787234]  <IRQ>  [<ffffffffb07b1c2c>] dump_stack+0x19/0x1f
    [ 2554.789436]  [<ffffffffb07ab748>] panic+0xe8/0x21f
    [ 2554.791344]  [<ffffffffb00d2500>] put_cred_rcu+0xc0/0xc0
    [ 2554.793159]  [<ffffffffb0163228>] rcu_process_callbacks+0x1d8/0x580
    [ 2554.794929]  [<ffffffffb00a9595>] __do_softirq+0xf5/0x290
    [ 2554.796511]  [<ffffffffb07c8aac>] call_softirq+0x1c/0x30
    [ 2554.798079]  [<ffffffffb0030825>] do_softirq+0x65/0xa0
    [ 2554.799607]  [<ffffffffb00a9945>] irq_exit+0x115/0x120
    [ 2554.801136]  [<ffffffffb07ca058>] smp_apic_timer_interrupt+0x48/0x60
    [ 2554.802914]  [<ffffffffb07c63f2>] apic_timer_interrupt+0x172/0x180
    [ 2554.804662]  <EOI>  [<ffffffffb07b9c00>] ? __sched_text_end+0x4/0x4
    [ 2554.806460]  [<ffffffffb07b9e5b>] ? native_safe_halt+0xb/0x30
    [ 2554.808123]  [<ffffffffb07b9c1e>] default_idle+0x1e/0xd0
    [ 2554.809695]  [<ffffffffb0039570>] arch_cpu_idle+0x20/0xc0
    [ 2554.811292]  [<ffffffffb010829a>] cpu_startup_entry+0x14a/0x1e0
    [ 2554.812997]  [<ffffffffb005d6e7>] start_secondary+0x1f7/0x270
    [ 2554.814663]  [<ffffffffb00000d5>] start_cpu+0x5/0x14
    [    0.000000] Initializing cgroup subsys cpuset
    [    0.000000] Initializing cgroup subsys cpu
    [    0.000000] Initializing cgroup subsys cpuacct
    [    0.000000] Linux version 3.10.0-ciqcbr7_9 (pvts@ciqcbr-7-9) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #3 SMP Thu Jul 3 00:29:46 UTC 2025
    …
    
  • Example 2:

    [root@ciqcbr-7-9 pvts]# /mnt/code/CVE-2023-1829/CVE-2023-1829-repro.sh
    …
    + tc filter delete dev lo parent 1: prio 1 handle 10 protocol ip tcindex
    [ 1094.994231] BUG: unable to handle kernel NULL pointer dereference at           (null)
    [ 1094.995625] IP: [<ffffffffabac0dc7>] pwq_activate_delayed_work+0x27/0xb0
    [ 1094.996784] PGD 0 
    [ 1094.997122] Oops: 0000 [#1] SMP 
    [ 1094.997654] Modules linked in: act_gact cls_tcindex sch_htb rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache nfit libnvdimm sunrpc iosf_mbi kvm_intel kvm irqbypass crc32_pclmul iTCO_wdt ghash_clmulni_intel iTCO_vendor_support bochs_drm ttm drm_kms_helper aesni_intel syscopyarea sysfillrect lrw sysimgblt gf128mul glue_helper fb_sys_fops ablk_helper sg drm cryptd joydev pcspkr virtio_balloon virtio_rng i2c_i801 lpc_ich drm_panel_orientation_quirks ip_tables xfs libcrc32c sr_mod cdrom virtio_net net_failover virtio_console virtio_blk failover ahci libahci libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw virtio_pci virtio_ring virtio
    [ 1095.007608] CPU: 5 PID: 312 Comm: kworker/u22:4 Kdump: loaded Not tainted 3.10.0-ciqcbr7_9 #3
    [ 1095.008874] Hardware name: Red Hat KVM/RHEL, BIOS 1.16.3-2.el9_5.1 04/01/2014
    [ 1095.009944] task: ffff9b1999ce9080 ti: ffff9b1999d00000 task.ti: ffff9b1999d00000
    [ 1095.011057] RIP: 0010:[<ffffffffabac0dc7>]  [<ffffffffabac0dc7>] pwq_activate_delayed_work+0x27/0xb0
    [ 1095.012439] RSP: 0018:ffff9b1999d03dd8  EFLAGS: 00010046
    [ 1095.013229] RAX: 0000000fffffffe1 RBX: ffff9b199f3d5300 RCX: ffff9b1999d03fd8
    [ 1095.014291] RDX: 0000000fffffff00 RSI: 0000000000000000 RDI: ffff9b199dd04440
    [ 1095.015356] RBP: ffff9b1999d03df0 R08: ffff9b199851b780 R09: 000000018020001f
    [ 1095.016412] R10: 0000000000000001 R11: ffff9b199851b780 R12: ffff9b199dd04440
    [ 1095.017474] R13: 0000000000000000 R14: ffff9b199f3d5300 R15: 00000000000002c0
    [ 1095.018541] FS:  0000000000000000(0000) GS:ffff9b199f280000(0000) knlGS:0000000000000000
    [ 1095.019736] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 1095.020597] CR2: 00000000000000b8 CR3: 000000060c010000 CR4: 0000000000360fe0
    [ 1095.021659] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    [ 1095.022715] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    [ 1095.023771] Call Trace:
    [ 1095.024153]  [<ffffffffabac171c>] pwq_dec_nr_in_flight+0x6c/0xb0
    [ 1095.025047]  [<ffffffffabac338a>] process_one_work+0x21a/0x440
    [ 1095.025915]  [<ffffffffabac4436>] worker_thread+0x126/0x3c0
    [ 1095.026750]  [<ffffffffabac4310>] ? manage_workers.isra.26+0x2b0/0x2b0
    [ 1095.027718]  [<ffffffffabacb621>] kthread+0xd1/0xe0
    [ 1095.028449]  [<ffffffffabacb550>] ? insert_kthread_work+0x40/0x40
    [ 1095.029355]  [<ffffffffac1c51dd>] ret_from_fork_nospec_begin+0x7/0x21
    [ 1095.030304]  [<ffffffffabacb550>] ? insert_kthread_work+0x40/0x40
    [ 1095.031216] Code: 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 45 31 ed 41 54 49 89 fc 53 48 8b 07 48 89 c2 30 d2 a8 04 4c 0f 45 ea 0f 1f 44 00 00 <49> 8b 45 00 48 8d 70 20 48 3b 70 20 74 61 31 d2 4c 89 e7 e8 e1 
    [ 1095.035234] RIP  [<ffffffffabac0dc7>] pwq_activate_delayed_work+0x27/0xb0
    [ 1095.036263]  RSP <ffff9b1999d03dd8>
    [ 1095.036793] CR2: 0000000000000000
    [    0.000000] Initializing cgroup subsys cpuset
    [    0.000000] Initializing cgroup subsys cpu
    [    0.000000] Initializing cgroup subsys cpuacct
    [    0.000000] Linux version 3.10.0-ciqcbr7_9 (pvts@ciqcbr-7-9) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #3 SMP Thu Jul 3 00:29:46 UTC 2025
    …
    

Patch

As expected, the bug-replicating script doesn't even run on the patched kernel:

[root@ciqcbr-7-9 pvts]# /mnt/code/CVE-2023-1829/CVE-2023-1829-repro.sh
+ for i in '{1..300}'
+ tc qdisc add dev lo root handle 1:0 htb default 30
+ tc class add dev lo parent 1: classid 1:1 htb rate 256kbit
+ tc filter add dev lo parent 1: protocol ip prio 1 handle 10 tcindex mask 0xFFFF shift 10 classid 1:1 action drop
RTNETLINK answers: No such file or directory
We have an error talking to the kernel

The cls_tcindex module is missing:

[root@ciqcbr-7-9 pvts]# modprobe cls_tcindex
modprobe: FATAL: Module cls_tcindex not found.

jira VULN-7630
cve CVE-2023-1829
commit-author Jamal Hadi Salim <[email protected]>
commit 8c710f7
upstream-diff Changes not in `ciqcbr7_9':
    1. No changes to `include/net/tc_wrapper.h' as this file was introduced
       much later, in v6.2, and has no predecessor in `ciqcbr7_9'.
    2. No changes to
       `tools/testing/selftests/tc-testing/tc-tests/filters/tcindex.json' as
       this file is part of `tc' testing which `ciqcbr7_9' lacks entirely.
    Changes not in the upstream:
    1. Removal of `CONFIG_NET_CLS_TCINDEX' options from the files in
       `configs/*' - the upstream doesn't keep config files under version
       control unlike Rocky.

The tcindex classifier has served us well for about a quarter of a century
but has not been getting much TLC due to lack of known users. Most recently
it has become easy prey to syzkaller. For this reason, we are retiring it.

	Signed-off-by: Jamal Hadi Salim <[email protected]>
	Acked-by: Jiri Pirko <[email protected]>
	Signed-off-by: Paolo Abeni <[email protected]>
(cherry picked from commit 8c710f7)
	Signed-off-by: Marcin Wcisło <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant