-
Notifications
You must be signed in to change notification settings - Fork 10
[LTS 9.2] netfilter: CVE-2022-{48641, 49917, 49918}, CVE-2023-{3777, 4015, 4244, 5197, 52620, 52923, 52924, 53549, 53635, 7192} #668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: ciqlts9_2
Are you sure you want to change the base?
Conversation
|
@pvts-mat I know this is still in draft, but wanted to upload the output from our upstream bug and cve checker script. Most of the |
|
As you mentioned in the PR summary and we've discussed in slack, for the moment this PR seems large enough and pulling in the bugfixes above would only make it bigger. I'm ok with kicking the bugfixes down the road to a future PR. |
Thanks, that's exactly why I put it out there instead of drafting it locally, there's a lot to look into here, I wanted to have the discussion rolling. Let me address the logs, I'll start with
Here's the detailed breakdown: |
|
Yep, that one is wrong in the commit, and I have it correct in the PR so it must have been a cherry-picking mistake. |
2919917 to
520390b
Compare
|
All the fixes reported are addressed in the
I'll explain the Rows
Then there are the 6da5f1b and 8d1c918 fixes for 73db1b8 mentioned in the script's log. I guess the SHA codes are truncated to less than 7 characters in the searches, because the "Fixes" commit referenced there isn't 73db1b8 actually, but 73db1b5. Funnily enough it's also |
With respect to this since it is in our kernel but not upstream kernel but is upstream to CIQ its in the CentOS-Stream that we maintain a copy of in our tree. Maybe we need to make add some additional logic to see if its in our tree. We have some examples in commits for CIQ-6.12.y branch where we pull from |
|
These two need reordered
Order should be: |
Hmm, it seems unlikely that 2cdaa3e( Indeed the 47d4b36( Perhaps |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is wild at this point it all seems the same, but the diffs between our and upstream make this super hard to read.
No I'm wrong I got mixed up, i missed this part of a9e9413
Let me think about and get back to you on the second part its been several hours looking through this. |
-Fixed relocation failure observed on [1] -Fixed spurious (and massive) added or deleted code blocks appearing in the output for rejected hunks, encountered with [1] -Fixed N value passed to --fuzzy not getting honored correctly, causing more fuzzing to occur than desired -Fixed too few context lines in the output which hurt readability; now there are more context lines in the output [1] #668
Apologies for the delay, yes despite it being a clean cherry-pick since we're missing 0fabd2aa199faeb8754aee94658f2c48ccb2c8c3(kernel-mainline) its probably worth mentioning because they code never moved in the first place. We're also not referencing that code anywhere, So yes lets add the |
520390b to
b413376
Compare
jira VULN-66488 cve CVE-2022-49918 commit-author Zhengchao Shao <[email protected]> commit 3d00c6a During the initialization of ip_vs_conn_net_init(), if file ip_vs_conn or ip_vs_conn_sync fails to be created, the initialization is successful by default. Therefore, the ip_vs_conn or ip_vs_conn_sync file doesn't be found during the remove. The following is the stack information: name 'ip_vs_conn_sync' WARNING: CPU: 3 PID: 9 at fs/proc/generic.c:712 remove_proc_entry+0x389/0x460 Modules linked in: Workqueue: netns cleanup_net RIP: 0010:remove_proc_entry+0x389/0x460 Call Trace: <TASK> __ip_vs_cleanup_batch+0x7d/0x120 ops_exit_list+0x125/0x170 cleanup_net+0x4ea/0xb00 process_one_work+0x9bf/0x1710 worker_thread+0x665/0x1080 kthread+0x2e4/0x3a0 ret_from_fork+0x1f/0x30 </TASK> Fixes: 61b1ab4 ("IPVS: netns, add basic init per netns.") Signed-off-by: Zhengchao Shao <[email protected]> Acked-by: Julian Anastasov <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 3d00c6a) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-66461 cve CVE-2022-49917 commit-author Zhengchao Shao <[email protected]> commit 5663ed6 During the initialization of ip_vs_app_net_init(), if file ip_vs_app fails to be created, the initialization is successful by default. Therefore, the ip_vs_app file doesn't be found during the remove in ip_vs_app_net_cleanup(). It will cause WRNING. The following is the stack information: name 'ip_vs_app' WARNING: CPU: 1 PID: 9 at fs/proc/generic.c:712 remove_proc_entry+0x389/0x460 Modules linked in: Workqueue: netns cleanup_net RIP: 0010:remove_proc_entry+0x389/0x460 Call Trace: <TASK> ops_exit_list+0x125/0x170 cleanup_net+0x4ea/0xb00 process_one_work+0x9bf/0x1710 worker_thread+0x665/0x1080 kthread+0x2e4/0x3a0 ret_from_fork+0x1f/0x30 </TASK> Fixes: 457c4cb ("[NET]: Make /proc/net per network namespace") Signed-off-by: Zhengchao Shao <[email protected]> Acked-by: Julian Anastasov <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 5663ed6) Signed-off-by: Marcin Wcisło <[email protected]>
…ries jira VULN-157585 cve CVE-2023-53549 commit-author Jozsef Kadlecsik <[email protected]> commit 5e29dc3 When adding/deleting large number of elements in one step in ipset, it can take a reasonable amount of time and can result in soft lockup errors. The patch 5f7b51b ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") tried to fix it by limiting the max elements to process at all. However it was not enough, it is still possible that we get hung tasks. Lowering the limit is not reasonable, so the approach in this patch is as follows: rely on the method used at resizing sets and save the state when we reach a smaller internal batch limit, unlock/lock and proceed from the saved state. Thus we can avoid long continuous tasks and at the same time removed the limit to add/delete large number of elements in one step. The nfnl mutex is held during the whole operation which prevents one to issue other ipset commands in parallel. Fixes: 5f7b51b ("netfilter: ipset: Limit the maximal range of consecutive elements to add/delete") Reported-by: [email protected] Signed-off-by: Jozsef Kadlecsik <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 5e29dc3) Signed-off-by: Marcin Wcisło <[email protected]>
…sion jira VULN-430 cve-pre CVE-2023-4244 commit-author Florian Westphal <[email protected]> commit 2b272bb When using a xfrm interface in a bridged setup (the outgoing device is bridged), the incoming packets in the xfrm interface are only tracked in the outgoing direction. $ brctl show bridge name interfaces br_eth1 eth1 $ conntrack -L tcp 115 SYN_SENT src=192... dst=192... [UNREPLIED] ... If br_netfilter is enabled, the first (encrypted) packet is received onR eth1, conntrack hooks are called from br_netfilter emulation which allocates nf_bridge info for this skb. If the packet is for local machine, skb gets passed up the ip stack. The skb passes through ip prerouting a second time. br_netfilter ip_sabotage_in supresses the re-invocation of the hooks. After this, skb gets decrypted in xfrm layer and appears in network stack a second time (after decryption). Then, ip_sabotage_in is called again and suppresses netfilter hook invocation, even though the bridge layer never called them for the plaintext incarnation of the packet. Free the bridge info after the first suppression to avoid this. I was unable to figure out where the regression comes from, as far as i can see br_netfilter always had this problem; i did not expect that skb is looped again with different headers. Fixes: c4b0e77 ("netfilter: avoid using skb->nf_bridge directly") Reported-and-tested-by: Wolfgang Nothdurft <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 2b272bb) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-430 cve-pre CVE-2023-4244 commit-author Florian Westphal <[email protected]> commit 94623f5 Recent attempt to ensure PREROUTING hook is executed again when a decrypted ipsec packet received on a bridge passes through the network stack a second time broke the physdev match in INPUT hook. We can't discard the nf_bridge info strct from sabotage_in hook, as this is needed by the physdev match. Keep the struct around and handle this with another conditional instead. Fixes: 2b272bb ("netfilter: br_netfilter: disable sabotage_in hook after first suppression") Reported-and-tested-by: Farid BENAMROUCHE <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 94623f5) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-430 cve-pre CVE-2023-4244 commit-author Florian Westphal <[email protected]> commit e58a171 We are not allowed to return an error at this point. Looking at the code it looks like ret is always 0 at this point, but its not. t = find_table_lock(net, repl->name, &ret, &ebt_mutex); ... this can return a valid table, with ret != 0. This bug causes update of table->private with the new blob, but then frees the blob right away in the caller. Syzbot report: BUG: KASAN: vmalloc-out-of-bounds in __ebt_unregister_table+0xc00/0xcd0 net/bridge/netfilter/ebtables.c:1168 Read of size 4 at addr ffffc90005425000 by task kworker/u4:4/74 Workqueue: netns cleanup_net Call Trace: kasan_report+0xbf/0x1f0 mm/kasan/report.c:517 __ebt_unregister_table+0xc00/0xcd0 net/bridge/netfilter/ebtables.c:1168 ebt_unregister_table+0x35/0x40 net/bridge/netfilter/ebtables.c:1372 ops_exit_list+0xb0/0x170 net/core/net_namespace.c:169 cleanup_net+0x4ee/0xb10 net/core/net_namespace.c:613 ... ip(6)tables appears to be ok (ret should be 0 at this point) but make this more obvious. Fixes: c58dd2d ("netfilter: Can't fail and free after table replacement") Reported-by: [email protected] Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit e58a171) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-34024 cve CVE-2022-48641 commit-author Florian Westphal <[email protected]> commit 62ce44c The bug fix was incomplete, it "replaced" crash with a memory leak. The old code had an assignment to "ret" embedded into the conditional, restore this. Fixes: 7997eff ("netfilter: ebtables: reject blobs that don't provide all entry points") Reported-and-tested-by: [email protected] Signed-off-by: Florian Westphal <[email protected]> (cherry picked from commit 62ce44c) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-430 cve-pre CVE-2023-4244 commit-author Florian Westphal <[email protected]> commit 4a02426 The xtables packet traverser performs an unconditional local_bh_disable(), but the nf_tables evaluation loop does not. Functions that are called from either xtables or nftables must assume that they can be called in process context. inet_twsk_deschedule_put() assumes that no softirq interrupt can occur. If tproxy is used from nf_tables its possible that we'll deadlock trying to aquire a lock already held in process context. Add a small helper that takes care of this and use it. Link: https://lore.kernel.org/netfilter-devel/[email protected]/ Fixes: 4ed8eb6 ("netfilter: nf_tables: Add native tproxy support") Reported-and-tested-by: Major Dávid <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 4a02426) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-430 cve-pre CVE-2023-4244 commit-author Pablo Neira Ayuso <[email protected]> commit 33c7aba upstream-diff Actually picked the 9.4 backport c085531 which applies cleanly without context conflicts Elements with an end interval flag set on do not store extensions. The global set definition is currently setting on the timeout and stateful expression for end interval elements. This leads to skipping end interval elements from the set->ops->walk() path as the expired check bogusly reports true. Moreover, do not set up stateful expressions for elements with end interval flag set on since this is never used. Fixes: 6503842 ("netfilter: nf_tables: allow to specify stateful expression in set definition") Fixes: 8d8540c ("netfilter: nft_set_rbtree: add timeout support") Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 33c7aba) Signed-off-by: Marcin Wcisło <[email protected]>
…tion jira VULN-430 cve-pre CVE-2023-4244 commit-author Pablo Neira Ayuso <[email protected]> commit c9e6978 ...instead of a tree descent, which became overly complicated in an attempt to cover cases where expired or inactive elements would affect comparisons with the new element being inserted. Further, it turned out that it's probably impossible to cover all those cases, as inactive nodes might entirely hide subtrees consisting of a complete interval plus a node that makes the current insertion not overlap. To speed up the overlap check, descent the tree to find a greater element that is closer to the key value to insert. Then walk down the node list for overlap detection. Starting the overlap check from rb_first() unconditionally is slow, it takes 10 times longer due to the full linear traversal of the list. Moreover, perform garbage collection of expired elements when walking down the node list to avoid bogus overlap reports. For the insertion operation itself, this essentially reverts back to the implementation before commit 7c84d41 ("netfilter: nft_set_rbtree: Detect partial overlaps on insertion"), except that cases of complete overlap are already handled in the overlap detection phase itself, which slightly simplifies the loop to find the insertion point. Based on initial patch from Stefano Brivio, including text from the original patch description too. Fixes: 7c84d41 ("netfilter: nft_set_rbtree: Detect partial overlaps on insertion") Reviewed-by: Stefano Brivio <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit c9e6978) Signed-off-by: Marcin Wcisło <[email protected]>
…collection jira VULN-430 cve-pre CVE-2023-4244 commit-author Pablo Neira Ayuso <[email protected]> commit 5d235d6 Skip interference with an ongoing transaction, do not perform garbage collection on inactive elements. Reset annotated previous end interval if the expired element is marked as busy (control plane removed the element right before expiration). Fixes: 8d8540c ("netfilter: nft_set_rbtree: add timeout support") Reviewed-by: Stefano Brivio <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 5d235d6) Signed-off-by: Marcin Wcisło <[email protected]>
…once jira VULN-430 cve CVE-2023-4244 commit-author Florian Westphal <[email protected]> commit c9bd265 nft -f -<<EOF add table ip t add table ip t { flags dormant; } add chain ip t c { type filter hook input priority 0; } add table ip t EOF Triggers a splat from nf core on next table delete because we lose track of right hook register state: WARNING: CPU: 2 PID: 1597 at net/netfilter/core.c:501 __nf_unregister_net_hook RIP: 0010:__nf_unregister_net_hook+0x41b/0x570 nf_unregister_net_hook+0xb4/0xf0 __nf_tables_unregister_hook+0x160/0x1d0 [..] The above should have table in *active* state, but in fact no hooks were registered. Reject on/off/on games rather than attempting to fix this. Fixes: 179d9ba ("netfilter: nf_tables: fix table flag updates") Reported-by: "Lee, Cherie-Anne" <[email protected]> Cc: Bing-Jhong Billy Jheng <[email protected]> Cc: [email protected] Signed-off-by: Florian Westphal <[email protected]> (cherry picked from commit c9bd265) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-598 cve CVE-2023-52581 commit-author Florian Westphal <[email protected]> commit cf5000a When more than 255 elements expired we're supposed to switch to a new gc container structure. This never happens: u8 type will wrap before reaching the boundary and nft_trans_gc_space() always returns true. This means we recycle the initial gc container structure and lose track of the elements that came before. While at it, don't deref 'gc' after we've passed it to call_rcu. Fixes: 5f68718 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Reported-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Florian Westphal <[email protected]> (cherry picked from commit cf5000a) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-430 cve CVE-2023-4244 commit-author Florian Westphal <[email protected]> commit 0873882 nft_rbtree_gc_elem() walks back and removes the end interval element that comes before the expired element. There is a small chance that we've cached this element as 'rbe_ge'. If this happens, we hold and test a pointer that has been queued for freeing. It also causes spurious insertion failures: $ cat test-testcases-sets-0044interval_overlap_0.1/testout.log Error: Could not process rule: File exists add element t s { 0 - 2 } ^^^^^^ Failed to insert 0 - 2 given: table ip t { set s { type inet_service flags interval,timeout timeout 2s gc-interval 2s } } The set (rbtree) is empty. The 'failure' doesn't happen on next attempt. Reason is that when we try to insert, the tree may hold an expired element that collides with the range we're adding. While we do evict/erase this element, we can trip over this check: if (rbe_ge && nft_rbtree_interval_end(rbe_ge) && nft_rbtree_interval_end(new)) return -ENOTEMPTY; rbe_ge was erased by the synchronous gc, we should not have done this check. Next attempt won't find it, so retry results in successful insertion. Restart in-kernel to avoid such spurious errors. Such restart are rare, unless userspace intentionally adds very large numbers of elements with very short timeouts while setting a huge gc interval. Even in this case, this cannot loop forever, on each retry an existing element has been removed. As the caller is holding the transaction mutex, its impossible for a second entity to add more expiring elements to the tree. After this it also becomes feasible to remove the async gc worker and perform all garbage collection from the commit path. Fixes: c9e6978 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection") Signed-off-by: Florian Westphal <[email protected]> (cherry picked from commit 0873882) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-430 cve CVE-2023-4244 commit-author Florian Westphal <[email protected]> commit d2fd2e4 upstream-diff There is no upstream commit to diff with. Picked directly from RH's branch. See message below for details. JIRA: https://issues.redhat.com/browse/RHEL-1720 JIRA: https://issues.redhat.com/browse/RHEL-1721 Upstream Status: RHEL only RHEL only. Proposed upstream but was rejected. I don't think we can force a rebase of nftables userland in RHEL <= 9.4. Even if we can do this, we would still need this change for z-stream. This change SHOULD NOT be forwarded into versions later than RHEL 9.4. For those releases nftables userspace should be updated to release 1.0.7 or later instead. nftables versions prior to commit 3975430b12d9 ("src: expand table command before evaluation"), i.e. 1.0.6 and earlier, will handle the following snippet in the wrong order: table ip t { chain c { jump { counter; } } } 1. create the table, chain,c and an anon chain. 2. append a rule to chain c to jump to the anon chain. 3. append the rule(s) (here: "counter") to the anon chain. (step 3 should be before 2). With below commit, this is now rejected by the kernel. Reason is that the 'jump {' rule added to chain c adds an explicit binding (dependency), i.e. the kernel will automatically remove the anon chain when userspace later asks to delete the 'jump {' rule from chain c. This caused crashes in the kernel in case of a errors further down in the same transaction. The abort part has to unroll all pending changes, including the request to add the rule 'jump {'. As its already bound, all the rules added to it get deleted as well. Because we tolerated late-add-after-bind, the transaction log also contains the NEWRULE requests (here "counter"), so those were deleted again. Instead of rejecting newrule-to-bound-chain, allow it iff the anon chain is new in this transaction and we are appending. Mark the newrule transaction as already_bound so abort path skips them. Fixes: 0ebc106 ("netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID") Reported-by: Timo Sigurdsson <[email protected]> Closes: https://lore.kernel.org/netfilter-devel/[email protected]/ Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Florian Westphal <[email protected]> (cherry picked from commit d2fd2e4) Signed-off-by: Marcin Wcisło <[email protected]>
…lush jira VULN-430 cve CVE-2023-4244 commit-author Pablo Neira Ayuso <[email protected]> commit 26cec9d Use the element object that is already offered instead. Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 26cec9d) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-430 cve CVE-2023-4244 commit-author Pablo Neira Ayuso <[email protected]> commit 6509a2e .flush is always successful since this results from iterating over the set elements to toggle mark the element as inactive in the next generation. Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 6509a2e) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-430 cve CVE-2023-4244 commit-author Pablo Neira Ayuso <[email protected]> commit 9dad402 upstream-diff Context conflict with the cve fix 5d4bb57 (wrong application order). Add placeholder structure and place it at the beginning of each struct nft_*_elem for each existing set backend, instead of exposing elements as void type to the frontend which defeats compiler type checks. Use this pointer to this new type to replace void *. This patch updates the following set backend API to use this new struct nft_elem_priv placeholder structure: - update - deactivate - flush - get as well as the following helper functions: - nft_set_elem_ext() - nft_set_elem_init() - nft_set_elem_destroy() - nf_tables_set_elem_destroy() This patch adds nft_elem_priv_cast() to cast struct nft_elem_priv to native element representation from the corresponding set backend. BUILD_BUG_ON() makes sure this .priv placeholder is always at the top of the opaque set element representation. Suggested-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 9dad402) Signed-off-by: Marcin Wcisło <[email protected]>
b413376 to
740e770
Compare
|
@PlaidCat So how about the "tagging policy"? Do we keep the CVE tags as they are? (I hope not). The issue is: how to cve-tag commits which do not address any specific CVE and which neither are bugfixes to any specific CVE nor they are prerequisites (or establishing those would be impractical and not even reflecting the workflow). |
In this sort of case feel free to change those to basically like what we did here:
is there more?? Let me know i'll process this.
I can make one but i'll need all the merges you're basing this off of and I can create a ticket. |
But that's not a subsystem sync. A subsystem sync has a specific codebase endpoint to achieve as a goal. Here no specific codebase revision was achieved, nor was it ever a goal, so both the version number - whatever is chosen - will be misleading as well as the implication of methodology. I was careful not to make the actual sync, which would be over 3x the volume of this merge. This is just a backport, but big one. A "wholesale backporting" if you like. I don't think it has a precedent in Rocky patching, or at least I didn't see it in the public PRs.
Please see the comment #668 (comment). |
In the case of this specific PR it's quite clear - the effort was driven by the CVE-2023-4244 fix, which has the associated jira ticket I was asking for a more general rule covering less clear-cut cases, like the next |
Maybe something like: That is a reference to the centos stream 9 merge commit that we are emulating. Just spitballing |
|
Commit is not a clean cherry pick due to missing |
Similar for |
Those are the commits no.
Notice the values in the
Also from the "Backporting strategy":
It should be added perhaps that in the case of clean picks from mainline the commits were equivalent to those in 9.4, so sourcecode-wise it made no difference whether it was mainline or 9.4 (so we're not doing some wild
I like this one. I would add this line even to those commits which do have |
I meant you should update the commit message to reflect that they are not clean cherry picks. At the moment we don't have a way to differentiate between multiple sources, so even if it's a clean cherry pick from 9.4 let's say, an upstream-diff mention should be added. You did so for other cherry picks. |
Ok, so the issue is the lack of |
cve CVE-2023-6111 commit-author Pablo Neira Ayuso <[email protected]> commit 93995bf upstream-diff Used the cleanly applying 9.4 backport 54e39cc The expired catchall element is not deactivated and removed from GC sync path. This path holds mutex so just call nft_setelem_data_deactivate() and nft_setelem_catchall_remove() before queueing the GC work. Fixes: 4a9e12e ("netfilter: nft_set_pipapo: call nft_trans_gc_queue_sync() in catchall GC") Reported-by: lonial con <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 93995bf) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-430 cve CVE-2023-4244 commit-author Pablo Neira Ayuso <[email protected]> commit 8837ba3 upstream-diff Used the cleanly applying 9.4 backport 45aad05 list_for_each_entry_safe() does not work for the async case which runs under RCU, therefore, split GC logic for catchall in two functions instead, one for each of the sync and async GC variants. The catchall sync GC variant never sees a _DEAD bit set on ever, thus, this handling is removed in such case, moreover, allocate GC sync batch via GFP_KERNEL. Fixes: 93995bf ("netfilter: nf_tables: remove catchall element in GC sync path") Reported-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 8837ba3) Signed-off-by: Marcin Wcisło <[email protected]>
…_entry() jira VULN-430 cve-bf CVE-2023-4244 commit-author Ilya Leoshkevich <[email protected]> commit 837723b bpf_nf testcase fails on s390x: bpf_skb_ct_lookup() cannot find the entry that was added by bpf_ct_insert_entry() within the same BPF function. The reason is that this entry is deleted by nf_ct_gc_expired(). The CT timeout starts ticking after the CT confirmation; therefore nf_conn.timeout is initially set to the timeout value, and __nf_conntrack_confirm() sets it to the deadline value. bpf_ct_insert_entry() sets IPS_CONFIRMED_BIT, but does not adjust the timeout, making its value meaningless and causing false positives. Fix the problem by making bpf_ct_insert_entry() adjust the timeout, like __nf_conntrack_confirm(). Fixes: 2cdaa3e ("netfilter: conntrack: restore IPS_CONFIRMED out of nf_conntrack_hash_check_insert()") Signed-off-by: Ilya Leoshkevich <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Cc: Florian Westphal <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]> (cherry picked from commit 837723b) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-34700 cve CVE-2023-42756 commit-author Jozsef Kadlecsik <[email protected]> commit 7433b6d Kyle Zeng reported that there is a race between IPSET_CMD_ADD and IPSET_CMD_SWAP in netfilter/ip_set, which can lead to the invocation of `__ip_set_put` on a wrong `set`, triggering the `BUG_ON(set->ref == 0);` check in it. The race is caused by using the wrong reference counter, i.e. the ref counter instead of ref_netlink. Fixes: 24e2278 ("netfilter: ipset: Add schedule point in call_ad().") Reported-by: Kyle Zeng <[email protected]> Closes: https://lore.kernel.org/netfilter-devel/ZPZqetxOmH+w%2Fmyc@westworld/#r Tested-by: Kyle Zeng <[email protected]> Signed-off-by: Jozsef Kadlecsik <[email protected]> Signed-off-by: Florian Westphal <[email protected]> (cherry picked from commit 7433b6d) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-8184 cve CVE-2024-26581 commit-author Pablo Neira Ayuso <[email protected]> commit 60c0c23 rbtree lazy gc on insert might collect an end interval element that has been just added in this transactions, skip end interval elements that are not yet active. Fixes: f718863 ("netfilter: nft_set_rbtree: fix overlap expiration walk") Cc: [email protected] Reported-by: lonial con <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 60c0c23) Signed-off-by: Marcin Wcisło <[email protected]>
…ith timeout jira VULN-836 cve CVE-2024-26643 commit-author Pablo Neira Ayuso <[email protected]> commit 552705a While the rhashtable set gc runs asynchronously, a race allows it to collect elements from anonymous sets with timeouts while it is being released from the commit path. Mingi Cho originally reported this issue in a different path in 6.1.x with a pipapo set with low timeouts which is not possible upstream since 7395dfa ("netfilter: nf_tables: use timestamp to check for set element timeout"). Fix this by setting on the dead flag for anonymous sets to skip async gc in this case. According to 08e4c8c ("netfilter: nf_tables: mark newset as dead on transaction abort"), Florian plans to accelerate abort path by releasing objects via workqueue, therefore, this sets on the dead flag for abort path too. Cc: [email protected] Fixes: 5f68718 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Reported-by: Mingi Cho <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 552705a) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-4906 cve-pre CVE-2024-26925 commit-author Pablo Neira Ayuso <[email protected]> commit a45e688 Unlike early commit path stage which triggers a call to abort, an explicit release of the batch is required on abort, otherwise mutex is released and commit_list remains in place. Add WARN_ON_ONCE to ensure commit_list is empty from the abort path before releasing the mutex. After this patch, commit_list is always assumed to be empty before grabbing the mutex, therefore 03c1f1e ("netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()") only needs to release the pending modules for registration. Cc: [email protected] Fixes: c0391b6 ("netfilter: nf_tables: missing validation from the abort path") Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit a45e688) Signed-off-by: Marcin Wcisło <[email protected]>
jira VULN-4906 cve CVE-2024-26925 commit-author Pablo Neira Ayuso <[email protected]> commit 0d459e2 The commit mutex should not be released during the critical section between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC worker could collect expired objects and get the released commit lock within the same GC sequence. nf_tables_module_autoload() temporarily releases the mutex to load module dependencies, then it goes back to replay the transaction again. Move it at the end of the abort phase after nft_gc_seq_end() is called. Cc: [email protected] Fixes: 7203443 ("netfilter: nf_tables: GC transaction race with abort path") Reported-by: Kuan-Ting Chen <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> (cherry picked from commit 0d459e2) Signed-off-by: Marcin Wcisło <[email protected]>
740e770 to
e4508ce
Compare
Actually, yea I like the idea of putting that in all of the commits. What do others think? |
I'm ok with this |
|
I ran interdiff on this PR and attached the output here: netfilter-interdiff.txt
Running interdiff for backport 995d7c47d49fd and upstream e6d57e9ff0aec ("netfilter: conntrack: fix rmmod double-free race")
only in patch2:
unchanged:
--- a/net/netfilter/nf_conntrack_bpf.c
+++ b/net/netfilter/nf_conntrack_bpf.c
@@ -381,7 +381,6 @@ struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct_i)
struct nf_conn *nfct = (struct nf_conn *)nfct_i;
int err;
- nfct->status |= IPS_CONFIRMED;
err = nf_conntrack_hash_check_insert(nfct);
if (err < 0) {
nf_conntrack_free(nfct);
DONE interdiff for backport 995d7c47d49fd and upstream e6d57e9ff0aec ("netfilter: conntrack: fix rmmod double-free race")Interdiff says that the upstream patch has that hunk while the backport does not.
struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct_i)
{
struct nf_conn *nfct = (struct nf_conn *)nfct_i;
int err;
if (!nf_ct_is_confirmed(nfct))
nfct->timeout += nfct_time_stamp;
nfct->status |= IPS_CONFIRMED;
err = nf_conntrack_hash_check_insert(nfct);
if (err < 0) {
nf_conntrack_free(nfct);
return NULL;
}
return nfct;
}EDIT: Oh, I just noticed the upstream-diff message and the followup |
[LTS 9.2]
Fixed:
CVE-2022-48641 VULN-34024
CVE-2022-49917 VULN-66461
CVE-2022-49918 VULN-66488
CVE-2023-3777 VULN-6585
CVE-2023-4015 VULN-6613
CVE-2023-4244 VULN-430
CVE-2023-5197 VULN-34732
CVE-2023-52620 VULN-645
CVE-2023-52923 VULN-158865
CVE-2023-52924 VULN-158864
CVE-2023-53549 VULN-157585
CVE-2023-53635 VULN-158282
CVE-2023-7192 VULN-6817
Introduced:
CVE-2024-27012
CVE-2024-27017
CVE-2024-42109
CVE-2024-0193 VULN-6825
Introduced & fixed:
CVE-2023-42756 VULN-34700
CVE-2023-52433 VULN-158866
CVE-2023-52581 VULN-598
CVE-2023-52925 VULN-158863
CVE-2023-53304 VULN-155012
CVE-2023-53566 VULN-157645
CVE-2023-6111
CVE-2024-26581 VULN-8184
CVE-2024-26643 VULN-836
CVE-2024-26925 VULN-4906
Background
This PR was driven by CVE-2023-4244. The official upstream fix is given in the merge commit a2dd023. The merged branch of 5 commits had to be ported in whole. Although they fixed the CVE-2023-4244 problem they introduced multiple more. Additionally, the LTS 9.2
netfiltercodebase was not suited for the backport.The CVE-2023-4244 bug was already addressed in LTS 9.4 as part of the CentOS 9 "netlink api rebase" merge request. It was decided to port it in whole as it already dealt with similar unsuitability of the fix to the
netfiltercodebase of CentOS 9 at that time, and with the subsequent fixes. From the CentOS MR:Some of the involved commits required prerequisites contained in other, similar branches, ported in whole as well. Coupled with a few "loose" bugfixes the solution grew to almost 100 commits, although without any custom adaptations (clean cherry-picks).
The ported commits validated the CVE-2024-26581 which was put on hold before due to the modified code missing from the
ciqlts9_2codebase. The fix for CVE-2024-26581 was therefore included in this PR as well. All other CVEs were solved as a byproduct of backporting the CentOS 9 branches and some of their loose bugfixes.Solution strategy
Backporting strategy
Strategy for backporting CVE fixes, their prerequisites, and their bug fixes:
ciqlts9_4branch. For all commits obtained that way thekernel-mainlineSHA was nevertheless used in thecommitproperty and theupstream-diffcomment was included with reference to theciqlts9_4SHA actually used for the backport. The picks from 9.4 were only used if the picks from mainline generated conflicts. One commit (d2fd2e4) didn't even have an upstream equivalent.ciqlts9_2in next PRs.There were some exceptions to these rules, each of them addressed briefly in the Comment column of the Commits table, which see.
Tagging policy
Because of the "full branches" rule many commits incorporated in the PR contain fixes to bugs which don't have any specific CVE assigned. While some of them might have been identified as bugfixes for commits which do have specific CVE assigned (easy), or as their prerequisites (difficult), it was decided to not dwelve into these details and simplify the tagging.
cveandjiraproperties.cve-pre CVE-2023-4244: if they appear before the commit netfilter: nf_tables: GC transaction API to avoid race with control plane (official CVE-2023-4244 fix), orcve CVE-2023-4244: for all the commits after, as long as they still belong to the branch1(see Branches), orcve-bf CVE-2023-4244: for all the commits after branch1if they fix any of the commits in branches1,2,3.Conflicts
With the help of LTS 9.4 backports the solution for LTS 9.2 was obtained without any meaningful conflicts. The only ones occured when backporting 9dad402 and 2413893 and it stemmed from the 5d4bb57 fix present in
ciqlts9_2, which disturbed the chronology of changes. The conflicts might have been avoided entirely if it was reverted before the commits in this PR and then re-applied after, but that would result in an odd history. The conflicts were solved manually instead, preserving the meaning of the changes.Solution details
Tables
The internal structure of the solution can best be described by the Commits table. It's augmented with the CVEs, Branches and Commit titles tables providing some details about the entities referenced in the
Commitstable.The
CommitsandCVEstables were aimed to be fully enclosing the PR, which means:Commitscontains all commits from the PR (but not only them),Commitscontains all "Fixes" commits of every commit in the PR,Commitscontains all "Fixed by" commits of every commit in the PR,CVEscontains all CVEs associated with every commit in the PR,Commitscontains all commits associated with each CVE from theCVEstable (which may fall outside of PR).In this way a full picture of all solved and introduced problems can be painted.
Fields
Some cell values have universal meaning in all tables:
-: The value was considered / checked and it either doesn't apply or the value is missing, or the associated list is empty, depending on column context.Commits
Columns:
kernel-mainlinebranchciqlts9_4branchciqlts9_2branchIdfrom the previous column.1,2, …,7: CentOS 9 merge branches. For more information see the Branches table. Branches1,2,3are shown in full (all their commits included), others - not necessarily.0: The commit is part of CentOS 9 history but isn't a part of any merged branch.-: The commit is not even present in9.4history, so talking about "CentOS 9" branch is meaningless.H: Special "branch" id to identify commits belonging tociqlts9_2history.-: Commit not part of the PR.Clean ch-p mainline: The commit was cherry-picked cleanly directly from thekernel-mainlinebranch.Clean ch-p 9.4: The commit was cherry-picked cleanly from theciqlts9_4backported version.Conflict mainline: The cherry pick of thekernel-mainlinecommit resulted in conflicts and likewise theciqlts9_4version. See Conflicts for details about these cases.(F): Fixing.(I): Introducing.(?): Not established whether the commit was fixing or introducing this CVE.The sequence preserves ordering of commits in this PR, and of commits in branches
1,2,3(from the most recent). Within theB:-andB:Hgroups the commits are ordered by the commit date of thekernel-mainlineversion (from the most recent).CVE-2023-4563 (F), CVE-2023-52924 (F), CVE-2023-52925 (I)CVE-2023-4563 (I), CVE-2023-52923 (I), CVE-2023-52924 (I), CVE-2023-6817 (?), CVE-2024-26924 (I), CVE-2024-57947 (I), CVE-2025-38162 (I)CVE-2023-4563 (I), CVE-2023-52923 (I), CVE-2023-52924 (I)CVE-2023-4563 (I), CVE-2023-52923 (I), CVE-2023-52924 (I)Commit titles
The following table provides the link between
Idcolumn in theCommitstable and the commit's title for identification purposes.CVEs
Columns:
Branches
Columns:
Bin the Commits tableCommitstable.kABI check: passed
Boot test: passed
boot-test.log
Testing
The testing angle was running the selftests from the
netfiltertest suite in different debugging conditions.The tests set
The
netfiltersubsystem has its dedicatednetfilterselftests suite. The table below summarizes allnetfilterselftests available inciqlts9_2and their evaluation status.Functional tests: passed
Reference
kselftests–ciqlts9_2–run1.log
kselftests–ciqlts9_2–run2.log
kselftests–ciqlts9_2–run3.log
Patch
kselftests–ciqlts9_2-CVE-batch-10–run1.log
kselftests–ciqlts9_2-CVE-batch-10–run2.log
kselftests–ciqlts9_2-CVE-batch-10–run3.log
kselftests–ciqlts9_2-CVE-batch-10–run4.log
kselftests–ciqlts9_2-CVE-batch-10–run5.log
kselftests–ciqlts9_2-CVE-batch-10–run6.log
Comparison
KFENCE tests: passed
All tests in Functional tests were run on kernel with default (enabled) settings for KFENCE:
In case of any KFENCE errors a message like
would occur in the dmesg logs. No such messages were found
KASAN tests: passed
The patched kernel was compiled with the following options changed:
The tests set from the Functional testing was run 50 times with KASAN debug messages switched on. No KASAN logs appeared in the kernel's console.
kasan-console.log
To ensure the KASAN was set up properly the KUnit tests for KASAN were run.
kasan-self-test.log
Kmemleak tests: passed
The patched kernel was compiled with the following options changed:
The tests set from the Functional testing was run 50 times. Kmemleak scan was triggered at the end and the logs were read
No memory leaks were found.
KCSAN tests: passed relative
The patched
ciqlts9_2-CVE-batch-10and reference kernelciqlts9_2were compiled with KCSAN enabled:The tests set from Functional testing was run a couple of times on the patched kernel with some KCSAN bugs reported:
kcsan-patch.log
Likewise, the same set of test was run on the reference kernel:
kcsan-reference.log
The data race errors reported can be summarized as follows, where
xmeans the error occured and-that it did not.No data race in the patched kernel was encountered which wasn't encountered in the reference kernel as well. Additionally, judging by the stack traces, none of the races detected seemed to be associated with the
netfiltersubsystem.