Skip to content

Commit 0b2de89

Browse files
committed
use uniform permission checks for all mount propagation changes
jira LE-4311 cve CVE-2025-38498 Rebuild_History Non-Buildable kernel-5.14.0-570.46.1.el9_6 commit-author Al Viro <[email protected]> commit cffd044 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-5.14.0-570.46.1.el9_6/cffd0441.failed do_change_type() and do_set_group() are operating on different aspects of the same thing - propagation graph. The latter asks for mounts involved to be mounted in namespace(s) the caller has CAP_SYS_ADMIN for. The former is a mess - originally it didn't even check that mount *is* mounted. That got fixed, but the resulting check turns out to be too strict for userland - in effect, we check that mount is in our namespace, having already checked that we have CAP_SYS_ADMIN there. What we really need (in both cases) is * only touch mounts that are mounted. That's a must-have constraint - data corruption happens if it get violated. * don't allow to mess with a namespace unless you already have enough permissions to do so (i.e. CAP_SYS_ADMIN in its userns). That's an equivalent of what do_set_group() does; let's extract that into a helper (may_change_propagation()) and use it in both do_set_group() and do_change_type(). Fixes: 12f147d "do_change_type(): refuse to operate on unmounted/not ours mounts" Acked-by: Andrei Vagin <[email protected]> Reviewed-by: Pavel Tikhomirov <[email protected]> Tested-by: Pavel Tikhomirov <[email protected]> Reviewed-by: Christian Brauner <[email protected]> Signed-off-by: Al Viro <[email protected]> (cherry picked from commit cffd044) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # fs/namespace.c
1 parent 808d2aa commit 0b2de89

File tree

1 file changed

+328
-0
lines changed

1 file changed

+328
-0
lines changed
Lines changed: 328 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,328 @@
1+
use uniform permission checks for all mount propagation changes
2+
3+
jira LE-4311
4+
cve CVE-2025-38498
5+
Rebuild_History Non-Buildable kernel-5.14.0-570.46.1.el9_6
6+
commit-author Al Viro <[email protected]>
7+
commit cffd0441872e7f6b1fce5e78fb1c99187a291330
8+
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
9+
Will be included in final tarball splat. Ref for failed cherry-pick at:
10+
ciq/ciq_backports/kernel-5.14.0-570.46.1.el9_6/cffd0441.failed
11+
12+
do_change_type() and do_set_group() are operating on different
13+
aspects of the same thing - propagation graph. The latter
14+
asks for mounts involved to be mounted in namespace(s) the caller
15+
has CAP_SYS_ADMIN for. The former is a mess - originally it
16+
didn't even check that mount *is* mounted. That got fixed,
17+
but the resulting check turns out to be too strict for userland -
18+
in effect, we check that mount is in our namespace, having already
19+
checked that we have CAP_SYS_ADMIN there.
20+
21+
What we really need (in both cases) is
22+
* only touch mounts that are mounted. That's a must-have
23+
constraint - data corruption happens if it get violated.
24+
* don't allow to mess with a namespace unless you already
25+
have enough permissions to do so (i.e. CAP_SYS_ADMIN in its userns).
26+
27+
That's an equivalent of what do_set_group() does; let's extract that
28+
into a helper (may_change_propagation()) and use it in both
29+
do_set_group() and do_change_type().
30+
31+
Fixes: 12f147ddd6de "do_change_type(): refuse to operate on unmounted/not ours mounts"
32+
Acked-by: Andrei Vagin <[email protected]>
33+
Reviewed-by: Pavel Tikhomirov <[email protected]>
34+
Tested-by: Pavel Tikhomirov <[email protected]>
35+
Reviewed-by: Christian Brauner <[email protected]>
36+
Signed-off-by: Al Viro <[email protected]>
37+
(cherry picked from commit cffd0441872e7f6b1fce5e78fb1c99187a291330)
38+
Signed-off-by: Jonathan Maple <[email protected]>
39+
40+
# Conflicts:
41+
# fs/namespace.c
42+
diff --cc fs/namespace.c
43+
index 4ec9c03ab924,88db58061919..000000000000
44+
--- a/fs/namespace.c
45+
+++ b/fs/namespace.c
46+
@@@ -2278,9 -2856,22 +2278,22 @@@ static int graft_tree(struct mount *mnt
47+
d_is_dir(mnt->mnt.mnt_root))
48+
return -ENOTDIR;
49+
50+
- return attach_recursive_mnt(mnt, p, mp);
51+
+ return attach_recursive_mnt(mnt, p, mp, false);
52+
}
53+
54+
+ static int may_change_propagation(const struct mount *m)
55+
+ {
56+
+ struct mnt_namespace *ns = m->mnt_ns;
57+
+
58+
+ // it must be mounted in some namespace
59+
+ if (IS_ERR_OR_NULL(ns)) // is_mounted()
60+
+ return -EINVAL;
61+
+ // and the caller must be admin in userns of that namespace
62+
+ if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN))
63+
+ return -EPERM;
64+
+ return 0;
65+
+ }
66+
+
67+
/*
68+
* Sanity check the flags to change_mnt_propagation.
69+
*/
70+
@@@ -2671,29 -3347,238 +2684,121 @@@ static inline int tree_contains_unbinda
71+
return 0;
72+
}
73+
74+
++<<<<<<< HEAD
75+
++=======
76+
+ static int do_set_group(struct path *from_path, struct path *to_path)
77+
+ {
78+
+ struct mount *from, *to;
79+
+ int err;
80+
+
81+
+ from = real_mount(from_path->mnt);
82+
+ to = real_mount(to_path->mnt);
83+
+
84+
+ namespace_lock();
85+
+
86+
+ err = may_change_propagation(from);
87+
+ if (err)
88+
+ goto out;
89+
+ err = may_change_propagation(to);
90+
+ if (err)
91+
+ goto out;
92+
+
93+
+ err = -EINVAL;
94+
+ /* To and From paths should be mount roots */
95+
+ if (!path_mounted(from_path))
96+
+ goto out;
97+
+ if (!path_mounted(to_path))
98+
+ goto out;
99+
+
100+
+ /* Setting sharing groups is only allowed across same superblock */
101+
+ if (from->mnt.mnt_sb != to->mnt.mnt_sb)
102+
+ goto out;
103+
+
104+
+ /* From mount root should be wider than To mount root */
105+
+ if (!is_subdir(to->mnt.mnt_root, from->mnt.mnt_root))
106+
+ goto out;
107+
+
108+
+ /* From mount should not have locked children in place of To's root */
109+
+ if (__has_locked_children(from, to->mnt.mnt_root))
110+
+ goto out;
111+
+
112+
+ /* Setting sharing groups is only allowed on private mounts */
113+
+ if (IS_MNT_SHARED(to) || IS_MNT_SLAVE(to))
114+
+ goto out;
115+
+
116+
+ /* From should not be private */
117+
+ if (!IS_MNT_SHARED(from) && !IS_MNT_SLAVE(from))
118+
+ goto out;
119+
+
120+
+ if (IS_MNT_SLAVE(from)) {
121+
+ hlist_add_behind(&to->mnt_slave, &from->mnt_slave);
122+
+ to->mnt_master = from->mnt_master;
123+
+ }
124+
+
125+
+ if (IS_MNT_SHARED(from)) {
126+
+ to->mnt_group_id = from->mnt_group_id;
127+
+ list_add(&to->mnt_share, &from->mnt_share);
128+
+ set_mnt_shared(to);
129+
+ }
130+
+
131+
+ err = 0;
132+
+ out:
133+
+ namespace_unlock();
134+
+ return err;
135+
+ }
136+
+
137+
+ /**
138+
+ * path_overmounted - check if path is overmounted
139+
+ * @path: path to check
140+
+ *
141+
+ * Check if path is overmounted, i.e., if there's a mount on top of
142+
+ * @path->mnt with @path->dentry as mountpoint.
143+
+ *
144+
+ * Context: namespace_sem must be held at least shared.
145+
+ * MUST NOT be called under lock_mount_hash() (there one should just
146+
+ * call __lookup_mnt() and check if it returns NULL).
147+
+ * Return: If path is overmounted true is returned, false if not.
148+
+ */
149+
+ static inline bool path_overmounted(const struct path *path)
150+
+ {
151+
+ unsigned seq = read_seqbegin(&mount_lock);
152+
+ bool no_child;
153+
+
154+
+ rcu_read_lock();
155+
+ no_child = !__lookup_mnt(path->mnt, path->dentry);
156+
+ rcu_read_unlock();
157+
+ if (need_seqretry(&mount_lock, seq)) {
158+
+ read_seqlock_excl(&mount_lock);
159+
+ no_child = !__lookup_mnt(path->mnt, path->dentry);
160+
+ read_sequnlock_excl(&mount_lock);
161+
+ }
162+
+ return unlikely(!no_child);
163+
+ }
164+
+
165+
++>>>>>>> cffd0441872e (use uniform permission checks for all mount propagation changes)
166+
/*
167+
- * Check if there is a possibly empty chain of descent from p1 to p2.
168+
- * Locks: namespace_sem (shared) or mount_lock (read_seqlock_excl).
169+
- */
170+
-static bool mount_is_ancestor(const struct mount *p1, const struct mount *p2)
171+
-{
172+
- while (p2 != p1 && mnt_has_parent(p2))
173+
- p2 = p2->mnt_parent;
174+
- return p2 == p1;
175+
-}
176+
-
177+
-/**
178+
- * can_move_mount_beneath - check that we can mount beneath the top mount
179+
- * @from: mount to mount beneath
180+
- * @to: mount under which to mount
181+
- * @mp: mountpoint of @to
182+
- *
183+
- * - Make sure that @to->dentry is actually the root of a mount under
184+
- * which we can mount another mount.
185+
- * - Make sure that nothing can be mounted beneath the caller's current
186+
- * root or the rootfs of the namespace.
187+
- * - Make sure that the caller can unmount the topmost mount ensuring
188+
- * that the caller could reveal the underlying mountpoint.
189+
- * - Ensure that nothing has been mounted on top of @from before we
190+
- * grabbed @namespace_sem to avoid creating pointless shadow mounts.
191+
- * - Prevent mounting beneath a mount if the propagation relationship
192+
- * between the source mount, parent mount, and top mount would lead to
193+
- * nonsensical mount trees.
194+
- *
195+
- * Context: This function expects namespace_lock() to be held.
196+
- * Return: On success 0, and on error a negative error code is returned.
197+
- */
198+
-static int can_move_mount_beneath(const struct path *from,
199+
- const struct path *to,
200+
- const struct mountpoint *mp)
201+
-{
202+
- struct mount *mnt_from = real_mount(from->mnt),
203+
- *mnt_to = real_mount(to->mnt),
204+
- *parent_mnt_to = mnt_to->mnt_parent;
205+
-
206+
- if (!mnt_has_parent(mnt_to))
207+
- return -EINVAL;
208+
-
209+
- if (!path_mounted(to))
210+
- return -EINVAL;
211+
-
212+
- if (IS_MNT_LOCKED(mnt_to))
213+
- return -EINVAL;
214+
-
215+
- /* Avoid creating shadow mounts during mount propagation. */
216+
- if (path_overmounted(from))
217+
- return -EINVAL;
218+
-
219+
- /*
220+
- * Mounting beneath the rootfs only makes sense when the
221+
- * semantics of pivot_root(".", ".") are used.
222+
- */
223+
- if (&mnt_to->mnt == current->fs->root.mnt)
224+
- return -EINVAL;
225+
- if (parent_mnt_to == current->nsproxy->mnt_ns->root)
226+
- return -EINVAL;
227+
-
228+
- if (mount_is_ancestor(mnt_to, mnt_from))
229+
- return -EINVAL;
230+
-
231+
- /*
232+
- * If the parent mount propagates to the child mount this would
233+
- * mean mounting @mnt_from on @mnt_to->mnt_parent and then
234+
- * propagating a copy @c of @mnt_from on top of @mnt_to. This
235+
- * defeats the whole purpose of mounting beneath another mount.
236+
- */
237+
- if (propagation_would_overmount(parent_mnt_to, mnt_to, mp))
238+
- return -EINVAL;
239+
-
240+
- /*
241+
- * If @mnt_to->mnt_parent propagates to @mnt_from this would
242+
- * mean propagating a copy @c of @mnt_from on top of @mnt_from.
243+
- * Afterwards @mnt_from would be mounted on top of
244+
- * @mnt_to->mnt_parent and @mnt_to would be unmounted from
245+
- * @mnt->mnt_parent and remounted on @mnt_from. But since @c is
246+
- * already mounted on @mnt_from, @mnt_to would ultimately be
247+
- * remounted on top of @c. Afterwards, @mnt_from would be
248+
- * covered by a copy @c of @mnt_from and @c would be covered by
249+
- * @mnt_from itself. This defeats the whole purpose of mounting
250+
- * @mnt_from beneath @mnt_to.
251+
- */
252+
- if (check_mnt(mnt_from) &&
253+
- propagation_would_overmount(parent_mnt_to, mnt_from, mp))
254+
- return -EINVAL;
255+
-
256+
- return 0;
257+
-}
258+
-
259+
-/* may_use_mount() - check if a mount tree can be used
260+
- * @mnt: vfsmount to be used
261+
- *
262+
- * This helper checks if the caller may use the mount tree starting
263+
- * from @path->mnt. The caller may use the mount tree under the
264+
- * following circumstances:
265+
- *
266+
- * (1) The caller is located in the mount namespace of the mount tree.
267+
- * This also implies that the mount does not belong to an anonymous
268+
- * mount namespace.
269+
- * (2) The caller is trying to use a mount tree that belongs to an
270+
- * anonymous mount namespace.
271+
- *
272+
- * For that to be safe, this helper enforces that the origin mount
273+
- * namespace the anonymous mount namespace was created from is the
274+
- * same as the caller's mount namespace by comparing the sequence
275+
- * numbers.
276+
- *
277+
- * The ownership of a non-anonymous mount namespace such as the
278+
- * caller's cannot change.
279+
- * => We know that the caller's mount namespace is stable.
280+
- *
281+
- * If the origin sequence number of the anonymous mount namespace is
282+
- * the same as the sequence number of the caller's mount namespace.
283+
- * => The owning namespaces are the same.
284+
- *
285+
- * ==> The earlier capability check on the owning namespace of the
286+
- * caller's mount namespace ensures that the caller has the
287+
- * ability to use the mount tree.
288+
- *
289+
- * Returns true if the mount tree can be used, false otherwise.
290+
+ * Check that there aren't references to earlier/same mount namespaces in the
291+
+ * specified subtree. Such references can act as pins for mount namespaces
292+
+ * that aren't checked by the mount-cycle checking code, thereby allowing
293+
+ * cycles to be made.
294+
*/
295+
-static inline bool may_use_mount(struct mount *mnt)
296+
+static bool check_for_nsfs_mounts(struct mount *subtree)
297+
{
298+
- if (check_mnt(mnt))
299+
- return true;
300+
+ struct mount *p;
301+
+ bool ret = false;
302+
303+
- /*
304+
- * Make sure that noone unmounted the target path or somehow
305+
- * managed to get their hands on something purely kernel
306+
- * internal.
307+
- */
308+
- if (!is_mounted(&mnt->mnt))
309+
- return false;
310+
+ lock_mount_hash();
311+
+ for (p = subtree; p; p = next_mnt(p, subtree))
312+
+ if (mnt_ns_loop(p->mnt.mnt_root))
313+
+ goto out;
314+
315+
- return check_anonymous_mnt(mnt);
316+
+ ret = true;
317+
+out:
318+
+ unlock_mount_hash();
319+
+ return ret;
320+
}
321+
322+
-static int do_move_mount(struct path *old_path,
323+
- struct path *new_path, enum mnt_tree_flags_t flags)
324+
+static int do_move_mount(struct path *old_path, struct path *new_path)
325+
{
326+
struct mnt_namespace *ns;
327+
struct mount *p;
328+
* Unmerged path fs/namespace.c

0 commit comments

Comments
 (0)