Skip to content

Commit 8913970

Browse files
xzpetertorvalds
authored andcommitted
mm/userfaultfd: selftests: fix memory corruption with thp enabled
In RHEL's gating selftests we've encountered memory corruption in the uffd event test even with upstream kernel: # ./userfaultfd anon 128 4 nr_pages: 32768, nr_pages_per_cpu: 32768 bounces: 3, mode: rnd racing read, userfaults: 6240 missing (6240) 14729 wp (14729) bounces: 2, mode: racing read, userfaults: 1444 missing (1444) 28877 wp (28877) bounces: 1, mode: rnd read, userfaults: 6055 missing (6055) 14699 wp (14699) bounces: 0, mode: read, userfaults: 82 missing (82) 25196 wp (25196) testing uffd-wp with pagemap (pgsize=4096): done testing uffd-wp with pagemap (pgsize=2097152): done testing events (fork, remap, remove): ERROR: nr 32427 memory corruption 0 1 (errno=0, line=963) ERROR: faulting process failed (errno=0, line=1117) It can be easily reproduced when global thp enabled, which is the default for RHEL. It's also known as a side effect of commit 0db282b ("selftest: use mmap instead of posix_memalign to allocate memory", 2021-07-23), which is imho right itself on using mmap() to make sure the addresses will be untagged even on arm. The problem is, for each test we allocate buffers using two allocate_area() calls. We assumed these two buffers won't affect each other, however they could, because mmap() could have found that the two buffers are near each other and having the same VMA flags, so they got merged into one VMA. It won't be a big problem if thp is not enabled, but when thp is agressively enabled it means when initializing the src buffer it could accidentally setup part of the dest buffer too when there's a shared THP that overlaps the two regions. Then some of the dest buffer won't be able to be trapped by userfaultfd missing mode, then it'll cause memory corruption as described. To fix it, do release_pages() after initializing the src buffer. Since the previous two release_pages() calls are after uffd_test_ctx_clear() which will unmap all the buffers anyway (which is stronger than release pages; as unmap() also tear town pgtables), drop them as they shouldn't really be anything useful. We can mark the Fixes tag upon 0db282b as it's reported to only happen there, however the real "Fixes" IMHO should be 8ba6e86, as before that commit we'll always do explicit release_pages() before registration of uffd, and 8ba6e86 changed that logic by adding extra unmap/map and we didn't release the pages at the right place. Meanwhile I don't have a solid glue anyway on whether posix_memalign() could always avoid triggering this bug, hence it's safer to attach this fix to commit 8ba6e86. Link: https://lkml.kernel.org/r/[email protected] Fixes: 8ba6e86 ("userfaultfd/selftests: reinitialize test context in each test") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1994931 Signed-off-by: Peter Xu <[email protected]> Reported-by: Li Wang <[email protected]> Tested-by: Li Wang <[email protected]> Reviewed-by: Axel Rasmussen <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Nadav Amit <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
1 parent 519d819 commit 8913970

File tree

1 file changed

+20
-3
lines changed

1 file changed

+20
-3
lines changed

tools/testing/selftests/vm/userfaultfd.c

Lines changed: 20 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -414,9 +414,6 @@ static void uffd_test_ctx_init_ext(uint64_t *features)
414414
uffd_test_ops->allocate_area((void **)&area_src);
415415
uffd_test_ops->allocate_area((void **)&area_dst);
416416

417-
uffd_test_ops->release_pages(area_src);
418-
uffd_test_ops->release_pages(area_dst);
419-
420417
userfaultfd_open(features);
421418

422419
count_verify = malloc(nr_pages * sizeof(unsigned long long));
@@ -437,6 +434,26 @@ static void uffd_test_ctx_init_ext(uint64_t *features)
437434
*(area_count(area_src, nr) + 1) = 1;
438435
}
439436

437+
/*
438+
* After initialization of area_src, we must explicitly release pages
439+
* for area_dst to make sure it's fully empty. Otherwise we could have
440+
* some area_dst pages be errornously initialized with zero pages,
441+
* hence we could hit memory corruption later in the test.
442+
*
443+
* One example is when THP is globally enabled, above allocate_area()
444+
* calls could have the two areas merged into a single VMA (as they
445+
* will have the same VMA flags so they're mergeable). When we
446+
* initialize the area_src above, it's possible that some part of
447+
* area_dst could have been faulted in via one huge THP that will be
448+
* shared between area_src and area_dst. It could cause some of the
449+
* area_dst won't be trapped by missing userfaults.
450+
*
451+
* This release_pages() will guarantee even if that happened, we'll
452+
* proactively split the thp and drop any accidentally initialized
453+
* pages within area_dst.
454+
*/
455+
uffd_test_ops->release_pages(area_dst);
456+
440457
pipefd = malloc(sizeof(int) * nr_cpus * 2);
441458
if (!pipefd)
442459
err("pipefd");

0 commit comments

Comments
 (0)