Skip to content

Commit 63895d2

Browse files
hygoniakpm00
authored andcommitted
mm/zswap: fix inconsistency when zswap_store_page() fails
Commit b7c0ccd ("mm: zswap: support large folios in zswap_store()") skips charging any zswap entries when it failed to zswap the entire folio. However, when some base pages are zswapped but it failed to zswap the entire folio, the zswap operation is rolled back. When freeing zswap entries for those pages, zswap_entry_free() uncharges the zswap entries that were not previously charged, causing zswap charging to become inconsistent. This inconsistency triggers two warnings with following steps: # On a machine with 64GiB of RAM and 36GiB of zswap $ stress-ng --bigheap 2 # wait until the OOM-killer kills stress-ng $ sudo reboot The two warnings are: in mm/memcontrol.c:163, function obj_cgroup_release(): WARN_ON_ONCE(nr_bytes & (PAGE_SIZE - 1)); in mm/page_counter.c:60, function page_counter_cancel(): if (WARN_ONCE(new < 0, "page_counter underflow: %ld nr_pages=%lu\n", new, nr_pages)) zswap_stored_pages also becomes inconsistent in the same way. As suggested by Kanchana, increment zswap_stored_pages and charge zswap entries within zswap_store_page() when it succeeds. This way, zswap_entry_free() will decrement the counter and uncharge the entries when it failed to zswap the entire folio. While this could potentially be optimized by batching objcg charging and incrementing the counter, let's focus on fixing the bug this time and leave the optimization for later after some evaluation. After resolving the inconsistency, the warnings disappear. [[email protected]: refactor zswap_store_page()] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: b7c0ccd ("mm: zswap: support large folios in zswap_store()") Co-developed-by: Kanchana P Sridhar <[email protected]> Signed-off-by: Kanchana P Sridhar <[email protected]> Signed-off-by: Hyeonggon Yoo <[email protected]> Acked-by: Yosry Ahmed <[email protected]> Acked-by: Nhat Pham <[email protected]> Cc: Chengming Zhou <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
1 parent f4b7826 commit 63895d2

File tree

1 file changed

+16
-19
lines changed

1 file changed

+16
-19
lines changed

mm/zswap.c

Lines changed: 16 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1445,9 +1445,9 @@ static void shrink_worker(struct work_struct *w)
14451445
* main API
14461446
**********************************/
14471447

1448-
static ssize_t zswap_store_page(struct page *page,
1449-
struct obj_cgroup *objcg,
1450-
struct zswap_pool *pool)
1448+
static bool zswap_store_page(struct page *page,
1449+
struct obj_cgroup *objcg,
1450+
struct zswap_pool *pool)
14511451
{
14521452
swp_entry_t page_swpentry = page_swap_entry(page);
14531453
struct zswap_entry *entry, *old;
@@ -1456,7 +1456,7 @@ static ssize_t zswap_store_page(struct page *page,
14561456
entry = zswap_entry_cache_alloc(GFP_KERNEL, page_to_nid(page));
14571457
if (!entry) {
14581458
zswap_reject_kmemcache_fail++;
1459-
return -EINVAL;
1459+
return false;
14601460
}
14611461

14621462
if (!zswap_compress(page, entry, pool))
@@ -1483,13 +1483,17 @@ static ssize_t zswap_store_page(struct page *page,
14831483

14841484
/*
14851485
* The entry is successfully compressed and stored in the tree, there is
1486-
* no further possibility of failure. Grab refs to the pool and objcg.
1487-
* These refs will be dropped by zswap_entry_free() when the entry is
1488-
* removed from the tree.
1486+
* no further possibility of failure. Grab refs to the pool and objcg,
1487+
* charge zswap memory, and increment zswap_stored_pages.
1488+
* The opposite actions will be performed by zswap_entry_free()
1489+
* when the entry is removed from the tree.
14891490
*/
14901491
zswap_pool_get(pool);
1491-
if (objcg)
1492+
if (objcg) {
14921493
obj_cgroup_get(objcg);
1494+
obj_cgroup_charge_zswap(objcg, entry->length);
1495+
}
1496+
atomic_long_inc(&zswap_stored_pages);
14931497

14941498
/*
14951499
* We finish initializing the entry while it's already in xarray.
@@ -1510,13 +1514,13 @@ static ssize_t zswap_store_page(struct page *page,
15101514
zswap_lru_add(&zswap_list_lru, entry);
15111515
}
15121516

1513-
return entry->length;
1517+
return true;
15141518

15151519
store_failed:
15161520
zpool_free(pool->zpool, entry->handle);
15171521
compress_failed:
15181522
zswap_entry_cache_free(entry);
1519-
return -EINVAL;
1523+
return false;
15201524
}
15211525

15221526
bool zswap_store(struct folio *folio)
@@ -1526,7 +1530,6 @@ bool zswap_store(struct folio *folio)
15261530
struct obj_cgroup *objcg = NULL;
15271531
struct mem_cgroup *memcg = NULL;
15281532
struct zswap_pool *pool;
1529-
size_t compressed_bytes = 0;
15301533
bool ret = false;
15311534
long index;
15321535

@@ -1564,20 +1567,14 @@ bool zswap_store(struct folio *folio)
15641567

15651568
for (index = 0; index < nr_pages; ++index) {
15661569
struct page *page = folio_page(folio, index);
1567-
ssize_t bytes;
15681570

1569-
bytes = zswap_store_page(page, objcg, pool);
1570-
if (bytes < 0)
1571+
if (!zswap_store_page(page, objcg, pool))
15711572
goto put_pool;
1572-
compressed_bytes += bytes;
15731573
}
15741574

1575-
if (objcg) {
1576-
obj_cgroup_charge_zswap(objcg, compressed_bytes);
1575+
if (objcg)
15771576
count_objcg_events(objcg, ZSWPOUT, nr_pages);
1578-
}
15791577

1580-
atomic_long_add(nr_pages, &zswap_stored_pages);
15811578
count_vm_events(ZSWPOUT, nr_pages);
15821579

15831580
ret = true;

0 commit comments

Comments
 (0)