From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Yosry Ahmed <yosryahmed@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Sam Sun <samsun1006219@gmail.com>, Barry Song <baohua@kernel.org>,
Chengming Zhou <chengming.zhou@linux.dev>,
Kanchana P Sridhar <kanchana.p.sridhar@intel.com>,
Nhat Pham <nphamcs@gmail.com>, Vitaly Wool <vitalywool@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>
Subject: [PATCH 6.12 15/40] mm: zswap: properly synchronize freeing resources during CPU hotunplug
Date: Thu, 30 Jan 2025 14:59:15 +0100 [thread overview]
Message-ID: <20250130133500.322711653@linuxfoundation.org> (raw)
In-Reply-To: <20250130133459.700273275@linuxfoundation.org>
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yosry Ahmed <yosryahmed@google.com>
commit 12dcb0ef540629a281533f9dedc1b6b8e14cfb65 upstream.
In zswap_compress() and zswap_decompress(), the per-CPU acomp_ctx of the
current CPU at the beginning of the operation is retrieved and used
throughout. However, since neither preemption nor migration are disabled,
it is possible that the operation continues on a different CPU.
If the original CPU is hotunplugged while the acomp_ctx is still in use,
we run into a UAF bug as some of the resources attached to the acomp_ctx
are freed during hotunplug in zswap_cpu_comp_dead() (i.e.
acomp_ctx.buffer, acomp_ctx.req, or acomp_ctx.acomp).
The problem was introduced in commit 1ec3b5fe6eec ("mm/zswap: move to use
crypto_acomp API for hardware acceleration") when the switch to the
crypto_acomp API was made. Prior to that, the per-CPU crypto_comp was
retrieved using get_cpu_ptr() which disables preemption and makes sure the
CPU cannot go away from under us. Preemption cannot be disabled with the
crypto_acomp API as a sleepable context is needed.
Use the acomp_ctx.mutex to synchronize CPU hotplug callbacks allocating
and freeing resources with compression/decompression paths. Make sure
that acomp_ctx.req is NULL when the resources are freed. In the
compression/decompression paths, check if acomp_ctx.req is NULL after
acquiring the mutex (meaning the CPU was offlined) and retry on the new
CPU.
The initialization of acomp_ctx.mutex is moved from the CPU hotplug
callback to the pool initialization where it belongs (where the mutex is
allocated). In addition to adding clarity, this makes sure that CPU
hotplug cannot reinitialize a mutex that is already locked by
compression/decompression.
Previously a fix was attempted by holding cpus_read_lock() [1]. This
would have caused a potential deadlock as it is possible for code already
holding the lock to fall into reclaim and enter zswap (causing a
deadlock). A fix was also attempted using SRCU for synchronization, but
Johannes pointed out that synchronize_srcu() cannot be used in CPU hotplug
notifiers [2].
Alternative fixes that were considered/attempted and could have worked:
- Refcounting the per-CPU acomp_ctx. This involves complexity in
handling the race between the refcount dropping to zero in
zswap_[de]compress() and the refcount being re-initialized when the
CPU is onlined.
- Disabling migration before getting the per-CPU acomp_ctx [3], but
that's discouraged and is a much bigger hammer than needed, and could
result in subtle performance issues.
[1]https://lkml.kernel.org/20241219212437.2714151-1-yosryahmed@google.com/
[2]https://lkml.kernel.org/20250107074724.1756696-2-yosryahmed@google.com/
[3]https://lkml.kernel.org/20250107222236.2715883-2-yosryahmed@google.com/
[yosryahmed@google.com: remove comment]
Link: https://lkml.kernel.org/r/CAJD7tkaxS1wjn+swugt8QCvQ-rVF5RZnjxwPGX17k8x9zSManA@mail.gmail.com
Link: https://lkml.kernel.org/r/20250108222441.3622031-1-yosryahmed@google.com
Fixes: 1ec3b5fe6eec ("mm/zswap: move to use crypto_acomp API for hardware acceleration")
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Closes: https://lore.kernel.org/lkml/20241113213007.GB1564047@cmpxchg.org/
Reported-by: Sam Sun <samsun1006219@gmail.com>
Closes: https://lore.kernel.org/lkml/CAEkJfYMtSdM5HceNsXUDf5haghD5+o2e7Qv4OcuruL4tPg6OaQ@mail.gmail.com/
Cc: Barry Song <baohua@kernel.org>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Kanchana P Sridhar <kanchana.p.sridhar@intel.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Vitaly Wool <vitalywool@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
mm/zswap.c | 58 ++++++++++++++++++++++++++++++++++++++++++++--------------
1 file changed, 44 insertions(+), 14 deletions(-)
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -251,7 +251,7 @@ static struct zswap_pool *zswap_pool_cre
struct zswap_pool *pool;
char name[38]; /* 'zswap' + 32 char (max) num + \0 */
gfp_t gfp = __GFP_NORETRY | __GFP_NOWARN | __GFP_KSWAPD_RECLAIM;
- int ret;
+ int ret, cpu;
if (!zswap_has_pool) {
/* if either are unset, pool initialization failed, and we
@@ -285,6 +285,9 @@ static struct zswap_pool *zswap_pool_cre
goto error;
}
+ for_each_possible_cpu(cpu)
+ mutex_init(&per_cpu_ptr(pool->acomp_ctx, cpu)->mutex);
+
ret = cpuhp_state_add_instance(CPUHP_MM_ZSWP_POOL_PREPARE,
&pool->node);
if (ret)
@@ -816,11 +819,12 @@ static int zswap_cpu_comp_prepare(unsign
struct acomp_req *req;
int ret;
- mutex_init(&acomp_ctx->mutex);
-
+ mutex_lock(&acomp_ctx->mutex);
acomp_ctx->buffer = kmalloc_node(PAGE_SIZE * 2, GFP_KERNEL, cpu_to_node(cpu));
- if (!acomp_ctx->buffer)
- return -ENOMEM;
+ if (!acomp_ctx->buffer) {
+ ret = -ENOMEM;
+ goto buffer_fail;
+ }
acomp = crypto_alloc_acomp_node(pool->tfm_name, 0, 0, cpu_to_node(cpu));
if (IS_ERR(acomp)) {
@@ -850,12 +854,15 @@ static int zswap_cpu_comp_prepare(unsign
acomp_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
crypto_req_done, &acomp_ctx->wait);
+ mutex_unlock(&acomp_ctx->mutex);
return 0;
req_fail:
crypto_free_acomp(acomp_ctx->acomp);
acomp_fail:
kfree(acomp_ctx->buffer);
+buffer_fail:
+ mutex_unlock(&acomp_ctx->mutex);
return ret;
}
@@ -864,17 +871,45 @@ static int zswap_cpu_comp_dead(unsigned
struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node);
struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu);
+ mutex_lock(&acomp_ctx->mutex);
if (!IS_ERR_OR_NULL(acomp_ctx)) {
if (!IS_ERR_OR_NULL(acomp_ctx->req))
acomp_request_free(acomp_ctx->req);
+ acomp_ctx->req = NULL;
if (!IS_ERR_OR_NULL(acomp_ctx->acomp))
crypto_free_acomp(acomp_ctx->acomp);
kfree(acomp_ctx->buffer);
}
+ mutex_unlock(&acomp_ctx->mutex);
return 0;
}
+static struct crypto_acomp_ctx *acomp_ctx_get_cpu_lock(struct zswap_pool *pool)
+{
+ struct crypto_acomp_ctx *acomp_ctx;
+
+ for (;;) {
+ acomp_ctx = raw_cpu_ptr(pool->acomp_ctx);
+ mutex_lock(&acomp_ctx->mutex);
+ if (likely(acomp_ctx->req))
+ return acomp_ctx;
+ /*
+ * It is possible that we were migrated to a different CPU after
+ * getting the per-CPU ctx but before the mutex was acquired. If
+ * the old CPU got offlined, zswap_cpu_comp_dead() could have
+ * already freed ctx->req (among other things) and set it to
+ * NULL. Just try again on the new CPU that we ended up on.
+ */
+ mutex_unlock(&acomp_ctx->mutex);
+ }
+}
+
+static void acomp_ctx_put_unlock(struct crypto_acomp_ctx *acomp_ctx)
+{
+ mutex_unlock(&acomp_ctx->mutex);
+}
+
static bool zswap_compress(struct folio *folio, struct zswap_entry *entry)
{
struct crypto_acomp_ctx *acomp_ctx;
@@ -887,10 +922,7 @@ static bool zswap_compress(struct folio
gfp_t gfp;
u8 *dst;
- acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx);
-
- mutex_lock(&acomp_ctx->mutex);
-
+ acomp_ctx = acomp_ctx_get_cpu_lock(entry->pool);
dst = acomp_ctx->buffer;
sg_init_table(&input, 1);
sg_set_folio(&input, folio, PAGE_SIZE, 0);
@@ -943,7 +975,7 @@ unlock:
else if (alloc_ret)
zswap_reject_alloc_fail++;
- mutex_unlock(&acomp_ctx->mutex);
+ acomp_ctx_put_unlock(acomp_ctx);
return comp_ret == 0 && alloc_ret == 0;
}
@@ -954,9 +986,7 @@ static void zswap_decompress(struct zswa
struct crypto_acomp_ctx *acomp_ctx;
u8 *src;
- acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx);
- mutex_lock(&acomp_ctx->mutex);
-
+ acomp_ctx = acomp_ctx_get_cpu_lock(entry->pool);
src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO);
/*
* If zpool_map_handle is atomic, we cannot reliably utilize its mapped buffer
@@ -980,10 +1010,10 @@ static void zswap_decompress(struct zswa
acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, PAGE_SIZE);
BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait));
BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE);
- mutex_unlock(&acomp_ctx->mutex);
if (src != acomp_ctx->buffer)
zpool_unmap_handle(zpool, entry->handle);
+ acomp_ctx_put_unlock(acomp_ctx);
}
/*********************************
next prev parent reply other threads:[~2025-01-30 14:00 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-30 13:59 [PATCH 6.12 00/40] 6.12.12-rc1 review Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 01/40] ASoC: wm8994: Add depends on MFD core Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 02/40] ASoC: codecs: es8316: Fix HW rate calculation for 48Mhz MCLK Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 03/40] ASoC: samsung: Add missing selects for MFD_WM8994 Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 04/40] seccomp: Stub for !CONFIG_SECCOMP Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 05/40] ASoC: cs42l43: Add codec force suspend/resume ops Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 06/40] scsi: iscsi: Fix redundant response for ISCSI_UEVENT_GET_HOST_STATS request Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 07/40] drm/amd/display: Use HW lock mgr for PSR1 Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 08/40] drm/amd/display: Initialize denominator defaults to 1 Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 09/40] of/unittest: Add test that of_address_to_resource() fails on non-translatable address Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 10/40] ALSA: hda/realtek: Fix volume adjustment issue on Lenovo ThinkBook 16P Gen5 Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 11/40] drm/connector: hdmi: Validate supported_formats matches ycbcr_420_allowed Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 12/40] irqchip/sunxi-nmi: Add missing SKIP_WAKE flag Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 13/40] hwmon: (drivetemp) Set scsi command timeout to 10s Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 14/40] ASoC: samsung: Add missing depends on I2C Greg Kroah-Hartman
2025-01-30 13:59 ` Greg Kroah-Hartman [this message]
2025-01-30 13:59 ` [PATCH 6.12 16/40] mm: zswap: move allocations during CPU init outside the lock Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 17/40] gfs2: Truncate address space when flipping GFS2_DIF_JDATA flag Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 18/40] libfs: Return ENOSPC when the directory offset range is exhausted Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 19/40] Revert "libfs: Add simple_offset_empty()" Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 20/40] Revert "libfs: fix infinite directory reads for offset dir" Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 21/40] libfs: Replace simple_offset end-of-directory detection Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 22/40] libfs: Use d_children list to iterate simple_offset directories Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 23/40] smb: client: handle lack of EA support in smb2_query_path_info() Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 24/40] net: sched: fix ets qdisc OOB Indexing Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 25/40] Revert "HID: multitouch: Add support for lenovo Y9000P Touchpad" Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 26/40] cachestat: fix page cache statistics permission checking Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 27/40] vfio/platform: check the bounds of read/write syscalls Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 28/40] scsi: storvsc: Ratelimit warning logs to prevent VM denial of service Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 29/40] USB: serial: quatech2: fix null-ptr-deref in qt2_process_read_urb() Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 30/40] Revert "usb: gadget: u_serial: Disable ep before setting port to null to fix the crash caused by port being null" Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 31/40] ALSA: usb-audio: Add delay quirk for USB Audio Device Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 32/40] wifi: rtl8xxxu: add more missing rtl8192cu USB IDs Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 33/40] HID: wacom: Initialize brightness of LED trigger Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 34/40] Input: xpad - add support for Nacon Pro Compact Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 35/40] Input: atkbd - map F23 key to support default copilot shortcut Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 36/40] Input: xpad - add unofficial Xbox 360 wireless receiver clone Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 37/40] Input: xpad - add QH Electronics VID/PID Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 38/40] Input: xpad - improve name of 8BitDo controller 2dc8:3106 Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 39/40] Input: xpad - add support for Nacon Evol-X Xbox One Controller Greg Kroah-Hartman
2025-01-30 13:59 ` [PATCH 6.12 40/40] Input: xpad - add support for wooting two he (arm) Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250130133500.322711653@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=hannes@cmpxchg.org \
--cc=kanchana.p.sridhar@intel.com \
--cc=nphamcs@gmail.com \
--cc=patches@lists.linux.dev \
--cc=samsun1006219@gmail.com \
--cc=stable@vger.kernel.org \
--cc=vitalywool@gmail.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.