All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Richard Chang <richardycc@google.com>,
	Jens Axboe <axboe@kernel.dk>,
	Andrew Morton <akpm@linux-foundation.org>,
	bgeffon@google.com, liumartin@google.com,
	linux-kernel@vger.kernel.org, linux-block@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH] zram: fix use-after-free in zram_writeback_endio
Date: Thu, 7 May 2026 15:56:52 -0700	[thread overview]
Message-ID: <af0YtJOLGvO-LJow@google.com> (raw)
In-Reply-To: <afw2919RiZje9xzq@google.com>

On Thu, May 07, 2026 at 06:40:37PM +0900, Sergey Senozhatsky wrote:
> On (26/05/05 09:37), Minchan Kim wrote:
> > > @@ -966,9 +966,8 @@ static void zram_writeback_endio(struct bio *bio)
> > >
> > >  	spin_lock_irqsave(&wb_ctl->done_lock, flags);
> > >  	list_add(&req->entry, &wb_ctl->done_reqs);
> > > -	spin_unlock_irqrestore(&wb_ctl->done_lock, flags);
> > > -
> > >  	wake_up(&wb_ctl->done_wait);
> > > +	spin_unlock_irqrestore(&wb_ctl->done_lock, flags);
> > >  }
> > >
> >
> > I agree this will fix the issue, but using a lock to extend the lifetime of
> > an object to avoid a UAF is not a good pattern. Object lifetime shared between
> > process and interrupt contexts should be managed explicitly using refcount.
> 
> ->num_inflight is a ref-counter, basically.  The problem is that
> completion is a two-step process, only one part of each is synchronized
> with the writeback context.  I honestly don't want to have two ref-counts:
> one for requests pending zram completion and one for active endio contexts.
> Maybe we can repurpose num_inflight instead.

If it can make the code much clearer and simpler, I have no objection.

> 
> > Furthermore, keeping wake_up() outside the critical section minimizes
> > interrupt-disabled latency
> 
> So I considered that, but isn't endio already called from IRQ context?
> Just asking.  We wakeup only one waiter (writeback task), so it's not
> that bad CPU-cycles wise.  Do you think it's really a concern?

I don't think it will have any measurable impact; I was just pointing out
a theoretical one.

> 
> wake_up() under spin-lock solves the problem of a unsynchronized
> two-stages endio process.
> 
> > and avoids nesting spinlocks (done_lock -> done_wait.lock), reducing
> > the risk of future lockdep issues, just in case.
> 
> I considered lockdep as well but ruled it out as impossible scenario,
> nesting here is strictly uni-directional, we never call into zram from
> the scheduler.  Just saying.

Sure. I just prefer to avoid adding more lock dependencies without a strong
justification, to prevent potential locking issues in the future.

> 
> > It definitely will add more overhead for the submission/completion paths to deal
> > with the refcount, but I think we should go that way at the cost of runtime.
> 
> Dunno, something like below maybe?
> 
> ---
>  drivers/block/zram/zram_drv.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index ce2e1c79fc75..27fe50d666d7 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -967,7 +967,7 @@ static int zram_writeback_complete(struct zram *zram, struct zram_wb_req *req)
>  static void zram_writeback_endio(struct bio *bio)
>  {
>  	struct zram_wb_req *req = container_of(bio, struct zram_wb_req, bio);
> -	struct zram_wb_ctl *wb_ctl = bio->bi_private;
> +	struct zram_wb_ctl *wb_ctl = READ_ONCE(bio->bi_private);
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&wb_ctl->done_lock, flags);
> @@ -975,6 +975,7 @@ static void zram_writeback_endio(struct bio *bio)
>  	spin_unlock_irqrestore(&wb_ctl->done_lock, flags);
>  
>  	wake_up(&wb_ctl->done_wait);
> +	atomic_dec(&wb_ctl->num_inflight);
>  }
>  
>  static void zram_submit_wb_request(struct zram *zram,
> @@ -998,7 +999,7 @@ static int zram_complete_done_reqs(struct zram *zram,
>  	unsigned long flags;
>  	int ret = 0, err;
>  
> -	while (atomic_read(&wb_ctl->num_inflight) > 0) {
> +	for (;;) {
>  		spin_lock_irqsave(&wb_ctl->done_lock, flags);
>  		req = list_first_entry_or_null(&wb_ctl->done_reqs,
>  					       struct zram_wb_req, entry);
> @@ -1006,7 +1007,6 @@ static int zram_complete_done_reqs(struct zram *zram,
>  			list_del(&req->entry);
>  		spin_unlock_irqrestore(&wb_ctl->done_lock, flags);
>  
> -		/* ->num_inflight > 0 doesn't mean we have done requests */
>  		if (!req)
>  			break;
>  
> @@ -1014,7 +1014,6 @@ static int zram_complete_done_reqs(struct zram *zram,
>  		if (err)
>  			ret = err;
>  
> -		atomic_dec(&wb_ctl->num_inflight);
>  		release_pp_slot(zram, req->pps);
>  		req->pps = NULL;
>  
> @@ -1129,8 +1128,11 @@ static int zram_writeback_slots(struct zram *zram,
>  	if (req)
>  		release_wb_req(req);
>  
> -	while (atomic_read(&wb_ctl->num_inflight) > 0) {
> -		wait_event(wb_ctl->done_wait, !list_empty(&wb_ctl->done_reqs));
> +	while (atomic_read(&wb_ctl->num_inflight) ||
> +	       !list_empty(&wb_ctl->done_reqs)) {
> +		wait_event_timeout(wb_ctl->done_wait,
> +				   !list_empty(&wb_ctl->done_reqs),
> +				   HZ);
>  		err = zram_complete_done_reqs(zram, wb_ctl);
>  		if (err)
>  			ret = err;

I understand why you used a timeout here, but I still don't think it's a good
idea since the user could wait for up to a second unnecessarily during the
race.

What I prefer is simple and explicit lifetime management for wb_ctl using
refcount. It directly addresses the core issue (UAF of wb_ctl) in a standard,
robust way without needing workarounds like timeouts. The runtime overhead
of kref will be negligible.

Something like this:

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index a324ede6206d..28ab4a24e77f 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -33,6 +33,7 @@
 #include <linux/cpuhotplug.h>
 #include <linux/part_stat.h>
 #include <linux/kernel_read_file.h>
+#include <linux/kref.h>
 
 #include "zram_drv.h"
 
@@ -504,6 +505,7 @@ struct zram_wb_ctl {
 	wait_queue_head_t done_wait;
 	spinlock_t done_lock;
 	atomic_t num_inflight;
+	struct kref kref;
 };
 
 struct zram_wb_req {
@@ -829,11 +831,8 @@ static void release_wb_req(struct zram_wb_req *req)
 	kfree(req);
 }
 
-static void release_wb_ctl(struct zram_wb_ctl *wb_ctl)
+static void __release_wb_ctl(struct zram_wb_ctl *wb_ctl)
 {
-	if (!wb_ctl)
-		return;
-
 	/* We should never have inflight requests at this point */
 	WARN_ON(atomic_read(&wb_ctl->num_inflight));
 	WARN_ON(!list_empty(&wb_ctl->done_reqs));
@@ -850,6 +849,18 @@ static void release_wb_ctl(struct zram_wb_ctl *wb_ctl)
 	kfree(wb_ctl);
 }
 
+static void release_wb_ctl_kref(struct kref *kref)
+{
+	struct zram_wb_ctl *wb_ctl = container_of(kref, struct zram_wb_ctl, kref);
+
+	__release_wb_ctl(wb_ctl);
+}
+
+static void release_wb_ctl(struct zram_wb_ctl *wb_ctl)
+{
+	kref_put(&wb_ctl->kref, release_wb_ctl_kref);
+}
+
 static struct zram_wb_ctl *init_wb_ctl(struct zram *zram)
 {
 	struct zram_wb_ctl *wb_ctl;
@@ -864,6 +875,7 @@ static struct zram_wb_ctl *init_wb_ctl(struct zram *zram)
 	atomic_set(&wb_ctl->num_inflight, 0);
 	init_waitqueue_head(&wb_ctl->done_wait);
 	spin_lock_init(&wb_ctl->done_lock);
+	kref_init(&wb_ctl->kref);
 
 	for (i = 0; i < zram->wb_batch_size; i++) {
 		struct zram_wb_req *req;
@@ -985,6 +997,7 @@ static void zram_writeback_endio(struct bio *bio)
 	spin_unlock_irqrestore(&wb_ctl->done_lock, flags);
 
 	wake_up(&wb_ctl->done_wait);
+	kref_put(&wb_ctl->kref, release_wb_ctl_kref);
 }
 
 static void zram_submit_wb_request(struct zram *zram,
@@ -996,6 +1009,7 @@ static void zram_submit_wb_request(struct zram *zram,
 	 * so that we don't over-submit.
 	 */
 	zram_account_writeback_submit(zram);
+	kref_get(&wb_ctl->kref);
 	atomic_inc(&wb_ctl->num_inflight);
 	req->bio.bi_private = wb_ctl;
 	submit_bio(&req->bio);
@@ -1276,8 +1290,8 @@ static ssize_t writeback_store(struct device *dev,
 
 	wb_ctl = init_wb_ctl(zram);
 	if (!wb_ctl) {
-		ret = -ENOMEM;
-		goto out;
+		release_pp_ctl(zram, pp_ctl);
+		return -ENOMEM;
 	}
 
 	args = skip_spaces(buf);


  reply	other threads:[~2026-05-07 22:56 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-04 12:32 [PATCH] zram: fix use-after-free in zram_writeback_endio Richard Chang
2026-05-05  3:25 ` Sergey Senozhatsky
2026-05-05 16:37 ` Minchan Kim
2026-05-07  9:40   ` Sergey Senozhatsky
2026-05-07 22:56     ` Minchan Kim [this message]
2026-05-07 23:38       ` Minchan Kim
2026-05-08  2:40       ` Sergey Senozhatsky
2026-05-08  8:49         ` [PATCH v2] " Richard Chang
2026-05-08 21:16           ` Minchan Kim
2026-05-09  2:18           ` Sergey Senozhatsky
2026-05-12  7:49             ` [PATCH v3] " Richard Chang
2026-05-13 14:02               ` [PATCH] " wang wei
2026-05-14 22:02                 ` Minchan Kim
2026-05-15  8:23                   ` wang wei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af0YtJOLGvO-LJow@google.com \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bgeffon@google.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liumartin@google.com \
    --cc=richardycc@google.com \
    --cc=senozhatsky@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.