public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] zram: fix use-after-free in zram_writeback_endio
@ 2026-05-04 12:32 Richard Chang
  2026-05-05  3:25 ` Sergey Senozhatsky
  2026-05-05 16:37 ` Minchan Kim
  0 siblings, 2 replies; 3+ messages in thread
From: Richard Chang @ 2026-05-04 12:32 UTC (permalink / raw)
  To: Minchan Kim, Sergey Senozhatsky, Jens Axboe, Andrew Morton
  Cc: bgeffon, liumartin, linux-kernel, linux-block, linux-mm,
	Richard Chang

A crash was observed in zram_writeback_endio due to a NULL pointer
dereference in wake_up. The root cause is a race condition between the
bio completion handler (zram_writeback_endio) and the writeback task.

In zram_writeback_endio, wake_up() is called on &wb_ctl->done_wait after
releasing wb_ctl->done_lock. This creates a race window where the
writeback task can see num_inflight become 0, return, and free wb_ctl
before zram_writeback_endio calls wake_up().

CPU 0 (zram_writeback_endio)       CPU 1 (zram_complete_done_reqs)
============================       ============================
spin_lock(&wb_ctl->done_lock);
list_add(&req->entry, &wb_ctl->done_reqs);
spin_unlock(&wb_ctl->done_lock);
                                   while (&wb_ctl->num_inflight) > 0)
                                   spin_lock(&wb_ctl->done_lock);
                                   list_del(&req->entry);
                                   spin_unlock(&wb_ctl->done_lock);
				   // num_inflight becomes 0
                                   atomic_dec(&wb_ctl->num_inflight);
                                   returns to writeback_store();
				   // frees wb_ctl
                                   release_wb_ctl(wb_ctl);

// UAF crash!
wake_up(&wb_ctl->done_wait);

Fix this by moving wake_up() inside the done_lock critical section.
This ensures that zram_complete_done_reqs cannot consume the request
and decrement num_inflight until zram_writeback_endio has finished
calling wake_up() and released the lock.

Fixes: f405066a1f0d ("zram: introduce writeback bio batching")
Signed-off-by: Richard Chang <richardycc@google.com>
---
 drivers/block/zram/zram_drv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index aebc710f0d6a..a457fdf564f8 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -966,9 +966,8 @@ static void zram_writeback_endio(struct bio *bio)
 
 	spin_lock_irqsave(&wb_ctl->done_lock, flags);
 	list_add(&req->entry, &wb_ctl->done_reqs);
-	spin_unlock_irqrestore(&wb_ctl->done_lock, flags);
-
 	wake_up(&wb_ctl->done_wait);
+	spin_unlock_irqrestore(&wb_ctl->done_lock, flags);
 }
 
 static void zram_submit_wb_request(struct zram *zram,
-- 
2.54.0.545.g6539524ca2-goog


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] zram: fix use-after-free in zram_writeback_endio
  2026-05-04 12:32 [PATCH] zram: fix use-after-free in zram_writeback_endio Richard Chang
@ 2026-05-05  3:25 ` Sergey Senozhatsky
  2026-05-05 16:37 ` Minchan Kim
  1 sibling, 0 replies; 3+ messages in thread
From: Sergey Senozhatsky @ 2026-05-05  3:25 UTC (permalink / raw)
  To: Andrew Morton, Richard Chang
  Cc: Minchan Kim, Sergey Senozhatsky, Jens Axboe, bgeffon, liumartin,
	linux-kernel, linux-block, linux-mm

On (26/05/04 12:32), Richard Chang wrote:
> A crash was observed in zram_writeback_endio due to a NULL pointer
> dereference in wake_up. The root cause is a race condition between the
> bio completion handler (zram_writeback_endio) and the writeback task.
> 
> In zram_writeback_endio, wake_up() is called on &wb_ctl->done_wait after
> releasing wb_ctl->done_lock. This creates a race window where the
> writeback task can see num_inflight become 0, return, and free wb_ctl
> before zram_writeback_endio calls wake_up().
> 
> CPU 0 (zram_writeback_endio)       CPU 1 (zram_complete_done_reqs)
> ============================       ============================
> spin_lock(&wb_ctl->done_lock);
> list_add(&req->entry, &wb_ctl->done_reqs);
> spin_unlock(&wb_ctl->done_lock);
>                                    while (&wb_ctl->num_inflight) > 0)
>                                    spin_lock(&wb_ctl->done_lock);
>                                    list_del(&req->entry);
>                                    spin_unlock(&wb_ctl->done_lock);
> 				   // num_inflight becomes 0
>                                    atomic_dec(&wb_ctl->num_inflight);
>                                    returns to writeback_store();
> 				   // frees wb_ctl
>                                    release_wb_ctl(wb_ctl);
> 
> // UAF crash!
> wake_up(&wb_ctl->done_wait);
> 
> Fix this by moving wake_up() inside the done_lock critical section.
> This ensures that zram_complete_done_reqs cannot consume the request
> and decrement num_inflight until zram_writeback_endio has finished
> calling wake_up() and released the lock.
> 
> Fixes: f405066a1f0d ("zram: introduce writeback bio batching")
> Signed-off-by: Richard Chang <richardycc@google.com>

Acked-by: Sergey Senozhatsky <senozhatsky@chromium.org>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] zram: fix use-after-free in zram_writeback_endio
  2026-05-04 12:32 [PATCH] zram: fix use-after-free in zram_writeback_endio Richard Chang
  2026-05-05  3:25 ` Sergey Senozhatsky
@ 2026-05-05 16:37 ` Minchan Kim
  1 sibling, 0 replies; 3+ messages in thread
From: Minchan Kim @ 2026-05-05 16:37 UTC (permalink / raw)
  To: Richard Chang
  Cc: Sergey Senozhatsky, Jens Axboe, Andrew Morton, bgeffon, liumartin,
	linux-kernel, linux-block, linux-mm

On Mon, May 04, 2026 at 12:32:30PM +0000, Richard Chang wrote:
> A crash was observed in zram_writeback_endio due to a NULL pointer
> dereference in wake_up. The root cause is a race condition between the
> bio completion handler (zram_writeback_endio) and the writeback task.
> 
> In zram_writeback_endio, wake_up() is called on &wb_ctl->done_wait after
> releasing wb_ctl->done_lock. This creates a race window where the
> writeback task can see num_inflight become 0, return, and free wb_ctl
> before zram_writeback_endio calls wake_up().
> 
> CPU 0 (zram_writeback_endio)       CPU 1 (zram_complete_done_reqs)
> ============================       ============================
> spin_lock(&wb_ctl->done_lock);
> list_add(&req->entry, &wb_ctl->done_reqs);
> spin_unlock(&wb_ctl->done_lock);
>                                    while (&wb_ctl->num_inflight) > 0)
>                                    spin_lock(&wb_ctl->done_lock);
>                                    list_del(&req->entry);
>                                    spin_unlock(&wb_ctl->done_lock);
> 				   // num_inflight becomes 0
>                                    atomic_dec(&wb_ctl->num_inflight);
>                                    returns to writeback_store();
> 				   // frees wb_ctl
>                                    release_wb_ctl(wb_ctl);
> 
> // UAF crash!
> wake_up(&wb_ctl->done_wait);
> 
> Fix this by moving wake_up() inside the done_lock critical section.
> This ensures that zram_complete_done_reqs cannot consume the request
> and decrement num_inflight until zram_writeback_endio has finished
> calling wake_up() and released the lock.
> 
> Fixes: f405066a1f0d ("zram: introduce writeback bio batching")
> Signed-off-by: Richard Chang <richardycc@google.com>
> ---
>  drivers/block/zram/zram_drv.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index aebc710f0d6a..a457fdf564f8 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -966,9 +966,8 @@ static void zram_writeback_endio(struct bio *bio)
>  
>  	spin_lock_irqsave(&wb_ctl->done_lock, flags);
>  	list_add(&req->entry, &wb_ctl->done_reqs);
> -	spin_unlock_irqrestore(&wb_ctl->done_lock, flags);
> -
>  	wake_up(&wb_ctl->done_wait);
> +	spin_unlock_irqrestore(&wb_ctl->done_lock, flags);
>  }
>  

I agree this will fix the issue, but using a lock to extend the lifetime of
an object to avoid a UAF is not a good pattern. Object lifetime shared between
process and interrupt contexts should be managed explicitly using refcount.

Furthermore, keeping wake_up() outside the critical section minimizes
interrupt-disabled latency and avoids nesting spinlocks
(done_lock -> done_wait.lock), reducing the risk of future lockdep
issues, just in case.

It definitely will add more overhead for the submission/completion paths to deal
with the refcount, but I think we should go that way at the cost of runtime.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-05 16:38 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-04 12:32 [PATCH] zram: fix use-after-free in zram_writeback_endio Richard Chang
2026-05-05  3:25 ` Sergey Senozhatsky
2026-05-05 16:37 ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox