All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <tom.leiming@gmail.com>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: Bart Van Assche <bvanassche@acm.org>,
	Jens Axboe <axboe@kernel.dk>, Christoph Hellwig <hch@lst.de>,
	Damien Le Moal <dlemoal@kernel.org>,
	linux-block <linux-block@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v2] loop: Fix NULL pointer dereference by synchronizing lo_release and loop_queue_rq
Date: Wed, 20 May 2026 11:06:51 +0800	[thread overview]
Message-ID: <ag0lS_CbKO9R5CV8@fedora> (raw)
In-Reply-To: <d43125ff-cc66-49b7-b16d-1b2650c68c23@I-love.SAKURA.ne.jp>

On Tue, May 19, 2026 at 06:27:11PM +0900, Tetsuo Handa wrote:
> On 2026/05/19 9:40, Andrew Morton wrote:
> > AI review asked a couple of questions:
> > 	https://sashiko.dev/#/patchset/9b2032d6-3f36-4d2b-8128-985c08a4fa37@I-love.SAKURA.ne.jp
> 
> To: gemini/gemini-3.1-pro-preview
> 
> Thank you for your valuable feedback. Your point about asynchronous I/O completing after drain_workqueue()
> and potentially causing a UAF at file_inode() from kiocb_end_write() from lo_rw_aio_do_completion() is correct.
> The drain_workqueue() alone does not wait for in-flight AIOs that have already returned -EIOCBQUEUED. However,
> I'm not convinced that use of blk_mq_freeze_queue() inside __loop_clr_fd() where disk->open_mutex was already
> held by bdev_release() is absolutely deadlock-free.
> 
> 1. VFS and Block Layer Lock Contention:
>    __loop_clr_fd() is exclusively invoked from the lo_release() path during the final close of the device.
>    At this stage, the block layer is holding disk->open_mutex. If we call blk_mq_freeze_queue() here, it will
>    synchronously wait for all in-flight AIOs to complete. However, the completion paths of those in-flight AIOs
>    (or subsequent metadata processing in the underlying filesystem) may attempt to acquire resources or execute
>    code paths that depend on the very same device state or open/close status. This creates a circular dependency,
>    leading to an unrecoverable hang.
> 
> 2. Memory Reclaim Deadlock:
>    blk_mq_freeze_queue() blocks until the queue's usage counter drops to zero. If an in-flight AIO requires memory
>    allocation for metadata updates upon completion, and the system is under heavy memory pressure, it can trigger
>    direct memory reclaim. If the reclaim path attempts to sync other buffers or interact with the frozen loop
>    device/queue, a circular deadlock occurs.
> 
> Therefore, I would like to choose SRCU-based synchronization instead of blk_mq_freeze_queue().
> 
> * Locking: We call srcu_read_lock(&loop_io_srcu) only for asynchronous paths (cmd->use_aio) immediately
>   before submitting the I/O to the underlying filesystem in lo_rw_aio().
> 
> * Unlocking: The reader lock is released via srcu_read_unlock() at the very end of the AIO completion handler
>   (lo_rw_aio_do_completion()).
> 
> * Synchronization: We place synchronize_srcu(&loop_io_srcu) immediately after drain_workqueue() in __loop_clr_fd().
> 
> I think that this guarantees that __loop_clr_fd() safely blocks until all pending AIO callbacks are 100% completed,
> fully eliminating the UAF risk and ensuring the safety of the subsequent mapping_set_gfp_mask() and fput(), while
> remaining entirely deadlock-free.
> 
> What do you think about this approach?
> 
>  drivers/block/loop.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 0000913f7efc..7c3961f3cbc9 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -80,6 +80,7 @@ struct loop_cmd {
>  	struct list_head list_entry;
>  	bool use_aio; /* use AIO interface to handle I/O */
>  	atomic_t ref; /* only for aio */
> +	int srcu_idx;
>  	long ret;
>  	struct kiocb iocb;
>  	struct bio_vec *bvec;
> @@ -93,6 +94,7 @@ struct loop_cmd {
>  static DEFINE_IDR(loop_index_idr);
>  static DEFINE_MUTEX(loop_ctl_mutex);
>  static DEFINE_MUTEX(loop_validate_mutex);
> +DEFINE_SRCU(loop_io_srcu);
>  
>  /**
>   * loop_global_lock_killable() - take locks for safe loop_validate_file() test
> @@ -327,6 +329,8 @@ static void lo_rw_aio_do_completion(struct loop_cmd *cmd)
>  		kiocb_end_write(&cmd->iocb);
>  	if (likely(!blk_should_fake_timeout(rq->q)))
>  		blk_mq_complete_request(rq);
> +	if (cmd->use_aio)
> +		srcu_read_unlock(&loop_io_srcu, cmd->srcu_idx);
>  }
>  
>  static void lo_rw_aio_complete(struct kiocb *iocb, long ret)
> @@ -392,6 +396,7 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd,
>  	if (cmd->use_aio) {
>  		cmd->iocb.ki_complete = lo_rw_aio_complete;
>  		cmd->iocb.ki_flags = IOCB_DIRECT;
> +		cmd->srcu_idx = srcu_read_lock(&loop_io_srcu);
>  	} else {
>  		cmd->iocb.ki_complete = NULL;
>  		cmd->iocb.ki_flags = 0;
> @@ -1118,6 +1123,22 @@ static void __loop_clr_fd(struct loop_device *lo)
>  	struct file *filp;
>  	gfp_t gfp = lo->old_gfp_mask;
>  
> +	/*
> +	 * Now that loop_queue_rq() sees lo->lo_state != Lo_bound,
> +	 * wait for already started loop_queue_rq() to complete.
> +	 */
> +	synchronize_rcu();
> +	/*
> +	 * Now that no more works are scheduled by loop_queue_rq(),
> +	 * wait for already scheduled works to complete.
> +	 */
> +	drain_workqueue(lo->workqueue);
> +	/*
> +	 * Now that no more AIO requests are scheduled by lo_rw_aio(),
> +	 * wait for already started AIO to complete.
> +	 */
> +	synchronize_srcu(&loop_io_srcu);

The IO after close(loop) should be from writeback. rcu/sruc isn't necessary,
please see the patch posted in another thread:

https://lore.kernel.org/linux-block/agxJdUf1b0JSDAux@fedora/

in which the check on lo->lo_state is moved to loop_handle_cmd(), meantime
drain_workqueue() is added for draining in-flight workers.

Thanks,
Ming

  reply	other threads:[~2026-05-20  3:06 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-18  0:02 [syzbot] [block?] general protection fault in lo_rw_aio syzbot
2026-04-21 11:05 ` Tetsuo Handa
2026-05-11 11:43   ` [PATCH] loop: Fix NULL pointer dereference by synchronizing lo_release and loop_queue_rq Tetsuo Handa
2026-05-11 15:58     ` Bart Van Assche
2026-05-11 17:43       ` Tetsuo Handa
2026-05-12 11:46         ` Tetsuo Handa
2026-05-15  1:38           ` [PATCH v2] " Tetsuo Handa
2026-05-19  0:40             ` Andrew Morton
2026-05-19  9:27               ` Tetsuo Handa
2026-05-20  3:06                 ` Ming Lei [this message]
2026-05-20  6:36                   ` Tetsuo Handa
2026-05-20  7:49                     ` Ming Lei
2026-05-20  8:20                       ` Tetsuo Handa
2026-05-20  8:54                         ` Ming Lei
2026-05-25  3:40                           ` [PATCH v3] loop: Fix NULL pointer dereference in lo_rw_aio() Tetsuo Handa
2026-05-25 15:19                             ` Ming Lei
2026-05-26  0:25                               ` Tetsuo Handa
2026-05-27  1:20                                 ` Ming Lei
2026-05-27  1:35                                   ` Tetsuo Handa
2026-05-27  3:00                                     ` Ming Lei
2026-05-27 11:29                                       ` Tetsuo Handa
2026-05-27 18:11                                         ` Damien Le Moal
2026-05-28  8:38                                           ` Christoph Hellwig
2026-05-28 10:16                                             ` Qu Wenruo
2026-06-01 14:40                                               ` Christoph Hellwig
2026-06-01 16:29                                                 ` Brian Foster
2026-06-01 22:27                                                   ` Qu Wenruo
2026-06-01 15:29                                               ` Ming Lei
2026-06-01 21:51                                                 ` Hillf Danton
2026-06-01 22:14                                                   ` Ming Lei
2026-06-01 23:17                                                     ` Hillf Danton
2026-06-01 23:36                                                       ` Ming Lei
2026-06-02  2:02                                                         ` Hillf Danton
2026-05-28  5:43                                       ` Hillf Danton
2026-05-28 23:00                                         ` Hillf Danton
2026-05-29  0:14                                           ` Tetsuo Handa
2026-05-29  7:04                                             ` Hillf Danton
2026-05-29 22:05                                               ` Hillf Danton
2026-05-30 23:57                                                 ` Tetsuo Handa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ag0lS_CbKO9R5CV8@fedora \
    --to=tom.leiming@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=dlemoal@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.