All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <gregkh@linuxfoundation.org>
To: Zhang Wensheng <zhangwensheng@huaweicloud.com>
Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
	linux-kernel@vger.kernel.org, bpf@vger.kernel.org,
	yukuai3@huawei.com
Subject: Re: [PATCH -next] [RFC] block: fix null-deref in percpu_ref_put
Date: Fri, 29 Jul 2022 13:16:54 +0200	[thread overview]
Message-ID: <YuPBpn0jIEy65T6P@kroah.com> (raw)
In-Reply-To: <20220729105036.2202791-1-zhangwensheng@huaweicloud.com>

On Fri, Jul 29, 2022 at 06:50:36PM +0800, Zhang Wensheng wrote:
> From: Zhang Wensheng <zhangwensheng5@huawei.com>
> 
> A problem was find in stable 5.10 and the root cause of it like below.

5.10 is very old, is this still an issue in Linus's tree?

> 
> In the use of q_usage_counter of request_queue, blk_cleanup_queue using
> "wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter))"
> to wait q_usage_counter becoming zero. however, if the q_usage_counter
> becoming zero quickly, and percpu_ref_exit will execute and ref->data
> will be freed, maybe another process will cause a null-defef problem
> like below:
> 
> 	CPU0                             CPU1
> blk_cleanup_queue
>  blk_freeze_queue
>   blk_mq_freeze_queue_wait
> 				scsi_end_request
> 				 percpu_ref_get
> 				 ...
> 				 percpu_ref_put
> 				  atomic_long_sub_and_test
>   percpu_ref_exit
>    ref->data -> NULL
>    				   ref->data->release(ref) -> null-deref
> 
> Fix it by setting flag(QUEUE_FLAG_USAGE_COUNT_SYNC) to add synchronization
> mechanism, when ref->data->release is called, the flag will be setted,
> and the "wait_event" in blk_mq_freeze_queue_wait must wait flag becoming
> true as well, which will limit percpu_ref_exit to execute ahead of time.
> 
> Although the problem was not reproduced in mainline, it may also has
> problem when the passthrough IO which will go directly to
> blk_cleanup_queue and cause the problem as well.
> 
> Signed-off-by: Zhang Wensheng <zhangwensheng5@huawei.com>

As the documentation said, this is not how you mark things for stable
backports.

> ---
>  block/blk-core.c       | 4 +++-
>  block/blk-mq.c         | 7 +++++++
>  include/linux/blk-mq.h | 1 +
>  include/linux/blkdev.h | 2 ++
>  4 files changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 27fb1357ad4b..4b73f46e62ec 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -312,7 +312,8 @@ void blk_cleanup_queue(struct request_queue *q)
>  	 * prevent that blk_mq_run_hw_queues() accesses the hardware queues
>  	 * after draining finished.
>  	 */
> -	blk_freeze_queue(q);
> +	blk_freeze_queue_start(q);
> +	blk_mq_freeze_queue_wait_sync(q);
>  
>  	blk_queue_flag_set(QUEUE_FLAG_DEAD, q);
>  
> @@ -403,6 +404,7 @@ static void blk_queue_usage_counter_release(struct percpu_ref *ref)
>  	struct request_queue *q =
>  		container_of(ref, struct request_queue, q_usage_counter);
>  
> +	blk_queue_flag_set(QUEUE_FLAG_USAGE_COUNT_SYNC, q);
>  	wake_up_all(&q->mq_freeze_wq);
>  }
>  
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 93d9d60980fb..44e764257511 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -165,6 +165,7 @@ void blk_freeze_queue_start(struct request_queue *q)
>  {
>  	mutex_lock(&q->mq_freeze_lock);
>  	if (++q->mq_freeze_depth == 1) {
> +		blk_queue_flag_clear(QUEUE_FLAG_USAGE_COUNT_SYNC, q);
>  		percpu_ref_kill(&q->q_usage_counter);
>  		mutex_unlock(&q->mq_freeze_lock);
>  		if (queue_is_mq(q))
> @@ -175,6 +176,12 @@ void blk_freeze_queue_start(struct request_queue *q)
>  }
>  EXPORT_SYMBOL_GPL(blk_freeze_queue_start);
>  
> +void blk_mq_freeze_queue_wait_sync(struct request_queue *q)
> +{
> +	wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter) &&
> +			test_bit(QUEUE_FLAG_USAGE_COUNT_SYNC, &q->queue_flags));

No timeout ever?


> +}
> +
>  void blk_mq_freeze_queue_wait(struct request_queue *q)
>  {
>  	wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter));
> diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
> index e2d9daf7e8dd..50fd56f85b31 100644
> --- a/include/linux/blk-mq.h
> +++ b/include/linux/blk-mq.h
> @@ -868,6 +868,7 @@ void blk_mq_freeze_queue(struct request_queue *q);
>  void blk_mq_unfreeze_queue(struct request_queue *q);
>  void blk_freeze_queue_start(struct request_queue *q);
>  void blk_mq_freeze_queue_wait(struct request_queue *q);
> +void blk_mq_freeze_queue_wait_sync(struct request_queue *q);
>  int blk_mq_freeze_queue_wait_timeout(struct request_queue *q,
>  				     unsigned long timeout);
>  
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 2f7b43444c5f..93ed8b166d66 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -575,6 +575,8 @@ struct request_queue {
>  #define QUEUE_FLAG_HCTX_ACTIVE	28	/* at least one blk-mq hctx is active */
>  #define QUEUE_FLAG_NOWAIT       29	/* device supports NOWAIT */
>  #define QUEUE_FLAG_SQ_SCHED     30	/* single queue style io dispatch */
> +/* sync for q_usage_counter */
> +#define QUEUE_FLAG_USAGE_COUNT_SYNC    31

Why not put the comment a the end of the line like everything else in
this list?

And why not use tabs?

thanks,

greg k-h

  reply	other threads:[~2022-07-29 11:17 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-29 10:50 [PATCH -next] [RFC] block: fix null-deref in percpu_ref_put Zhang Wensheng
2022-07-29 11:16 ` Greg KH [this message]
2022-07-29 13:58 ` Ming Lei
2022-07-30  2:15   ` zhangwensheng (E)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YuPBpn0jIEy65T6P@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=axboe@kernel.dk \
    --cc=bpf@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=yukuai3@huawei.com \
    --cc=zhangwensheng@huaweicloud.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.