All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kashyap Desai <kashyap.desai@broadcom.com>
To: Ming Lei <ming.lei@redhat.com>, luojiaxing <luojiaxing@huawei.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	John Garry <john.garry@huawei.com>
Subject: RE: [PATCH] blk-mq: avoid to iterate over stale request
Date: Wed, 15 Dec 2021 00:11:13 +0530	[thread overview]
Message-ID: <0d8666c9983158a4954f30f6b429e797@mail.gmail.com> (raw)
In-Reply-To: <20210906065003.439019-1-ming.lei@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 3139 bytes --]

+ John Garry

> blk-mq can't run allocating driver tag and updating ->rqs[tag]
atomically,
> meantime blk-mq doesn't clear ->rqs[tag] after the driver tag is
released.
>
> So there is chance to iterating over one stale request just after the
tag is
> allocated and before updating ->rqs[tag].
>
> scsi_host_busy_iter() calls scsi_host_check_in_flight() to count scsi
in-flight
> requests after scsi host is blocked, so no new scsi command can be
marked as
> SCMD_STATE_INFLIGHT. However, driver tag allocation still can be run by
blk-
> mq core. One request is marked as SCMD_STATE_INFLIGHT, but this request
> may have been kept in another slot of ->rqs[], meantime the slot can be
> allocated out but ->rqs[] isn't updated yet. Then this in-flight request
is
> counted twice as SCMD_STATE_INFLIGHT. This way causes trouble in
handling
> scsi error.

Hi Ming,

We found similar issue on RHEL8.5 (kernel  does not have this patch in
discussion.). Issue reproduced on 5.15 kernel as well.
I understood this commit will fix specific race condition and avoid
reading incorrect host_busy value.
As per commit message - That incorrect host_busy will be just transient.
If we read after some delay, correct host_busy count will be available.
Right ?

In my case (I am using shared host tag enabled driver), it is not race
condition issue but stale rqs[] entries create permanent incorrect count
of host_busy.
Example - There are two pending IOs. This IOs are timed out. Bitmap of
pending IO is tag#5 (actually belongs to hctx0), tag#10 (actually belongs
to hctx1).  Note  - This is a shared bit map.
If hctx0 has same address of the request at 5th and 10th index, we will
count total 2 inflight commands instead of 1 from hctx0 context + From
hctx1 context, we will count 1 inflight command = Total is 3.
Even though we read after some delay, host_busy will be incorrect. We
expect host_busy = 2 but it will return 3.

This patch fix my issue explained above for shared host-tag case.  I am
confused reading the commit message. You may not have intentionally fix
the issue as I explained but indirectly it fixes my issue. Am I correct ?

What was an issue reported by Luojiaxiang ? I am interested to know if
issue reported by Luojiaxiang had shared host tagset enabled ?

Kashyap

>
> Fixes the issue by not iterating over stale request.
>
> Cc: linux-scsi@vger.kernel.org
> Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
> Reported-by: luojiaxing <luojiaxing@huawei.com>
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>  block/blk-mq-tag.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index
> 86f87346232a..ff5caeb82542 100644
> --- a/block/blk-mq-tag.c
> +++ b/block/blk-mq-tag.c
> @@ -208,7 +208,7 @@ static struct request
> *blk_mq_find_and_get_req(struct blk_mq_tags *tags,
>
>  	spin_lock_irqsave(&tags->lock, flags);
>  	rq = tags->rqs[bitnr];
> -	if (!rq || !refcount_inc_not_zero(&rq->ref))
> +	if (!rq || rq->tag != bitnr || !refcount_inc_not_zero(&rq->ref))
>  		rq = NULL;
>  	spin_unlock_irqrestore(&tags->lock, flags);
>  	return rq;
> --
> 2.31.1

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4212 bytes --]

  parent reply	other threads:[~2021-12-14 18:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-06  6:50 [PATCH] blk-mq: avoid to iterate over stale request Ming Lei
2021-09-06 22:44 ` Bart Van Assche
2021-09-07  1:14   ` Ming Lei
2021-09-13  1:26 ` Ming Lei
2021-09-13  1:31   ` Jens Axboe
2021-12-14 18:41 ` Kashyap Desai [this message]
2021-12-15  3:36   ` Ming Lei
2021-12-15  7:30     ` Kashyap Desai
2021-12-15  8:02       ` Ming Lei
2021-12-15  8:45         ` Kashyap Desai
2021-12-15 12:56           ` Ming Lei
2021-12-16 11:55             ` Kashyap Desai
2021-12-16 12:50               ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0d8666c9983158a4954f30f6b429e797@mail.gmail.com \
    --to=kashyap.desai@broadcom.com \
    --cc=axboe@kernel.dk \
    --cc=john.garry@huawei.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=luojiaxing@huawei.com \
    --cc=martin.petersen@oracle.com \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.