From: Ming Lei <ming.lei@redhat.com>
To: John Garry <john.garry@huawei.com>
Cc: axboe@kernel.dk, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, hch@lst.de,
Bart Van Assche <bvanassche@acm.org>
Subject: Re: [PATCH] blk-mq: Properly init bios from blk_mq_alloc_request_hctx()
Date: Tue, 25 Oct 2022 08:34:47 +0800 [thread overview]
Message-ID: <Y1cvJ4/uwUScAQq4@T590> (raw)
In-Reply-To: <360c78dc-65ce-362f-389d-075f2259ce5b@huawei.com>
On Mon, Oct 24, 2022 at 05:56:15PM +0100, John Garry wrote:
> On 24/10/2022 14:27, Ming Lei wrote:
> > > > > - rq->bio = rq->biotail = NULL;
> > > > This patch looks not good, why do you switch to initialize the three fields
> > > > twice in fast path?
> > > Can you please show me how these are initialized twice?
> > blk_mq_bio_to_request() is one which setup these fields, then you add
> > another one in blk_mq_rq_ctx_init().
>
> ok, understood.
>
> >
> > > If there is a real concern with this then we go with my original idea, which
> > > was to copy the init method of blk_mq_alloc_request() (in
> > > blk_mq_alloc_request_hctx())
> > >
> > > > BTW, we know blk_mq_alloc_request_hctx() has big trouble, so please
> > > > avoid to extend it to other use cases.
> > > Yeah, I know this,
> > Did you know the exact issue on nvme-tcp, nvme-rdma or nvme-fc maybe
> > with blk_mq_alloc_request_hctx()?
>
> I thought that the original issue was an OoO bounds issue, fixed in
> 14dc7a18. Now there is still some issue in the following link, which is
> still unresolved as I understand:
> https://lore.kernel.org/linux-block/5bd886f1-a7c6-b765-da29-777be0328bc2@grimberg.me/#t
>
> But I think that 14dc7a18 may still leave undesirable scenario:
> - all cpus in HW queue cpumask may go offline after cpu_online_mask read in
> blk_mq_alloc_request_hctx() and before we get the driver tag and set
> rq->hctx
Yeah.
>
> >
> > > but sometimes we just need to allocate for a specific HW
> > > queue...
> > >
> > > For my usecase of interest, it should not impact if the cpumask of the HW
> > > queue goes offline after selecting the cpu in blk_mq_alloc_request_hctx(),
> > > so any race is ok ... I think.
> > >
> > > However it should be still possible to make blk_mq_alloc_request_hctx() more
> > > robust. How about using something like work_on_cpu_safe() to allocate and
> > > execute the request with blk_mq_alloc_request() on a cpu associated with the
> > > HW queue, such that we know the cpu is online and stays online until we
> > > execute it? Or also extent to work_on_cpumask_safe() variant, so that we
> > > don't need to try all cpus in the mask (to see if online)?
> > But all cpus on this hctx->cpumask could become offline.
>
> If all hctx->cpumask are offline then we should not allocate a request and
> this is acceptable. Maybe I am missing your point.
As you saw, this API has the above problem too, but any one of CPUs
may become online later, maybe just during blk_mq_alloc_request_hctx(),
and it is easy to cause inconsistence.
You didn't share your use case, but for nvme connection request, if it
is 1:1 mapping, if any one of CPU becomes offline, the controller
initialization could be failed, that isn't good from user viewpoint at
all.
Thanks,
Ming
next prev parent reply other threads:[~2022-10-25 1:17 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-22 16:07 [PATCH] blk-mq: Properly init bios from blk_mq_alloc_request_hctx() John Garry
2022-10-23 13:12 ` Ming Lei
2022-10-24 10:56 ` John Garry
2022-10-24 13:27 ` Ming Lei
2022-10-24 16:56 ` John Garry
2022-10-25 0:34 ` Ming Lei [this message]
2022-10-25 7:40 ` John Garry
2022-10-25 9:00 ` Ming Lei
2022-10-25 9:08 ` John Garry
2022-10-25 9:16 ` Ming Lei
2022-10-25 9:32 ` John Garry
2022-10-25 11:21 ` Ming Lei
2022-10-25 11:36 ` John Garry
2022-10-25 12:33 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y1cvJ4/uwUScAQq4@T590 \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=hch@lst.de \
--cc=john.garry@huawei.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.