From: Ming Lei <ming.lei@redhat.com>
To: Bart Van Assche <bvanassche@acm.org>
Cc: Jens Axboe <axboe@kernel.dk>,
linux-block@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
Daniel Wagner <dwagner@suse.de>,
Khazhismel Kumykov <khazhy@google.com>,
Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>,
"Martin K . Petersen" <martin.petersen@oracle.com>,
Hannes Reinecke <hare@suse.de>,
Johannes Thumshirn <johannes.thumshirn@wdc.com>,
John Garry <john.garry@huawei.com>,
linux-scsi@vger.kernel.org
Subject: Re: [PATCH v7 3/5] blk-mq: Fix races between iterating over requests and freeing requests
Date: Sun, 25 Apr 2021 08:09:17 +0800 [thread overview]
Message-ID: <YISzLal7Ur7jyuiy@T590> (raw)
In-Reply-To: <28607d75-042f-7a6a-f5d0-2ee03754917e@acm.org>
On Fri, Apr 23, 2021 at 10:52:43AM -0700, Bart Van Assche wrote:
> On 4/22/21 8:52 PM, Ming Lei wrote:
> > For example, scsi aacraid normal completion vs. reset together with elevator
> > switch, aacraid is one single queue HBA, and the request will be completed
> > via IPI or softirq asynchronously, that said request isn't really completed
> > after blk_mq_complete_request() returns.
> >
> > 1) interrupt comes, and request A is completed via blk_mq_complete_request()
> > from aacraid's interrupt handler via ->scsi_done()
> >
> > 2) _aac_reset_adapter() comes because of reset event which can be
> > triggered by sysfs store or whatever, irq is drained in
> > _aac_reset_adpter(), so blk_mq_complete_request(request A) from aacraid irq
> > context is done, but request A is just scheduled to be completed via IPI
> > or softirq asynchronously, not really done yet.
> >
> > 3) scsi_host_complete_all_commands() is called from _aac_reset_adapter() for
> > failing all pending requests. request A is still visible in
> > scsi_host_complete_all_commands, because its tag isn't freed yet. But the
> > tag & request A can be completed & freed exactly after scsi_host_complete_all_commands()
> > reads ->rqs[bitnr] in bt_tags_iter(), which calls complete_all_cmds_iter()
> > -> .scsi_done() -> blk_mq_complete_request(), and same request A is scheduled via
> > IPI or softirq, and request A is addded in ipi or softirq list.
> >
> > 4) meantime request A is freed from normal completion triggered by interrupt, one
> > pending elevator switch can move on since request A drops the last reference; and
> > bt_tags_iter() returns from reset path, so blk_mq_wait_for_tag_iter() can return
> > too, then the whole scheduler request pool is freed now.
> >
> > 5) request A in ipi/softirq list scheduled from _aac_reset_adapter is read , UAF
> > is triggered.
> >
> > It is supposed that driver covers normal completion vs. error handling, but wrt.
> > remove completion, not sure driver is capable of covering that.
>
> Hi Ming,
>
> I agree that the scenario above can happen and also that a fix is
> needed. However, I do not agree that this should be fixed by modifying
> the tag iteration functions. I see scsi_host_complete_all_commands() as
> a workaround for broken storage controllers - storage controllers that
> do not have a facility for terminating all pending commands. NVMe
> controllers can be told to terminate all pending I/O commands by e.g.
> deleting all I/O submission queues. Many SCSI controllers also provide a
> facility for aborting all pending commands. For SCSI controllers that do
> not provide such a facility I propose to fix races like the race
> described above in the SCSI LLD instead of modifying the block layer tag
> iteration functions.
Hi Bart,
Terminating all pending commands can't avoid the issue wrt. request UAF,
so far blk_mq_tagset_wait_completed_request() is used for making sure
that all pending requests are really aborted.
However, blk_mq_wait_for_tag_iter() still may return before
blk_mq_wait_for_tag_iter() is done because blk_mq_wait_for_tag_iter()
supposes all request reference is just done inside bt_tags_iter(),
especially .iter_rwsem and read rcu lock is added in bt_tags_iter().
What I really meant is that .iter_rwsem/read rcu added or blk_mq_wait_for_tag_iter
can't do what is expected.
Thanks,
Ming
next prev parent reply other threads:[~2021-04-25 0:09 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20210421000235.2028-1-bvanassche@acm.org>
[not found] ` <20210421000235.2028-4-bvanassche@acm.org>
[not found] ` <YIDqa6YkNoD5OiKN@T590>
[not found] ` <b717ffc0-a434-738f-9c63-32901bd164b2@acm.org>
[not found] ` <YIEiElb9wxReV/oL@T590>
[not found] ` <32a121b7-2444-ac19-420d-4961f2a18129@acm.org>
2021-04-23 3:52 ` [PATCH v7 3/5] blk-mq: Fix races between iterating over requests and freeing requests Ming Lei
2021-04-23 17:52 ` Bart Van Assche
2021-04-25 0:09 ` Ming Lei [this message]
2021-04-25 21:01 ` Bart Van Assche
2021-04-26 0:55 ` Ming Lei
2021-04-26 16:29 ` Bart Van Assche
2021-04-27 0:11 ` Ming Lei
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YISzLal7Ur7jyuiy@T590 \
--to=ming.lei@redhat.com \
--cc=axboe@kernel.dk \
--cc=bvanassche@acm.org \
--cc=dwagner@suse.de \
--cc=hare@suse.de \
--cc=hch@lst.de \
--cc=johannes.thumshirn@wdc.com \
--cc=john.garry@huawei.com \
--cc=khazhy@google.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=shinichiro.kawasaki@wdc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox