* Re: [PATCH] block: bio-integrity: fix potential null-ptr-deref in bio_integrity_free
[not found] ` <66619EB6.4040002@huaweicloud.com>
@ 2024-06-06 14:30 ` Ming Lei
2024-06-06 14:52 ` Christoph Hellwig
0 siblings, 1 reply; 5+ messages in thread
From: Ming Lei @ 2024-06-06 14:30 UTC (permalink / raw)
To: yebin
Cc: Christoph Hellwig, axboe, linux-block, linux-kernel, Ye Bin,
Zhang Yi, Ewan D. Milne, linux-nvme
On Thu, Jun 06, 2024 at 07:34:14PM +0800, yebin wrote:
>
>
> On 2024/6/6 14:44, Christoph Hellwig wrote:
> > What kernel is this on? As of Linux 6.9 we are now always freezing
> v4.18
> > the queue while updating the logical_block_size in the nvme driver,
> > so there should be no inflight I/O while it is changing.
> >
> The root cause of the problem is that there is no concurrency protection
> between
> issuing DIO checks in __ blkdev direct IO simple() and updating logical
> block sizes ,
> resulting in the block layer being able to see DIOs that are not aligned
> with logical
> blocks.
Yeah, that is one area queue freezing can't cover logical block size
change, but I'd suggest to put the logical bs check into submit_bio() or
slow path of __bio_queue_enter() at least.
BTW, Yi has one reproducer, and slab is corrupted just like this report
when running 'nvme format' & IO on partitions.
I am not sure if this kind of change can avoid the issue completely, anyway
Yi and I can test it and see if the kind of change works.
My concern is that nvme format is started without draining IO, and
IO can be submitted to hardware when nvme FW is handling formatting.
I am not sure if nvme FW can deal with this situation correctly.
Ewan suggested to run 'nvme format' with exclusive nvme disk open, which
needs nvme-cli change.
Thanks,
Ming
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: bio-integrity: fix potential null-ptr-deref in bio_integrity_free
2024-06-06 14:30 ` [PATCH] block: bio-integrity: fix potential null-ptr-deref in bio_integrity_free Ming Lei
@ 2024-06-06 14:52 ` Christoph Hellwig
2024-06-06 15:48 ` Keith Busch
2024-06-06 23:46 ` Ming Lei
0 siblings, 2 replies; 5+ messages in thread
From: Christoph Hellwig @ 2024-06-06 14:52 UTC (permalink / raw)
To: Ming Lei
Cc: yebin, Christoph Hellwig, axboe, linux-block, linux-kernel,
Ye Bin, Zhang Yi, Ewan D. Milne, linux-nvme
On Thu, Jun 06, 2024 at 10:30:06PM +0800, Ming Lei wrote:
> Yeah, that is one area queue freezing can't cover logical block size
> change, but I'd suggest to put the logical bs check into submit_bio() or
> slow path of __bio_queue_enter() at least.
We really need an alignment check in submit_bio anyway, so doing it
under the freeze protection would also help with this.
> My concern is that nvme format is started without draining IO, and
> IO can be submitted to hardware when nvme FW is handling formatting.
> I am not sure if nvme FW can deal with this situation correctly.
> Ewan suggested to run 'nvme format' with exclusive nvme disk open, which
> needs nvme-cli change.
.. and doesn't protect against someone using a different tool anyway.
That beeing said, nvme_passthru_start actually freezes all queues
based on the commands supported an affects log, and
nvme_init_known_nvm_effects should force this even for controllers
not supporting the log or reporting bogus information. So in general
the queue should be frozen during the actual format.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: bio-integrity: fix potential null-ptr-deref in bio_integrity_free
2024-06-06 14:52 ` Christoph Hellwig
@ 2024-06-06 15:48 ` Keith Busch
2024-06-06 23:46 ` Ming Lei
1 sibling, 0 replies; 5+ messages in thread
From: Keith Busch @ 2024-06-06 15:48 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Ming Lei, yebin, axboe, linux-block, linux-kernel, Ye Bin,
Zhang Yi, Ewan D. Milne, linux-nvme
On Thu, Jun 06, 2024 at 07:52:51AM -0700, Christoph Hellwig wrote:
> On Thu, Jun 06, 2024 at 10:30:06PM +0800, Ming Lei wrote:
> > Yeah, that is one area queue freezing can't cover logical block size
> > change, but I'd suggest to put the logical bs check into submit_bio() or
> > slow path of __bio_queue_enter() at least.
>
> We really need an alignment check in submit_bio anyway, so doing it
> under the freeze protection would also help with this.
>
> > My concern is that nvme format is started without draining IO, and
> > IO can be submitted to hardware when nvme FW is handling formatting.
> > I am not sure if nvme FW can deal with this situation correctly.
> > Ewan suggested to run 'nvme format' with exclusive nvme disk open, which
> > needs nvme-cli change.
>
> .. and doesn't protect against someone using a different tool anyway.
It also doesn't work if format is sent from a different host. If you
have a shared namespace, anyone could change the format without
coordinating with anyone else.
That's just an unfixable gap with the current spec. The driver tries to
react to the Namespace Notice AEN for this, but that can be too late or
not even enabled.
> That beeing said, nvme_passthru_start actually freezes all queues
> based on the commands supported an affects log, and
> nvme_init_known_nvm_effects should force this even for controllers
> not supporting the log or reporting bogus information. So in general
> the queue should be frozen during the actual format.
Just for single host consideration, the our current nvme freeze does
ensure IO is drained before dispatching the format, and should work for
most format changes.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: bio-integrity: fix potential null-ptr-deref in bio_integrity_free
2024-06-06 14:52 ` Christoph Hellwig
2024-06-06 15:48 ` Keith Busch
@ 2024-06-06 23:46 ` Ming Lei
1 sibling, 0 replies; 5+ messages in thread
From: Ming Lei @ 2024-06-06 23:46 UTC (permalink / raw)
To: Christoph Hellwig
Cc: yebin, axboe, linux-block, linux-kernel, Ye Bin, Zhang Yi,
Ewan D. Milne, linux-nvme
On Thu, Jun 06, 2024 at 07:52:51AM -0700, Christoph Hellwig wrote:
> On Thu, Jun 06, 2024 at 10:30:06PM +0800, Ming Lei wrote:
> > Yeah, that is one area queue freezing can't cover logical block size
> > change, but I'd suggest to put the logical bs check into submit_bio() or
> > slow path of __bio_queue_enter() at least.
>
> We really need an alignment check in submit_bio anyway, so doing it
> under the freeze protection would also help with this.
>
> > My concern is that nvme format is started without draining IO, and
> > IO can be submitted to hardware when nvme FW is handling formatting.
> > I am not sure if nvme FW can deal with this situation correctly.
> > Ewan suggested to run 'nvme format' with exclusive nvme disk open, which
> > needs nvme-cli change.
>
> .. and doesn't protect against someone using a different tool anyway.
>
> That beeing said, nvme_passthru_start actually freezes all queues
> based on the commands supported an affects log, and
> nvme_init_known_nvm_effects should force this even for controllers
> not supporting the log or reporting bogus information. So in general
> the queue should be frozen during the actual format.
That is something I missed, thanks for sharing the story.
Thanks,
Ming
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] block: bio-integrity: fix potential null-ptr-deref in bio_integrity_free
[not found] <20240606062655.2185006-1-yebin@huaweicloud.com>
[not found] ` <ZmFatW3BEzTPgR7S@infradead.org>
@ 2024-06-11 9:15 ` Markus Elfring
1 sibling, 0 replies; 5+ messages in thread
From: Markus Elfring @ 2024-06-11 9:15 UTC (permalink / raw)
To: Ye Bin, linux-block, linux-nvme, Jens Axboe
Cc: LKML, Christoph Hellwig, Ming Lei
…
> The root cause of this issue is the concurrency between the write process
> and the block size update process. However, this concurrency does not exist
> in actual production environments. To solve above issue, Verify if the
> segments of BIO are aligned with integrity intervals.
* Please improve the change description with an imperative wording.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.10-rc2#n94
* Would you like to add the tag “Fixes” accordingly?
* How do you think about to use the summary phrase “Avoid null pointer dereference
in bio_integrity_free()”?
Regards,
Markus
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-06-11 9:20 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240606062655.2185006-1-yebin@huaweicloud.com>
[not found] ` <ZmFatW3BEzTPgR7S@infradead.org>
[not found] ` <66619EB6.4040002@huaweicloud.com>
2024-06-06 14:30 ` [PATCH] block: bio-integrity: fix potential null-ptr-deref in bio_integrity_free Ming Lei
2024-06-06 14:52 ` Christoph Hellwig
2024-06-06 15:48 ` Keith Busch
2024-06-06 23:46 ` Ming Lei
2024-06-11 9:15 ` Markus Elfring
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox