linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [bug report] nvme4: inconsistent AWUPF, controller not added (0/7).
@ 2025-07-02 11:13 Yi Zhang
  2025-07-02 16:33 ` alan.adamson
  0 siblings, 1 reply; 6+ messages in thread
From: Yi Zhang @ 2025-07-02 11:13 UTC (permalink / raw)
  To: linux-block, open list:NVM EXPRESS DRIVER; +Cc: Christoph Hellwig, John Garry

Hi Christoph

I found this failure on one Samsung NVMe disk[1] with the latest
linux-block/for-next. Here is the reproducer and dmesg log.
Please help check it and let me know if you need any info/test. Thanks.

[1]
SAMSUNG MZQL2960HCJR-00A07 (PM9A3)
[2]
+ nvme format -l1 -f /dev/nvme4n1
Success formatting namespace:1
+ nvme reset /dev/nvme4
Reset: Network dropped connection on reset

dmesg:
[  751.872864] nvme nvme4: rescanning namespaces.
[  752.177475] nvme nvme4: resetting controller
[  752.221030] nvme nvme4: inconsistent AWUPF, controller not added (0/7).
[  752.227653] nvme nvme4: Disabling device after reset failure: -22

-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] nvme4: inconsistent AWUPF, controller not added (0/7).
  2025-07-02 11:13 [bug report] nvme4: inconsistent AWUPF, controller not added (0/7) Yi Zhang
@ 2025-07-02 16:33 ` alan.adamson
  2025-07-03  8:03   ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: alan.adamson @ 2025-07-02 16:33 UTC (permalink / raw)
  To: Yi Zhang, linux-block, open list:NVM EXPRESS DRIVER
  Cc: Christoph Hellwig, John Garry


On 7/2/25 4:13 AM, Yi Zhang wrote:
> Hi Christoph
>
> I found this failure on one Samsung NVMe disk[1] with the latest
> linux-block/for-next. Here is the reproducer and dmesg log.
> Please help check it and let me know if you need any info/test. Thanks.
>
> [1]
> SAMSUNG MZQL2960HCJR-00A07 (PM9A3)
> [2]
> + nvme format -l1 -f /dev/nvme4n1
> Success formatting namespace:1
> + nvme reset /dev/nvme4
> Reset: Network dropped connection on reset
>
> dmesg:
> [  751.872864] nvme nvme4: rescanning namespaces.
> [  752.177475] nvme nvme4: resetting controller
> [  752.221030] nvme nvme4: inconsistent AWUPF, controller not added (0/7).
> [  752.227653] nvme nvme4: Disabling device after reset failure: -22
>
Looks like the device isn't reporting AWUPF after the format/reset.

Can you try:

nvme id-ctrl /dev/nvme4 | grep awupf
nvme id-ns  /dev/nvme4n1 | grep nawupf
nvme format -l1 -f /dev/nvme4n1
nvme id-ctrl /dev/nvme4 | grep awupf
nvme id-ns  /dev/nvme4n1 | grep nawupf
nvme reset /dev/nvme4
nvme id-ctrl /dev/nvme4 | grep awupf


Alan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] nvme4: inconsistent AWUPF, controller not added (0/7).
  2025-07-02 16:33 ` alan.adamson
@ 2025-07-03  8:03   ` Christoph Hellwig
  2025-07-03 17:47     ` Yi Zhang
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2025-07-03  8:03 UTC (permalink / raw)
  To: alan.adamson
  Cc: Yi Zhang, linux-block, open list:NVM EXPRESS DRIVER,
	Christoph Hellwig, John Garry

On Wed, Jul 02, 2025 at 09:33:32AM -0700, alan.adamson@oracle.com wrote:
> Looks like the device isn't reporting AWUPF after the format/reset.

The other option would be that the format changed the value.

The mess NVMe creasted with the totally un-thought out atomics is
beyond belive :(

I wonder if we should just back out the whole thing and wait for the
working group to come up with something that can actually safely work.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] nvme4: inconsistent AWUPF, controller not added (0/7).
  2025-07-03  8:03   ` Christoph Hellwig
@ 2025-07-03 17:47     ` Yi Zhang
  2025-07-04  2:17       ` Ming Lei
  0 siblings, 1 reply; 6+ messages in thread
From: Yi Zhang @ 2025-07-03 17:47 UTC (permalink / raw)
  To: Christoph Hellwig, alan.adamson
  Cc: linux-block, open list:NVM EXPRESS DRIVER, John Garry, Ming Lei,
	Maurizio Lombardi

On Thu, Jul 3, 2025 at 4:04 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Wed, Jul 02, 2025 at 09:33:32AM -0700, alan.adamson@oracle.com wrote:
> > Looks like the device isn't reporting AWUPF after the format/reset.
>
> The other option would be that the format changed the value.
>
> The mess NVMe creasted with the totally un-thought out atomics is
> beyond belive :(
>
> I wonder if we should just back out the whole thing and wait for the
> working group to come up with something that can actually safely work.
>

Yeah, the format operation will change the awupf value.
Here is the reset operation pass[1] and fail[2] log
[1]
+ nvme format -l0 -f /dev/nvme3n1
Success formatting namespace:1
+ nvme id-ctrl /dev/nvme3
+ grep awupf
awupf     : 7
+ grep nawupf
+ nvme id-ns /dev/nvme3n1
nawupf  : 7
+ nvme format -l1 -f /dev/nvme3n1
Success formatting namespace:1
+ nvme id-ctrl /dev/nvme3
+ grep awupf
awupf     : 0
+ nvme id-ns /dev/nvme3n1
+ grep nawupf
nawupf  : 0
+ nvme reset /dev/nvme3
+ nvme id-ctrl /dev/nvme3
+ grep awupf
awupf     : 0

[2]
+ nvme format -l0 -f /dev/nvme5n1
Success formatting namespace:1
+ nvme id-ctrl /dev/nvme5
+ grep awupf
awupf     : 7
+ nvme id-ns /dev/nvme5n1
+ grep nawupf
nawupf  : 7
+ nvme format -l1 -f /dev/nvme5n1
Success formatting namespace:1
+ nvme id-ctrl /dev/nvme5
+ grep awupf
awupf     : 0
+ nvme id-ns /dev/nvme5n1
+ grep nawupf
nawupf  : 0
+ nvme reset /dev/nvme5
Reset: Network dropped connection on reset
# dmesg | tail -5
[  597.973393] nvme nvme5: rescanning namespaces.
[  598.292285] nvme nvme5: rescanning namespaces.
[  598.584937] nvme nvme5: resetting controller
[  598.626440] nvme nvme5: inconsistent AWUPF, controller not added (0/7).
[  598.633064] nvme nvme5: Disabling device after reset failure: -22


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] nvme4: inconsistent AWUPF, controller not added (0/7).
  2025-07-03 17:47     ` Yi Zhang
@ 2025-07-04  2:17       ` Ming Lei
  2025-07-07  5:37         ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Ming Lei @ 2025-07-04  2:17 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Christoph Hellwig, alan.adamson, linux-block,
	open list:NVM EXPRESS DRIVER, John Garry, Maurizio Lombardi

On Fri, Jul 04, 2025 at 01:47:50AM +0800, Yi Zhang wrote:
> On Thu, Jul 3, 2025 at 4:04 PM Christoph Hellwig <hch@lst.de> wrote:
> >
> > On Wed, Jul 02, 2025 at 09:33:32AM -0700, alan.adamson@oracle.com wrote:
> > > Looks like the device isn't reporting AWUPF after the format/reset.
> >
> > The other option would be that the format changed the value.
> >
> > The mess NVMe creasted with the totally un-thought out atomics is
> > beyond belive :(
> >
> > I wonder if we should just back out the whole thing and wait for the
> > working group to come up with something that can actually safely work.
> >
> 
> Yeah, the format operation will change the awupf value.
> Here is the reset operation pass[1] and fail[2] log
> [1]
> + nvme format -l0 -f /dev/nvme3n1
> Success formatting namespace:1
> + nvme id-ctrl /dev/nvme3
> + grep awupf
> awupf     : 7
> + grep nawupf
> + nvme id-ns /dev/nvme3n1
> nawupf  : 7
> + nvme format -l1 -f /dev/nvme3n1
> Success formatting namespace:1
> + nvme id-ctrl /dev/nvme3
> + grep awupf
> awupf     : 0
> + nvme id-ns /dev/nvme3n1
> + grep nawupf
> nawupf  : 0
> + nvme reset /dev/nvme3
> + nvme id-ctrl /dev/nvme3
> + grep awupf
> awupf     : 0
> 
> [2]
> + nvme format -l0 -f /dev/nvme5n1
> Success formatting namespace:1
> + nvme id-ctrl /dev/nvme5
> + grep awupf
> awupf     : 7
> + nvme id-ns /dev/nvme5n1
> + grep nawupf
> nawupf  : 7
> + nvme format -l1 -f /dev/nvme5n1
> Success formatting namespace:1
> + nvme id-ctrl /dev/nvme5
> + grep awupf
> awupf     : 0

Per NVMe spec, AWUPF unit is 'logical blocks', and logical block size is changed
by 'nvme format', so AWUPF value retuned from `Identify command` can be changed
because the controller implements fixed-length atomic write size(512*8, 4096 * 1)?


Thanks,
Ming


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [bug report] nvme4: inconsistent AWUPF, controller not added (0/7).
  2025-07-04  2:17       ` Ming Lei
@ 2025-07-07  5:37         ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2025-07-07  5:37 UTC (permalink / raw)
  To: Ming Lei
  Cc: Yi Zhang, Christoph Hellwig, alan.adamson, linux-block,
	open list:NVM EXPRESS DRIVER, John Garry, Maurizio Lombardi

On Fri, Jul 04, 2025 at 10:17:38AM +0800, Ming Lei wrote:
> Per NVMe spec, AWUPF unit is 'logical blocks', and logical block size is changed
> by 'nvme format', so AWUPF value retuned from `Identify command` can be changed
> because the controller implements fixed-length atomic write size(512*8, 4096 * 1)?
> 

Yes.  And that's an issue because NVMe doesn't have a controller-level
concept of a logical bloc ksize, the logical block size is per-namespace.

Or in other words, AWUPF is a mess.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-07-07  5:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-02 11:13 [bug report] nvme4: inconsistent AWUPF, controller not added (0/7) Yi Zhang
2025-07-02 16:33 ` alan.adamson
2025-07-03  8:03   ` Christoph Hellwig
2025-07-03 17:47     ` Yi Zhang
2025-07-04  2:17       ` Ming Lei
2025-07-07  5:37         ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).