public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Ming Lei <ming.lei@redhat.com>, linux-block@vger.kernel.org
Cc: Yi Zhang <yi.zhang@redhat.com>,
	John Meneghini <jmeneghi@redhat.com>,
	linux-nvme@lists.infradead.org, hch@lst.de,
	Keith Busch <kbusch@kernel.org>
Subject: Re: [Report] blk-zoned/ZNS: non_power_of_2 of zone->len]
Date: Fri, 12 Jan 2024 12:05:45 +0900	[thread overview]
Message-ID: <20503cd0-3a99-45bb-8374-40296a3cb92a@kernel.org> (raw)
In-Reply-To: <ZaCSOH7L+Nm6PvcN@fedora>

On 1/12/24 10:13, Ming Lei wrote:
> Hello Damien and Guys,
> 
> Yi reported that the following failure:
> 
> Oct 18 15:24:15 localhost kernel: nvme nvme4: invalid zone size:196608 for namespace:1
> Oct 18 15:24:33 localhost smartd[2303]: Device: /dev/nvme4, opened
> Oct 18 15:24:33 localhost smartd[2303]: Device: /dev/nvme4, NETAPPX4022S173A4T0NTZ, S/N:S66NNE0T800169, FW:MVP40B7B, 4.09 TB
> 
> Looks current blk-zoned requires zone->len to be power_of_2() since
> commit:
> 
> 6c6b35491422 ("block: set the zone size in blk_revalidate_disk_zones atomically")
> 
> And the original power_of_2() requirement is from the following commit
> for ZBC and ZAC.
> 
> d9dd73087a8b ("block: Enhance blk_revalidate_disk_zones()")
> 
> Meantime block layer does support non-power_of_2 chunk sectors limit.

That is not true. It does. See blk_stack_limits which ahs:

	/* Set non-power-of-2 compatible chunk_sectors boundary */
        if (b->chunk_sectors)
                t->chunk_sectors = gcd(t->chunk_sectors, b->chunk_sectors);

and the absence of any check on the value of chunk_sectors in
blk_queue_chunk_sectors().

> The question is if there is such hard requirement for ZNS, and I can't see
> any such words in NVMe Zoned Namespace Command Set Specification.

No, there are no requirements in ZNS for the zone size to be a power of 2 number
of sectors/LBAs. The same is also true for ZBC and ZAC (SCSI and ATA) SMR HDDs.
The requirement for the zone size to be a power of 2 number of sectors is
entirely in the kernel. The reason being that zoned block device support started
with SMR HDDs which all had a zone size of 256 MB (and still do) and no user
ever wanted anything else than that. So everything was coded with this
requirement, as that allowed many nice things like bit-shift/mask arithmetic for
conversions between zone number and sectors etc (and that of course is very
efficient).

> So is it one NVMe firmware issue? or blk-zoned problem with too strict(power_of_2)
> requirement on zone->len?

It is the latter. There was a session at LSF/MM last year about this. I recall
that the conclusion was that unless there is a strong user demand for non power
of 2 zone size, we are not going to do anything about it. Because allowing
non-power of 2 zone size has some serious consequences all over the place,
including in FSes that natively support zoned devices. So relaxing that
requirement is not trivial.


-- 
Damien Le Moal
Western Digital Research


  reply	other threads:[~2024-01-12  3:05 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-12  1:13 [Report] blk-zoned/ZNS: non_power_of_2 of zone->len] Ming Lei
2024-01-12  3:05 ` Damien Le Moal [this message]
2024-01-12  3:29   ` Ming Lei
2024-01-12  3:34     ` Damien Le Moal
2024-01-12  3:46       ` Bart Van Assche
2024-01-12 15:40     ` Pankaj Raghav (Samsung)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20503cd0-3a99-45bb-8374-40296a3cb92a@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=hch@lst.de \
    --cc=jmeneghi@redhat.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=ming.lei@redhat.com \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox