public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: "Javier González" <javier@javigon.com>,
	"Matias Bjørling" <Matias.Bjorling@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Keith Busch <kbusch@kernel.org>,
	Pankaj Raghav <p.raghav@samsung.com>,
	Adam Manzanares <a.manzanares@samsung.com>,
	"jiangbo.365@bytedance.com" <jiangbo.365@bytedance.com>,
	kanchan Joshi <joshi.k@samsung.com>, Jens Axboe <axboe@kernel.dk>,
	Sagi Grimberg <sagi@grimberg.me>,
	Pankaj Raghav <pankydev8@gmail.com>,
	Kanchan Joshi <joshiiitr@gmail.com>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>
Subject: Re: [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices
Date: Wed, 16 Mar 2022 09:00:27 +0900	[thread overview]
Message-ID: <fcb4f608-970c-56d3-fe3d-b344fab8baf7@opensource.wdc.com> (raw)
In-Reply-To: <20220315130501.q7fjpqzutadadfu3@ArmHalley.localdomain>

On 3/15/22 22:05, Javier González wrote:
>>> The main constraint for (1) PO2 is removed in the block layer, we
>>> have (2) Linux hosts stating that unmapped LBAs are a problem,
>>> and we have (3) HW supporting size=capacity.
>>> 
>>> I would be happy to hear what else you would like to see for this
>>> to be of use to the kernel community.
>> 
>> (Added numbers to your paragraph above)
>> 
>> 1. The sysfs chunksize attribute was "misused" to also represent
>> zone size. What has changed is that RAID controllers now can use a
>> NPO2 chunk size. This wasn't meant to naturally extend to zones,
>> which as shown in the current posted patchset, is a lot more work.
> 
> True. But this was the main constraint for PO2.

And as I said, users asked for it.

>> 2. Bo mentioned that the software already manages holes. It took a
>> bit of time to get right, but now it works. Thus, the software in
>> question is already capable of working with holes. Thus, fixing
>> this, would present itself as a minor optimization overall. I'm not
>> convinced the work to do this in the kernel is proportional to the
>> change it'll make to the applications.
> 
> I will let Bo response himself to this.
> 
>> 3. I'm happy to hear that. However, I'll like to reiterate the
>> point that the PO2 requirement have been known for years. That
>> there's a drive doing NPO2 zones is great, but a decision was made
>> by the SSD implementors to not support the Linux kernel given its
>> current implementation.
> 
> Zone devices has been supported for years in SMR, and I this is a
> strong argument. However, ZNS is still very new and customers have
> several requirements. I do not believe that a HDD stack should have
> such an impact in NVMe.
> 
> Also, we will see new interfaces adding support for zoned devices in
> the future.
> 
> We should think about the future and not the past.

Backward compatibility ? We must not break userspace...

>> 
>> All that said - if there are people willing to do the work and it
>> doesn't have a negative impact on performance, code quality,
>> maintenance complexity, etc. then there isn't anything saying
>> support can't be added - but it does seem like it’s a lot of work,
>> for little overall benefits to applications and the host users.
> 
> Exactly.
> 
> Patches in the block layer are trivial. This is running in
> production loads without issues. I have tried to highlight the
> benefits in previous benefits and I believe you understand them.

The block layer is not the issue here. We all understand that one is easy.

> Support for ZoneFS seems easy too. We have an early POC for btrfs and
> it seems it can be done. We sign up for these 2.

zonefs can trivially support non power of 2 zone sizes, but as zonefs
creates a discrete view of the device capacity with its one file per
zone interface, an application accesses to a zone are forcibly limited
to that zone, as they should. With zonefs, pow2 and nonpow2 devices will
show the *same* interface to the application. Non power of 2 zone size
then have absolutely no benefits at all.

> As for F2FS and dm-zoned, I do not think these are targets at the 
> moment. If this is the path we follow, these will bail out at mkfs
> time.

And what makes you think that this is acceptable ? What guarantees do
you have that this will not be a problem for users out there ?



-- 
Damien Le Moal
Western Digital Research


  parent reply	other threads:[~2022-03-16  0:00 UTC|newest]

Thread overview: 83+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20220308165414eucas1p106df0bd6a901931215cfab81660a4564@eucas1p1.samsung.com>
2022-03-08 16:53 ` [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices Pankaj Raghav
2022-03-08 16:53   ` [PATCH 1/6] nvme: zns: Allow ZNS drives that have non-power_of_2 zone size Pankaj Raghav
2022-03-08 17:14     ` Keith Busch
2022-03-08 17:43       ` Pankaj Raghav
2022-03-09  3:40     ` Damien Le Moal
2022-03-09 13:19       ` Pankaj Raghav
2022-03-09  3:44     ` Damien Le Moal
2022-03-09 13:35       ` Pankaj Raghav
2022-03-08 16:53   ` [PATCH 2/6] block: Add npo2_zone_setup callback to block device fops Pankaj Raghav
2022-03-09  3:46     ` Damien Le Moal
2022-03-09 14:02       ` Pankaj Raghav
2022-03-08 16:53   ` [PATCH 3/6] block: add a bool member to request_queue for power_of_2 emulation Pankaj Raghav
2022-03-08 16:53   ` [PATCH 4/6] nvme: zns: Add support for power_of_2 emulation to NVMe ZNS devices Pankaj Raghav
2022-03-09  4:04     ` Damien Le Moal
2022-03-09 14:33       ` Pankaj Raghav
2022-03-09 21:43         ` Damien Le Moal
2022-03-10 20:35           ` Luis Chamberlain
2022-03-10 23:50             ` Damien Le Moal
2022-03-11  0:56               ` Luis Chamberlain
2022-03-08 16:53   ` [PATCH 5/6] null_blk: forward the sector value from null_handle_memory_backend Pankaj Raghav
2022-03-08 16:53   ` [PATCH 6/6] null_blk: Add support for power_of_2 emulation to the null blk device Pankaj Raghav
2022-03-09  4:09     ` Damien Le Moal
2022-03-09 14:42       ` Pankaj Raghav
2022-03-10  9:47   ` [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices Christoph Hellwig
2022-03-10 12:57     ` Pankaj Raghav
2022-03-10 13:07       ` Matias Bjørling
2022-03-10 13:14         ` Javier González
2022-03-10 14:58           ` Matias Bjørling
2022-03-10 15:07             ` Keith Busch
2022-03-10 15:16               ` Javier González
2022-03-10 23:44                 ` Damien Le Moal
2022-03-10 15:13             ` Javier González
2022-03-10 14:44       ` Christoph Hellwig
2022-03-11 20:19         ` Luis Chamberlain
2022-03-11 20:51           ` Keith Busch
2022-03-11 21:04             ` Luis Chamberlain
2022-03-11 21:31               ` Keith Busch
2022-03-11 22:24                 ` Luis Chamberlain
2022-03-12  7:58                   ` Damien Le Moal
2022-03-14  7:35                     ` Christoph Hellwig
2022-03-14  7:45                       ` Damien Le Moal
2022-03-14  7:58                         ` Christoph Hellwig
2022-03-14 10:49                         ` Javier González
2022-03-14 14:16                           ` Matias Bjørling
2022-03-14 16:23                             ` Luis Chamberlain
2022-03-14 19:30                               ` Matias Bjørling
2022-03-14 19:51                                 ` Luis Chamberlain
2022-03-15 10:45                                   ` Matias Bjørling
2022-03-14 19:55                             ` Javier González
2022-03-15 12:32                               ` Matias Bjørling
2022-03-15 13:05                                 ` Javier González
2022-03-15 13:14                                   ` Matias Bjørling
2022-03-15 13:26                                     ` Javier González
2022-03-15 13:30                                       ` Christoph Hellwig
2022-03-15 13:52                                         ` Javier González
2022-03-15 14:03                                           ` Matias Bjørling
2022-03-15 14:14                                           ` Johannes Thumshirn
2022-03-15 14:27                                             ` David Sterba
2022-03-15 19:56                                               ` Pankaj Raghav
2022-03-15 15:11                                             ` Javier González
2022-03-15 18:51                                             ` Pankaj Raghav
2022-03-16  8:37                                               ` Johannes Thumshirn
2022-03-15 17:00                                         ` Luis Chamberlain
2022-03-16  0:07                                           ` Damien Le Moal
2022-03-16  0:23                                             ` Luis Chamberlain
2022-03-16  0:46                                               ` Damien Le Moal
2022-03-16  1:24                                                 ` Luis Chamberlain
2022-03-16  1:44                                                   ` Damien Le Moal
2022-03-16  2:13                                                     ` Luis Chamberlain
2022-03-16  2:27                                               ` Martin K. Petersen
2022-03-16  2:41                                                 ` Luis Chamberlain
2022-03-16  8:44                                                 ` Javier González
2022-03-15 13:39                                       ` Matias Bjørling
2022-03-16  0:00                                   ` Damien Le Moal [this message]
2022-03-16  8:57                                     ` Javier González
2022-03-16 16:18                                     ` Pankaj Raghav
2022-03-14  8:36                     ` Matias Bjørling
2022-03-11 22:23             ` Adam Manzanares
2022-03-11 22:30               ` Keith Busch
2022-03-21 16:21             ` Jonathan Derrick
2022-03-21 16:44               ` Keith Busch
2022-03-10 17:38     ` Adam Manzanares
2022-03-14  7:36       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fcb4f608-970c-56d3-fe3d-b344fab8baf7@opensource.wdc.com \
    --to=damien.lemoal@opensource.wdc.com \
    --cc=Matias.Bjorling@wdc.com \
    --cc=a.manzanares@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=javier@javigon.com \
    --cc=jiangbo.365@bytedance.com \
    --cc=joshi.k@samsung.com \
    --cc=joshiiitr@gmail.com \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=mcgrof@kernel.org \
    --cc=p.raghav@samsung.com \
    --cc=pankydev8@gmail.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox