From: David Sterba <dsterba@suse.cz>
To: Johannes Thumshirn <Johannes.Thumshirn@wdc.com>
Cc: "Javier González" <javier@javigon.com>,
"Christoph Hellwig" <hch@lst.de>,
"Matias Bjørling" <Matias.Bjorling@wdc.com>,
"Damien Le Moal" <damien.lemoal@opensource.wdc.com>,
"Luis Chamberlain" <mcgrof@kernel.org>,
"Keith Busch" <kbusch@kernel.org>,
"Pankaj Raghav" <p.raghav@samsung.com>,
"Adam Manzanares" <a.manzanares@samsung.com>,
"jiangbo.365@bytedance.com" <jiangbo.365@bytedance.com>,
"kanchan Joshi" <joshi.k@samsung.com>,
"Jens Axboe" <axboe@kernel.dk>,
"Sagi Grimberg" <sagi@grimberg.me>,
"Pankaj Raghav" <pankydev8@gmail.com>,
"Kanchan Joshi" <joshiiitr@gmail.com>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
"linux-btrfs @ vger . kernel . org" <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices
Date: Tue, 15 Mar 2022 15:27:40 +0100 [thread overview]
Message-ID: <20220315142740.GU12643@twin.jikos.cz> (raw)
In-Reply-To: <PH0PR04MB74167377D7D86C60C290DAB29B109@PH0PR04MB7416.namprd04.prod.outlook.com>
On Tue, Mar 15, 2022 at 02:14:23PM +0000, Johannes Thumshirn wrote:
> On 15/03/2022 14:52, Javier González wrote:
> > On 15.03.2022 14:30, Christoph Hellwig wrote:
> >> On Tue, Mar 15, 2022 at 02:26:11PM +0100, Javier González wrote:
> >>> but we do not see a usage for ZNS in F2FS, as it is a mobile
> >>> file-system. As other interfaces arrive, this work will become natural.
> >>>
> >>> ZoneFS and butrfs are good targets for ZNS and these we can do. I would
> >>> still do the work in phases to make sure we have enough early feedback
> >>> from the community.
> >>>
> >>> Since this thread has been very active, I will wait some time for
> >>> Christoph and others to catch up before we start sending code.
> >>
> >> Can someone summarize where we stand? Between the lack of quoting
> >> from hell and overly long lines from corporate mail clients I've
> >> mostly stopped reading this thread because it takes too much effort
> >> actually extract the information.
> >
> > Let me give it a try:
> >
> > - PO2 emulation in NVMe is a no-go. Drop this.
> >
> > - The arguments against supporting PO2 are:
> > - It makes ZNS depart from a SMR assumption of PO2 zone sizes. This
> > can create confusion for users of both SMR and ZNS
> >
> > - Existing applications assume PO2 zone sizes, and probably do
> > optimizations for these. These applications, if wanting to use
> > ZNS will have to change the calculations
> >
> > - There is a fear for performance regressions.
> >
> > - It adds more work to you and other maintainers
> >
> > - The arguments in favour of PO2 are:
> > - Unmapped LBAs create holes that applications need to deal with.
> > This affects mapping and performance due to splits. Bo explained
> > this in a thread from Bytedance's perspective. I explained in an
> > answer to Matias how we are not letting zones transition to
> > offline in order to simplify the host stack. Not sure if this is
> > something we want to bring to NVMe.
> >
> > - As ZNS adds more features and other protocols add support for
> > zoned devices we will have more use-cases for the zoned block
> > device. We will have to deal with these fragmentation at some
> > point.
> >
> > - This is used in production workloads in Linux hosts. I would
> > advocate for this not being off-tree as it will be a headache for
> > all in the future.
> >
> > - If you agree that removing PO2 is an option, we can do the following:
> > - Remove the constraint in the block layer and add ZoneFS support
> > in a first patch.
> >
> > - Add btrfs support in a later patch
>
> (+ linux-btrfs )
>
> Please also make sure to support btrfs and not only throw some patches
> over the fence. Zoned device support in btrfs is complex enough and has
> quite some special casing vs regular btrfs, which we're working on getting
> rid of. So having non-power-of-2 zone size, would also mean having NPO2
> block-groups (and thus block-groups not aligned to the stripe size).
>
> Just thinking of this and knowing I need to support it gives me a
> headache.
PO2 is really easy to work with and I guess allocation on the physical
device could also benefit from that, I'm still puzzled why the NPO2 is
even proposed.
We can possibly hide the calculations behind some API so I hope in the
end it should be bearable. The size of block groups is flexible we only
want some reasonable alignment.
> Also please consult the rest of the btrfs developers for thoughts on this.
> After all btrfs has full zoned support (including ZNS, not saying it's
> perfect) and is also the default FS for at least two Linux distributions.
I haven't read the whole thread yet, my impression is that some hardware
is deliberately breaking existing assumptions about zoned devices and in
turn breaking btrfs support. I hope I'm wrong on that or at least that
it's possible to work around it.
next prev parent reply other threads:[~2022-03-15 14:31 UTC|newest]
Thread overview: 83+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20220308165414eucas1p106df0bd6a901931215cfab81660a4564@eucas1p1.samsung.com>
2022-03-08 16:53 ` [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices Pankaj Raghav
2022-03-08 16:53 ` [PATCH 1/6] nvme: zns: Allow ZNS drives that have non-power_of_2 zone size Pankaj Raghav
2022-03-08 17:14 ` Keith Busch
2022-03-08 17:43 ` Pankaj Raghav
2022-03-09 3:40 ` Damien Le Moal
2022-03-09 13:19 ` Pankaj Raghav
2022-03-09 3:44 ` Damien Le Moal
2022-03-09 13:35 ` Pankaj Raghav
2022-03-08 16:53 ` [PATCH 2/6] block: Add npo2_zone_setup callback to block device fops Pankaj Raghav
2022-03-09 3:46 ` Damien Le Moal
2022-03-09 14:02 ` Pankaj Raghav
2022-03-08 16:53 ` [PATCH 3/6] block: add a bool member to request_queue for power_of_2 emulation Pankaj Raghav
2022-03-08 16:53 ` [PATCH 4/6] nvme: zns: Add support for power_of_2 emulation to NVMe ZNS devices Pankaj Raghav
2022-03-09 4:04 ` Damien Le Moal
2022-03-09 14:33 ` Pankaj Raghav
2022-03-09 21:43 ` Damien Le Moal
2022-03-10 20:35 ` Luis Chamberlain
2022-03-10 23:50 ` Damien Le Moal
2022-03-11 0:56 ` Luis Chamberlain
2022-03-08 16:53 ` [PATCH 5/6] null_blk: forward the sector value from null_handle_memory_backend Pankaj Raghav
2022-03-08 16:53 ` [PATCH 6/6] null_blk: Add support for power_of_2 emulation to the null blk device Pankaj Raghav
2022-03-09 4:09 ` Damien Le Moal
2022-03-09 14:42 ` Pankaj Raghav
2022-03-10 9:47 ` [PATCH 0/6] power_of_2 emulation support for NVMe ZNS devices Christoph Hellwig
2022-03-10 12:57 ` Pankaj Raghav
2022-03-10 13:07 ` Matias Bjørling
2022-03-10 13:14 ` Javier González
2022-03-10 14:58 ` Matias Bjørling
2022-03-10 15:07 ` Keith Busch
2022-03-10 15:16 ` Javier González
2022-03-10 23:44 ` Damien Le Moal
2022-03-10 15:13 ` Javier González
2022-03-10 14:44 ` Christoph Hellwig
2022-03-11 20:19 ` Luis Chamberlain
2022-03-11 20:51 ` Keith Busch
2022-03-11 21:04 ` Luis Chamberlain
2022-03-11 21:31 ` Keith Busch
2022-03-11 22:24 ` Luis Chamberlain
2022-03-12 7:58 ` Damien Le Moal
2022-03-14 7:35 ` Christoph Hellwig
2022-03-14 7:45 ` Damien Le Moal
2022-03-14 7:58 ` Christoph Hellwig
2022-03-14 10:49 ` Javier González
2022-03-14 14:16 ` Matias Bjørling
2022-03-14 16:23 ` Luis Chamberlain
2022-03-14 19:30 ` Matias Bjørling
2022-03-14 19:51 ` Luis Chamberlain
2022-03-15 10:45 ` Matias Bjørling
2022-03-14 19:55 ` Javier González
2022-03-15 12:32 ` Matias Bjørling
2022-03-15 13:05 ` Javier González
2022-03-15 13:14 ` Matias Bjørling
2022-03-15 13:26 ` Javier González
2022-03-15 13:30 ` Christoph Hellwig
2022-03-15 13:52 ` Javier González
2022-03-15 14:03 ` Matias Bjørling
2022-03-15 14:14 ` Johannes Thumshirn
2022-03-15 14:27 ` David Sterba [this message]
2022-03-15 19:56 ` Pankaj Raghav
2022-03-15 15:11 ` Javier González
2022-03-15 18:51 ` Pankaj Raghav
2022-03-16 8:37 ` Johannes Thumshirn
2022-03-15 17:00 ` Luis Chamberlain
2022-03-16 0:07 ` Damien Le Moal
2022-03-16 0:23 ` Luis Chamberlain
2022-03-16 0:46 ` Damien Le Moal
2022-03-16 1:24 ` Luis Chamberlain
2022-03-16 1:44 ` Damien Le Moal
2022-03-16 2:13 ` Luis Chamberlain
2022-03-16 2:27 ` Martin K. Petersen
2022-03-16 2:41 ` Luis Chamberlain
2022-03-16 8:44 ` Javier González
2022-03-15 13:39 ` Matias Bjørling
2022-03-16 0:00 ` Damien Le Moal
2022-03-16 8:57 ` Javier González
2022-03-16 16:18 ` Pankaj Raghav
2022-03-14 8:36 ` Matias Bjørling
2022-03-11 22:23 ` Adam Manzanares
2022-03-11 22:30 ` Keith Busch
2022-03-21 16:21 ` Jonathan Derrick
2022-03-21 16:44 ` Keith Busch
2022-03-10 17:38 ` Adam Manzanares
2022-03-14 7:36 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220315142740.GU12643@twin.jikos.cz \
--to=dsterba@suse.cz \
--cc=Johannes.Thumshirn@wdc.com \
--cc=Matias.Bjorling@wdc.com \
--cc=a.manzanares@samsung.com \
--cc=axboe@kernel.dk \
--cc=damien.lemoal@opensource.wdc.com \
--cc=hch@lst.de \
--cc=javier@javigon.com \
--cc=jiangbo.365@bytedance.com \
--cc=joshi.k@samsung.com \
--cc=joshiiitr@gmail.com \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=pankydev8@gmail.com \
--cc=sagi@grimberg.me \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox