From: Dave Chinner <david@fromorbit.com>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: linux-block@vger.kernel.org, linux-fsdevel@vger.kernel.org,
lsf-pc@lists.linux-foundation.org,
"Matias Bjørling" <Matias.Bjorling@wdc.com>,
"Javier González" <javier.gonz@samsung.com>,
"Damien Le Moal" <Damien.LeMoal@wdc.com>,
"Bart Van Assche" <bvanassche@acm.org>,
"Adam Manzanares" <a.manzanares@samsung.com>,
"Keith Busch" <Keith.Busch@wdc.com>,
"Johannes Thumshirn" <Johannes.Thumshirn@wdc.com>,
"Naohiro Aota" <Naohiro.Aota@wdc.com>,
"Pankaj Raghav" <pankydev8@gmail.com>,
"Kanchan Joshi" <joshi.k@samsung.com>,
"Nitesh Shetty" <nj.shetty@samsung.com>
Subject: Re: [LSF/MM/BPF BoF] BoF for Zoned Storage
Date: Sat, 5 Mar 2022 09:42:57 +1100 [thread overview]
Message-ID: <20220304224257.GN3927073@dread.disaster.area> (raw)
In-Reply-To: <YiKOQM+HMZXnArKT@bombadil.infradead.org>
On Fri, Mar 04, 2022 at 02:10:08PM -0800, Luis Chamberlain wrote:
> On Fri, Mar 04, 2022 at 11:10:22AM +1100, Dave Chinner wrote:
> > On Wed, Mar 02, 2022 at 04:56:54PM -0800, Luis Chamberlain wrote:
> > > Thinking proactively about LSFMM, regarding just Zone storage..
> > >
> > > I'd like to propose a BoF for Zoned Storage. The point of it is
> > > to address the existing point points we have and take advantage of
> > > having folks in the room we can likely settle on things faster which
> > > otherwise would take years.
> > >
> > > I'll throw at least one topic out:
> > >
> > > * Raw access for zone append for microbenchmarks:
> > > - are we really happy with the status quo?
> > > - if not what outlets do we have?
> > >
> > > I think the nvme passthrogh stuff deserves it's own shared
> > > discussion though and should not make it part of the BoF.
> >
> > Reading through the discussion on this thread, perhaps this session
> > should be used to educate application developers about how to use
> > ZoneFS so they never need to manage low level details of zone
> > storage such as enumerating zones, controlling write pointers
> > safely for concurrent IO, performing zone resets, etc.
>
> I'm not even sure users are really aware that given cap can be different
> than zone size and btrfs uses zone size to compute size, the size is a
> flat out lie.
Sorry, I don't get what btrfs does with zone management has anything
to do with using Zonefs to get direct, raw IO access to individual
zones. Direct IO on open zone fds is likely more efficient than
doing IO through the standard LBA based block device because ZoneFS
uses iomap_dio_rw() so it only needs to do one mapping operation per
IO instead of one per page in the IO. Nor does it have to manage
buffer heads or other "generic blockdev" functionality that direct
IO access to zoned storage doesn't require.
So whatever you're complaining about that btrfs lies about, does or
doesn't do is irrelevant - Zonefs was written with the express
purpose of getting user applications away from needing to directly
manage zone storage. SO if you have special zone IO management
requirements, work out how they can be supported by zonefs - we
don't need yet another special purpose direct hardware access API
for zone storage when we already have a solid solution to the
problem already.
> modprobe null_blk nr_devices=0
> mkdir /sys/kernel/config/nullb/nullb0
> echo 0 > /sys/kernel/config/nullb/nullb0/completion_nsec
> echo 0 > /sys/kernel/config/nullb/nullb0/irqmode
> echo 2 > /sys/kernel/config/nullb/nullb0/queue_mode
> echo 1024 > /sys/kernel/config/nullb/nullb0/hw_queue_depth
> echo 1 > /sys/kernel/config/nullb/nullb0/memory_backed
> echo 1 > /sys/kernel/config/nullb/nullb0/zoned
>
> echo 128 > /sys/kernel/config/nullb/nullb0/zone_size
> # 6 zones are implied, we are saying 768 for the full storage size..
> # but...
> echo 768 > /sys/kernel/config/nullb/nullb0/size
>
> # If we force capacity to be way less than the zone sizes, btrfs still
> # uses the zone size to do its data / metadata size computation...
> echo 32 > /sys/kernel/config/nullb/nullb0/zone_capacity
Then that's just a btrfs zone support bug where it's used the
wrong information to size it's zones. Why not just send a patch to
fix it?
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2022-03-04 22:43 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-03 0:56 [LSF/MM/BPF BoF] BoF for Zoned Storage Luis Chamberlain
2022-03-03 1:03 ` Luis Chamberlain
2022-03-03 1:33 ` Bart Van Assche
2022-03-03 4:31 ` Matias Bjørling
2022-03-03 5:21 ` Adam Manzanares
2022-03-03 5:32 ` Javier González
2022-03-03 6:29 ` Javier González
2022-03-03 7:54 ` Pankaj Raghav
2022-03-03 9:49 ` Damien Le Moal
2022-03-03 14:55 ` Adam Manzanares
2022-03-03 15:22 ` Damien Le Moal
2022-03-03 17:10 ` Adam Manzanares
2022-03-03 19:51 ` Matias Bjørling
2022-03-03 20:18 ` Adam Manzanares
2022-03-03 21:08 ` Javier González
2022-03-03 21:33 ` Matias Bjørling
2022-03-04 20:12 ` Luis Chamberlain
2022-03-06 23:54 ` Damien Le Moal
2022-03-03 16:12 ` Himanshu Madhani
2022-03-03 7:21 ` Hannes Reinecke
2022-03-03 8:55 ` Damien Le Moal
2022-03-03 7:38 ` Kanchan Joshi
2022-03-03 8:43 ` Johannes Thumshirn
2022-03-03 18:20 ` Viacheslav Dubeyko
2022-03-04 0:10 ` Dave Chinner
2022-03-04 22:10 ` Luis Chamberlain
2022-03-04 22:42 ` Dave Chinner [this message]
2022-03-04 22:55 ` Luis Chamberlain
2022-03-05 7:33 ` Javier González
2022-03-07 7:12 ` Dave Chinner
2022-03-07 10:27 ` Matias Bjørling
2022-03-07 11:29 ` Javier González
2022-03-11 0:49 ` Luis Chamberlain
2022-03-11 6:07 ` Christoph Hellwig
2022-03-11 20:31 ` Luis Chamberlain
2022-03-07 13:55 ` James Bottomley
2022-03-07 14:35 ` Javier González
2022-03-07 15:15 ` Keith Busch
2022-03-07 15:28 ` Javier González
2022-03-07 20:42 ` Damien Le Moal
2022-03-11 7:21 ` Javier González
2022-03-11 7:39 ` Damien Le Moal
2022-03-11 7:42 ` Christoph Hellwig
2022-03-11 7:53 ` Javier González
2022-03-11 8:46 ` Christoph Hellwig
2022-03-11 8:59 ` Javier González
2022-03-12 8:03 ` Damien Le Moal
2022-03-07 0:07 ` Damien Le Moal
2022-03-06 23:56 ` Damien Le Moal
2022-03-07 15:44 ` Luis Chamberlain
2022-03-07 16:23 ` Johannes Thumshirn
2022-03-07 16:36 ` Luis Chamberlain
2022-03-15 18:08 ` [EXT] " Luca Porzio (lporzio)
2022-03-15 18:39 ` Bart Van Assche
2022-03-15 18:47 ` Bean Huo (beanhuo)
2022-03-15 18:49 ` Jens Axboe
2022-03-15 19:04 ` Bean Huo (beanhuo)
2022-03-15 19:16 ` Jens Axboe
2022-03-15 19:59 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220304224257.GN3927073@dread.disaster.area \
--to=david@fromorbit.com \
--cc=Damien.LeMoal@wdc.com \
--cc=Johannes.Thumshirn@wdc.com \
--cc=Keith.Busch@wdc.com \
--cc=Matias.Bjorling@wdc.com \
--cc=Naohiro.Aota@wdc.com \
--cc=a.manzanares@samsung.com \
--cc=bvanassche@acm.org \
--cc=javier.gonz@samsung.com \
--cc=joshi.k@samsung.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=mcgrof@kernel.org \
--cc=nj.shetty@samsung.com \
--cc=pankydev8@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox