From: Josef Bacik <josef@toxicpanda.com>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: kreijack@inwind.it, David Sterba <dsterba@suse.cz>,
Sinnamohideen Shafeeq <shafeeqs@panasas.com>,
Paul Jones <paul@pauljones.id.au>,
"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode
Date: Tue, 14 Dec 2021 15:34:24 -0500 [thread overview]
Message-ID: <Ybj/0ITsCQTBLkQF@localhost.localdomain> (raw)
In-Reply-To: <Ybj40IuxdaAy75Ue@hungrycats.org>
On Tue, Dec 14, 2021 at 03:04:32PM -0500, Zygo Blaxell wrote:
> On Tue, Dec 14, 2021 at 08:03:45PM +0100, Goffredo Baroncelli wrote:
> > On 12/13/21 23:49, Zygo Blaxell wrote:
> > > On Mon, Dec 13, 2021 at 04:15:14PM -0500, Josef Bacik wrote:
> > > > On Mon, Dec 13, 2021 at 08:54:24PM +0100, Goffredo Baroncelli wrote:
> > > > > Gentle ping :-)
> > > > >
> > > > > Are there anyone of the mains developer interested in supporting this patch ?
> > > > >
> > > > > I am open to improve it if required.
> > > > >
> > > >
> > > > Sorry I missed this go by. I like the interface, we don't have a use for
> > > > device->type yet, so this fits nicely.
> > > >
> > > > I don't see the btrfs-progs patches in my inbox, and these don't apply, so
> > > > you'll definitely need to refresh for a proper review, but looking at these
> > > > patches they seem sane enough, and I like the interface. I'd like to hear
> > > > Zygo's opinion as well.
> > >
> > > I've been running earlier versions with modifications since summer 2020,
> > > and this version mostly unmodified (rebase changes only) since it was
> > > posted. It seems to work, even in corner cases like converting balances,
> > > replacing drives, and running out of space. The "running out of space"
> > > experience is on btrfs is weird at the best of times, and these patches
> > > add some more new special cases, but it doesn't behave in ways that
> > > would surprise a sysadmin familiar with how btrfs chunk allocation works.
> > >
> > > One major piece that's missing is adjusting the statvfs (aka df)
> > > available blocks field so that it doesn't include unallocated space on
> > > any metadata-only devices. Right now all the unallocated space on
> > > metadata-only devices is counted as free even though it's impossible to
> > > put a data block there, so anything that is triggered automatically
> > > on "f_bavail < some_threshold" will be confused.
> > >
> > > I don't think that piece has to block the rest of the patch series--if
> > > you're not using the feature, df gives the right number (or at least the
> > > same number it gave before), and if you are using the feature, you can
> > > subtract the unavailable data space until a later patch comes along to
> > > fix it.
> > >
> > > I like
> > >
> > > echo data_only > /sys/fs/btrfs/$uuid/devinfo/3/type
> >
> > Only to be clear, for now you can pass a numeric value to "type". Not a text
> > like your example.
> >
> > However I want to put on the table another option: to not expose all the
> > "type" field, but only the "allocation policy"; we can add a new sysfs field
> > called "allocation policy" that internally change the dev_item->type field.
> >
> > It is not only a "cosmetic" change. If we want to change the allocation
> > policy, now the correct way is:
> > - read the type field
> > - change the "allocation policy" bits
> > - write the type field
> >
> > Which is race 'prone'
>
> > For now it is not a problem, because type contains only the allocation bits.
> > But in future when the type field will contains further properties this could
> > be a problem.
>
> Yeah, keep the interface very narrow, don't hand out access to random bits.
>
> If the kernel supports additional bits, it should support additional
> sysfs filenames to go with them. Or it could put all the supported
> options in the sysfs field, like block IO schedulers do, so you could
> find this in the file by reading it:
>
> [prefer_data] prefer_metadata metadata_only data_only
>
> > > more than patching btrfs-progs so I can use
> > >
> > > btrfs prop set /dev/... allocation_hint data_only
> > >
> > > but I admit that might be because I'm weird.
> >
> > I prefer the echo approach too; however it is not very ergonomics in conjunction
> > to sudo....
>
> For /proc/sys/* we have the 'sysctl' tool, so you can write 'sysctl
> vm.drop_caches=1' or 'sudo sysctl vm.drop_caches=1'. For some reason
> we don't have this for sysfs (or maybe it's just Debian...?) so we have
> to write things like 'echo foo | sudo tee /sys/fs/...'.
>
> Of course btrfs-progs could always open the
> /sys/fs/btrfs/.../allocation_policy file and write to it. But if we're
> modifying btrfs-progs then we could use the ioctl interface anyway.
>
> I don't have a strong preference for either sysfs or ioctl, nor am I
> opposed to simply implementing both. I'll let someone who does have
> such a preference make their case.
I think echo'ing a name into sysfs is better than bits for sure. However I want
the ability to set the device properties via a btrfs-progs command offline so I
can setup the storage and then mount the file system. I want
1) The sysfs interface so you can change things on the fly. This stays
persistent of course, so the way it works is perfect.
2) The btrfs-progs command sets it on offline devices. If you point it at a
live mounted fs it can simply use the sysfs thing to do it live.
Does this seem reasonable? Thanks,
Josef
next prev parent reply other threads:[~2021-12-14 20:34 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-24 15:31 [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 1/4] btrfs: add flags to give an hint to the chunk allocator Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 2/4] btrfs: export dev_item.type in /sys/fs/btrfs/<uuid>/devinfo/<devid>/type Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 3/4] btrfs: change the DEV_ITEM 'type' field via sysfs Goffredo Baroncelli
2021-10-24 15:31 ` [PATCH 4/4] btrfs: add allocator_hint mode Goffredo Baroncelli
2021-12-17 15:58 ` Hans van Kranenburg
2021-12-17 18:28 ` Goffredo Baroncelli
2021-12-17 19:41 ` Zygo Blaxell
2021-12-18 9:07 ` Goffredo Baroncelli
2021-12-18 22:48 ` Zygo Blaxell
2021-12-19 0:03 ` Graham Cobb
2021-12-19 2:30 ` Zygo Blaxell
2021-12-13 9:39 ` [RFC][V8][PATCH 0/5] btrfs: allocation_hint mode Paul Jones
2021-12-13 19:54 ` Goffredo Baroncelli
2021-12-13 21:15 ` Josef Bacik
2021-12-13 22:49 ` Zygo Blaxell
2021-12-14 14:31 ` Josef Bacik
2021-12-14 19:03 ` Goffredo Baroncelli
2021-12-14 20:04 ` Zygo Blaxell
2021-12-14 20:34 ` Josef Bacik [this message]
2021-12-14 20:41 ` Goffredo Baroncelli
2021-12-15 13:58 ` Josef Bacik
2021-12-15 18:53 ` Goffredo Baroncelli
2021-12-16 0:56 ` Josef Bacik
2021-12-17 5:40 ` Zygo Blaxell
2021-12-17 14:48 ` Josef Bacik
2021-12-17 16:31 ` Zygo Blaxell
2021-12-17 18:08 ` Goffredo Baroncelli
2021-12-16 2:30 ` Paul Jones
2021-12-14 1:03 ` Sinnamohideen, Shafeeq
2021-12-14 18:53 ` Goffredo Baroncelli
2021-12-14 20:35 ` Josef Bacik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ybj/0ITsCQTBLkQF@localhost.localdomain \
--to=josef@toxicpanda.com \
--cc=ce3g8jdj@umail.furryterror.org \
--cc=dsterba@suse.cz \
--cc=kreijack@inwind.it \
--cc=linux-btrfs@vger.kernel.org \
--cc=paul@pauljones.id.au \
--cc=shafeeqs@panasas.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.