Re: [RFC PATCH 0/14] xfs: Towards thin provisioning aware filesystems

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Amir Goldstein <amir73il@gmail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Brian Foster <bfoster@redhat.com>, linux-xfs <linux-xfs@vger.kernel.org>
Subject: Re: [RFC PATCH 0/14] xfs: Towards thin provisioning aware filesystems
Date: Mon, 6 Nov 2017 11:48:05 +0200	[thread overview]
Message-ID: <CAOQ4uxjJHyQ4-hUxcmRSP8V3xxZGN3MGm+8hKCRQFf2pRCfFiQ@mail.gmail.com> (raw)
In-Reply-To: <20171106011634.GC5858@dastard>

On Mon, Nov 6, 2017 at 3:16 AM, Dave Chinner <david@fromorbit.com> wrote:
...

> Not to mention other application developers that might
> be using the existing geometry and grwofs ioctls - shrink will now

Acknowledging that those "other application" may exist in the wild
makes it even harder to claim that allowing to change usable_dblocks
with existing API is not going to cause pain for users...

> just work on existing binaries without them having to do
> anything....
>
>> Don't you see that this is the right thing to do w.r.t. API design?
>
> No, I don't, because you're trying to solve a problem that, quite
> simply, doesn't exist.
>

It is *very* possible that you are right, but you have not proven that
the problem does not exist. You have proven that the problem
does not exist w.r.t old xfs_grow -D <size> and you correctly
claimed that the problem with old xfs_grow -m <imaxpct> is an
implementation bug with RFC patches.

Let me give an example that will demonstrate my concern.

One of our older NAS products, still deployed with many customers
has LVM based volume manager and ext3 file system.
When user changes the size of a volume via Web UI, lower level
commands will resize LVM and then resize2fs to max size.
Because "resize2fs to max size" is not an atomic operation and
because this is a "for dummies" product, in order to recover from
"half-resize", there is a post-mount script that runs resize2fs
unconditionally after boot.

So in this product, the LVM volume size is treated as an "intent log"
for file system size grow operation.

I find it hard to believe that this practice is so novel that nobody
else ever used it and for that matter with xfs file system over LVM
and xfs_grow -d.

Now imagine you upgrade such a system to a kernel that supports
"thinspace" and new xfsprogs and create thin file systems, and then
downgrade the system to a kernel that sill supports "thinspace", but
xfsprogs that do not (or even a proprietary system component that
uses XFS_IOC_FSGROWDATA ioctl to perform the "auto-grow").

The results will be that all the thin file systems will all "auto-grow"
to the thick size of the volume.

So the way I see it, my proposal to require explicitly
XFS_IOC_FSGROWDATA API V1 for any change to usable_dblocks
that is not coupled with same change to dblocks is meant to resolve
userspace/kernel compatibility issues.

And I fail to see how that requirement makes it hard to maintain
userspace/kernel compatibility:
- xfs_growfs needs to check for "thinspace" flag and if exists use V1 API
- old kernel can't mount "thinspace" fs, so it can never see V1 API
  unless from a buggy program, that will get -EINVAL
- old xfs_growfs will keep failing to shrink even a thin fs
- old xfs_growfs will succeed to grow, except (*) for a thin fs that
was previously shrunk

(*) That exception is relating to the example I described above,
and we seem to not be in agreement about the desired behavior.

IIUC, you like the fact that old xfs_grow can grow a thin and shrunk fs
where I see troubled lurking in this behavior.

So we can agree to disagree on the desired behavior, but for the
record, this and only this point is the API design flaw I am talking
about.

There may be complexities w.r.t maintaining userspace/kernel compatibility
with the proposed solution. I trust you on this because you have far
more experience than me with maintaining historic baggage of wrongly
designed APIs.

If no one else is concerned about the old xfs_grow -d use case and no one
else shares my opinion about the desired behavior in that use case, then
I withdraw my claims.

Thanks,
Amir.

next prev parent reply	other threads:[~2017-11-06  9:48 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-26  8:33 [RFC PATCH 0/14] xfs: Towards thin provisioning aware filesystems Dave Chinner
2017-10-26  8:33 ` [PATCH 01/14] xfs: factor out AG header initialisation from growfs core Dave Chinner
2017-10-26  8:33 ` [PATCH 02/14] xfs: convert growfs AG header init to use buffer lists Dave Chinner
2017-10-26  8:33 ` [PATCH 03/14] xfs: factor ag btree reoot block initialisation Dave Chinner
2017-10-26  8:33 ` [PATCH 04/14] xfs: turn ag header initialisation into a table driven operation Dave Chinner
2017-10-26  8:33 ` [PATCH 05/14] xfs: make imaxpct changes in growfs separate Dave Chinner
2017-10-26  8:33 ` [PATCH 06/14] xfs: separate secondary sb update in growfs Dave Chinner
2017-10-26  8:33 ` [PATCH 07/14] xfs: rework secondary superblock updates " Dave Chinner
2017-10-26  8:33 ` [PATCH 08/14] xfs: move various type verifiers to common file Dave Chinner
2017-10-26  8:33 ` [PATCH 09/14] xfs: split usable space from block device size Dave Chinner
2017-10-26  8:33 ` [PATCH 10/14] xfs: hide reserved metadata space from users Dave Chinner
2017-10-26  8:33 ` [PATCH 11/14] xfs: bump XFS_IOC_FSGEOMETRY to v5 structures Dave Chinner
2017-10-26  8:33 ` [PATCH 12/14] xfs: convert remaingin xfs_sb_version_... checks to bool Dave Chinner
2017-10-26 16:03   ` Darrick J. Wong
2017-10-26  8:33 ` [PATCH 13/14] xfs: add suport for "thin space" filesystems Dave Chinner
2017-10-26  8:33 ` [PATCH 14/14] xfs: add growfs support for changing usable blocks Dave Chinner
2017-10-26 11:30   ` Amir Goldstein
2017-10-26 12:48     ` Dave Chinner
2017-10-26 13:32       ` Amir Goldstein
2017-10-27 10:26         ` Amir Goldstein
2017-10-26 11:09 ` [RFC PATCH 0/14] xfs: Towards thin provisioning aware filesystems Amir Goldstein
2017-10-26 12:35   ` Dave Chinner
2017-11-01 22:31     ` Darrick J. Wong
2017-10-30 13:31 ` Brian Foster
2017-10-30 21:09   ` Dave Chinner
2017-10-31  4:49     ` Amir Goldstein
2017-10-31 22:40       ` Dave Chinner
2017-10-31 11:24     ` Brian Foster
2017-11-01  0:45       ` Dave Chinner
2017-11-01 14:17         ` Brian Foster
2017-11-01 23:53           ` Dave Chinner
2017-11-02 11:25             ` Brian Foster
2017-11-02 23:30               ` Dave Chinner
2017-11-03  2:47                 ` Darrick J. Wong
2017-11-03 11:36                   ` Brian Foster
2017-11-05 22:50                     ` Dave Chinner
2017-11-06 13:01                       ` Brian Foster
2017-11-06 21:20                         ` Dave Chinner
2017-11-07 11:28                           ` Brian Foster
2017-11-03 11:26                 ` Brian Foster
2017-11-03 12:19                   ` Amir Goldstein
2017-11-06  1:16                     ` Dave Chinner
2017-11-06  9:48                       ` Amir Goldstein [this message]
2017-11-06 21:46                         ` Dave Chinner
2017-11-07  5:30                           ` Amir Goldstein
2017-11-05 23:51                   ` Dave Chinner
2017-11-06 13:07                     ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOQ4uxjJHyQ4-hUxcmRSP8V3xxZGN3MGm+8hKCRQFf2pRCfFiQ@mail.gmail.com \
    --to=amir73il@gmail.com \
    --cc=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).