linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Fleetwood <mike.fleetwood@googlemail.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Tarik Ceylan <Tarik.Ceylan@ruhr-uni-bochum.de>,
	linux-xfs@vger.kernel.org, Eric Sandeen <sandeen@sandeen.net>
Subject: Re: How to reliably measure fs usage with reflinks enabled?
Date: Fri, 18 May 2018 15:43:13 +0100	[thread overview]
Message-ID: <CAMU1PDg-aOE-gakEtktq8nDRWXcsdorcx0a25_uQ_MFM1nUTpQ@mail.gmail.com> (raw)
In-Reply-To: <20180516001342.GK23861@dastard>

(Sorry for the late reply, work commitments)

On 16 May 2018 at 01:13, Dave Chinner <david@fromorbit.com> wrote:
> On Tue, May 15, 2018 at 02:52:30PM +0100, Mike Fleetwood wrote:
>> On 15 May 2018 at 02:29, Dave Chinner <david@fromorbit.com> wrote:
>> > So the reflink code reserved ~7GB of space in the filesystem (less
>> > than 1%) for it's own reflink related metadata if it ever needs it.
>> > It hasn't used it yet but we need to make sure that it's available
>> > when the filesystem is near ENOSPC. Hence it's considered used space
>> > because users cannot store user data in that space.
>> >
>> > The change I plan to make is to reduce the user reported filesystem
>> > size rather than account for it as used space. IOWs, you'd see a
>> > filesystem size of 889G instead of 896G, but have only 8.8GB used.
>> > It means exactly the same thingi and will behave exactly the same
>> > way, it's just a different space accounting technique....
>>
>> I'm one of the authors of GParted and it uses the reported file system
>> size [1] and compares it to the block device size to see if the file
>> system fills the partition or not and whether to show unallocated space
>> to the user and advise them to grown the file system to fill the block
>> device [2].  As such we prefer that the reported size of the file system
>> match the highest offset that the file system can write to in the block
>> device.
>
> I think that's a narrow, use case specific assumption. There is
> absolutely no guarantee that the filesystem on a device fills the
> entire device or that the filesystem space reported by df/statvfs
> accurately reflects the size of the underlying block device.
>
> Filesystems are moving towards a virtualised world where space usage
> and capacity is kept separate from the capacity of the underlying
> storage provider. That's a solid direction we are moving with xfs:
>
> https://www.spinics.net/lists/linux-xfs/msg12216.html
>
> so we can support subvolumes:
>
> https://www.youtube.com/watch?v=wG8FUvSGROw
>
> via a virtual block address space that remaps the filesystem space
> accounting away from the underlying physical block device:
>
> https://lwn.net/SubscriberLink/753650/32230c15f3453808/
>
> This will completely break any assumption that the filesystem size
> is related to the underlying storage device(s).
>
> GParted deals very firmly with a specific aspect of disk based
> storage - managing partitions on a physical block device.
> Filesystems need to move beyond physical block devices - sanely
> supporting sparse virtual block devices has been on everyone's
> enterprise filesystem wish list for years.

Agreed that GParted is a tool for simple storage setups with current
full fat block devices and file systems.  As such enterprise users with
multiple levels in their storage stack is not it's target audience.

> GParted doesn't have to support these new features - it can simply
> turn them off for filesystems it creates on physical disk
> partitions, but we're doing stuff to support the storage models
> needed for container hosting, virtualisation, efficient backups and
> cloning, etc. If that means we have to break assumptions that legacy
> infrastructure make to support those new features, then so be it....
>
> <snip>
>
>> [2] For full disclosure, because tools for various FSs under report
>>     their file system size, there is a heuristic that there must be at
>>     least 2% difference before unallocated space and grow file system
>>     recommendation is generated so under reporting the FS size by less
>>     than 1% wouldn't actually be an issue. for us.
>
> So, an ext3 example on a small root filesystem:
>
> $ grep sda1 /proc/partitions
>    8        1    9984366 sda1
> $ df -k /
> Filesystem     1K-blocks    Used Available Use% Mounted on
> /dev/root        9696448 8615892    581340  94% /
> $
>
> Just under 3% difference between fs reported size and the block
> device size, and obviously GParted has been fine with this sort of
> discrepancy on ext3 for the past 15+years. IIRC the XFS metadata
> reservations max out at around 3% of total filesystem space, so
> GParted should be just fine with us hiding them by reducing total
> filesystem size...

(I assume you are aware, but for completeness ...)
By default ext2/4 kernel code subtracts some overhead blocks from the
statvfs reported f_blocks figure.  This is documented in mount(8)
against the bsddf/minixdf options.

So after checking, GParted was modified to use the dumpe2fs command to
read the superblock to get the file system size for mounted ext* file
systems too.

https://marc.info/?l=linux-ext4&m=134706477618732&w=2

I see that xfs_db doesn't allow reading the super block of mounted XFS
file systems.  So for the case of a mounted XFS on full fat block device
I guess I'll wait and see how much overhead is subtracted from the
statvfs f_blocks figure and make sure GParted accounts for that.

>> Just providing an app authors point of view.
>
> *nod*.
>
> We're aware that we need to let existing apps continue to work on
> existing formats and features. But we need to break from the old
> ways to do what people are asking us to do, so we're not going to
> lock ourselves in. If we're not breaking old things and making
> people unhappy, then we're not making sufficient progress.

Mike

  reply	other threads:[~2018-05-18 14:43 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-14 20:02 How to reliably measure fs usage with reflinks enabled? Tarik Ceylan
2018-05-14 22:02 ` Eric Sandeen
2018-05-14 22:57   ` Dave Chinner
2018-05-14 23:37     ` Tarik Ceylan
2018-05-15  1:29       ` Dave Chinner
2018-05-15 13:52         ` Mike Fleetwood
2018-05-16  0:13           ` Dave Chinner
2018-05-18 14:43             ` Mike Fleetwood [this message]
2018-05-18 14:56               ` Eric Sandeen
2018-05-19  8:36                 ` Mike Fleetwood
2018-05-18 14:58         ` Darrick J. Wong
2018-05-20  0:10           ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMU1PDg-aOE-gakEtktq8nDRWXcsdorcx0a25_uQ_MFM1nUTpQ@mail.gmail.com \
    --to=mike.fleetwood@googlemail.com \
    --cc=Tarik.Ceylan@ruhr-uni-bochum.de \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).