public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Chris Dunlop <chris@onthe.net.au>
Cc: linux-xfs@vger.kernel.org
Subject: Re: Extreme fragmentation ho!
Date: Thu, 31 Dec 2020 09:03:11 +1100	[thread overview]
Message-ID: <20201230220311.GB164134@dread.disaster.area> (raw)
In-Reply-To: <20201230062836.GA2695485@onthe.net.au>

On Wed, Dec 30, 2020 at 05:28:36PM +1100, Chris Dunlop wrote:
> On Tue, Dec 29, 2020 at 09:06:22AM +1100, Dave Chinner wrote:
> > On Tue, Dec 22, 2020 at 08:54:53AM +1100, Chris Dunlop wrote:
> > > The file is sitting on XFS on LV on a raid6 comprising 6 x 5400 RPM HDD:
> > 
> > ... probably not that unreasonable for pretty much the slowest
> > storage configuration you can possibly come up with for small,
> > metadata write intensive workloads.
> 
> [ Chris grimaces and glances over at the 8+3 erasure-encoded ceph rbd
> sitting like a pitch drop experiment in the corner. ]

I would have thought that should be able to do more than the 20 IOPS
the raid6 above will do on random 4kB writes.... :)

> Speaking of slow storage and metadata write intensive workloads, what's the
> reason reflinks with a realtime device isn't supported? That was one
> approach I wanted to try, to get the metadata ops running on a small fast
> storage with the bulk data sitting on big slow bulk storage. But:
> 
> # mkfs.xfs -m reflink=1 -d rtinherit=1 -r rtdev=/dev/fast /dev/slow
> reflink not supported with realtime devices

Yup, the realtime device is a pure data device, so all it's metadata
is held externally to the device (i.e. it is held in the "data
device", not the RT device). IOWs, it's a completely separate
filesystem implementation within XFS, and so requires independent
functional extensions to support reflink + rmap...

> My naive thought was a reflink was probably "just" a block range referenced
> from multiple places, and probably a refcount somewhere. It seems like it
> should be possible to have the range, references and refcount sitting on the
> fast storage pointing to the actual data blocks on the slow storage.

Yes, it is possible, but the current reflink implementation is based
on allocation group internal structures (rmap is the same), and the
realtime device doesn't have these. Hence there are new metadata
structures that need to be added (refcount btrees rooted in inodes,
not fixed location AG headers) and a bunch of new supporting code to
be written. Largely Darrick has done this already, it's just a
problem of review bandwidth and validation time:

https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux.git/log/?h=realtime-reflink-extsize

(which also includes realtime rmap support, a whole new internal
metadata inode directory to index all the new inode btrees for the
rt device, etc)

It's a pretty large chunk of shiny new code....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

      reply	other threads:[~2020-12-30 22:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-21 21:54 Extreme fragmentation ho! Chris Dunlop
2020-12-22 13:03 ` Brian Foster
2020-12-28 22:06 ` Dave Chinner
2020-12-30  6:28   ` Chris Dunlop
2020-12-30 22:03     ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201230220311.GB164134@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=chris@onthe.net.au \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox