linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: "Kani, Toshi" <toshi.kani@hpe.com>
Cc: "david@fromorbit.com" <david@fromorbit.com>,
	"ross.zwisler@linux.intel.com" <ross.zwisler@linux.intel.com>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>
Subject: Re: DAX 2MB mappings for XFS
Date: Fri, 12 Jan 2018 15:52:55 -0800	[thread overview]
Message-ID: <20180112235255.GB5597@magnolia> (raw)
In-Reply-To: <1515801655.16384.57.camel@hpe.com>

On Fri, Jan 12, 2018 at 11:15:00PM +0000, Kani, Toshi wrote:
> On Sat, 2018-01-13 at 09:27 +1100, Dave Chinner wrote:
> > On Fri, Jan 12, 2018 at 09:38:22PM +0000, Kani, Toshi wrote:
> > > On Sat, 2018-01-13 at 08:19 +1100, Dave Chinner wrote:
> > >  :
> > > > IOWs, what you are seeing is trying to do a very large allocation on
> > > > a very small (8GB) XFS filesystem.  It's rare someone asks to
> > > > allocate >25% of the filesystem space in one allocation, so it's not
> > > > surprising it triggers ENOSPC-like algorithms because it doesn't fit
> > > > into a single AG....
> > > > 
> > > > We can probably look to optimise this, but I'm not sure if we can
> > > > easily differentiate this case (i.e. allocation request larger than
> > > > continguous free space) from the same situation near ENOSPC when we
> > > > really do have to trim to fit...
> > > > 
> > > > Remember: stripe unit allocation alignment is a hint in XFS that we
> > > > can and do ignore when necessary - it's not a binding rule.
> > > 
> > > Thanks for the clarification!  Can XFS allocate smaller extents so that
> > > each extent will fit to an AG?
> > 
> > I've already answered that question:
> > 
> > 	I'm not sure if we can easily differentiate this case (i.e.
> > 	allocation request larger than continguous free space) from
> > 	the same situation near ENOSPC when we really do have to
> > 	trim to fit...
> 
> Right.  I was thinking to limit the extent size (i.e. a half or quarter
> of AG size) regardless of the ENOSPC condition, but it may be the same
> thing.
> 
> > > ext4 creates multiple smaller extents for the same request.
> > 
> > Yes, because it has much, much smaller block groups so "allocation >
> > max extent size (128MB)" is a common path.
> > 
> > It's not a common path on XFS - filesystems (and hence AGs) are
> > typically orders of magnitude larger than the maximum extent size
> > (8GB) so the problem only shows up when we're near ENOSPC. XFS is
> > really not optimised for tiny filesystems, and when it comes to pmem
> > we were lead to beleive we'd have mutliple terabytes of pmem in
> > systems by now, not still be stuck with 8GB NVDIMMS. Hence we've
> > spent very little time worrying about such issues because we
> > weren't aiming to support such small capcities for very long...
> 
> I see.  Yes, there will be multiple terabytes capacity, but it will also
> allow to divide it into multiple smaller namespaces.  So, user may
> continue to have relatively smaller namespaces for their use cases.  If
> user allocates a namespace that is just big enough to host several
> active files, it may hit this issue regardless of their size.

I am curious, why not just give XFS all the space and let it manage the space?

--D

> Thanks,
> -Toshi

  reply	other threads:[~2018-01-12 23:53 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-12 19:40 DAX 2MB mappings for XFS Kani, Toshi
2018-01-12 21:19 ` Dave Chinner
2018-01-12 21:38   ` Kani, Toshi
2018-01-12 22:27     ` Dave Chinner
2018-01-12 23:15       ` Kani, Toshi
2018-01-12 23:52         ` Darrick J. Wong [this message]
2018-01-13  0:05           ` Kani, Toshi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180112235255.GB5597@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=ross.zwisler@linux.intel.com \
    --cc=toshi.kani@hpe.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).