From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from userp2130.oracle.com ([156.151.31.86]:51564 "EHLO userp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965291AbeALXxC (ORCPT ); Fri, 12 Jan 2018 18:53:02 -0500 Date: Fri, 12 Jan 2018 15:52:55 -0800 From: "Darrick J. Wong" To: "Kani, Toshi" Cc: "david@fromorbit.com" , "ross.zwisler@linux.intel.com" , "linux-nvdimm@lists.01.org" , "linux-fsdevel@vger.kernel.org" Subject: Re: DAX 2MB mappings for XFS Message-ID: <20180112235255.GB5597@magnolia> References: <1515788779.16384.29.camel@hpe.com> <20180112211915.GF27323@dastard> <1515795857.16384.34.camel@hpe.com> <20180112222750.GG27323@dastard> <1515801655.16384.57.camel@hpe.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1515801655.16384.57.camel@hpe.com> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Fri, Jan 12, 2018 at 11:15:00PM +0000, Kani, Toshi wrote: > On Sat, 2018-01-13 at 09:27 +1100, Dave Chinner wrote: > > On Fri, Jan 12, 2018 at 09:38:22PM +0000, Kani, Toshi wrote: > > > On Sat, 2018-01-13 at 08:19 +1100, Dave Chinner wrote: > > > : > > > > IOWs, what you are seeing is trying to do a very large allocation on > > > > a very small (8GB) XFS filesystem. It's rare someone asks to > > > > allocate >25% of the filesystem space in one allocation, so it's not > > > > surprising it triggers ENOSPC-like algorithms because it doesn't fit > > > > into a single AG.... > > > > > > > > We can probably look to optimise this, but I'm not sure if we can > > > > easily differentiate this case (i.e. allocation request larger than > > > > continguous free space) from the same situation near ENOSPC when we > > > > really do have to trim to fit... > > > > > > > > Remember: stripe unit allocation alignment is a hint in XFS that we > > > > can and do ignore when necessary - it's not a binding rule. > > > > > > Thanks for the clarification! Can XFS allocate smaller extents so that > > > each extent will fit to an AG? > > > > I've already answered that question: > > > > I'm not sure if we can easily differentiate this case (i.e. > > allocation request larger than continguous free space) from > > the same situation near ENOSPC when we really do have to > > trim to fit... > > Right. I was thinking to limit the extent size (i.e. a half or quarter > of AG size) regardless of the ENOSPC condition, but it may be the same > thing. > > > > ext4 creates multiple smaller extents for the same request. > > > > Yes, because it has much, much smaller block groups so "allocation > > > max extent size (128MB)" is a common path. > > > > It's not a common path on XFS - filesystems (and hence AGs) are > > typically orders of magnitude larger than the maximum extent size > > (8GB) so the problem only shows up when we're near ENOSPC. XFS is > > really not optimised for tiny filesystems, and when it comes to pmem > > we were lead to beleive we'd have mutliple terabytes of pmem in > > systems by now, not still be stuck with 8GB NVDIMMS. Hence we've > > spent very little time worrying about such issues because we > > weren't aiming to support such small capcities for very long... > > I see. Yes, there will be multiple terabytes capacity, but it will also > allow to divide it into multiple smaller namespaces. So, user may > continue to have relatively smaller namespaces for their use cases. If > user allocates a namespace that is just big enough to host several > active files, it may hit this issue regardless of their size. I am curious, why not just give XFS all the space and let it manage the space? --D > Thanks, > -Toshi