* [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res
@ 2019-09-12 14:32 Brian Foster
2019-09-12 14:32 ` [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs Brian Foster
` (3 more replies)
0 siblings, 4 replies; 17+ messages in thread
From: Brian Foster @ 2019-09-12 14:32 UTC (permalink / raw)
To: linux-xfs; +Cc: Carlos Maiolino
Hi all,
This is a repost of a couple patches I posted a few months ago[1]. There
are no changes other than a rebase to for-next. Any thoughts on these? I
think Carlos had also run into some related generic/223 failures fairly
recently...
Carlos,
Any chance you could give these a try?
Brian
[1] https://lore.kernel.org/linux-xfs/20190501140504.16435-1-bfoster@redhat.com/
Brian Foster (2):
xfs: drop minlen before tossing alignment on bmap allocs
xfs: don't set bmapi total block req where minleft is sufficient
fs/xfs/libxfs/xfs_bmap.c | 13 +++++++++----
fs/xfs/xfs_bmap_util.c | 4 ++--
fs/xfs/xfs_dquot.c | 4 ++--
fs/xfs/xfs_iomap.c | 4 ++--
fs/xfs/xfs_reflink.c | 4 ++--
fs/xfs/xfs_rtalloc.c | 3 +--
6 files changed, 18 insertions(+), 14 deletions(-)
--
2.20.1
^ permalink raw reply [flat|nested] 17+ messages in thread* [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs 2019-09-12 14:32 [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res Brian Foster @ 2019-09-12 14:32 ` Brian Foster 2019-09-12 22:35 ` Dave Chinner 2019-09-18 21:41 ` Darrick J. Wong 2019-09-12 14:32 ` [PATCH REPOST 2/2] xfs: don't set bmapi total block req where minleft is sufficient Brian Foster ` (2 subsequent siblings) 3 siblings, 2 replies; 17+ messages in thread From: Brian Foster @ 2019-09-12 14:32 UTC (permalink / raw) To: linux-xfs; +Cc: Carlos Maiolino The bmap block allocation code issues a sequence of retries to perform an optimal allocation, gradually loosening constraints as allocations fail. For example, the first attempt might begin at a particular bno, with maxlen == minlen and alignment incorporated. As allocations fail, the parameters fall back to different modes, drop alignment requirements and reduce the minlen and total block requirements. For large extent allocations with an args.total value that exceeds the allocation length (i.e., non-delalloc), the total value tends to dominate despite these fallbacks. For example, an aligned extent allocation request of tens to hundreds of MB that cannot be satisfied from a particular AG will not succeed after dropping alignment or minlen because xfs_alloc_space_available() never selects an AG that can't satisfy args.total. The retry sequence eventually reduces total and ultimately succeeds if a minlen extent is available somewhere, but the first several retries are effectively pointless in this scenario. Beyond simply being inefficient, another side effect of this behavior is that we drop alignment requirements too aggressively. Consider a 1GB fallocate on a 15GB fs with 16 AGs and 128k stripe unit: # xfs_io -c "falloc 0 1g" /mnt/file # <xfstests>/src/t_stripealign /mnt/file 32 /mnt/file: Start block 347176 not multiple of sunit 32 Despite the filesystem being completely empty, the fact that the allocation request cannot be satisifed from a single AG means the allocation doesn't succeed until xfs_bmap_btalloc() drops total from the original value based on maxlen. This occurs after we've dropped minlen and alignment (unnecessarily). As a step towards addressing this problem, insert a new retry in the bmap allocation sequence to drop minlen (from maxlen) before tossing alignment. This should still result in as large of an extent as possible as the block allocator prioritizes extent size in all but exact allocation modes. By itself, this does not change the behavior of the command above because the preallocation code still specifies total based on maxlen. Instead, this facilitates preservation of alignment once extra reservation is separated from the extent length portion of the total block requirement. Signed-off-by: Brian Foster <bfoster@redhat.com> --- fs/xfs/libxfs/xfs_bmap.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 054b4ce30033..eaa965920a03 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -3573,6 +3573,14 @@ xfs_bmap_btalloc( if ((error = xfs_alloc_vextent(&args))) return error; } + if (args.fsbno == NULLFSBLOCK && nullfb && + args.minlen > ap->minlen) { + args.minlen = ap->minlen; + args.fsbno = ap->blkno; + error = xfs_alloc_vextent(&args); + if (error) + return error; + } if (isaligned && args.fsbno == NULLFSBLOCK) { /* * allocation failed, so turn off alignment and @@ -3584,9 +3592,7 @@ xfs_bmap_btalloc( if ((error = xfs_alloc_vextent(&args))) return error; } - if (args.fsbno == NULLFSBLOCK && nullfb && - args.minlen > ap->minlen) { - args.minlen = ap->minlen; + if (args.fsbno == NULLFSBLOCK && nullfb) { args.type = XFS_ALLOCTYPE_START_BNO; args.fsbno = ap->blkno; if ((error = xfs_alloc_vextent(&args))) -- 2.20.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs 2019-09-12 14:32 ` [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs Brian Foster @ 2019-09-12 22:35 ` Dave Chinner 2019-09-12 22:46 ` Dave Chinner ` (2 more replies) 2019-09-18 21:41 ` Darrick J. Wong 1 sibling, 3 replies; 17+ messages in thread From: Dave Chinner @ 2019-09-12 22:35 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs, Carlos Maiolino On Thu, Sep 12, 2019 at 10:32:22AM -0400, Brian Foster wrote: > The bmap block allocation code issues a sequence of retries to > perform an optimal allocation, gradually loosening constraints as > allocations fail. For example, the first attempt might begin at a > particular bno, with maxlen == minlen and alignment incorporated. As > allocations fail, the parameters fall back to different modes, drop > alignment requirements and reduce the minlen and total block > requirements. > > For large extent allocations with an args.total value that exceeds > the allocation length (i.e., non-delalloc), the total value tends to > dominate despite these fallbacks. For example, an aligned extent > allocation request of tens to hundreds of MB that cannot be > satisfied from a particular AG will not succeed after dropping > alignment or minlen because xfs_alloc_space_available() never > selects an AG that can't satisfy args.total. The retry sequence > eventually reduces total and ultimately succeeds if a minlen extent > is available somewhere, but the first several retries are > effectively pointless in this scenario. > > Beyond simply being inefficient, another side effect of this > behavior is that we drop alignment requirements too aggressively. > Consider a 1GB fallocate on a 15GB fs with 16 AGs and 128k stripe > unit: > > # xfs_io -c "falloc 0 1g" /mnt/file > # <xfstests>/src/t_stripealign /mnt/file 32 > /mnt/file: Start block 347176 not multiple of sunit 32 Ok, so what Carlos and I found last night was an issue with the the agresv code leading to the maximum free extent calculated by xfs_alloc_longest_free_extent() being longer than the largest allowable extent allocation (mp->m_ag_max_usable) resulting in the situation where blen > args->maxlen, and so in the case of initial allocation here, we never run this: /* * Adjust for alignment */ if (blen > args.alignment && blen <= args.maxlen) args.minlen = blen - args.alignment; args.minalignslop = 0; this is how we end up with args.minlen = args.maxlen and the initial allocation failing. The issue is the way mp->m_ag_max_usable is calculated versus how the pag->pag_meta_resv.ar_reserved value is set up for the finobt. That is, "ask" = max tree size, and "used" = 1 because we have a root block allocated. that code does: mp->m_ag_max_unused -= ask; ... pag->pag_meta_resv.ar_reserved = ask - used That means when we calculate the longest extent in the AG, we do: longest = pag->pagf_longest - min_needed - resv(NONE) = pag->pagf_longest - min_needed - pag->pag_meta_resv.ar_reserved whilst mp->m_ag_max_usable is calculated as usable = agf_length - AG header blocks - AGFL - resv(ask) When the AG is empty, this ends up with pag->pagf_longest = agf_length - AG header blocks and min_needed = AGFL blocks and resv(ask) = pag->pag_meta_resv.ar_reserved + 1 and so: longest = usable + 1 And that's how we get blen = maxlen + 1, and that's why alignment is being dropped for the initial allocation in this "allocate full AG" corner case. > Despite the filesystem being completely empty, the fact that the > allocation request cannot be satisifed from a single AG means the > allocation doesn't succeed until xfs_bmap_btalloc() drops total from > the original value based on maxlen. This occurs after we've dropped > minlen and alignment (unnecessarily). Right, we'll continue to fail until minlen is reduced appropriately. But that's not an issue in the fallback algorithms, that's a problem with the initial conditions not being set up correctly. > As a step towards addressing this problem, insert a new retry in the > bmap allocation sequence to drop minlen (from maxlen) before tossing > alignment. This should still result in as large of an extent as > possible as the block allocator prioritizes extent size in all but > exact allocation modes. By itself, this does not change the behavior > of the command above because the preallocation code still specifies > total based on maxlen. Instead, this facilitates preservation of > alignment once extra reservation is separated from the extent length > portion of the total block requirement. AFAICT this is not necessary. The prototypoe patch I wrote last night while working through this with Carlos is attached below. I updated with a variant of your patch 2 to demonstrate that it does actually solve the problem of full AG allocation failing to be aligned. Cheers, Dave. -- Dave Chinner david@fromorbit.com xfs: cap longest free extent to maximum allocatable From: Dave Chinner <dchinner@redhat.com> Cap longest extent to the largest we can allocate based on limits calculated at mount time. Dynamic state (such as finobt blocks) can result in the longest free extent exceeding the size we can allocate, and that results in failure to align full AG allocations when the AG is empty. Result: xfs_io-4413 [003] 426.412459: xfs_alloc_vextent_loopfailed: dev 8:96 agno 0 agbno 32 minlen 243968 maxlen 244000 mod 0 prod 1 minleft 1 total 262148 alignment 32 minalignslop 0 len 0 type NEAR_BNO otype START_BNO wasdel 0 wasfromfl 0 resv 0 datatype 0x5 firstblock 0xffffffffffffffff minlen and maxlen are now separated by the alignment size, and allocation fails because args.total > free space in the AG. Update: add a hacked version of Brian's xfs_bmapi_write() args.total hack to actually allow this initial aligned allocation to succeed: $ sudo mkfs.xfs -f -d su=128k,sw=4,size=15g /dev/sdg meta-data=/dev/sdg isize=512 agcount=16, agsize=245728 blks = sectsz=4096 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 data = bsize=4096 blocks=3931648, imaxpct=25 = sunit=32 swidth=128 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 $ sudo mount /dev/sdg /mnt/test $ sudo xfs_io -f -c "falloc 0 1g" -c "bmap -vvp" /mnt/test/file /mnt/test/file: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL FLAGS 0: [0..1951999]: 1966080..3918079 1 (256..1952255) 1952000 010001 1: [1952000..2097151]: 3931904..4077055 2 (256..145407) 145152 010011 FLAG Values: 0100000 Shared extent 0010000 Unwritten preallocated extent 0001000 Doesn't begin on stripe unit 0000100 Doesn't end on stripe unit 0000010 Doesn't begin on stripe width 0000001 Doesn't end on stripe width $ sudo ~/src/xfstests-dev/src/t_stripealign /mnt/test/file 32 /mnt/test/file: well-aligned $ Note that it skips AG 1 now because AG 0 is not the AG with the longest free space - it's got a root inode chunk in it so it has less space in it than the other AGs. Hence it can't do a maximally sized aligned allocation, while AG 1 can. Note the difference in total compared to the initial trace above. xfs_io-4398 [003] 207.663502: xfs_ag_resv_needed: dev 8:96 agno 0 resv 0 freeblks 245707 flcount 4 resv 0 ask 0 len 1713 xfs_io-4398 [003] 207.663502: xfs_alloc_vextent_loopfailed: dev 8:96 agno 0 agbno 32 minlen 243968 maxlen 244000 mod 0 prod 1 minleft 1 total 0 alignment 32 minalignslop 0 len 0 type NEAR_BNO otype START_BNO wasdel 0 wasfromfl 0 resv 0 datatype 0x5 firstblock 0xffffffffffffffff .... xfs_io-4398 [003] 207.663503: xfs_ag_resv_needed: dev 8:96 agno 1 resv 0 freeblks 245715 flcount 4 resv 0 ask 0 len 1713 ..... xfs_io-4398 [003] 207.666010: xfs_alloc_size_done: dev 8:96 agno 1 agbno 32 minlen 243968 maxlen 244000 mod 0 prod 1 minleft 1 total 0 alignment 32 minalignslop 0 len 244000 type THIS_AG otype START_BNO wasdel 0 wasfromfl 0 resv 0 datatype 0x5 firstblock 0xffffffffffffffff Signed-off-by: Dave Chinner <dchinner@redhat.com> --- fs/xfs/libxfs/xfs_alloc.c | 3 ++- fs/xfs/libxfs/xfs_bmap.c | 14 ++++++++++++++ 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index 533b04aaf6f6..9dead25d2e70 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -1989,7 +1989,8 @@ xfs_alloc_longest_free_extent( * reservations and AGFL rules in place, we can return this extent. */ if (pag->pagf_longest > delta) - return pag->pagf_longest - delta; + return min_t(xfs_extlen_t, pag->pag_mount->m_ag_max_usable, + pag->pagf_longest - delta); /* Otherwise, let the caller try for 1 block if there's space. */ return pag->pagf_flcount > 0 || pag->pagf_longest > 0; diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index 054b4ce30033..b05683f649a6 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4286,6 +4286,20 @@ xfs_bmapi_write( #endif whichfork = xfs_bmapi_whichfork(flags); + /* + * XXX: Hack! + * + * If the total blocks requested is larger than an AG, we can't allocate + * all the space atomically and within a single AG. This will be a + * "short" allocation. In this case, just ignore the total block count + * and rely on minleft calculations to ensure the allocation we do fits + * inside an AG properly. + * + * Based on a patch from Brian. + */ + if (bma.total > mp->m_ag_max_usable) + bma.total = 0; + ASSERT(*nmap >= 1); ASSERT(*nmap <= XFS_BMAP_MAX_NMAP); ASSERT(tp != NULL); ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs 2019-09-12 22:35 ` Dave Chinner @ 2019-09-12 22:46 ` Dave Chinner 2019-09-13 14:58 ` Brian Foster 2019-09-17 12:22 ` Carlos Maiolino 2 siblings, 0 replies; 17+ messages in thread From: Dave Chinner @ 2019-09-12 22:46 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs, Carlos Maiolino On Fri, Sep 13, 2019 at 08:35:19AM +1000, Dave Chinner wrote: > On Thu, Sep 12, 2019 at 10:32:22AM -0400, Brian Foster wrote: > > The bmap block allocation code issues a sequence of retries to > > perform an optimal allocation, gradually loosening constraints as > > allocations fail. For example, the first attempt might begin at a > > particular bno, with maxlen == minlen and alignment incorporated. As > > allocations fail, the parameters fall back to different modes, drop > > alignment requirements and reduce the minlen and total block > > requirements. > > > > For large extent allocations with an args.total value that exceeds > > the allocation length (i.e., non-delalloc), the total value tends to > > dominate despite these fallbacks. For example, an aligned extent > > allocation request of tens to hundreds of MB that cannot be > > satisfied from a particular AG will not succeed after dropping > > alignment or minlen because xfs_alloc_space_available() never > > selects an AG that can't satisfy args.total. The retry sequence > > eventually reduces total and ultimately succeeds if a minlen extent > > is available somewhere, but the first several retries are > > effectively pointless in this scenario. > > > > Beyond simply being inefficient, another side effect of this > > behavior is that we drop alignment requirements too aggressively. > > Consider a 1GB fallocate on a 15GB fs with 16 AGs and 128k stripe > > unit: > > > > # xfs_io -c "falloc 0 1g" /mnt/file > > # <xfstests>/src/t_stripealign /mnt/file 32 > > /mnt/file: Start block 347176 not multiple of sunit 32 > > Ok, so what Carlos and I found last night was an issue with the > the agresv code leading to the maximum free extent calculated > by xfs_alloc_longest_free_extent() being longer than the largest > allowable extent allocation (mp->m_ag_max_usable) resulting in the > situation where blen > args->maxlen, and so in the case of initial > allocation here, we never run this: Just to make it clear: carlos did all the hard work of narrowing it down and isolating the accounting discrepancy in the allocation code. All I did was put 2 and 2 together - the agresv discrepancy - wrote a quick patch and did a trace to make sure I didn't get 5... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs 2019-09-12 22:35 ` Dave Chinner 2019-09-12 22:46 ` Dave Chinner @ 2019-09-13 14:58 ` Brian Foster 2019-09-14 22:00 ` Dave Chinner 2019-09-17 12:22 ` Carlos Maiolino 2 siblings, 1 reply; 17+ messages in thread From: Brian Foster @ 2019-09-13 14:58 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-xfs, Carlos Maiolino On Fri, Sep 13, 2019 at 08:35:19AM +1000, Dave Chinner wrote: > On Thu, Sep 12, 2019 at 10:32:22AM -0400, Brian Foster wrote: > > The bmap block allocation code issues a sequence of retries to > > perform an optimal allocation, gradually loosening constraints as > > allocations fail. For example, the first attempt might begin at a > > particular bno, with maxlen == minlen and alignment incorporated. As > > allocations fail, the parameters fall back to different modes, drop > > alignment requirements and reduce the minlen and total block > > requirements. > > > > For large extent allocations with an args.total value that exceeds > > the allocation length (i.e., non-delalloc), the total value tends to > > dominate despite these fallbacks. For example, an aligned extent > > allocation request of tens to hundreds of MB that cannot be > > satisfied from a particular AG will not succeed after dropping > > alignment or minlen because xfs_alloc_space_available() never > > selects an AG that can't satisfy args.total. The retry sequence > > eventually reduces total and ultimately succeeds if a minlen extent > > is available somewhere, but the first several retries are > > effectively pointless in this scenario. > > > > Beyond simply being inefficient, another side effect of this > > behavior is that we drop alignment requirements too aggressively. > > Consider a 1GB fallocate on a 15GB fs with 16 AGs and 128k stripe > > unit: > > > > # xfs_io -c "falloc 0 1g" /mnt/file > > # <xfstests>/src/t_stripealign /mnt/file 32 > > /mnt/file: Start block 347176 not multiple of sunit 32 > > Ok, so what Carlos and I found last night was an issue with the > the agresv code leading to the maximum free extent calculated > by xfs_alloc_longest_free_extent() being longer than the largest > allowable extent allocation (mp->m_ag_max_usable) resulting in the > situation where blen > args->maxlen, and so in the case of initial > allocation here, we never run this: > > /* > * Adjust for alignment > */ > if (blen > args.alignment && blen <= args.maxlen) > args.minlen = blen - args.alignment; > args.minalignslop = 0; > Interesting, I think I missed this logic when I looked at this originally. It makes sense that we should be allowing this to handle the maxlen = minlen alignment case. > this is how we end up with args.minlen = args.maxlen and the > initial allocation failing. > > The issue is the way mp->m_ag_max_usable is calculated versus how > the pag->pag_meta_resv.ar_reserved value is set up for the finobt. > That is, "ask" = max tree size, and "used" = 1 because we have a > root block allocated. that code does: > Ok, I see that this behavior changes without perag reservations in the mix (i.e., disable finobt, reflink, etc.). The perag reservation calculation stuff always confuses me, however.. > mp->m_ag_max_unused -= ask; > ... > pag->pag_meta_resv.ar_reserved = ask - used > > That means when we calculate the longest extent in the AG, we do: > > longest = pag->pagf_longest - min_needed - resv(NONE) > = pag->pagf_longest - min_needed - pag->pag_meta_resv.ar_reserved > So here we use ar_reserved, which reflects outstanding and so far unused reservation for the metadata type. > whilst mp->m_ag_max_usable is calculated as > > usable = agf_length - AG header blocks - AGFL - resv(ask) > And here we apparently use the total reservation requirement, regardless of how much is used, because we're calculating the maximum amount ever available from the AG. > When the AG is empty, this ends up with > > pag->pagf_longest = agf_length - AG header blocks > and > min_needed = AGFL blocks > and > resv(ask) = pag->pag_meta_resv.ar_reserved + 1 > > and so: > longest = usable + 1 > > And that's how we get blen = maxlen + 1, and that's why alignment is > being dropped for the initial allocation in this "allocate full AG" > corner case. > And maxlen is capped to ->m_ag_max_usable, hence the delta between the two values. I think this makes sense. Thanks for the breakdown. > > Despite the filesystem being completely empty, the fact that the > > allocation request cannot be satisifed from a single AG means the > > allocation doesn't succeed until xfs_bmap_btalloc() drops total from > > the original value based on maxlen. This occurs after we've dropped > > minlen and alignment (unnecessarily). > > Right, we'll continue to fail until minlen is reduced appropriately. > But that's not an issue in the fallback algorithms, that's a problem > with the initial conditions not being set up correctly. > > > As a step towards addressing this problem, insert a new retry in the > > bmap allocation sequence to drop minlen (from maxlen) before tossing > > alignment. This should still result in as large of an extent as > > possible as the block allocator prioritizes extent size in all but > > exact allocation modes. By itself, this does not change the behavior > > of the command above because the preallocation code still specifies > > total based on maxlen. Instead, this facilitates preservation of > > alignment once extra reservation is separated from the extent length > > portion of the total block requirement. > > AFAICT this is not necessary. The prototypoe patch I wrote last > night while working through this with Carlos is attached below. I > updated with a variant of your patch 2 to demonstrate that it does > actually solve the problem of full AG allocation failing to be > aligned. > I agree that this addresses the reported issue, but I can reproduce other corner cases affected by the original patch that aren't affected by this one. For example, if the allocation request happens to be slightly less than blen but not enough to allow for alignment, minlen isn't dropped and we can run through the same allocation retry sequence that kills off alignment before success. This does raise an interesting question in that current allocation behavior allocates one extent out of whatever that largest free extent happened to be in the AG, regardless of whether it's aligned (because args.alignment drops to 1 before minlen). With the retry in this patch, we'd satisfy alignment but end up allocating two extents in the file after chopping alignment off the first free extent. I could see a reasonable argument for either. On one hand contiguity may be preferable to alignment. On the other, the difference between one or two extents in this case is basically the alignment size. Getting back an unaligned extent of say 200MB over an aligned extent 128k smaller might not be worth it, or at the very least unexpected to users who don't understand XFS geometry when there are multiples of that amount of free space still available in the broader fs. I don't feel too strongly about it either way, but figured it's worth noting.. ... > > xfs: cap longest free extent to maximum allocatable > > From: Dave Chinner <dchinner@redhat.com> > > Cap longest extent to the largest we can allocate based on limits > calculated at mount time. Dynamic state (such as finobt blocks) > can result in the longest free extent exceeding the size we can > allocate, and that results in failure to align full AG allocations > when the AG is empty. > ... > Signed-off-by: Dave Chinner <dchinner@redhat.com> > --- Are either of you guys planning to post this patch without the bmapi.total hack? Brian > fs/xfs/libxfs/xfs_alloc.c | 3 ++- > fs/xfs/libxfs/xfs_bmap.c | 14 ++++++++++++++ > 2 files changed, 16 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c > index 533b04aaf6f6..9dead25d2e70 100644 > --- a/fs/xfs/libxfs/xfs_alloc.c > +++ b/fs/xfs/libxfs/xfs_alloc.c > @@ -1989,7 +1989,8 @@ xfs_alloc_longest_free_extent( > * reservations and AGFL rules in place, we can return this extent. > */ > if (pag->pagf_longest > delta) > - return pag->pagf_longest - delta; > + return min_t(xfs_extlen_t, pag->pag_mount->m_ag_max_usable, > + pag->pagf_longest - delta); > > /* Otherwise, let the caller try for 1 block if there's space. */ > return pag->pagf_flcount > 0 || pag->pagf_longest > 0; > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c > index 054b4ce30033..b05683f649a6 100644 > --- a/fs/xfs/libxfs/xfs_bmap.c > +++ b/fs/xfs/libxfs/xfs_bmap.c > @@ -4286,6 +4286,20 @@ xfs_bmapi_write( > #endif > whichfork = xfs_bmapi_whichfork(flags); > > + /* > + * XXX: Hack! > + * > + * If the total blocks requested is larger than an AG, we can't allocate > + * all the space atomically and within a single AG. This will be a > + * "short" allocation. In this case, just ignore the total block count > + * and rely on minleft calculations to ensure the allocation we do fits > + * inside an AG properly. > + * > + * Based on a patch from Brian. > + */ > + if (bma.total > mp->m_ag_max_usable) > + bma.total = 0; > + > ASSERT(*nmap >= 1); > ASSERT(*nmap <= XFS_BMAP_MAX_NMAP); > ASSERT(tp != NULL); ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs 2019-09-13 14:58 ` Brian Foster @ 2019-09-14 22:00 ` Dave Chinner 2019-09-15 13:09 ` Brian Foster 0 siblings, 1 reply; 17+ messages in thread From: Dave Chinner @ 2019-09-14 22:00 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs, Carlos Maiolino On Fri, Sep 13, 2019 at 10:58:02AM -0400, Brian Foster wrote: > On Fri, Sep 13, 2019 at 08:35:19AM +1000, Dave Chinner wrote: > > On Thu, Sep 12, 2019 at 10:32:22AM -0400, Brian Foster wrote: > > > The bmap block allocation code issues a sequence of retries to > > > perform an optimal allocation, gradually loosening constraints as > > > allocations fail. For example, the first attempt might begin at a > > > particular bno, with maxlen == minlen and alignment incorporated. As > > > allocations fail, the parameters fall back to different modes, drop > > > alignment requirements and reduce the minlen and total block > > > requirements. > > > > > > For large extent allocations with an args.total value that exceeds > > > the allocation length (i.e., non-delalloc), the total value tends to > > > dominate despite these fallbacks. For example, an aligned extent > > > allocation request of tens to hundreds of MB that cannot be > > > satisfied from a particular AG will not succeed after dropping > > > alignment or minlen because xfs_alloc_space_available() never > > > selects an AG that can't satisfy args.total. The retry sequence > > > eventually reduces total and ultimately succeeds if a minlen extent > > > is available somewhere, but the first several retries are > > > effectively pointless in this scenario. > > > > > > Beyond simply being inefficient, another side effect of this > > > behavior is that we drop alignment requirements too aggressively. > > > Consider a 1GB fallocate on a 15GB fs with 16 AGs and 128k stripe > > > unit: > > > > > > # xfs_io -c "falloc 0 1g" /mnt/file > > > # <xfstests>/src/t_stripealign /mnt/file 32 > > > /mnt/file: Start block 347176 not multiple of sunit 32 > > > > Ok, so what Carlos and I found last night was an issue with the > > the agresv code leading to the maximum free extent calculated > > by xfs_alloc_longest_free_extent() being longer than the largest > > allowable extent allocation (mp->m_ag_max_usable) resulting in the > > situation where blen > args->maxlen, and so in the case of initial > > allocation here, we never run this: > > > > /* > > * Adjust for alignment > > */ > > if (blen > args.alignment && blen <= args.maxlen) > > args.minlen = blen - args.alignment; > > args.minalignslop = 0; > > .... > > > As a step towards addressing this problem, insert a new retry in the > > > bmap allocation sequence to drop minlen (from maxlen) before tossing > > > alignment. This should still result in as large of an extent as > > > possible as the block allocator prioritizes extent size in all but > > > exact allocation modes. By itself, this does not change the behavior > > > of the command above because the preallocation code still specifies > > > total based on maxlen. Instead, this facilitates preservation of > > > alignment once extra reservation is separated from the extent length > > > portion of the total block requirement. > > > > AFAICT this is not necessary. The prototypoe patch I wrote last > > night while working through this with Carlos is attached below. I > > updated with a variant of your patch 2 to demonstrate that it does > > actually solve the problem of full AG allocation failing to be > > aligned. > > > > I agree that this addresses the reported issue, but I can reproduce > other corner cases affected by the original patch that aren't affected > by this one. For example, if the allocation request happens to be > slightly less than blen but not enough to allow for alignment, minlen > isn't dropped and we can run through the same allocation retry sequence > that kills off alignment before success. But isn't that just another variation of the initial conditions (minlen/maxlen) not being set up correctly for alignment when the AG is empty? i.e. Take the above condition and change it like this: /* * Adjust for alignment */ - if (blen > args.alignment && blen <= args.maxlen) + if (blen > args.alignment && blen <= args.maxlen + args.alignment) args.minlen = blen - args.alignment; args.minalignslop = 0; and now we cover all the cases when blen covers an aligned maxlen allocation... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs 2019-09-14 22:00 ` Dave Chinner @ 2019-09-15 13:09 ` Brian Foster 2019-09-16 8:36 ` Carlos Maiolino 0 siblings, 1 reply; 17+ messages in thread From: Brian Foster @ 2019-09-15 13:09 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-xfs, Carlos Maiolino On Sun, Sep 15, 2019 at 08:00:35AM +1000, Dave Chinner wrote: > On Fri, Sep 13, 2019 at 10:58:02AM -0400, Brian Foster wrote: > > On Fri, Sep 13, 2019 at 08:35:19AM +1000, Dave Chinner wrote: > > > On Thu, Sep 12, 2019 at 10:32:22AM -0400, Brian Foster wrote: > > > > The bmap block allocation code issues a sequence of retries to > > > > perform an optimal allocation, gradually loosening constraints as > > > > allocations fail. For example, the first attempt might begin at a > > > > particular bno, with maxlen == minlen and alignment incorporated. As > > > > allocations fail, the parameters fall back to different modes, drop > > > > alignment requirements and reduce the minlen and total block > > > > requirements. > > > > > > > > For large extent allocations with an args.total value that exceeds > > > > the allocation length (i.e., non-delalloc), the total value tends to > > > > dominate despite these fallbacks. For example, an aligned extent > > > > allocation request of tens to hundreds of MB that cannot be > > > > satisfied from a particular AG will not succeed after dropping > > > > alignment or minlen because xfs_alloc_space_available() never > > > > selects an AG that can't satisfy args.total. The retry sequence > > > > eventually reduces total and ultimately succeeds if a minlen extent > > > > is available somewhere, but the first several retries are > > > > effectively pointless in this scenario. > > > > > > > > Beyond simply being inefficient, another side effect of this > > > > behavior is that we drop alignment requirements too aggressively. > > > > Consider a 1GB fallocate on a 15GB fs with 16 AGs and 128k stripe > > > > unit: > > > > > > > > # xfs_io -c "falloc 0 1g" /mnt/file > > > > # <xfstests>/src/t_stripealign /mnt/file 32 > > > > /mnt/file: Start block 347176 not multiple of sunit 32 > > > > > > Ok, so what Carlos and I found last night was an issue with the > > > the agresv code leading to the maximum free extent calculated > > > by xfs_alloc_longest_free_extent() being longer than the largest > > > allowable extent allocation (mp->m_ag_max_usable) resulting in the > > > situation where blen > args->maxlen, and so in the case of initial > > > allocation here, we never run this: > > > > > > /* > > > * Adjust for alignment > > > */ > > > if (blen > args.alignment && blen <= args.maxlen) > > > args.minlen = blen - args.alignment; > > > args.minalignslop = 0; > > > > .... > > > > As a step towards addressing this problem, insert a new retry in the > > > > bmap allocation sequence to drop minlen (from maxlen) before tossing > > > > alignment. This should still result in as large of an extent as > > > > possible as the block allocator prioritizes extent size in all but > > > > exact allocation modes. By itself, this does not change the behavior > > > > of the command above because the preallocation code still specifies > > > > total based on maxlen. Instead, this facilitates preservation of > > > > alignment once extra reservation is separated from the extent length > > > > portion of the total block requirement. > > > > > > AFAICT this is not necessary. The prototypoe patch I wrote last > > > night while working through this with Carlos is attached below. I > > > updated with a variant of your patch 2 to demonstrate that it does > > > actually solve the problem of full AG allocation failing to be > > > aligned. > > > > > > > I agree that this addresses the reported issue, but I can reproduce > > other corner cases affected by the original patch that aren't affected > > by this one. For example, if the allocation request happens to be > > slightly less than blen but not enough to allow for alignment, minlen > > isn't dropped and we can run through the same allocation retry sequence > > that kills off alignment before success. > > But isn't that just another variation of the initial conditions > (minlen/maxlen) not being set up correctly for alignment when the AG > is empty? > Perhaps, though I don't think it's exclusive to an empty AG. > i.e. Take the above condition and change it like this: > > /* > * Adjust for alignment > */ > - if (blen > args.alignment && blen <= args.maxlen) > + if (blen > args.alignment && blen <= args.maxlen + args.alignment) > args.minlen = blen - args.alignment; > args.minalignslop = 0; > > and now we cover all the cases when blen covers an aligned maxlen > allocation... > Do we want to consider whether minlen goes to 1? Otherwise that looks reasonable to me. What I was trying to get at is just that we should consider whether there are any other corner cases (that we might care about) where this particular allocation might not behave as expected vs. just the example used in the original commit log. If somebody wants to send a finalized patch or two with these fixes along with the bma.total one (or I can tack it on in reply..?), I'll think about it further on review as well.. Brian > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs 2019-09-15 13:09 ` Brian Foster @ 2019-09-16 8:36 ` Carlos Maiolino 0 siblings, 0 replies; 17+ messages in thread From: Carlos Maiolino @ 2019-09-16 8:36 UTC (permalink / raw) To: Brian Foster; +Cc: Dave Chinner, linux-xfs > > > > aligned. > > > > > > > > > > I agree that this addresses the reported issue, but I can reproduce > > > other corner cases affected by the original patch that aren't affected > > > by this one. For example, if the allocation request happens to be > > > slightly less than blen but not enough to allow for alignment, minlen > > > isn't dropped and we can run through the same allocation retry sequence > > > that kills off alignment before success. > > > > But isn't that just another variation of the initial conditions > > (minlen/maxlen) not being set up correctly for alignment when the AG > > is empty? > > > > Perhaps, though I don't think it's exclusive to an empty AG. > > > i.e. Take the above condition and change it like this: > > > > /* > > * Adjust for alignment > > */ > > - if (blen > args.alignment && blen <= args.maxlen) > > + if (blen > args.alignment && blen <= args.maxlen + args.alignment) > > args.minlen = blen - args.alignment; > > args.minalignslop = 0; > > > > and now we cover all the cases when blen covers an aligned maxlen > > allocation... > > > > Do we want to consider whether minlen goes to 1? Otherwise that looks > reasonable to me. What I was trying to get at is just that we should > consider whether there are any other corner cases (that we might care > about) where this particular allocation might not behave as expected vs. > just the example used in the original commit log. > > If somebody wants to send a finalized patch or two with these fixes > along with the bma.total one (or I can tack it on in reply..?), I'll > think about it further on review as well.. I didn't realize the conversation was already going on before I replied for the first time, my apologies for unnecessary emails. I've been working on some patches about this issue since I had this chat with Dave, but I do not have anything 'mature' yet, exactly because some of the issues you mentioned above, like the behavior not being exclusive to an empty AG, and the fact the generic/223 was still failing after the 'fix' has been applied (the single large fallocated file worked, but generic/223 no), so I was kind of chasing my own tails on Friday :D I'll get back to it today and see what I can do with fresh eyes. Thanks Dave and Brian. > > Brian > > > Cheers, > > > > Dave. > > -- > > Dave Chinner > > david@fromorbit.com -- Carlos ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs 2019-09-12 22:35 ` Dave Chinner 2019-09-12 22:46 ` Dave Chinner 2019-09-13 14:58 ` Brian Foster @ 2019-09-17 12:22 ` Carlos Maiolino 2 siblings, 0 replies; 17+ messages in thread From: Carlos Maiolino @ 2019-09-17 12:22 UTC (permalink / raw) To: Dave Chinner; +Cc: Brian Foster, linux-xfs Hi Dave. I've been playing a bit with it, and, based on our talk on IRC: > +++ b/fs/xfs/libxfs/xfs_bmap.c > @@ -4286,6 +4286,20 @@ xfs_bmapi_write( > #endif > whichfork = xfs_bmapi_whichfork(flags); > > + /* > + * XXX: Hack! > + * > + * If the total blocks requested is larger than an AG, we can't allocate > + * all the space atomically and within a single AG. This will be a > + * "short" allocation. In this case, just ignore the total block count > + * and rely on minleft calculations to ensure the allocation we do fits > + * inside an AG properly. > + * > + * Based on a patch from Brian. > + */ > + if (bma.total > mp->m_ag_max_usable) > + bma.total = 0; Instead zeroing bma.total here, can't we crop it to blen in xfs_bmap_btalloc()? I did some tests here and looks like the result is the same, although I'm not sure if it's a good approach. Cheers > + > ASSERT(*nmap >= 1); > ASSERT(*nmap <= XFS_BMAP_MAX_NMAP); > ASSERT(tp != NULL); -- Carlos ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs 2019-09-12 14:32 ` [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs Brian Foster 2019-09-12 22:35 ` Dave Chinner @ 2019-09-18 21:41 ` Darrick J. Wong 2019-09-19 11:49 ` Brian Foster 1 sibling, 1 reply; 17+ messages in thread From: Darrick J. Wong @ 2019-09-18 21:41 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs, Carlos Maiolino On Thu, Sep 12, 2019 at 10:32:22AM -0400, Brian Foster wrote: > The bmap block allocation code issues a sequence of retries to > perform an optimal allocation, gradually loosening constraints as > allocations fail. For example, the first attempt might begin at a > particular bno, with maxlen == minlen and alignment incorporated. As > allocations fail, the parameters fall back to different modes, drop > alignment requirements and reduce the minlen and total block > requirements. > > For large extent allocations with an args.total value that exceeds > the allocation length (i.e., non-delalloc), the total value tends to > dominate despite these fallbacks. For example, an aligned extent > allocation request of tens to hundreds of MB that cannot be > satisfied from a particular AG will not succeed after dropping > alignment or minlen because xfs_alloc_space_available() never > selects an AG that can't satisfy args.total. The retry sequence "..that can satisfy args.total"? > eventually reduces total and ultimately succeeds if a minlen extent > is available somewhere, but the first several retries are > effectively pointless in this scenario. > > Beyond simply being inefficient, another side effect of this > behavior is that we drop alignment requirements too aggressively. > Consider a 1GB fallocate on a 15GB fs with 16 AGs and 128k stripe > unit: > > # xfs_io -c "falloc 0 1g" /mnt/file > # <xfstests>/src/t_stripealign /mnt/file 32 > /mnt/file: Start block 347176 not multiple of sunit 32 > > Despite the filesystem being completely empty, the fact that the > allocation request cannot be satisifed from a single AG means the > allocation doesn't succeed until xfs_bmap_btalloc() drops total from > the original value based on maxlen. This occurs after we've dropped > minlen and alignment (unnecessarily). > > As a step towards addressing this problem, insert a new retry in the > bmap allocation sequence to drop minlen (from maxlen) before tossing > alignment. This should still result in as large of an extent as > possible as the block allocator prioritizes extent size in all but > exact allocation modes. By itself, this does not change the behavior > of the command above because the preallocation code still specifies > total based on maxlen. Instead, this facilitates preservation of > alignment once extra reservation is separated from the extent length > portion of the total block requirement. > > Signed-off-by: Brian Foster <bfoster@redhat.com> > --- > fs/xfs/libxfs/xfs_bmap.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c > index 054b4ce30033..eaa965920a03 100644 > --- a/fs/xfs/libxfs/xfs_bmap.c > +++ b/fs/xfs/libxfs/xfs_bmap.c > @@ -3573,6 +3573,14 @@ xfs_bmap_btalloc( > if ((error = xfs_alloc_vextent(&args))) > return error; > } > + if (args.fsbno == NULLFSBLOCK && nullfb && > + args.minlen > ap->minlen) { Maybe a comment here to point out that we're retrying the allocation with the minimum acceptable minlen? I say this mostly because (some of) the other clauses have a quick description of the constraints that are being fed to the allocation request, and it makes it easier to keep track of what's going on. > + args.minlen = ap->minlen; > + args.fsbno = ap->blkno; > + error = xfs_alloc_vextent(&args); > + if (error) > + return error; > + } > if (isaligned && args.fsbno == NULLFSBLOCK) { > /* > * allocation failed, so turn off alignment and > @@ -3584,9 +3592,7 @@ xfs_bmap_btalloc( > if ((error = xfs_alloc_vextent(&args))) > return error; > } > - if (args.fsbno == NULLFSBLOCK && nullfb && > - args.minlen > ap->minlen) { > - args.minlen = ap->minlen; > + if (args.fsbno == NULLFSBLOCK && nullfb) { > args.type = XFS_ALLOCTYPE_START_BNO; Particularly when we get here and I have to look pretty closely to figure out what this retry is now attempting to do. This one is trying the allocation again, but now with no alignment and the caller's minlen, right? --D > args.fsbno = ap->blkno; > if ((error = xfs_alloc_vextent(&args))) > -- > 2.20.1 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs 2019-09-18 21:41 ` Darrick J. Wong @ 2019-09-19 11:49 ` Brian Foster 0 siblings, 0 replies; 17+ messages in thread From: Brian Foster @ 2019-09-19 11:49 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs, Carlos Maiolino On Wed, Sep 18, 2019 at 02:41:41PM -0700, Darrick J. Wong wrote: > On Thu, Sep 12, 2019 at 10:32:22AM -0400, Brian Foster wrote: > > The bmap block allocation code issues a sequence of retries to > > perform an optimal allocation, gradually loosening constraints as > > allocations fail. For example, the first attempt might begin at a > > particular bno, with maxlen == minlen and alignment incorporated. As > > allocations fail, the parameters fall back to different modes, drop > > alignment requirements and reduce the minlen and total block > > requirements. > > > > For large extent allocations with an args.total value that exceeds > > the allocation length (i.e., non-delalloc), the total value tends to > > dominate despite these fallbacks. For example, an aligned extent > > allocation request of tens to hundreds of MB that cannot be > > satisfied from a particular AG will not succeed after dropping > > alignment or minlen because xfs_alloc_space_available() never > > selects an AG that can't satisfy args.total. The retry sequence > > "..that can satisfy args.total"? > Heh. "can't" was intended there, but after reading it back it is poorly worded as a double negative. :P Basically it's just saying that args.total is what restricts AG selection in this corner case. > > eventually reduces total and ultimately succeeds if a minlen extent > > is available somewhere, but the first several retries are > > effectively pointless in this scenario. > > > > Beyond simply being inefficient, another side effect of this > > behavior is that we drop alignment requirements too aggressively. > > Consider a 1GB fallocate on a 15GB fs with 16 AGs and 128k stripe > > unit: > > > > # xfs_io -c "falloc 0 1g" /mnt/file > > # <xfstests>/src/t_stripealign /mnt/file 32 > > /mnt/file: Start block 347176 not multiple of sunit 32 > > > > Despite the filesystem being completely empty, the fact that the > > allocation request cannot be satisifed from a single AG means the > > allocation doesn't succeed until xfs_bmap_btalloc() drops total from > > the original value based on maxlen. This occurs after we've dropped > > minlen and alignment (unnecessarily). > > > > As a step towards addressing this problem, insert a new retry in the > > bmap allocation sequence to drop minlen (from maxlen) before tossing > > alignment. This should still result in as large of an extent as > > possible as the block allocator prioritizes extent size in all but > > exact allocation modes. By itself, this does not change the behavior > > of the command above because the preallocation code still specifies > > total based on maxlen. Instead, this facilitates preservation of > > alignment once extra reservation is separated from the extent length > > portion of the total block requirement. > > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > --- > > fs/xfs/libxfs/xfs_bmap.c | 12 +++++++++--- > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c > > index 054b4ce30033..eaa965920a03 100644 > > --- a/fs/xfs/libxfs/xfs_bmap.c > > +++ b/fs/xfs/libxfs/xfs_bmap.c > > @@ -3573,6 +3573,14 @@ xfs_bmap_btalloc( > > if ((error = xfs_alloc_vextent(&args))) > > return error; > > } > > + if (args.fsbno == NULLFSBLOCK && nullfb && > > + args.minlen > ap->minlen) { > > Maybe a comment here to point out that we're retrying the allocation > with the minimum acceptable minlen? I say this mostly because (some of) > the other clauses have a quick description of the constraints that are > being fed to the allocation request, and it makes it easier to keep > track of what's going on. > Yeah.. > > + args.minlen = ap->minlen; > > + args.fsbno = ap->blkno; > > + error = xfs_alloc_vextent(&args); > > + if (error) > > + return error; > > + } > > if (isaligned && args.fsbno == NULLFSBLOCK) { > > /* > > * allocation failed, so turn off alignment and > > @@ -3584,9 +3592,7 @@ xfs_bmap_btalloc( > > if ((error = xfs_alloc_vextent(&args))) > > return error; > > } > > - if (args.fsbno == NULLFSBLOCK && nullfb && > > - args.minlen > ap->minlen) { > > - args.minlen = ap->minlen; > > + if (args.fsbno == NULLFSBLOCK && nullfb) { > > args.type = XFS_ALLOCTYPE_START_BNO; > > Particularly when we get here and I have to look pretty closely to > figure out what this retry is now attempting to do. This one is trying > the allocation again, but now with no alignment and the caller's minlen, > right? > Yeah, but this branch also (potentially) changes the allocation type. IIRC, this wasn't relevant to the corner case I was trying to address with this patch. I basically just wanted to get minlen dropped before tossing alignment and at the same time didn't want to screw around with the existing logic that was unrelated (hence the separation of the minlen update into a new retry as opposed to just reordering code). That said, the other subthread suggests this patch can be replaced with more localized fixes to bma.minlen alignment handling code. My original reproducer of this problem was based on some of the extent allocation rework bits that have been deferred (rework of the size allocation mode). I wasn't aware of the bma.minlen alignment logic at the time, so I may have misanalyzed the problem and my current development tree doesn't reproduce. I'll make the tweaks to this patch locally in the event that I run into the problem again when I get back to that work and if there still happens to be corner cases not covered by the minlen fixup patch that Carlos has posted.. Brian > --D > > > args.fsbno = ap->blkno; > > if ((error = xfs_alloc_vextent(&args))) > > -- > > 2.20.1 > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH REPOST 2/2] xfs: don't set bmapi total block req where minleft is sufficient 2019-09-12 14:32 [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res Brian Foster 2019-09-12 14:32 ` [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs Brian Foster @ 2019-09-12 14:32 ` Brian Foster 2019-09-18 21:55 ` Darrick J. Wong 2019-09-16 8:18 ` [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res Carlos Maiolino 2019-10-18 17:17 ` Darrick J. Wong 3 siblings, 1 reply; 17+ messages in thread From: Brian Foster @ 2019-09-12 14:32 UTC (permalink / raw) To: linux-xfs; +Cc: Carlos Maiolino xfs_bmapi_write() takes a total block requirement parameter that is passed down to the block allocation code and is used to specify the total block requirement of the associated transaction. This is used to try and select an AG that can not only satisfy the requested extent allocation, but can also accommodate subsequent allocations that might be required to complete the transaction. For example, additional bmbt block allocations may be required on insertion of the resulting extent to an inode data fork. While it's important for callers to calculate and reserve such extra blocks in the transaction, it is not necessary to pass the total value to xfs_bmapi_write() in all cases. The latter automatically sets minleft to ensure that sufficient free blocks remain after the allocation attempt to expand the format of the associated inode (i.e., such as extent to btree conversion, btree splits, etc). Therefore, any callers that pass a total block requirement of the bmap mapping length plus worst case bmbt expansion essentially specify the additional reservation requirement twice. These callers can pass a total of zero to rely on the bmapi minleft policy. Beyond being superfluous, the primary motivation for this change is that the total reservation logic in the bmbt code is dubious in scenarios where minlen < maxlen and a maxlen extent cannot be allocated (which is more common for data extent allocations where contiguity is not required). The total value is based on maxlen in the xfs_bmapi_write() caller. If the bmbt code falls back to an allocation between minlen and maxlen, that allocation will not succeed until total is reset to minlen, which essentially throws away any additional reservation included in total by the caller. In addition, the total value is not reset until after alignment is dropped, which means that such callers drop alignment far too aggressively than necessary. Update all callers of xfs_bmapi_write() that pass a total block value of the mapping length plus bmbt reservation to instead pass zero and rely on xfs_bmapi_minleft() to enforce the bmbt reservation requirement. This trades off slightly less conservative AG selection for the ability to preserve alignment in more scenarios. xfs_bmapi_write() callers that incorporate unrelated or additional reservations in total beyond what is already included in minleft must continue to use the former. Signed-off-by: Brian Foster <bfoster@redhat.com> --- fs/xfs/libxfs/xfs_bmap.c | 1 - fs/xfs/xfs_bmap_util.c | 4 ++-- fs/xfs/xfs_dquot.c | 4 ++-- fs/xfs/xfs_iomap.c | 4 ++-- fs/xfs/xfs_reflink.c | 4 ++-- fs/xfs/xfs_rtalloc.c | 3 +-- 6 files changed, 9 insertions(+), 11 deletions(-) diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c index eaa965920a03..c2f0afdf2f65 100644 --- a/fs/xfs/libxfs/xfs_bmap.c +++ b/fs/xfs/libxfs/xfs_bmap.c @@ -4505,7 +4505,6 @@ xfs_bmapi_convert_delalloc( bma.wasdel = true; bma.offset = bma.got.br_startoff; bma.length = max_t(xfs_filblks_t, bma.got.br_blockcount, MAXEXTLEN); - bma.total = XFS_EXTENTADD_SPACE_RES(ip->i_mount, XFS_DATA_FORK); bma.minleft = xfs_bmapi_minleft(tp, ip, whichfork); if (whichfork == XFS_COW_FORK) bma.flags = XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC; diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index 0910cb75b65d..079aedade656 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -962,8 +962,8 @@ xfs_alloc_file_space( xfs_trans_ijoin(tp, ip, 0); error = xfs_bmapi_write(tp, ip, startoffset_fsb, - allocatesize_fsb, alloc_type, resblks, - imapp, &nimaps); + allocatesize_fsb, alloc_type, 0, imapp, + &nimaps); if (error) goto error0; diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c index aeb95e7391c1..b924dbd63a7d 100644 --- a/fs/xfs/xfs_dquot.c +++ b/fs/xfs/xfs_dquot.c @@ -305,8 +305,8 @@ xfs_dquot_disk_alloc( /* Create the block mapping. */ xfs_trans_ijoin(tp, quotip, XFS_ILOCK_EXCL); error = xfs_bmapi_write(tp, quotip, dqp->q_fileoffset, - XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA, - XFS_QM_DQALLOC_SPACE_RES(mp), &map, &nmaps); + XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA, 0, &map, + &nmaps); if (error) return error; ASSERT(map.br_blockcount == XFS_DQUOT_CLUSTER_SIZE_FSB); diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index f780e223b118..27f2030690e2 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -277,8 +277,8 @@ xfs_iomap_write_direct( * caller gave to us. */ nimaps = 1; - error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb, - bmapi_flags, resblks, imap, &nimaps); + error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb, bmapi_flags, 0, + imap, &nimaps); if (error) goto out_res_cancel; diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 0f08153b4994..3aa56cac1a47 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -410,8 +410,8 @@ xfs_reflink_allocate_cow( /* Allocate the entire reservation as unwritten blocks. */ nimaps = 1; error = xfs_bmapi_write(tp, ip, imap->br_startoff, imap->br_blockcount, - XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, - resblks, imap, &nimaps); + XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, 0, imap, + &nimaps); if (error) goto out_unreserve; diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c index 4a48a8c75b4f..d42b5a2047e0 100644 --- a/fs/xfs/xfs_rtalloc.c +++ b/fs/xfs/xfs_rtalloc.c @@ -792,8 +792,7 @@ xfs_growfs_rt_alloc( */ nmap = 1; error = xfs_bmapi_write(tp, ip, oblocks, nblocks - oblocks, - XFS_BMAPI_METADATA, resblks, &map, - &nmap); + XFS_BMAPI_METADATA, 0, &map, &nmap); if (!error && nmap < 1) error = -ENOSPC; if (error) -- 2.20.1 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 2/2] xfs: don't set bmapi total block req where minleft is sufficient 2019-09-12 14:32 ` [PATCH REPOST 2/2] xfs: don't set bmapi total block req where minleft is sufficient Brian Foster @ 2019-09-18 21:55 ` Darrick J. Wong 2019-09-19 11:55 ` Brian Foster 0 siblings, 1 reply; 17+ messages in thread From: Darrick J. Wong @ 2019-09-18 21:55 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs, Carlos Maiolino On Thu, Sep 12, 2019 at 10:32:23AM -0400, Brian Foster wrote: > xfs_bmapi_write() takes a total block requirement parameter that is > passed down to the block allocation code and is used to specify the > total block requirement of the associated transaction. This is used > to try and select an AG that can not only satisfy the requested > extent allocation, but can also accommodate subsequent allocations > that might be required to complete the transaction. For example, > additional bmbt block allocations may be required on insertion of > the resulting extent to an inode data fork. > > While it's important for callers to calculate and reserve such extra > blocks in the transaction, it is not necessary to pass the total > value to xfs_bmapi_write() in all cases. The latter automatically > sets minleft to ensure that sufficient free blocks remain after the > allocation attempt to expand the format of the associated inode > (i.e., such as extent to btree conversion, btree splits, etc). > Therefore, any callers that pass a total block requirement of the > bmap mapping length plus worst case bmbt expansion essentially > specify the additional reservation requirement twice. These callers > can pass a total of zero to rely on the bmapi minleft policy. > > Beyond being superfluous, the primary motivation for this change is > that the total reservation logic in the bmbt code is dubious in > scenarios where minlen < maxlen and a maxlen extent cannot be > allocated (which is more common for data extent allocations where > contiguity is not required). The total value is based on maxlen in > the xfs_bmapi_write() caller. If the bmbt code falls back to an > allocation between minlen and maxlen, that allocation will not > succeed until total is reset to minlen, which essentially throws > away any additional reservation included in total by the caller. In Hm, are you talking about lowmode allocations and the "retry with fewer constraints" behavior in xfs_bmap_btalloc? > addition, the total value is not reset until after alignment is > dropped, which means that such callers drop alignment far too > aggressively than necessary. Does that need fixing? > Update all callers of xfs_bmapi_write() that pass a total block > value of the mapping length plus bmbt reservation to instead pass > zero and rely on xfs_bmapi_minleft() to enforce the bmbt reservation > requirement. This trades off slightly less conservative AG selection > for the ability to preserve alignment in more scenarios. > xfs_bmapi_write() callers that incorporate unrelated or additional > reservations in total beyond what is already included in minleft > must continue to use the former. Does doing this affect the outcome of where bmbt blocks get allocated with respect to whichever data extent allocation triggered the reshaping of the bmbt? I would imagine that it /could/ result in somewhat better allocation decisions? But that the primary outcome of these two patches is that a large fallocate on a filesystem with alignment hints and small AGs (relative to the fallocate request size) are more likely to spit out aligned extents? The code changes look ok, but at the same time I wonder if there's a bigger picture I'm missing? FWIW that might just be due to Dave and Carlos discussing something resulting in the "A small improvement in the allocation algorithm" series and just a gut feeling that better coordination (or maintainer soothing :P) is needed. --D > Signed-off-by: Brian Foster <bfoster@redhat.com> > --- > fs/xfs/libxfs/xfs_bmap.c | 1 - > fs/xfs/xfs_bmap_util.c | 4 ++-- > fs/xfs/xfs_dquot.c | 4 ++-- > fs/xfs/xfs_iomap.c | 4 ++-- > fs/xfs/xfs_reflink.c | 4 ++-- > fs/xfs/xfs_rtalloc.c | 3 +-- > 6 files changed, 9 insertions(+), 11 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c > index eaa965920a03..c2f0afdf2f65 100644 > --- a/fs/xfs/libxfs/xfs_bmap.c > +++ b/fs/xfs/libxfs/xfs_bmap.c > @@ -4505,7 +4505,6 @@ xfs_bmapi_convert_delalloc( > bma.wasdel = true; > bma.offset = bma.got.br_startoff; > bma.length = max_t(xfs_filblks_t, bma.got.br_blockcount, MAXEXTLEN); > - bma.total = XFS_EXTENTADD_SPACE_RES(ip->i_mount, XFS_DATA_FORK); > bma.minleft = xfs_bmapi_minleft(tp, ip, whichfork); > if (whichfork == XFS_COW_FORK) > bma.flags = XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC; > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > index 0910cb75b65d..079aedade656 100644 > --- a/fs/xfs/xfs_bmap_util.c > +++ b/fs/xfs/xfs_bmap_util.c > @@ -962,8 +962,8 @@ xfs_alloc_file_space( > xfs_trans_ijoin(tp, ip, 0); > > error = xfs_bmapi_write(tp, ip, startoffset_fsb, > - allocatesize_fsb, alloc_type, resblks, > - imapp, &nimaps); > + allocatesize_fsb, alloc_type, 0, imapp, > + &nimaps); > if (error) > goto error0; > > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c > index aeb95e7391c1..b924dbd63a7d 100644 > --- a/fs/xfs/xfs_dquot.c > +++ b/fs/xfs/xfs_dquot.c > @@ -305,8 +305,8 @@ xfs_dquot_disk_alloc( > /* Create the block mapping. */ > xfs_trans_ijoin(tp, quotip, XFS_ILOCK_EXCL); > error = xfs_bmapi_write(tp, quotip, dqp->q_fileoffset, > - XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA, > - XFS_QM_DQALLOC_SPACE_RES(mp), &map, &nmaps); > + XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA, 0, &map, > + &nmaps); > if (error) > return error; > ASSERT(map.br_blockcount == XFS_DQUOT_CLUSTER_SIZE_FSB); > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > index f780e223b118..27f2030690e2 100644 > --- a/fs/xfs/xfs_iomap.c > +++ b/fs/xfs/xfs_iomap.c > @@ -277,8 +277,8 @@ xfs_iomap_write_direct( > * caller gave to us. > */ > nimaps = 1; > - error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb, > - bmapi_flags, resblks, imap, &nimaps); > + error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb, bmapi_flags, 0, > + imap, &nimaps); > if (error) > goto out_res_cancel; > > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c > index 0f08153b4994..3aa56cac1a47 100644 > --- a/fs/xfs/xfs_reflink.c > +++ b/fs/xfs/xfs_reflink.c > @@ -410,8 +410,8 @@ xfs_reflink_allocate_cow( > /* Allocate the entire reservation as unwritten blocks. */ > nimaps = 1; > error = xfs_bmapi_write(tp, ip, imap->br_startoff, imap->br_blockcount, > - XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, > - resblks, imap, &nimaps); > + XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, 0, imap, > + &nimaps); > if (error) > goto out_unreserve; > > diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c > index 4a48a8c75b4f..d42b5a2047e0 100644 > --- a/fs/xfs/xfs_rtalloc.c > +++ b/fs/xfs/xfs_rtalloc.c > @@ -792,8 +792,7 @@ xfs_growfs_rt_alloc( > */ > nmap = 1; > error = xfs_bmapi_write(tp, ip, oblocks, nblocks - oblocks, > - XFS_BMAPI_METADATA, resblks, &map, > - &nmap); > + XFS_BMAPI_METADATA, 0, &map, &nmap); > if (!error && nmap < 1) > error = -ENOSPC; > if (error) > -- > 2.20.1 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 2/2] xfs: don't set bmapi total block req where minleft is sufficient 2019-09-18 21:55 ` Darrick J. Wong @ 2019-09-19 11:55 ` Brian Foster 0 siblings, 0 replies; 17+ messages in thread From: Brian Foster @ 2019-09-19 11:55 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs, Carlos Maiolino On Wed, Sep 18, 2019 at 02:55:16PM -0700, Darrick J. Wong wrote: > On Thu, Sep 12, 2019 at 10:32:23AM -0400, Brian Foster wrote: > > xfs_bmapi_write() takes a total block requirement parameter that is > > passed down to the block allocation code and is used to specify the > > total block requirement of the associated transaction. This is used > > to try and select an AG that can not only satisfy the requested > > extent allocation, but can also accommodate subsequent allocations > > that might be required to complete the transaction. For example, > > additional bmbt block allocations may be required on insertion of > > the resulting extent to an inode data fork. > > > > While it's important for callers to calculate and reserve such extra > > blocks in the transaction, it is not necessary to pass the total > > value to xfs_bmapi_write() in all cases. The latter automatically > > sets minleft to ensure that sufficient free blocks remain after the > > allocation attempt to expand the format of the associated inode > > (i.e., such as extent to btree conversion, btree splits, etc). > > Therefore, any callers that pass a total block requirement of the > > bmap mapping length plus worst case bmbt expansion essentially > > specify the additional reservation requirement twice. These callers > > can pass a total of zero to rely on the bmapi minleft policy. > > > > Beyond being superfluous, the primary motivation for this change is > > that the total reservation logic in the bmbt code is dubious in > > scenarios where minlen < maxlen and a maxlen extent cannot be > > allocated (which is more common for data extent allocations where > > contiguity is not required). The total value is based on maxlen in > > the xfs_bmapi_write() caller. If the bmbt code falls back to an > > allocation between minlen and maxlen, that allocation will not > > succeed until total is reset to minlen, which essentially throws > > away any additional reservation included in total by the caller. In > > Hm, are you talking about lowmode allocations and the "retry with fewer > constraints" behavior in xfs_bmap_btalloc? > This isn't related to low space mode. Consider a simple example of xfs_alloc_file_space() calling xfs_bmapi_write() with a total param of the maxlen + bmbt reservation. xfs_bmapi_write() assigns bma.total, this makes its way to args.total and the allocation code thus won't pick an AG with less space available than specified in args.total (see xfs_alloc_space_available()). If args.total is what prevents AG selection for a particular allocation, the allocation retries in xfs_bmap_btalloc() are going to fail until we get to the one that does: args.total = ap->minlen; ... which basically means we're now free to select an AG that satisfies minlen without any consideration for the "+ bmbt reservation" part that was added to bma.total in the first place. This is separate from the observation that the bmap code already assigns [bma|args].minleft (xfs_bmapi_minleft()) to a value that considers that additional bmbt blocks might be required for btree splits due to the extent allocation/mapping. With that, my understanding was that a bma.total of maxlen + bmbt res is unnecessary because the bmap code already takes the bmbt res into consideration itself. > > addition, the total value is not reset until after alignment is > > dropped, which means that such callers drop alignment far too > > aggressively than necessary. > > Does that need fixing? > That was the intent of the first patch. :) > > Update all callers of xfs_bmapi_write() that pass a total block > > value of the mapping length plus bmbt reservation to instead pass > > zero and rely on xfs_bmapi_minleft() to enforce the bmbt reservation > > requirement. This trades off slightly less conservative AG selection > > for the ability to preserve alignment in more scenarios. > > xfs_bmapi_write() callers that incorporate unrelated or additional > > reservations in total beyond what is already included in minleft > > must continue to use the former. > > Does doing this affect the outcome of where bmbt blocks get allocated > with respect to whichever data extent allocation triggered the reshaping > of the bmbt? I would imagine that it /could/ result in somewhat better > allocation decisions? But that the primary outcome of these two patches > is that a large fallocate on a filesystem with alignment hints and small > AGs (relative to the fallocate request size) are more likely to spit out > aligned extents? > Yeah, the intent is to try and honor alignment in more cases. I don't think this affects bmbt block allocation because of the minleft bits mentioned above. Rather, this just means that if we _did_ have some subset of AGs where bma.total (maxlen+bmbt res) could be satisfied, we're no longer restricting ourselves to those AGs over others where minlen+minleft+alignment might be possible. IOW, this takes into consideration the behavior change from the previous patch (or Carlos' variant thereof). > The code changes look ok, but at the same time I wonder if there's a > bigger picture I'm missing? FWIW that might just be due to Dave and > Carlos discussing something resulting in the "A small improvement in the > allocation algorithm" series and just a gut feeling that better > coordination (or maintainer soothing :P) is needed. > Yeah, I think what fell out of that ends up replacing the first patch. AFAICT, this patch is still necessary to prevent bma.total from getting in the way, though some discussion over Carlos' series is still in progress. Either way, it probably makes sense for us to work things out in that series first.. Brian > --D > > > Signed-off-by: Brian Foster <bfoster@redhat.com> > > --- > > fs/xfs/libxfs/xfs_bmap.c | 1 - > > fs/xfs/xfs_bmap_util.c | 4 ++-- > > fs/xfs/xfs_dquot.c | 4 ++-- > > fs/xfs/xfs_iomap.c | 4 ++-- > > fs/xfs/xfs_reflink.c | 4 ++-- > > fs/xfs/xfs_rtalloc.c | 3 +-- > > 6 files changed, 9 insertions(+), 11 deletions(-) > > > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c > > index eaa965920a03..c2f0afdf2f65 100644 > > --- a/fs/xfs/libxfs/xfs_bmap.c > > +++ b/fs/xfs/libxfs/xfs_bmap.c > > @@ -4505,7 +4505,6 @@ xfs_bmapi_convert_delalloc( > > bma.wasdel = true; > > bma.offset = bma.got.br_startoff; > > bma.length = max_t(xfs_filblks_t, bma.got.br_blockcount, MAXEXTLEN); > > - bma.total = XFS_EXTENTADD_SPACE_RES(ip->i_mount, XFS_DATA_FORK); > > bma.minleft = xfs_bmapi_minleft(tp, ip, whichfork); > > if (whichfork == XFS_COW_FORK) > > bma.flags = XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC; > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > > index 0910cb75b65d..079aedade656 100644 > > --- a/fs/xfs/xfs_bmap_util.c > > +++ b/fs/xfs/xfs_bmap_util.c > > @@ -962,8 +962,8 @@ xfs_alloc_file_space( > > xfs_trans_ijoin(tp, ip, 0); > > > > error = xfs_bmapi_write(tp, ip, startoffset_fsb, > > - allocatesize_fsb, alloc_type, resblks, > > - imapp, &nimaps); > > + allocatesize_fsb, alloc_type, 0, imapp, > > + &nimaps); > > if (error) > > goto error0; > > > > diff --git a/fs/xfs/xfs_dquot.c b/fs/xfs/xfs_dquot.c > > index aeb95e7391c1..b924dbd63a7d 100644 > > --- a/fs/xfs/xfs_dquot.c > > +++ b/fs/xfs/xfs_dquot.c > > @@ -305,8 +305,8 @@ xfs_dquot_disk_alloc( > > /* Create the block mapping. */ > > xfs_trans_ijoin(tp, quotip, XFS_ILOCK_EXCL); > > error = xfs_bmapi_write(tp, quotip, dqp->q_fileoffset, > > - XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA, > > - XFS_QM_DQALLOC_SPACE_RES(mp), &map, &nmaps); > > + XFS_DQUOT_CLUSTER_SIZE_FSB, XFS_BMAPI_METADATA, 0, &map, > > + &nmaps); > > if (error) > > return error; > > ASSERT(map.br_blockcount == XFS_DQUOT_CLUSTER_SIZE_FSB); > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > > index f780e223b118..27f2030690e2 100644 > > --- a/fs/xfs/xfs_iomap.c > > +++ b/fs/xfs/xfs_iomap.c > > @@ -277,8 +277,8 @@ xfs_iomap_write_direct( > > * caller gave to us. > > */ > > nimaps = 1; > > - error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb, > > - bmapi_flags, resblks, imap, &nimaps); > > + error = xfs_bmapi_write(tp, ip, offset_fsb, count_fsb, bmapi_flags, 0, > > + imap, &nimaps); > > if (error) > > goto out_res_cancel; > > > > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c > > index 0f08153b4994..3aa56cac1a47 100644 > > --- a/fs/xfs/xfs_reflink.c > > +++ b/fs/xfs/xfs_reflink.c > > @@ -410,8 +410,8 @@ xfs_reflink_allocate_cow( > > /* Allocate the entire reservation as unwritten blocks. */ > > nimaps = 1; > > error = xfs_bmapi_write(tp, ip, imap->br_startoff, imap->br_blockcount, > > - XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, > > - resblks, imap, &nimaps); > > + XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, 0, imap, > > + &nimaps); > > if (error) > > goto out_unreserve; > > > > diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c > > index 4a48a8c75b4f..d42b5a2047e0 100644 > > --- a/fs/xfs/xfs_rtalloc.c > > +++ b/fs/xfs/xfs_rtalloc.c > > @@ -792,8 +792,7 @@ xfs_growfs_rt_alloc( > > */ > > nmap = 1; > > error = xfs_bmapi_write(tp, ip, oblocks, nblocks - oblocks, > > - XFS_BMAPI_METADATA, resblks, &map, > > - &nmap); > > + XFS_BMAPI_METADATA, 0, &map, &nmap); > > if (!error && nmap < 1) > > error = -ENOSPC; > > if (error) > > -- > > 2.20.1 > > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res 2019-09-12 14:32 [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res Brian Foster 2019-09-12 14:32 ` [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs Brian Foster 2019-09-12 14:32 ` [PATCH REPOST 2/2] xfs: don't set bmapi total block req where minleft is sufficient Brian Foster @ 2019-09-16 8:18 ` Carlos Maiolino 2019-10-18 17:17 ` Darrick J. Wong 3 siblings, 0 replies; 17+ messages in thread From: Carlos Maiolino @ 2019-09-16 8:18 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs On Thu, Sep 12, 2019 at 10:32:21AM -0400, Brian Foster wrote: > Hi all, > > This is a repost of a couple patches I posted a few months ago[1]. There > are no changes other than a rebase to for-next. Any thoughts on these? I > think Carlos had also run into some related generic/223 failures fairly > recently... Hi Brian, Sure, I'll take a look at these patches now, my apologies, I only saw these patches now. > > Carlos, > > Any chance you could give these a try? > > Brian > > [1] https://lore.kernel.org/linux-xfs/20190501140504.16435-1-bfoster@redhat.com/ > > Brian Foster (2): > xfs: drop minlen before tossing alignment on bmap allocs > xfs: don't set bmapi total block req where minleft is sufficient > > fs/xfs/libxfs/xfs_bmap.c | 13 +++++++++---- > fs/xfs/xfs_bmap_util.c | 4 ++-- > fs/xfs/xfs_dquot.c | 4 ++-- > fs/xfs/xfs_iomap.c | 4 ++-- > fs/xfs/xfs_reflink.c | 4 ++-- > fs/xfs/xfs_rtalloc.c | 3 +-- > 6 files changed, 18 insertions(+), 14 deletions(-) > > -- > 2.20.1 > -- Carlos ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res 2019-09-12 14:32 [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res Brian Foster ` (2 preceding siblings ...) 2019-09-16 8:18 ` [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res Carlos Maiolino @ 2019-10-18 17:17 ` Darrick J. Wong 2019-10-21 12:14 ` Brian Foster 3 siblings, 1 reply; 17+ messages in thread From: Darrick J. Wong @ 2019-10-18 17:17 UTC (permalink / raw) To: Brian Foster; +Cc: linux-xfs, Carlos Maiolino On Thu, Sep 12, 2019 at 10:32:21AM -0400, Brian Foster wrote: > Hi all, > > This is a repost of a couple patches I posted a few months ago[1]. There > are no changes other than a rebase to for-next. Any thoughts on these? I > think Carlos had also run into some related generic/223 failures fairly > recently... > > Carlos, > > Any chance you could give these a try? Any progress on this in the last month? I /think/ this is related to the unaligned allocations that Dan's complaining about this morning. --D > Brian > > [1] https://lore.kernel.org/linux-xfs/20190501140504.16435-1-bfoster@redhat.com/ > > Brian Foster (2): > xfs: drop minlen before tossing alignment on bmap allocs > xfs: don't set bmapi total block req where minleft is sufficient > > fs/xfs/libxfs/xfs_bmap.c | 13 +++++++++---- > fs/xfs/xfs_bmap_util.c | 4 ++-- > fs/xfs/xfs_dquot.c | 4 ++-- > fs/xfs/xfs_iomap.c | 4 ++-- > fs/xfs/xfs_reflink.c | 4 ++-- > fs/xfs/xfs_rtalloc.c | 3 +-- > 6 files changed, 18 insertions(+), 14 deletions(-) > > -- > 2.20.1 > ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res 2019-10-18 17:17 ` Darrick J. Wong @ 2019-10-21 12:14 ` Brian Foster 0 siblings, 0 replies; 17+ messages in thread From: Brian Foster @ 2019-10-21 12:14 UTC (permalink / raw) To: Darrick J. Wong; +Cc: linux-xfs, Carlos Maiolino On Fri, Oct 18, 2019 at 10:17:20AM -0700, Darrick J. Wong wrote: > On Thu, Sep 12, 2019 at 10:32:21AM -0400, Brian Foster wrote: > > Hi all, > > > > This is a repost of a couple patches I posted a few months ago[1]. There > > are no changes other than a rebase to for-next. Any thoughts on these? I > > think Carlos had also run into some related generic/223 failures fairly > > recently... > > > > Carlos, > > > > Any chance you could give these a try? > > Any progress on this in the last month? I /think/ this is related to > the unaligned allocations that Dan's complaining about this morning. > Not that I'm aware of. It looks similar and IIRC all that is really needed here is a tweak to Carlos' and Dave's patch 1 that Carlos posted (which replaces patch 1 of this series) added with patch 2 of this series. I've just reposted a v2 series with that combination (including links/references to hopefully reduce confusion). Brian > --D > > > Brian > > > > [1] https://lore.kernel.org/linux-xfs/20190501140504.16435-1-bfoster@redhat.com/ > > > > Brian Foster (2): > > xfs: drop minlen before tossing alignment on bmap allocs > > xfs: don't set bmapi total block req where minleft is sufficient > > > > fs/xfs/libxfs/xfs_bmap.c | 13 +++++++++---- > > fs/xfs/xfs_bmap_util.c | 4 ++-- > > fs/xfs/xfs_dquot.c | 4 ++-- > > fs/xfs/xfs_iomap.c | 4 ++-- > > fs/xfs/xfs_reflink.c | 4 ++-- > > fs/xfs/xfs_rtalloc.c | 3 +-- > > 6 files changed, 18 insertions(+), 14 deletions(-) > > > > -- > > 2.20.1 > > ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2019-10-21 12:14 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2019-09-12 14:32 [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res Brian Foster 2019-09-12 14:32 ` [PATCH REPOST 1/2] xfs: drop minlen before tossing alignment on bmap allocs Brian Foster 2019-09-12 22:35 ` Dave Chinner 2019-09-12 22:46 ` Dave Chinner 2019-09-13 14:58 ` Brian Foster 2019-09-14 22:00 ` Dave Chinner 2019-09-15 13:09 ` Brian Foster 2019-09-16 8:36 ` Carlos Maiolino 2019-09-17 12:22 ` Carlos Maiolino 2019-09-18 21:41 ` Darrick J. Wong 2019-09-19 11:49 ` Brian Foster 2019-09-12 14:32 ` [PATCH REPOST 2/2] xfs: don't set bmapi total block req where minleft is sufficient Brian Foster 2019-09-18 21:55 ` Darrick J. Wong 2019-09-19 11:55 ` Brian Foster 2019-09-16 8:18 ` [PATCH REPOST 0/2] xfs: rely on minleft instead of total for bmbt res Carlos Maiolino 2019-10-18 17:17 ` Darrick J. Wong 2019-10-21 12:14 ` Brian Foster
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox