* handle nimaps=0 from xfs_bmapi_write @ 2023-10-09 10:30 Christoph Hellwig 2023-10-09 10:30 ` [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space Christoph Hellwig 0 siblings, 1 reply; 4+ messages in thread From: Christoph Hellwig @ 2023-10-09 10:30 UTC (permalink / raw) To: linux-xfs Hi all, I've recently been playing around with re-enabling delayed allocations on the RT subvolume (for the rextentsize=1 case), and ran into an interesting bug where fsx unexpectedly returned -ENOSPC when testing this code on top of Darrick's rtgroups code. It turns out that this is because xfs_alloc_file_space does not retry when xfs_bmapi_write returns 0 with *nimaps = 0, which can happen if the allocator can't fill the entire space from the beginning of a delalloc extent to the start of the actually requested range in the call to xfs_bmapi_write. ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space 2023-10-09 10:30 handle nimaps=0 from xfs_bmapi_write Christoph Hellwig @ 2023-10-09 10:30 ` Christoph Hellwig 2023-10-09 16:27 ` Darrick J. Wong 0 siblings, 1 reply; 4+ messages in thread From: Christoph Hellwig @ 2023-10-09 10:30 UTC (permalink / raw) To: linux-xfs If xfs_bmapi_write finds a delalloc extent at the requested range, it tries to convert the entire delalloc extent to a real allocation. But if the allocator then can't find an AG with enough space to at least cover the start block of the requested range, xfs_bmapi_write will return 0 but leave *nimaps set to 0. In that case we simply need to keep looping with the same startoffset_fsb. Note that this could affect any caller of xfs_bmapi_write that covers an existing delayed allocation. As far as I can tell we do not have any other such caller, though - the regular writeback path uses xfs_bmapi_convert_delalloc to convert delayed allocations to real ones, and direct I/O invalidates the page cache first. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_bmap_util.c | 25 ++++++++++++++----------- 1 file changed, 14 insertions(+), 11 deletions(-) diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c index d85580b101ad8a..556f57f757f33e 100644 --- a/fs/xfs/xfs_bmap_util.c +++ b/fs/xfs/xfs_bmap_util.c @@ -814,12 +814,10 @@ xfs_alloc_file_space( { xfs_mount_t *mp = ip->i_mount; xfs_off_t count; - xfs_filblks_t allocated_fsb; xfs_filblks_t allocatesize_fsb; xfs_extlen_t extsz, temp; xfs_fileoff_t startoffset_fsb; xfs_fileoff_t endoffset_fsb; - int nimaps; int rt; xfs_trans_t *tp; xfs_bmbt_irec_t imaps[1], *imapp; @@ -842,7 +840,6 @@ xfs_alloc_file_space( count = len; imapp = &imaps[0]; - nimaps = 1; startoffset_fsb = XFS_B_TO_FSBT(mp, offset); endoffset_fsb = XFS_B_TO_FSB(mp, offset + count); allocatesize_fsb = endoffset_fsb - startoffset_fsb; @@ -853,6 +850,7 @@ xfs_alloc_file_space( while (allocatesize_fsb && !error) { xfs_fileoff_t s, e; unsigned int dblocks, rblocks, resblks; + int nimaps = 1; /* * Determine space reservations for data/realtime. @@ -918,15 +916,20 @@ xfs_alloc_file_space( if (error) break; - allocated_fsb = imapp->br_blockcount; - - if (nimaps == 0) { - error = -ENOSPC; - break; + /* + * If xfs_bmapi_write finds a delalloc extent at the requested + * range, it tries to convert the entire delalloc extent to a + * real allocation. + * But if the allocator then can't find an AG with enough space + * to at least cover the start block of the requested range, + * xfs_bmapi_write will return 0 but leave *nimaps set to 0. + * In that case we simply need to keep looping with the same + * startoffset_fsb. + */ + if (nimaps) { + startoffset_fsb += imapp->br_blockcount; + allocatesize_fsb -= imapp->br_blockcount; } - - startoffset_fsb += allocated_fsb; - allocatesize_fsb -= allocated_fsb; } return error; -- 2.39.2 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space 2023-10-09 10:30 ` [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space Christoph Hellwig @ 2023-10-09 16:27 ` Darrick J. Wong 2023-10-10 7:59 ` Christoph Hellwig 0 siblings, 1 reply; 4+ messages in thread From: Darrick J. Wong @ 2023-10-09 16:27 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 09, 2023 at 12:30:20PM +0200, Christoph Hellwig wrote: > If xfs_bmapi_write finds a delalloc extent at the requested range, it > tries to convert the entire delalloc extent to a real allocation. > But if the allocator then can't find an AG with enough space to at least > cover the start block of the requested range, xfs_bmapi_write will return > 0 but leave *nimaps set to 0. > In that case we simply need to keep looping with the same startoffset_fsb. > > Note that this could affect any caller of xfs_bmapi_write that covers > an existing delayed allocation. As far as I can tell we do not have > any other such caller, though - the regular writeback path uses > xfs_bmapi_convert_delalloc to convert delayed allocations to real ones, > and direct I/O invalidates the page cache first. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_bmap_util.c | 25 ++++++++++++++----------- > 1 file changed, 14 insertions(+), 11 deletions(-) > > diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c > index d85580b101ad8a..556f57f757f33e 100644 > --- a/fs/xfs/xfs_bmap_util.c > +++ b/fs/xfs/xfs_bmap_util.c > @@ -814,12 +814,10 @@ xfs_alloc_file_space( > { > xfs_mount_t *mp = ip->i_mount; > xfs_off_t count; > - xfs_filblks_t allocated_fsb; > xfs_filblks_t allocatesize_fsb; > xfs_extlen_t extsz, temp; > xfs_fileoff_t startoffset_fsb; > xfs_fileoff_t endoffset_fsb; > - int nimaps; > int rt; > xfs_trans_t *tp; > xfs_bmbt_irec_t imaps[1], *imapp; > @@ -842,7 +840,6 @@ xfs_alloc_file_space( > > count = len; > imapp = &imaps[0]; > - nimaps = 1; > startoffset_fsb = XFS_B_TO_FSBT(mp, offset); > endoffset_fsb = XFS_B_TO_FSB(mp, offset + count); > allocatesize_fsb = endoffset_fsb - startoffset_fsb; > @@ -853,6 +850,7 @@ xfs_alloc_file_space( > while (allocatesize_fsb && !error) { > xfs_fileoff_t s, e; > unsigned int dblocks, rblocks, resblks; > + int nimaps = 1; > > /* > * Determine space reservations for data/realtime. > @@ -918,15 +916,20 @@ xfs_alloc_file_space( > if (error) > break; > > - allocated_fsb = imapp->br_blockcount; > - > - if (nimaps == 0) { > - error = -ENOSPC; > - break; > + /* > + * If xfs_bmapi_write finds a delalloc extent at the requested > + * range, it tries to convert the entire delalloc extent to a > + * real allocation. > + * But if the allocator then can't find an AG with enough space > + * to at least cover the start block of the requested range, Hmm. Given that you said this was done in the context of delalloc on the realtime volume, I don't think there are AGs in play here? Unless the AG actually ran out of space allocating a bmbt block? My hunch here is that free space on the rt volume is fragmented, but there were still enough free rtextents to create a large delalloc reservation. Conversion of the reservation to an unwritten extent managed to map one free rtextent into the file, but not enough to convert the file mapping all the way to @startoffset_fsb. Hence the bmapi_write call succeeds, but returns @nmaps == 0. If that's true, I suggest changing the second sentence of the comment to read: "If the allocator cannot find a single free extent large enough to cover the start block of the requested range, xfs_bmapi_write will return 0 but leave *nimaps set to 0." I agree with the fix -- calling bmapi_write again with the same startoffset_fsb will return the mapping of the space that /did/ get allocated, which enables us to push @startoffset_fsb forward. --D > + * xfs_bmapi_write will return 0 but leave *nimaps set to 0. > + * In that case we simply need to keep looping with the same > + * startoffset_fsb. > + */ > + if (nimaps) { > + startoffset_fsb += imapp->br_blockcount; > + allocatesize_fsb -= imapp->br_blockcount; > } > - > - startoffset_fsb += allocated_fsb; > - allocatesize_fsb -= allocated_fsb; > } > > return error; > -- > 2.39.2 > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space 2023-10-09 16:27 ` Darrick J. Wong @ 2023-10-10 7:59 ` Christoph Hellwig 0 siblings, 0 replies; 4+ messages in thread From: Christoph Hellwig @ 2023-10-10 7:59 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs On Mon, Oct 09, 2023 at 09:27:56AM -0700, Darrick J. Wong wrote: > > + /* > > + * If xfs_bmapi_write finds a delalloc extent at the requested > > + * range, it tries to convert the entire delalloc extent to a > > + * real allocation. > > + * But if the allocator then can't find an AG with enough space > > + * to at least cover the start block of the requested range, > > Hmm. Given that you said this was done in the context of delalloc on > the realtime volume, I don't think there are AGs in play here? Unless > the AG actually ran out of space allocating a bmbt block? Well, really rt group as it was using your rt group patches. Just using AG as the more generalized case here. > My hunch here is that free space on the rt volume is fragmented, but > there were still enough free rtextents to create a large delalloc > reservation. Conversion of the reservation to an unwritten extent > managed to map one free rtextent into the file, but not enough to > convert the file mapping all the way to @startoffset_fsb. Hence the > bmapi_write call succeeds, but returns @nmaps == 0. Yes. > If that's true, I suggest changing the second sentence of the comment to > read: > > "If the allocator cannot find a single free extent large enough to > cover the start block of the requested range, xfs_bmapi_write will > return 0 but leave *nimaps set to 0." That's probably better indeed. I'll wait a bit for more comment and will resend with that update. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-10-10 7:59 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-10-09 10:30 handle nimaps=0 from xfs_bmapi_write Christoph Hellwig 2023-10-09 10:30 ` [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space Christoph Hellwig 2023-10-09 16:27 ` Darrick J. Wong 2023-10-10 7:59 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).