* handle nimaps=0 from xfs_bmapi_write
@ 2023-10-09 10:30 Christoph Hellwig
2023-10-09 10:30 ` [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2023-10-09 10:30 UTC (permalink / raw)
To: linux-xfs
Hi all,
I've recently been playing around with re-enabling delayed
allocations on the RT subvolume (for the rextentsize=1 case), and ran
into an interesting bug where fsx unexpectedly returned -ENOSPC when
testing this code on top of Darrick's rtgroups code.
It turns out that this is because xfs_alloc_file_space does not retry
when xfs_bmapi_write returns 0 with *nimaps = 0, which can happen if the
allocator can't fill the entire space from the beginning of a delalloc
extent to the start of the actually requested range in the call to
xfs_bmapi_write.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space
2023-10-09 10:30 handle nimaps=0 from xfs_bmapi_write Christoph Hellwig
@ 2023-10-09 10:30 ` Christoph Hellwig
2023-10-09 16:27 ` Darrick J. Wong
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2023-10-09 10:30 UTC (permalink / raw)
To: linux-xfs
If xfs_bmapi_write finds a delalloc extent at the requested range, it
tries to convert the entire delalloc extent to a real allocation.
But if the allocator then can't find an AG with enough space to at least
cover the start block of the requested range, xfs_bmapi_write will return
0 but leave *nimaps set to 0.
In that case we simply need to keep looping with the same startoffset_fsb.
Note that this could affect any caller of xfs_bmapi_write that covers
an existing delayed allocation. As far as I can tell we do not have
any other such caller, though - the regular writeback path uses
xfs_bmapi_convert_delalloc to convert delayed allocations to real ones,
and direct I/O invalidates the page cache first.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
fs/xfs/xfs_bmap_util.c | 25 ++++++++++++++-----------
1 file changed, 14 insertions(+), 11 deletions(-)
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index d85580b101ad8a..556f57f757f33e 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -814,12 +814,10 @@ xfs_alloc_file_space(
{
xfs_mount_t *mp = ip->i_mount;
xfs_off_t count;
- xfs_filblks_t allocated_fsb;
xfs_filblks_t allocatesize_fsb;
xfs_extlen_t extsz, temp;
xfs_fileoff_t startoffset_fsb;
xfs_fileoff_t endoffset_fsb;
- int nimaps;
int rt;
xfs_trans_t *tp;
xfs_bmbt_irec_t imaps[1], *imapp;
@@ -842,7 +840,6 @@ xfs_alloc_file_space(
count = len;
imapp = &imaps[0];
- nimaps = 1;
startoffset_fsb = XFS_B_TO_FSBT(mp, offset);
endoffset_fsb = XFS_B_TO_FSB(mp, offset + count);
allocatesize_fsb = endoffset_fsb - startoffset_fsb;
@@ -853,6 +850,7 @@ xfs_alloc_file_space(
while (allocatesize_fsb && !error) {
xfs_fileoff_t s, e;
unsigned int dblocks, rblocks, resblks;
+ int nimaps = 1;
/*
* Determine space reservations for data/realtime.
@@ -918,15 +916,20 @@ xfs_alloc_file_space(
if (error)
break;
- allocated_fsb = imapp->br_blockcount;
-
- if (nimaps == 0) {
- error = -ENOSPC;
- break;
+ /*
+ * If xfs_bmapi_write finds a delalloc extent at the requested
+ * range, it tries to convert the entire delalloc extent to a
+ * real allocation.
+ * But if the allocator then can't find an AG with enough space
+ * to at least cover the start block of the requested range,
+ * xfs_bmapi_write will return 0 but leave *nimaps set to 0.
+ * In that case we simply need to keep looping with the same
+ * startoffset_fsb.
+ */
+ if (nimaps) {
+ startoffset_fsb += imapp->br_blockcount;
+ allocatesize_fsb -= imapp->br_blockcount;
}
-
- startoffset_fsb += allocated_fsb;
- allocatesize_fsb -= allocated_fsb;
}
return error;
--
2.39.2
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space
2023-10-09 10:30 ` [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space Christoph Hellwig
@ 2023-10-09 16:27 ` Darrick J. Wong
2023-10-10 7:59 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Darrick J. Wong @ 2023-10-09 16:27 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-xfs
On Mon, Oct 09, 2023 at 12:30:20PM +0200, Christoph Hellwig wrote:
> If xfs_bmapi_write finds a delalloc extent at the requested range, it
> tries to convert the entire delalloc extent to a real allocation.
> But if the allocator then can't find an AG with enough space to at least
> cover the start block of the requested range, xfs_bmapi_write will return
> 0 but leave *nimaps set to 0.
> In that case we simply need to keep looping with the same startoffset_fsb.
>
> Note that this could affect any caller of xfs_bmapi_write that covers
> an existing delayed allocation. As far as I can tell we do not have
> any other such caller, though - the regular writeback path uses
> xfs_bmapi_convert_delalloc to convert delayed allocations to real ones,
> and direct I/O invalidates the page cache first.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
> fs/xfs/xfs_bmap_util.c | 25 ++++++++++++++-----------
> 1 file changed, 14 insertions(+), 11 deletions(-)
>
> diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
> index d85580b101ad8a..556f57f757f33e 100644
> --- a/fs/xfs/xfs_bmap_util.c
> +++ b/fs/xfs/xfs_bmap_util.c
> @@ -814,12 +814,10 @@ xfs_alloc_file_space(
> {
> xfs_mount_t *mp = ip->i_mount;
> xfs_off_t count;
> - xfs_filblks_t allocated_fsb;
> xfs_filblks_t allocatesize_fsb;
> xfs_extlen_t extsz, temp;
> xfs_fileoff_t startoffset_fsb;
> xfs_fileoff_t endoffset_fsb;
> - int nimaps;
> int rt;
> xfs_trans_t *tp;
> xfs_bmbt_irec_t imaps[1], *imapp;
> @@ -842,7 +840,6 @@ xfs_alloc_file_space(
>
> count = len;
> imapp = &imaps[0];
> - nimaps = 1;
> startoffset_fsb = XFS_B_TO_FSBT(mp, offset);
> endoffset_fsb = XFS_B_TO_FSB(mp, offset + count);
> allocatesize_fsb = endoffset_fsb - startoffset_fsb;
> @@ -853,6 +850,7 @@ xfs_alloc_file_space(
> while (allocatesize_fsb && !error) {
> xfs_fileoff_t s, e;
> unsigned int dblocks, rblocks, resblks;
> + int nimaps = 1;
>
> /*
> * Determine space reservations for data/realtime.
> @@ -918,15 +916,20 @@ xfs_alloc_file_space(
> if (error)
> break;
>
> - allocated_fsb = imapp->br_blockcount;
> -
> - if (nimaps == 0) {
> - error = -ENOSPC;
> - break;
> + /*
> + * If xfs_bmapi_write finds a delalloc extent at the requested
> + * range, it tries to convert the entire delalloc extent to a
> + * real allocation.
> + * But if the allocator then can't find an AG with enough space
> + * to at least cover the start block of the requested range,
Hmm. Given that you said this was done in the context of delalloc on
the realtime volume, I don't think there are AGs in play here? Unless
the AG actually ran out of space allocating a bmbt block?
My hunch here is that free space on the rt volume is fragmented, but
there were still enough free rtextents to create a large delalloc
reservation. Conversion of the reservation to an unwritten extent
managed to map one free rtextent into the file, but not enough to
convert the file mapping all the way to @startoffset_fsb. Hence the
bmapi_write call succeeds, but returns @nmaps == 0.
If that's true, I suggest changing the second sentence of the comment to
read:
"If the allocator cannot find a single free extent large enough to
cover the start block of the requested range, xfs_bmapi_write will
return 0 but leave *nimaps set to 0."
I agree with the fix -- calling bmapi_write again with the same
startoffset_fsb will return the mapping of the space that /did/ get
allocated, which enables us to push @startoffset_fsb forward.
--D
> + * xfs_bmapi_write will return 0 but leave *nimaps set to 0.
> + * In that case we simply need to keep looping with the same
> + * startoffset_fsb.
> + */
> + if (nimaps) {
> + startoffset_fsb += imapp->br_blockcount;
> + allocatesize_fsb -= imapp->br_blockcount;
> }
> -
> - startoffset_fsb += allocated_fsb;
> - allocatesize_fsb -= allocated_fsb;
> }
>
> return error;
> --
> 2.39.2
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space
2023-10-09 16:27 ` Darrick J. Wong
@ 2023-10-10 7:59 ` Christoph Hellwig
0 siblings, 0 replies; 4+ messages in thread
From: Christoph Hellwig @ 2023-10-10 7:59 UTC (permalink / raw)
To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs
On Mon, Oct 09, 2023 at 09:27:56AM -0700, Darrick J. Wong wrote:
> > + /*
> > + * If xfs_bmapi_write finds a delalloc extent at the requested
> > + * range, it tries to convert the entire delalloc extent to a
> > + * real allocation.
> > + * But if the allocator then can't find an AG with enough space
> > + * to at least cover the start block of the requested range,
>
> Hmm. Given that you said this was done in the context of delalloc on
> the realtime volume, I don't think there are AGs in play here? Unless
> the AG actually ran out of space allocating a bmbt block?
Well, really rt group as it was using your rt group patches. Just
using AG as the more generalized case here.
> My hunch here is that free space on the rt volume is fragmented, but
> there were still enough free rtextents to create a large delalloc
> reservation. Conversion of the reservation to an unwritten extent
> managed to map one free rtextent into the file, but not enough to
> convert the file mapping all the way to @startoffset_fsb. Hence the
> bmapi_write call succeeds, but returns @nmaps == 0.
Yes.
> If that's true, I suggest changing the second sentence of the comment to
> read:
>
> "If the allocator cannot find a single free extent large enough to
> cover the start block of the requested range, xfs_bmapi_write will
> return 0 but leave *nimaps set to 0."
That's probably better indeed. I'll wait a bit for more comment
and will resend with that update.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-10-10 7:59 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-09 10:30 handle nimaps=0 from xfs_bmapi_write Christoph Hellwig
2023-10-09 10:30 ` [PATCH] xfs: handle nimaps=0 from xfs_bmapi_write in xfs_alloc_file_space Christoph Hellwig
2023-10-09 16:27 ` Darrick J. Wong
2023-10-10 7:59 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).