linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Garry <john.g.garry@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: djwong@kernel.org, hch@lst.de, viro@zeniv.linux.org.uk,
	brauner@kernel.org, jack@suse.cz, chandan.babu@oracle.com,
	willy@infradead.org, axboe@kernel.dk, martin.petersen@oracle.com,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	tytso@mit.edu, jbongio@google.com, ojaswin@linux.ibm.com,
	ritesh.list@gmail.com, mcgrof@kernel.org, p.raghav@samsung.com,
	linux-xfs@vger.kernel.org, catherine.hoang@oracle.com,
	Dave Chinner <dchinner@redhat.com>
Subject: Re: [PATCH v3 07/21] fs: xfs: align args->minlen for forced allocation alignment
Date: Thu, 6 Jun 2024 17:22:19 +0100	[thread overview]
Message-ID: <bcc35a78-9446-48e4-b1ce-0f11972bd19d@oracle.com> (raw)
In-Reply-To: <ZmF3h2ObrJ5hNADp@dread.disaster.area>

On 06/06/2024 09:47, Dave Chinner wrote:
> On Wed, Jun 05, 2024 at 03:26:11PM +0100, John Garry wrote:
>> Hi Dave,
>>
>> I still think that there is a problem with this code or some other allocator
>> code which gives rise to unexpected -ENOSPC. I just highlight this code,
>> above, as I get an unexpected -ENOSPC failure here when the fs does have
>> many free (big enough) extents. I think that the problem may be elsewhere,
>> though.
>>
>> Initially we have a file like this:
>>
>>   EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
>>     0: [0..127]:        62592..62719      0 (62592..62719)     128
>>     1: [128..895]:      hole                                   768
>>     2: [896..1023]:     63616..63743      0 (63616..63743)     128
>>     3: [1024..1151]:    64896..65023      0 (64896..65023)     128
>>     4: [1152..1279]:    65664..65791      0 (65664..65791)     128
>>     5: [1280..1407]:    68224..68351      0 (68224..68351)     128
>>     6: [1408..1535]:    76416..76543      0 (76416..76543)     128
>>     7: [1536..1791]:    62720..62975      0 (62720..62975)     256
>>     8: [1792..1919]:    60032..60159      0 (60032..60159)     128
>>     9: [1920..2047]:    63488..63615      0 (63488..63615)     128
>>    10: [2048..2303]:    63744..63999      0 (63744..63999)     256
>>
>> forcealign extsize is 16 4k fsb, so the layout looks ok.
>>
>> Then we truncate the file to 454 sectors (or 56.75 fsb). This gives:
>>
>> EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
>>     0: [0..127]:        62592..62719      0 (62592..62719)     128
>>     1: [128..455]:      hole                                   328
>>
>> We have 57 fsb.
>>
>> Then I attempt to write from byte offset 232448 (454 sector) and a get a
>> write failure in xfs_bmap_select_minlen() returning -ENOSPC; at that point
>> the file looks like this:
> 
> So you are doing an unaligned write of some size at EOF and EOF is
> not aligned to the extsize?

Correct

> 
>>   EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET        TOTAL
>>     0: [0..127]:        62592..62719      0 (62592..62719)     128
>>     1: [128..447]:      hole                                   320
>>     2: [448..575]:      62720..62847      0 (62720..62847)     128
>>
>> That hole in ext #1 is 40 fsb, and not aligned with forcealign granularity.
>> This means that ext #2 is misaligned wrt forcealign granularity.
> 
> OK, so the command to produce this would be something like this?
> 
> # xfs_io -fd -c "truncate 0" \
> 	-c "chattr +<forcealign>" -c "extsize 64k" \
> 	-c "pwrite 0 64k -b 64k" -c "pwrite 448k 64k -b 64k" \
> 	-c "bmap -vvp" \
> 	-c "truncate 227k" \
> 	-c "bmap -vvp" \
> 	-c "pwrite 227k 64k -b 64k" \
> 	-c "bmap -vvp" \
> 	/mnt/scratch/testfile

No, unfortunately not. Well maybe not on a clean filesystem. In my 
stress test, something else is causing this. Probably heavy fragmentation.

> 
>> This is strange.
>>
>> I notice that we when allocate ext #2, xfs_bmap_btalloc() returns
>> ap->blkno=7840, length=16, offset=56. I would expect offset % 16 == 0, which
>> it is not.
> 
> IOWs, the allocation was not correctly rounded down to an aligned
> start offset.  What were the initial parameters passed to this
> allocation?

For xfs_bmap_btalloc() entry,

ap->offset=48, length=32, blkno=0, total=0, minlen=1, minleft=1, eof=1, 
wasdel=0, aeof=0, conv=0, datatype=5, flags=0x8

> i.e. why didn't it round the start offset down to 48?
> Answering that question will tell you where the bug is.

After xfs_bmap_compute_alignments() -> xfs_bmap_extsize_align(), 
ap->offset=48 - that seems ok.

Maybe the problem is in xfs_bmap_process_allocated_extent(). For the 
problematic case when calling that function:

args->fsbno=7840 args->len=16 ap->offset=48 orig_offset=56 orig_length=24

So, as the comment reads there, we could not satisfy the original length 
request, so we move up the position of the extent.

I assume that we just don't want to do that for forcealign, correct?

> 
> Of course, if the allocation start is rounded down to 48, then
> the length should be rounded up to 32 to cover the entire range we
> are writing new data to.
> 
>> In the following sub-io block zeroing, I note that we zero the front padding
>> from pos=196608 (or fsb 48 or sector 384) for len=35840, and back padding
>> from pos=263680 for len=64000 (upto sector 640 or fsb 80). That seems wrong,
>> as we are zeroing data in the ext #1 hole, right?
> 
> The sub block zeroing is doing exactly the right thing - it is
> demonstrating the exact range that the force aligned allocation
> should have covered.

Agreed

> 
>> Now the actual -ENOSPC comes from xfs_bmap_btalloc() -> ... ->
>> xfs_bmap_select_minlen() with initially blen=32 args->alignment=16
>> ap->minlen=1 args->maxlen=8. There xfs_bmap_btalloc() has ap->length=8
>> initially. This may be just a symptom.
> 
> Yeah, now the allocator is trying to fix up the mess that the first unaligned
> allocation created, and it's tripping over ENOSPC because it's not
> allowed to do sub-extent size hint allocations when forced alignment
> is enabled....
> 
>> I guess that there is something wrong in the block allocator for ext #2. Any
>> idea where to check?
> 
> Start with tracing exactly what range iomap is requesting be
> allocated, and then follow that through into the allocator to work
> out why the offset being passed to the allocation never gets rounded
> down to be aligned. There's a mistake in the logic somewhere that is
> failing to apply the start alignment to the allocation request (i.e.
> the bug will be in the allocation setup code path. i.e. somewhere
> in the xfs_bmapi_write -> xfs_bmap_btalloc path *before* we get to
> the xfs_alloc_vextent...() calls.
> 
As above, the problem seems in the processing fix-up.

Thanks,
John


  reply	other threads:[~2024-06-06 16:23 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-29 17:47 [PATCH v3 00/21] block atomic writes for XFS John Garry
2024-04-29 17:47 ` [PATCH v3 01/21] fs: Add generic_atomic_write_valid_size() John Garry
2024-04-29 17:47 ` [PATCH v3 02/21] xfs: only allow minlen allocations when near ENOSPC John Garry
2024-04-29 17:47 ` [PATCH v3 03/21] xfs: always tail align maxlen allocations John Garry
2024-04-29 17:47 ` [PATCH v3 04/21] xfs: simplify extent allocation alignment John Garry
2024-04-29 17:47 ` [PATCH v3 05/21] xfs: make EOF allocation simpler John Garry
2024-04-29 17:47 ` [PATCH v3 06/21] xfs: introduce forced allocation alignment John Garry
2024-04-29 17:47 ` [PATCH v3 07/21] fs: xfs: align args->minlen for " John Garry
2024-06-05 14:26   ` John Garry
2024-06-06  8:47     ` Dave Chinner
2024-06-06 16:22       ` John Garry [this message]
2024-06-07  6:04         ` John Garry
2024-04-29 17:47 ` [PATCH v3 08/21] xfs: Introduce FORCEALIGN inode flag John Garry
2024-04-30 23:22   ` Dave Chinner
2024-05-01 10:03     ` John Garry
2024-05-02  0:50       ` Dave Chinner
2024-05-02  7:56         ` John Garry
2024-06-12  2:10   ` Long Li
2024-06-12  6:55     ` John Garry
2024-06-12 15:43       ` Darrick J. Wong
2024-06-13  2:04         ` Long Li
2024-04-29 17:47 ` [PATCH v3 09/21] xfs: Do not free EOF blocks for forcealign John Garry
2024-04-30 22:54   ` Dave Chinner
2024-05-01  8:30     ` John Garry
2024-05-02  1:11       ` Dave Chinner
2024-05-02  8:55         ` John Garry
2024-04-29 17:47 ` [PATCH v3 10/21] xfs: Update xfs_is_falloc_aligned() mask " John Garry
2024-04-30 23:35   ` Dave Chinner
2024-05-01 10:48     ` John Garry
2024-05-01 23:45       ` Darrick J. Wong
2024-04-29 17:47 ` [PATCH RFC v3 11/21] xfs: Unmap blocks according to forcealign John Garry
2024-05-01  0:10   ` Dave Chinner
2024-05-01 10:54     ` John Garry
2024-06-06  9:50     ` John Garry
2024-04-29 17:47 ` [PATCH RFC v3 12/21] xfs: Only free full extents for forcealign John Garry
2024-05-01  0:53   ` Dave Chinner
2024-05-01 11:24     ` John Garry
2024-05-01 23:53     ` Darrick J. Wong
2024-05-02  3:12       ` Dave Chinner
2024-04-29 17:47 ` [PATCH v3 13/21] xfs: Enable file data forcealign feature John Garry
2024-04-29 17:47 ` [PATCH v3 14/21] iomap: Sub-extent zeroing John Garry
2024-05-01  1:07   ` Dave Chinner
2024-05-01 10:23     ` John Garry
2024-05-30 10:40     ` John Garry
2024-07-26 14:29     ` John Garry
2024-07-26 17:13       ` Christoph Hellwig
2024-07-29 17:02         ` John Garry
2024-08-22 20:35         ` Darrick J. Wong
2024-06-11  3:10   ` Long Li
2024-06-11  7:29     ` John Garry
2024-04-29 17:47 ` [PATCH v3 15/21] fs: xfs: " John Garry
2024-05-01  1:32   ` Dave Chinner
2024-05-01 11:36     ` John Garry
2024-05-02  1:26       ` Dave Chinner
2024-04-29 17:47 ` [PATCH v3 16/21] fs: Add FS_XFLAG_ATOMICWRITES flag John Garry
2024-04-29 17:47 ` [PATCH v3 17/21] iomap: Atomic write support John Garry
2024-05-01  1:47   ` Dave Chinner
2024-05-01 11:08     ` John Garry
2024-05-02  1:43       ` Dave Chinner
2024-05-02  9:12         ` John Garry
2024-04-29 17:47 ` [PATCH v3 18/21] xfs: Support FS_XFLAG_ATOMICWRITES for forcealign John Garry
2024-04-29 17:47 ` [PATCH v3 19/21] xfs: Support atomic write for statx John Garry
2024-04-29 17:47 ` [PATCH v3 20/21] xfs: Validate atomic writes John Garry
2024-04-29 17:47 ` [PATCH v3 21/21] xfs: Support setting FMODE_CAN_ATOMIC_WRITE John Garry

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bcc35a78-9446-48e4-b1ce-0f11972bd19d@oracle.com \
    --to=john.g.garry@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=catherine.hoang@oracle.com \
    --cc=chandan.babu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=jbongio@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=mcgrof@kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=p.raghav@samsung.com \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).