Re: Weird loop device behavior in 6.15-rc1?

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

From: "Darrick J. Wong" <djwong@kernel.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: axboe@kernel.dk, dlemoal@kernel.org, linux-block@vger.kernel.org,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: Weird loop device behavior in 6.15-rc1?
Date: Tue, 8 Apr 2025 07:27:09 -0700	[thread overview]
Message-ID: <20250408142709.GH6266@frogsfrogsfrogs> (raw)
In-Reply-To: <Z_TF0vYWljwlWxoY@infradead.org>

On Mon, Apr 07, 2025 at 11:44:34PM -0700, Christoph Hellwig wrote:
> On Mon, Apr 07, 2025 at 04:30:07PM -0700, Darrick J. Wong wrote:
> > Hey Christoph,
> > 
> > I have a ... weird test setup where loop devices have directio enabled
> > unconditionally on a system with 4k-lba disks, and now that I pulled
> > down 6.15-rc1, I see failures in xfs/259:
> 
> Hmm, this works just fine for with a 4k LBA size NVMe setup on -rc1
> with latest xfsprogs and xfstests for-next.

Yeah, fstests works fine with loop in buffered mode. :)

I /think/ the (separate) problem is that prior to 6.15, the logican and
physical blocksizes of the loop device would be set to 512b in
direct-io=on mode.  Now it's set to either the STATX_DIOALIGN size or
the underlying bdev's logical block size, which means 4k.  mkfs.xfs runs
BLKSSZGET, compares that to the -b size= argument, and rejects when
blocksize < loop device logical block size.

I don't know if the loop device should behave more like 512e drives,
where we advertise a (potentially slow) 512b LBA and a 4k physical block
size?  Or just stick with the way things are right now because 512e mode
sucks.  The first means I don't have to patch fstests here, the second
means I'd have to adjust _create_loop to take a desired blocksize and
try to set up the loopdev with that block size, even if it means
dropping dio mode.

> > Then trying to format an XFS filesystem fails:
> 
> That on the other hand I can reproduce locally.
> 
> > I think there's a bug in the loop driver where changing
> > LO_FLAGS_DIRECT_IO doesn't actually try to change the O_DIRECT state of
> > the underlying lo->lo_backing_file->f_flags.  So I can try to set a 2k
> > block size on the loop dev, which turns off LO_FLAGS_DIRECT_IO but the
> > fd is still open O_DIRECT so the writes fail.  But this isn't a
> > regression in -rc1, so maybe this is the expected behavior?
> 
> This does look old, but also I would not call it expected.
> 
> > On 6.15-rc1, you actually /can/ change the sector size:
> 
> > But the backing file still has O_DIRECT on, so formatting fails:
> 
> Looks like the fact that fixing the silent failure to change the sector
> size exposed the not clear O_DIRECT bug..
> 
> I'll cook up a patch to clear O_DIRECT.

Thanks!

> > Thoughts?
> > 
> > --D
> > 
> > (/me notes that xfs/801 is failing across the board, and I don't know
> > what changed about THPs in tmpfs but clearly something's corrupting
> > memory.)
> 
> That one always failed for me because it uses a sysfs-dump tool that
> simply doesn't seem to exist.

Ooops.  I meant to take that out before committing and left it in.
Maybe I should just paste a stupid version into xfs/801:

$ sysfs-dump /sys/block/sda/queue/
/sys/block/sda/queue//add_random = 0
/sys/block/sda/queue//chunk_sectors : 0
/sys/block/sda/queue//dax : 0
/sys/block/sda/queue//discard_granularity : 512
/sys/block/sda/queue//discard_max_bytes = 0
/sys/block/sda/queue//discard_max_hw_bytes : 0
/sys/block/sda/queue//discard_zeroes_data : 0
/sys/block/sda/queue//dma_alignment : 511
<etc>

Full version below.

--D

#!/bin/sh

# Dump a sysfs directory as a key: value stream.

WANT_NEWLINE=

print_help() {
        echo "Usage: $0 [-n] files..."
        exit 1
}

dump() {
        test -f "$1" || return
        SEP='?'
        test -r "$1" && SEP=':'
        stat -c '%A' "$1" | grep -q 'w' && SEP='='
        if [ -n "${WANT_NEWLINE}" ]; then
                echo "$1 ${SEP}"
                cat "$1" 2> /dev/null
        else
                echo "$1 ${SEP} $(cat "$1" 2> /dev/null)"
        fi
}

for i in "$@"; do
        if [ "$i" = "--help" ]; then
                print_help
        fi
        if [ "$i" = "-n" ]; then
                WANT_NEWLINE=1
        fi
        if [ -d "$i" ]; then
                for x in "$i/"*; do
                        dump "$x"
                done
        else
                dump "$i"
        fi
done

exit 0

     prev parent reply	other threads:[~2025-04-08 14:27 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-07 23:30 Weird loop device behavior in 6.15-rc1? Darrick J. Wong
2025-04-08  6:44 ` Christoph Hellwig
2025-04-08 14:27   ` Darrick J. Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250408142709.GH6266@frogsfrogsfrogs \
    --to=djwong@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=dlemoal@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox