linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Weird loop device behavior in 6.15-rc1?
@ 2025-04-07 23:30 Darrick J. Wong
  2025-04-08  6:44 ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Darrick J. Wong @ 2025-04-07 23:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: axboe, dlemoal, linux-block, linux-fsdevel

Hey Christoph,

I have a ... weird test setup where loop devices have directio enabled
unconditionally on a system with 4k-lba disks, and now that I pulled
down 6.15-rc1, I see failures in xfs/259:

--- /run/fstests/bin/tests/xfs/259.out	2025-01-30 10:00:17.074275830 -0800
+++ /var/tmp/fstests/xfs/259.out.bad	2025-04-06 19:34:56.587315490 -0700
@@ -1,17 +1,428 @@
 QA output created by 259
 Trying to make (4TB - 4096B) long xfs, block size 4096
 Trying to make (4TB - 4096B) long xfs, block size 2048
+block size 2048 cannot be smaller than sector size 4096

I think bugs in the loop driver's O_DIRECT handling are responsible for
this.  I boiled it down to the key commands so that you don't have to
set up a bunch of hardware.

First, some setup:

# losetup -f --direct-io=on --sector-size 4096 --show /dev/sda
# mkfs.xfs -f /dev/sda
# mount /dev/sda /mnt

On 6.14 and 6.15-rc1, if I create the loop device with directio mode
and try to turn it off so that I can reduce the block size:

# truncate -s 30g /mnt/a
# losetup --direct-io=on -f --show /mnt/a
/dev/loop1
# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 1 4096

# losetup --sector-size 2048 /dev/loop1
# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 0 2048

# losetup --direct-io=off /dev/loop1
# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 0 2048

# losetup --sector-size 2048 /dev/loop1
# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 0 2048

(yes, that is a weird sequence)

Then trying to format an XFS filesystem fails:

# mkfs.xfs -f /dev/loop1 -b size=2k -K
meta-data=/dev/loop1             isize=512    agcount=4, agsize=393216 blks
         =                       sectsz=2048  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=0   metadir=0
data     =                       bsize=2048   blocks=1572864, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
log      =internal log           bsize=2048   blocks=32768, version=2
         =                       sectsz=2048  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
         =                       rgcount=0    rgsize=0 extents
mkfs.xfs: pwrite failed: Input/output error

I think there's a bug in the loop driver where changing
LO_FLAGS_DIRECT_IO doesn't actually try to change the O_DIRECT state of
the underlying lo->lo_backing_file->f_flags.  So I can try to set a 2k
block size on the loop dev, which turns off LO_FLAGS_DIRECT_IO but the
fd is still open O_DIRECT so the writes fail.  But this isn't a
regression in -rc1, so maybe this is the expected behavior?

/me notes that going the opposite direction (turning directio on after
creation) fails:

# losetup --direct-io=on /dev/loop2
losetup: /dev/loop2: set direct io failed: Invalid argument

At least the loopdev stays in buffered mode and mkfs runs fine.

But now let's try passing in "0" to losetup --sector-size to set the
sector size to the minimum value.  On 6.14, this happens:

# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 1 4096

# losetup --sector-size 0 /dev/loop1
# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 1 4096

# losetup --direct-io=off /dev/loop1
# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 0 4096

# losetup --sector-size 0 /dev/loop1
# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 1 4096

Notice that the loopdev never changes block size, so mkfs fails:

# mkfs.xfs -f /dev/loop1 -b size=2k -K
block size 2048 cannot be smaller than sector size 4096

On 6.15-rc1, you actually /can/ change the sector size:

# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 1 4096
# losetup --sector-size 0 /dev/loop1
# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 1 4096
# losetup --direct-io=off /dev/loop1
# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 0 4096
# losetup --sector-size 0 /dev/loop1
# losetup --list --raw /dev/loop1
NAME SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE DIO LOG-SEC
/dev/loop1 0 0 0 0 /mnt/a 0 512

But the backing file still has O_DIRECT on, so formatting fails:

# mkfs.xfs -f /dev/loop1 -b size=2k -K
meta-data=/dev/loop1             isize=512    agcount=4, agsize=393216 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=1
         =                       reflink=1    bigtime=1 inobtcount=1 nrext64=1
         =                       exchange=0   metadir=0
data     =                       bsize=2048   blocks=1572864, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1, parent=0
log      =internal log           bsize=2048   blocks=32768, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
         =                       rgcount=0    rgsize=0 extents
mkfs.xfs: pwrite failed: Input/output error

So I /think/ what's going on here is that LOOP_SET_DIRECT_IO should be
trying to set/clear O_DIRECT on the backing file.

I chose to tag you because I think commit f4774e92aab85d ("loop: take
the file system minimum dio alignment into account") is what caused the
change in the block size setting behavior.  I also see similar messages
in xfs/078 and maybe xfs/432 if I turn on zoned=1 mode.

Though as I mentioned, I think the problems with the loop driver go
deeper than that commit. :/

Thoughts?

--D

(/me notes that xfs/801 is failing across the board, and I don't know
what changed about THPs in tmpfs but clearly something's corrupting
memory.)

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-04-08 14:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-07 23:30 Weird loop device behavior in 6.15-rc1? Darrick J. Wong
2025-04-08  6:44 ` Christoph Hellwig
2025-04-08 14:27   ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).