All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dave@treblig.org>
To: Keith Busch <kbusch@kernel.org>, zkabelac@redhat.com
Cc: Vjaceslavs Klimovs <vklimovs@gmail.com>,
	Thorsten Leemhuis <regressions@leemhuis.info>,
	trnka@scm.com, linux-block@vger.kernel.org,
	dm-devel@lists.linux.dev,
	Linux kernel regressions list <regressions@lists.linux.dev>
Subject: Re: Repeatable, raid1+O_DIRECT, hang/warn
Date: Tue, 16 Jun 2026 14:04:49 +0000	[thread overview]
Message-ID: <ajFYAeItZkyZ9Imi@gallifrey> (raw)
In-Reply-To: <ajFK5NXkxd6jU5zu@gallifrey>

* Dr. David Alan Gilbert (dave@treblig.org) wrote:
> * Dr. David Alan Gilbert (dave@treblig.org) wrote:
> > * Keith Busch (kbusch@kernel.org) wrote:
> > > On Mon, Jun 15, 2026 at 04:16:12PM -0700, Vjaceslavs Klimovs wrote:
> > > > Your trace looks like what the two earlier reports hit: a read reaching
> > > > a leaf device with sectors > 0 but phys_seg 0 (an empty bio). One aside
> > > > that may help read the trace: blk_io_trace.error is a __u16, so the
> > > > bracketed values on your C lines are errnos as u16 (65514 = -EINVAL,
> > > > 65531 = -EIO).
> > > > 
> > > > The WARN itself is new, the bad bio isn't. bio_add_page() only started
> > > > rejecting len == 0 in 643893647cac ("block: reject zero length in
> > > > bio_add_page()", v7.1-rc1); on 7.0.8 the same empty bio tripped
> > > > scsi_alloc_sgtables()'s !nr_segs instead, which matches what you saw.
> > > > That fits your "not a recent regression": the condition is older, v7.1
> > > > just made it loud.
> > > > 
> > > > For Tomas's and my reports (QEMU O_DIRECT to the LV block device) the
> > > > origin looks like 5ff3f74e145a ("block: simplify direct io validity
> > > > check", v6.18): blkdev_dio_invalid() now checks only aggregate
> > > > ki_pos | count alignment and dropped the per-segment
> > > > bdev_iter_is_aligned() walk, so a degenerate or misaligned O_DIRECT no
> > > > longer gets -EINVAL at the fops boundary. But your reproducer reads a
> > > > file, which goes through the filesystem O_DIRECT path and never calls
> > > > blkdev_dio_invalid(), and still makes the empty bio. So it isn't only
> > > > that one entry point.
> > > > 
> > > > dm-mirror then hangs because Keith's f7b24c7b41f2 only covers md
> > > > raid1/raid10; legacy dm-mirror (dm-raid1.c) has no equivalent and
> > > > rebuilds the empty read onto the other leg. Note the leg's status isn't
> > > > even consistent (your SATA path returns BLK_STS_IOERR, not
> > > > BLK_STS_INVAL), so copying that status check into dm-mirror probably
> > > > wouldn't catch every case.
> > > > 
> > > > For what it's worth, that points me toward rejecting the empty or
> > > > misaligned bio once, at submission, with -EINVAL, rather than teaching
> > > > each consumer to tolerate it. But you'll know the tradeoffs far better
> > > > than I do.
> > > > 
> > > > I have a small QEMU + LVM raid1/mirror setup that reproduces the
> > > > block-device variant and bisects to 5ff3f74e. Happy to run your file
> > > > reproducer with some instrumentation at the dm-mirror read entry
> > > > (bi_size vs bio_sectors vs bvec lengths) to see whether the bio is
> > > > already empty on arrival or built that way on the retry, and to test
> > > > any patch.
> > > 
> > > Thanks for following up here. I didn't initially see your follow-up
> > > until Thorsten linked it. I apologize for missing that, this feature is
> > > important so I don't want to see anything regress for it.
> > > 
> > > There is a known bug fix I think future tests should include:
> > > 
> > >   https://lore.kernel.org/linux-block/20260612223205.465913-1-kbusch@meta.com/
> > 
> > > This likely isn't the fix you're looking for, but including it rules out
> > > conditions that are not important here.
> > > 
> > > After that, can we try this suggestion and see if the hang goes away?
> > > 
> > >   https://lore.kernel.org/linux-block/ajBb8tK-0aJBpIgF@kbusch-mbp/
> > 
> > With just that one in, the machine survives - thanks!
> > 
> > It does give:
> > 
> > [  505.208354] device-mapper: raid1: Mirror read failed from 252:24. Trying alternative device.
> > [  505.239376] device-mapper: raid1: All sides of mirror have failed.
> > [  505.239389] device-mapper: raid1: Read failure on mirror device 252:25.  Failing I/O.
> > [  505.239394] device-mapper: raid1: Mirror read failed.
> > 
> > Although as far as I can tell the RAID hasn't errored and is still in sync.
> > 
> > If I turn the test case into a write (just s/pread/pwrite/ ) - the machine
> > still survives but then it does lose raid sync, and the raid resync
> > seems to stick until I do a 'lvchange --refresh main/lvol0'
> > which recovers after having spat out a:
> > 
> > [  865.319527] Buffer I/O error on dev dm-26, logical block 262128, async page read
> > 
> > > I expect the original test case to still return an error (and I think it
> > > was designed to), but it shouldn't produce the warn or bug splats with a
> > > stuck uninterruptable task.
> > 
> > It's not clear to me if it was designed to fail or not; I've not had
> > a chance to rerun the original qemu block tests yet, and I don't know
> > if old kernels succesfully used O_DIRECT in this case.
> > 
> > It still feels that my pwrite case above shouldn't cause a raid de-sync
> > (especially since a normal user can do it).
> 
> Just to follow up on that;  if I use the modern lvm mode 
> ( lvcreate  -m 1 -L 1G main /dev/sda2 /dev/sdb2 ) rather than
> the old mirror with the same patch, then:
> 
>   a) I get no log errors with either read or write
>   b) read still gives EIO
>   c) write apparently succeeds ?!

One more confirmation; running qemu's 'make check' during build passes
with no log errors (whether it skipped any tests due to it's detection
code I don't know).

Dave

> Dave
> 
> > Dave
> > -- 
> >  -----Open up your eyes, open up your mind, open up your code -------   
> > / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
> > \        dave @ treblig.org |                               | In Hex /
> >  \ _________________________|_____ http://www.treblig.org   |_______/
> -- 
>  -----Open up your eyes, open up your mind, open up your code -------   
> / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
> \        dave @ treblig.org |                               | In Hex /
>  \ _________________________|_____ http://www.treblig.org   |_______/
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

  reply	other threads:[~2026-06-16 14:04 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-14 17:57 Repeatable, raid1+O_DIRECT, hang/warn Dr. David Alan Gilbert
2026-06-15 10:34 ` Thorsten Leemhuis
2026-06-15 12:50   ` Dr. David Alan Gilbert
2026-06-15 23:16     ` Vjaceslavs Klimovs
2026-06-16  0:06       ` Keith Busch
2026-06-16  1:25         ` Vjaceslavs Klimovs
2026-06-16 12:57         ` Dr. David Alan Gilbert
2026-06-16 13:08           ` Dr. David Alan Gilbert
2026-06-16 14:04             ` Dr. David Alan Gilbert [this message]
2026-06-16 14:19             ` Keith Busch
2026-06-15 13:07 ` Zdenek Kabelac
2026-06-15 13:20   ` Dr. David Alan Gilbert
2026-06-15 15:20 ` Keith Busch
2026-06-15 15:35   ` Keith Busch
2026-06-15 16:37     ` Dr. David Alan Gilbert
2026-06-15 17:19       ` Keith Busch
2026-06-15 17:42         ` Dr. David Alan Gilbert
2026-06-15 19:25           ` Keith Busch
2026-06-15 20:09             ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajFYAeItZkyZ9Imi@gallifrey \
    --to=dave@treblig.org \
    --cc=dm-devel@lists.linux.dev \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=trnka@scm.com \
    --cc=vklimovs@gmail.com \
    --cc=zkabelac@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.