All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Patrick Plenefisch <simonpatp@gmail.com>
Cc: Goffredo Baroncelli <kreijack@inwind.it>,
	linux-kernel@vger.kernel.org, Alasdair Kergon <agk@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>, Chris Mason <clm@fb.com>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>,
	regressions@lists.linux.dev, dm-devel@lists.linux.dev,
	linux-btrfs@vger.kernel.org, ming.lei@redhat.com
Subject: Re: LVM-on-LVM: error while submitting device barriers
Date: Tue, 5 Mar 2024 12:45:13 -0500	[thread overview]
Message-ID: <ZedaKUge-EBo4CuT@redhat.com> (raw)
In-Reply-To: <a783e5ed-db56-4100-956a-353170b1b7ed@inwind.it>

On Thu, Feb 29 2024 at  5:05P -0500,
Goffredo Baroncelli <kreijack@inwind.it> wrote:

> On 29/02/2024 21.22, Patrick Plenefisch wrote:
> > On Thu, Feb 29, 2024 at 2:56 PM Goffredo Baroncelli <kreijack@inwind.it> wrote:
> > > 
> > > > Your understanding is correct. The only thing that comes to my mind to
> > > > cause the problem is asymmetry of the SATA devices. I have one 8TB
> > > > device, plus a 1.5TB, 3TB, and 3TB drives. Doing math on the actual
> > > > extents, lowerVG/single spans (3TB+3TB), and
> > > > lowerVG/lvmPool/lvm/brokenDisk spans (3TB+1.5TB). Both obviously have
> > > > the other leg of raid1 on the 8TB drive, but my thought was that the
> > > > jump across the 1.5+3TB drive gap was at least "interesting"
> > > 
> > > 
> > > what about lowerVG/works ?
> > > 
> > 
> > That one is only on two disks, it doesn't span any gaps
> 
> Sorry, but re-reading the original email I found something that I missed before:
> 
> > BTRFS error (device dm-75): bdev /dev/mapper/lvm-brokenDisk errs: wr
> > 0, rd 0, flush 1, corrupt 0, gen 0
> > BTRFS warning (device dm-75): chunk 13631488 missing 1 devices, max
>                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > tolerance is 0 for writable mount
> > BTRFS: error (device dm-75) in write_all_supers:4379: errno=-5 IO
> > failure (errors while submitting device barriers.)
> 
> Looking at the code, it seems that if a FLUSH commands fails, btrfs
> considers that the disk is missing. The it cannot mount RW the device.
> 
> I would investigate with the LVM developers, if it properly passes
> the flush/barrier command through all the layers, when we have an
> lvm over lvm (raid1). The fact that the lvm is a raid1, is important because
> a flush command to be honored has to be honored by all the
> devices involved.

Hi Patrick,

Your initial report (start of this thread) mentioned that the
regression occured with 5.19. The DM changes that landed during the
5.19 merge window refactored quite a bit of DM core's handling for bio
splitting (to simplify DM's newfound support for bio polling) -- Ming
Lei (now cc'd) and I wrote these changes:

e86f2b005a51 dm: simplify basic targets
bdb34759a0db dm: use bio_sectors in dm_aceept_partial_bio
b992b40dfcc1 dm: don't pass bio to __dm_start_io_acct and dm_end_io_acct
e6926ad0c988 dm: pass dm_io instance to dm_io_acct directly
d3de6d12694d dm: switch to bdev based IO accounting interfaces
7dd76d1feec7 dm: improve bio splitting and associated IO accounting
2e803cd99ba8 dm: don't grab target io reference in dm_zone_map_bio
0f14d60a023c dm: improve dm_io reference counting
ec211631ae24 dm: put all polled dm_io instances into a single list
9d20653fe84e dm: simplify bio-based IO accounting further
4edadf6dcb54 dm: improve abnormal bio processing

I'll have a closer look at these DM commits (especially relative to
flush bios and your stacked device usage).

The last commit (4edadf6dcb54) is marginally relevant (but likely most
easily reverted from v5.19-rc2, as a simple test to see if it somehow
a problem... doubtful to be cause but worth a try).

(FYI, not relevant because it is specific to REQ_NOWAIT but figured I'd
 mention it, this commit earlier in the 5.19 DM changes was bogus:
 563a225c9fd2 dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio
 Jens fixed it with this stable@ commit:
 a9ce385344f9 dm: don't attempt to queue IO under RCU protection)

> > > However yes, I agree that the pair of disks involved may be the answer
> > > of the problem.
> > > 
> > > Could you show us the output of
> > > 
> > > $ sudo pvdisplay -m
> > > 
> > > 
> > 
> > I trimmed it, but kept the relevant bits (Free PE is thus not correct):
> > 
> > 
> >    --- Physical volume ---
> >    PV Name               /dev/lowerVG/lvmPool
> >    VG Name               lvm
> >    PV Size               <3.00 TiB / not usable 3.00 MiB
> >    Allocatable           yes
> >    PE Size               4.00 MiB
> >    Total PE              786431
> >    Free PE               82943
> >    Allocated PE          703488
> >    PV UUID               7p3LSU-EAHd-xUg0-r9vT-Gzkf-tYFV-mvlU1M
> > 
> >    --- Physical Segments ---
> >    Physical extent 0 to 159999:
> >      Logical volume      /dev/lvm/brokenDisk
> >      Logical extents     0 to 159999
> >    Physical extent 160000 to 339199:
> >      Logical volume      /dev/lvm/a
> >      Logical extents     0 to 179199
> >    Physical extent 339200 to 349439:
> >      Logical volume      /dev/lvm/brokenDisk
> >      Logical extents     160000 to 170239
> >    Physical extent 349440 to 351999:
> >      FREE
> >    Physical extent 352000 to 460026:
> >      Logical volume      /dev/lvm/brokenDisk
> >      Logical extents     416261 to 524287
> >    Physical extent 460027 to 540409:
> >      FREE
> >    Physical extent 540410 to 786430:
> >      Logical volume      /dev/lvm/brokenDisk
> >      Logical extents     170240 to 416260

Please provide the following from guest that activates /dev/lvm/brokenDisk:

lsblk
dmsetup table

Please also provide the same from the host (just for completeness).

Also, I didn't see any kernel logs that show DM-specific errors.  I
doubt you'd have left any DM-specific errors out in your report.  So
is btrfs the canary here?  To be clear: You're only seeing btrfs
errors in the kernel log?

Mike

  reply	other threads:[~2024-03-05 17:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAOCpoWc_HQy4UJzTi9pqtJdO740Wx5Yd702O-mwXBE6RVBX1Eg@mail.gmail.com>
     [not found] ` <CAOCpoWf3TSQkUUo-qsj0LVEOm-kY0hXdmttLE82Ytc0hjpTSPw@mail.gmail.com>
2024-02-28 17:25   ` [REGRESSION] LVM-on-LVM: error while submitting device barriers Patrick Plenefisch
2024-02-28 19:19     ` Goffredo Baroncelli
2024-02-28 19:37       ` Patrick Plenefisch
2024-02-29 19:56         ` Goffredo Baroncelli
2024-02-29 20:22           ` Patrick Plenefisch
2024-02-29 22:05             ` Goffredo Baroncelli
2024-03-05 17:45               ` Mike Snitzer [this message]
2024-03-06 15:59                 ` Ming Lei
2024-03-09 20:39                   ` Patrick Plenefisch
2024-03-10 11:34                     ` Ming Lei
2024-03-10 15:27                       ` Mike Snitzer
2024-03-10 15:47                         ` Ming Lei
2024-03-10 18:11                         ` Patrick Plenefisch
2024-03-11 13:13                           ` Ming Lei
2024-03-12 22:54                             ` Patrick Plenefisch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZedaKUge-EBo4CuT@redhat.com \
    --to=snitzer@kernel.org \
    --cc=agk@redhat.com \
    --cc=clm@fb.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=dsterba@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mpatocka@redhat.com \
    --cc=regressions@lists.linux.dev \
    --cc=simonpatp@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.