public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@kernel.org>
To: Patrick Plenefisch <simonpatp@gmail.com>
Cc: Goffredo Baroncelli <kreijack@inwind.it>,
	linux-kernel@vger.kernel.org, Alasdair Kergon <agk@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>, Chris Mason <clm@fb.com>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>,
	regressions@lists.linux.dev, dm-devel@lists.linux.dev,
	linux-btrfs@vger.kernel.org, ming.lei@redhat.com
Subject: Re: LVM-on-LVM: error while submitting device barriers
Date: Tue, 5 Mar 2024 12:45:13 -0500	[thread overview]
Message-ID: <ZedaKUge-EBo4CuT@redhat.com> (raw)
In-Reply-To: <a783e5ed-db56-4100-956a-353170b1b7ed@inwind.it>

On Thu, Feb 29 2024 at  5:05P -0500,
Goffredo Baroncelli <kreijack@inwind.it> wrote:

> On 29/02/2024 21.22, Patrick Plenefisch wrote:
> > On Thu, Feb 29, 2024 at 2:56 PM Goffredo Baroncelli <kreijack@inwind.it> wrote:
> > > 
> > > > Your understanding is correct. The only thing that comes to my mind to
> > > > cause the problem is asymmetry of the SATA devices. I have one 8TB
> > > > device, plus a 1.5TB, 3TB, and 3TB drives. Doing math on the actual
> > > > extents, lowerVG/single spans (3TB+3TB), and
> > > > lowerVG/lvmPool/lvm/brokenDisk spans (3TB+1.5TB). Both obviously have
> > > > the other leg of raid1 on the 8TB drive, but my thought was that the
> > > > jump across the 1.5+3TB drive gap was at least "interesting"
> > > 
> > > 
> > > what about lowerVG/works ?
> > > 
> > 
> > That one is only on two disks, it doesn't span any gaps
> 
> Sorry, but re-reading the original email I found something that I missed before:
> 
> > BTRFS error (device dm-75): bdev /dev/mapper/lvm-brokenDisk errs: wr
> > 0, rd 0, flush 1, corrupt 0, gen 0
> > BTRFS warning (device dm-75): chunk 13631488 missing 1 devices, max
>                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > tolerance is 0 for writable mount
> > BTRFS: error (device dm-75) in write_all_supers:4379: errno=-5 IO
> > failure (errors while submitting device barriers.)
> 
> Looking at the code, it seems that if a FLUSH commands fails, btrfs
> considers that the disk is missing. The it cannot mount RW the device.
> 
> I would investigate with the LVM developers, if it properly passes
> the flush/barrier command through all the layers, when we have an
> lvm over lvm (raid1). The fact that the lvm is a raid1, is important because
> a flush command to be honored has to be honored by all the
> devices involved.

Hi Patrick,

Your initial report (start of this thread) mentioned that the
regression occured with 5.19. The DM changes that landed during the
5.19 merge window refactored quite a bit of DM core's handling for bio
splitting (to simplify DM's newfound support for bio polling) -- Ming
Lei (now cc'd) and I wrote these changes:

e86f2b005a51 dm: simplify basic targets
bdb34759a0db dm: use bio_sectors in dm_aceept_partial_bio
b992b40dfcc1 dm: don't pass bio to __dm_start_io_acct and dm_end_io_acct
e6926ad0c988 dm: pass dm_io instance to dm_io_acct directly
d3de6d12694d dm: switch to bdev based IO accounting interfaces
7dd76d1feec7 dm: improve bio splitting and associated IO accounting
2e803cd99ba8 dm: don't grab target io reference in dm_zone_map_bio
0f14d60a023c dm: improve dm_io reference counting
ec211631ae24 dm: put all polled dm_io instances into a single list
9d20653fe84e dm: simplify bio-based IO accounting further
4edadf6dcb54 dm: improve abnormal bio processing

I'll have a closer look at these DM commits (especially relative to
flush bios and your stacked device usage).

The last commit (4edadf6dcb54) is marginally relevant (but likely most
easily reverted from v5.19-rc2, as a simple test to see if it somehow
a problem... doubtful to be cause but worth a try).

(FYI, not relevant because it is specific to REQ_NOWAIT but figured I'd
 mention it, this commit earlier in the 5.19 DM changes was bogus:
 563a225c9fd2 dm: introduce dm_{get,put}_live_table_bio called from dm_submit_bio
 Jens fixed it with this stable@ commit:
 a9ce385344f9 dm: don't attempt to queue IO under RCU protection)

> > > However yes, I agree that the pair of disks involved may be the answer
> > > of the problem.
> > > 
> > > Could you show us the output of
> > > 
> > > $ sudo pvdisplay -m
> > > 
> > > 
> > 
> > I trimmed it, but kept the relevant bits (Free PE is thus not correct):
> > 
> > 
> >    --- Physical volume ---
> >    PV Name               /dev/lowerVG/lvmPool
> >    VG Name               lvm
> >    PV Size               <3.00 TiB / not usable 3.00 MiB
> >    Allocatable           yes
> >    PE Size               4.00 MiB
> >    Total PE              786431
> >    Free PE               82943
> >    Allocated PE          703488
> >    PV UUID               7p3LSU-EAHd-xUg0-r9vT-Gzkf-tYFV-mvlU1M
> > 
> >    --- Physical Segments ---
> >    Physical extent 0 to 159999:
> >      Logical volume      /dev/lvm/brokenDisk
> >      Logical extents     0 to 159999
> >    Physical extent 160000 to 339199:
> >      Logical volume      /dev/lvm/a
> >      Logical extents     0 to 179199
> >    Physical extent 339200 to 349439:
> >      Logical volume      /dev/lvm/brokenDisk
> >      Logical extents     160000 to 170239
> >    Physical extent 349440 to 351999:
> >      FREE
> >    Physical extent 352000 to 460026:
> >      Logical volume      /dev/lvm/brokenDisk
> >      Logical extents     416261 to 524287
> >    Physical extent 460027 to 540409:
> >      FREE
> >    Physical extent 540410 to 786430:
> >      Logical volume      /dev/lvm/brokenDisk
> >      Logical extents     170240 to 416260

Please provide the following from guest that activates /dev/lvm/brokenDisk:

lsblk
dmsetup table

Please also provide the same from the host (just for completeness).

Also, I didn't see any kernel logs that show DM-specific errors.  I
doubt you'd have left any DM-specific errors out in your report.  So
is btrfs the canary here?  To be clear: You're only seeing btrfs
errors in the kernel log?

Mike

  reply	other threads:[~2024-03-05 17:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAOCpoWc_HQy4UJzTi9pqtJdO740Wx5Yd702O-mwXBE6RVBX1Eg@mail.gmail.com>
     [not found] ` <CAOCpoWf3TSQkUUo-qsj0LVEOm-kY0hXdmttLE82Ytc0hjpTSPw@mail.gmail.com>
2024-02-28 17:25   ` [REGRESSION] LVM-on-LVM: error while submitting device barriers Patrick Plenefisch
2024-02-28 19:19     ` Goffredo Baroncelli
2024-02-28 19:37       ` Patrick Plenefisch
2024-02-29 19:56         ` Goffredo Baroncelli
2024-02-29 20:22           ` Patrick Plenefisch
2024-02-29 22:05             ` Goffredo Baroncelli
2024-03-05 17:45               ` Mike Snitzer [this message]
2024-03-06 15:59                 ` Ming Lei
2024-03-09 20:39                   ` Patrick Plenefisch
2024-03-10 11:34                     ` Ming Lei
2024-03-10 15:27                       ` Mike Snitzer
2024-03-10 15:47                         ` Ming Lei
2024-03-10 18:11                         ` Patrick Plenefisch
2024-03-11 13:13                           ` Ming Lei
2024-03-12 22:54                             ` Patrick Plenefisch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZedaKUge-EBo4CuT@redhat.com \
    --to=snitzer@kernel.org \
    --cc=agk@redhat.com \
    --cc=clm@fb.com \
    --cc=dm-devel@lists.linux.dev \
    --cc=dsterba@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    --cc=mpatocka@redhat.com \
    --cc=regressions@lists.linux.dev \
    --cc=simonpatp@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox