From: Josef Bacik <josef@toxicpanda.com>
To: Anand Jain <anand.jain@oracle.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] btrfs: free device if we fail to open it
Date: Mon, 6 Dec 2021 14:16:24 -0500 [thread overview]
Message-ID: <Ya5hiA1MIOTtyFwZ@localhost.localdomain> (raw)
In-Reply-To: <3e1ca8f0-e817-6314-90eb-84b10e03ee77@oracle.com>
On Mon, Dec 06, 2021 at 10:32:07PM +0800, Anand Jain wrote:
>
> <snip>
>
>
> > > I got it. It shouldn't be difficult to reproduce and, I could reproduce.
> > > Without this patch.
> > >
> > >
> > > Below is a device with two different paths. dm and its mapper.
> > >
> > > ----------
> > > $ ls -bli /dev/mapper/vg-scratch1 /dev/dm-1
> > > 561 brw-rw---- 1 root disk 252, 1 Dec 3 12:13 /dev/dm-1
> > > 565 lrwxrwxrwx 1 root root 7 Dec 3 12:13 /dev/mapper/vg-scratch1 ->
> > > ../dm-1
> > > ----------
> > >
> > > Clean the fs_devices.
> > >
> > > ----------
> > > $ btrfs dev scan --forget
> > > ----------
> > >
> > > Use the mapper to do mkfs.btrfs.
> > >
> > > ----------
> > > $ mkfs.btrfs -fq /dev/mapper/vg-scratch0
> > > $ mount /dev/mapper/vg-scratch0 /btrfs
> > > ----------
> > >
> > > Crete raid1 again using mapper path.
> > >
> > > ----------
> > > $ mkfs.btrfs -U $uuid -fq -draid1 -mraid1 /dev/mapper/vg-scratch1
> > > /dev/mapper/vg-scratch2
> > > ----------
> > >
> > > Use dm path to add the device which belongs to another btrfs filesystem.
> > >
> > > ----------
> > > $ btrfs dev add -f /dev/dm-1 /btrfs
> > > ----------
> > >
> > > Now mount the above raid1 in degraded mode.
> > >
> > > ----------
> > > $ mount -o degraded /dev/mapper/vg-scratch2 /btrfs1
> > > ----------
> > >
> >
> > Ahhh nice, I couldn't figure out a way to trigger it manually. I wonder if we
> > can figure out a way to do this in xfstests without needing to have your
> > SCRATCH_DEV on lvm already?
>
> Yep. A dm linear on top of a raw device will help. I have a rough draft
> working. I could send it to xfstests if you want?
>
Yes please, that would be helpful.
> > > > Yeah I was a little fuzzy on this. I think *any* failure should mean that we
> > > > remove the device from the fs_devices tho right? So that we show we're missing
> > > > a device, since we can't actually access it? I'm actually asking, because I
> > > > think we can go either way, but to me I think any failure sure result in the
> > > > removal of the device so we can re-scan the correct one. Thanks,
> > > >
> > >
> > > It is difficult to generalize, I guess. For example, consider the transient
> > > errors during the boot-up and the errors due to slow to-ready devices or the
> > > system-related errors such as ENOMEM/EACCES, all these does not call for
> > > device-free. If we free the device for transient errors, any further attempt
> > > to mount will fail unless it is device-scan again.
> > >
> > > Here the bug is about btrfs_free_stale_devices() which failed to identify
> > > the same device when tricked by mixing the dm and mapper paths.
> > > Can I check with you if there is another way to fix this by checking the
> > > device major and min number or the serial number from the device inquiry
> > > page?
> >
>
> > I suppose I could just change it so that our verification proceses, like the
> > MAGIC or FSID checks, return ENODATA and we only do it in those cases. Does
> > that seem reasonable?
>
> The 'btrfs device add' calls btrfs_free_stale_devices(), however
> device_path_matched() fails to match the device by its path. So IMO, fix has
> to be in device_path_matched() but with a different parameter to match
> instead of device path.
>
> Here is another manifestation of the same problem.
>
> $ mkfs.btrfs -fq /dev/dm-0
> $ cat /proc/fs/btrfs/devlist | grep device:
> device: /dev/dm-0
>
> $ btrfs dev scan --forget /dev/dm-0
> ERROR: cannot unregister device '/dev/mapper/tsdb': No such file or
> directory
>
> Above, mkfs.btrfs does not sanitizes the input device path however, the
> forget command sanitizes the input device to /dev/mapper/tsdb and the
> device_path_matched() in btrfs_free_stale_devices() fails. So fix has to be
> in device_path_matched() and, why not replace strcmp() with compare major
> and minor device numbers?
I like that better actually, you mind wiring it up since it's your idea?
Thanks,
Josef
next prev parent reply other threads:[~2021-12-06 19:16 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-01 21:18 [PATCH] btrfs: free device if we fail to open it Josef Bacik
2021-12-02 7:09 ` Anand Jain
2021-12-02 16:02 ` Josef Bacik
2021-12-03 8:31 ` Anand Jain
2021-12-03 15:08 ` Josef Bacik
2021-12-06 14:32 ` Anand Jain
2021-12-06 19:16 ` Josef Bacik [this message]
2021-12-06 22:23 ` Anand Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ya5hiA1MIOTtyFwZ@localhost.localdomain \
--to=josef@toxicpanda.com \
--cc=anand.jain@oracle.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox