From: Josef Bacik <josef@toxicpanda.com>
To: Anand Jain <anand.jain@oracle.com>
Cc: linux-btrfs@vger.kernel.org, kernel-team@fb.com
Subject: Re: [PATCH] btrfs: free device if we fail to open it
Date: Mon, 6 Dec 2021 14:16:24 -0500 [thread overview]
Message-ID: <Ya5hiA1MIOTtyFwZ@localhost.localdomain> (raw)
In-Reply-To: <3e1ca8f0-e817-6314-90eb-84b10e03ee77@oracle.com>
On Mon, Dec 06, 2021 at 10:32:07PM +0800, Anand Jain wrote:
>
> <snip>
>
>
> > > I got it. It shouldn't be difficult to reproduce and, I could reproduce.
> > > Without this patch.
> > >
> > >
> > > Below is a device with two different paths. dm and its mapper.
> > >
> > > ----------
> > > $ ls -bli /dev/mapper/vg-scratch1 /dev/dm-1
> > > 561 brw-rw---- 1 root disk 252, 1 Dec 3 12:13 /dev/dm-1
> > > 565 lrwxrwxrwx 1 root root 7 Dec 3 12:13 /dev/mapper/vg-scratch1 ->
> > > ../dm-1
> > > ----------
> > >
> > > Clean the fs_devices.
> > >
> > > ----------
> > > $ btrfs dev scan --forget
> > > ----------
> > >
> > > Use the mapper to do mkfs.btrfs.
> > >
> > > ----------
> > > $ mkfs.btrfs -fq /dev/mapper/vg-scratch0
> > > $ mount /dev/mapper/vg-scratch0 /btrfs
> > > ----------
> > >
> > > Crete raid1 again using mapper path.
> > >
> > > ----------
> > > $ mkfs.btrfs -U $uuid -fq -draid1 -mraid1 /dev/mapper/vg-scratch1
> > > /dev/mapper/vg-scratch2
> > > ----------
> > >
> > > Use dm path to add the device which belongs to another btrfs filesystem.
> > >
> > > ----------
> > > $ btrfs dev add -f /dev/dm-1 /btrfs
> > > ----------
> > >
> > > Now mount the above raid1 in degraded mode.
> > >
> > > ----------
> > > $ mount -o degraded /dev/mapper/vg-scratch2 /btrfs1
> > > ----------
> > >
> >
> > Ahhh nice, I couldn't figure out a way to trigger it manually. I wonder if we
> > can figure out a way to do this in xfstests without needing to have your
> > SCRATCH_DEV on lvm already?
>
> Yep. A dm linear on top of a raw device will help. I have a rough draft
> working. I could send it to xfstests if you want?
>
Yes please, that would be helpful.
> > > > Yeah I was a little fuzzy on this. I think *any* failure should mean that we
> > > > remove the device from the fs_devices tho right? So that we show we're missing
> > > > a device, since we can't actually access it? I'm actually asking, because I
> > > > think we can go either way, but to me I think any failure sure result in the
> > > > removal of the device so we can re-scan the correct one. Thanks,
> > > >
> > >
> > > It is difficult to generalize, I guess. For example, consider the transient
> > > errors during the boot-up and the errors due to slow to-ready devices or the
> > > system-related errors such as ENOMEM/EACCES, all these does not call for
> > > device-free. If we free the device for transient errors, any further attempt
> > > to mount will fail unless it is device-scan again.
> > >
> > > Here the bug is about btrfs_free_stale_devices() which failed to identify
> > > the same device when tricked by mixing the dm and mapper paths.
> > > Can I check with you if there is another way to fix this by checking the
> > > device major and min number or the serial number from the device inquiry
> > > page?
> >
>
> > I suppose I could just change it so that our verification proceses, like the
> > MAGIC or FSID checks, return ENODATA and we only do it in those cases. Does
> > that seem reasonable?
>
> The 'btrfs device add' calls btrfs_free_stale_devices(), however
> device_path_matched() fails to match the device by its path. So IMO, fix has
> to be in device_path_matched() but with a different parameter to match
> instead of device path.
>
> Here is another manifestation of the same problem.
>
> $ mkfs.btrfs -fq /dev/dm-0
> $ cat /proc/fs/btrfs/devlist | grep device:
> device: /dev/dm-0
>
> $ btrfs dev scan --forget /dev/dm-0
> ERROR: cannot unregister device '/dev/mapper/tsdb': No such file or
> directory
>
> Above, mkfs.btrfs does not sanitizes the input device path however, the
> forget command sanitizes the input device to /dev/mapper/tsdb and the
> device_path_matched() in btrfs_free_stale_devices() fails. So fix has to be
> in device_path_matched() and, why not replace strcmp() with compare major
> and minor device numbers?
I like that better actually, you mind wiring it up since it's your idea?
Thanks,
Josef
next prev parent reply other threads:[~2021-12-06 19:16 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-01 21:18 [PATCH] btrfs: free device if we fail to open it Josef Bacik
2021-12-02 7:09 ` Anand Jain
2021-12-02 16:02 ` Josef Bacik
2021-12-03 8:31 ` Anand Jain
2021-12-03 15:08 ` Josef Bacik
2021-12-06 14:32 ` Anand Jain
2021-12-06 19:16 ` Josef Bacik [this message]
2021-12-06 22:23 ` Anand Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ya5hiA1MIOTtyFwZ@localhost.localdomain \
--to=josef@toxicpanda.com \
--cc=anand.jain@oracle.com \
--cc=kernel-team@fb.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.