From: Qu Wenruo <quwenruo@cn.fujitsu.com>
To: <kreijack@inwind.it>, <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH 1/5] Avoid to consider lvm snapshots when scanning devices.
Date: Tue, 9 Dec 2014 08:32:20 +0800 [thread overview]
Message-ID: <54864314.1000403@cn.fujitsu.com> (raw)
In-Reply-To: <5485BC81.5000906@gmail.com>
Hi Goffredo
On 12/08/2014 03:02 AM, Qu Wenruo wrote:
>> -------- Original Message --------
>> Subject: [PATCH 1/5] Avoid to consider lvm snapshots when scanning devices.
>> From: Goffredo Baroncelli <kreijack@gmail.com>
>> To: <linux-btrfs@vger.kernel.org>
>> Date: 2014年12月05日 02:39
>>> LVM snapshots create a problem to the btrfs devices management.
>>> BTRFS assumes that each device haw an unique 'device UUID'.
>>> A LVM snapshot breaks this assumption.
>>>
>>> This patch skips LVM snapshots during the device scan phase.
>>> If you need to consider a LVM snapshot you have to set the
>>> environmental variable BTRFS_SKIP_LVM_SNAPSHOT to "no".
>> IMHO, it is better only to skip LVM snapshot if and only if the snapshot contains a btrfs with
>> conflicting UUID.
> Hi Qu,
>
> Currently the "scan phase" in btrfs is done 1 device at time.
> (udev finds a new device and starts "btrfs dev scan <device>"),
> and I haven't changed that. This means that btrfs[-prog] doesn't
> know which devices are already registered, or not. And if even
> it would know this information, you have to consider the case
> where the snapshot appears before the real target. So
> btrfs[-prog] is not in position to perform this analysis [see
> below my other comment]
>
>> Since LVM is such a flexible block level volume management, it is possible that some one
>> did a snapshot of a btrfs fs, and then reformat the original one to other fs.
>> In that case, the LVM snapshot skip seems overkilled.
>>
>> Also, personally, I prefer there will be some option like -i to allow user to choose which device is
>> used when conflicting uuid is detected. This seems to be the best case and user can have the full control
>> on device scan. This also makes the environment variant not needed.
>>
>> LVM snapshot skip (only when conflicting) is better to be the fallback behavior if -i is not given.
> I understood your reasons, but I don't find any solution
> compatible with the current btrfs device registration model
> (asynchronously when the device appears).
>
> In another patch set, I proposed a mount.btrfs helper which is
> in position to perform these analysis and to pick the "right"
> device (even with the user suggestion).
>
> Today the lvm-snapshot and btrfs behave very poor: it is not
> predictable which device is pick (the original or the snapshot).
> These patch *avoid* most problems skipping the snapshots, which
> to me seems a reasonable default.
> For the other case the user is still able to mount any disks
> [combination] passing them directly via command line (
> mount /dev/sdX -o device=/dev/sdY,device=/dev/sdz... );
>
> Anyway I think for these kind of setup (btrfs on lvm-snapshot),
> passing the disks explicitly is the only solution; in fact your
> suggestion about the '-i' switch is not very different.
>
>> Thanks,
>> Qu
> BR
> G.Baroncelli
Your explains sounds quite reasonable, it's good enough before any
better solutions.
Thanks,
Qu
>>> To check if a device is a LVM snapshot, it is checked the
>>> 'udev' device property 'DM_UDEV_LOW_PRIORITY_FLAG' .
>>> If it is set to 1, the device has to be skipped.
>>>
>>> As conseguence, btrfs now depends by libudev.
>>>
>>> Programmatically you can control this behavior with the functions:
>>> - btrfs_scan_set_skip_lvm_snapshot(int new_value)
>>> - int btrfs_scan_get_skip_lvm_snapshot( )
>>>
>>> Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
>>> ---
>>> Makefile | 4 +--
>>> utils.c | 107 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> utils.h | 9 +++++-
>>> 3 files changed, 117 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/Makefile b/Makefile
>>> index 4cae30c..9464361 100644
>>> --- a/Makefile
>>> +++ b/Makefile
>>> @@ -26,7 +26,7 @@ TESTS = fsck-tests.sh convert-tests.sh
>>> INSTALL = install
>>> prefix ?= /usr/local
>>> bindir = $(prefix)/bin
>>> -lib_LIBS = -luuid -lblkid -lm -lz -llzo2 -L.
>>> +lib_LIBS = -luuid -lblkid -lm -lz -ludev -llzo2 -L.
>>> libdir ?= $(prefix)/lib
>>> incdir = $(prefix)/include/btrfs
>>> LIBS = $(lib_LIBS) $(libs_static)
>>> @@ -99,7 +99,7 @@ lib_links = libbtrfs.so.0 libbtrfs.so
>>> headers = $(libbtrfs_headers)
>>> # make C=1 to enable sparse
>>> -check_defs := .cc-defines.h
>>> +check_defs := .cc-defines.h
>>> ifdef C
>>> #
>>> # We're trying to use sparse against glibc headers which go wild
>>> diff --git a/utils.c b/utils.c
>>> index 2a92416..9887f8b 100644
>>> --- a/utils.c
>>> +++ b/utils.c
>>> @@ -29,6 +29,7 @@
>>> #include <sys/types.h>
>>> #include <sys/stat.h>
>>> #include <uuid/uuid.h>
>>> +#include <libudev.h>
>>> #include <fcntl.h>
>>> #include <unistd.h>
>>> #include <mntent.h>
>>> @@ -52,6 +53,13 @@
>>> #define BLKDISCARD _IO(0x12,119)
>>> #endif
>>> +/*
>>> + * This variable controls if the lvm snapshot have to be skipped or not.
>>> + * Access this variable only via the btrfs_scan_[sg]et_skip_lvm_snapshot()
>>> + * functions
>>> + */
>>> +static int __scan_device_skip_lvm_snapshot = -1;
>>> +
>>> static int btrfs_scan_done = 0;
>>> static char argv0_buf[ARGV0_BUF_SIZE] = "btrfs";
>>> @@ -1593,6 +1601,9 @@ int btrfs_scan_block_devices(int run_ioctl)
>>> char fullpath[110];
>>> int scans = 0;
>>> int special;
>>> + int skip_snapshot;
>>> +
>>> + skip_snapshot = btrfs_scan_get_skip_lvm_snapshot();
>>> scan_again:
>>> proc_partitions = fopen("/proc/partitions","r");
>>> @@ -1642,6 +1653,9 @@ scan_again:
>>> continue;
>>> }
>>> + if (skip_snapshot && is_low_priority_device(fullpath))
>>> + continue;
>>> +
>>> fd = open(fullpath, O_RDONLY);
>>> if (fd < 0) {
>>> if (errno != ENOMEDIUM)
>>> @@ -2182,6 +2196,30 @@ int test_dev_for_mkfs(char *file, int force_overwrite, char *estr)
>>> return 0;
>>> }
>>> +int btrfs_scan_get_skip_lvm_snapshot( )
>>> +{
>>> + const char *value;
>>> +
>>> + if (__scan_device_skip_lvm_snapshot != -1 )
>>> + return __scan_device_skip_lvm_snapshot;
>>> +
>>> + value = getenv(BTRFS_SKIP_LVM_SNAPSHOT_ENV_NAME);
>>> + if (value && !strcasecmp(value, "NO"))
>>> + __scan_device_skip_lvm_snapshot = 0;
>>> + else if (value && !strcasecmp(value, "YES"))
>>> + __scan_device_skip_lvm_snapshot = 1;
>>> + else
>>> + __scan_device_skip_lvm_snapshot =
>>> + BTRFS_SKIP_LVM_SNAPSHOT_DEFAULT;
>>> +
>>> + return __scan_device_skip_lvm_snapshot;
>>> +}
>>> +
>>> +void btrfs_scan_set_skip_lvm_snapshot(int new_value)
>>> +{
>>> + __scan_device_skip_lvm_snapshot = !!new_value;
>>> +}
>>> +
>>> int btrfs_scan_lblkid()
>>> {
>>> int fd = -1;
>>> @@ -2192,6 +2230,9 @@ int btrfs_scan_lblkid()
>>> blkid_dev dev = NULL;
>>> blkid_cache cache = NULL;
>>> char path[PATH_MAX];
>>> + int skip_snapshot;
>>> +
>>> + skip_snapshot = btrfs_scan_get_skip_lvm_snapshot();
>>> if (btrfs_scan_done)
>>> return 0;
>>> @@ -2210,6 +2251,9 @@ int btrfs_scan_lblkid()
>>> /* if we are here its definitely a btrfs disk*/
>>> strncpy(path, blkid_dev_devname(dev), PATH_MAX);
>>> + if (skip_snapshot && is_low_priority_device(path))
>>> + continue;
>>> +
>>> fd = open(path, O_RDONLY);
>>> if (fd < 0) {
>>> printf("ERROR: could not open %s\n", path);
>>> @@ -2450,3 +2494,66 @@ int find_next_key(struct btrfs_path *path, struct btrfs_key *key)
>>> }
>>> return 1;
>>> }
>>> +
>>> +/*
>>> + * This function return 1 if the device (path) is consdered "LOW_PRIORITY" by
>>> + * LVM2 library. These device are typically the nsapshot.
>>> + * This function return < 0 in case of error; 0 otherwise.
>>> + */
>>> +int is_low_priority_device(const char *path)
>>> +{
>>> +
>>> + struct udev *udev=NULL;
>>> + struct udev_device *dev;
>>> + struct udev_enumerate *enumerate=NULL;
>>> + struct udev_list_entry *devices;
>>> + struct udev_list_entry *node, *list;
>>> + int ret=-1;
>>> + const char *value, *syspath;
>>> + char *rpath=NULL;
>>> +
>>> + rpath = realpath(path, NULL);
>>> + if (!rpath) {
>>> + fprintf(stderr, "ERROR: not enough memory\n");
>>> + ret=-2;
>>> + goto exit;
>>> + }
>>> +
>>> + /* Create the udev object */
>>> + udev = udev_new();
>>> + if (!udev) {
>>> + fprintf(stderr, "ERROR: Can't create udev\n");
>>> + ret=-1;
>>> + goto exit;
>>> + }
>>> +
>>> + enumerate = udev_enumerate_new(udev);
>>> + udev_enumerate_add_match_subsystem(enumerate, "block");
>>> + udev_enumerate_add_match_property(enumerate, "DEVNAME", rpath);
>>> + udev_enumerate_scan_devices(enumerate);
>>> +
>>> + devices = udev_enumerate_get_list_entry(enumerate);
>>> + syspath = udev_list_entry_get_name(devices);
>>> +
>>> + dev = udev_device_new_from_syspath(udev, syspath);
>>> +
>>> + list = udev_device_get_properties_list_entry (dev);
>>> +
>>> + ret = 0;
>>> + node = udev_list_entry_get_by_name(list, "DM_UDEV_LOW_PRIORITY_FLAG");
>>> + if (node) {
>>> + value = udev_list_entry_get_value(node);
>>> + if (value && !strcmp(value, "1"))
>>> + ret = 1;
>>> + }
>>> +
>>> +exit:
>>> + free(rpath);
>>> +
>>> + /* Free the enumerator object */
>>> + udev_enumerate_unref(enumerate);
>>> +
>>> + udev_unref(udev);
>>> +
>>> + return ret;
>>> +}
>>> diff --git a/utils.h b/utils.h
>>> index 289e86b..9855f09 100644
>>> --- a/utils.h
>>> +++ b/utils.h
>>> @@ -128,7 +128,7 @@ int csum_tree_block(struct btrfs_root *root, struct extent_buffer *buf,
>>> int verify);
>>> int ask_user(char *question);
>>> int lookup_ino_rootid(int fd, u64 *rootid);
>>> -int btrfs_scan_lblkid(void);
>>> +int btrfs_scan_lblkid();
>>> int get_btrfs_mount(const char *dev, char *mp, size_t mp_size);
>>> int find_mount_root(const char *path, char **mount_root);
>>> int get_device_info(int fd, u64 devid,
>>> @@ -161,4 +161,11 @@ static inline u64 btrfs_min_dev_size(u32 leafsize)
>>> int find_next_key(struct btrfs_path *path, struct btrfs_key *key);
>>> +#define BTRFS_SKIP_LVM_SNAPSHOT_DEFAULT 1
>>> +#define BTRFS_SKIP_LVM_SNAPSHOT_ENV_NAME "BTRFS_SKIP_LVM_SNAPSHOT"
>>> +
>>> +int is_low_priority_device(const char *path);
>>> +void btrfs_scan_set_skip_lvm_snapshot(int new_value);
>>> +int btrfs_scan_get_skip_lvm_snapshot( );
>>> +
>>> #endif
>>
>
next prev parent reply other threads:[~2014-12-09 0:32 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-04 18:39 [PATCH V2][BTRFS-PROGS] Don't use LVM snapshot device Goffredo Baroncelli
2014-12-04 18:39 ` [PATCH 1/5] Avoid to consider lvm snapshots when scanning devices Goffredo Baroncelli
2014-12-08 2:02 ` Qu Wenruo
2014-12-08 14:58 ` Goffredo Baroncelli
2014-12-09 0:32 ` Qu Wenruo [this message]
2014-12-09 10:27 ` David Sterba
2014-12-09 18:19 ` Goffredo Baroncelli
2014-12-04 18:39 ` [PATCH 2/5] 'btrfs device scan' skips lvm snapshots Goffredo Baroncelli
2014-12-04 18:39 ` [PATCH 3/5] Update 'btrfs device scan' man page Goffredo Baroncelli
2014-12-04 18:39 ` [PATCH 4/5] Add reference to BTRFS_SKIP_LVM_SNAPSHOT environment variable Goffredo Baroncelli
2014-12-04 18:39 ` [PATCH 5/5] Abort in case of device uuid conflict Goffredo Baroncelli
2014-12-05 7:26 ` [PATCH V2][BTRFS-PROGS] Don't use LVM snapshot device Duncan
2014-12-05 18:39 ` Goffredo Baroncelli
2014-12-08 15:30 ` Phillip Susi
2014-12-08 17:36 ` Goffredo Baroncelli
2014-12-08 18:17 ` Phillip Susi
2014-12-08 19:22 ` Robert White
2014-12-10 7:52 ` Anand Jain
2014-12-10 18:40 ` Goffredo Baroncelli
-- strict thread matches above, loose matches on Subject: below --
2014-12-04 18:24 [PATCH][BTRFS-PROGS] Skip " Goffredo Baroncelli
2014-12-04 18:24 ` [PATCH 1/5] Avoid to consider lvm snapshots when scanning devices Goffredo Baroncelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54864314.1000403@cn.fujitsu.com \
--to=quwenruo@cn.fujitsu.com \
--cc=kreijack@inwind.it \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.