* [RFC PATCH 0/5] btrfs-progs: snapshot diff function
@ 2012-08-07 8:56 Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 1/5] btrfs-progs: make ino_resovle() shared Jeff Liu
` (5 more replies)
0 siblings, 6 replies; 10+ messages in thread
From: Jeff Liu @ 2012-08-07 8:56 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
Hello,
I've done a prototype implementation of snapshot diff utility many months ago.
It was originally meant to analyze the differences between two snapshots which are
inherited from the same subvolume/snapshot.
Moreover, the upstream LXC userland tools has been released with a dedicated template
to create new containers combine with btrfs subvolume/snapshot create function, so this
path set might be useful if someone is suffering from a broken container guest(it was
cloned from a health one in previous but it does not work with some configurations now).
In this case, this feature could works as an assistant to help investigating the root
cause by listing those changed files from the snapshot that the container resides.
This patch set works to three kinds of change for now.
- new_file: new created files at the destination snapshot.
- removed_file: those files are still resides on source subvolume/snapshot but they have
been removed from the destination.
- updated_file: files are resides on both subvolumes/snapshots, but they might be changed.
Currently, the user could do diff business on any two subvolumes/snapshots, if the destination
snapshot is not inherited from the same subvolume/snapshot upon the source one, he must be
surprised by the results, so it's better to improve it with pre-check for that if possible.
Another issue is,
- if we created some new files or updated some existing files under the source snapshot,
they will be marked as REMOVED/UPDATED out of the source from the destination snapshot's
point of view, so the results might looks a bit strange.
A quick demo:
root@kdev:/btrfs# btrfs subvolume diff-snapshot one two
[REMOVED REGFILE] one/regfile_in_one objectid 264 transid 50
[REMOVED DIR] one/dir_02_at_one objectid 262 transid 36
[REMOVED REGFILE] one/dir_02_at_one/file_at_dir02_one objectid 263 transid 37
[REMOVED DIR] one/dir_at_one objectid 258 transid 29
[REMOVED REGFILE] one/dir_at_one/file_02_at_one_dir objectid 260 transid 32
[REMOVED REGFILE] one/dir_at_one/file_03_at_one_dir objectid 261 transid 35
[REMOVED REGFILE] one/dir_at_one/file_at_one_dir objectid 259 transid 30
[REMOVED REGFILE] one/file_at_one objectid 257 transid 26
[NEW REGFILE] two/regfile_in_two objectid 265 transid 50
[NEW DIR] two/dir_at_two objectid 262 transid 40
[NEW REGFILE] two/dir_at_two/file01_at_dir_of_two objectid 263 transid 41
[NEW SYMLINK] two/dir_at_two/passwd objectid 264 transid 42
[NEW REGFILE] two/file_02 objectid 258 transid 23
[NEW REGFILE] two/file_03 objectid 270 transid 68
[NEW REGFILE] two/file_04 objectid 275 transid 68
Any comments are appreciated!
Thanks,
-Jeff
Makefile | 2 +-
btrfs-list.c | 3 +-
cmds-subvolume.c | 90 +++++
diff-snapshot.c | 1026 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
diff-snapshot.h | 47 +++
5 files changed, 1165 insertions(+), 3 deletions(-)
^ permalink raw reply [flat|nested] 10+ messages in thread
* [RFC PATCH 1/5] btrfs-progs: make ino_resovle() shared
2012-08-07 8:56 [RFC PATCH 0/5] btrfs-progs: snapshot diff function Jeff Liu
@ 2012-08-07 8:57 ` Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 2/5] btrfs-progs: header file of snapshot diff Jeff Liu
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Jeff Liu @ 2012-08-07 8:57 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
Make ino_resolve() shared so that we can call it at snapshot diff module.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
btrfs-list.c | 3 +--
btrfs-list.h | 22 ++++++++++++++++++++++
2 files changed, 23 insertions(+), 2 deletions(-)
create mode 100644 btrfs-list.h
diff --git a/btrfs-list.c b/btrfs-list.c
index c53d016..75681e1 100644
--- a/btrfs-list.c
+++ b/btrfs-list.c
@@ -474,8 +474,7 @@ char *build_name(char *dirid, char *name)
* cache the results so we can avoid tree searches if a later call goes
* to the same directory or file name
*/
-static char *ino_resolve(int fd, u64 ino, u64 *cache_dirid, char **cache_name)
-
+char *ino_resolve(int fd, u64 ino, u64 *cache_dirid, char **cache_name)
{
u64 dirid;
char *dirname;
diff --git a/btrfs-list.h b/btrfs-list.h
new file mode 100644
index 0000000..70b2078
--- /dev/null
+++ b/btrfs-list.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright (C) 2007 Oracle. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#ifndef __BTRFS_LIST_
+#define __BTRFS_LIST_
+char *ino_resolve(int fd, u64 ino, u64 *cache_dirid, char **cache_name);
+#endif
--
1.7.4.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [RFC PATCH 2/5] btrfs-progs: header file of snapshot diff
2012-08-07 8:56 [RFC PATCH 0/5] btrfs-progs: snapshot diff function Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 1/5] btrfs-progs: make ino_resovle() shared Jeff Liu
@ 2012-08-07 8:57 ` Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 3/5] btrfs-progs: souce " Jeff Liu
` (3 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Jeff Liu @ 2012-08-07 8:57 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
Maybe it's better to put those #defines to the source file of snapshot diff as no other modules
need them.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
diff-snapshot.h | 47 +++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 47 insertions(+), 0 deletions(-)
create mode 100644 diff-snapshot.h
diff --git a/diff-snapshot.h b/diff-snapshot.h
new file mode 100644
index 0000000..0ba09da
--- /dev/null
+++ b/diff-snapshot.h
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2012 Oracle. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#ifndef __SNAPSHOT_DIFF_
+#define __SNAPSHOT_DIFF_
+
+#define SNAPSHOT_DIFF_LIST_NEW_ITEM (1 << 0)
+#define SNAPSHOT_DIFF_LIST_REMOVED_ITEM (1 << 1)
+#define SNAPSHOT_DIFF_LIST_UPDATED_ITEM (1 << 2)
+#define SNAPSHOT_DIFF_LIST_ALL (1 << 3)
+
+#define SNAPSHOT_DIFF_SHOW_NEW_ITEM(flags) \
+ ((flags & SNAPSHOT_DIFF_LIST_NEW_ITEM) || \
+ (flags & SNAPSHOT_DIFF_LIST_ALL))
+
+#define SNAPSHOT_DIFF_SHOW_REMOVED_ITEM(flags) \
+ ((flags & SNAPSHOT_DIFF_LIST_REMOVED_ITEM) || \
+ (flags & SNAPSHOT_DIFF_LIST_ALL))
+
+#define SNAPSHOT_DIFF_SHOW_UPDATED_ITEM(flags) \
+ ((flags & SNAPSHOT_DIFF_LIST_UPDATED_ITEM) || \
+ (flags & SNAPSHOT_DIFF_LIST_ALL))
+
+enum {
+ SNAPSHOT_DIFF_NEW_ITEM = 0,
+ SNAPSHOT_DIFF_REMOVED_ITEM,
+ SNAPSHOT_DIFF_UPDATED_ITEM,
+};
+
+int snapshot_diff(int src_fd, int dst_fd, const char *src_snapshot,
+ const char *dest_snapshot, unsigned int diff_flags);
+#endif
--
1.7.4.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [RFC PATCH 3/5] btrfs-progs: souce file of snapshot diff
2012-08-07 8:56 [RFC PATCH 0/5] btrfs-progs: snapshot diff function Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 1/5] btrfs-progs: make ino_resovle() shared Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 2/5] btrfs-progs: header file of snapshot diff Jeff Liu
@ 2012-08-07 8:57 ` Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 4/5] btrfs-progs: teach Makefile aware of the new comer Jeff Liu
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Jeff Liu @ 2012-08-07 8:57 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
Now the source file is coming.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
diff-snapshot.c | 1026 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 1026 insertions(+), 0 deletions(-)
create mode 100644 diff-snapshot.c
diff --git a/diff-snapshot.c b/diff-snapshot.c
new file mode 100644
index 0000000..7b7f4c7
--- /dev/null
+++ b/diff-snapshot.c
@@ -0,0 +1,1026 @@
+/*
+ * Copyright (C) 2012 Oracle. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <sys/types.h>
+#include <dirent.h>
+#include <sys/stat.h>
+#include <sys/ioctl.h>
+#include <uuid/uuid.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <assert.h>
+#include "ctree.h"
+#include "ioctl.h"
+#include "utils.h"
+#include "btrfs-list.h"
+#include "diff-snapshot.h"
+
+/*
+ * scan and cache the number of items from both snapshots at a time.
+ * FIXME: maybe it's better to let user to specify this value according
+ * to their memory status, cache more items every time can improve the
+ * overall performance as it could reduce the efforts to retrieve items
+ * through ioctl(2). or we can implement a routine to calculate this value
+ * based on the available memory, maybe sounds more reasonable.
+ */
+static unsigned int nr_item_scan = 4096;
+
+struct snapshot_diff_info {
+ /* source snapshot scan info */
+ struct snapshot_scan_info *src_scan_info;
+
+ /* destination snapshot scan info */
+ struct snapshot_scan_info *dest_scan_info;
+};
+
+struct snapshot_scan_info {
+ /* name */
+ char *snapshot;
+
+ /* snapshot dir id */
+ int fd;
+
+ /* snapshot tree id */
+ u64 tree_id;
+
+ /* the latest object id in last round of scan */
+ u64 last_objectid;
+
+ /* cache the scanned items on */
+ struct rb_root scanned_items;
+
+ /*
+ * rb-tree to cache those items which have already been
+ * processed by lookup snapshot, so it will not be cached
+ * again. the memory allocated for it would be reclaimed
+ * back once we determined that the business for it was
+ * done.
+ */
+ struct rb_root processed_items;
+
+ /* no more items can be fetched from a snapshot */
+ int scan_done;
+};
+
+/* item info at cache */
+struct snapshot_scan_item {
+ struct rb_node si_node;
+ u64 transid;
+ u64 objectid;
+ char *path;
+ u8 type;
+};
+
+/*
+ * processed items are those who are not returned in the current
+ * round of scan, but they are resides on snapshot.
+ */
+struct processed_item {
+ struct rb_node pi_node;
+ char *path;
+};
+
+static const char *decode_item_type(u8 type)
+{
+ char *typestr = "UNKNOWN";
+
+ switch (type) {
+ case BTRFS_FT_DIR:
+ typestr = "DIR";
+ break;
+ case BTRFS_FT_REG_FILE:
+ typestr = "REGFILE";
+ break;
+ case BTRFS_FT_CHRDEV:
+ typestr = "CHRDEV";
+ break;
+ case BTRFS_FT_BLKDEV:
+ typestr = "BLKDEV";
+ break;
+ case BTRFS_FT_FIFO:
+ typestr = "FIFO";
+ break;
+ case BTRFS_FT_SOCK:
+ typestr = "SOCKET";
+ break;
+ case BTRFS_FT_SYMLINK:
+ typestr = "SYMLINK";
+ break;
+ case BTRFS_FT_XATTR:
+ typestr = "XATTR";
+ break;
+ default:
+ break;
+ }
+
+ return typestr;
+}
+
+/*
+ * we found an item on both snapshots, check if it was updated or not
+ * by comparing ctime && mtime.
+ */
+static inline int item_is_updated(const char *src_snapshot,
+ const char *dest_snapshot,
+ const char *path, u8 item_type,
+ int *updated)
+{
+ char src_full_path[BTRFS_PATH_NAME_MAX + 1];
+ char dest_full_path[BTRFS_PATH_NAME_MAX + 1];
+ struct stat st1;
+ struct stat st2;
+ int ret = 0;
+
+ ret = snprintf(src_full_path, sizeof(src_full_path), "%s/%s",
+ src_snapshot, path);
+ if (ret < 0) {
+ fprintf(stderr, "failed to build full path [%s/%s]\n",
+ src_snapshot, path);
+ goto out;
+ }
+
+ ret = snprintf(dest_full_path, sizeof(dest_full_path), "%s/%s",
+ dest_snapshot, path);
+ if (ret < 0) {
+ fprintf(stderr, "failed to build full path [%s/%s]\n",
+ dest_snapshot, path);
+ goto out;
+ }
+
+ if (item_type == BTRFS_FT_SYMLINK) {
+ if (lstat(src_full_path, &st1) < 0) {
+ fprintf(stderr, "lstat %s failed %s\n",
+ src_full_path, strerror(errno));
+ ret = -1;
+ goto out;
+ }
+
+ if (lstat(dest_full_path, &st2) < 0) {
+ fprintf(stderr, "lstat %s failed %s\n",
+ dest_full_path, strerror(errno));
+ ret = -1;
+ goto out;
+ }
+ } else {
+ if (stat(src_full_path, &st1) < 0) {
+ fprintf(stderr, "stat src path %s failed %s\n",
+ src_full_path, strerror(errno));
+ ret = -1;
+ goto out;
+ }
+
+ if (stat(dest_full_path, &st2) < 0) {
+ fprintf(stderr, "stat dest path %s failed %s\n",
+ dest_full_path, strerror(errno));
+ ret = -1;
+ goto out;
+ }
+ }
+
+ *updated = (st1.st_mtime != st2.st_mtime ||
+ st1.st_ctime != st2.st_ctime) ? 1 : 0;
+
+out:
+ return ret;
+}
+
+static const char *decode_item_state(unsigned int state)
+{
+ char *s = NULL;
+
+ switch (state) {
+ case SNAPSHOT_DIFF_NEW_ITEM:
+ s = "NEW";
+ break;
+ case SNAPSHOT_DIFF_REMOVED_ITEM:
+ s = "REMOVED";
+ break;
+ case SNAPSHOT_DIFF_UPDATED_ITEM:
+ s = "UPDATED";
+ break;
+ default:
+ break;
+ }
+
+ return s;
+}
+
+static inline void print_item_diff_info(struct snapshot_scan_item *item,
+ const char *snapshot,
+ unsigned int state)
+{
+ const char *s = decode_item_state(state);
+ const char *type = decode_item_type(item->type);
+
+ printf("[%s %s] %s/%s objectid %llu transid %llu\n",
+ s, type, snapshot, item->path, item->objectid,
+ item->transid);
+}
+
+static inline int snapshot_diff_init(struct snapshot_diff_info *diff_info)
+
+{
+ diff_info->src_scan_info = malloc(sizeof(struct snapshot_scan_info));
+ if (!diff_info->src_scan_info)
+ return -ENOMEM;
+
+ diff_info->dest_scan_info = malloc(sizeof(struct snapshot_scan_info));
+ if (!diff_info->dest_scan_info)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static inline int snapshot_item_scan_init(struct snapshot_scan_info *scan_info,
+ const char *snapshot, int fd,
+ u64 tree_id)
+{
+ scan_info->snapshot = strdup(snapshot);
+ if (!scan_info->snapshot)
+ return -ENOMEM;
+
+ scan_info->fd = fd;
+ scan_info->tree_id = tree_id;
+ scan_info->last_objectid = 256;
+ scan_info->scan_done = 0;
+ scan_info->scanned_items = RB_ROOT;
+ scan_info->processed_items = RB_ROOT;
+}
+
+static u64 get_snapshot_tree_id(int fd)
+{
+ struct btrfs_ioctl_ino_lookup_args ino_args;
+ int ret;
+
+ memset(&ino_args, 0, sizeof(ino_args));
+ ino_args.objectid = BTRFS_FIRST_FREE_OBJECTID;
+
+ ret = ioctl(fd, BTRFS_IOC_INO_LOOKUP, &ino_args);
+ if (ret) {
+ fprintf(stderr,
+ "ERROR: Failed to lookup path for dirid %llu - %s\n",
+ (unsigned long long)BTRFS_FIRST_FREE_OBJECTID,
+ strerror(errno));
+ return 0;
+ }
+
+ return ino_args.treeid;
+}
+
+static int cache_processed_item(struct rb_root *root, const char *path)
+{
+ struct rb_node **p = &(root->rb_node);
+ struct rb_node *parent = NULL;
+ struct processed_item *item;
+
+ while (*p) {
+ int ret;
+ struct processed_item *this;
+ this = rb_entry(*p, struct processed_item, pi_node);
+ parent = *p;
+ ret = strcmp(this->path, path);
+ if (ret > 0)
+ p = &(*p)->rb_left;
+ else if (ret < 0)
+ p = &(*p)->rb_right;
+ else
+ /*
+ * FIXME: to handle this situation in a
+ * more reasonable way.
+ */
+ return 0;
+ }
+
+ item = calloc(1, sizeof(*item));
+ item->path = strdup(path);
+ if (!item->path) {
+ fprintf(stderr, "out of memory to cache processed item\n");
+ return -ENOMEM;
+ }
+
+ rb_link_node(&item->pi_node, parent, p);
+ rb_insert_color(&item->pi_node, root);
+
+ return 0;
+}
+
+static struct processed_item *find_processed_item(struct rb_root *root,
+ const char *path)
+{
+ struct rb_node *node = rb_first(root);
+ struct processed_item *this;
+
+ while (node) {
+ int ret;
+ this = rb_entry(node, struct processed_item, pi_node);
+ ret = strcmp(this->path, path);
+ if (ret > 0)
+ node = node->rb_left;
+ else if (ret < 0)
+ node = node->rb_right;
+ else
+ return this;
+ }
+
+ return NULL;
+}
+
+static void remove_processed_item(struct rb_root *root, const char *path)
+{
+ struct processed_item *item;
+
+ item = find_processed_item(root, path);
+ if (item) {
+ rb_erase(&item->pi_node, root);
+ free(item->path);
+ free(item);
+ }
+}
+
+static int item_is_processed(struct rb_root *root, const char *path)
+{
+ return find_processed_item(root, path) ? 1 : 0;
+}
+
+static void free_processed_items(struct rb_root *root)
+{
+ struct processed_item *item;
+ struct rb_node *node;
+
+ while ((node = rb_first(root))) {
+ item = rb_entry(node, struct processed_item, pi_node);
+ rb_erase(&item->pi_node, root);
+ free(item->path);
+ free(item);
+ }
+}
+
+static void snapshot_item_scan_free(struct snapshot_scan_info *scan_info)
+{
+ struct rb_root *root = &scan_info->scanned_items;
+ struct snapshot_scan_item *item;
+ struct rb_node *node;
+
+ while ((node = rb_first(root))) {
+ item = rb_entry(node, struct snapshot_scan_item, si_node);
+ rb_erase(&item->si_node, root);
+ free(item->path);
+ free(item);
+ }
+}
+
+static inline void
+snapshot_diff_destroy(struct snapshot_diff_info *diff_info)
+{
+ struct snapshot_scan_info *src_scan_info = diff_info->src_scan_info;
+ struct snapshot_scan_info *dest_scan_info = diff_info->dest_scan_info;
+
+ free(src_scan_info->snapshot);
+ free(dest_scan_info->snapshot);
+
+ free_processed_items(&src_scan_info->processed_items);
+ free_processed_items(&dest_scan_info->processed_items);
+
+ snapshot_item_scan_free(diff_info->src_scan_info);
+ snapshot_item_scan_free(diff_info->dest_scan_info);
+
+ free(src_scan_info);
+ free(dest_scan_info);
+}
+
+static int add_item_to_cache(struct rb_root *root, const char *path, u8 type,
+ u64 objectid, u64 transid)
+{
+ struct rb_node **p = &(root->rb_node);
+ struct rb_node *parent = NULL;
+ struct snapshot_scan_item *item;
+
+ while (*p) {
+ struct snapshot_scan_item *this;
+ int ret;
+
+ this = rb_entry(*p, struct snapshot_scan_item, si_node);
+ parent = *p;
+ ret = strcmp(this->path, path);
+ if (ret > 0)
+ p = &(*p)->rb_left;
+ else if (ret < 0)
+ p = &(*p)->rb_right;
+ else
+ assert(0);
+ }
+
+ item = malloc(sizeof(struct snapshot_scan_item));
+ if (!item) {
+ fprintf(stderr, "out of memory while inserting new item\n");
+ return -ENOMEM;
+ }
+
+ item->path = strdup(path);
+ if (!item->path) {
+ fprintf(stderr, "out of memory while inserting new item\n");
+ return -ENOMEM;
+ }
+
+ item->type = type;
+ item->objectid = objectid;
+ item->transid = transid;
+
+ rb_link_node(&item->si_node, parent, p);
+ rb_insert_color(&item->si_node, root);
+ return 0;
+}
+
+static int process_snapshot_item(struct snapshot_scan_info *scan_info,
+ struct btrfs_dir_item *item)
+{
+ struct rb_root *processed_items = &scan_info->processed_items;
+ struct rb_root *scanned_items = &scan_info->scanned_items;
+ char *cache_dir_name = NULL;
+ int fd = scan_info->fd;
+ char *path;
+ u64 cache_dirid;
+ int ret = 0;
+
+ path = ino_resolve(fd, item->location.objectid,
+ &cache_dirid, &cache_dir_name);
+ if (!path) {
+ ret = -EIO;
+ goto out;
+ }
+
+ /*
+ * this item can be freely skipped as we have already processed
+ * it in previous business.
+ */
+ if (item_is_processed(processed_items, path)) {
+ remove_processed_item(processed_items, path);
+ goto out;
+ }
+
+ ret = add_item_to_cache(scanned_items, path, item->type,
+ item->location.objectid,
+ item->transid);
+
+out:
+ return ret;
+}
+
+static int do_snapshot_scan(struct snapshot_scan_info *scan_info)
+{
+ struct btrfs_ioctl_search_args args;
+ struct btrfs_ioctl_search_key *sk = &args.key;
+ struct btrfs_ioctl_search_header *sh;
+ struct btrfs_dir_item *item;
+ struct btrfs_dir_item backup;
+ int fd = scan_info->fd;
+ int count = 0;
+ int ret;
+
+ memset(&backup, 0, sizeof(backup));
+ memset(&args, 0, sizeof(args));
+
+ sk->tree_id = scan_info->tree_id;
+ sk->min_objectid = scan_info->last_objectid;
+ sk->min_type = BTRFS_DIR_INDEX_KEY;
+ sk->max_type = BTRFS_DIR_INDEX_KEY;
+
+ /*
+ * set all the other params to the max, we'll take any objectid
+ * and any trans
+ */
+ sk->max_objectid = (u64)-1;
+ sk->max_offset = (u64)-1;
+ sk->max_transid = (u64)-1;
+
+ /* just a big number, doesn't matter much */
+ sk->nr_items = 4096;
+
+ do {
+ unsigned long off = 0;
+ int i;
+ ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
+ if (ret < 0) {
+ fprintf(stderr, "ERROR: can't perform the search %s\n",
+ strerror(errno));
+ return ret;
+ }
+
+ /* the ioctl returns the number of item it found in nr_items */
+ if (sk->nr_items == 0) {
+ scan_info->scan_done = true;
+ break;
+ }
+
+ /*
+ * for each item, pull the key out of the header and then
+ * read the root_ref item it contains
+ */
+ for (i = 0; i < sk->nr_items; i++) {
+ sh = (struct btrfs_ioctl_search_header *)(args.buf +
+ off);
+ off += sizeof(*sh);
+
+ /*
+ * just in case the item was too big, pass something
+ * other than garbage
+ */
+ if (sh->len == 0)
+ item = &backup;
+ else
+ item = (struct btrfs_dir_item *)(args.buf +
+ off);
+
+ if (sh->type == BTRFS_DIR_INDEX_KEY) {
+ ret = process_snapshot_item(scan_info, item);
+ if (ret < 0)
+ return ret;
+ ++count;
+ }
+
+ off += sh->len;
+
+ /*
+ * record the mins in sk so we can make sure the
+ * next search doesn't repeat this root
+ */
+ sk->min_objectid = sh->objectid;
+ sk->min_offset = sh->offset;
+ sk->min_type = sh->type;
+ }
+
+ if (sk->min_offset < (u64)-1) {
+ sk->min_offset++;
+ } else if (sk->min_objectid < (u64)-1) {
+ sk->min_objectid++;
+ sk->min_offset = 0;
+ sk->min_type = 0;
+ } else {
+ scan_info->scan_done = 1;
+ break;
+ }
+ } while (count < nr_item_scan);
+
+ if (!scan_info->scan_done)
+ scan_info->last_objectid = sk->min_objectid;
+
+ return ret;
+}
+
+static int snapshot_item_scan_read(struct snapshot_scan_info *scan_info)
+{
+ return do_snapshot_scan(scan_info);
+}
+
+static struct snapshot_scan_item *find_item_on_cache(struct rb_root *root,
+ const char *path)
+{
+ struct rb_node *node = rb_first(root);
+ struct snapshot_scan_item *this;
+
+ while (node) {
+ int ret;
+ this = rb_entry(node, struct snapshot_scan_item, si_node);
+ ret = strcmp(this->path, path);
+ if (ret > 0)
+ node = node->rb_left;
+ else if (ret < 0)
+ node = node->rb_right;
+ else
+ return this;
+ }
+
+ return NULL;
+}
+
+static void remove_item_from_cache(struct rb_root *root, const char *path)
+{
+ struct snapshot_scan_item *item;
+
+ item = find_item_on_cache(root, path);
+ if (item) {
+ rb_erase(&item->si_node, root);
+ free(item->path);
+ free(item);
+ }
+}
+
+/* check if an item path does exist on dest snapshot or not */
+static inline int find_item_on_snapshot(const char *path, u8 type, int *found)
+{
+ int ret;
+
+ if (type != BTRFS_FT_SYMLINK) {
+ ret = access(path, F_OK);
+ if (ret == 0) {
+ *found = 1;
+ goto out;
+ }
+
+ if (errno == ENOENT) {
+ *found = 0;
+ ret = 0;
+ goto out;
+ }
+ fprintf(stderr, "failed to access %s as %s\n",
+ path, strerror(errno));
+ } else {
+ struct stat st;
+ ret = lstat(path, &st);
+ if (ret == 0) {
+ *found = 1;
+ goto out;
+ }
+
+ if (errno == ENOENT) {
+ *found = 0;
+ ret = 0;
+ goto out;
+ }
+ fprintf(stderr, "failed to lstat %s failed as %s\n",
+ path, strerror(errno));
+ }
+
+out:
+ return ret;
+}
+
+static int do_item_diff(struct snapshot_scan_item *item, u8 item_type,
+ unsigned int item_state, const char *src_snapshot,
+ const char *dest_snapshot, unsigned int flags)
+{
+ int updated;
+ int ret = 0;
+
+ switch (item_state) {
+ case SNAPSHOT_DIFF_NEW_ITEM:
+ if (SNAPSHOT_DIFF_SHOW_NEW_ITEM(flags))
+ print_item_diff_info(item, dest_snapshot, item_state);
+ break;
+ case SNAPSHOT_DIFF_REMOVED_ITEM:
+ if (SNAPSHOT_DIFF_SHOW_REMOVED_ITEM(flags))
+ print_item_diff_info(item, src_snapshot, item_state);
+ break;
+ case SNAPSHOT_DIFF_UPDATED_ITEM:
+ if (SNAPSHOT_DIFF_SHOW_UPDATED_ITEM(flags)) {
+ ret = item_is_updated(src_snapshot, dest_snapshot,
+ item->path, item_type, &updated);
+ if (ret < 0)
+ return ret;
+
+ if (updated) {
+ print_item_diff_info(item, dest_snapshot,
+ item_state);
+ }
+ }
+ break;
+ default:
+ assert(0);
+ }
+
+ return ret;
+}
+
+static int inline build_full_path(char full_path[BTRFS_PATH_NAME_MAX + 1],
+ const char *src_snapshot,
+ const char *path)
+{
+ size_t len = strlen(src_snapshot) + strlen(path) + 1;
+
+ if (snprintf(full_path, BTRFS_PATH_NAME_MAX + 1, "%s/%s",
+ src_snapshot, path) != len) {
+ fprintf(stderr, "failed to build full path %s/%s\n",
+ src_snapshot, path);
+ return -1;
+ }
+
+ return 0;
+}
+
+
+/*
+ * step through each scanned item from source cache, and check it up on dest
+ * cache firstly. if an item can be found at dest cache and if it does not
+ * changed by examining its ctime and mtime upon the source one, proceed to
+ * check the next source item, or print the updated info if needed.
+ * if we can not find it at the dest cache, probably it resides on the dest
+ * snapshot, hence, we need to check it up on there. if it does not exist,
+ * print the removed info if needed, otherwise, check if it was updated or not.
+ */
+static int process_source_item_cache(struct snapshot_scan_info *src_scan_info,
+ struct snapshot_scan_info *dest_scan_info,
+ unsigned int flags)
+{
+ struct rb_root *src_scanned_items = &src_scan_info->scanned_items;
+ struct rb_root *dest_scanned_items = &dest_scan_info->scanned_items;
+ struct rb_root *processed_items = &dest_scan_info->processed_items;
+ const char *src_snapshot = src_scan_info->snapshot;
+ const char *dest_snapshot = dest_scan_info->snapshot;
+ struct rb_node *node;
+ u8 item_type;
+ int ret = 0;
+
+ for (node = rb_first(src_scanned_items); node; node = rb_next(node)) {
+ char full_path[BTRFS_PATH_NAME_MAX + 1];
+ struct snapshot_scan_item *src_item;
+ struct snapshot_scan_item *dest_item;
+ unsigned int item_state;
+ const char *path;
+ int found = 0;
+
+ src_item = rb_entry(node, struct snapshot_scan_item, si_node);
+ item_type = src_item->type;
+ path = src_item->path;
+
+ /* can it be found on dest cache? */
+ dest_item = find_item_on_cache(dest_scanned_items, path);
+ if (dest_item) {
+ item_state = SNAPSHOT_DIFF_UPDATED_ITEM,
+ ret = do_item_diff(src_item, item_type, item_state,
+ src_snapshot, dest_snapshot, flags);
+ if (ret < 0)
+ break;
+
+ continue;
+ }
+
+ ret = build_full_path(full_path, dest_snapshot, path);
+ if (ret < 0)
+ break;
+
+ /*
+ * the source item was not found from dest cache. probably
+ * it resides at dest snapshot, try to lookup it there so.
+ */
+ ret = find_item_on_snapshot(full_path, item_type, &found);
+ if (ret < 0)
+ break;
+
+ item_state = found ? SNAPSHOT_DIFF_UPDATED_ITEM :
+ SNAPSHOT_DIFF_REMOVED_ITEM;
+ ret = do_item_diff(src_item, item_type, item_state,
+ src_snapshot, dest_snapshot, flags);
+ if (ret < 0)
+ break;
+
+ /*
+ * so we found the source item from the dest snapshot,
+ * however, it will be retrieved again when scanning
+ * the dest snapshot at a later time. to avoid this,
+ * we should put it to the dest processed tree, so it
+ * will be skipped to cache again at that time.
+ */
+ if (found) {
+ ret = cache_processed_item(processed_items, path);
+ if (ret < 0)
+ break;
+ }
+ }
+
+ return ret;
+}
+
+/*
+ * revert to iterate the left items in dest cache and find out whether it
+ * resides on source snapshot or not. note that, those left items must not
+ * exists on source cache as we have already done that check up. so we only
+ * need to examine if it does exist on source subvolume or not. if not, a
+ * a new created item was found. otherwise, check it was updated or not.
+ */
+static int process_dest_item_cache(struct snapshot_scan_info *src_scan_info,
+ struct snapshot_scan_info *dest_scan_info,
+ int flags)
+{
+ struct rb_root *scanned_items = &dest_scan_info->scanned_items;
+ struct rb_root *processed_items = &src_scan_info->processed_items;
+ const char *src_snapshot = src_scan_info->snapshot;
+ const char *dest_snapshot = dest_scan_info->snapshot;
+ struct rb_node *node;
+ int ret = 0;
+
+ for (node = rb_first(scanned_items); node; node = rb_next(node)) {
+ char full_path[BTRFS_PATH_NAME_MAX + 1];
+ struct snapshot_scan_item *item;
+ unsigned int item_state;
+ const char *path;
+ u8 item_type;
+ int found = 0;
+
+ item = rb_entry(node, struct snapshot_scan_item, si_node);
+ path = item->path;
+
+ ret = build_full_path(full_path, src_snapshot, path);
+ if (ret < 0)
+ break;
+
+ item_type = item->type;
+ /* check if this item located at source snapshot or not */
+ ret = find_item_on_snapshot(full_path, item_type, &found);
+ if (ret < 0)
+ break;
+
+ item_state = found ? SNAPSHOT_DIFF_UPDATED_ITEM :
+ SNAPSHOT_DIFF_NEW_ITEM;
+ ret = do_item_diff(item, item_type, item_state, src_snapshot,
+ dest_snapshot, flags);
+ if (ret < 0)
+ break;
+
+ /*
+ * so we found this time on the source snapshot. put it to
+ * processed tree so we'll not cache and handle it again.
+ */
+ if (found) {
+ ret = cache_processed_item(processed_items, path);
+ if (ret < 0)
+ break;
+ }
+ }
+
+ return ret;
+}
+
+/*
+ * so we have two caches holding the scanned items from both source and
+ * dest snapshot, perform real diff business now.
+ */
+static int do_diff_items(struct snapshot_scan_info *src_scan_info,
+ struct snapshot_scan_info *dest_scan_info,
+ unsigned int flags)
+{
+ int ret = 0;
+
+ ret = process_source_item_cache(src_scan_info, dest_scan_info, flags);
+ if (ret < 0)
+ return ret;
+
+ ret = process_dest_item_cache(src_scan_info, dest_scan_info, flags);
+ return ret;
+}
+
+int snapshot_diff(int src_fd, int dest_fd, const char *src_snapshot,
+ const char *dest_snapshot, unsigned int flags)
+{
+ struct snapshot_diff_info diff_info;
+ struct snapshot_scan_info *src_scan_info;
+ struct snapshot_scan_info *dest_scan_info;
+ u64 src_tree_id;
+ u64 dest_tree_id;
+ int diff_done = 0;
+ int ret;
+
+ /*
+ * commit buffer cache to disk to make the new transactions
+ * of both snapshots take affected immediately.
+ */
+ sync();
+
+ src_tree_id = get_snapshot_tree_id(src_fd);
+ if (!src_tree_id)
+ return 1;
+
+ dest_tree_id = get_snapshot_tree_id(dest_fd);
+ if (!dest_tree_id)
+ return 1;
+
+ /*
+ * list all changes on the dest snapshot if no option
+ * was specified.
+ */
+ if (!flags) {
+ flags |= (SNAPSHOT_DIFF_LIST_NEW_ITEM |
+ SNAPSHOT_DIFF_LIST_REMOVED_ITEM |
+ SNAPSHOT_DIFF_LIST_UPDATED_ITEM);
+ }
+
+ ret = snapshot_diff_init(&diff_info);
+ if (ret < 0) {
+ fprintf(stderr, "snapshot diff init failed\n");
+ return ret;
+ }
+
+ src_scan_info = diff_info.src_scan_info;
+ dest_scan_info = diff_info.dest_scan_info;
+ ret = snapshot_item_scan_init(src_scan_info, src_snapshot, src_fd,
+ src_tree_id);
+ if (ret < 0) {
+ fprintf(stderr, "source snapshot scan init failed\n");
+ return ret;
+ }
+
+ ret = snapshot_item_scan_init(dest_scan_info, dest_snapshot, dest_fd,
+ dest_tree_id);
+ if (ret < 0) {
+ fprintf(stderr, "dest snapshot scan init failed\n");
+ return ret;
+ }
+
+ while (1) {
+ /*
+ * scan the specified number of items(4096) and cache
+ * them on rb-tree from both source and dest snapshot.
+ */
+ ret = snapshot_item_scan_read(src_scan_info);
+ if (ret)
+ goto out;
+
+ ret = snapshot_item_scan_read(dest_scan_info);
+ if (ret)
+ goto out;
+
+ /* fire up */
+ ret = do_diff_items(src_scan_info, dest_scan_info, flags);
+ if (ret)
+ goto out;
+
+ /*
+ * this round diff process done, free up those cached
+ * items so.
+ */
+ snapshot_item_scan_free(src_scan_info);
+ snapshot_item_scan_free(dest_scan_info);
+
+ /*
+ * no more item can be probed from the source snapshot,
+ * if the same thing happen to dest, which means that we
+ * have gone through all items in both of them, diff done.
+ * Otherwise, we'll drop through to experience some extra
+ * check up. If there still have items at source snapshot,
+ * proceed to next round of scan && diff.
+ */
+ if (src_scan_info->scan_done) {
+ if (dest_scan_info->scan_done)
+ diff_done = 1;
+ break;
+ }
+ }
+
+ /*
+ * come to this point, probably the diff process is totally complete
+ * because of no more items can be fetched on both snapshots. or the
+ * source scan was done, and the user don't want to list those new
+ * created items on dest snapshot, or the destination scan was done
+ * and the user don't want to list the those items which are resides
+ * on source snapshot but may be removed on destination.
+ */
+ if (diff_done ||
+ ((src_scan_info->scan_done &&
+ !SNAPSHOT_DIFF_SHOW_NEW_ITEM(flags)) ||
+ (dest_scan_info->scan_done &&
+ !(SNAPSHOT_DIFF_SHOW_REMOVED_ITEM(flags)))))
+ goto out;
+
+ /*
+ * there may be have some items resides at either source snapshot
+ * or destination, we have to deal with them if needed.
+ */
+ if (!src_scan_info->scan_done &&
+ SNAPSHOT_DIFF_SHOW_REMOVED_ITEM(flags)) {
+ do {
+ ret = snapshot_item_scan_read(src_scan_info);
+ if (ret)
+ goto out;
+
+ ret = do_diff_items(src_scan_info, dest_scan_info,
+ flags);
+ if (ret)
+ goto out;
+ } while (!src_scan_info->scan_done);
+ }
+
+ if (!dest_scan_info->scan_done &&
+ SNAPSHOT_DIFF_SHOW_NEW_ITEM(flags)) {
+ do {
+ ret = snapshot_item_scan_read(dest_scan_info);
+ if (ret)
+ break;
+
+ ret = do_diff_items(src_scan_info, dest_scan_info,
+ flags);
+ if (ret)
+ break;
+ } while (!dest_scan_info->scan_done);
+ }
+
+out:
+ snapshot_diff_destroy(&diff_info);
+ return ret;
+}
--
1.7.4.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [RFC PATCH 4/5] btrfs-progs: teach Makefile aware of the new comer
2012-08-07 8:56 [RFC PATCH 0/5] btrfs-progs: snapshot diff function Jeff Liu
` (2 preceding siblings ...)
2012-08-07 8:57 ` [RFC PATCH 3/5] btrfs-progs: souce " Jeff Liu
@ 2012-08-07 8:57 ` Jeff Liu
2012-08-07 8:58 ` [RFC PATCH 5/5] btrfs-progs: let this feature works at subvolume command group Jeff Liu
2012-08-24 13:09 ` [RFC PATCH 0/5] btrfs-progs: snapshot diff function Alex Lyakas
5 siblings, 0 replies; 10+ messages in thread
From: Jeff Liu @ 2012-08-07 8:57 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
Makefile | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/Makefile b/Makefile
index 9694444..f8b517d 100644
--- a/Makefile
+++ b/Makefile
@@ -4,7 +4,7 @@ CFLAGS = -g -O0
objects = ctree.o disk-io.o radix-tree.o extent-tree.o print-tree.o \
root-tree.o dir-item.o file-item.o inode-item.o \
inode-map.o crc32c.o rbtree.o extent-cache.o extent_io.o \
- volumes.o utils.o btrfs-list.o btrfslabel.o repair.o
+ volumes.o utils.o btrfs-list.o btrfslabel.o repair.o diff-snapshot.o
cmds_objects = cmds-subvolume.o cmds-filesystem.o cmds-device.o cmds-scrub.o \
cmds-inspect.o cmds-balance.o
--
1.7.4.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [RFC PATCH 5/5] btrfs-progs: let this feature works at subvolume command group.
2012-08-07 8:56 [RFC PATCH 0/5] btrfs-progs: snapshot diff function Jeff Liu
` (3 preceding siblings ...)
2012-08-07 8:57 ` [RFC PATCH 4/5] btrfs-progs: teach Makefile aware of the new comer Jeff Liu
@ 2012-08-07 8:58 ` Jeff Liu
2012-08-24 13:09 ` [RFC PATCH 0/5] btrfs-progs: snapshot diff function Alex Lyakas
5 siblings, 0 replies; 10+ messages in thread
From: Jeff Liu @ 2012-08-07 8:58 UTC (permalink / raw)
To: linux-btrfs@vger.kernel.org
make this feature works as `btrfs subvolume diff-snapshot [options] <src> <dest>`.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
---
cmds-subvolume.c | 90 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 files changed, 90 insertions(+), 0 deletions(-)
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 3508ce6..e7cc3cc 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -28,6 +28,7 @@
#include "ioctl.h"
#include "commands.h"
+#include "diff-snapshot.h"
/* btrfs-list.c */
int list_subvols(int fd, int print_parent, int get_default);
@@ -515,6 +516,94 @@ static int cmd_find_new(int argc, char **argv)
return 0;
}
+static const char * const cmd_snapshot_diff_usage[] = {
+ "btrfs subvolume diff-snapshot [options] <source> <dest>",
+ "List the differences between two snapshots",
+ "By default, list all changes in destination snapshot towards source",
+ "",
+ "-n list the new items in destination snapshot\n",
+ "-r list the removed items in destination snapshot\n",
+ "-u list the updated items in destination snapshot\n",
+ NULL
+};
+
+static int cmd_snapshot_diff(int argc, char **argv)
+{
+ int ret;
+ int src_fd;
+ int dst_fd;
+ char *src_snapshot;
+ char *dest_snapshot;
+ unsigned int flags = 0;
+
+ while (1) {
+ int c = getopt(argc, argv, "rnu");
+ if (c < 0)
+ break;
+ switch (c) {
+ case 'r':
+ flags |= SNAPSHOT_DIFF_LIST_REMOVED_ITEM;
+ break;
+ case 'n':
+ flags |= SNAPSHOT_DIFF_LIST_NEW_ITEM;
+ break;
+ case 'u':
+ flags |= SNAPSHOT_DIFF_LIST_UPDATED_ITEM;
+ break;
+ default:
+ fprintf(stderr, "ERROR: snapshot diff args invalid.\n"
+ " -r list removed items\n"
+ " -n list new items\n"
+ " -u list updated items\n");
+ return 1;
+ }
+ }
+
+ src_snapshot = argv[argc - 2];
+ dest_snapshot = argv[argc - 1];
+
+ ret = test_issubvolume(src_snapshot);
+ if (ret < 0) {
+ fprintf(stderr, "ERROR: error accessing '%s'\n",
+ src_snapshot);
+ return 12;
+ }
+ if (!ret) {
+ fprintf(stderr, "ERROR: '%s' is not a subvolume\n",
+ src_snapshot);
+ return 13;
+ }
+
+ ret = test_issubvolume(dest_snapshot);
+ if (ret < 0) {
+ fprintf(stderr, "ERROR: error accessing '%s'\n",
+ dest_snapshot);
+ return 12;
+ }
+ if (!ret) {
+ fprintf(stderr, "ERROR: '%s' is not a subvolume\n",
+ dest_snapshot);
+ return 13;
+ }
+
+ src_fd = open_file_or_dir(src_snapshot);
+ if (src_fd < 0) {
+ fprintf(stderr, "ERROR: can't access '%s'\n", src_snapshot);
+ return 12;
+ }
+
+ dst_fd = open_file_or_dir(dest_snapshot);
+ if (dst_fd < 0) {
+ fprintf(stderr, "ERROR: can't access '%s'\n", dest_snapshot);
+ return 12;
+ }
+
+ ret = snapshot_diff(src_fd, dst_fd, src_snapshot, dest_snapshot, flags);
+ if (ret)
+ return 19;
+ return 0;
+}
+
const struct cmd_group subvolume_cmd_group = {
subvolume_cmd_group_usage, NULL, {
{ "create", cmd_subvol_create, cmd_subvol_create_usage, NULL, 0 },
@@ -526,6 +615,7 @@ const struct cmd_group subvolume_cmd_group = {
{ "set-default", cmd_subvol_set_default,
cmd_subvol_set_default_usage, NULL, 0 },
{ "find-new", cmd_find_new, cmd_find_new_usage, NULL, 0 },
+ { "diff-snapshot", cmd_snapshot_diff, cmd_snapshot_diff_usage, NULL, 0 },
{ 0, 0, 0, 0, 0 }
}
};
--
1.7.4.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/5] btrfs-progs: snapshot diff function
2012-08-07 8:56 [RFC PATCH 0/5] btrfs-progs: snapshot diff function Jeff Liu
` (4 preceding siblings ...)
2012-08-07 8:58 ` [RFC PATCH 5/5] btrfs-progs: let this feature works at subvolume command group Jeff Liu
@ 2012-08-24 13:09 ` Alex Lyakas
2012-08-24 14:15 ` Jeff Liu
5 siblings, 1 reply; 10+ messages in thread
From: Alex Lyakas @ 2012-08-24 13:09 UTC (permalink / raw)
To: jeff.liu; +Cc: linux-btrfs@vger.kernel.org
Hi Jeff,
how do you see this snapshot-diff functionality vs the send/receive
functionality that was recently added? I think that the binary stream
that the send code produces, can be interpreted by printing out text
messages, which will essentially give the same information (although
much more detailed) as your snapshot-diff tool.
Apologies if I somehow misunderstood what your snapshot-diff code does.
Thanks,
Alex.
On Tue, Aug 7, 2012 at 11:56 AM, Jeff Liu <jeff.liu@oracle.com> wrote:
> Hello,
>
> I've done a prototype implementation of snapshot diff utility many months ago.
> It was originally meant to analyze the differences between two snapshots which are
> inherited from the same subvolume/snapshot.
>
> Moreover, the upstream LXC userland tools has been released with a dedicated template
> to create new containers combine with btrfs subvolume/snapshot create function, so this
> path set might be useful if someone is suffering from a broken container guest(it was
> cloned from a health one in previous but it does not work with some configurations now).
> In this case, this feature could works as an assistant to help investigating the root
> cause by listing those changed files from the snapshot that the container resides.
>
> This patch set works to three kinds of change for now.
> - new_file: new created files at the destination snapshot.
>
> - removed_file: those files are still resides on source subvolume/snapshot but they have
> been removed from the destination.
>
> - updated_file: files are resides on both subvolumes/snapshots, but they might be changed.
>
> Currently, the user could do diff business on any two subvolumes/snapshots, if the destination
> snapshot is not inherited from the same subvolume/snapshot upon the source one, he must be
> surprised by the results, so it's better to improve it with pre-check for that if possible.
>
> Another issue is,
> - if we created some new files or updated some existing files under the source snapshot,
> they will be marked as REMOVED/UPDATED out of the source from the destination snapshot's
> point of view, so the results might looks a bit strange.
>
>
> A quick demo:
> root@kdev:/btrfs# btrfs subvolume diff-snapshot one two
> [REMOVED REGFILE] one/regfile_in_one objectid 264 transid 50
> [REMOVED DIR] one/dir_02_at_one objectid 262 transid 36
> [REMOVED REGFILE] one/dir_02_at_one/file_at_dir02_one objectid 263 transid 37
> [REMOVED DIR] one/dir_at_one objectid 258 transid 29
> [REMOVED REGFILE] one/dir_at_one/file_02_at_one_dir objectid 260 transid 32
> [REMOVED REGFILE] one/dir_at_one/file_03_at_one_dir objectid 261 transid 35
> [REMOVED REGFILE] one/dir_at_one/file_at_one_dir objectid 259 transid 30
> [REMOVED REGFILE] one/file_at_one objectid 257 transid 26
> [NEW REGFILE] two/regfile_in_two objectid 265 transid 50
> [NEW DIR] two/dir_at_two objectid 262 transid 40
> [NEW REGFILE] two/dir_at_two/file01_at_dir_of_two objectid 263 transid 41
> [NEW SYMLINK] two/dir_at_two/passwd objectid 264 transid 42
> [NEW REGFILE] two/file_02 objectid 258 transid 23
> [NEW REGFILE] two/file_03 objectid 270 transid 68
> [NEW REGFILE] two/file_04 objectid 275 transid 68
>
>
> Any comments are appreciated!
>
> Thanks,
> -Jeff
>
>
> Makefile | 2 +-
> btrfs-list.c | 3 +-
> cmds-subvolume.c | 90 +++++
> diff-snapshot.c | 1026 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
> diff-snapshot.h | 47 +++
> 5 files changed, 1165 insertions(+), 3 deletions(-)
>
>
>
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/5] btrfs-progs: snapshot diff function
2012-08-24 13:09 ` [RFC PATCH 0/5] btrfs-progs: snapshot diff function Alex Lyakas
@ 2012-08-24 14:15 ` Jeff Liu
2012-08-25 6:23 ` Alexander Block
0 siblings, 1 reply; 10+ messages in thread
From: Jeff Liu @ 2012-08-24 14:15 UTC (permalink / raw)
To: Alex Lyakas; +Cc: linux-btrfs@vger.kernel.org
Hi Alex,
Thanks for taking a look.
On 08/24/2012 09:09 PM, Alex Lyakas wrote:
> Hi Jeff,
> how do you see this snapshot-diff functionality vs the send/receive
> functionality that was recently added? I think that the binary stream
> that the send code produces, can be interpreted by printing out text
> messages, which will essentially give the same information (although
> much more detailed) as your snapshot-diff tool.
send/receive has not yet implemented when I working on this feature(back to the end of last year).
looks there really has some duplicate efforts.
Just as you said, the produced stream of send need to interpret to show readable info, if the binary stream is
became huge enough, maybe that will make some silly user crying like me :).
Also, it is mainly focus on backup purpose IMHO, please correct me if I missing something in this point.
The diff utility is designed to list any changes between two snapshots in a straightforward way consider the command interface,
it also can be improved to show the differences at a given time range.
But sure, send/receive is just awesome, if we can introduce a interpreted script which do same thing to
make end user's life easier, that would be fine.
Thanks,
-Jeff
> Apologies if I somehow misunderstood what your snapshot-diff code does.
>
> Thanks,
> Alex.
>
>
> On Tue, Aug 7, 2012 at 11:56 AM, Jeff Liu <jeff.liu@oracle.com> wrote:
>> Hello,
>>
>> I've done a prototype implementation of snapshot diff utility many months ago.
>> It was originally meant to analyze the differences between two snapshots which are
>> inherited from the same subvolume/snapshot.
>>
>> Moreover, the upstream LXC userland tools has been released with a dedicated template
>> to create new containers combine with btrfs subvolume/snapshot create function, so this
>> path set might be useful if someone is suffering from a broken container guest(it was
>> cloned from a health one in previous but it does not work with some configurations now).
>> In this case, this feature could works as an assistant to help investigating the root
>> cause by listing those changed files from the snapshot that the container resides.
>>
>> This patch set works to three kinds of change for now.
>> - new_file: new created files at the destination snapshot.
>>
>> - removed_file: those files are still resides on source subvolume/snapshot but they have
>> been removed from the destination.
>>
>> - updated_file: files are resides on both subvolumes/snapshots, but they might be changed.
>>
>> Currently, the user could do diff business on any two subvolumes/snapshots, if the destination
>> snapshot is not inherited from the same subvolume/snapshot upon the source one, he must be
>> surprised by the results, so it's better to improve it with pre-check for that if possible.
>>
>> Another issue is,
>> - if we created some new files or updated some existing files under the source snapshot,
>> they will be marked as REMOVED/UPDATED out of the source from the destination snapshot's
>> point of view, so the results might looks a bit strange.
>>
>>
>> A quick demo:
>> root@kdev:/btrfs# btrfs subvolume diff-snapshot one two
>> [REMOVED REGFILE] one/regfile_in_one objectid 264 transid 50
>> [REMOVED DIR] one/dir_02_at_one objectid 262 transid 36
>> [REMOVED REGFILE] one/dir_02_at_one/file_at_dir02_one objectid 263 transid 37
>> [REMOVED DIR] one/dir_at_one objectid 258 transid 29
>> [REMOVED REGFILE] one/dir_at_one/file_02_at_one_dir objectid 260 transid 32
>> [REMOVED REGFILE] one/dir_at_one/file_03_at_one_dir objectid 261 transid 35
>> [REMOVED REGFILE] one/dir_at_one/file_at_one_dir objectid 259 transid 30
>> [REMOVED REGFILE] one/file_at_one objectid 257 transid 26
>> [NEW REGFILE] two/regfile_in_two objectid 265 transid 50
>> [NEW DIR] two/dir_at_two objectid 262 transid 40
>> [NEW REGFILE] two/dir_at_two/file01_at_dir_of_two objectid 263 transid 41
>> [NEW SYMLINK] two/dir_at_two/passwd objectid 264 transid 42
>> [NEW REGFILE] two/file_02 objectid 258 transid 23
>> [NEW REGFILE] two/file_03 objectid 270 transid 68
>> [NEW REGFILE] two/file_04 objectid 275 transid 68
>>
>>
>> Any comments are appreciated!
>>
>> Thanks,
>> -Jeff
>>
>>
>> Makefile | 2 +-
>> btrfs-list.c | 3 +-
>> cmds-subvolume.c | 90 +++++
>> diff-snapshot.c | 1026 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> diff-snapshot.h | 47 +++
>> 5 files changed, 1165 insertions(+), 3 deletions(-)
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/5] btrfs-progs: snapshot diff function
2012-08-24 14:15 ` Jeff Liu
@ 2012-08-25 6:23 ` Alexander Block
2012-08-26 12:26 ` Jie Liu
0 siblings, 1 reply; 10+ messages in thread
From: Alexander Block @ 2012-08-25 6:23 UTC (permalink / raw)
To: jeff.liu; +Cc: Alex Lyakas, linux-btrfs@vger.kernel.org
On Fri, Aug 24, 2012 at 4:15 PM, Jeff Liu <jeff.liu@oracle.com> wrote:
> Hi Alex,
>
> Thanks for taking a look.
>
> On 08/24/2012 09:09 PM, Alex Lyakas wrote:
>
>> Hi Jeff,
>> how do you see this snapshot-diff functionality vs the send/receive
>> functionality that was recently added? I think that the binary stream
>> that the send code produces, can be interpreted by printing out text
>> messages, which will essentially give the same information (although
>> much more detailed) as your snapshot-diff tool.
>
> send/receive has not yet implemented when I working on this feature(back to the end of last year).
> looks there really has some duplicate efforts.
>
> Just as you said, the produced stream of send need to interpret to show readable info, if the binary stream is
> became huge enough, maybe that will make some silly user crying like me :).
> Also, it is mainly focus on backup purpose IMHO, please correct me if I missing something in this point.
>
> The diff utility is designed to list any changes between two snapshots in a straightforward way consider the command interface,
> it also can be improved to show the differences at a given time range.
>
> But sure, send/receive is just awesome, if we can introduce a interpreted script which do same thing to
> make end user's life easier, that would be fine.
>
> Thanks,
> -Jeff
>
>> Apologies if I somehow misunderstood what your snapshot-diff code does.
>>
>> Thanks,
>> Alex.
>>
>>
>> On Tue, Aug 7, 2012 at 11:56 AM, Jeff Liu <jeff.liu@oracle.com> wrote:
>>> Hello,
>>>
>>> I've done a prototype implementation of snapshot diff utility many months ago.
>>> It was originally meant to analyze the differences between two snapshots which are
>>> inherited from the same subvolume/snapshot.
>>>
>>> Moreover, the upstream LXC userland tools has been released with a dedicated template
>>> to create new containers combine with btrfs subvolume/snapshot create function, so this
>>> path set might be useful if someone is suffering from a broken container guest(it was
>>> cloned from a health one in previous but it does not work with some configurations now).
>>> In this case, this feature could works as an assistant to help investigating the root
>>> cause by listing those changed files from the snapshot that the container resides.
>>>
>>> This patch set works to three kinds of change for now.
>>> - new_file: new created files at the destination snapshot.
>>>
>>> - removed_file: those files are still resides on source subvolume/snapshot but they have
>>> been removed from the destination.
>>>
>>> - updated_file: files are resides on both subvolumes/snapshots, but they might be changed.
>>>
>>> Currently, the user could do diff business on any two subvolumes/snapshots, if the destination
>>> snapshot is not inherited from the same subvolume/snapshot upon the source one, he must be
>>> surprised by the results, so it's better to improve it with pre-check for that if possible.
>>>
>>> Another issue is,
>>> - if we created some new files or updated some existing files under the source snapshot,
>>> they will be marked as REMOVED/UPDATED out of the source from the destination snapshot's
>>> point of view, so the results might looks a bit strange.
>>>
>>>
>>> A quick demo:
>>> root@kdev:/btrfs# btrfs subvolume diff-snapshot one two
>>> [REMOVED REGFILE] one/regfile_in_one objectid 264 transid 50
>>> [REMOVED DIR] one/dir_02_at_one objectid 262 transid 36
>>> [REMOVED REGFILE] one/dir_02_at_one/file_at_dir02_one objectid 263 transid 37
>>> [REMOVED DIR] one/dir_at_one objectid 258 transid 29
>>> [REMOVED REGFILE] one/dir_at_one/file_02_at_one_dir objectid 260 transid 32
>>> [REMOVED REGFILE] one/dir_at_one/file_03_at_one_dir objectid 261 transid 35
>>> [REMOVED REGFILE] one/dir_at_one/file_at_one_dir objectid 259 transid 30
>>> [REMOVED REGFILE] one/file_at_one objectid 257 transid 26
>>> [NEW REGFILE] two/regfile_in_two objectid 265 transid 50
>>> [NEW DIR] two/dir_at_two objectid 262 transid 40
>>> [NEW REGFILE] two/dir_at_two/file01_at_dir_of_two objectid 263 transid 41
>>> [NEW SYMLINK] two/dir_at_two/passwd objectid 264 transid 42
>>> [NEW REGFILE] two/file_02 objectid 258 transid 23
>>> [NEW REGFILE] two/file_03 objectid 270 transid 68
>>> [NEW REGFILE] two/file_04 objectid 275 transid 68
>>>
>>>
>>> Any comments are appreciated!
>>>
>>> Thanks,
>>> -Jeff
>>>
>>>
>>> Makefile | 2 +-
>>> btrfs-list.c | 3 +-
>>> cmds-subvolume.c | 90 +++++
>>> diff-snapshot.c | 1026 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> diff-snapshot.h | 47 +++
>>> 5 files changed, 1165 insertions(+), 3 deletions(-)
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
My idea was to introduce new "instructions" to the stream later which
could be activated using a flag in the ioctl structure. These
instructions would not be real instruction but diff statements. They
would contain the plain results as given by btrfs_compare_trees. So we
would have the information which tree items were
added/removed/changed. As an alternative, this could be a new ioctl.
Greetings from Ko Tao :)
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC PATCH 0/5] btrfs-progs: snapshot diff function
2012-08-25 6:23 ` Alexander Block
@ 2012-08-26 12:26 ` Jie Liu
0 siblings, 0 replies; 10+ messages in thread
From: Jie Liu @ 2012-08-26 12:26 UTC (permalink / raw)
To: Alexander Block; +Cc: Alex Lyakas, linux-btrfs@vger.kernel.org
On 08/25/12 14:23, Alexander Block wrote:
> On Fri, Aug 24, 2012 at 4:15 PM, Jeff Liu <jeff.liu@oracle.com> wrote:
>> Hi Alex,
>>
>> Thanks for taking a look.
>>
>> On 08/24/2012 09:09 PM, Alex Lyakas wrote:
>>
>>> Hi Jeff,
>>> how do you see this snapshot-diff functionality vs the send/receive
>>> functionality that was recently added? I think that the binary stream
>>> that the send code produces, can be interpreted by printing out text
>>> messages, which will essentially give the same information (although
>>> much more detailed) as your snapshot-diff tool.
>> send/receive has not yet implemented when I working on this feature(back to the end of last year).
>> looks there really has some duplicate efforts.
>>
>> Just as you said, the produced stream of send need to interpret to show readable info, if the binary stream is
>> became huge enough, maybe that will make some silly user crying like me :).
>> Also, it is mainly focus on backup purpose IMHO, please correct me if I missing something in this point.
>>
>> The diff utility is designed to list any changes between two snapshots in a straightforward way consider the command interface,
>> it also can be improved to show the differences at a given time range.
>>
>> But sure, send/receive is just awesome, if we can introduce a interpreted script which do same thing to
>> make end user's life easier, that would be fine.
>>
>> Thanks,
>> -Jeff
>>
>>> Apologies if I somehow misunderstood what your snapshot-diff code does.
>>>
>>> Thanks,
>>> Alex.
>>>
>>>
>>> On Tue, Aug 7, 2012 at 11:56 AM, Jeff Liu <jeff.liu@oracle.com> wrote:
>>>> Hello,
>>>>
>>>> I've done a prototype implementation of snapshot diff utility many months ago.
>>>> It was originally meant to analyze the differences between two snapshots which are
>>>> inherited from the same subvolume/snapshot.
>>>> My idea was to introduce new "instructions" to the stream later which
>>>> could be activated using a flag in the ioctl structure. These
>>>> instructions would not be real instruction but diff statements. They
>>>> would contain the plain results as given by btrfs_compare_trees. So we
>>>> would have the information which tree items were
>>>> added/removed/changed. As an alternative, this could be a new ioctl.
Sound interesting.
The performance of interpret huge streams from the user land could
resolved in this way, am looking forward to see that become true so. :)
Thanks,
-Jeff
>
> Greetings from Ko Tao :)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-08-26 12:26 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-07 8:56 [RFC PATCH 0/5] btrfs-progs: snapshot diff function Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 1/5] btrfs-progs: make ino_resovle() shared Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 2/5] btrfs-progs: header file of snapshot diff Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 3/5] btrfs-progs: souce " Jeff Liu
2012-08-07 8:57 ` [RFC PATCH 4/5] btrfs-progs: teach Makefile aware of the new comer Jeff Liu
2012-08-07 8:58 ` [RFC PATCH 5/5] btrfs-progs: let this feature works at subvolume command group Jeff Liu
2012-08-24 13:09 ` [RFC PATCH 0/5] btrfs-progs: snapshot diff function Alex Lyakas
2012-08-24 14:15 ` Jeff Liu
2012-08-25 6:23 ` Alexander Block
2012-08-26 12:26 ` Jie Liu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).