[BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump'

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump'
@ 2016-07-24 11:03 Goffredo Baroncelli
  2016-07-24 11:03 ` [PATCH 1/5] Add some helper functions Goffredo Baroncelli
                   ` (5 more replies)
  0 siblings, 6 replies; 9+ messages in thread
From: Goffredo Baroncelli @ 2016-07-24 11:03 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, Chris Mason

Hi all,

the following patches add two new commands: 
1) btrfs inspect-internal physical-find
2) btrfs inspect-internal physical-dump

The aim of these two new commands is to locate (1) and dump (2) the stripe elements
stored on the disks. I developed these two new command to simplify the
debugging of some RAID5 bugs (but this is another discussion).

An example of 'btrfs inspect-internal physical-find' is the following:

# btrfs inspect physical-find mnt/out.txt
mnt/out.txt: 0
        devid: 3 dev_name: /dev/loop2 offset: 61931520 type: DATA
        devid: 2 dev_name: /dev/loop1 offset: 61931520 type: OTHER
        devid: 1 dev_name: /dev/loop0 offset: 81854464 type: PARITY
        devid: 4 dev_name: /dev/loop3 offset: 61931520 type: PARITY

In the output above, DATA is the stripe elemnt conaining data. OTHER
is the sibling stripe elemnt: it contains data related to or other files
or to the same file but different position. The two stripe elements contain
the RAID6 parity (P and Q).

It is possible to pass the offset of the file to inspect.

An example of 'btrfs inspect-internal physical-dump' is the following

# btrfs insp physical-find mnt/out.txt 
mnt/out.txt: 0
devid: 5 dev_name: /dev/loop4 offset: 56819712 type: OTHER
devid: 4 dev_name: /dev/loop3 offset: 56819712 type: OTHER
devid: 3 dev_name: /dev/loop2 offset: 56819712 type: DATA
devid: 2 dev_name: /dev/loop1 offset: 56819712 type: PARITY
devid: 1 dev_name: /dev/loop0 offset: 76742656 type: PARITY

# btrfs insp physical-dump mnt/out.txt | xxd 
mnt/out.txt: 0
file: /dev/loop2 off=56819712
00000000: 6164 6161 6161 6161 6161 6161 6161 6161  adaaaaaaaaaaaaaa
00000010: 6161 6161 6161 6161 6161 6161 6161 6161  aaaaaaaaaaaaaaaa
00000020: 6161 6161 6161 6161 6161 6161 6161 6161  aaaaaaaaaaaaaaaa
00000030: 6161 6161 6161 6161 6161 6161 6161 6161  aaaaaaaaaaaaaaaa
00000040: 6161 6161 6161 6161 6161 6161 6161 6161  aaaaaaaaaaaaaaaa
[...]

In this case it is dumped the content of the first 4k of the file. It 
is possible to pass also an offset (at step of 4k). Moreover
it is possible to select to dump: which copy has to be dumped (switch -c,
only for RAID1/RAID10/DUP); which parity has to be dumped (switch -p,
only for RAID5/RAID6); which stripe element other than data (switch -s,
only for RAID5/RAID6).

BR
G.Baroncelli

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/5] Add some helper functions
  2016-07-24 11:03 [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Goffredo Baroncelli
@ 2016-07-24 11:03 ` Goffredo Baroncelli
  2016-07-24 11:03 ` [PATCH 2/5] New btrfs command: "btrfs inspect physical-find" Goffredo Baroncelli
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Goffredo Baroncelli @ 2016-07-24 11:03 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, Chris Mason, Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

Add the following functions:
- int is_btrfs_fs(const char *path) -> returns 0 if path is a btrfs filesystem
- void check_root_or_exit() -> checks if the user has the root capability or
                               it exits writing an error message
- void check_btrfs_or_exit(const char *path)
				checks if path is a valid btrfs filesystem,
				otherwise it exits

Signed-off-by: Goffredo baroncelli <kreijack@inwind.it>
---
 utils.c | 41 +++++++++++++++++++++++++++++++++++++++++
 utils.h | 14 ++++++++++++++
 2 files changed, 55 insertions(+)

diff --git a/utils.c b/utils.c
index 578fdb0..b99706c 100644
--- a/utils.c
+++ b/utils.c
@@ -4131,3 +4131,44 @@ unsigned int rand_range(unsigned int upper)
 	 */
 	return (unsigned int)(jrand48(rand_seed) % upper);
 }
+
+/*
+ * check if path is a btrfs filesystem
+ */
+int is_btrfs_fs(const char *path)
+{
+	struct statfs stfs;
+
+	if (statfs(path, &stfs) != 0) {
+		/* cannot access */
+		return -1;
+	}
+
+	if (stfs.f_type != BTRFS_SUPER_MAGIC) {
+		/* not a btrfs filesystem */
+		return -2;
+	}
+
+	return 0;
+}
+
+/*
+ * check if the user is root
+ */
+void check_root_or_exit()
+{
+    if (geteuid() == 0)
+        return;
+
+    error("You need to be root to execute this command");
+    exit(100);
+}
+
+void check_btrfs_or_exit(const char *path)
+{
+    if (!is_btrfs_fs(path))
+        return;
+
+    error("'%s' must be a valid btrfs filesystem", path);
+    exit(100);
+}
diff --git a/utils.h b/utils.h
index 98bfb34..0bd6ecb 100644
--- a/utils.h
+++ b/utils.h
@@ -399,4 +399,18 @@ unsigned int rand_range(unsigned int upper);
 /* Also allow setting the seed manually */
 void init_rand_seed(u64 seed);
 
+/* return 0 if path is a valid btrfs filesystem */
+int is_btrfs_fs(const char *path);
+
+/*
+ * check if the user has the root capability, otherwise it exits printing an
+ * error message
+ */
+void check_root_or_exit();
+/*
+ * check if path is a valid btrfs filesystem, otherwise it exits printing an
+ * error message
+ */
+void check_btrfs_or_exit(const char *path);
+
 #endif
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/5] New btrfs command: "btrfs inspect physical-find"
  2016-07-24 11:03 [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Goffredo Baroncelli
  2016-07-24 11:03 ` [PATCH 1/5] Add some helper functions Goffredo Baroncelli
@ 2016-07-24 11:03 ` Goffredo Baroncelli
  2016-07-24 11:03 ` [PATCH 3/5] new command btrfs inspect physical-dump Goffredo Baroncelli
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Goffredo Baroncelli @ 2016-07-24 11:03 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, Chris Mason, Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

The aim of this new command is to show the physical placement on the disk
of a file.
Currently it handles all the profiles (single, dup, raid1/10/5/6).

The syntax is simple:

where:
  <filename> is the file to inspect
  <offset> is the offset of the file to inspect (default 0)

Below some examples:

** Single

$ sudo mkfs.btrfs -f -d single -m single /dev/loop0
$ sudo mount /dev/loop0 mnt/
$ python -c "print 'ad'+'a'*65534+'bd'+'b'*65533" | sudo tee mnt/out.txt >/dev/null
$ sudo ../btrfs-progs/btrfs inspect physical-find mnt/out.txt
mnt/out.txt: 0
        devid: 1 dev_name: /dev/loop0 offset: 12582912 type: LINEAR
$ dd 2>/dev/null if=/dev/loop0 skip=12582912 bs=1 count=5; echo
adaaa

** Dup

The command shows both the copies

$ sudo mkfs.btrfs -f -d single -m single /dev/loop0
$ sudo mount /dev/loop0 mnt/
$ python -c "print 'ad'+'a'*65534+'bd'+'b'*65533" | sudo tee mnt/out.txt >/dev/null
$ sudo ../btrfs-progs/btrfs inspect physical-find mnt/out.txt
$ sudo ../btrfs-progs/btrfs inspect physical-find mnt/out.txt
mnt/out.txt: 0
        devid: 1 dev_name: /dev/loop0 offset: 71303168 type: DUP
        devid: 1 dev_name: /dev/loop0 offset: 104857600 type: DUP
$ dd 2>/dev/null if=/dev/loop0 skip=104857600 bs=1 count=5 ; echo
adaaa

** Raid1

The command shows both the copies

$ sudo mkfs.btrfs -f -d raid1 -m raid1 /dev/loop0 /dev/loop1
$ sudo mount /dev/loop0 mnt/
$ python -c "print 'ad'+'a'*65534+'bd'+'b'*65533" | sudo tee mnt/out.txt >/dev/null
$ sudo ../btrfs-progs/btrfs inspect physical-find mnt/out.txt mnt/out.txt: 0
        devid: 2 dev_name: /dev/loop1 offset: 61865984 type: RAID1
        devid: 1 dev_name: /dev/loop0 offset: 81788928 type: RAID1
$ dd 2>/dev/null if=/dev/loop0 skip=81788928 bs=1 count=5; echo
adaaa

** Raid10

The command show both the copies; if you set an offset to the next disk-stripe, you can see the next pair of disk-stripe

$ sudo mkfs.btrfs -f -d raid10 -m raid10 /dev/loop[0123]
$ sudo mount /dev/loop0 mnt/
$ python -c "print 'ad'+'a'*65534+'bd'+'b'*65533" | sudo tee mnt/out.txt >/dev/null
$ sudo ../btrfs-progs/btrfs inspect physical-find mnt/out.txt mnt/out.txt: 0
        devid: 4 dev_name: /dev/loop3 offset: 61931520 type: RAID10
        devid: 3 dev_name: /dev/loop2 offset: 61931520 type: RAID10
$ dd 2>/dev/null if=/dev/loop2 skip=61931520 bs=1 count=5; echo
adaaa
$ sudo ../btrfs-progs/btrfs inspect physical-find mnt/out.txt 65536
mnt/out.txt: 65536
        devid: 2 dev_name: /dev/loop1 offset: 61931520 type: RAID10
        devid: 1 dev_name: /dev/loop0 offset: 81854464 type: RAID10
$ dd 2>/dev/null if=/dev/loop0 skip=81854464 bs=1 count=5; echo
bdbbb

** Raid5

Depending by the offset, you can see which disk-stripe is used.

$ sudo mkfs.btrfs -f -d raid5 -m raid5 /dev/loop[012]
$ sudo mount /dev/loop0 mnt/
$ python -c "print 'ad'+'a'*65534+'bd'+'b'*65533" | sudo tee mnt/out.txt >/dev/null
$ sudo ../btrfs-progs/btrfs inspect physical-find mnt/out.txt
mnt/out.txt: 0
        devid: 2 dev_name: /dev/loop1 offset: 61931520 type: DATA
        devid: 1 dev_name: /dev/loop0 offset: 81854464 type: OTHER
        devid: 3 dev_name: /dev/loop2 offset: 61931520 type: PARITY
$ sudo ../btrfs-progs/btrfs inspect physical-find mnt/out.txt 65536mnt/out.txt: 65536
        devid: 2 dev_name: /dev/loop1 offset: 61931520 type: OTHER
        devid: 1 dev_name: /dev/loop0 offset: 81854464 type: DATA
        devid: 3 dev_name: /dev/loop2 offset: 61931520 type: PARITY
$ dd 2>/dev/null if=/dev/loop1 skip=61931520 bs=1 count=5; echo
adaaa
$ dd 2>/dev/null if=/dev/loop0 skip=81854464 bs=1 count=5; echo
bdbbb
$ dd 2>/dev/null if=/dev/loop2 skip=61931520 bs=1 count=5 | xxd
00000000: 0300 0303 03                             .....

The parity is computed as: parity=disk1^disk2. So "adaa" ^ "bdbb" == "\x03\x00\x03\x03

** Raid6
$ sudo mkfs.btrfs -f -mraid6 -draid6 /dev/loop[0-4]^C
$ sudo mount /dev/loop0 mnt/
$ python -c "print 'ad'+'a'*65534+'bd'+'b'*65533" | sudo tee mnt/out.txt >/dev/null
$ sudo ../btrfs-progs/btrfs inspect physical-find mnt/out.txt
mnt/out.txt: 0
        devid: 3 dev_name: /dev/loop2 offset: 61931520 type: DATA
        devid: 2 dev_name: /dev/loop1 offset: 61931520 type: OTHER
        devid: 1 dev_name: /dev/loop0 offset: 81854464 type: PARITY
        devid: 4 dev_name: /dev/loop3 offset: 61931520 type: PARITY

$ dd 2>/dev/null if=/dev/loop2 skip=61931520 bs=1 count=5 ; echo
adaaa


Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 cmds-inspect.c | 550 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 550 insertions(+)

diff --git a/cmds-inspect.c b/cmds-inspect.c
index dd7b9dd..dd0570b 100644
--- a/cmds-inspect.c
+++ b/cmds-inspect.c
@@ -22,6 +22,11 @@
 #include <errno.h>
 #include <getopt.h>
 #include <limits.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <linux/fs.h>
+#include <linux/fiemap.h>
 
 #include "kerncompat.h"
 #include "ioctl.h"
@@ -623,6 +628,549 @@ out:
 	return !!ret;
 }
 
+
+static const char * const cmd_inspect_physical_find_usage[] = {
+	"btrfs inspect-internal physical-find <path> [<off>]",
+	"Show the physical placement of a file data.",
+	"<path>   file to show",
+	"<off>    file offset to show; 0 if not specified",
+	"This command requires root privileges",
+	NULL
+};
+
+#define STRIPE_INFO_LINEAR		1
+#define STRIPE_INFO_DUP			2
+#define STRIPE_INFO_RAID0		3
+#define STRIPE_INFO_RAID1		4
+#define STRIPE_INFO_RAID10		5
+#define STRIPE_INFO_RAID56_DATA		6
+#define STRIPE_INFO_RAID56_OTHER	7
+#define STRIPE_INFO_RAID56_PARITY	8
+
+static const char * const stripe_info_descr[] = {
+	[STRIPE_INFO_LINEAR] = "LINEAR",
+	[STRIPE_INFO_DUP] = "DUP",
+	[STRIPE_INFO_RAID0] = "RAID0",
+	[STRIPE_INFO_RAID1] = "RAID1",
+	[STRIPE_INFO_RAID10] = "RAID10",
+	[STRIPE_INFO_RAID56_DATA] = "DATA",
+	[STRIPE_INFO_RAID56_OTHER] = "OTHER",
+	[STRIPE_INFO_RAID56_PARITY] = "PARITY",
+};
+
+struct stripe_info {
+	u64 devid;
+	const char *dname;
+	u64 phy_start;
+	int type;
+};
+
+static void add_stripe_info(struct stripe_info **list, int *count,
+	u64 devid, const char *dname, u64 phy_start, int type) {
+
+	if (*list == NULL)
+		*count = 0;
+
+	++*count;
+	*list = realloc(*list, sizeof(struct stripe_info) * *count);
+	/*
+	 * It is rude, but it should not happen for this kind of allocation...
+	 * ... and anyway when it happens, there are more severe problems
+	 * that this handling of "not enough memory"
+	 */
+	if (*list == NULL) {
+		error("Not nough memory: abort\n");
+		exit(100);
+	}
+
+	(*list)[*count-1].devid = devid;
+	(*list)[*count-1].dname = dname;
+	(*list)[*count-1].phy_start = phy_start;
+	(*list)[*count-1].type = type;
+}
+
+static void dump_stripes(int ndisks, struct btrfs_ioctl_dev_info_args *disks,
+			 struct btrfs_chunk *chunk, u64 logical_start,
+			 struct stripe_info **stripes_ret, int *stripes_count) {
+	struct btrfs_stripe *stripes;
+
+	stripes = &chunk->stripe;
+
+	if ((chunk->type & BTRFS_BLOCK_GROUP_PROFILE_MASK) == 0) {
+		/* LINEAR: each chunk has (should have) only one disk */
+		int j;
+		char *dname = "<NOT FOUND>";
+
+		assert(chunk->num_stripes == 1);
+
+		u64 phy_start = stripes[0].offset +
+			+logical_start;
+		for (j = 0 ; j < ndisks ; j++) {
+			if (stripes[0].devid == disks[j].devid) {
+				dname = (char *)disks[j].path;
+				break;
+			}
+		}
+
+		add_stripe_info(stripes_ret, stripes_count,
+			stripes[0].devid, dname, phy_start,
+			STRIPE_INFO_LINEAR);
+	} else if (chunk->type & BTRFS_BLOCK_GROUP_RAID0) {
+		/*
+		 * RAID0: each chunk is composed by more disks;
+		 * each stripe_len bytes are in a different disk:
+		 *
+		 *  file: ABC...NMOP....
+		 *
+		 *      disk1   disk2    disk3  .... disksN
+		 *
+		 *        A      B         C    ....    N
+		 *        M      O         P    ....
+		 *
+		 */
+		u64 disks_number = chunk->num_stripes;
+		u64 disk_stripe_size = chunk->stripe_len;
+		u64 stripe_capacity;
+		u64 stripe_nr;
+		u64 disk_stripe_start;
+		int sidx;
+		int j;
+		char *dname = "<NOT FOUND>";
+
+		stripe_capacity = disks_number * disk_stripe_size;
+		stripe_nr = logical_start / stripe_capacity;
+		disk_stripe_start = logical_start % disk_stripe_size;
+
+		sidx = (logical_start / disk_stripe_size) % disks_number;
+
+		u64 phy_start = stripes[sidx].offset +
+			stripe_nr * disk_stripe_size +
+			disk_stripe_start;
+
+		for (j = 0 ; j < ndisks ; j++) {
+			if (stripes[sidx].devid == disks[j].devid) {
+				dname = (char *)disks[j].path;
+				break;
+			}
+		}
+
+		add_stripe_info(stripes_ret, stripes_count,
+			stripes[sidx].devid, dname, phy_start,
+			STRIPE_INFO_RAID0);
+
+	} else if (chunk->type & BTRFS_BLOCK_GROUP_RAID1) {
+		/*
+		 * RAID0: each chunk is composed by more disks;
+		 * each stripe_len bytes are in a different disk:
+		 *
+		 *  file: ABC...
+		 *
+		 *      disk1   disk2   disk3  ....
+		 *
+		 *        A       A
+		 *        B       B
+		 *        C       C
+		 *
+		 */
+		int sidx;
+
+		for (sidx = 0; sidx < chunk->num_stripes; sidx++) {
+			int j;
+			char *dname = "<NOT FOUND>";
+			u64 phy_start = stripes[sidx].offset +
+				+logical_start;
+
+			for (j = 0 ; j < ndisks ; j++) {
+				if (stripes[sidx].devid == disks[j].devid) {
+					dname = (char *)disks[j].path;
+					break;
+				}
+			}
+			add_stripe_info(stripes_ret, stripes_count,
+				stripes[sidx].devid, dname, phy_start,
+				STRIPE_INFO_RAID1);
+		}
+
+	} else if (chunk->type & BTRFS_BLOCK_GROUP_DUP) {
+		/*
+		 * DUP: each chunk has 'num_stripes' disk_stripe. Heach
+		 * disk_stripe has its own copy of data
+		 *
+		 *  file: ABCD....
+		 *
+		 *      disk1   disk2   disk3
+		 *
+		 *        A
+		 *        B
+		 *        C
+		 *      [...]
+		 *        A
+		 *        B
+		 *        C
+		 *
+		 *
+		 * NOTE: the difference between DUP and RAID1 is that
+		 * in RAID1 each disk_stripe is in a different disk, in DUP
+		 * each disk chunk is in the same disk
+		 */
+		int sidx;
+
+		for (sidx = 0; sidx < chunk->num_stripes; sidx++) {
+			int j;
+			char *dname = "<NOT FOUND>";
+			u64 phy_start = stripes[sidx].offset +
+				+logical_start;
+
+			for (j = 0 ; j < ndisks ; j++) {
+				if (stripes[sidx].devid == disks[j].devid) {
+					dname = (char *)disks[j].path;
+					break;
+				}
+			}
+
+			add_stripe_info(stripes_ret, stripes_count,
+				stripes[sidx].devid, dname, phy_start,
+				STRIPE_INFO_DUP);
+		}
+	} else if (chunk->type & BTRFS_BLOCK_GROUP_RAID10) {
+		/*
+		 * RAID10: each chunk is composed by more disks;
+		 * each stripe_len bytes are in a different disk:
+		 *
+		 *  file: ABCD....
+		 *
+		 *      disk1   disk2   disk3   disk4
+		 *
+		 *        A      A         B      B
+		 *        C      C         D      D
+		 *
+		 *
+		 */
+		int i;
+		u64 disks_number = chunk->num_stripes;
+		u64 disk_stripe_size = chunk->stripe_len;
+		u64 stripe_capacity;
+		u64 stripe_nr;
+		u64 stripe_start;
+		u64 disk_stripe_start;
+
+		stripe_capacity = disks_number * disk_stripe_size / chunk->sub_stripes;
+		stripe_nr = logical_start / stripe_capacity;
+		stripe_start = logical_start % stripe_capacity;
+		disk_stripe_start = logical_start % disk_stripe_size;
+
+		for (i = 0; i < chunk->sub_stripes; i++) {
+			int j;
+			char *dname = "<NOT FOUND>";
+			int sidx = (i +
+				stripe_start/disk_stripe_size*chunk->sub_stripes) %
+				disks_number;
+
+			u64 phy_start = stripes[sidx].offset +
+				+stripe_nr*disk_stripe_size + disk_stripe_start;
+
+			for (j = 0 ; j < ndisks ; j++) {
+				if (stripes[sidx].devid == disks[j].devid) {
+					dname = (char *)disks[j].path;
+					break;
+				}
+			}
+
+			add_stripe_info(stripes_ret, stripes_count,
+				stripes[sidx].devid, dname, phy_start,
+				STRIPE_INFO_RAID10);
+
+		}
+	} else if (chunk->type & BTRFS_BLOCK_GROUP_RAID5 ||
+			chunk->type & BTRFS_BLOCK_GROUP_RAID6) {
+		/*
+		 * RAID5: each chunk is spread on a different disk; however one
+		 * disk is used for parity
+		 *
+		 *  file: ABCDEFGHIJK....
+		 *
+		 *      disk1  disk2  disk3  disk4  disk5
+		 *
+		 *        A      B      C      D      P
+		 *        P      D      E      F      G
+		 *        H      P      I      J      K
+		 *
+		 *   Note: P == parity
+		 *
+		 * RAID6: each chunk is spread on a different disk; however two
+		 * disks are used for parity
+		 *
+		 *  file: ABCDEFGHI...
+		 *
+		 *      disk1  disk2  disk3  disk4  disk5
+		 *
+		 *        A      B      C      P      Q
+		 *        Q      D      E      F      P
+		 *        P      Q      G      H      I
+		 *
+		 *   Note: P,Q == parity
+		 *
+		 */
+		int parities_nr = 1;
+		u64 disks_number = chunk->num_stripes;
+		u64 disk_stripe_size = chunk->stripe_len;
+		u64 stripe_capacity;
+		u64 stripe_nr;
+		u64 stripe_start;
+		u64 pos = 0;
+		u64 disk_stripe_start;
+		int sidx;
+
+		if (chunk->type & BTRFS_BLOCK_GROUP_RAID6)
+			parities_nr = 2;
+
+		stripe_capacity = (disks_number - parities_nr) *
+						disk_stripe_size;
+		stripe_nr = logical_start / stripe_capacity;
+		stripe_start = logical_start % stripe_capacity;
+		disk_stripe_start = logical_start % disk_stripe_size;
+
+		for (sidx = 0; sidx < disks_number ; sidx++) {
+			int j;
+			char *dname = "<NOT FOUND>";
+			u64 stripe_index = (sidx + stripe_nr) % disks_number;
+			u64 phy_start = stripes[stripe_index].offset + /* chunk start */
+				+ stripe_nr*disk_stripe_size +  /* stripe start */
+				+ disk_stripe_start;
+
+			for (j = 0 ; j < ndisks ; j++)
+				if (stripes[stripe_index].devid == disks[j].devid) {
+				dname = (char *)disks[j].path;
+				break;
+				}
+
+			if (sidx >= (disks_number - parities_nr)) {
+				add_stripe_info(stripes_ret, stripes_count,
+					stripes[stripe_index].devid, dname, phy_start,
+					STRIPE_INFO_RAID56_PARITY);
+				continue;
+			}
+
+			if (stripe_start >= pos && stripe_start < (pos+disk_stripe_size)) {
+				add_stripe_info(stripes_ret, stripes_count,
+					stripes[stripe_index].devid, dname, phy_start,
+					STRIPE_INFO_RAID56_DATA);
+			} else {
+				add_stripe_info(stripes_ret, stripes_count,
+					stripes[stripe_index].devid, dname, phy_start,
+					STRIPE_INFO_RAID56_OTHER);
+			}
+
+			pos += disk_stripe_size;
+		}
+		assert(pos == stripe_capacity);
+	} else {
+		error("Unknown chunk type = 0x%016llx\n", chunk->type);
+		return;
+	}
+
+}
+
+static int get_chunk_offset(int fd, u64 logical_start,
+	struct btrfs_chunk *chunk_ret, u64 *off_ret) {
+
+	struct btrfs_ioctl_search_args args;
+	struct btrfs_ioctl_search_key *sk = &args.key;
+	struct btrfs_ioctl_search_header sh;
+	unsigned long off = 0;
+	int i;
+
+	memset(&args, 0, sizeof(args));
+	sk->tree_id = BTRFS_CHUNK_TREE_OBJECTID;
+	sk->min_objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID;
+	sk->max_objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID;
+	sk->min_type = BTRFS_CHUNK_ITEM_KEY;
+	sk->max_type = BTRFS_CHUNK_ITEM_KEY;
+	sk->max_offset = (u64)-1;
+	sk->min_offset = 0;
+	sk->max_transid = (u64)-1;
+
+	while (1) {
+		int ret;
+
+		sk->nr_items = 1;
+		ret = ioctl(fd, BTRFS_IOC_TREE_SEARCH, &args);
+		if (ret < 0)
+			return -errno;
+
+		if (sk->nr_items == 0)
+			break;
+
+		off = 0;
+		for (i = 0; i < sk->nr_items; i++) {
+			struct btrfs_chunk *item;
+
+			memcpy(&sh, args.buf + off, sizeof(sh));
+			off += sizeof(sh);
+			item = (struct btrfs_chunk *)(args.buf + off);
+			off += sh.len;
+
+			if (logical_start >= sh.offset &&
+			    logical_start < sh.offset+item->length) {
+				memcpy(chunk_ret, item, sh.len);
+				*off_ret = logical_start-sh.offset;
+				return 0;
+			}
+
+			sk->min_objectid = sh.objectid;
+			sk->min_type = sh.type;
+			sk->min_offset = sh.offset;
+		}
+
+		if (sk->min_offset < (u64)-1)
+			sk->min_offset++;
+		else
+			break;
+	}
+
+	return 1; /* not found */
+}
+
+/*
+ * Inline extents are skipped because they do not take data space,
+ * delalloc and unknown are skipped because we do not know how much
+ * space they will use yet.
+ */
+#define SKIP_FLAGS	(FIEMAP_EXTENT_UNKNOWN|FIEMAP_EXTENT_DELALLOC| \
+			 FIEMAP_EXTENT_DATA_INLINE)
+static int cmd_inspect_physical_find(int argc, char **argv)
+{
+	int ret = 0;
+	u64 logical = 0ull;
+	int fd = -1;
+	int last = 0;
+	char buf[16384];
+	char *fname;
+	int found = 0;
+	struct fiemap *fiemap = (struct fiemap *)buf;
+	struct fiemap_extent *fm_ext;
+	const int count = (sizeof(buf) - sizeof(*fiemap)) /
+					sizeof(struct fiemap_extent);
+	struct btrfs_ioctl_dev_info_args *disks = NULL;
+	struct btrfs_ioctl_fs_info_args fi_args = {0};
+	char btrfs_chunk_data[4096];
+	struct btrfs_chunk *chunk_item = (struct btrfs_chunk *)&btrfs_chunk_data;
+	u64 chunk_offset = 0;
+	int minargc = 1;
+	struct stripe_info *stripes = NULL;
+	int stripes_count = 0;
+	int i;
+	int rc;
+
+	memset(fiemap, 0, sizeof(struct fiemap));
+
+	if (check_argc_min(argc - minargc, 1) || check_argc_max(argc - minargc, 2))
+		usage(cmd_inspect_physical_find_usage);
+
+	if (argc - minargc == 2)
+		logical = strtoull(argv[minargc+1], NULL, 0);
+	fname = argv[minargc];
+
+	check_root_or_exit();
+	check_btrfs_or_exit(fname);
+
+	printf("%s: %llu\n", fname, logical);
+
+	fd = open(fname, O_RDONLY);
+	if (fd < 0) {
+		error("Can't open '%s' for reading\n", fname);
+		ret = -errno;
+		goto out;
+	}
+
+	do {
+
+		int rc;
+		int j;
+
+		fiemap->fm_length = ~0ULL;
+		fiemap->fm_extent_count = count;
+		fiemap->fm_flags = FIEMAP_FLAG_SYNC;
+		rc = ioctl(fd, FS_IOC_FIEMAP, (unsigned long) fiemap);
+		if (rc < 0) {
+			error("Can't do ioctl()\n");
+			ret = -errno;
+			goto out;
+		}
+
+		for (j = 0; j < fiemap->fm_mapped_extents; j++) {
+			u32 flags;
+
+			fm_ext = &fiemap->fm_extents[j];
+			flags = fm_ext->fe_flags;
+
+			fiemap->fm_start = (fm_ext->fe_logical +
+					fm_ext->fe_length);
+
+			if (flags & FIEMAP_EXTENT_LAST)
+				last = 1;
+
+			if (flags & SKIP_FLAGS)
+				continue;
+
+			if (logical > fm_ext->fe_logical +
+			    fm_ext->fe_length)
+				continue;
+
+			found = 1;
+			break;
+		}
+	} while (last == 0 || found == 0);
+
+
+	if (!found) {
+		error("Can't find the extent: the file is too short, or the file is stored in a leaf.\n");
+		ret = 10;
+		goto out;
+	}
+
+	rc = get_fs_info(fname, &fi_args, &disks);
+	if (rc < 0) {
+		error("Cannot get info for the filesystem: may be it is not a btrfs filesystem ?\n");
+		ret = 12;
+		goto out;
+	}
+
+	rc = get_chunk_offset(fd,
+		fm_ext->fe_physical + logical - fm_ext->fe_logical,
+		chunk_item, &chunk_offset);
+	if (rc < 0) {
+		error("cannot perform the search: %s", strerror(rc));
+		ret = 13;
+		goto out;
+	}
+	if (rc != 0) {
+		error("cannot find chunk\n");
+		ret = 14;
+		goto out;
+	}
+
+	dump_stripes(fi_args.num_devices, disks,
+		     chunk_item, chunk_offset,
+		     &stripes, &stripes_count);
+
+	for (i = 0 ; i < stripes_count ; i++) {
+		printf("devid: %llu dev_name: %s offset: %llu type: %s\n",
+			stripes[i].devid, stripes[i].dname,
+			stripes[i].phy_start,
+			stripe_info_descr[stripes[i].type]);
+	}
+
+out:
+	if (fd != -1)
+		close(fd);
+	if (disks != NULL)
+		free(disks);
+	if (stripes != NULL)
+		free(stripes);
+	return ret;
+}
+
 static const char inspect_cmd_group_info[] =
 "query various internal information";
 
@@ -644,6 +1192,8 @@ const struct cmd_group inspect_cmd_group = {
 				cmd_inspect_dump_super_usage, NULL, 0 },
 		{ "tree-stats", cmd_inspect_tree_stats,
 				cmd_inspect_tree_stats_usage, NULL, 0 },
+		{ "physical-find", cmd_inspect_physical_find,
+				cmd_inspect_physical_find_usage, NULL, 0 },
 		NULL_CMD_STRUCT
 	}
 };
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/5] new command btrfs inspect physical-dump
  2016-07-24 11:03 [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Goffredo Baroncelli
  2016-07-24 11:03 ` [PATCH 1/5] Add some helper functions Goffredo Baroncelli
  2016-07-24 11:03 ` [PATCH 2/5] New btrfs command: "btrfs inspect physical-find" Goffredo Baroncelli
@ 2016-07-24 11:03 ` Goffredo Baroncelli
  2016-07-24 11:03 ` [PATCH 4/5] Add man page for command btrfs insp physical-find Goffredo Baroncelli
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 9+ messages in thread
From: Goffredo Baroncelli @ 2016-07-24 11:03 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, Chris Mason, Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

The aim of this command, is to dump the disk content of a file bypassing the
btrfs filesystem. This could help to test the btrfs filesystem.
The dump size is a page (4k) (even if the file is shorter). It is possible
to set an offset for the file portion to read, but even this offset must be
multiple of 4k.

With the switch -c , it is possible to select whch copy will be
dumped (RAID1/RAID10/DUP).
With the switch -p, it is possible to select which parity will
be dumped (RAID5/RAID6)
With the switch -s, it is possible to dump the other elemnt of the
stripe (RAID5, RAID6)

# btrfs insp physical-dump /bin/ls 8192 | xxd
/bin/ls: 8192
file: /dev/sda3 off=16600629248
00000000: b0e2 6100 0000 0000 0700 0000 5200 0000  ..a.........R...
00000010: 0000 0000 0000 0000 b8e2 6100 0000 0000  ..........a.....
00000020: 0700 0000 5300 0000 0000 0000 0000 0000  ....S...........
00000030: c0e2 6100 0000 0000 0700 0000 5400 0000  ..a.........T...
[...]


Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 cmds-inspect.c | 342 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 342 insertions(+)

diff --git a/cmds-inspect.c b/cmds-inspect.c
index dd0570b..48243ec 100644
--- a/cmds-inspect.c
+++ b/cmds-inspect.c
@@ -1171,6 +1171,346 @@ out:
 	return ret;
 }
 
+static const char * const cmd_inspect_physical_dump_usage[] = {
+	"btrfs inspect-internal physical-dump [-c <copynr>|-s <stripenr>|-p <paritynr>] <path> [<off>]",
+	"Dump the physical content of a file offset",
+	"<path>      file to dump",
+	"<off>       file offset to dump; 0 if not specified",
+	"<copynr>    number of copy to dump (for raid1,dup/raid10)",
+	"<paritynr>  number of parity to dump (for raid5/raid6)",
+	"<stripenr>  number of stripe elemnt to dump (for raid5/raid6)",
+	"This command requires root privileges",
+	NULL
+};
+
+static int dumpfile(const char *fname, u64 off)
+{
+	int fd = -1;
+	int size = 4096;
+	char buf[size];
+	int r;
+	int e = 0;
+	off_t r1;
+
+	fprintf(stderr, "file: %s off=%llu\n", fname, off);
+
+	fd = open(fname, O_RDONLY|O_APPEND);
+	if (fd < 0) {
+		int e = errno;
+
+		error("cannot open file: '%s'\n", strerror(e));
+		return -e;
+	}
+
+	r1 = lseek(fd, off, SEEK_SET);
+	if (r1 == (off_t)-1) {
+		e = -errno;
+		error("cannot seek file: '%s'\n", strerror(-e));
+		goto out;
+	}
+
+	while (size) {
+		r = read(fd, buf, size);
+		if (r < 0) {
+			e = -errno;
+			error("cannot read file: '%s'\n", strerror(-e));
+			goto out;
+		}
+
+		size -= r;
+		r = fwrite(buf, r, 1, stdout);
+		if (r < 0) {
+			e = -errno;
+			error("cannot write: '%s'\n", strerror(-e));
+			goto out;
+		}
+
+	}
+
+out:
+	if (fd != -1)
+		close(fd);
+	return e;
+}
+
+static int cmd_inspect_physical_dump(int argc, char **argv)
+{
+	int ret = 0;
+	u64 logical = 0ull;
+	int fd;
+	int last = 0;
+	char buf[16384];
+	char *fname;
+	int found = 0;
+	struct fiemap *fiemap = (struct fiemap *)buf;
+	struct fiemap_extent *fm_ext = &fiemap->fm_extents[0];
+	const int count = (sizeof(buf) - sizeof(*fiemap)) /
+			sizeof(struct fiemap_extent);
+	u64 profile_type;
+	struct btrfs_ioctl_dev_info_args *disks = NULL;
+	struct btrfs_ioctl_fs_info_args fi_args = {0};
+	char btrfs_chunk_data[4096];
+	struct btrfs_chunk *chunk_item = (struct btrfs_chunk *)&btrfs_chunk_data;
+	u64 chunk_offset = 0;
+	struct stripe_info *stripes = NULL;
+	int stripes_count = 0;
+	int rc;
+	int copynr = 0;
+	int paritynr = -1;
+	int stripenr = -1;
+
+	optind = 1;
+	while (1) {
+		int c = getopt(argc, argv, "c:p:s:");
+
+		if (c < 0)
+			break;
+
+		switch (c) {
+		case 'c':
+			copynr = atoi(optarg);
+			break;
+		case 'p':
+			paritynr = atoi(optarg);
+			break;
+		case 's':
+			stripenr = atoi(optarg);
+			break;
+		default:
+			usage(cmd_inspect_physical_dump_usage);
+		}
+	}
+
+	if (check_argc_min(argc - optind, 1) ||
+	    check_argc_max(argc - optind, 3))
+		usage(cmd_inspect_physical_dump_usage);
+
+	if (argc - optind == 2)
+		logical = strtoull(argv[optind+1], NULL, 0);
+
+	if (logical % 4096) {
+		error("<off> must be multiple of 4096 !");
+		return 11;
+	}
+
+	fname = argv[optind];
+
+	check_root_or_exit();
+	check_btrfs_or_exit(fname);
+
+	fprintf(stderr, "%s: %llu\n", fname, logical);
+
+	fd = open(fname, O_RDONLY|O_DIRECT);
+	if (fd < 0) {
+		error("Can't open '%s' for reading.\n", fname);
+		ret = -errno;
+		goto out;
+	}
+
+	do {
+
+		int rc;
+		int j;
+
+		fiemap->fm_length = ~0ULL;
+		fiemap->fm_extent_count = count;
+		fiemap->fm_flags = FIEMAP_FLAG_SYNC;
+		rc = ioctl(fd, FS_IOC_FIEMAP, (unsigned long) fiemap);
+		if (rc < 0) {
+			error("Can't do ioctl()\n");
+			ret = -errno;
+			goto out;
+		}
+
+		for (j = 0; j < fiemap->fm_mapped_extents; j++) {
+			u32 flags;
+
+			fm_ext = &fiemap->fm_extents[j];
+			flags = fm_ext->fe_flags;
+
+			fiemap->fm_start = (fm_ext->fe_logical +
+				fm_ext->fe_length);
+
+			if (flags & FIEMAP_EXTENT_LAST)
+				last = 1;
+
+			if (flags & SKIP_FLAGS)
+				continue;
+
+			if (logical > fm_ext->fe_logical +
+			    fm_ext->fe_length)
+				continue;
+
+			found = 1;
+			break;
+		}
+	} while (last == 0 || found == 0);
+
+
+	if (!found) {
+		error("Can't find the extent: the file is too short, or the file is stored in a leaf.\n");
+		ret = 10;
+		goto out;
+	}
+
+	rc = get_fs_info(fname, &fi_args, &disks);
+	if (rc < 0) {
+		error("Cannot get info for the filesystem: may be it is not a btrfs filesystem ?\n");
+		ret = 12;
+		goto out;
+	}
+
+	rc = get_chunk_offset(fd,
+		fm_ext->fe_physical + logical - fm_ext->fe_logical,
+		chunk_item, &chunk_offset);
+	if (rc < 0) {
+		error("cannot perform the search: %s", strerror(rc));
+		ret = 13;
+		goto out;
+	}
+	if (rc != 0) {
+		error("cannot find chunk\n");
+		ret = 14;
+		goto out;
+	}
+
+	dump_stripes(fi_args.num_devices, disks,
+		     chunk_item, chunk_offset,
+		     &stripes, &stripes_count);
+
+	profile_type = chunk_item->type & BTRFS_BLOCK_GROUP_PROFILE_MASK;
+	if (profile_type == 0 || profile_type & BTRFS_BLOCK_GROUP_RAID0) {
+
+		if (copynr != 0) {
+			error("-c <copynr> is not valid for profile '%s'\n",
+			      btrfs_group_profile_str(profile_type));
+			ret = 16;
+			goto out;
+		}
+		if (stripenr != -1) {
+			error("-s <stripenr> is not valid for profile '%s'\n",
+			      btrfs_group_profile_str(profile_type));
+			ret = 16;
+			goto out;
+		}
+		if (paritynr != -1) {
+			error("-p <paritynr> is not valid for profile '%s'\n",
+			      btrfs_group_profile_str(profile_type));
+			ret = 16;
+			goto out;
+		}
+
+		ret = dumpfile(stripes[0].dname, stripes[0].phy_start);
+
+	} else if (profile_type & BTRFS_BLOCK_GROUP_RAID1 ||
+			profile_type & BTRFS_BLOCK_GROUP_DUP ||
+			profile_type & BTRFS_BLOCK_GROUP_RAID10) {
+
+		if (stripenr != -1) {
+			error("-s <stripenr> is not valid for profile '%s'\n",
+			      btrfs_group_profile_str(profile_type));
+			ret = 16;
+			goto out;
+		}
+		if (paritynr != -1) {
+			error("-p <paritynr> is not valid for profile '%s'\n",
+			      btrfs_group_profile_str(profile_type));
+			ret = 16;
+			goto out;
+		}
+		if (copynr < 0 || copynr > 1) {
+			error("<copynr>=%d is not valid for profile '%s'\n",
+			      copynr, btrfs_group_profile_str(profile_type));
+			ret = 16;
+			goto out;
+		}
+
+		ret = dumpfile(stripes[copynr].dname, stripes[copynr].phy_start);
+
+	} else if (profile_type & BTRFS_BLOCK_GROUP_RAID5 ||
+		   profile_type & BTRFS_BLOCK_GROUP_RAID6) {
+
+		int maxparity = 0;
+		int stripeid = -1;
+
+		if (profile_type & BTRFS_BLOCK_GROUP_RAID6)
+			maxparity = 1;
+
+		if (copynr != 0) {
+			error("-c <copynr> is not valid for profile '%s'\n",
+			      btrfs_group_profile_str(profile_type));
+			ret = 16;
+			goto out;
+		}
+		if (paritynr != -1 && stripenr != -1) {
+			error("You cannot pass both -p <paritynr> and -s <stripenr> for profile '%s'\n",
+				btrfs_group_profile_str(profile_type));
+			ret = 16;
+			goto out;
+		}
+		if (paritynr < -1 || paritynr > maxparity) {
+			error("<paritynr>=%d is not valid for profile '%s'\n",
+				paritynr, btrfs_group_profile_str(profile_type));
+			ret = 16;
+			goto out;
+		}
+		if (stripenr < -1 || stripenr > (stripes_count - maxparity - 3)) {
+			error("<stripenr>=%d is not valid for profile '%s' [%d disks]\n",
+				stripenr, btrfs_group_profile_str(profile_type),
+				stripes_count);
+			ret = 16;
+			goto out;
+		}
+		if (stripenr == -1 && paritynr == -1) {
+			int i;
+
+			for (i = 0 ; i < stripes_count ; i++) {
+				if (stripes[i].type == STRIPE_INFO_RAID56_DATA) {
+					stripeid = i;
+					break;
+				}
+			}
+		} else if (paritynr != -1) {
+			int i;
+
+			for (i = 0 ; i < stripes_count ; i++) {
+				if (stripes[i].type == STRIPE_INFO_RAID56_PARITY)
+					--paritynr;
+				if (paritynr == -1) {
+					stripeid = i;
+					break;
+				}
+			}
+		} else {
+			int i;
+
+			for (i = 0 ; i < stripes_count ; i++) {
+				if (stripes[i].type == STRIPE_INFO_RAID56_OTHER)
+					--stripenr;
+				if (stripenr == -1) {
+					stripeid = i;
+					break;
+				}
+			}
+		}
+
+		assert(stripeid >= 0 && stripeid < stripes_count);
+
+		ret = dumpfile(stripes[stripeid].dname,
+			       stripes[stripeid].phy_start);
+
+	}
+
+out:
+	if (fd != -1)
+		close(fd);
+	if (disks != NULL)
+		free(disks);
+	if (stripes != NULL)
+		free(stripes);
+	return ret;
+}
+
 static const char inspect_cmd_group_info[] =
 "query various internal information";
 
@@ -1194,6 +1534,8 @@ const struct cmd_group inspect_cmd_group = {
 				cmd_inspect_tree_stats_usage, NULL, 0 },
 		{ "physical-find", cmd_inspect_physical_find,
 				cmd_inspect_physical_find_usage, NULL, 0 },
+		{ "physical-dump", cmd_inspect_physical_dump,
+				cmd_inspect_physical_dump_usage, NULL, 0 },
 		NULL_CMD_STRUCT
 	}
 };
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/5] Add man page for command btrfs insp physical-find
  2016-07-24 11:03 [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Goffredo Baroncelli
                   ` (2 preceding siblings ...)
  2016-07-24 11:03 ` [PATCH 3/5] new command btrfs inspect physical-dump Goffredo Baroncelli
@ 2016-07-24 11:03 ` Goffredo Baroncelli
  2016-07-24 11:03 ` [PATCH 5/5] Add new command to man pages: btrfs insp physical-dump Goffredo Baroncelli
  2016-07-25  2:14 ` [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Qu Wenruo
  5 siblings, 0 replies; 9+ messages in thread
From: Goffredo Baroncelli @ 2016-07-24 11:03 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, Chris Mason, Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 Documentation/btrfs-inspect-internal.asciidoc | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/Documentation/btrfs-inspect-internal.asciidoc b/Documentation/btrfs-inspect-internal.asciidoc
index 74f6dea..35e2237 100644
--- a/Documentation/btrfs-inspect-internal.asciidoc
+++ b/Documentation/btrfs-inspect-internal.asciidoc
@@ -146,6 +146,11 @@ Print sizes and statistics of trees.
 -b::::
 Print raw numbers in bytes.
 
+*physical-find* <path> [<off>]::
+(needs root privileges)
++
+Show the placement of a given file (at offset 'off', default 0) on the disks.
+
 EXIT STATUS
 -----------
 *btrfs inspect-internal* returns a zero exit status if it succeeds. Non zero is
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 5/5] Add new command to man pages: btrfs insp physical-dump
  2016-07-24 11:03 [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Goffredo Baroncelli
                   ` (3 preceding siblings ...)
  2016-07-24 11:03 ` [PATCH 4/5] Add man page for command btrfs insp physical-find Goffredo Baroncelli
@ 2016-07-24 11:03 ` Goffredo Baroncelli
  2016-07-25  2:14 ` [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Qu Wenruo
  5 siblings, 0 replies; 9+ messages in thread
From: Goffredo Baroncelli @ 2016-07-24 11:03 UTC (permalink / raw)
  To: linux-btrfs; +Cc: dsterba, Chris Mason, Goffredo Baroncelli

From: Goffredo Baroncelli <kreijack@inwind.it>

Signed-off-by: Goffredo Baroncelli <kreijack@inwind.it>
---
 Documentation/btrfs-inspect-internal.asciidoc | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/Documentation/btrfs-inspect-internal.asciidoc b/Documentation/btrfs-inspect-internal.asciidoc
index 35e2237..0497d08 100644
--- a/Documentation/btrfs-inspect-internal.asciidoc
+++ b/Documentation/btrfs-inspect-internal.asciidoc
@@ -151,6 +151,17 @@ Print raw numbers in bytes.
 +
 Show the placement of a given file (at offset 'off', default 0) on the disks.
 
+*physical-dump* [-c <copynr>|-s <stripenr>|-p <paritynr>] <path> [<off>]::
+(needs root privileges)
++
+Dump the disk content of a given file (at offset 'off', default 0).
+For RAID1/RAID10/DUP 'copynr', select which copy will be dumped. For
+RAID5/RAID6, 'paritynr' specifies which parity will be dumped. For
+RAID5/RAID6, 'stripenr' specifies which stripe elemnt will be dumped.
++
+'off' must be a multiple of 4096. 4096 bytes are dumped, even if the file
+is shorter.
+
 EXIT STATUS
 -----------
 *btrfs inspect-internal* returns a zero exit status if it succeeds. Non zero is
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump'
  2016-07-24 11:03 [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Goffredo Baroncelli
                   ` (4 preceding siblings ...)
  2016-07-24 11:03 ` [PATCH 5/5] Add new command to man pages: btrfs insp physical-dump Goffredo Baroncelli
@ 2016-07-25  2:14 ` Qu Wenruo
  2016-07-25 17:14   ` Goffredo Baroncelli
  5 siblings, 1 reply; 9+ messages in thread
From: Qu Wenruo @ 2016-07-25  2:14 UTC (permalink / raw)
  To: Goffredo Baroncelli, linux-btrfs; +Cc: dsterba, Chris Mason

Hi Goffredo,

At 07/24/2016 07:03 PM, Goffredo Baroncelli wrote:
> Hi all,
>
> the following patches add two new commands:
> 1) btrfs inspect-internal physical-find
> 2) btrfs inspect-internal physical-dump
>
> The aim of these two new commands is to locate (1) and dump (2) the stripe elements
> stored on the disks. I developed these two new command to simplify the
> debugging of some RAID5 bugs (but this is another discussion).

That's pretty nice function.
However the function seems to be a combination of fiemap(to resolve file 
to logical) and resolve logical (to resolve logical to on-device 
bytenr), with RAID5/6 specific flags.


Instead of introduce a new functions doing a combination of existing 
features, would you mind expanding current logical-resolve?

IMHO, it's not a good idea to cross two different logical 
layers(file<->logical mapping layer and logical<->dev exten mapping 
layer) in one function.

Thanks,
Qu
>
> An example of 'btrfs inspect-internal physical-find' is the following:
>
> # btrfs inspect physical-find mnt/out.txt
> mnt/out.txt: 0
>         devid: 3 dev_name: /dev/loop2 offset: 61931520 type: DATA
>         devid: 2 dev_name: /dev/loop1 offset: 61931520 type: OTHER
>         devid: 1 dev_name: /dev/loop0 offset: 81854464 type: PARITY
>         devid: 4 dev_name: /dev/loop3 offset: 61931520 type: PARITY
>
> In the output above, DATA is the stripe elemnt conaining data. OTHER
> is the sibling stripe elemnt: it contains data related to or other files
> or to the same file but different position. The two stripe elements contain
> the RAID6 parity (P and Q).
>
> It is possible to pass the offset of the file to inspect.
>
> An example of 'btrfs inspect-internal physical-dump' is the following
>
> # btrfs insp physical-find mnt/out.txt
> mnt/out.txt: 0
> devid: 5 dev_name: /dev/loop4 offset: 56819712 type: OTHER
> devid: 4 dev_name: /dev/loop3 offset: 56819712 type: OTHER
> devid: 3 dev_name: /dev/loop2 offset: 56819712 type: DATA
> devid: 2 dev_name: /dev/loop1 offset: 56819712 type: PARITY
> devid: 1 dev_name: /dev/loop0 offset: 76742656 type: PARITY
>
> # btrfs insp physical-dump mnt/out.txt | xxd
> mnt/out.txt: 0
> file: /dev/loop2 off=56819712
> 00000000: 6164 6161 6161 6161 6161 6161 6161 6161  adaaaaaaaaaaaaaa
> 00000010: 6161 6161 6161 6161 6161 6161 6161 6161  aaaaaaaaaaaaaaaa
> 00000020: 6161 6161 6161 6161 6161 6161 6161 6161  aaaaaaaaaaaaaaaa
> 00000030: 6161 6161 6161 6161 6161 6161 6161 6161  aaaaaaaaaaaaaaaa
> 00000040: 6161 6161 6161 6161 6161 6161 6161 6161  aaaaaaaaaaaaaaaa
> [...]
>
> In this case it is dumped the content of the first 4k of the file. It
> is possible to pass also an offset (at step of 4k). Moreover
> it is possible to select to dump: which copy has to be dumped (switch -c,
> only for RAID1/RAID10/DUP); which parity has to be dumped (switch -p,
> only for RAID5/RAID6); which stripe element other than data (switch -s,
> only for RAID5/RAID6).
>
> BR
> G.Baroncelli
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump'
  2016-07-25  2:14 ` [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Qu Wenruo
@ 2016-07-25 17:14   ` Goffredo Baroncelli
  2016-07-26  1:32     ` Qu Wenruo
  0 siblings, 1 reply; 9+ messages in thread
From: Goffredo Baroncelli @ 2016-07-25 17:14 UTC (permalink / raw)
  To: Qu Wenruo, linux-btrfs; +Cc: dsterba, Chris Mason

On 2016-07-25 04:14, Qu Wenruo wrote:
> Hi Goffredo,
> 
> At 07/24/2016 07:03 PM, Goffredo Baroncelli wrote:
>> Hi all,
>> 
>> the following patches add two new commands: 1) btrfs
>> inspect-internal physical-find 2) btrfs inspect-internal
>> physical-dump
>> 
>> The aim of these two new commands is to locate (1) and dump (2) the
>> stripe elements stored on the disks. I developed these two new
>> command to simplify the debugging of some RAID5 bugs (but this is
>> another discussion).
> 
> That's pretty nice function. However the function seems to be a
> combination of fiemap(to resolve file to logical) and resolve logical
> (to resolve logical to on-device bytenr), with RAID5/6 specific
> flags.
> 
> 
> Instead of introduce a new functions doing a combination of existing
> features, would you mind expanding current logical-resolve?

[I suppose that you are referring to ./btrfs-map-logical and not logical-resolve]

Before developing these two utils, I was unaware of logical-resolve.

Frankly speaking I am skeptical to extend btrfs-map-logical, which is 
outside the "btrfs" family function. So I consider it as legacy code: the one
which should not be extended, because it was born as "a quick and dirty" utility.

Finally, there is a big differences between my tools and btrfs-map-logical. 
The former needs a mounted filesystem, the latter doesn't.

> 
> IMHO, it's not a good idea to cross two different logical
> layers(file<->logical mapping layer and logical<->dev exten mapping
> layer) in one function.

Indeed, it was the goal. The use case it a tool to help to
find bug in the raid5 code. Before my tools, I used some hacks to
parse the output of filefrag and combing this with btrfs-debug-tree info..
So I develop "btrfs insp physical-find". Then Chris asked me to make a
further step, so I made 'btrfs insp physical-dump'. I think that definitely
we need a tool like "btrfs insp physical-find/dump".

If you think that btrfs-map-logical has its use case, I can develop
further physical-dump/find to start from a "logical" address instead of
a file: it would be a lot more simple to me than extend btrfs-map-logical.

> 
> Thanks, Qu
>> 
>> An example of 'btrfs inspect-internal physical-find' is the
>> following:
>> 
>> # btrfs inspect physical-find mnt/out.txt mnt/out.txt: 0 devid: 3
>> dev_name: /dev/loop2 offset: 61931520 type: DATA devid: 2 dev_name:
>> /dev/loop1 offset: 61931520 type: OTHER devid: 1 dev_name:
>> /dev/loop0 offset: 81854464 type: PARITY devid: 4 dev_name:
>> /dev/loop3 offset: 61931520 type: PARITY
>> 
>> In the output above, DATA is the stripe elemnt conaining data.
>> OTHER is the sibling stripe elemnt: it contains data related to or
>> other files or to the same file but different position. The two
>> stripe elements contain the RAID6 parity (P and Q).
>> 
>> It is possible to pass the offset of the file to inspect.
>> 
>> An example of 'btrfs inspect-internal physical-dump' is the
>> following
>> 
>> # btrfs insp physical-find mnt/out.txt mnt/out.txt: 0 devid: 5
>> dev_name: /dev/loop4 offset: 56819712 type: OTHER devid: 4
>> dev_name: /dev/loop3 offset: 56819712 type: OTHER devid: 3
>> dev_name: /dev/loop2 offset: 56819712 type: DATA devid: 2 dev_name:
>> /dev/loop1 offset: 56819712 type: PARITY devid: 1 dev_name:
>> /dev/loop0 offset: 76742656 type: PARITY
>> 
>> # btrfs insp physical-dump mnt/out.txt | xxd mnt/out.txt: 0 file:
>> /dev/loop2 off=56819712 00000000: 6164 6161 6161 6161 6161 6161
>> 6161 6161  adaaaaaaaaaaaaaa 00000010: 6161 6161 6161 6161 6161 6161
>> 6161 6161  aaaaaaaaaaaaaaaa 00000020: 6161 6161 6161 6161 6161 6161
>> 6161 6161  aaaaaaaaaaaaaaaa 00000030: 6161 6161 6161 6161 6161 6161
>> 6161 6161  aaaaaaaaaaaaaaaa 00000040: 6161 6161 6161 6161 6161 6161
>> 6161 6161  aaaaaaaaaaaaaaaa [...]
>> 
>> In this case it is dumped the content of the first 4k of the file.
>> It is possible to pass also an offset (at step of 4k). Moreover it
>> is possible to select to dump: which copy has to be dumped (switch
>> -c, only for RAID1/RAID10/DUP); which parity has to be dumped
>> (switch -p, only for RAID5/RAID6); which stripe element other than
>> data (switch -s, only for RAID5/RAID6).
>> 
>> BR G.Baroncelli
>> 
>> 
>> -- To unsubscribe from this list: send the line "unsubscribe
>> linux-btrfs" in the body of a message to majordomo@vger.kernel.org 
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 
> 
> 
> -- To unsubscribe from this list: send the line "unsubscribe
> linux-btrfs" in the body of a message to majordomo@vger.kernel.org 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump'
  2016-07-25 17:14   ` Goffredo Baroncelli
@ 2016-07-26  1:32     ` Qu Wenruo
  0 siblings, 0 replies; 9+ messages in thread
From: Qu Wenruo @ 2016-07-26  1:32 UTC (permalink / raw)
  To: kreijack, linux-btrfs; +Cc: dsterba, Chris Mason



At 07/26/2016 01:14 AM, Goffredo Baroncelli wrote:
> On 2016-07-25 04:14, Qu Wenruo wrote:
>> Hi Goffredo,
>>
>> At 07/24/2016 07:03 PM, Goffredo Baroncelli wrote:
>>> Hi all,
>>>
>>> the following patches add two new commands: 1) btrfs
>>> inspect-internal physical-find 2) btrfs inspect-internal
>>> physical-dump
>>>
>>> The aim of these two new commands is to locate (1) and dump (2) the
>>> stripe elements stored on the disks. I developed these two new
>>> command to simplify the debugging of some RAID5 bugs (but this is
>>> another discussion).
>>
>> That's pretty nice function. However the function seems to be a
>> combination of fiemap(to resolve file to logical) and resolve logical
>> (to resolve logical to on-device bytenr), with RAID5/6 specific
>> flags.
>>
>>
>> Instead of introduce a new functions doing a combination of existing
>> features, would you mind expanding current logical-resolve?
>
> [I suppose that you are referring to ./btrfs-map-logical and not logical-resolve]

Oh sorry, I misunderstand the recent trend to merge standalone tools 
into btrfs.
>
> Before developing these two utils, I was unaware of logical-resolve.
>
> Frankly speaking I am skeptical to extend btrfs-map-logical, which is
> outside the "btrfs" family function. So I consider it as legacy code: the one
> which should not be extended, because it was born as "a quick and dirty" utility.

IIRC, there is the trend to merge such tools into inspect-internal 
subcommands.

>
> Finally, there is a big differences between my tools and btrfs-map-logical.
> The former needs a mounted filesystem, the latter doesn't.

Oh, I really missed the difference.
So your tools is based on tree search ioctl, not the offline tree search.

And in that case, your tool is totally useful then, as we don't have any 
online tool for that purpose yet.
>
>>
>> IMHO, it's not a good idea to cross two different logical
>> layers(file<->logical mapping layer and logical<->dev exten mapping
>> layer) in one function.
>
> Indeed, it was the goal. The use case it a tool to help to
> find bug in the raid5 code. Before my tools, I used some hacks to
> parse the output of filefrag and combing this with btrfs-debug-tree info..
> So I develop "btrfs insp physical-find". Then Chris asked me to make a
> further step, so I made 'btrfs insp physical-dump'. I think that definitely
> we need a tool like "btrfs insp physical-find/dump".
>
> If you think that btrfs-map-logical has its use case, I can develop
> further physical-dump/find to start from a "logical" address instead of
> a file: it would be a lot more simple to me than extend btrfs-map-logical.

Please do it, and this makes it possible to inspect metadata extents too.

And it can also make the code easier to use from other subcommand.

Thanks,
Qu

>
>>
>> Thanks, Qu
>>>
>>> An example of 'btrfs inspect-internal physical-find' is the
>>> following:
>>>
>>> # btrfs inspect physical-find mnt/out.txt mnt/out.txt: 0 devid: 3
>>> dev_name: /dev/loop2 offset: 61931520 type: DATA devid: 2 dev_name:
>>> /dev/loop1 offset: 61931520 type: OTHER devid: 1 dev_name:
>>> /dev/loop0 offset: 81854464 type: PARITY devid: 4 dev_name:
>>> /dev/loop3 offset: 61931520 type: PARITY
>>>
>>> In the output above, DATA is the stripe elemnt conaining data.
>>> OTHER is the sibling stripe elemnt: it contains data related to or
>>> other files or to the same file but different position. The two
>>> stripe elements contain the RAID6 parity (P and Q).
>>>
>>> It is possible to pass the offset of the file to inspect.
>>>
>>> An example of 'btrfs inspect-internal physical-dump' is the
>>> following
>>>
>>> # btrfs insp physical-find mnt/out.txt mnt/out.txt: 0 devid: 5
>>> dev_name: /dev/loop4 offset: 56819712 type: OTHER devid: 4
>>> dev_name: /dev/loop3 offset: 56819712 type: OTHER devid: 3
>>> dev_name: /dev/loop2 offset: 56819712 type: DATA devid: 2 dev_name:
>>> /dev/loop1 offset: 56819712 type: PARITY devid: 1 dev_name:
>>> /dev/loop0 offset: 76742656 type: PARITY
>>>
>>> # btrfs insp physical-dump mnt/out.txt | xxd mnt/out.txt: 0 file:
>>> /dev/loop2 off=56819712 00000000: 6164 6161 6161 6161 6161 6161
>>> 6161 6161  adaaaaaaaaaaaaaa 00000010: 6161 6161 6161 6161 6161 6161
>>> 6161 6161  aaaaaaaaaaaaaaaa 00000020: 6161 6161 6161 6161 6161 6161
>>> 6161 6161  aaaaaaaaaaaaaaaa 00000030: 6161 6161 6161 6161 6161 6161
>>> 6161 6161  aaaaaaaaaaaaaaaa 00000040: 6161 6161 6161 6161 6161 6161
>>> 6161 6161  aaaaaaaaaaaaaaaa [...]
>>>
>>> In this case it is dumped the content of the first 4k of the file.
>>> It is possible to pass also an offset (at step of 4k). Moreover it
>>> is possible to select to dump: which copy has to be dumped (switch
>>> -c, only for RAID1/RAID10/DUP); which parity has to be dumped
>>> (switch -p, only for RAID5/RAID6); which stripe element other than
>>> data (switch -s, only for RAID5/RAID6).
>>>
>>> BR G.Baroncelli
>>>
>>>
>>> -- To unsubscribe from this list: send the line "unsubscribe
>>> linux-btrfs" in the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>
>>
>> -- To unsubscribe from this list: send the line "unsubscribe
>> linux-btrfs" in the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
>



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-07-26  1:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-24 11:03 [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Goffredo Baroncelli
2016-07-24 11:03 ` [PATCH 1/5] Add some helper functions Goffredo Baroncelli
2016-07-24 11:03 ` [PATCH 2/5] New btrfs command: "btrfs inspect physical-find" Goffredo Baroncelli
2016-07-24 11:03 ` [PATCH 3/5] new command btrfs inspect physical-dump Goffredo Baroncelli
2016-07-24 11:03 ` [PATCH 4/5] Add man page for command btrfs insp physical-find Goffredo Baroncelli
2016-07-24 11:03 ` [PATCH 5/5] Add new command to man pages: btrfs insp physical-dump Goffredo Baroncelli
2016-07-25  2:14 ` [BTRFS-PROGS][PATCH] Add two new commands: 'btrfs insp physical-find' and 'btrfs insp physical-dump' Qu Wenruo
2016-07-25 17:14   ` Goffredo Baroncelli
2016-07-26  1:32     ` Qu Wenruo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).