All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] btrfs-progs: fix the filename sanitization of btrfs-image
@ 2024-07-18 22:33 Qu Wenruo
  2024-07-18 22:33 ` [PATCH v2 1/4] btrfs-progs: add warning for -s option " Qu Wenruo
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Qu Wenruo @ 2024-07-18 22:33 UTC (permalink / raw)
  To: linux-btrfs

[CHANGELOG]
v2:
- Various grammar fixes in the command docs
- Use `_mktemp()` and `_log()` when possible
- Use a dedicated filename list, and grep to find an exact match for
  them.

There are several bugs in btrfs-image filename sanitization:

- Ensured kernel path resolution bug
  Since during path resolution btrfs uses hash to find the child inode,
  with garbage filled DIR_ITEMs, it's definitely unable to properly
  resolve the path.

  A warning is added to the man page by the first patch.

- Only the last item got properly sanitized
  All the remaining INODE_REF/DIR_INDEX/DIR_ITEM are not sanitized at
  all.

  This is fixed by the second patch.

- Sanitized filename contains non-ASCII chars
  This is fixed by the third patch.


Finally a new test case is introduced to verify the filename
sanitization behavior of btrfs-image.

Qu Wenruo (4):
  btrfs-progs: add warning for -s option of btrfs-image
  btrfs-progs: image: fix the bug that filename sanitization not working
  btrfs-progs: misc-tests: add a basic resume test using error injection
  btrfs-progs: misc-tests: add a test case for filename sanitization

 Documentation/btrfs-image.rst                 | 19 +++++---
 image/sanitize.c                              |  8 +++-
 tests/common                                  |  7 +++
 .../065-csum-conversion-inject/test.sh        | 45 +++++++++++++++++++
 tests/misc-tests/065-image-filename/test.sh   | 38 ++++++++++++++++
 5 files changed, 109 insertions(+), 8 deletions(-)
 create mode 100755 tests/misc-tests/065-csum-conversion-inject/test.sh
 create mode 100755 tests/misc-tests/065-image-filename/test.sh

--
2.45.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v2 1/4] btrfs-progs: add warning for -s option of btrfs-image
  2024-07-18 22:33 [PATCH v2 0/4] btrfs-progs: fix the filename sanitization of btrfs-image Qu Wenruo
@ 2024-07-18 22:33 ` Qu Wenruo
  2024-07-18 22:33 ` [PATCH v2 2/4] btrfs-progs: image: fix the bug that filename sanitization not working Qu Wenruo
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2024-07-18 22:33 UTC (permalink / raw)
  To: linux-btrfs

The filename sanitization is not recommended as it introduces mismatches
between DIR_ITEM and INODE_REF.

Even hash confliction mode (double "-s" option) is not ensured to always
find a conflict, and when failed to find a conflict, a mismatch would
happen.

And when a mismatch happens, the kernel will not resolve the path
correctly since kernel uses the hash from DIR_ITEM to lookup the child
inode.

So add a warning into the "-s" option of btrfs-image.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 Documentation/btrfs-image.rst | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/Documentation/btrfs-image.rst b/Documentation/btrfs-image.rst
index a63b0da273c5..ae6145f79c06 100644
--- a/Documentation/btrfs-image.rst
+++ b/Documentation/btrfs-image.rst
@@ -37,13 +37,18 @@ OPTIONS
         file system will not be able to be mounted.
 
 -s
-        Sanitize the file names when generating the image. One -s means just
-        generate random garbage, which means that the directory indexes won't match up
-        since the hashes won't match with the garbage filenames. Using *-s* will
-        calculate a collision for the filename so that the hashes match, and if it
-        can't calculate a collision then it will just generate garbage.  The collision
-        calculator is very time and CPU intensive so only use it if you are having
-        problems with your file system tree and need to have it mostly working.
+	Sanitize the file names when generating the image.
+	Not recommended as this would introduce new file name hash mismatches,
+	thus if your problem involves subvolume tress, it can even mask existing problems.
+	Furthermore kernels can not do proper path resolution due to the introduced
+	hash mismatches.
+
+	One *-s* means just generate random garbage, which means that the
+	directory hash won't match its file names.
+	Using two *-s* will calculate a collision for the file name so that the
+	hashes match, and if it can't calculate a collision then it will just
+	generate garbage.  The collision calculator is very time and CPU
+	intensive.
 
 -w
         Walk all the trees manually and copy any blocks that are referenced. Use this
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 2/4] btrfs-progs: image: fix the bug that filename sanitization not working
  2024-07-18 22:33 [PATCH v2 0/4] btrfs-progs: fix the filename sanitization of btrfs-image Qu Wenruo
  2024-07-18 22:33 ` [PATCH v2 1/4] btrfs-progs: add warning for -s option " Qu Wenruo
@ 2024-07-18 22:33 ` Qu Wenruo
  2024-07-18 22:33 ` [PATCH v2 3/4] btrfs-progs: misc-tests: add a basic resume test using error injection Qu Wenruo
  2024-07-18 22:33 ` [PATCH v2 4/4] btrfs-progs: misc-tests: add a test case for filename sanitization Qu Wenruo
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2024-07-18 22:33 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Andrea Gelmini

[BUG]
There is a bug report that image dump taken by "btrfs-image -s" doesn't
really sanitize the filenames:

 # truncates -s 1G source.raw
 # mkfs.btrfs -f source.raw
 # mount source.raw $mnt
 # touch $mnt/top_secret_filename
 # touch $mnt/secret_filename
 # umount $mnt
 # btrfs-image -s source.raw dump.img
 # string dump.img | grep filename
 top_secret_filename
 secret_filename
 top_secret_filename
 secret_filename
 top_secret_filename

[CAUSE]
Using above image to store the fs, and we got the following result in fs
tree:

	item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160
		generation 3 transid 7 size 68 nbytes 16384
		block group 0 mode 40755 links 1 uid 0 gid 0 rdev 0
		sequence 2 flags 0x0(none)
	item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12
		index 0 namelen 2 name: ..
	item 2 key (256 DIR_ITEM 439756795) itemoff 16062 itemsize 49
		location key (257 INODE_ITEM 0) type FILE
		transid 7 data_len 0 name_len 19
		name: top_secret_filename
	item 3 key (256 DIR_ITEM 693462946) itemoff 16017 itemsize 45
		location key (258 INODE_ITEM 0) type FILE
		transid 7 data_len 0 name_len 15
		name: secret_filename
	item 4 key (256 DIR_INDEX 2) itemoff 15968 itemsize 49
		location key (257 INODE_ITEM 0) type FILE
		transid 7 data_len 0 name_len 19
		name: top_secret_filename
	item 5 key (256 DIR_INDEX 3) itemoff 15923 itemsize 45
		location key (258 INODE_ITEM 0) type FILE
		transid 7 data_len 0 name_len 15
		name: secret_filename
	item 6 key (257 INODE_ITEM 0) itemoff 15763 itemsize 160
		generation 7 transid 7 size 0 nbytes 0
		block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0
		sequence 1 flags 0x0(none)
	item 7 key (257 INODE_REF 256) itemoff 15734 itemsize 29
		index 2 namelen 19 name: top_secret_filename
	item 8 key (258 INODE_ITEM 0) itemoff 15574 itemsize 160
		generation 7 transid 7 size 0 nbytes 0
		block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0
		sequence 1 flags 0x0(none)
	item 9 key (258 INODE_REF 256) itemoff 15549 itemsize 25
		index 3 namelen 15 name: 1���'�gc*&R

The result shows, only the last INODE_REF got sanitized, all the
remaining is not touched at all.

This is cauesd by how we sanitize the filenames:

 copy_buffer()
 |- memcpy(dst, src->data, src->len);
 |  This means we copy the whole eb into our buffer already.
 |
 |- zero_items()
    |- sanitize_name()
       |- eb = alloc_dummy_eb();
       |- memcpy(eb->data, src->data, src->len);
       |  This means we generate a dummy eb with the same contents of
       |  the source eb.
       |
       |- sanitize_dir_item();
       |  We override the dir item of the given item (specified by the
       |  slot number) inside our dummy eb.
       |
       |- memcpy(dst, eb->data, eb->lem);

The last one copy the dummy eb into our buffer, with only the slot
corrupted.

But when the whole work flow hits the next slot, we only corrupt the
next slot, but still copy the whole dummy eb back to buffer.

This means the previous slot would be overwritten by the old unsanitized
data.

Resulting only the last slot is corrupted.

[FIX]
Fix the bug by only copying back the corrupted item to the buffer.
So that other slots won't be overwritten by unsanitized data.

Reported-by: Andrea Gelmini <andrea.gelmini@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 image/sanitize.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/image/sanitize.c b/image/sanitize.c
index 084da22e401e..ef4f6e515633 100644
--- a/image/sanitize.c
+++ b/image/sanitize.c
@@ -449,6 +449,8 @@ void sanitize_name(enum sanitize_mode sanitize, struct rb_root *name_tree,
 		int slot)
 {
 	struct extent_buffer *eb;
+	u32 item_ptr_off = btrfs_item_ptr_offset(src, slot);
+	u32 item_ptr_size = btrfs_item_size(src, slot);
 
 	eb = alloc_dummy_eb(src->start, src->len);
 	if (!eb) {
@@ -476,7 +478,11 @@ void sanitize_name(enum sanitize_mode sanitize, struct rb_root *name_tree,
 		break;
 	}
 
-	memcpy(dst, eb->data, eb->len);
+	/*
+	 * Only overwrite the sanitized part, or we can overwrite previous
+	 * sanitized value with the old values from @src.
+	 */
+	memcpy(dst + item_ptr_off, eb->data + item_ptr_off, item_ptr_size);
 	free(eb);
 }
 
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 3/4] btrfs-progs: misc-tests: add a basic resume test using error injection
  2024-07-18 22:33 [PATCH v2 0/4] btrfs-progs: fix the filename sanitization of btrfs-image Qu Wenruo
  2024-07-18 22:33 ` [PATCH v2 1/4] btrfs-progs: add warning for -s option " Qu Wenruo
  2024-07-18 22:33 ` [PATCH v2 2/4] btrfs-progs: image: fix the bug that filename sanitization not working Qu Wenruo
@ 2024-07-18 22:33 ` Qu Wenruo
  2024-07-18 22:33 ` [PATCH v2 4/4] btrfs-progs: misc-tests: add a test case for filename sanitization Qu Wenruo
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2024-07-18 22:33 UTC (permalink / raw)
  To: linux-btrfs

The new test case does:

- Make sure the build has error injection support
  This is done by checking "btrfs --version" output.

- Inject error at the last commit transaction of new data csum
  generation

- Resume the csum conversion and make sure it works

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 tests/common                                  |  7 +++
 .../065-csum-conversion-inject/test.sh        | 45 +++++++++++++++++++
 2 files changed, 52 insertions(+)
 create mode 100755 tests/misc-tests/065-csum-conversion-inject/test.sh

diff --git a/tests/common b/tests/common
index 30857f153a22..4ff6addf20db 100644
--- a/tests/common
+++ b/tests/common
@@ -406,6 +406,13 @@ check_regular_build()
 	fi
 }
 
+check_injection()
+{
+	if ! _test_config "INJECT"; then
+		_not_run "This test requires error injection support (make D=1)"
+	fi
+}
+
 check_prereq()
 {
 	# Internal tools for testing, not shipped with the package
diff --git a/tests/misc-tests/065-csum-conversion-inject/test.sh b/tests/misc-tests/065-csum-conversion-inject/test.sh
new file mode 100755
index 000000000000..715349c4d403
--- /dev/null
+++ b/tests/misc-tests/065-csum-conversion-inject/test.sh
@@ -0,0 +1,45 @@
+#!/bin/bash
+# Verify the csum conversion can still resume after an interruption
+
+source "$TEST_TOP/common" || exit
+source "$TEST_TOP/common.convert" || exit
+
+check_experimental_build
+check_injection
+setup_root_helper
+prepare_test_dev
+
+test_resume_data_csum_generation()
+{
+	local new_csum="$1"
+	local tmp=$(_mktemp "csum-convert")
+
+	# Error at the end of the data csum generation.
+	export INJECT="0x4de02239"
+	run_mustfail_stdout "error injection not working" \
+		"$TOP/btrfstune" --csum "$new_csum" "$TEST_DEV" &> $tmp
+	cat "$tmp" >> "$RESULTS"
+	if ! grep -q "$INJECT" "$tmp"; then
+		rm -f -- "$tmp"
+		_fail "csum conversion failed to unexpected reason"
+	fi
+	rm -f -- "$tmp"
+	unset INJECT
+	run_check "$TOP/btrfstune" --csum "$new_csum" "$TEST_DEV"
+	run_check "$TOP/btrfs" check --check-data-csum "$TEST_DEV"
+}
+
+check_injection
+
+run_check_mkfs_test_dev --csum crc32c
+
+# We only mount the filesystem once to populate its contents, later one we
+# would never mount the fs (to reduce the dependency on kernel features).
+run_check_mount_test_dev
+populate_fs
+run_check_umount_test_dev
+
+test_resume_data_csum_generation xxhash
+test_resume_data_csum_generation blake2
+test_resume_data_csum_generation sha256
+test_resume_data_csum_generation crc32c
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH v2 4/4] btrfs-progs: misc-tests: add a test case for filename sanitization
  2024-07-18 22:33 [PATCH v2 0/4] btrfs-progs: fix the filename sanitization of btrfs-image Qu Wenruo
                   ` (2 preceding siblings ...)
  2024-07-18 22:33 ` [PATCH v2 3/4] btrfs-progs: misc-tests: add a basic resume test using error injection Qu Wenruo
@ 2024-07-18 22:33 ` Qu Wenruo
  3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2024-07-18 22:33 UTC (permalink / raw)
  To: linux-btrfs

This test case checks:

- If a regular btrfs-image dump has the unsanitized filenames
- If a sanitized btrfs-image dump has filenames properly censored

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 tests/misc-tests/065-image-filename/test.sh | 38 +++++++++++++++++++++
 1 file changed, 38 insertions(+)
 create mode 100755 tests/misc-tests/065-image-filename/test.sh

diff --git a/tests/misc-tests/065-image-filename/test.sh b/tests/misc-tests/065-image-filename/test.sh
new file mode 100755
index 000000000000..0c53cc2f6f72
--- /dev/null
+++ b/tests/misc-tests/065-image-filename/test.sh
@@ -0,0 +1,38 @@
+#!/bin/bash
+# Verify "btrfs-image -s" sanitizes the filenames correctly
+
+source "$TEST_TOP/common" || exit
+source "$TEST_TOP/common.convert" || exit
+
+setup_root_helper
+prepare_test_dev
+
+declare -a filenames=("%%top_secret%%" "@@secret@@" "||confidential||")
+tmp=$(_mktemp "image-filename")
+
+run_check_mkfs_test_dev
+run_check_mount_test_dev
+for i in ${filenames[@]}; do
+	run_check $SUDO_HELPER touch "$TEST_MNT/$i"
+done
+run_check_umount_test_dev
+
+run_check "$TOP/btrfs-image" "$TEST_DEV" "$tmp"
+_log "strings found inside the regular dump:"
+strings "$tmp" >> "$RESULTS"
+for i in ${filenames[@]}; do
+	if ! grep -q "$i" "$tmp"; then
+		rm -f -- "$tmp"
+		_fail "regular dump sanitized the filenames"
+	fi
+done
+run_check "$TOP/btrfs-image" -s "$TEST_DEV" "$tmp"
+_log "strings found inside the sanitize dump:"
+strings "$tmp" >> "$RESULTS"
+for i in ${filenames[@]}; do
+	if grep -q "$i" "$tmp"; then
+		rm -f -- "$tmp"
+		_fail "filenames not properly sanitized"
+	fi
+done
+rm -f -- "$tmp"
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-18 22:33 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-18 22:33 [PATCH v2 0/4] btrfs-progs: fix the filename sanitization of btrfs-image Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 1/4] btrfs-progs: add warning for -s option " Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 2/4] btrfs-progs: image: fix the bug that filename sanitization not working Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 3/4] btrfs-progs: misc-tests: add a basic resume test using error injection Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 4/4] btrfs-progs: misc-tests: add a test case for filename sanitization Qu Wenruo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.