* [PATCH v2 1/4] btrfs-progs: add warning for -s option of btrfs-image
2024-07-18 22:33 [PATCH v2 0/4] btrfs-progs: fix the filename sanitization of btrfs-image Qu Wenruo
@ 2024-07-18 22:33 ` Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 2/4] btrfs-progs: image: fix the bug that filename sanitization not working Qu Wenruo
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2024-07-18 22:33 UTC (permalink / raw)
To: linux-btrfs
The filename sanitization is not recommended as it introduces mismatches
between DIR_ITEM and INODE_REF.
Even hash confliction mode (double "-s" option) is not ensured to always
find a conflict, and when failed to find a conflict, a mismatch would
happen.
And when a mismatch happens, the kernel will not resolve the path
correctly since kernel uses the hash from DIR_ITEM to lookup the child
inode.
So add a warning into the "-s" option of btrfs-image.
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
Documentation/btrfs-image.rst | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/Documentation/btrfs-image.rst b/Documentation/btrfs-image.rst
index a63b0da273c5..ae6145f79c06 100644
--- a/Documentation/btrfs-image.rst
+++ b/Documentation/btrfs-image.rst
@@ -37,13 +37,18 @@ OPTIONS
file system will not be able to be mounted.
-s
- Sanitize the file names when generating the image. One -s means just
- generate random garbage, which means that the directory indexes won't match up
- since the hashes won't match with the garbage filenames. Using *-s* will
- calculate a collision for the filename so that the hashes match, and if it
- can't calculate a collision then it will just generate garbage. The collision
- calculator is very time and CPU intensive so only use it if you are having
- problems with your file system tree and need to have it mostly working.
+ Sanitize the file names when generating the image.
+ Not recommended as this would introduce new file name hash mismatches,
+ thus if your problem involves subvolume tress, it can even mask existing problems.
+ Furthermore kernels can not do proper path resolution due to the introduced
+ hash mismatches.
+
+ One *-s* means just generate random garbage, which means that the
+ directory hash won't match its file names.
+ Using two *-s* will calculate a collision for the file name so that the
+ hashes match, and if it can't calculate a collision then it will just
+ generate garbage. The collision calculator is very time and CPU
+ intensive.
-w
Walk all the trees manually and copy any blocks that are referenced. Use this
--
2.45.2
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH v2 2/4] btrfs-progs: image: fix the bug that filename sanitization not working
2024-07-18 22:33 [PATCH v2 0/4] btrfs-progs: fix the filename sanitization of btrfs-image Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 1/4] btrfs-progs: add warning for -s option " Qu Wenruo
@ 2024-07-18 22:33 ` Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 3/4] btrfs-progs: misc-tests: add a basic resume test using error injection Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 4/4] btrfs-progs: misc-tests: add a test case for filename sanitization Qu Wenruo
3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2024-07-18 22:33 UTC (permalink / raw)
To: linux-btrfs; +Cc: Andrea Gelmini
[BUG]
There is a bug report that image dump taken by "btrfs-image -s" doesn't
really sanitize the filenames:
# truncates -s 1G source.raw
# mkfs.btrfs -f source.raw
# mount source.raw $mnt
# touch $mnt/top_secret_filename
# touch $mnt/secret_filename
# umount $mnt
# btrfs-image -s source.raw dump.img
# string dump.img | grep filename
top_secret_filename
secret_filename
top_secret_filename
secret_filename
top_secret_filename
[CAUSE]
Using above image to store the fs, and we got the following result in fs
tree:
item 0 key (256 INODE_ITEM 0) itemoff 16123 itemsize 160
generation 3 transid 7 size 68 nbytes 16384
block group 0 mode 40755 links 1 uid 0 gid 0 rdev 0
sequence 2 flags 0x0(none)
item 1 key (256 INODE_REF 256) itemoff 16111 itemsize 12
index 0 namelen 2 name: ..
item 2 key (256 DIR_ITEM 439756795) itemoff 16062 itemsize 49
location key (257 INODE_ITEM 0) type FILE
transid 7 data_len 0 name_len 19
name: top_secret_filename
item 3 key (256 DIR_ITEM 693462946) itemoff 16017 itemsize 45
location key (258 INODE_ITEM 0) type FILE
transid 7 data_len 0 name_len 15
name: secret_filename
item 4 key (256 DIR_INDEX 2) itemoff 15968 itemsize 49
location key (257 INODE_ITEM 0) type FILE
transid 7 data_len 0 name_len 19
name: top_secret_filename
item 5 key (256 DIR_INDEX 3) itemoff 15923 itemsize 45
location key (258 INODE_ITEM 0) type FILE
transid 7 data_len 0 name_len 15
name: secret_filename
item 6 key (257 INODE_ITEM 0) itemoff 15763 itemsize 160
generation 7 transid 7 size 0 nbytes 0
block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0
sequence 1 flags 0x0(none)
item 7 key (257 INODE_REF 256) itemoff 15734 itemsize 29
index 2 namelen 19 name: top_secret_filename
item 8 key (258 INODE_ITEM 0) itemoff 15574 itemsize 160
generation 7 transid 7 size 0 nbytes 0
block group 0 mode 100644 links 1 uid 0 gid 0 rdev 0
sequence 1 flags 0x0(none)
item 9 key (258 INODE_REF 256) itemoff 15549 itemsize 25
index 3 namelen 15 name: 1���'�gc*&R
The result shows, only the last INODE_REF got sanitized, all the
remaining is not touched at all.
This is cauesd by how we sanitize the filenames:
copy_buffer()
|- memcpy(dst, src->data, src->len);
| This means we copy the whole eb into our buffer already.
|
|- zero_items()
|- sanitize_name()
|- eb = alloc_dummy_eb();
|- memcpy(eb->data, src->data, src->len);
| This means we generate a dummy eb with the same contents of
| the source eb.
|
|- sanitize_dir_item();
| We override the dir item of the given item (specified by the
| slot number) inside our dummy eb.
|
|- memcpy(dst, eb->data, eb->lem);
The last one copy the dummy eb into our buffer, with only the slot
corrupted.
But when the whole work flow hits the next slot, we only corrupt the
next slot, but still copy the whole dummy eb back to buffer.
This means the previous slot would be overwritten by the old unsanitized
data.
Resulting only the last slot is corrupted.
[FIX]
Fix the bug by only copying back the corrupted item to the buffer.
So that other slots won't be overwritten by unsanitized data.
Reported-by: Andrea Gelmini <andrea.gelmini@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
image/sanitize.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/image/sanitize.c b/image/sanitize.c
index 084da22e401e..ef4f6e515633 100644
--- a/image/sanitize.c
+++ b/image/sanitize.c
@@ -449,6 +449,8 @@ void sanitize_name(enum sanitize_mode sanitize, struct rb_root *name_tree,
int slot)
{
struct extent_buffer *eb;
+ u32 item_ptr_off = btrfs_item_ptr_offset(src, slot);
+ u32 item_ptr_size = btrfs_item_size(src, slot);
eb = alloc_dummy_eb(src->start, src->len);
if (!eb) {
@@ -476,7 +478,11 @@ void sanitize_name(enum sanitize_mode sanitize, struct rb_root *name_tree,
break;
}
- memcpy(dst, eb->data, eb->len);
+ /*
+ * Only overwrite the sanitized part, or we can overwrite previous
+ * sanitized value with the old values from @src.
+ */
+ memcpy(dst + item_ptr_off, eb->data + item_ptr_off, item_ptr_size);
free(eb);
}
--
2.45.2
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH v2 3/4] btrfs-progs: misc-tests: add a basic resume test using error injection
2024-07-18 22:33 [PATCH v2 0/4] btrfs-progs: fix the filename sanitization of btrfs-image Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 1/4] btrfs-progs: add warning for -s option " Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 2/4] btrfs-progs: image: fix the bug that filename sanitization not working Qu Wenruo
@ 2024-07-18 22:33 ` Qu Wenruo
2024-07-18 22:33 ` [PATCH v2 4/4] btrfs-progs: misc-tests: add a test case for filename sanitization Qu Wenruo
3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2024-07-18 22:33 UTC (permalink / raw)
To: linux-btrfs
The new test case does:
- Make sure the build has error injection support
This is done by checking "btrfs --version" output.
- Inject error at the last commit transaction of new data csum
generation
- Resume the csum conversion and make sure it works
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
tests/common | 7 +++
.../065-csum-conversion-inject/test.sh | 45 +++++++++++++++++++
2 files changed, 52 insertions(+)
create mode 100755 tests/misc-tests/065-csum-conversion-inject/test.sh
diff --git a/tests/common b/tests/common
index 30857f153a22..4ff6addf20db 100644
--- a/tests/common
+++ b/tests/common
@@ -406,6 +406,13 @@ check_regular_build()
fi
}
+check_injection()
+{
+ if ! _test_config "INJECT"; then
+ _not_run "This test requires error injection support (make D=1)"
+ fi
+}
+
check_prereq()
{
# Internal tools for testing, not shipped with the package
diff --git a/tests/misc-tests/065-csum-conversion-inject/test.sh b/tests/misc-tests/065-csum-conversion-inject/test.sh
new file mode 100755
index 000000000000..715349c4d403
--- /dev/null
+++ b/tests/misc-tests/065-csum-conversion-inject/test.sh
@@ -0,0 +1,45 @@
+#!/bin/bash
+# Verify the csum conversion can still resume after an interruption
+
+source "$TEST_TOP/common" || exit
+source "$TEST_TOP/common.convert" || exit
+
+check_experimental_build
+check_injection
+setup_root_helper
+prepare_test_dev
+
+test_resume_data_csum_generation()
+{
+ local new_csum="$1"
+ local tmp=$(_mktemp "csum-convert")
+
+ # Error at the end of the data csum generation.
+ export INJECT="0x4de02239"
+ run_mustfail_stdout "error injection not working" \
+ "$TOP/btrfstune" --csum "$new_csum" "$TEST_DEV" &> $tmp
+ cat "$tmp" >> "$RESULTS"
+ if ! grep -q "$INJECT" "$tmp"; then
+ rm -f -- "$tmp"
+ _fail "csum conversion failed to unexpected reason"
+ fi
+ rm -f -- "$tmp"
+ unset INJECT
+ run_check "$TOP/btrfstune" --csum "$new_csum" "$TEST_DEV"
+ run_check "$TOP/btrfs" check --check-data-csum "$TEST_DEV"
+}
+
+check_injection
+
+run_check_mkfs_test_dev --csum crc32c
+
+# We only mount the filesystem once to populate its contents, later one we
+# would never mount the fs (to reduce the dependency on kernel features).
+run_check_mount_test_dev
+populate_fs
+run_check_umount_test_dev
+
+test_resume_data_csum_generation xxhash
+test_resume_data_csum_generation blake2
+test_resume_data_csum_generation sha256
+test_resume_data_csum_generation crc32c
--
2.45.2
^ permalink raw reply related [flat|nested] 5+ messages in thread* [PATCH v2 4/4] btrfs-progs: misc-tests: add a test case for filename sanitization
2024-07-18 22:33 [PATCH v2 0/4] btrfs-progs: fix the filename sanitization of btrfs-image Qu Wenruo
` (2 preceding siblings ...)
2024-07-18 22:33 ` [PATCH v2 3/4] btrfs-progs: misc-tests: add a basic resume test using error injection Qu Wenruo
@ 2024-07-18 22:33 ` Qu Wenruo
3 siblings, 0 replies; 5+ messages in thread
From: Qu Wenruo @ 2024-07-18 22:33 UTC (permalink / raw)
To: linux-btrfs
This test case checks:
- If a regular btrfs-image dump has the unsanitized filenames
- If a sanitized btrfs-image dump has filenames properly censored
Signed-off-by: Qu Wenruo <wqu@suse.com>
---
tests/misc-tests/065-image-filename/test.sh | 38 +++++++++++++++++++++
1 file changed, 38 insertions(+)
create mode 100755 tests/misc-tests/065-image-filename/test.sh
diff --git a/tests/misc-tests/065-image-filename/test.sh b/tests/misc-tests/065-image-filename/test.sh
new file mode 100755
index 000000000000..0c53cc2f6f72
--- /dev/null
+++ b/tests/misc-tests/065-image-filename/test.sh
@@ -0,0 +1,38 @@
+#!/bin/bash
+# Verify "btrfs-image -s" sanitizes the filenames correctly
+
+source "$TEST_TOP/common" || exit
+source "$TEST_TOP/common.convert" || exit
+
+setup_root_helper
+prepare_test_dev
+
+declare -a filenames=("%%top_secret%%" "@@secret@@" "||confidential||")
+tmp=$(_mktemp "image-filename")
+
+run_check_mkfs_test_dev
+run_check_mount_test_dev
+for i in ${filenames[@]}; do
+ run_check $SUDO_HELPER touch "$TEST_MNT/$i"
+done
+run_check_umount_test_dev
+
+run_check "$TOP/btrfs-image" "$TEST_DEV" "$tmp"
+_log "strings found inside the regular dump:"
+strings "$tmp" >> "$RESULTS"
+for i in ${filenames[@]}; do
+ if ! grep -q "$i" "$tmp"; then
+ rm -f -- "$tmp"
+ _fail "regular dump sanitized the filenames"
+ fi
+done
+run_check "$TOP/btrfs-image" -s "$TEST_DEV" "$tmp"
+_log "strings found inside the sanitize dump:"
+strings "$tmp" >> "$RESULTS"
+for i in ${filenames[@]}; do
+ if grep -q "$i" "$tmp"; then
+ rm -f -- "$tmp"
+ _fail "filenames not properly sanitized"
+ fi
+done
+rm -f -- "$tmp"
--
2.45.2
^ permalink raw reply related [flat|nested] 5+ messages in thread