* [PATCH U-BOOT 0/3] btrfs: fix and improve read code
@ 2023-04-18 1:17 Dominique Martinet
2023-04-18 1:17 ` [PATCH U-BOOT 1/3] btrfs: fix offset within btrfs_read_extent_reg() Dominique Martinet
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Dominique Martinet @ 2023-04-18 1:17 UTC (permalink / raw)
To: Marek Behún, Qu Wenruo; +Cc: linux-btrfs, u-boot, Dominique Martinet
I'm posting this as a series but this is really three independant
patches.
The first is a real bug that has made our board unable to boot as it was
corrupting the kernel and image CRC validation failed (yay for checksums
-- any plan of enabling validation of data checksums too? I see the code
is there for metadata already, so it's probably not missing that much,
but unfortunately as far as I understand it would only have seen that
the extent data was correct and still corrupted the final buffer as it's
just an offset error...)
The second patch is an opportunistic optimization as I noticed the code
behaved differently than before, and there doesn't seem to be a reason
the reads couldn't be coalesced.
The last patch is just something that looked odd and looks "obviously
better" but I'll leave that up to you.
Thanks!
Link: https://lore.kernel.org/linux-btrfs/20200525063257.46757-1-wqu@suse.com/ [1]
Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
---
Dominique Martinet (3):
btrfs: fix offset within btrfs_read_extent_reg()
btrfs: btrfs_file_read: allow opportunistic read until the end
btrfs: btfs_file_read: zero trailing data if no extent was found
fs/btrfs/inode.c | 32 ++++++++++++++++++--------------
1 file changed, 18 insertions(+), 14 deletions(-)
---
base-commit: 5db4972a5bbdbf9e3af48ffc9bc4fec73b7b6a79
change-id: 20230418-btrfs-extent-reads-e2df6e329ad4
Best regards,
--
Dominique Martinet | Asmadeus
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH U-BOOT 1/3] btrfs: fix offset within btrfs_read_extent_reg()
2023-04-18 1:17 [PATCH U-BOOT 0/3] btrfs: fix and improve read code Dominique Martinet
@ 2023-04-18 1:17 ` Dominique Martinet
2023-04-18 1:58 ` Qu Wenruo
2023-04-18 1:17 ` [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end Dominique Martinet
2023-04-18 1:17 ` [PATCH U-BOOT 3/3] btrfs: btfs_file_read: zero trailing data if no extent was found Dominique Martinet
2 siblings, 1 reply; 18+ messages in thread
From: Dominique Martinet @ 2023-04-18 1:17 UTC (permalink / raw)
To: Marek Behún, Qu Wenruo; +Cc: linux-btrfs, u-boot, Dominique Martinet
From: Dominique Martinet <dominique.martinet@atmark-techno.com>
btrfs_read_extent_reg correctly computed the extent offset in the
BTRFS_COMPRESS_NONE case, but did not account for the 'offset - key.offset'
part correctly in the compressed case, making the function read
incorrect data.
In the case I examined, the last 4k of a file was corrupted and
contained data from a few blocks prior:
btrfs_file_read()
-> btrfs_read_extent_reg
(aligned part loop from 0x40480000 to 0x40ba0000, 128KB at a time)
last read had 0x4000 bytes in extent, but aligned_end - cur limited
the read to 0x3000 (offset 0x720000)
-> read_and_truncate_page
-> btrfs_read_extent_reg
reading the last 0x1000 bytes from offset 0x73000 (end of the
previous 0x4000) into 0x40ba3000
here key.offset = 0x70000 so we need to use it to recover the
0x3000 offset.
Confirmed by checking data, before patch:
u-boot=> mmc load 1:1 $loadaddr boot/uImage
u-boot=> md 0x40ba0000
40ba0000: c0232ff8 c0c722cb 030e94be bf10000c ./#.."..........
u-boot=> md 0x40ba3000
40ba3000: c0232ff8 c0c722cb 030e94be bf10000c ./#.."..........
After patch (also confirmed with data read from linux):
u-boot=> md 0x40ba3000
40ba3000: 64cc9f03 81142256 6910000c 0c03483c ...dV".....i<H..
Note that the code previously (before commit e3427184f38a ("fs: btrfs:
Implement btrfs_file_read()")) did not split that read in two, so
this is a regression.
Fixes: a26a6bedafcf ("fs: btrfs: Introduce btrfs_read_extent_inline() and btrfs_read_extent_reg()")
Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
---
fs/btrfs/inode.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 40025662f250..3d6e39e6544d 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -443,6 +443,8 @@ int btrfs_read_extent_reg(struct btrfs_path *path,
IS_ALIGNED(len, fs_info->sectorsize));
ASSERT(offset >= key.offset &&
offset + len <= key.offset + extent_num_bytes);
+ /* offset is now offset within extent */
+ offset = btrfs_file_extent_offset(leaf, fi) + offset - key.offset;
/* Preallocated or hole , fill @dest with zero */
if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_PREALLOC ||
@@ -454,9 +456,7 @@ int btrfs_read_extent_reg(struct btrfs_path *path,
if (btrfs_file_extent_compression(leaf, fi) == BTRFS_COMPRESS_NONE) {
u64 logical;
- logical = btrfs_file_extent_disk_bytenr(leaf, fi) +
- btrfs_file_extent_offset(leaf, fi) +
- offset - key.offset;
+ logical = btrfs_file_extent_disk_bytenr(leaf, fi) + offset;
read = len;
num_copies = btrfs_num_copies(fs_info, logical, len);
@@ -511,7 +511,7 @@ int btrfs_read_extent_reg(struct btrfs_path *path,
if (ret < dsize)
memset(dbuf + ret, 0, dsize - ret);
/* Then copy the needed part */
- memcpy(dest, dbuf + btrfs_file_extent_offset(leaf, fi), len);
+ memcpy(dest, dbuf + offset, len);
ret = len;
out:
free(cbuf);
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end
2023-04-18 1:17 [PATCH U-BOOT 0/3] btrfs: fix and improve read code Dominique Martinet
2023-04-18 1:17 ` [PATCH U-BOOT 1/3] btrfs: fix offset within btrfs_read_extent_reg() Dominique Martinet
@ 2023-04-18 1:17 ` Dominique Martinet
2023-04-18 2:02 ` Qu Wenruo
2023-04-18 1:17 ` [PATCH U-BOOT 3/3] btrfs: btfs_file_read: zero trailing data if no extent was found Dominique Martinet
2 siblings, 1 reply; 18+ messages in thread
From: Dominique Martinet @ 2023-04-18 1:17 UTC (permalink / raw)
To: Marek Behún, Qu Wenruo; +Cc: linux-btrfs, u-boot, Dominique Martinet
From: Dominique Martinet <dominique.martinet@atmark-techno.com>
btrfs_file_read main 'aligned read' loop would limit the last read to
the aligned end even if the data is present in the extent: there is no
reason not to read the data that we know will be presented correctly.
If that somehow fails cur only advances up to the aligned_end anyway and
the file tail is read through the old code unchanged; this could be
required if e.g. the end data is inlined.
Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
---
fs/btrfs/inode.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 3d6e39e6544d..efffec0f2e68 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -663,7 +663,8 @@ int btrfs_file_read(struct btrfs_root *root, u64 ino, u64 file_offset, u64 len,
struct btrfs_path path;
struct btrfs_key key;
u64 aligned_start = round_down(file_offset, fs_info->sectorsize);
- u64 aligned_end = round_down(file_offset + len, fs_info->sectorsize);
+ u64 end = file_offset + len;
+ u64 aligned_end = round_down(end, fs_info->sectorsize);
u64 next_offset;
u64 cur = aligned_start;
int ret = 0;
@@ -743,26 +744,26 @@ int btrfs_file_read(struct btrfs_root *root, u64 ino, u64 file_offset, u64 len,
extent_num_bytes = btrfs_file_extent_num_bytes(path.nodes[0],
fi);
ret = btrfs_read_extent_reg(&path, fi, cur,
- min(extent_num_bytes, aligned_end - cur),
+ min(extent_num_bytes, end - cur),
dest + cur - file_offset);
if (ret < 0)
goto out;
- cur += min(extent_num_bytes, aligned_end - cur);
+ cur += min(extent_num_bytes, end - cur);
}
/* Read the tailing unaligned part*/
- if (file_offset + len != aligned_end) {
+ if (file_offset + len != cur) {
btrfs_release_path(&path);
- ret = lookup_data_extent(root, &path, ino, aligned_end,
+ ret = lookup_data_extent(root, &path, ino, cur,
&next_offset);
/* <0 is error, >0 means no extent */
if (ret)
goto out;
fi = btrfs_item_ptr(path.nodes[0], path.slots[0],
struct btrfs_file_extent_item);
- ret = read_and_truncate_page(&path, fi, aligned_end,
- file_offset + len - aligned_end,
- dest + aligned_end - file_offset);
+ ret = read_and_truncate_page(&path, fi, cur,
+ file_offset + len - cur,
+ dest + cur - file_offset);
}
out:
btrfs_release_path(&path);
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH U-BOOT 3/3] btrfs: btfs_file_read: zero trailing data if no extent was found
2023-04-18 1:17 [PATCH U-BOOT 0/3] btrfs: fix and improve read code Dominique Martinet
2023-04-18 1:17 ` [PATCH U-BOOT 1/3] btrfs: fix offset within btrfs_read_extent_reg() Dominique Martinet
2023-04-18 1:17 ` [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end Dominique Martinet
@ 2023-04-18 1:17 ` Dominique Martinet
2023-04-18 2:04 ` Qu Wenruo
2 siblings, 1 reply; 18+ messages in thread
From: Dominique Martinet @ 2023-04-18 1:17 UTC (permalink / raw)
To: Marek Behún, Qu Wenruo; +Cc: linux-btrfs, u-boot, Dominique Martinet
From: Dominique Martinet <dominique.martinet@atmark-techno.com>
btfs_file_read's truncate path has a comment noting '>0 means no extent'
and bailing out immediately, but the buffer has not been written so
probably needs zeroing out.
This is a theorical fix only and hasn't been tested on a file that
actually runs this code path.
Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
---
fs/btrfs/inode.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index efffec0f2e68..23c006c98c3b 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -756,9 +756,12 @@ int btrfs_file_read(struct btrfs_root *root, u64 ino, u64 file_offset, u64 len,
btrfs_release_path(&path);
ret = lookup_data_extent(root, &path, ino, cur,
&next_offset);
- /* <0 is error, >0 means no extent */
- if (ret)
+ /* <0 is error, >0 means no extent: zero end of buffer */
+ if (ret) {
+ if (ret > 0)
+ memset(dest + cur, 0, end - cur);
goto out;
+ }
fi = btrfs_item_ptr(path.nodes[0], path.slots[0],
struct btrfs_file_extent_item);
ret = read_and_truncate_page(&path, fi, cur,
--
2.39.2
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 1/3] btrfs: fix offset within btrfs_read_extent_reg()
2023-04-18 1:17 ` [PATCH U-BOOT 1/3] btrfs: fix offset within btrfs_read_extent_reg() Dominique Martinet
@ 2023-04-18 1:58 ` Qu Wenruo
2023-04-18 2:05 ` Dominique Martinet
0 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2023-04-18 1:58 UTC (permalink / raw)
To: Dominique Martinet, Marek Behún, Qu Wenruo
Cc: linux-btrfs, u-boot, Dominique Martinet
On 2023/4/18 09:17, Dominique Martinet wrote:
> From: Dominique Martinet <dominique.martinet@atmark-techno.com>
The subject can be changed to "btrfs: fix offset when reading compressed
extents".
The original one is a little too generic.
>
> btrfs_read_extent_reg correctly computed the extent offset in the
> BTRFS_COMPRESS_NONE case, but did not account for the 'offset - key.offset'
> part correctly in the compressed case, making the function read
> incorrect data.
>
> In the case I examined, the last 4k of a file was corrupted and
> contained data from a few blocks prior:
> btrfs_file_read()
> -> btrfs_read_extent_reg
> (aligned part loop from 0x40480000 to 0x40ba0000, 128KB at a time)
> last read had 0x4000 bytes in extent, but aligned_end - cur limited
> the read to 0x3000 (offset 0x720000)
> -> read_and_truncate_page
> -> btrfs_read_extent_reg
> reading the last 0x1000 bytes from offset 0x73000 (end of the
> previous 0x4000) into 0x40ba3000
> here key.offset = 0x70000 so we need to use it to recover the
> 0x3000 offset.
You can use "btrfs ins dump-tree" to provide a much easier to read
on-disk data layout.
And you can also add a smaller reproducer.
>
> Confirmed by checking data, before patch:
> u-boot=> mmc load 1:1 $loadaddr boot/uImage
> u-boot=> md 0x40ba0000
> 40ba0000: c0232ff8 c0c722cb 030e94be bf10000c ./#.."..........
> u-boot=> md 0x40ba3000
> 40ba3000: c0232ff8 c0c722cb 030e94be bf10000c ./#.."..........
>
> After patch (also confirmed with data read from linux):
> u-boot=> md 0x40ba3000
> 40ba3000: 64cc9f03 81142256 6910000c 0c03483c ...dV".....i<H..
>
> Note that the code previously (before commit e3427184f38a ("fs: btrfs:
> Implement btrfs_file_read()")) did not split that read in two, so
> this is a regression.
>
> Fixes: a26a6bedafcf ("fs: btrfs: Introduce btrfs_read_extent_inline() and btrfs_read_extent_reg()")
> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
> ---
> fs/btrfs/inode.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 40025662f250..3d6e39e6544d 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -443,6 +443,8 @@ int btrfs_read_extent_reg(struct btrfs_path *path,
> IS_ALIGNED(len, fs_info->sectorsize));
> ASSERT(offset >= key.offset &&
> offset + len <= key.offset + extent_num_bytes);
> + /* offset is now offset within extent */
> + offset = btrfs_file_extent_offset(leaf, fi) + offset - key.offset;
I prefer not to use the @offset, explained later.
>
> /* Preallocated or hole , fill @dest with zero */
> if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_PREALLOC ||
> @@ -454,9 +456,7 @@ int btrfs_read_extent_reg(struct btrfs_path *path,
> if (btrfs_file_extent_compression(leaf, fi) == BTRFS_COMPRESS_NONE) {
> u64 logical;
>
> - logical = btrfs_file_extent_disk_bytenr(leaf, fi) +
> - btrfs_file_extent_offset(leaf, fi) +
> - offset - key.offset;
> + logical = btrfs_file_extent_disk_bytenr(leaf, fi) + offset;
This is touching non-compressed path, which is very weird as your commit
message said this part is correct.
I prefer the original one to show everything we need to take into
consideration.
Otherwise looks good to me.
Thanks,
Qu
> read = len;
>
> num_copies = btrfs_num_copies(fs_info, logical, len);
> @@ -511,7 +511,7 @@ int btrfs_read_extent_reg(struct btrfs_path *path,
> if (ret < dsize)
> memset(dbuf + ret, 0, dsize - ret);
> /* Then copy the needed part */
> - memcpy(dest, dbuf + btrfs_file_extent_offset(leaf, fi), len);
> + memcpy(dest, dbuf + offset, len);
> ret = len;
> out:
> free(cbuf);
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end
2023-04-18 1:17 ` [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end Dominique Martinet
@ 2023-04-18 2:02 ` Qu Wenruo
2023-04-18 2:41 ` Dominique Martinet
0 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2023-04-18 2:02 UTC (permalink / raw)
To: Dominique Martinet, Marek Behún, Qu Wenruo
Cc: linux-btrfs, u-boot, Dominique Martinet
On 2023/4/18 09:17, Dominique Martinet wrote:
> From: Dominique Martinet <dominique.martinet@atmark-techno.com>
>
> btrfs_file_read main 'aligned read' loop would limit the last read to
> the aligned end even if the data is present in the extent: there is no
> reason not to read the data that we know will be presented correctly.
>
> If that somehow fails cur only advances up to the aligned_end anyway and
> the file tail is read through the old code unchanged; this could be
> required if e.g. the end data is inlined.
>
> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
> ---
> fs/btrfs/inode.c | 17 +++++++++--------
> 1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 3d6e39e6544d..efffec0f2e68 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -663,7 +663,8 @@ int btrfs_file_read(struct btrfs_root *root, u64 ino, u64 file_offset, u64 len,
> struct btrfs_path path;
> struct btrfs_key key;
> u64 aligned_start = round_down(file_offset, fs_info->sectorsize);
> - u64 aligned_end = round_down(file_offset + len, fs_info->sectorsize);
> + u64 end = file_offset + len;
> + u64 aligned_end = round_down(end, fs_info->sectorsize);
> u64 next_offset;
> u64 cur = aligned_start;
> int ret = 0;
> @@ -743,26 +744,26 @@ int btrfs_file_read(struct btrfs_root *root, u64 ino, u64 file_offset, u64 len,
> extent_num_bytes = btrfs_file_extent_num_bytes(path.nodes[0],
> fi);
> ret = btrfs_read_extent_reg(&path, fi, cur,
> - min(extent_num_bytes, aligned_end - cur),
> + min(extent_num_bytes, end - cur),
> dest + cur - file_offset);
> if (ret < 0)
> goto out;
> - cur += min(extent_num_bytes, aligned_end - cur);
> + cur += min(extent_num_bytes, end - cur);
> }
>
> /* Read the tailing unaligned part*/
Can we remove this part completely?
IIRC if we read until the target end, the unaligned end part can be
completely removed then.
Thanks,
Qu
> - if (file_offset + len != aligned_end) {
> + if (file_offset + len != cur) {
> btrfs_release_path(&path);
> - ret = lookup_data_extent(root, &path, ino, aligned_end,
> + ret = lookup_data_extent(root, &path, ino, cur,
> &next_offset);
> /* <0 is error, >0 means no extent */
> if (ret)
> goto out;
> fi = btrfs_item_ptr(path.nodes[0], path.slots[0],
> struct btrfs_file_extent_item);
> - ret = read_and_truncate_page(&path, fi, aligned_end,
> - file_offset + len - aligned_end,
> - dest + aligned_end - file_offset);
> + ret = read_and_truncate_page(&path, fi, cur,
> + file_offset + len - cur,
> + dest + cur - file_offset);
> }
> out:
> btrfs_release_path(&path);
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 3/3] btrfs: btfs_file_read: zero trailing data if no extent was found
2023-04-18 1:17 ` [PATCH U-BOOT 3/3] btrfs: btfs_file_read: zero trailing data if no extent was found Dominique Martinet
@ 2023-04-18 2:04 ` Qu Wenruo
2023-04-18 2:43 ` Dominique Martinet
0 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2023-04-18 2:04 UTC (permalink / raw)
To: Dominique Martinet, Marek Behún, Qu Wenruo
Cc: linux-btrfs, u-boot, Dominique Martinet
On 2023/4/18 09:17, Dominique Martinet wrote:
> From: Dominique Martinet <dominique.martinet@atmark-techno.com>
>
> btfs_file_read's truncate path has a comment noting '>0 means no extent'
> and bailing out immediately, but the buffer has not been written so
> probably needs zeroing out.
>
> This is a theorical fix only and hasn't been tested on a file that
> actually runs this code path.
IIRC there is a memset() at the very beginning of btrfs_file_read() to
set the whole dest memory to zero.
This is to handle cases like NO_HOLE cases, which we can skip hole extents.
Thanks,
Qu
>
> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
> ---
> fs/btrfs/inode.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index efffec0f2e68..23c006c98c3b 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -756,9 +756,12 @@ int btrfs_file_read(struct btrfs_root *root, u64 ino, u64 file_offset, u64 len,
> btrfs_release_path(&path);
> ret = lookup_data_extent(root, &path, ino, cur,
> &next_offset);
> - /* <0 is error, >0 means no extent */
> - if (ret)
> + /* <0 is error, >0 means no extent: zero end of buffer */
> + if (ret) {
> + if (ret > 0)
> + memset(dest + cur, 0, end - cur);
> goto out;
> + }
> fi = btrfs_item_ptr(path.nodes[0], path.slots[0],
> struct btrfs_file_extent_item);
> ret = read_and_truncate_page(&path, fi, cur,
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 1/3] btrfs: fix offset within btrfs_read_extent_reg()
2023-04-18 1:58 ` Qu Wenruo
@ 2023-04-18 2:05 ` Dominique Martinet
2023-04-18 2:16 ` Qu Wenruo
2023-04-18 2:20 ` Qu Wenruo
0 siblings, 2 replies; 18+ messages in thread
From: Dominique Martinet @ 2023-04-18 2:05 UTC (permalink / raw)
To: Qu Wenruo
Cc: Dominique Martinet, Marek Behún, Qu Wenruo, linux-btrfs,
u-boot
Qu Wenruo wrote on Tue, Apr 18, 2023 at 09:58:37AM +0800:
> The subject can be changed to "btrfs: fix offset when reading compressed
> extents".
> The original one is a little too generic.
Ok.
> > btrfs_file_read()
> > -> btrfs_read_extent_reg
> > (aligned part loop from 0x40480000 to 0x40ba0000, 128KB at a time)
> > last read had 0x4000 bytes in extent, but aligned_end - cur limited
> > the read to 0x3000 (offset 0x720000)
> > -> read_and_truncate_page
> > -> btrfs_read_extent_reg
> > reading the last 0x1000 bytes from offset 0x73000 (end of the
> > previous 0x4000) into 0x40ba3000
> > here key.offset = 0x70000 so we need to use it to recover the
> > 0x3000 offset.
>
> You can use "btrfs ins dump-tree" to provide a much easier to read on-disk
> data layout.
key.offset is the offset within the read call, not the offset on disk.
The file on disk looks perfectly normal, the condition to trigger the
bug is to have a file which size is not sector-aligned and where the
last extent is bigger than a sector.
I had a look at dump-tree early on and couldn't actually find my file in
there, now the problem is understood it should be easy to make a
reproducer so I'll add this info and commands needed to reproduce (+ a
link to a fs image just in case) in v2
> > /* Preallocated or hole , fill @dest with zero */
> > if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_PREALLOC ||
> > @@ -454,9 +456,7 @@ int btrfs_read_extent_reg(struct btrfs_path *path,
> > if (btrfs_file_extent_compression(leaf, fi) == BTRFS_COMPRESS_NONE) {
> > u64 logical;
> > - logical = btrfs_file_extent_disk_bytenr(leaf, fi) +
> > - btrfs_file_extent_offset(leaf, fi) +
> > - offset - key.offset;
> > + logical = btrfs_file_extent_disk_bytenr(leaf, fi) + offset;
>
> This is touching non-compressed path, which is very weird as your commit
> message said this part is correct.
my rationale is that since this was considered once but forgotten the
other time it'll be easy to add yet another code path that forgets it
later, but I guess it won't change much anyway -- I'll fix the patch
making it explicit again.
Thanks,
--
Dominique
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 1/3] btrfs: fix offset within btrfs_read_extent_reg()
2023-04-18 2:05 ` Dominique Martinet
@ 2023-04-18 2:16 ` Qu Wenruo
2023-04-18 2:20 ` Qu Wenruo
1 sibling, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2023-04-18 2:16 UTC (permalink / raw)
To: Dominique Martinet
Cc: Dominique Martinet, Marek Behún, Qu Wenruo, linux-btrfs,
u-boot
On 2023/4/18 10:05, Dominique Martinet wrote:
> Qu Wenruo wrote on Tue, Apr 18, 2023 at 09:58:37AM +0800:
>> The subject can be changed to "btrfs: fix offset when reading compressed
>> extents".
>> The original one is a little too generic.
>
> Ok.
>
>>> btrfs_file_read()
>>> -> btrfs_read_extent_reg
>>> (aligned part loop from 0x40480000 to 0x40ba0000, 128KB at a time)
>>> last read had 0x4000 bytes in extent, but aligned_end - cur limited
>>> the read to 0x3000 (offset 0x720000)
>>> -> read_and_truncate_page
>>> -> btrfs_read_extent_reg
>>> reading the last 0x1000 bytes from offset 0x73000 (end of the
>>> previous 0x4000) into 0x40ba3000
>>> here key.offset = 0x70000 so we need to use it to recover the
>>> 0x3000 offset.
>>
>> You can use "btrfs ins dump-tree" to provide a much easier to read on-disk
>> data layout.
>
> key.offset is the offset within the read call, not the offset on disk.
> The file on disk looks perfectly normal, the condition to trigger the
> bug is to have a file which size is not sector-aligned and where the
> last extent is bigger than a sector.
Yes, I understand the runtime offset is involved.
But you're still trying to explain the situation with involved on-disk
file extent, right?
In that case it's way easier to go something like this:
We can have a compressed file extent like this:
item 6 key (257 EXTENT_DATA 0) itemoff 15816 itemsize 53
generation 7 type 1 (regular)
extent data disk byte 13631488 nr 4096
extent data offset 0 nr 32768 ram 32768
extent compression 1 (zlib)
Then if we try to read file range [4K, 8K) of inode 257 in Uboot, then
we got corrupted data.
At least that's the preferred way to explain in btrfs community (with
on-disk thus static part, then runtime part).
>
> I had a look at dump-tree early on and couldn't actually find my file in
> there, now the problem is understood it should be easy to make a
> reproducer so I'll add this info and commands needed to reproduce (+ a
> link to a fs image just in case) in v2
The reproducer is super easy.
# mkfs.btrfs -f $dev
# mount $dev -o compress $mnt
# xfs_io -f -c "pwrite -i /dev/urandom 0 16K" $mnt/file
# umount $mnt
Then in uboot
# load host 0 $addr file 0x1000 0x1000
# md $addr
And compare to regular xxd from kernel.
>
>>> /* Preallocated or hole , fill @dest with zero */
>>> if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_PREALLOC ||
>>> @@ -454,9 +456,7 @@ int btrfs_read_extent_reg(struct btrfs_path *path,
>>> if (btrfs_file_extent_compression(leaf, fi) == BTRFS_COMPRESS_NONE) {
>>> u64 logical;
>>> - logical = btrfs_file_extent_disk_bytenr(leaf, fi) +
>>> - btrfs_file_extent_offset(leaf, fi) +
>>> - offset - key.offset;
>>> + logical = btrfs_file_extent_disk_bytenr(leaf, fi) + offset;
>>
>> This is touching non-compressed path, which is very weird as your commit
>> message said this part is correct.
>
> my rationale is that since this was considered once but forgotten the
> other time it'll be easy to add yet another code path that forgets it
> later, but I guess it won't change much anyway -- I'll fix the patch
> making it explicit again.
OK, that makes sense now.
Thanks,
Qu
>
>
> Thanks,
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 1/3] btrfs: fix offset within btrfs_read_extent_reg()
2023-04-18 2:05 ` Dominique Martinet
2023-04-18 2:16 ` Qu Wenruo
@ 2023-04-18 2:20 ` Qu Wenruo
1 sibling, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2023-04-18 2:20 UTC (permalink / raw)
To: Dominique Martinet
Cc: Dominique Martinet, Marek Behún, Qu Wenruo, linux-btrfs,
u-boot
On 2023/4/18 10:05, Dominique Martinet wrote:
> Qu Wenruo wrote on Tue, Apr 18, 2023 at 09:58:37AM +0800:
>> The subject can be changed to "btrfs: fix offset when reading compressed
>> extents".
>> The original one is a little too generic.
>
> Ok.
>
>>> btrfs_file_read()
>>> -> btrfs_read_extent_reg
>>> (aligned part loop from 0x40480000 to 0x40ba0000, 128KB at a time)
>>> last read had 0x4000 bytes in extent, but aligned_end - cur limited
>>> the read to 0x3000 (offset 0x720000)
>>> -> read_and_truncate_page
>>> -> btrfs_read_extent_reg
>>> reading the last 0x1000 bytes from offset 0x73000 (end of the
>>> previous 0x4000) into 0x40ba3000
>>> here key.offset = 0x70000 so we need to use it to recover the
>>> 0x3000 offset.
>>
>> You can use "btrfs ins dump-tree" to provide a much easier to read on-disk
>> data layout.
>
> key.offset is the offset within the read call, not the offset on disk.
> The file on disk looks perfectly normal, the condition to trigger the
> bug is to have a file which size is not sector-aligned and where the
> last extent is bigger than a sector.
Forgot to mention, this bug does not only affect the mentioned case, it
affects all partial read of an extent.
Even if it's sector aligned.
>
> I had a look at dump-tree early on and couldn't actually find my file in
> there, now the problem is understood it should be easy to make a
> reproducer so I'll add this info and commands needed to reproduce (+ a
> link to a fs image just in case) in v2
>
>>> /* Preallocated or hole , fill @dest with zero */
>>> if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_PREALLOC ||
>>> @@ -454,9 +456,7 @@ int btrfs_read_extent_reg(struct btrfs_path *path,
>>> if (btrfs_file_extent_compression(leaf, fi) == BTRFS_COMPRESS_NONE) {
>>> u64 logical;
>>> - logical = btrfs_file_extent_disk_bytenr(leaf, fi) +
>>> - btrfs_file_extent_offset(leaf, fi) +
>>> - offset - key.offset;
>>> + logical = btrfs_file_extent_disk_bytenr(leaf, fi) + offset;
>>
>> This is touching non-compressed path, which is very weird as your commit
>> message said this part is correct.
>
> my rationale is that since this was considered once but forgotten the
> other time it'll be easy to add yet another code path that forgets it
> later, but I guess it won't change much anyway -- I'll fix the patch
> making it explicit again.
>
>
> Thanks,
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end
2023-04-18 2:02 ` Qu Wenruo
@ 2023-04-18 2:41 ` Dominique Martinet
2023-04-18 2:53 ` Qu Wenruo
0 siblings, 1 reply; 18+ messages in thread
From: Dominique Martinet @ 2023-04-18 2:41 UTC (permalink / raw)
To: Qu Wenruo
Cc: Marek Behún, Qu Wenruo, linux-btrfs, u-boot,
Dominique Martinet
Qu Wenruo wrote on Tue, Apr 18, 2023 at 10:02:00AM +0800:
> > /* Read the tailing unaligned part*/
>
> Can we remove this part completely?
>
> IIRC if we read until the target end, the unaligned end part can be
> completely removed then.
The "Read the aligned part" loop stops at aligned_end:
> while (cur < aligned_end)
So it should be possible that the last aligned extent we consider does
not contain data until the end, e.g. an offset that ends with the
aligned end:
0 4096 4123
[extent1-----|extent2]
In this case the main loop will only read extent1 and we need the
"trailing unaligned part" if for extent2.
I have a feeling the loop could just be updated to go to the end
`while (cur < end)` as it doesn't seem to care about the end
alignment... Should I update v2 to do that instead?
This made me look at the "Read out the leading unaligned part" initial
part, and its check only looks at the sector size but it probably has
the same problem -- we want to make sure we read any leftover from a
previous extent e.g. this file:
0 4096 8192
[extent1------------------|extent2]
and reading from 4k with a sectorsize of 4k is probably bad will enter
the aligned main loop right away...
And I think that'll fail?...
Actually not quite sure what's expecting what to be aligned in there,
but I just tried some partial reads from non-zero offsets and my board
resets almost all the time so I guess I've found something else to dig
into.
This isn't a priority for me right now but I'll look a bit more when I
have more time if you haven't first.
--
Dominique Martinet | Asmadeus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 3/3] btrfs: btfs_file_read: zero trailing data if no extent was found
2023-04-18 2:04 ` Qu Wenruo
@ 2023-04-18 2:43 ` Dominique Martinet
0 siblings, 0 replies; 18+ messages in thread
From: Dominique Martinet @ 2023-04-18 2:43 UTC (permalink / raw)
To: Qu Wenruo
Cc: Marek Behún, Qu Wenruo, linux-btrfs, u-boot,
Dominique Martinet
Qu Wenruo wrote on Tue, Apr 18, 2023 at 10:04:56AM +0800:
> > This is a theorical fix only and hasn't been tested on a file that
> > actually runs this code path.
>
> IIRC there is a memset() at the very beginning of btrfs_file_read() to set
> the whole dest memory to zero.
Right, sorry.
I'll drop this and send a new patch that removes the duplicate memset
(at "The whole file is a hole" comment) instead as seeing multiple
memsets is what made me think it'd be necessary without thinking; that
can be done even if we rework the function a bit in later cleanups...
--
Dominique Martinet | Asmadeus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end
2023-04-18 2:41 ` Dominique Martinet
@ 2023-04-18 2:53 ` Qu Wenruo
2023-04-18 3:07 ` Dominique Martinet
0 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2023-04-18 2:53 UTC (permalink / raw)
To: Dominique Martinet
Cc: Marek Behún, Qu Wenruo, linux-btrfs, u-boot,
Dominique Martinet
On 2023/4/18 10:41, Dominique Martinet wrote:
> Qu Wenruo wrote on Tue, Apr 18, 2023 at 10:02:00AM +0800:
>>> /* Read the tailing unaligned part*/
>>
>> Can we remove this part completely?
>>
>> IIRC if we read until the target end, the unaligned end part can be
>> completely removed then.
>
> The "Read the aligned part" loop stops at aligned_end:
>> while (cur < aligned_end)
>
> So it should be possible that the last aligned extent we consider does
> not contain data until the end, e.g. an offset that ends with the
> aligned end:
>
> 0 4096 4123
> [extent1-----|extent2]
>
> In this case the main loop will only read extent1 and we need the
> "trailing unaligned part" if for extent2.
>
> I have a feeling the loop could just be updated to go to the end
> `while (cur < end)` as it doesn't seem to care about the end
> alignment... Should I update v2 to do that instead?
Yeah, it would be very awesome if you can remove the tailing part
completely.
>
>
>
> This made me look at the "Read out the leading unaligned part" initial
> part, and its check only looks at the sector size but it probably has
> the same problem -- we want to make sure we read any leftover from a
> previous extent e.g. this file:
If you're talking about the same problem mentioned in patch 1, then yes,
any read in the middle of an extent would cause problems.
No matter if it's aligned or not, as btrfs would convert any unaligned
part into aligned read.
But if we fix the bug you mentioned, I see nothing can going wrong.
Unaligned read is converted into aligned one (then copy the target range
into the dest buf), and aligned part (including the tailing unaligned
part) would be properly handled then.
> 0 4096 8192
> [extent1------------------|extent2]
> and reading from 4k with a sectorsize of 4k is probably bad will enter
> the aligned main loop right away...
> And I think that'll fail?...
> Actually not quite sure what's expecting what to be aligned in there,
> but I just tried some partial reads from non-zero offsets and my board
> resets almost all the time so I guess I've found something else to dig
> into.
> This isn't a priority for me right now but I'll look a bit more when I
> have more time if you haven't first.
>
My initial plan is to merge the tailing part into the aligned loop, but
since you're already working in this part, feel free to do it.
And I'm very happy to provide any help on this.
Thanks,
Qu
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end
2023-04-18 2:53 ` Qu Wenruo
@ 2023-04-18 3:07 ` Dominique Martinet
2023-04-18 3:21 ` Qu Wenruo
0 siblings, 1 reply; 18+ messages in thread
From: Dominique Martinet @ 2023-04-18 3:07 UTC (permalink / raw)
To: Qu Wenruo
Cc: Marek Behún, Qu Wenruo, linux-btrfs, u-boot,
Dominique Martinet
Qu Wenruo wrote on Tue, Apr 18, 2023 at 10:53:41AM +0800:
> > I have a feeling the loop could just be updated to go to the end
> > `while (cur < end)` as it doesn't seem to care about the end
> > alignment... Should I update v2 to do that instead?
>
> Yeah, it would be very awesome if you can remove the tailing part
> completely.
Ok, will give it a try.
I'll want to test this a bit so might take a day or two as I have other
work to finish first.
> > This made me look at the "Read out the leading unaligned part" initial
> > part, and its check only looks at the sector size but it probably has
> > the same problem -- we want to make sure we read any leftover from a
> > previous extent e.g. this file:
>
> If you're talking about the same problem mentioned in patch 1, then yes, any
> read in the middle of an extent would cause problems.
No, was just thinking the leading part being a separate loop doesn't
seem to make sense either as the code shouldn't care about sector size
alignemnt but about full extents.
If the main loop handles everything correctly then the leading if can
also be removed along with "read_and_truncate_page" that would no longer
be used.
I'll give this a try as well and report back.
> No matter if it's aligned or not, as btrfs would convert any unaligned part
> into aligned read.
Yes I don't see where it would fail (except that my board does crash),
I guess that at this point I should spend some time building a qemu
u-boot and hooking up gdb will be faster..
> My initial plan is to merge the tailing part into the aligned loop, but
> since you're already working in this part, feel free to do it.
Yes, sure -- removing the if is easy, I'd just rather not make it fail
for someone else :)
--
Dominique Martinet | Asmadeus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end
2023-04-18 3:07 ` Dominique Martinet
@ 2023-04-18 3:21 ` Qu Wenruo
2023-04-18 3:53 ` Dominique Martinet
0 siblings, 1 reply; 18+ messages in thread
From: Qu Wenruo @ 2023-04-18 3:21 UTC (permalink / raw)
To: Dominique Martinet
Cc: Marek Behún, Qu Wenruo, linux-btrfs, u-boot,
Dominique Martinet
On 2023/4/18 11:07, Dominique Martinet wrote:
> Qu Wenruo wrote on Tue, Apr 18, 2023 at 10:53:41AM +0800:
>>> I have a feeling the loop could just be updated to go to the end
>>> `while (cur < end)` as it doesn't seem to care about the end
>>> alignment... Should I update v2 to do that instead?
>>
>> Yeah, it would be very awesome if you can remove the tailing part
>> completely.
>
> Ok, will give it a try.
> I'll want to test this a bit so might take a day or two as I have other
> work to finish first.
>
>>> This made me look at the "Read out the leading unaligned part" initial
>>> part, and its check only looks at the sector size but it probably has
>>> the same problem -- we want to make sure we read any leftover from a
>>> previous extent e.g. this file:
>>
>> If you're talking about the same problem mentioned in patch 1, then yes, any
>> read in the middle of an extent would cause problems.
>
> No, was just thinking the leading part being a separate loop doesn't
> seem to make sense either as the code shouldn't care about sector size
> alignemnt but about full extents.
The main concern related to the leading unaligned part is, we need to
skip something unaligned from the beginning , while all other situations
never need to skip such case (they at most skip the tailing part).
Maybe we can do some extra calculation to handle it, but I'm not in a
hurry to address the leading part.
Another thing is, for all other fs-ish interface (kernel/fuse etc), they
all read/write in certain block size, then handle unaligned part from a
higher layer.
Thus I prefer to have most things handled in an aligned fashion, and
only handle the unaligned part specially.
But if you can find an elegant way to handle all cases, that would be
really awesome!
Thanks,
Qu
> If the main loop handles everything correctly then the leading if can
> also be removed along with "read_and_truncate_page" that would no longer
> be used.
>
> I'll give this a try as well and report back.
>
>> No matter if it's aligned or not, as btrfs would convert any unaligned part
>> into aligned read.
>
> Yes I don't see where it would fail (except that my board does crash),
> I guess that at this point I should spend some time building a qemu
> u-boot and hooking up gdb will be faster..
>
>> My initial plan is to merge the tailing part into the aligned loop, but
>> since you're already working in this part, feel free to do it.
>
> Yes, sure -- removing the if is easy, I'd just rather not make it fail
> for someone else :)
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end
2023-04-18 3:21 ` Qu Wenruo
@ 2023-04-18 3:53 ` Dominique Martinet
2023-04-18 5:17 ` Dominique Martinet
2023-04-18 7:15 ` Qu Wenruo
0 siblings, 2 replies; 18+ messages in thread
From: Dominique Martinet @ 2023-04-18 3:53 UTC (permalink / raw)
To: Qu Wenruo
Cc: Marek Behún, Qu Wenruo, linux-btrfs, u-boot,
Dominique Martinet
Qu Wenruo wrote on Tue, Apr 18, 2023 at 11:21:00AM +0800:
> > No, was just thinking the leading part being a separate loop doesn't
> > seem to make sense either as the code shouldn't care about sector size
> > alignemnt but about full extents.
>
> The main concern related to the leading unaligned part is, we need to skip
> something unaligned from the beginning , while all other situations never
> need to skip such case (they at most skip the tailing part).
Ok, there is one exception for inline extents apparently.. But I'm not
still not convinced the `aligned_start != file_offset` check is enough
for that either; I'd say it's unlikely but the inline part can be
compressed, so we could have a file which has > 4k (sectorsize) of
expanded data, so a read from the 4k offset would skip the special
handling and fail (reading the whole extent in dest)
Even if that's not possible, reading just the first 10 bytes of an
inline extent will be aligned and go through the main loop which just
reads the whole extent, so it'll need the same handling as the regular
btrfs_read_extent_reg handling at which point it might just as well
handle start offset too.
That aside taking the loop in order:
- lookup_data_extent doesn't care (used in heading/tail)
- skipping holes don't care as they explicitely put cursor at start of
next extent (or bail out if nothing next)
- inline needs fixing anyway as said above
- PREALLOC or nothing on disk also goes straight to next and is ok
- ah, I see what you meant now, we need to substract the current
position within the extent to extent_num_bytes...
That's also already a problem, though; to take the same example:
0 8k 16k
[extent1 | extent2 ... ]
reading from 4k onwards will try to read
min(extent_num_bytes, end-cur) = min(8k, 12k) = 8k
from the 4k offset which goes over the end of the extent.
That could actually be my resets from the previous mail.
So either the first check should just lookup the extent and check that
extent start matches current offset instead of checking for sectorsize
alignment, or we can just fix the loop and remove the first if.
--
Dominique Martinet | Asmadeus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end
2023-04-18 3:53 ` Dominique Martinet
@ 2023-04-18 5:17 ` Dominique Martinet
2023-04-18 7:15 ` Qu Wenruo
1 sibling, 0 replies; 18+ messages in thread
From: Dominique Martinet @ 2023-04-18 5:17 UTC (permalink / raw)
To: Qu Wenruo
Cc: Marek Behún, Qu Wenruo, linux-btrfs, u-boot,
Dominique Martinet
Dominique Martinet wrote on Tue, Apr 18, 2023 at 12:53:35PM +0900:
> Ok, there is one exception for inline extents apparently.. But I'm not
> still not convinced the `aligned_start != file_offset` check is enough
> for that either; I'd say it's unlikely but the inline part can be
> compressed, so we could have a file which has > 4k (sectorsize) of
> expanded data, so a read from the 4k offset would skip the special
> handling and fail (reading the whole extent in dest)
(Wasn't able to create such a file, I assume that means the uncompressed
data must fits in a page -- if we deal with arm machines with 16k or 64k
page size that'll probably change things again but I'll just pretend
sector size will magically match in this case..)
> Even if that's not possible, reading just the first 10 bytes of an
> inline extent will be aligned and go through the main loop which just
> reads the whole extent, so it'll need the same handling as the regular
> btrfs_read_extent_reg handling at which point it might just as well
> handle start offset too.
Question regarding inline extent: the main loop goes straight out after
reading an inline extent, assuming it is always alone (or last) -- is
that correct?
By playing with `filefrag -v` I could sometimes see files that
temporarily have inline + a delalloc extent, but wasn't able to make a
file that kept the inline extent after appending more data, so I guess
it is also sane enough... Just making it continue and letting the loop
end seems just as simple now there is no trailing if, but if inline
extents cannot be mixed I'll be happy to keep it going out.
back to btrfs_read_extent_reg:
merging the tail back into the main loop breaks the first assert
(IS_ALIGNED(len, fs_info->sectorsize) in particular).
The old loop invocation made sure it was aligned and
read_and_truncate_page used to take care of calling it with a bigger
buffer when it was not.
I was only looking at the compressed path and that does not care about
'len' alignment because it makes an intermediate copy for decompression,
but the BTRFS_COMPRESS_NONE's read_extent_data might care?
I didn't see anything that actually requires alignment there (len in
particular should be ok, but even offset->logical seems to properly be
used as "being part of a range" in lookups so alignment doesn't actually
matter), but if this isn't tested I can understand wanting to be more
careful there.
Ok so this is rightfully less obvious than I had first assumed -- sorry
for rushing in. Let's do baby steps:
- I'll resend just the first patch shortly, it's a real fix and I'll be
using it right away.
- I'll double check 'len' doesn't need to be aligned in
btrfs_read_extent_reg and just rework the main loop to allow removing
the tail exception, that'll avoid a double read in many cases and I
think it's worth doing.
Might take a bit more time as I want to finish some other work, let's
say a few days.
- That leaves the two issues I brought up with the main loop:
* inline case ignoring end; this is minor enough but easy to fix
* btrfs_read_extent_reg() assuming start of extent in its length
agument; I believe it'll be easier to stop trying to set a upper limit
in the main loop and just have btrfs_read_extent_reg() do it itself, we
can use its return value to know how much was actually read.
(Probably just substract offset - key.offset to extent_num_bytes to
have a new cap, but I still don't understand btrfs_file_extent_offset()
in all of this as it's always 0 when I looked)
Will try to do that over the next few weeks, but if you want to look at
it feel free to do this before me.
- At this point it might be worth considering removing the initial check
as that also makes an extra small read in the unaligned case, but it's
not a bug and can wait.
What do you think?
--
Dominique Martinet | Asmadeus
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end
2023-04-18 3:53 ` Dominique Martinet
2023-04-18 5:17 ` Dominique Martinet
@ 2023-04-18 7:15 ` Qu Wenruo
1 sibling, 0 replies; 18+ messages in thread
From: Qu Wenruo @ 2023-04-18 7:15 UTC (permalink / raw)
To: Dominique Martinet
Cc: Marek Behún, Qu Wenruo, linux-btrfs, u-boot,
Dominique Martinet
On 2023/4/18 11:53, Dominique Martinet wrote:
> Qu Wenruo wrote on Tue, Apr 18, 2023 at 11:21:00AM +0800:
>>> No, was just thinking the leading part being a separate loop doesn't
>>> seem to make sense either as the code shouldn't care about sector size
>>> alignemnt but about full extents.
>>
>> The main concern related to the leading unaligned part is, we need to skip
>> something unaligned from the beginning , while all other situations never
>> need to skip such case (they at most skip the tailing part).
>
> Ok, there is one exception for inline extents apparently.. But I'm not
> still not convinced the `aligned_start != file_offset` check is enough
> for that either; I'd say it's unlikely but the inline part can be
> compressed, so we could have a file which has > 4k (sectorsize) of
> expanded data, so a read from the 4k offset would skip the special
> handling and fail (reading the whole extent in dest)
Btrfs inline has a limit to sectorsize.
That means, inlined compressed extent can at most be 4K sized (if 4K is
our sector size).
So that won't be a problem.
>
> Even if that's not possible, reading just the first 10 bytes of an
> inline extent will be aligned and go through the main loop which just
> reads the whole extent, so it'll need the same handling as the regular
> btrfs_read_extent_reg handling at which point it might just as well
> handle start offset too.
If we just read 10 bytes, the aligned part should handle it well.
My real concern is what if we read 10 bytes at offset 10 bytes.
If this can be handled in the same way of aligned read (and still be
reasonable readable), then it would be awesome.
>
>
> That aside taking the loop in order:
> - lookup_data_extent doesn't care (used in heading/tail)
> - skipping holes don't care as they explicitely put cursor at start of
> next extent (or bail out if nothing next)
> - inline needs fixing anyway as said above
> - PREALLOC or nothing on disk also goes straight to next and is ok
> - ah, I see what you meant now, we need to substract the current
> position within the extent to extent_num_bytes...
> That's also already a problem, though; to take the same example:
> 0 8k 16k
> [extent1 | extent2 ... ]
> reading from 4k onwards will try to read
> min(extent_num_bytes, end-cur) = min(8k, 12k) = 8k
> from the 4k offset which goes over the end of the extent.
That's indeed a problem.
As most of the Uboot fs drivers only care read the whole file, never
really utilize the ability to read part of the file, that path is not
properly tested.
(Some driver, IIRC ubifs?, doesn't even allow read with non-zero offset)
Thanks,
Qu
>
> That could actually be my resets from the previous mail.
>
>
> So either the first check should just lookup the extent and check that
> extent start matches current offset instead of checking for sectorsize
> alignment, or we can just fix the loop and remove the first if.
>
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2023-04-18 14:39 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-18 1:17 [PATCH U-BOOT 0/3] btrfs: fix and improve read code Dominique Martinet
2023-04-18 1:17 ` [PATCH U-BOOT 1/3] btrfs: fix offset within btrfs_read_extent_reg() Dominique Martinet
2023-04-18 1:58 ` Qu Wenruo
2023-04-18 2:05 ` Dominique Martinet
2023-04-18 2:16 ` Qu Wenruo
2023-04-18 2:20 ` Qu Wenruo
2023-04-18 1:17 ` [PATCH U-BOOT 2/3] btrfs: btrfs_file_read: allow opportunistic read until the end Dominique Martinet
2023-04-18 2:02 ` Qu Wenruo
2023-04-18 2:41 ` Dominique Martinet
2023-04-18 2:53 ` Qu Wenruo
2023-04-18 3:07 ` Dominique Martinet
2023-04-18 3:21 ` Qu Wenruo
2023-04-18 3:53 ` Dominique Martinet
2023-04-18 5:17 ` Dominique Martinet
2023-04-18 7:15 ` Qu Wenruo
2023-04-18 1:17 ` [PATCH U-BOOT 3/3] btrfs: btfs_file_read: zero trailing data if no extent was found Dominique Martinet
2023-04-18 2:04 ` Qu Wenruo
2023-04-18 2:43 ` Dominique Martinet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox