* [PATCH 0/2] btrfs: more explaination on extent_map members @ 2024-04-02 6:23 Qu Wenruo 2024-04-02 6:23 ` [PATCH 1/2] btrfs: add extra comments " Qu Wenruo 2024-04-02 6:23 ` [PATCH 2/2] btrfs: simplify the inline extent map creation Qu Wenruo 0 siblings, 2 replies; 7+ messages in thread From: Qu Wenruo @ 2024-04-02 6:23 UTC (permalink / raw) To: linux-btrfs Btrfs uses extent_map to represent a in-memory file extent. There are severam members that are 1:1 mappe in on-disk file extent items and extent maps: - extent_map::start == key.offset - extent_map::len == file_extent_num_bytes - extent_map::ram_bytes == file_extent_ram_bytes But that's all, the remaining are pretty different: - Use block_start to indicate holes/inline extents Meanwhile btrfs on-disk file extent items go with a dedicated type for inline extents, and disk_bytenr 0 for holes. - Extra members for fsync optimization I'm still not 100% sure how mod_start and mod_len really works though. - Weird block_start/orig_block_len/orig_start In theory we can directly go with the same file_extent_disk_bytenr, file_extent_disk_num_bytes and file_extent_offset to calculate the remaining members (block_start/orig_start/orig_block_len/block_len). But for whatever reason, we didn't go that path and have a hell of weird and inconsistent calculation for them. I do not have the confidence to handle the mess yet, but as the first step, I would add comments for those members mostly according to btrfs_extent_item_to_extent_map(), and hopefully we can improve the situation in not-far-away future. Qu Wenruo (2): btrfs: add extra comments on extent_map members btrfs: simplify the inline extent map creation fs/btrfs/extent_map.h | 62 ++++++++++++++++++++++++++++++++++++++++++- fs/btrfs/file-item.c | 14 ++++++---- 2 files changed, 70 insertions(+), 6 deletions(-) -- 2.44.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] btrfs: add extra comments on extent_map members 2024-04-02 6:23 [PATCH 0/2] btrfs: more explaination on extent_map members Qu Wenruo @ 2024-04-02 6:23 ` Qu Wenruo 2024-04-02 7:33 ` Andrea Gelmini 2024-04-02 15:45 ` Filipe Manana 2024-04-02 6:23 ` [PATCH 2/2] btrfs: simplify the inline extent map creation Qu Wenruo 1 sibling, 2 replies; 7+ messages in thread From: Qu Wenruo @ 2024-04-02 6:23 UTC (permalink / raw) To: linux-btrfs The extent_map structure is very critical to btrfs, as it is involved for both read and write paths. Unfortunately the structure is not properly explained, making it pretty hard to understand nor to do further improvement. This patch would add extra comments explaining the major numbers base on my code reading. Hopefully we can find more members to cleanup in the future. Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/extent_map.h | 62 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 61 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h index c5a098c99cc6..30322defcd03 100644 --- a/fs/btrfs/extent_map.h +++ b/fs/btrfs/extent_map.h @@ -37,21 +37,81 @@ enum { }; /* + * This extent_map structure is an in-memory representation of file extents, + * it would represent all file extents (including holes, no matter if we have + * hole file extents). + * * Keep this structure as compact as possible, as we can have really large * amounts of allocated extent maps at any time. */ struct extent_map { struct rb_node rb_node; - /* all of these are in bytes */ + /* All of these are in bytes */ + + /* + * File offset of the file extent. matching key.offset of + * (INO EXTENT_DATA FILEPOS) key. + */ u64 start; + + /* + * Length of the file extent. + * For non-inlined file extents it's btrfs_file_extent_item::num_bytes. + * For inlined file extents it's sectorsize. (as there is no reliable + * btrfs_file_extent::num_bytes). + */ u64 len; + + /* + * The modified range start/length, these are in-memory-only + * members for fsync/logtree optimization. + */ u64 mod_start; u64 mod_len; + + /* + * The file offset of the original file extent before splitting. + * + * This is an in-memory-only member, mathcing + * em::start - btrfs_file_extent_item::offset for regular/preallocated + * extents. EXTENT_MAP_HOLE otherwise. + */ u64 orig_start; + + /* + * The full on-disk extent length, matching + * btrfs_file_extent_item::disk_num_bytes. + */ u64 orig_block_len; + + /* + * The decompressed size of the whole on-disk extent, matching + * btrfs_file_extent_item::ram_bytes. + * + * For non-compressed extents, this matches orig_block_len. + */ u64 ram_bytes; + + /* + * The on-disk logical bytenr for the file extent. + * + * For compressed extents it matches btrfs_file_extent_item::disk_bytenr. + * For uncompressed extents it matches + * btrfs_file_extent_item::disk_bytenr + btrfs_file_extent_item::offset + * + * For hole extents it is EXTENT_MAP_HOLE and for inline extents it is + * EXTENT_MAP_INLINE. + */ u64 block_start; + + /* + * The on-disk length for the file extent. + * + * For compressed extents it matches btrfs_file_extent_item::disk_num_bytes. + * For uncompressed extents it matches em::len. + * Otherwise -1 (aka doesn't make much sense). + */ u64 block_len; /* -- 2.44.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] btrfs: add extra comments on extent_map members 2024-04-02 6:23 ` [PATCH 1/2] btrfs: add extra comments " Qu Wenruo @ 2024-04-02 7:33 ` Andrea Gelmini 2024-04-02 8:25 ` Qu Wenruo 2024-04-02 15:45 ` Filipe Manana 1 sibling, 1 reply; 7+ messages in thread From: Andrea Gelmini @ 2024-04-02 7:33 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs Il giorno mar 2 apr 2024 alle ore 08:24 Qu Wenruo <wqu@suse.com> ha scritto: > > + * This is an in-memory-only member, matching Just a stupid fix about "mathcing". Thanks a lot Qu, Gelma ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] btrfs: add extra comments on extent_map members 2024-04-02 7:33 ` Andrea Gelmini @ 2024-04-02 8:25 ` Qu Wenruo 0 siblings, 0 replies; 7+ messages in thread From: Qu Wenruo @ 2024-04-02 8:25 UTC (permalink / raw) To: Andrea Gelmini; +Cc: linux-btrfs 在 2024/4/2 18:03, Andrea Gelmini 写道: > Il giorno mar 2 apr 2024 alle ore 08:24 Qu Wenruo <wqu@suse.com> ha scritto: >> >> + * This is an in-memory-only member, matching > > Just a stupid fix about "mathcing". Thanks for pointing out, I guess David is already giving up on my grammar... Thanks, Qu > > Thanks a lot Qu, > Gelma ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] btrfs: add extra comments on extent_map members 2024-04-02 6:23 ` [PATCH 1/2] btrfs: add extra comments " Qu Wenruo 2024-04-02 7:33 ` Andrea Gelmini @ 2024-04-02 15:45 ` Filipe Manana 1 sibling, 0 replies; 7+ messages in thread From: Filipe Manana @ 2024-04-02 15:45 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs On Tue, Apr 2, 2024 at 7:24 AM Qu Wenruo <wqu@suse.com> wrote: > > The extent_map structure is very critical to btrfs, as it is involved > for both read and write paths. > > Unfortunately the structure is not properly explained, making it pretty > hard to understand nor to do further improvement. > > This patch would add extra comments explaining the major numbers base on would add -> adds And by "numbers" I think you wanted to say "members"? base -> based > my code reading. > Hopefully we can find more members to cleanup in the future. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > fs/btrfs/extent_map.h | 62 ++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 61 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h > index c5a098c99cc6..30322defcd03 100644 > --- a/fs/btrfs/extent_map.h > +++ b/fs/btrfs/extent_map.h > @@ -37,21 +37,81 @@ enum { > }; > > /* > + * This extent_map structure is an in-memory representation of file extents, > + * it would represent all file extents (including holes, no matter if we have > + * hole file extents). This can simply be: "This structure represents file extents and holes." Which also fixes some grammar (it would represent -> it represents) and the stuff between parentheses is confusing. > + * > * Keep this structure as compact as possible, as we can have really large > * amounts of allocated extent maps at any time. > */ > struct extent_map { > struct rb_node rb_node; > > - /* all of these are in bytes */ > + /* All of these are in bytes */ Please add punctuation as well. > + > + /* > + * File offset of the file extent. matching key.offset of > + * (INO EXTENT_DATA FILEPOS) key. > + */ "File offset matching the offset of a BTRFS_EXTENT_ITEM_KEY key." That's a lot more clear than using the shortened format from tree-dump. Also note that "matching" should start with a capital letter (beginning of sentence. > u64 start; > + > + /* > + * Length of the file extent. > + * For non-inlined file extents it's btrfs_file_extent_item::num_bytes. > + * For inlined file extents it's sectorsize. (as there is no reliable > + * btrfs_file_extent::num_bytes). Don't put whole sentences in parentheses after a punctuation mark. This can also be rephrased in a more clear way: "For inline extents it's sectorsize and btrfs_file_extent_item::num_bytes has data and not a valid length, because inline data starts at offsetof(struct btrfs_file_extent_item, disk_bytenr)." > + */ > u64 len; > + > + /* > + * The modified range start/length, these are in-memory-only > + * members for fsync/logtree optimization. > + */ These were initially used to avoid logging the same csum ranges multiple times when extent maps get merged. But from a quick look and experiment we don't actually need them in order to avoid that (we don't merge new modified extents). I'll send a patch to remove them and update the fsync logic. > u64 mod_start; > u64 mod_len; > + > + /* > + * The file offset of the original file extent before splitting. > + * > + * This is an in-memory-only member, mathcing in-memory-only -> in-memory only mathcing -> matching > + * em::start - btrfs_file_extent_item::offset for regular/preallocated Instead of em, which is only a typical variable name we use for extent maps, use 'extent_map' so that it's precise and leaves no room for confusion. > + * extents. EXTENT_MAP_HOLE otherwise. > + */ > u64 orig_start; > + > + /* > + * The full on-disk extent length, matching > + * btrfs_file_extent_item::disk_num_bytes. > + */ > u64 orig_block_len; > + > + /* > + * The decompressed size of the whole on-disk extent, matching > + * btrfs_file_extent_item::ram_bytes. > + * > + * For non-compressed extents, this matches orig_block_len. > + */ > u64 ram_bytes; > + > + /* > + * The on-disk logical bytenr for the file extent. > + * > + * For compressed extents it matches btrfs_file_extent_item::disk_bytenr. > + * For uncompressed extents it matches > + * btrfs_file_extent_item::disk_bytenr + btrfs_file_extent_item::offset > + * > + * For hole extents it is EXTENT_MAP_HOLE and for inline extents it is > + * EXTENT_MAP_INLINE. > + */ > u64 block_start; > + > + /* > + * The on-disk length for the file extent. > + * > + * For compressed extents it matches btrfs_file_extent_item::disk_num_bytes. > + * For uncompressed extents it matches em::len. Same as before, use 'extent_map::len' instead of 'em::len'. > + * Otherwise -1 (aka doesn't make much sense). That is cryptic... Should be something like: "If the extent map represents a hole, then it's -1 and shouldn't be used." Thanks. > + */ > u64 block_len; > > /* > -- > 2.44.0 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] btrfs: simplify the inline extent map creation 2024-04-02 6:23 [PATCH 0/2] btrfs: more explaination on extent_map members Qu Wenruo 2024-04-02 6:23 ` [PATCH 1/2] btrfs: add extra comments " Qu Wenruo @ 2024-04-02 6:23 ` Qu Wenruo 2024-04-02 15:50 ` Filipe Manana 1 sibling, 1 reply; 7+ messages in thread From: Qu Wenruo @ 2024-04-02 6:23 UTC (permalink / raw) To: linux-btrfs With the tree-checker ensuring all inline file extents starts at file offset 0 and has a length no larger than sectorsize, we can simplify the calculation to assigned those fixes values directly. Signed-off-by: Qu Wenruo <wqu@suse.com> --- fs/btrfs/file-item.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c index e58fb5347e65..de3ccee38572 100644 --- a/fs/btrfs/file-item.c +++ b/fs/btrfs/file-item.c @@ -1265,18 +1265,19 @@ void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode, struct extent_buffer *leaf = path->nodes[0]; const int slot = path->slots[0]; struct btrfs_key key; - u64 extent_start, extent_end; + u64 extent_start; u64 bytenr; u8 type = btrfs_file_extent_type(leaf, fi); int compress_type = btrfs_file_extent_compression(leaf, fi); btrfs_item_key_to_cpu(leaf, &key, slot); extent_start = key.offset; - extent_end = btrfs_file_extent_end(path); em->ram_bytes = btrfs_file_extent_ram_bytes(leaf, fi); em->generation = btrfs_file_extent_generation(leaf, fi); if (type == BTRFS_FILE_EXTENT_REG || type == BTRFS_FILE_EXTENT_PREALLOC) { + u64 extent_end = btrfs_file_extent_end(path); + em->start = extent_start; em->len = extent_end - extent_start; em->orig_start = extent_start - @@ -1299,9 +1300,12 @@ void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode, em->flags |= EXTENT_FLAG_PREALLOC; } } else if (type == BTRFS_FILE_EXTENT_INLINE) { + /* Tree-checker has ensured this. */ + ASSERT(extent_start == 0); + em->block_start = EXTENT_MAP_INLINE; - em->start = extent_start; - em->len = extent_end - extent_start; + em->start = 0; + em->len = fs_info->sectorsize; /* * Initialize orig_start and block_len with the same values * as in inode.c:btrfs_get_extent(). @@ -1336,7 +1340,7 @@ u64 btrfs_file_extent_end(const struct btrfs_path *path) if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) { end = btrfs_file_extent_ram_bytes(leaf, fi); - end = ALIGN(key.offset + end, leaf->fs_info->sectorsize); + end = leaf->fs_info->sectorsize; } else { end = key.offset + btrfs_file_extent_num_bytes(leaf, fi); } -- 2.44.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] btrfs: simplify the inline extent map creation 2024-04-02 6:23 ` [PATCH 2/2] btrfs: simplify the inline extent map creation Qu Wenruo @ 2024-04-02 15:50 ` Filipe Manana 0 siblings, 0 replies; 7+ messages in thread From: Filipe Manana @ 2024-04-02 15:50 UTC (permalink / raw) To: Qu Wenruo; +Cc: linux-btrfs On Tue, Apr 2, 2024 at 7:24 AM Qu Wenruo <wqu@suse.com> wrote: > > With the tree-checker ensuring all inline file extents starts at file > offset 0 and has a length no larger than sectorsize, we can simplify the > calculation to assigned those fixes values directly. > > Signed-off-by: Qu Wenruo <wqu@suse.com> > --- > fs/btrfs/file-item.c | 14 +++++++++----- > 1 file changed, 9 insertions(+), 5 deletions(-) > > diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c > index e58fb5347e65..de3ccee38572 100644 > --- a/fs/btrfs/file-item.c > +++ b/fs/btrfs/file-item.c > @@ -1265,18 +1265,19 @@ void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode, > struct extent_buffer *leaf = path->nodes[0]; > const int slot = path->slots[0]; > struct btrfs_key key; > - u64 extent_start, extent_end; > + u64 extent_start; > u64 bytenr; > u8 type = btrfs_file_extent_type(leaf, fi); > int compress_type = btrfs_file_extent_compression(leaf, fi); > > btrfs_item_key_to_cpu(leaf, &key, slot); > extent_start = key.offset; > - extent_end = btrfs_file_extent_end(path); > em->ram_bytes = btrfs_file_extent_ram_bytes(leaf, fi); > em->generation = btrfs_file_extent_generation(leaf, fi); > if (type == BTRFS_FILE_EXTENT_REG || > type == BTRFS_FILE_EXTENT_PREALLOC) { > + u64 extent_end = btrfs_file_extent_end(path); > + This can be made const or better, since it's only used in one place now and fits within 80 characters, just use the expression directly instead of using a variable. > em->start = extent_start; > em->len = extent_end - extent_start; Which is here: em->len = btrfs_file_extent_end(path) - extent_start; > em->orig_start = extent_start - > @@ -1299,9 +1300,12 @@ void btrfs_extent_item_to_extent_map(struct btrfs_inode *inode, > em->flags |= EXTENT_FLAG_PREALLOC; > } > } else if (type == BTRFS_FILE_EXTENT_INLINE) { > + /* Tree-checker has ensured this. */ > + ASSERT(extent_start == 0); > + > em->block_start = EXTENT_MAP_INLINE; > - em->start = extent_start; > - em->len = extent_end - extent_start; > + em->start = 0; > + em->len = fs_info->sectorsize; > /* > * Initialize orig_start and block_len with the same values > * as in inode.c:btrfs_get_extent(). > @@ -1336,7 +1340,7 @@ u64 btrfs_file_extent_end(const struct btrfs_path *path) > > if (btrfs_file_extent_type(leaf, fi) == BTRFS_FILE_EXTENT_INLINE) { > end = btrfs_file_extent_ram_bytes(leaf, fi); > - end = ALIGN(key.offset + end, leaf->fs_info->sectorsize); > + end = leaf->fs_info->sectorsize; So both assignments to "end" should be removed, not just the second one, as the new one doesn't need the previous and makes it pointless. Thanks. > } else { > end = key.offset + btrfs_file_extent_num_bytes(leaf, fi); > } > -- > 2.44.0 > > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-04-02 15:51 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-04-02 6:23 [PATCH 0/2] btrfs: more explaination on extent_map members Qu Wenruo 2024-04-02 6:23 ` [PATCH 1/2] btrfs: add extra comments " Qu Wenruo 2024-04-02 7:33 ` Andrea Gelmini 2024-04-02 8:25 ` Qu Wenruo 2024-04-02 15:45 ` Filipe Manana 2024-04-02 6:23 ` [PATCH 2/2] btrfs: simplify the inline extent map creation Qu Wenruo 2024-04-02 15:50 ` Filipe Manana
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox