From: Qu Wenruo <quwenruo.btrfs@gmx.com>
To: David Sterba <dsterba@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 01/10] btrfs: use precalculated sectorsize_bits from fs_info
Date: Mon, 2 Nov 2020 22:18:38 +0800 [thread overview]
Message-ID: <c798fbcc-c7e9-fca6-992b-bd006d6a61b4@gmx.com> (raw)
In-Reply-To: <5d586f76-7cad-b7be-60d3-44c8d3b67623@gmx.com>
[-- Attachment #1.1: Type: text/plain, Size: 11049 bytes --]
On 2020/11/2 下午10:05, Qu Wenruo wrote:
>
>
> On 2020/10/29 下午10:27, David Sterba wrote:
>> We do a lot of calculations where we divide or multiply by sectorsize.
>> We also know and make sure that sectorsize is a power of two, so this
>> means all divisions can be turned to shifts and avoid eg. expensive
>> u64/u32 divisions.
>>
>> The type is u32 as it's more register friendly on x86_64 compared to u8
>> and the resulting assembly is smaller (movzbl vs movl).
>>
>> There's also superblock s_blocksize_bits but it's usually one more
>> pointer dereference farther than fs_info.
>>
>> Signed-off-by: David Sterba <dsterba@suse.com>
>> ---
>> fs/btrfs/ctree.h | 1 +
>> fs/btrfs/disk-io.c | 2 ++
>> fs/btrfs/extent-tree.c | 2 +-
>> fs/btrfs/file-item.c | 11 ++++++-----
>> fs/btrfs/free-space-tree.c | 12 +++++++-----
>> fs/btrfs/ordered-data.c | 3 +--
>> fs/btrfs/scrub.c | 12 ++++++------
>> fs/btrfs/tree-checker.c | 3 ++-
>> 8 files changed, 26 insertions(+), 20 deletions(-)
>>
>> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> index 8a83bce3225c..87c40cc5c42e 100644
>> --- a/fs/btrfs/ctree.h
>> +++ b/fs/btrfs/ctree.h
>> @@ -931,6 +931,7 @@ struct btrfs_fs_info {
>> /* Cached block sizes */
>> u32 nodesize;
>> u32 sectorsize;
>> + u32 sectorsize_bits;
>
> For the bit shift, it can alwasy be contained in one u8.
> Since one u32 is only to be at most 32 bits, it can be easily contained
> in u8 whose max value is 255.
>
> This should allow us to pack several u8 together to reduce some memory
> usage.
>
> Despite this, the series is pretty good.
>
> Thanks,
> Qu
>> u32 stripesize;
>>
>> /* Block groups and devices containing active swapfiles. */
>> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
>> index 601a7ab2adb4..4e67c122389c 100644
>> --- a/fs/btrfs/disk-io.c
>> +++ b/fs/btrfs/disk-io.c
>> @@ -2812,6 +2812,7 @@ void btrfs_init_fs_info(struct btrfs_fs_info *fs_info)
>> /* Usable values until the real ones are cached from the superblock */
>> fs_info->nodesize = 4096;
>> fs_info->sectorsize = 4096;
>> + fs_info->sectorsize_bits = ilog2(4096);
This may sounds like a nitpicking, but what about "ffs(4096) - 1"?
IMHO it should be a little more faster than ilog2, especially when we
have ensure all sector size is power of 2 already.
>> fs_info->stripesize = 4096;
>>
>> spin_lock_init(&fs_info->swapfile_pins_lock);
>> @@ -3076,6 +3077,7 @@ int __cold open_ctree(struct super_block *sb, struct btrfs_fs_devices *fs_device
>> /* Cache block sizes */
>> fs_info->nodesize = nodesize;
>> fs_info->sectorsize = sectorsize;
>> + fs_info->sectorsize_bits = ilog2(sectorsize);
Same here.
Thanks,
Qu
>> fs_info->stripesize = stripesize;
>>
>> /*
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index 5fd60b13f4f8..29ac97248942 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -2145,7 +2145,7 @@ u64 btrfs_csum_bytes_to_leaves(struct btrfs_fs_info *fs_info, u64 csum_bytes)
>> csum_size = BTRFS_MAX_ITEM_SIZE(fs_info);
>> num_csums_per_leaf = div64_u64(csum_size,
>> (u64)btrfs_super_csum_size(fs_info->super_copy));
>> - num_csums = div64_u64(csum_bytes, fs_info->sectorsize);
>> + num_csums = csum_bytes >> fs_info->sectorsize_bits;
>> num_csums += num_csums_per_leaf - 1;
>> num_csums = div64_u64(num_csums, num_csums_per_leaf);
>> return num_csums;
>> diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
>> index 816f57d52fc9..d8cd467b4e0c 100644
>> --- a/fs/btrfs/file-item.c
>> +++ b/fs/btrfs/file-item.c
>> @@ -119,7 +119,7 @@ static inline u32 max_ordered_sum_bytes(struct btrfs_fs_info *fs_info,
>> {
>> u32 ncsums = (PAGE_SIZE - sizeof(struct btrfs_ordered_sum)) / csum_size;
>>
>> - return ncsums * fs_info->sectorsize;
>> + return ncsums << fs_info->sectorsize_bits;
>> }
>>
>> int btrfs_insert_file_extent(struct btrfs_trans_handle *trans,
>> @@ -369,7 +369,7 @@ blk_status_t btrfs_lookup_bio_sums(struct inode *inode, struct bio *bio,
>> * a single leaf so it will also fit inside a u32
>> */
>> diff = disk_bytenr - item_start_offset;
>> - diff = diff / fs_info->sectorsize;
>> + diff = diff >> fs_info->sectorsize_bits;
>> diff = diff * csum_size;
>> count = min_t(int, nblocks, (item_last_offset - disk_bytenr) >>
>> inode->i_sb->s_blocksize_bits);
>> @@ -465,7 +465,8 @@ int btrfs_lookup_csums_range(struct btrfs_root *root, u64 start, u64 end,
>> start = key.offset;
>>
>> size = btrfs_item_size_nr(leaf, path->slots[0]);
>> - csum_end = key.offset + (size / csum_size) * fs_info->sectorsize;
>> + csum_end = key.offset +
>> + ((size / csum_size) >> fs_info->sectorsize_bits);
>> if (csum_end <= start) {
>> path->slots[0]++;
>> continue;
>> @@ -606,7 +607,7 @@ blk_status_t btrfs_csum_one_bio(struct btrfs_inode *inode, struct bio *bio,
>>
>> data = kmap_atomic(bvec.bv_page);
>> crypto_shash_digest(shash, data + bvec.bv_offset
>> - + (i * fs_info->sectorsize),
>> + + (i << fs_info->sectorsize_bits),
>> fs_info->sectorsize,
>> sums->sums + index);
>> kunmap_atomic(data);
>> @@ -1020,7 +1021,7 @@ int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans,
>>
>> index += ins_size;
>> ins_size /= csum_size;
>> - total_bytes += ins_size * fs_info->sectorsize;
>> + total_bytes += ins_size << fs_info->sectorsize_bits;
>>
>> btrfs_mark_buffer_dirty(path->nodes[0]);
>> if (total_bytes < sums->len) {
>> diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c
>> index 6b9faf3b0e96..f09f62e245a0 100644
>> --- a/fs/btrfs/free-space-tree.c
>> +++ b/fs/btrfs/free-space-tree.c
>> @@ -416,16 +416,18 @@ int convert_free_space_to_extents(struct btrfs_trans_handle *trans,
>> btrfs_mark_buffer_dirty(leaf);
>> btrfs_release_path(path);
>>
>> - nrbits = div_u64(block_group->length, block_group->fs_info->sectorsize);
>> + nrbits = block_group->length >> block_group->fs_info->sectorsize_bits;
>> start_bit = find_next_bit_le(bitmap, nrbits, 0);
>>
>> while (start_bit < nrbits) {
>> end_bit = find_next_zero_bit_le(bitmap, nrbits, start_bit);
>> ASSERT(start_bit < end_bit);
>>
>> - key.objectid = start + start_bit * block_group->fs_info->sectorsize;
>> + key.objectid = start +
>> + (start_bit << block_group->fs_info->sectorsize_bits);
>> key.type = BTRFS_FREE_SPACE_EXTENT_KEY;
>> - key.offset = (end_bit - start_bit) * block_group->fs_info->sectorsize;
>> + key.offset = (end_bit - start_bit) <<
>> + block_group->fs_info->sectorsize_bits;
>>
>> ret = btrfs_insert_empty_item(trans, root, path, &key, 0);
>> if (ret)
>> @@ -540,8 +542,8 @@ static void free_space_set_bits(struct btrfs_block_group *block_group,
>> end = found_end;
>>
>> ptr = btrfs_item_ptr_offset(leaf, path->slots[0]);
>> - first = div_u64(*start - found_start, fs_info->sectorsize);
>> - last = div_u64(end - found_start, fs_info->sectorsize);
>> + first = (*start - found_start) >> fs_info->sectorsize_bits;
>> + last = (end - found_start) >> fs_info->sectorsize_bits;
>> if (bit)
>> extent_buffer_bitmap_set(leaf, ptr, first, last - first);
>> else
>> diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c
>> index 87bac9ecdf4c..7b62dcc6cd98 100644
>> --- a/fs/btrfs/ordered-data.c
>> +++ b/fs/btrfs/ordered-data.c
>> @@ -868,7 +868,6 @@ int btrfs_find_ordered_sum(struct btrfs_inode *inode, u64 offset,
>> struct btrfs_ordered_inode_tree *tree = &inode->ordered_tree;
>> unsigned long num_sectors;
>> unsigned long i;
>> - u32 sectorsize = btrfs_inode_sectorsize(inode);
>> const u8 blocksize_bits = inode->vfs_inode.i_sb->s_blocksize_bits;
>> const u16 csum_size = btrfs_super_csum_size(fs_info->super_copy);
>> int index = 0;
>> @@ -890,7 +889,7 @@ int btrfs_find_ordered_sum(struct btrfs_inode *inode, u64 offset,
>> index += (int)num_sectors * csum_size;
>> if (index == len)
>> goto out;
>> - disk_bytenr += num_sectors * sectorsize;
>> + disk_bytenr += num_sectors << fs_info->sectorsize_bits;
>> }
>> }
>> out:
>> diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
>> index 54a4f34d4c1c..7babf670c8c2 100644
>> --- a/fs/btrfs/scrub.c
>> +++ b/fs/btrfs/scrub.c
>> @@ -2300,7 +2300,7 @@ static inline void __scrub_mark_bitmap(struct scrub_parity *sparity,
>> u64 offset;
>> u64 nsectors64;
>> u32 nsectors;
>> - int sectorsize = sparity->sctx->fs_info->sectorsize;
>> + u32 sectorsize_bits = sparity->sctx->fs_info->sectorsize_bits;
>>
>> if (len >= sparity->stripe_len) {
>> bitmap_set(bitmap, 0, sparity->nsectors);
>> @@ -2309,8 +2309,8 @@ static inline void __scrub_mark_bitmap(struct scrub_parity *sparity,
>>
>> start -= sparity->logic_start;
>> start = div64_u64_rem(start, sparity->stripe_len, &offset);
>> - offset = div_u64(offset, sectorsize);
>> - nsectors64 = div_u64(len, sectorsize);
>> + offset = offset >> sectorsize_bits;
>> + nsectors64 = len >> sectorsize_bits;
>>
>> ASSERT(nsectors64 < UINT_MAX);
>> nsectors = (u32)nsectors64;
>> @@ -2386,10 +2386,10 @@ static int scrub_find_csum(struct scrub_ctx *sctx, u64 logical, u8 *csum)
>> if (!sum)
>> return 0;
>>
>> - index = div_u64(logical - sum->bytenr, sctx->fs_info->sectorsize);
>> + index = (logical - sum->bytenr) >> sctx->fs_info->sectorsize_bits;
>> ASSERT(index < UINT_MAX);
>>
>> - num_sectors = sum->len / sctx->fs_info->sectorsize;
>> + num_sectors = sum->len >> sctx->fs_info->sectorsize_bits;
>> memcpy(csum, sum->sums + index * sctx->csum_size, sctx->csum_size);
>> if (index == num_sectors - 1) {
>> list_del(&sum->list);
>> @@ -2776,7 +2776,7 @@ static noinline_for_stack int scrub_raid56_parity(struct scrub_ctx *sctx,
>> int extent_mirror_num;
>> int stop_loop = 0;
>>
>> - nsectors = div_u64(map->stripe_len, fs_info->sectorsize);
>> + nsectors = map->stripe_len >> fs_info->sectorsize_bits;
>> bitmap_len = scrub_calc_parity_bitmap_len(nsectors);
>> sparity = kzalloc(sizeof(struct scrub_parity) + 2 * bitmap_len,
>> GFP_NOFS);
>> diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
>> index 8784b74f5232..c0e19917e59b 100644
>> --- a/fs/btrfs/tree-checker.c
>> +++ b/fs/btrfs/tree-checker.c
>> @@ -361,7 +361,8 @@ static int check_csum_item(struct extent_buffer *leaf, struct btrfs_key *key,
>> u32 prev_item_size;
>>
>> prev_item_size = btrfs_item_size_nr(leaf, slot - 1);
>> - prev_csum_end = (prev_item_size / csumsize) * sectorsize;
>> + prev_csum_end = (prev_item_size / csumsize);
>> + prev_csum_end <<= fs_info->sectorsize_bits;
>> prev_csum_end += prev_key->offset;
>> if (prev_csum_end > key->offset) {
>> generic_err(leaf, slot - 1,
>>
>
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2020-11-02 14:20 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-29 14:27 [PATCH 00/10] Sectorsize, csum_size lifted to fs_info David Sterba
2020-10-29 14:27 ` [PATCH 01/10] btrfs: use precalculated sectorsize_bits from fs_info David Sterba
2020-11-02 14:05 ` Qu Wenruo
2020-11-02 14:18 ` Qu Wenruo [this message]
2020-11-02 14:31 ` Johannes Thumshirn
2020-11-02 15:08 ` David Sterba
2020-11-02 15:12 ` Johannes Thumshirn
2020-11-02 15:15 ` David Sterba
2020-11-03 9:31 ` David Sterba
2020-10-29 14:27 ` [PATCH 02/10] btrfs: replace div_u64 by shift in free_space_bitmap_size David Sterba
2020-11-02 14:07 ` Qu Wenruo
2020-10-29 14:27 ` [PATCH 03/10] btrfs: replace s_blocksize_bits with fs_info::sectorsize_bits David Sterba
2020-11-02 14:23 ` Qu Wenruo
2020-11-02 15:18 ` David Sterba
2020-10-29 14:27 ` [PATCH 04/10] btrfs: store precalculated csum_size in fs_info David Sterba
2020-10-29 14:27 ` [PATCH 05/10] btrfs: precalculate checksums per leaf once David Sterba
2020-11-02 14:27 ` Qu Wenruo
2020-11-02 15:24 ` David Sterba
2020-11-02 16:00 ` David Sterba
2020-10-29 14:27 ` [PATCH 06/10] btrfs: use cached value of fs_info::csum_size everywhere David Sterba
2020-11-02 14:28 ` Qu Wenruo
2020-10-29 14:27 ` [PATCH 07/10] btrfs: switch cached fs_info::csum_size from u16 to u32 David Sterba
2020-10-29 14:27 ` [PATCH 08/10] btrfs: remove unnecessary local variables for checksum size David Sterba
2020-10-29 14:43 ` Johannes Thumshirn
2020-10-29 14:27 ` [PATCH 09/10] btrfs: check integrity: remove local copy of csum_size David Sterba
2020-10-29 14:44 ` Johannes Thumshirn
2020-10-29 14:27 ` [PATCH 10/10] btrfs: scrub: remove local copy of csum_size from context David Sterba
2020-10-29 14:45 ` Johannes Thumshirn
2020-10-29 14:54 ` David Sterba
2020-10-29 15:01 ` Johannes Thumshirn
2020-10-29 14:50 ` [PATCH 00/10] Sectorsize, csum_size lifted to fs_info Johannes Thumshirn
2020-10-29 16:25 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c798fbcc-c7e9-fca6-992b-bd006d6a61b4@gmx.com \
--to=quwenruo.btrfs@gmx.com \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox