From: David Sterba <dsterba@suse.cz>
To: Boris Burkov <boris@bur.io>
Cc: David Sterba <dsterba@suse.com>, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 7/7] btrfs: accessors: factor out split memcpy with two sources
Date: Thu, 3 Jul 2025 22:57:10 +0200 [thread overview]
Message-ID: <20250703205710.GX31241@suse.cz> (raw)
In-Reply-To: <20250702185337.GC2308047@zen.localdomain>
On Wed, Jul 02, 2025 at 11:53:37AM -0700, Boris Burkov wrote:
> On Tue, Jul 01, 2025 at 07:23:54PM +0200, David Sterba wrote:
> > The case of a reading the bytes from 2 folios needs two memcpy()s, the
> > compiler does not emit calls but two inline loops.
> >
> > Factoring out the code makes some improvement (stack, code) and in the
> > future will provide an optimized implementation as well. (The analogical
> > version with two destinations is not done as it increases stack usage
> > but can be done if needed.)
>
> Is there some fundamental reason for this, or does it just happen to be
> so? Sort of interesting either way. Does it make you worry that this
> stuff will regress randomly in the future?
My explanation for that is that it's in the compiler optimization black
box, the function is inline and when evaluated in the context of the
caller it just came out worse.
This is the patch:
--- a/fs/btrfs/accessors.c
+++ b/fs/btrfs/accessors.c
@@ -29,6 +29,14 @@ static __always_inline void memcpy_split_src(char *dest, const char *src1,
memcpy(dest + len1, src2, total - len1);
}
+static __always_inline void memcpy_split_dest(char *dest1, const char *src,
+ char *dest2, const size_t len1,
+ const size_t total)
+{
+ memcpy(dest1, src, len1);
+ memcpy(dest2, src + len1, total - len1);
+}
+
/*
* Macro templates that define helpers to read/write extent buffer data of a
* given size, that are also used via ctree.h for access to item members by
@@ -105,9 +113,9 @@ void btrfs_set_##bits(const struct extent_buffer *eb, void *ptr, \
kaddr = folio_address(eb->folios[idx + 1]); \
*kaddr = lebytes[1]; \
} else { \
- memcpy(kaddr, lebytes, part); \
- kaddr = folio_address(eb->folios[idx + 1]); \
- memcpy(kaddr, lebytes + part, sizeof(u##bits) - part); \
+ memcpy_split_dest(kaddr, lebytes, \
+ folio_address(eb->folios[idx + 1]), \
+ part, sizeof(u##bits)); \
} \
}
---
And the evaluation:
btrfs_set_64 +8 (24 -> 32)
btrfs_set_32 +8 (24 -> 32)
text data bss dec hex filename
1454157 115665 16088 1585910 1832f6 pre/btrfs.ko
1454173 115665 16088 1585926 183306 post/btrfs.ko
DELTA: +16
I excluded the patch because of the worse stack and code diff, because
the other patches were all an improvement. This one was preparatory for
the fancy optimization I have so it can be placed to a helper instead of
the macro. Given the remaining time before code freeze it would not
leave enough time for testing and it might also be a generic helper in
the end.
next prev parent reply other threads:[~2025-07-03 20:57 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-01 17:23 [PATCH 0/7] Set/get accessor speedups David Sterba
2025-07-01 17:23 ` [PATCH 1/7] btrfs: accessors: simplify folio bounds checks David Sterba
2025-07-01 17:23 ` [PATCH 2/7] btrfs: accessors: use type sizeof constants directly David Sterba
2025-07-02 17:58 ` Boris Burkov
2025-07-03 21:20 ` David Sterba
2025-07-07 14:25 ` David Sterba
2025-07-01 17:23 ` [PATCH 3/7] btrfs: accessors: inline eb bounds check and factor out the error report David Sterba
2025-07-02 18:00 ` Boris Burkov
2025-07-03 21:16 ` David Sterba
2025-07-01 17:23 ` [PATCH 4/7] btrfs: accessors: compile-time fast path for u8 David Sterba
2025-07-01 17:23 ` [PATCH 5/7] btrfs: accessors: compile-time fast path for u16 David Sterba
2025-07-01 17:23 ` [PATCH 6/7] btrfs: accessors: set target address at initialization David Sterba
2025-07-01 17:23 ` [PATCH 7/7] btrfs: accessors: factor out split memcpy with two sources David Sterba
2025-07-02 18:53 ` Boris Burkov
2025-07-03 20:57 ` David Sterba [this message]
2025-07-02 18:54 ` [PATCH 0/7] Set/get accessor speedups Boris Burkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250703205710.GX31241@suse.cz \
--to=dsterba@suse.cz \
--cc=boris@bur.io \
--cc=dsterba@suse.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox