From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Eric Dumazet <edumazet@google.com>
Cc: willemdebruijn.kernel@gmail.com, davem@davemloft.net,
dsahern@kernel.org, kuba@kernel.org, pabeni@redhat.com,
kuniyu@google.com, horms@kernel.org, netdev@vger.kernel.org
Subject: Re: [PATCH] udp: Force compute_score to always inline
Date: Thu, 09 Apr 2026 18:50:53 -0400 [thread overview]
Message-ID: <87v7dzoiia.fsf@mailhost.krisman.be> (raw)
In-Reply-To: <CANn89iKQhLOdtn-_viyDN8ytjJtR-4p0gteXL6gGSHoUYZp5Hw@mail.gmail.com> (Eric Dumazet's message of "Thu, 9 Apr 2026 15:36:15 -0700")
Eric Dumazet <edumazet@google.com> writes:
> On Thu, Apr 9, 2026 at 3:16 PM Gabriel Krisman Bertazi <krisman@suse.de> wrote:
>
>>
>> Back in 2024 I reported a 7-12% regression on an iperf3 UDP loopback
>> thoughput test that we traced to the extra overhead of calling
>> compute_score on two places, introduced by commit f0ea27e7bfe1 ("udp:
>> re-score reuseport groups when connected sockets are present"). At the
>> time, I pointed out the overhead was caused by the multiple calls,
>> associated with cpu-specific mitigations, and merged commit
>> 50aee97d1511 ("udp: Avoid call to compute_score on multiple sites") to
>> jump back explicitly, to force the rescore call in a single place.
>>
>> Recently though, we got another regression report against a newer distro
>> version, which a team colleague traced back to the same root-cause.
>> Turns out that once we updated to gcc-13, the compiler got smart enough
>> to unroll the loop, undoing my previous mitigation. Let's bite the
>> bullet and __always_inline compute_score on both ipv4 and ipv6 to
>> prevent gcc from de-optimizing it again in the future. These functions
>> are only called in two places each, udpX_lib_lookup1 and
>> udpX_lib_lookup2, so the extra size shouldn't be a problem and it is hot
>> enough to be very visible in profilings. In fact, with gcc13, forcing
>> the inline will prevent gcc from unrolling the fix from commit
>> 50aee97d1511, so we don't end up increasing udpX_lib_lookup2 at all.
>>
>> I haven't recollected the results myself, as I don't have access to the
>> machine at the moment. But the same colleague reported 4.67%
>> inprovement with this patch in the loopback benchmark, solving the
>> regression report within noise margins.
>
> You could include scripts/bloat-o-meter results, so that we can sense
> the cost of such a change.
>
> $ scripts/bloat-o-meter -t vmlinux.old vmlinux.new
> add/remove: 0/2 grow/shrink: 6/1 up/down: 622/-410 (212)
> Function old new delta
> __udp6_lib_lookup 797 1007 +210
> __udp4_lib_lookup 838 984 +146
> udp6_lib_lookup2 404 536 +132
> udp4_lib_lookup2 396 498 +102
> udpv6_rcv 3018 3034 +16
> udp_init_sock 244 260 +16
> bpf_iter_udp_batch 953 937 -16
> __pfx_compute_score 32 - -32
> compute_score 362 - -362
> Total: Before=30269687, After=30269899, chg +0.00%
>
> No change for clang.
>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
Apologies, I wasn't aware of that tool. I did some calculations by hand
and found something like 200 bytes extra in udp6_lib_lookup2.
For gcc-13:
scripts/bloat-o-meter vmlinux vmlinux-inline
add/remove: 0/2 grow/shrink: 4/0 up/down: 616/-416 (200)
Function old new delta
udp6_lib_lookup2 762 949 +187
__udp6_lib_lookup 810 975 +165
udp4_lib_lookup2 757 906 +149
__udp4_lib_lookup 871 986 +115
__pfx_compute_score 32 - -32
compute_score 384 - -384
Total: Before=35011784, After=35011984, chg +0.00%
--
Gabriel Krisman Bertazi
prev parent reply other threads:[~2026-04-09 22:51 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 22:15 [PATCH] udp: Force compute_score to always inline Gabriel Krisman Bertazi
2026-04-09 22:36 ` Eric Dumazet
2026-04-09 22:50 ` Gabriel Krisman Bertazi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87v7dzoiia.fsf@mailhost.krisman.be \
--to=krisman@suse.de \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=kuba@kernel.org \
--cc=kuniyu@google.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=willemdebruijn.kernel@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox