From: Ingo Molnar <mingo@kernel.org>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
"H. Peter Anvin" <hpa@zytor.com>,
linux-kernel@vger.kernel.org, Juergen Gross <jgross@suse.com>,
Stefano Stabellini <sstabellini@kernel.org>,
"Ahmed S . Darwish" <darwi@linutronix.de>,
Andrew Cooper <andrew.cooper3@citrix.com>,
John Ogness <john.ogness@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Borislav Petkov <bp@alien8.de>,
Thomas Gleixner <tglx@linutronix.de>,
Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH] compiler/gcc: Make asm() templates asm __inline__() by default
Date: Wed, 19 Mar 2025 23:34:47 +0100 [thread overview]
Message-ID: <Z9tGh0Fa96gACmpQ@gmail.com> (raw)
In-Reply-To: <CAFULd4aFUun7+1izxbDM8nTEEta5PApysyTGsn1hjvQND=2UtQ@mail.gmail.com>
* Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Mar 18, 2025 at 9:11 PM Ingo Molnar <mingo@kernel.org> wrote:
>
> > #ifdef CONFIG_CC_HAS_ASM_INLINE
> > # define asm_inline __asm__ __inline
> > # define asm(...) asm_inline(__VA_ARGS__)
> > #else
> > # define asm_inline asm
> > #endif
> >
> > And I fixed up the places where this isn't syntactically correct:
> >
> > 35 files changed, 82 insertions(+), 79 deletions(-)
> >
> > I haven't looked at code generation much yet, but text size changes are
> > minimal:
> >
> > text data bss dec hex filename
> > 29429076 7931870 1401196 38762142 24f769e vmlinux.before
> > 29429631 7931870 1401200 38762701 24f78cd vmlinux.after
> >
> > Which is promising, assuming I haven't messed up anywhere.
>
> Please use bloat-o-meter, it is more precise.
Here's the bloat-o-meter output between vanilla and patched vmlinux:
add/remove: 6/20 grow/shrink: 43/13 up/down: 4245/-2812 (1433)
Function old new delta
__ia32_sys_pidfd_send_signal 21 818 +797
__x64_sys_pidfd_send_signal 22 818 +796
nl80211_send_wiphy 11189 11867 +678
icl_update_topdown_event 473 691 +218
intel_joiner_adjust_timings.isra - 145 +145
deactivate_locked_super 148 249 +101
tcp_v4_send_synack 301 389 +88
kill_anon_super 53 137 +84
xa_destroy 291 371 +80
__xa_set_mark 135 209 +74
ip_fraglist_prepare 204 269 +65
store_hwp_dynamic_boost 136 200 +64
csum_partial 239 302 +63
ip_fraglist_init 155 217 +62
store_max_perf_pct 252 311 +59
ip_frag_next 377 434 +57
ip_do_fragment 1644 1701 +57
__ip_local_out 283 340 +57
store_min_perf_pct 273 329 +56
__udp4_lib_rcv 2917 2970 +53
freeze_super 1124 1175 +51
super_wake - 47 +47
ieee80211_parse_tx_radiotap 1267 1311 +44
ic_bootp_recv 1333 1371 +38
vlv_compute_watermarks 2267 2299 +32
intel_atomic_commit_tail 5423 5455 +32
_g4x_compute_pipe_wm 397 429 +32
__ieee80211_xmit_fast 2611 2643 +32
intel_format_info_is_yuv_semiplanar 43 72 +29
skl_main_to_aux_plane 113 136 +23
ip_auto_config 4029 4050 +21
__memcpy_flushcache 369 386 +17
validate_beacon_head 240 256 +16
tcp_v4_rcv 4850 4866 +16
intel_enable_transcoder 1443 1459 +16
intel_cx0pll_state_verify 1859 1875 +16
intel_cx0_phy_check_hdmi_link_rate 181 197 +16
inet_gro_receive 569 585 +16
__pfx_super_wake - 16 +16
__pfx_intel_joiner_adjust_timings.isra - 16 +16
__pfx_intel_cx0_read.constprop - 16 +16
intel_fb_is_gen12_ccs_aux_plane.isra 95 107 +12
_intel_modeset_primary_pipes 79 91 +12
intel_cx0_read.constprop - 10 +10
intel_fb_is_ccs_aux_plane 110 117 +7
intel_crtc_readout_derived_state 513 516 +3
wq_update_node_max_active 538 540 +2
intel_set_cdclk_pre_plane_update 838 840 +2
intel_sseu_subslice_total 67 68 +1
intel_crtc_compute_config 890 889 -1
_intel_modeset_secondary_pipes 50 47 -3
intel_mtl_pll_enable 7472 7460 -12
bxt_set_cdclk 1328 1316 -12
vfs_get_tree 205 189 -16
intel_fb_modifier_to_tiling 186 170 -16
__pfx_nl80211_send_iftype_data 16 - -16
__pfx_lane_mask_to_lane 16 - -16
__pfx_kill_super_notify.part 16 - -16
__pfx_ip_fast_csum 16 - -16
__pfx_intel_pstate_update_policies 16 - -16
__pfx_intel_joiner_adjust_timings 16 - -16
__pfx_format_is_yuv_semiplanar.part.isra 16 - -16
__pfx_cdclk_divider 16 - -16
__pfx___icl_update_topdown_event 16 - -16
__pfx___do_sys_pidfd_send_signal 16 - -16
intel_c20_sram_read.constprop 173 149 -24
xas_split_alloc 302 270 -32
nl80211_parse_sched_scan 3311 3279 -32
kill_litter_super 70 31 -39
format_is_yuv_semiplanar.part.isra 41 - -41
lane_mask_to_lane 44 - -44
ip_fast_csum 48 - -48
intel_cx0pll_readout_hw_state 1085 1037 -48
intel_pstate_update_policies 66 - -66
cdclk_divider 69 - -69
kill_super_notify.part 111 - -111
__icl_update_topdown_event 112 - -112
intel_joiner_adjust_timings 140 - -140
intel_fill_fb_info 2998 2813 -185
intel_tile_width_bytes 747 531 -216
nl80211_send_iftype_data 574 - -574
__do_sys_pidfd_send_signal 811 - -811
Total: Before=22547058, After=22548491, chg +0.01%
A lot fewer functions are affected than I expected from such a
large-scope change.
> Actually, functions with the most impact (x86 locking functions and
> __arch_hweight) were recently converted to asm_inline, so besides
> __untagged_addr, the remaining have very little impact, if at all
> (c.f. amd_clear_divider() ). There is also no need to convert asm()
> without directives inside.
I did the test with Linus-vanilla (81e4f8d68c66) to maximize the
potential effect, which doesn't have those changes yet.
See tip:WIP.x86/core.
> My proposal would be to convert the remaining few cases (the remaining
> asms involving ALTERNATIVE and exceptions) "by hand" to asm_inline()
> and stick a rule in checkpatch to use asm_inline() in the code
> involving asm(), like we have the rule with asm volatile.
>
> I don't think redefining an important C keyword is a good approach, it
> obfuscates its meaning too much. And as has been shown by Ingo's
> experiment, there is a substantial effort to fix false positives.
> Instead of fixing these, we can trivially convert the remaining cases
> to asm_volatile() as well, without obfuscating asm(). Checkpatch can
> take care of future cases.
This would work for me too. The cross-arch impact and churn seems
substantial.
Thanks,
Ingo
next prev parent reply other threads:[~2025-03-19 22:34 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-17 22:18 [PATCH 0/5] x86/cpu: Introduce <asm/cpuid/types.h> and <asm/cpuid/api.h> and clean them up Ingo Molnar
2025-03-17 22:18 ` [PATCH 1/5] x86/cpuid: Refactor <asm/cpuid.h> Ingo Molnar
2025-03-18 12:00 ` [tip: x86/cpu] " tip-bot2 for Ahmed S. Darwish
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Ahmed S. Darwish
2025-03-17 22:18 ` [PATCH 2/5] x86/cpuid: Clean up <asm/cpuid/types.h> Ingo Molnar
2025-03-18 12:00 ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-17 22:18 ` [PATCH 3/5] x86/cpuid: Clean up <asm/cpuid/api.h> Ingo Molnar
2025-03-18 12:00 ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-17 22:18 ` [PATCH 4/5] x86/cpuid: Standardize on u32 in <asm/cpuid/api.h> Ingo Molnar
2025-03-18 5:59 ` Xin Li
2025-03-18 12:00 ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-17 22:18 ` [PATCH 5/5] x86/cpuid: Use u32 in instead of uint32_t " Ingo Molnar
2025-03-18 6:01 ` Xin Li
2025-03-18 8:34 ` Ingo Molnar
2025-03-18 9:37 ` Borislav Petkov
2025-03-18 11:53 ` Ingo Molnar
2025-03-18 12:15 ` Borislav Petkov
2025-03-18 18:20 ` Ingo Molnar
2025-03-19 8:08 ` Thomas Gleixner
2025-03-19 20:16 ` Ingo Molnar
2025-03-18 12:00 ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-18 14:05 ` [PATCH 0/5] x86/cpu: Introduce <asm/cpuid/types.h> and <asm/cpuid/api.h> and clean them up H. Peter Anvin
2025-03-18 18:04 ` Ingo Molnar
2025-03-18 18:33 ` Linus Torvalds
2025-03-18 18:46 ` Ingo Molnar
2025-03-18 20:11 ` [PATCH] compiler/gcc: Make asm() templates asm __inline__() by default Ingo Molnar
2025-03-18 22:07 ` Josh Poimboeuf
2025-03-19 4:57 ` Uros Bizjak
2025-03-19 22:34 ` Ingo Molnar [this message]
2025-03-20 8:21 ` Uros Bizjak
2025-03-20 8:59 ` Ingo Molnar
2025-03-20 10:30 ` Uros Bizjak
2025-03-20 11:58 ` Uros Bizjak
2025-03-19 3:30 ` [PATCH 0/5] x86/cpu: Introduce <asm/cpuid/types.h> and <asm/cpuid/api.h> and clean them up H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z9tGh0Fa96gACmpQ@gmail.com \
--to=mingo@kernel.org \
--cc=andrew.cooper3@citrix.com \
--cc=bp@alien8.de \
--cc=darwi@linutronix.de \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=john.ogness@linutronix.de \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=sstabellini@kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox