From: Ingo Molnar <mingo@kernel.org>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
"H. Peter Anvin" <hpa@zytor.com>,
linux-kernel@vger.kernel.org, Juergen Gross <jgross@suse.com>,
Stefano Stabellini <sstabellini@kernel.org>,
"Ahmed S . Darwish" <darwi@linutronix.de>,
Andrew Cooper <andrew.cooper3@citrix.com>,
John Ogness <john.ogness@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Borislav Petkov <bp@alien8.de>,
Thomas Gleixner <tglx@linutronix.de>,
Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH] compiler/gcc: Make asm() templates asm __inline__() by default
Date: Wed, 19 Mar 2025 23:34:47 +0100 [thread overview]
Message-ID: <Z9tGh0Fa96gACmpQ@gmail.com> (raw)
In-Reply-To: <CAFULd4aFUun7+1izxbDM8nTEEta5PApysyTGsn1hjvQND=2UtQ@mail.gmail.com>
* Uros Bizjak <ubizjak@gmail.com> wrote:
> On Tue, Mar 18, 2025 at 9:11 PM Ingo Molnar <mingo@kernel.org> wrote:
>
> > #ifdef CONFIG_CC_HAS_ASM_INLINE
> > # define asm_inline __asm__ __inline
> > # define asm(...) asm_inline(__VA_ARGS__)
> > #else
> > # define asm_inline asm
> > #endif
> >
> > And I fixed up the places where this isn't syntactically correct:
> >
> > 35 files changed, 82 insertions(+), 79 deletions(-)
> >
> > I haven't looked at code generation much yet, but text size changes are
> > minimal:
> >
> > text data bss dec hex filename
> > 29429076 7931870 1401196 38762142 24f769e vmlinux.before
> > 29429631 7931870 1401200 38762701 24f78cd vmlinux.after
> >
> > Which is promising, assuming I haven't messed up anywhere.
>
> Please use bloat-o-meter, it is more precise.
Here's the bloat-o-meter output between vanilla and patched vmlinux:
add/remove: 6/20 grow/shrink: 43/13 up/down: 4245/-2812 (1433)
Function old new delta
__ia32_sys_pidfd_send_signal 21 818 +797
__x64_sys_pidfd_send_signal 22 818 +796
nl80211_send_wiphy 11189 11867 +678
icl_update_topdown_event 473 691 +218
intel_joiner_adjust_timings.isra - 145 +145
deactivate_locked_super 148 249 +101
tcp_v4_send_synack 301 389 +88
kill_anon_super 53 137 +84
xa_destroy 291 371 +80
__xa_set_mark 135 209 +74
ip_fraglist_prepare 204 269 +65
store_hwp_dynamic_boost 136 200 +64
csum_partial 239 302 +63
ip_fraglist_init 155 217 +62
store_max_perf_pct 252 311 +59
ip_frag_next 377 434 +57
ip_do_fragment 1644 1701 +57
__ip_local_out 283 340 +57
store_min_perf_pct 273 329 +56
__udp4_lib_rcv 2917 2970 +53
freeze_super 1124 1175 +51
super_wake - 47 +47
ieee80211_parse_tx_radiotap 1267 1311 +44
ic_bootp_recv 1333 1371 +38
vlv_compute_watermarks 2267 2299 +32
intel_atomic_commit_tail 5423 5455 +32
_g4x_compute_pipe_wm 397 429 +32
__ieee80211_xmit_fast 2611 2643 +32
intel_format_info_is_yuv_semiplanar 43 72 +29
skl_main_to_aux_plane 113 136 +23
ip_auto_config 4029 4050 +21
__memcpy_flushcache 369 386 +17
validate_beacon_head 240 256 +16
tcp_v4_rcv 4850 4866 +16
intel_enable_transcoder 1443 1459 +16
intel_cx0pll_state_verify 1859 1875 +16
intel_cx0_phy_check_hdmi_link_rate 181 197 +16
inet_gro_receive 569 585 +16
__pfx_super_wake - 16 +16
__pfx_intel_joiner_adjust_timings.isra - 16 +16
__pfx_intel_cx0_read.constprop - 16 +16
intel_fb_is_gen12_ccs_aux_plane.isra 95 107 +12
_intel_modeset_primary_pipes 79 91 +12
intel_cx0_read.constprop - 10 +10
intel_fb_is_ccs_aux_plane 110 117 +7
intel_crtc_readout_derived_state 513 516 +3
wq_update_node_max_active 538 540 +2
intel_set_cdclk_pre_plane_update 838 840 +2
intel_sseu_subslice_total 67 68 +1
intel_crtc_compute_config 890 889 -1
_intel_modeset_secondary_pipes 50 47 -3
intel_mtl_pll_enable 7472 7460 -12
bxt_set_cdclk 1328 1316 -12
vfs_get_tree 205 189 -16
intel_fb_modifier_to_tiling 186 170 -16
__pfx_nl80211_send_iftype_data 16 - -16
__pfx_lane_mask_to_lane 16 - -16
__pfx_kill_super_notify.part 16 - -16
__pfx_ip_fast_csum 16 - -16
__pfx_intel_pstate_update_policies 16 - -16
__pfx_intel_joiner_adjust_timings 16 - -16
__pfx_format_is_yuv_semiplanar.part.isra 16 - -16
__pfx_cdclk_divider 16 - -16
__pfx___icl_update_topdown_event 16 - -16
__pfx___do_sys_pidfd_send_signal 16 - -16
intel_c20_sram_read.constprop 173 149 -24
xas_split_alloc 302 270 -32
nl80211_parse_sched_scan 3311 3279 -32
kill_litter_super 70 31 -39
format_is_yuv_semiplanar.part.isra 41 - -41
lane_mask_to_lane 44 - -44
ip_fast_csum 48 - -48
intel_cx0pll_readout_hw_state 1085 1037 -48
intel_pstate_update_policies 66 - -66
cdclk_divider 69 - -69
kill_super_notify.part 111 - -111
__icl_update_topdown_event 112 - -112
intel_joiner_adjust_timings 140 - -140
intel_fill_fb_info 2998 2813 -185
intel_tile_width_bytes 747 531 -216
nl80211_send_iftype_data 574 - -574
__do_sys_pidfd_send_signal 811 - -811
Total: Before=22547058, After=22548491, chg +0.01%
A lot fewer functions are affected than I expected from such a
large-scope change.
> Actually, functions with the most impact (x86 locking functions and
> __arch_hweight) were recently converted to asm_inline, so besides
> __untagged_addr, the remaining have very little impact, if at all
> (c.f. amd_clear_divider() ). There is also no need to convert asm()
> without directives inside.
I did the test with Linus-vanilla (81e4f8d68c66) to maximize the
potential effect, which doesn't have those changes yet.
See tip:WIP.x86/core.
> My proposal would be to convert the remaining few cases (the remaining
> asms involving ALTERNATIVE and exceptions) "by hand" to asm_inline()
> and stick a rule in checkpatch to use asm_inline() in the code
> involving asm(), like we have the rule with asm volatile.
>
> I don't think redefining an important C keyword is a good approach, it
> obfuscates its meaning too much. And as has been shown by Ingo's
> experiment, there is a substantial effort to fix false positives.
> Instead of fixing these, we can trivially convert the remaining cases
> to asm_volatile() as well, without obfuscating asm(). Checkpatch can
> take care of future cases.
This would work for me too. The cross-arch impact and churn seems
substantial.
Thanks,
Ingo
next prev parent reply other threads:[~2025-03-19 22:34 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-17 22:18 [PATCH 0/5] x86/cpu: Introduce <asm/cpuid/types.h> and <asm/cpuid/api.h> and clean them up Ingo Molnar
2025-03-17 22:18 ` [PATCH 1/5] x86/cpuid: Refactor <asm/cpuid.h> Ingo Molnar
2025-03-18 12:00 ` [tip: x86/cpu] " tip-bot2 for Ahmed S. Darwish
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Ahmed S. Darwish
2025-03-17 22:18 ` [PATCH 2/5] x86/cpuid: Clean up <asm/cpuid/types.h> Ingo Molnar
2025-03-18 12:00 ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-17 22:18 ` [PATCH 3/5] x86/cpuid: Clean up <asm/cpuid/api.h> Ingo Molnar
2025-03-18 12:00 ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-17 22:18 ` [PATCH 4/5] x86/cpuid: Standardize on u32 in <asm/cpuid/api.h> Ingo Molnar
2025-03-18 5:59 ` Xin Li
2025-03-18 12:00 ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-17 22:18 ` [PATCH 5/5] x86/cpuid: Use u32 in instead of uint32_t " Ingo Molnar
2025-03-18 6:01 ` Xin Li
2025-03-18 8:34 ` Ingo Molnar
2025-03-18 9:37 ` Borislav Petkov
2025-03-18 11:53 ` Ingo Molnar
2025-03-18 12:15 ` Borislav Petkov
2025-03-18 18:20 ` Ingo Molnar
2025-03-19 8:08 ` Thomas Gleixner
2025-03-19 20:16 ` Ingo Molnar
2025-03-18 12:00 ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03 ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-18 14:05 ` [PATCH 0/5] x86/cpu: Introduce <asm/cpuid/types.h> and <asm/cpuid/api.h> and clean them up H. Peter Anvin
2025-03-18 18:04 ` Ingo Molnar
2025-03-18 18:33 ` Linus Torvalds
2025-03-18 18:46 ` Ingo Molnar
2025-03-18 20:11 ` [PATCH] compiler/gcc: Make asm() templates asm __inline__() by default Ingo Molnar
2025-03-18 22:07 ` Josh Poimboeuf
2025-03-19 4:57 ` Uros Bizjak
2025-03-19 22:34 ` Ingo Molnar [this message]
2025-03-20 8:21 ` Uros Bizjak
2025-03-20 8:59 ` Ingo Molnar
2025-03-20 10:30 ` Uros Bizjak
2025-03-20 11:58 ` Uros Bizjak
2025-03-19 3:30 ` [PATCH 0/5] x86/cpu: Introduce <asm/cpuid/types.h> and <asm/cpuid/api.h> and clean them up H. Peter Anvin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z9tGh0Fa96gACmpQ@gmail.com \
--to=mingo@kernel.org \
--cc=andrew.cooper3@citrix.com \
--cc=bp@alien8.de \
--cc=darwi@linutronix.de \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=john.ogness@linutronix.de \
--cc=jpoimboe@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=sstabellini@kernel.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=ubizjak@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.