public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Uros Bizjak <ubizjak@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-kernel@vger.kernel.org, Juergen Gross <jgross@suse.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	"Ahmed S . Darwish" <darwi@linutronix.de>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	John Ogness <john.ogness@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH] compiler/gcc: Make asm() templates asm __inline__() by default
Date: Wed, 19 Mar 2025 23:34:47 +0100	[thread overview]
Message-ID: <Z9tGh0Fa96gACmpQ@gmail.com> (raw)
In-Reply-To: <CAFULd4aFUun7+1izxbDM8nTEEta5PApysyTGsn1hjvQND=2UtQ@mail.gmail.com>


* Uros Bizjak <ubizjak@gmail.com> wrote:

> On Tue, Mar 18, 2025 at 9:11 PM Ingo Molnar <mingo@kernel.org> wrote:
> 
> >  #ifdef CONFIG_CC_HAS_ASM_INLINE
> >  # define asm_inline __asm__ __inline
> >  # define asm(...) asm_inline(__VA_ARGS__)
> >  #else
> >  # define asm_inline asm
> >  #endif
> >
> > And I fixed up the places where this isn't syntactically correct:
> >
> >  35 files changed, 82 insertions(+), 79 deletions(-)
> >
> > I haven't looked at code generation much yet, but text size changes are
> > minimal:
> >
> >       text         data     bss      dec            hex filename
> >   29429076      7931870 1401196 38762142        24f769e vmlinux.before
> >   29429631      7931870 1401200 38762701        24f78cd vmlinux.after
> >
> > Which is promising, assuming I haven't messed up anywhere.
> 
> Please use bloat-o-meter, it is more precise.

Here's the bloat-o-meter output between vanilla and patched vmlinux:

add/remove: 6/20 grow/shrink: 43/13 up/down: 4245/-2812 (1433)
Function                                     old     new   delta
__ia32_sys_pidfd_send_signal                  21     818    +797
__x64_sys_pidfd_send_signal                   22     818    +796
nl80211_send_wiphy                         11189   11867    +678
icl_update_topdown_event                     473     691    +218
intel_joiner_adjust_timings.isra               -     145    +145
deactivate_locked_super                      148     249    +101
tcp_v4_send_synack                           301     389     +88
kill_anon_super                               53     137     +84
xa_destroy                                   291     371     +80
__xa_set_mark                                135     209     +74
ip_fraglist_prepare                          204     269     +65
store_hwp_dynamic_boost                      136     200     +64
csum_partial                                 239     302     +63
ip_fraglist_init                             155     217     +62
store_max_perf_pct                           252     311     +59
ip_frag_next                                 377     434     +57
ip_do_fragment                              1644    1701     +57
__ip_local_out                               283     340     +57
store_min_perf_pct                           273     329     +56
__udp4_lib_rcv                              2917    2970     +53
freeze_super                                1124    1175     +51
super_wake                                     -      47     +47
ieee80211_parse_tx_radiotap                 1267    1311     +44
ic_bootp_recv                               1333    1371     +38
vlv_compute_watermarks                      2267    2299     +32
intel_atomic_commit_tail                    5423    5455     +32
_g4x_compute_pipe_wm                         397     429     +32
__ieee80211_xmit_fast                       2611    2643     +32
intel_format_info_is_yuv_semiplanar           43      72     +29
skl_main_to_aux_plane                        113     136     +23
ip_auto_config                              4029    4050     +21
__memcpy_flushcache                          369     386     +17
validate_beacon_head                         240     256     +16
tcp_v4_rcv                                  4850    4866     +16
intel_enable_transcoder                     1443    1459     +16
intel_cx0pll_state_verify                   1859    1875     +16
intel_cx0_phy_check_hdmi_link_rate           181     197     +16
inet_gro_receive                             569     585     +16
__pfx_super_wake                               -      16     +16
__pfx_intel_joiner_adjust_timings.isra         -      16     +16
__pfx_intel_cx0_read.constprop                 -      16     +16
intel_fb_is_gen12_ccs_aux_plane.isra          95     107     +12
_intel_modeset_primary_pipes                  79      91     +12
intel_cx0_read.constprop                       -      10     +10
intel_fb_is_ccs_aux_plane                    110     117      +7
intel_crtc_readout_derived_state             513     516      +3
wq_update_node_max_active                    538     540      +2
intel_set_cdclk_pre_plane_update             838     840      +2
intel_sseu_subslice_total                     67      68      +1
intel_crtc_compute_config                    890     889      -1
_intel_modeset_secondary_pipes                50      47      -3
intel_mtl_pll_enable                        7472    7460     -12
bxt_set_cdclk                               1328    1316     -12
vfs_get_tree                                 205     189     -16
intel_fb_modifier_to_tiling                  186     170     -16
__pfx_nl80211_send_iftype_data                16       -     -16
__pfx_lane_mask_to_lane                       16       -     -16
__pfx_kill_super_notify.part                  16       -     -16
__pfx_ip_fast_csum                            16       -     -16
__pfx_intel_pstate_update_policies            16       -     -16
__pfx_intel_joiner_adjust_timings             16       -     -16
__pfx_format_is_yuv_semiplanar.part.isra      16       -     -16
__pfx_cdclk_divider                           16       -     -16
__pfx___icl_update_topdown_event              16       -     -16
__pfx___do_sys_pidfd_send_signal              16       -     -16
intel_c20_sram_read.constprop                173     149     -24
xas_split_alloc                              302     270     -32
nl80211_parse_sched_scan                    3311    3279     -32
kill_litter_super                             70      31     -39
format_is_yuv_semiplanar.part.isra            41       -     -41
lane_mask_to_lane                             44       -     -44
ip_fast_csum                                  48       -     -48
intel_cx0pll_readout_hw_state               1085    1037     -48
intel_pstate_update_policies                  66       -     -66
cdclk_divider                                 69       -     -69
kill_super_notify.part                       111       -    -111
__icl_update_topdown_event                   112       -    -112
intel_joiner_adjust_timings                  140       -    -140
intel_fill_fb_info                          2998    2813    -185
intel_tile_width_bytes                       747     531    -216
nl80211_send_iftype_data                     574       -    -574
__do_sys_pidfd_send_signal                   811       -    -811
Total: Before=22547058, After=22548491, chg +0.01%

A lot fewer functions are affected than I expected from such a 
large-scope change.

> Actually, functions with the most impact (x86 locking functions and
> __arch_hweight) were recently converted to asm_inline, so besides
> __untagged_addr, the remaining have very little impact, if at all
> (c.f. amd_clear_divider() ). There is also no need to convert asm()
> without directives inside.

I did the test with Linus-vanilla (81e4f8d68c66) to maximize the 
potential effect, which doesn't have those changes yet.

See tip:WIP.x86/core.

> My proposal would be to convert the remaining few cases (the remaining
> asms involving ALTERNATIVE and exceptions) "by hand" to asm_inline()
> and stick a rule in checkpatch to use asm_inline() in the code
> involving asm(), like we have the rule with asm volatile.
> 
> I don't think redefining an important C keyword is a good approach, it
> obfuscates its meaning too much. And as has been shown by Ingo's
> experiment, there is a substantial effort to fix false positives.
> Instead of fixing these, we can trivially convert the remaining cases
> to asm_volatile() as well, without obfuscating asm(). Checkpatch can
> take care of future cases.

This would work for me too. The cross-arch impact and churn seems 
substantial.

Thanks,

	Ingo

  reply	other threads:[~2025-03-19 22:34 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-17 22:18 [PATCH 0/5] x86/cpu: Introduce <asm/cpuid/types.h> and <asm/cpuid/api.h> and clean them up Ingo Molnar
2025-03-17 22:18 ` [PATCH 1/5] x86/cpuid: Refactor <asm/cpuid.h> Ingo Molnar
2025-03-18 12:00   ` [tip: x86/cpu] " tip-bot2 for Ahmed S. Darwish
2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Ahmed S. Darwish
2025-03-17 22:18 ` [PATCH 2/5] x86/cpuid: Clean up <asm/cpuid/types.h> Ingo Molnar
2025-03-18 12:00   ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-17 22:18 ` [PATCH 3/5] x86/cpuid: Clean up <asm/cpuid/api.h> Ingo Molnar
2025-03-18 12:00   ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-17 22:18 ` [PATCH 4/5] x86/cpuid: Standardize on u32 in <asm/cpuid/api.h> Ingo Molnar
2025-03-18  5:59   ` Xin Li
2025-03-18 12:00   ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-17 22:18 ` [PATCH 5/5] x86/cpuid: Use u32 in instead of uint32_t " Ingo Molnar
2025-03-18  6:01   ` Xin Li
2025-03-18  8:34     ` Ingo Molnar
2025-03-18  9:37       ` Borislav Petkov
2025-03-18 11:53         ` Ingo Molnar
2025-03-18 12:15           ` Borislav Petkov
2025-03-18 18:20             ` Ingo Molnar
2025-03-19  8:08               ` Thomas Gleixner
2025-03-19 20:16                 ` Ingo Molnar
2025-03-18 12:00   ` [tip: x86/cpu] " tip-bot2 for Ingo Molnar
2025-03-19 11:03   ` [tip: x86/core] " tip-bot2 for Ingo Molnar
2025-03-18 14:05 ` [PATCH 0/5] x86/cpu: Introduce <asm/cpuid/types.h> and <asm/cpuid/api.h> and clean them up H. Peter Anvin
2025-03-18 18:04   ` Ingo Molnar
2025-03-18 18:33     ` Linus Torvalds
2025-03-18 18:46       ` Ingo Molnar
2025-03-18 20:11         ` [PATCH] compiler/gcc: Make asm() templates asm __inline__() by default Ingo Molnar
2025-03-18 22:07           ` Josh Poimboeuf
2025-03-19  4:57           ` Uros Bizjak
2025-03-19 22:34             ` Ingo Molnar [this message]
2025-03-20  8:21               ` Uros Bizjak
2025-03-20  8:59                 ` Ingo Molnar
2025-03-20 10:30                   ` Uros Bizjak
2025-03-20 11:58                     ` Uros Bizjak
2025-03-19  3:30     ` [PATCH 0/5] x86/cpu: Introduce <asm/cpuid/types.h> and <asm/cpuid/api.h> and clean them up H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z9tGh0Fa96gACmpQ@gmail.com \
    --to=mingo@kernel.org \
    --cc=andrew.cooper3@citrix.com \
    --cc=bp@alien8.de \
    --cc=darwi@linutronix.de \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=john.ogness@linutronix.de \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=sstabellini@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=ubizjak@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox