From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
"Daniel Bristot de Oliveira" <bristot@kernel.org>,
"Boqun Feng" <boqun.feng@gmail.com>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Eric Dumazet" <edumazet@google.com>,
"Frederic Weisbecker" <frederic@kernel.org>,
"Ingo Molnar" <mingo@redhat.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Paolo Abeni" <pabeni@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Waiman Long" <longman@redhat.com>,
"Will Deacon" <will@kernel.org>,
"Alexei Starovoitov" <ast@kernel.org>,
"Andrii Nakryiko" <andrii@kernel.org>,
"Eduard Zingerman" <eddyz87@gmail.com>,
"Hao Luo" <haoluo@google.com>, "Jiri Olsa" <jolsa@kernel.org>,
"John Fastabend" <john.fastabend@gmail.com>,
"KP Singh" <kpsingh@kernel.org>,
"Martin KaFai Lau" <martin.lau@linux.dev>,
"Song Liu" <song@kernel.org>,
"Stanislav Fomichev" <sdf@google.com>,
"Toke Høiland-Jørgensen" <toke@redhat.com>,
"Yonghong Song" <yonghong.song@linux.dev>,
bpf@vger.kernel.org
Subject: Re: [PATCH v5 net-next 14/15] net: Reference bpf_redirect_info via task_struct on PREEMPT_RT.
Date: Tue, 11 Jun 2024 10:39:18 +0200 [thread overview]
Message-ID: <20240611083918.fJTJJtBu@linutronix.de> (raw)
In-Reply-To: <18328cc2-c135-4b69-8c5f-cd45998e970f@kernel.org>
On 2024-06-11 09:55:11 [+0200], Jesper Dangaard Brouer wrote:
> > For gcc the stosq vs movq depends on the CPU settings. The generic uses
> > movq up to 40 bytes, skylake uses movq even for 64bytes. clang…
> > This could be tuned via -mmemset-strategy=libcall:64:align,rep_8byte:-1:align
> >
>
> Cool I didn't know of this tuning. Is this a compiler option?
> Where do I change this setting, as I would like to experiment with this
> for our prod kernels.
This is what I play with right now, I'm not sure it is what I want… For
reference:
---->8-----
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 1d7122a1883e8..b35b7b21598de 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -775,6 +775,9 @@ config SCHED_OMIT_FRAME_POINTER
If in doubt, say "Y".
+config X86_OPT_MEMSET
+ bool "X86 memset playground"
+
menuconfig HYPERVISOR_GUEST
bool "Linux guest support"
help
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 801fd85c3ef69..bab37787fe5cd 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -151,6 +151,15 @@ else
KBUILD_AFLAGS += -m64
KBUILD_CFLAGS += -m64
+ ifeq ($(CONFIG_X86_OPT_MEMSET),y)
+ #export X86_MEMSET_CFLAGS := -mmemset-strategy=libcall:64:align,rep_8byte:-1:align
+ export X86_MEMSET_CFLAGS := -mmemset-strategy=libcall:-1:align
+ else
+ export X86_MEMSET_CFLAGS :=
+ endif
+
+ KBUILD_CFLAGS += $(X86_MEMSET_CFLAGS)
+
# Align jump targets to 1 byte, not the default 16 bytes:
KBUILD_CFLAGS += $(call cc-option,-falign-jumps=1)
diff --git a/arch/x86/entry/vdso/Makefile b/arch/x86/entry/vdso/Makefile
index 215a1b202a918..d0c9a589885ef 100644
--- a/arch/x86/entry/vdso/Makefile
+++ b/arch/x86/entry/vdso/Makefile
@@ -121,6 +121,7 @@ KBUILD_CFLAGS_32 := $(filter-out -m64,$(KBUILD_CFLAGS))
KBUILD_CFLAGS_32 := $(filter-out -mcmodel=kernel,$(KBUILD_CFLAGS_32))
KBUILD_CFLAGS_32 := $(filter-out -fno-pic,$(KBUILD_CFLAGS_32))
KBUILD_CFLAGS_32 := $(filter-out -mfentry,$(KBUILD_CFLAGS_32))
+KBUILD_CFLAGS_32 := $(filter-out $(X86_MEMSET_CFLAGS),$(KBUILD_CFLAGS_32))
KBUILD_CFLAGS_32 := $(filter-out $(RANDSTRUCT_CFLAGS),$(KBUILD_CFLAGS_32))
KBUILD_CFLAGS_32 := $(filter-out $(GCC_PLUGINS_CFLAGS),$(KBUILD_CFLAGS_32))
KBUILD_CFLAGS_32 := $(filter-out $(RETPOLINE_CFLAGS),$(KBUILD_CFLAGS_32))
---->8-----
I dug this up in the gcc source code and initially played on the command
line with it. The snippet compiles the kernel and it boots so…
> My other finding is, this primarily a kernel compile problem, because
> for userspace compiler chooses to use MMX instructions (e.g. movaps
> xmmword ptr[rsp], xmm0). The kernel compiler options (-mno-sse -mno-mmx
> -mno-sse2 -mno-3dnow -mno-avx) disables this, which aparently changes
> the tipping point.
sure.
>
> > I folded this into the last two patches:
> >
> > diff --git a/include/linux/filter.h b/include/linux/filter.h
> > index d2b4260d9d0be..1588d208f1348 100644
> > --- a/include/linux/filter.h
> > +++ b/include/linux/filter.h
> > @@ -744,27 +744,40 @@ struct bpf_redirect_info {
> > struct bpf_nh_params nh;
> > };
> > +enum bpf_ctx_init_type {
> > + bpf_ctx_ri_init,
> > + bpf_ctx_cpu_map_init,
> > + bpf_ctx_dev_map_init,
> > + bpf_ctx_xsk_map_init,
> > +};
> > +
> > struct bpf_net_context {
> > struct bpf_redirect_info ri;
> > struct list_head cpu_map_flush_list;
> > struct list_head dev_map_flush_list;
> > struct list_head xskmap_map_flush_list;
> > + unsigned int flags;
>
> Why have yet another flags variable, when we already have two flags in
> bpf_redirect_info ?
Ah you want to fold this into ri member including the status for the
lists? Could try. It is splitted in order to delay the initialisation of
the lists, too. We would need to be careful to not overwrite the
flags if `ri' is initialized after the lists. That would be the case
with CONFIG_DEBUG_NET=y and not doing redirect (the empty list check
initializes that).
Sebastian
next prev parent reply other threads:[~2024-06-11 8:39 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20240607070427.1379327-1-bigeasy@linutronix.de>
2024-06-07 6:53 ` [PATCH v5 net-next 11/15] lwt: Don't disable migration prio invoking BPF Sebastian Andrzej Siewior
2024-06-07 6:53 ` [PATCH v5 net-next 12/15] seg6: Use nested-BH locking for seg6_bpf_srh_states Sebastian Andrzej Siewior
2024-06-07 6:53 ` [PATCH v5 net-next 13/15] net: Use nested-BH locking for bpf_scratchpad Sebastian Andrzej Siewior
2024-06-07 6:53 ` [PATCH v5 net-next 14/15] net: Reference bpf_redirect_info via task_struct on PREEMPT_RT Sebastian Andrzej Siewior
2024-06-07 11:51 ` Jesper Dangaard Brouer
2024-06-10 16:50 ` Sebastian Andrzej Siewior
2024-06-11 7:55 ` Jesper Dangaard Brouer
2024-06-11 8:39 ` Sebastian Andrzej Siewior [this message]
2024-06-12 10:42 ` Sebastian Andrzej Siewior
2024-06-07 6:53 ` [PATCH v5 net-next 15/15] net: Move per-CPU flush-lists to bpf_net_context " Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240611083918.fJTJJtBu@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=boqun.feng@gmail.com \
--cc=bpf@vger.kernel.org \
--cc=bristot@kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=eddyz87@gmail.com \
--cc=edumazet@google.com \
--cc=frederic@kernel.org \
--cc=haoluo@google.com \
--cc=hawk@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=jolsa@kernel.org \
--cc=kpsingh@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=martin.lau@linux.dev \
--cc=mingo@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=peterz@infradead.org \
--cc=sdf@google.com \
--cc=song@kernel.org \
--cc=tglx@linutronix.de \
--cc=toke@redhat.com \
--cc=will@kernel.org \
--cc=yonghong.song@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox