netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Network Development <netdev@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Boqun Feng <boqun.feng@gmail.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Eric Dumazet <edumazet@google.com>,
	Frederic Weisbecker <frederic@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Jakub Kicinski <kuba@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Waiman Long <longman@redhat.com>, Will Deacon <will@kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Cong Wang <xiyou.wangcong@gmail.com>, Hao Luo <haoluo@google.com>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Jesper Dangaard Brouer <hawk@kernel.org>,
	Jiri Olsa <jolsa@kernel.org>, Jiri Pirko <jiri@resnulli.us>,
	John Fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>,
	Martin KaFai Lau <martin.lau@linux.dev>,
	Ronak Doshi <doshir@vmware.com>, Song Liu <song@kernel.org>,
	Stanislav Fomichev <sdf@google.com>,
	VMware PV-Drivers Reviewers <pv-drivers@vmware.com>,
	Yonghong Song <yonghong.song@linux.dev>,
	bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH net-next 15/24] net: Use nested-BH locking for XDP redirect.
Date: Fri, 12 Jan 2024 18:41:38 +0100	[thread overview]
Message-ID: <20240112174138.tMmUs11o@linutronix.de> (raw)
In-Reply-To: <87r0iw524h.fsf@toke.dk>

On 2024-01-04 20:29:02 [+0100], Toke Høiland-Jørgensen wrote:
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> 
> >> @@ -3925,6 +3926,7 @@ struct sk_buff *tcf_qevent_handle(struct tcf_qevent *qe, struct Qdisc *sch, stru
> >>
> >>         fl = rcu_dereference_bh(qe->filter_chain);
> >>
> >> +       guard(local_lock_nested_bh)(&bpf_run_lock.redirect_lock);
> >>         switch (tcf_classify(skb, NULL, fl, &cl_res, false)) {
> >>         case TC_ACT_SHOT:
> >>                 qdisc_qstats_drop(sch);
> >
> > Here and in all other places this patch adds locks that
> > will kill performance of XDP, tcx and everything else in networking.
> >
> > I'm surprised Jesper and other folks are not jumping in with nacks.
> > We measure performance in nanoseconds here.
> > Extra lock is no go.
> > Please find a different way without ruining performance.
> 
> I'll add that while all this compiles out as no-ops on !PREEMPT_RT, I do
> believe there are people who are using XDP on PREEMPT_RT kernels and
> still expect decent performance. And to achieve that it is absolutely
> imperative that we can amortise expensive operations (such as locking)
> over multiple packets.
> 
> I realise there's a fundamental trade-off between the amount of
> amortisation and the latency hit that we take from holding locks for
> longer, but tuning the batch size (while still keeping some amount of
> batching) may be a way forward? I suppose Jakub's suggestion in the
> other part of the thread, of putting the locks around napi->poll(), is a
> step towards something like this.

The RT requirements are usually different. Networking as in CAN might be
important but Ethernet could only used for remote communication and so
"not" important. People complained that they need to wait for Ethernet
to be done until the CAN packet can be injected into the stack.
With that expectation you would like to pause Ethernet immediately and
switch over the CAN interrupt thread.

But if someone managed to setup XDP then it is likely to be important.
With RT traffic it is usually not the throughput that matters but the
latency. You are likely in the position to receive a packet, say every
1ms, and need to respond immediately. XDP would be used to inspect the
packet and either hand it over to the stack or process it.

I expected the lock operation (under RT) to always succeeds and not
cause any delay because it should not be contended. It should only
block if something with higher priority preempted the current interrupt
thread _and_ also happen to use XDP on the same CPU. In that case (XDP
is needed) it would flush the current user out of the locked section
before the higher-prio thread could take over. Doing bulk and allowing
the low-priority thread to complete would delay the high-priority
thread. Maybe I am too pessimistic here and having two XDP programs on
one CPU is unlikely to happen.

Adding the lock on per-NAPI basis would allow to batch packets.
Acquiring the lock only if XDP is supported would not block the CAN
drivers since they dont't support XDP. But sounds like a hack.

Daniel said netkit doesn't need this locking because it is not
supporting this redirect and it made me think. Would it work to make
the redirect structures part of the bpf_prog-structure instead of
per-CPU? My understanding is that eBPF's programs data structures are
part of it and contain locking allowing one eBPF program preempt
another one.
Having the redirect structures part of the program would obsolete
locking. Do I miss anything?

> -Toke

Sebastian

  reply	other threads:[~2024-01-12 17:41 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-15 17:07 [PATCH net-next 00/24] locking: Introduce nested-BH locking Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 01/24] locking/local_lock: Introduce guard definition for local_lock Sebastian Andrzej Siewior
2023-12-18  8:16   ` Paolo Abeni
2024-01-11 16:19     ` Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 02/24] locking/local_lock: Add local nested BH locking infrastructure Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 03/24] net: Use __napi_alloc_frag_align() instead of open coding it Sebastian Andrzej Siewior
2023-12-18  7:48   ` Paolo Abeni
2024-01-12  9:01     ` Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 04/24] net: Use nested-BH locking for napi_alloc_cache Sebastian Andrzej Siewior
2023-12-16  4:43   ` kernel test robot
2024-01-12 10:58     ` Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 05/24] net/tcp_sigpool: Use nested-BH locking for sigpool_scratch Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 06/24] net/ipv4: Use nested-BH locking for ipv4_tcp_sk Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 07/24] netfilter: br_netfilter: Use nested-BH locking for brnf_frag_data_storage Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 08/24] net: softnet_data: Make xmit.recursion per task Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 09/24] dev: Use the RPS lock for softnet_data::input_pkt_queue on PREEMPT_RT Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 10/24] dev: Use nested-BH locking for softnet_data.process_queue Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 11/24] lwt: Don't disable migration prio invoking BPF Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 12/24] seg6: Use nested-BH locking for seg6_bpf_srh_states Sebastian Andrzej Siewior
2023-12-16  3:39   ` kernel test robot
2023-12-18  8:33   ` Paolo Abeni
2024-01-12 11:23     ` Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 13/24] net: Use nested-BH locking for bpf_scratchpad Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 14/24] net: Add a lock which held during the redirect process Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 15/24] net: Use nested-BH locking for XDP redirect Sebastian Andrzej Siewior
2023-12-16  4:12   ` kernel test robot
2023-12-20  0:25   ` Alexei Starovoitov
2024-01-04 19:29     ` Toke Høiland-Jørgensen
2024-01-12 17:41       ` Sebastian Andrzej Siewior [this message]
2024-01-17 16:37         ` Toke Høiland-Jørgensen
2024-01-18  2:04           ` Jakub Kicinski
2024-01-18  8:27             ` Sebastian Andrzej Siewior
2024-01-18 16:38               ` Jakub Kicinski
2024-01-18 16:50                 ` Sebastian Andrzej Siewior
2024-01-18 11:51             ` Toke Høiland-Jørgensen
2024-01-18 16:37               ` Jakub Kicinski
2024-01-20 14:41                 ` Toke Høiland-Jørgensen
2024-01-18  7:35           ` Sebastian Andrzej Siewior
2024-01-18 11:58             ` Toke Høiland-Jørgensen
2023-12-15 17:07 ` [PATCH net-next 16/24] net: netkit, veth, tun, virt*: " Sebastian Andrzej Siewior
2023-12-16 19:28   ` kernel test robot
2023-12-18  8:52   ` Daniel Borkmann
2024-01-12 15:37     ` Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 17/24] net: amazon, aquanti, broadcom, cavium, engleder: " Sebastian Andrzej Siewior
2023-12-16 22:09   ` Kiyanovski, Arthur
2024-01-12 17:53     ` Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 18/24] net: Freescale: " Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 19/24] net: fungible, gve, mtk, microchip, mana: " Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 20/24] net: intel: " Sebastian Andrzej Siewior
2023-12-16  4:53   ` kernel test robot
2023-12-19  0:01     ` Nathan Chancellor
2023-12-19 16:55       ` Nick Desaulniers
2023-12-15 17:07 ` [PATCH net-next 21/24] net: marvell: " Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 22/24] net: mellanox, nfp, sfc: " Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 23/24] net: qlogic, socionext, stmmac, cpsw: " Sebastian Andrzej Siewior
2023-12-15 17:07 ` [PATCH net-next 24/24] net: bpf: Add lockdep assert for the redirect process Sebastian Andrzej Siewior
2023-12-15 22:50 ` [PATCH net-next 00/24] locking: Introduce nested-BH locking Jakub Kicinski
2023-12-18 17:23   ` Sebastian Andrzej Siewior
2023-12-19  0:41     ` Jakub Kicinski
2023-12-21 20:46       ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240112174138.tMmUs11o@linutronix.de \
    --to=bigeasy@linutronix.de \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=boqun.feng@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=doshir@vmware.com \
    --cc=edumazet@google.com \
    --cc=frederic@kernel.org \
    --cc=haoluo@google.com \
    --cc=hawk@kernel.org \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kpsingh@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longman@redhat.com \
    --cc=martin.lau@linux.dev \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pv-drivers@vmware.com \
    --cc=sdf@google.com \
    --cc=song@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=toke@redhat.com \
    --cc=will@kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    --cc=yonghong.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).