From: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
To: Ilya Maximets <i.maximets@ovn.org>
Cc: netdev@vger.kernel.org, linux-rt-devel@lists.linux.dev,
dev@openvswitch.org, Eric Dumazet <edumazet@google.com>,
Simon Horman <horms@kernel.org>, Jakub Kicinski <kuba@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Paolo Abeni <pabeni@redhat.com>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [ovs-dev] [PATCH net-next 11/18] openvswitch: Use nested-BH locking for ovs_actions.
Date: Thu, 13 Mar 2025 12:50:18 +0100 [thread overview]
Message-ID: <20250313115018.7xe77nJ-@linutronix.de> (raw)
In-Reply-To: <fd4c8167-0c2d-4f5f-bf70-1efcdf3de2fb@ovn.org>
On 2025-03-10 17:56:09 [+0100], Ilya Maximets wrote:
> >>> + local_lock_nested_bh(&ovs_actions.bh_lock);
> >>
> >> Wouldn't this cause a warning when we're in a syscall/process context?
> >
> > My understanding is that is only invoked in softirq context. Did I
> > misunderstood it?
>
> It can be called from the syscall/process context while processing
> OVS_PACKET_CMD_EXECUTE request.
>
> > Otherwise that this_cpu_ptr() above should complain
> > that preemption is not disabled and if preemption is indeed not disabled
> > how do you ensure that you don't get preempted after the
> > __this_cpu_inc_return() in several tasks (at the same time) leading to
> > exceeding the OVS_RECURSION_LIMIT?
>
> We disable BH in this case, so it should be safe (on non-RT). See the
> ovs_packet_cmd_execute() for more details.
Yes, exactly. So if BH is disabled then local_lock_nested_bh() can
safely acquire a per-CPU spinlock on PREEMPT_RT here. This basically
mimics the local_bh_disable() behaviour in terms of exclusive data
structures on a smaller scope.
> >>> + ovs_act->owner = current;
> >>> + }
> >>> +
> >>> level = __this_cpu_inc_return(ovs_actions.exec_level);
> >>> if (unlikely(level > OVS_RECURSION_LIMIT)) {
> >>> net_crit_ratelimited("ovs: recursion limit reached on datapath %s, probable configuration error\n",
> >>> @@ -1710,5 +1718,10 @@ int ovs_execute_actions(struct datapath *dp, struct sk_buff *skb,
> >>>
> >>> out:
> >>> __this_cpu_dec(ovs_actions.exec_level);
> >>> +
> >>> + if (level == 1) {
> >>> + ovs_act->owner = NULL;
> >>> + local_unlock_nested_bh(&ovs_actions.bh_lock);
> >>> + }
> >>
> >> Seems dangerous to lock every time the owner changes but unlock only
> >> once on level 1. Even if this works fine, it seems unnecessarily
> >> complicated. Maybe it's better to just lock once before calling
> >> ovs_execute_actions() instead?
> >
> > My understanding is this can be invoked recursively. That means on first
> > invocation owner == NULL and then you acquire the lock at which point
> > exec_level goes 0->1. On the recursive invocation owner == current and
> > you skip the lock but exec_level goes 1 -> 2.
> > On your return path once level becomes 1, then it means that dec made it
> > go 1 -> 0, you unlock the lock.
>
> My point is: why locking here with some extra non-obvious logic of owner
> tracking if we can lock (unconditionally?) in ovs_packet_cmd_execute() and
> ovs_dp_process_packet() instead? We already disable BH in one of those
> and take appropriate RCU locks in both. So, feels like a better place
> for the extra locking if necessary. We will also not need to move around
> any code in actions.c if the code there is guaranteed to be safe by holding
> locks outside of it.
I think I was considering it but dropped it because it looks like one
can call the other.
ovs_packet_cmd_execute() is an unique entry to ovs_execute_actions().
This could the lock unconditionally.
Then we have ovs_dp_process_packet() as the second entry point towards
ovs_execute_actions() and is the tricky one. One originates from
netdev_frame_hook() which the "normal" packet receiving.
Then within ovs_execute_actions() there is ovs_vport_send() which could
use internal_dev_recv() for forwarding. This one throws the packet into
the networking stack so it could come back via netdev_frame_hook().
Then there is this internal forwarding via internal_dev_xmit() which
also ends up in ovs_execute_actions(). Here I don't know if this can
originate from within the recursion.
After looking at this and seeing the internal_dev_recv() I decided to
move it to within ovs_execute_actions() where the recursion check itself
is.
> > The locking part happens only on PREEMPT_RT because !PREEMPT_RT has
> > softirqs disabled which guarantee that there will be no preemption.
> >
> > tools/testing/selftests/net/openvswitch should cover this?
>
> It's not a comprehensive test suite, it covers some cases, but it
> doesn't test anything related to preemptions specifically.
From looking at the traces, everything originates from
netdev_frame_hook() and there is sometimes one recursion from within
ovs_execute_actions(). I haven't seen anything else.
> >> Also, the name of the struct ovs_action doesn't make a lot of sense,
> >> I'd suggest to call it pcpu_storage or something like that instead.
> >> I.e. have a more generic name as the fields inside are not directly
> >> related to each other.
> >
> > Understood. ovs_pcpu_storage maybe?
>
> It's OK, I guess, but see also a point about locking inside datapath.c
> instead and probably not needing to change anything in actions.c.
If you say that adding a lock to ovs_dp_process_packet() and another to
ovs_packet_cmd_execute() then I can certainly update. However based on
what I wrote above, I am not sure.
> >> Best regards, Ilya Maximets.
Sebastian
next prev parent reply other threads:[~2025-03-13 11:50 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-09 14:46 [PATCH net-next 00/18] net: Cover more per-CPU storage with local nested BH locking Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 01/18] net: page_pool: Don't recycle into cache on PREEMPT_RT Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 02/18] net: dst_cache: Use nested-BH locking for dst_cache::cache Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 03/18] ipv4/route: Use this_cpu_inc() for stats on PREEMPT_RT Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 04/18] ipv6: sr: Use nested-BH locking for hmac_storage Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 05/18] xdp: Use nested-BH locking for system_page_pool Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 06/18] netfilter: nf_dup{4, 6}: Move duplication check to task_struct Sebastian Andrzej Siewior
2025-03-17 17:29 ` Paolo Abeni
2025-04-07 9:33 ` Juri Lelli
2025-03-09 14:46 ` [PATCH net-next 07/18] netfilter: nft_inner: Use nested-BH locking for nft_pcpu_tun_ctx Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 08/18] netfilter: nf_dup_netdev: Move the recursion counter struct netdev_xmit Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 09/18] xfrm: Use nested-BH locking for nat_keepalive_sk_ipv[46] Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 10/18] openvswitch: Merge three per-CPU structures into one Sebastian Andrzej Siewior
2025-03-17 17:34 ` Paolo Abeni
2025-03-17 17:38 ` Paolo Abeni
2025-03-09 14:46 ` [PATCH net-next 11/18] openvswitch: Use nested-BH locking for ovs_actions Sebastian Andrzej Siewior
2025-03-10 14:17 ` [ovs-dev] " Ilya Maximets
2025-03-10 14:44 ` Sebastian Andrzej Siewior
2025-03-10 16:56 ` Ilya Maximets
2025-03-13 11:50 ` Sebastian Andrzej Siewior [this message]
2025-03-13 13:23 ` Ilya Maximets
2025-03-13 14:02 ` Sebastian Andrzej Siewior
2025-03-13 14:11 ` Aaron Conole
2025-03-09 14:46 ` [PATCH net-next 12/18] openvswitch: Move ovs_frag_data_storage into the struct ovs_action Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 13/18] net/sched: act_mirred: Move the recursion counter struct netdev_xmit Sebastian Andrzej Siewior
2025-03-21 17:14 ` Davide Caratti
2025-04-11 13:12 ` Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 14/18] net/sched: Use nested-BH locking for sch_frag_data_storage Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 15/18] mptcp: Use nested-BH locking for hmac_storage Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 16/18] rds: Disable only bottom halves in rds_page_remainder_alloc() Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 17/18] rds: Acquire per-CPU pointer within BH disabled section Sebastian Andrzej Siewior
2025-03-09 14:46 ` [PATCH net-next 18/18] rds: Use nested-BH locking for rds_page_remainder Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250313115018.7xe77nJ-@linutronix.de \
--to=bigeasy@linutronix.de \
--cc=davem@davemloft.net \
--cc=dev@openvswitch.org \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=i.maximets@ovn.org \
--cc=kuba@kernel.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).