From: Aaron Conole <aconole@redhat.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: netdev@vger.kernel.org, linux-rt-devel@lists.linux.dev,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
Simon Horman <horms@kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
Eelco Chaudron <echaudro@redhat.com>,
Ilya Maximets <i.maximets@ovn.org>,
dev@openvswitch.org
Subject: Re: [PATCH net-next v2 12/18] openvswitch: Move ovs_frag_data_storage into the struct ovs_pcpu_storage
Date: Tue, 15 Apr 2025 12:26:13 -0400 [thread overview]
Message-ID: <f7tbjsxfl22.fsf@redhat.com> (raw)
In-Reply-To: <20250414160754.503321-13-bigeasy@linutronix.de> (Sebastian Andrzej Siewior's message of "Mon, 14 Apr 2025 18:07:48 +0200")
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> ovs_frag_data_storage is a per-CPU variable and relies on disabled BH for its
> locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
> this data structure requires explicit locking.
>
> Move ovs_frag_data_storage into the struct ovs_pcpu_storage which already
> provides locking for the structure.
>
> Cc: Aaron Conole <aconole@redhat.com>
> Cc: Eelco Chaudron <echaudro@redhat.com>
> Cc: Ilya Maximets <i.maximets@ovn.org>
> Cc: dev@openvswitch.org
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
I'm going to reply here, but I need to bisect a bit more (though I
suspect the results below are due to 11/18). When I tested with this
patch there were lots of "unexplained" latency spikes during processing
(note, I'm not doing PREEMPT_RT in my testing, but I guess it would
smooth the spikes out at the cost of max performance).
With the series:
[SUM] 0.00-300.00 sec 3.28 TBytes 96.1 Gbits/sec 9417 sender
[SUM] 0.00-300.00 sec 3.28 TBytes 96.1 Gbits/sec receiver
Without the series:
[SUM] 0.00-300.00 sec 3.26 TBytes 95.5 Gbits/sec 149 sender
[SUM] 0.00-300.00 sec 3.26 TBytes 95.5 Gbits/sec receiver
And while the 'final' numbers might look acceptable, one thing I'll note
is I saw multiple stalls as:
[ 5] 57.00-58.00 sec 128 KBytes 903 Kbits/sec 0 4.02 MBytes
But without the patch, I didn't see such stalls. My testing:
1. Install openvswitch userspace and ipcalc
2. start userspace.
3. Setup two netns and connect them (I have a more complicated script to
set up the flows, and I can send that to you)
4. Use iperf3 to test (-P5 -t 300)
As I wrote I suspect the locking in 11 is leading to these stalls, as
the data I'm sending shouldn't be hitting the frag path.
Do these results seem expected to you?
> net/openvswitch/actions.c | 20 ++------------------
> net/openvswitch/datapath.h | 16 ++++++++++++++++
> 2 files changed, 18 insertions(+), 18 deletions(-)
>
> diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
> index f4996c11aefac..4d20eadd77ceb 100644
> --- a/net/openvswitch/actions.c
> +++ b/net/openvswitch/actions.c
> @@ -39,22 +39,6 @@
> #include "flow_netlink.h"
> #include "openvswitch_trace.h"
>
> -#define MAX_L2_LEN (VLAN_ETH_HLEN + 3 * MPLS_HLEN)
> -struct ovs_frag_data {
> - unsigned long dst;
> - struct vport *vport;
> - struct ovs_skb_cb cb;
> - __be16 inner_protocol;
> - u16 network_offset; /* valid only for MPLS */
> - u16 vlan_tci;
> - __be16 vlan_proto;
> - unsigned int l2_len;
> - u8 mac_proto;
> - u8 l2_data[MAX_L2_LEN];
> -};
> -
> -static DEFINE_PER_CPU(struct ovs_frag_data, ovs_frag_data_storage);
> -
> DEFINE_PER_CPU(struct ovs_pcpu_storage, ovs_pcpu_storage) = {
> .bh_lock = INIT_LOCAL_LOCK(bh_lock),
> };
> @@ -771,7 +755,7 @@ static int set_sctp(struct sk_buff *skb, struct sw_flow_key *flow_key,
> static int ovs_vport_output(struct net *net, struct sock *sk,
> struct sk_buff *skb)
> {
> - struct ovs_frag_data *data = this_cpu_ptr(&ovs_frag_data_storage);
> + struct ovs_frag_data *data = this_cpu_ptr(&ovs_pcpu_storage.frag_data);
> struct vport *vport = data->vport;
>
> if (skb_cow_head(skb, data->l2_len) < 0) {
> @@ -823,7 +807,7 @@ static void prepare_frag(struct vport *vport, struct sk_buff *skb,
> unsigned int hlen = skb_network_offset(skb);
> struct ovs_frag_data *data;
>
> - data = this_cpu_ptr(&ovs_frag_data_storage);
> + data = this_cpu_ptr(&ovs_pcpu_storage.frag_data);
> data->dst = skb->_skb_refdst;
> data->vport = vport;
> data->cb = *OVS_CB(skb);
> diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
> index 4a665c3cfa906..1b5348b0f5594 100644
> --- a/net/openvswitch/datapath.h
> +++ b/net/openvswitch/datapath.h
> @@ -13,6 +13,7 @@
> #include <linux/skbuff.h>
> #include <linux/u64_stats_sync.h>
> #include <net/ip_tunnels.h>
> +#include <net/mpls.h>
>
> #include "conntrack.h"
> #include "flow.h"
> @@ -173,6 +174,20 @@ struct ovs_net {
> bool xt_label;
> };
>
> +#define MAX_L2_LEN (VLAN_ETH_HLEN + 3 * MPLS_HLEN)
> +struct ovs_frag_data {
> + unsigned long dst;
> + struct vport *vport;
> + struct ovs_skb_cb cb;
> + __be16 inner_protocol;
> + u16 network_offset; /* valid only for MPLS */
> + u16 vlan_tci;
> + __be16 vlan_proto;
> + unsigned int l2_len;
> + u8 mac_proto;
> + u8 l2_data[MAX_L2_LEN];
> +};
> +
> struct deferred_action {
> struct sk_buff *skb;
> const struct nlattr *actions;
> @@ -200,6 +215,7 @@ struct action_flow_keys {
> struct ovs_pcpu_storage {
> struct action_fifo action_fifos;
> struct action_flow_keys flow_keys;
> + struct ovs_frag_data frag_data;
> int exec_level;
> struct task_struct *owner;
> local_lock_t bh_lock;
next prev parent reply other threads:[~2025-04-15 16:26 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-14 16:07 [PATCH net-next v2 00/18] net: Cover more per-CPU storage with local nested BH locking Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 01/18] net: page_pool: Don't recycle into cache on PREEMPT_RT Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 02/18] net: dst_cache: Use nested-BH locking for dst_cache::cache Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 03/18] ipv4/route: Use this_cpu_inc() for stats on PREEMPT_RT Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 04/18] ipv6: sr: Use nested-BH locking for hmac_storage Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 05/18] xdp: Use nested-BH locking for system_page_pool Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 06/18] netfilter: nf_dup{4, 6}: Move duplication check to task_struct Sebastian Andrzej Siewior
2025-04-29 9:23 ` Peter Zijlstra
2025-04-14 16:07 ` [PATCH net-next v2 07/18] netfilter: nft_inner: Use nested-BH locking for nft_pcpu_tun_ctx Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 08/18] netfilter: nf_dup_netdev: Move the recursion counter struct netdev_xmit Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 09/18] xfrm: Use nested-BH locking for nat_keepalive_sk_ipv[46] Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 10/18] openvswitch: Merge three per-CPU structures into one Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 11/18] openvswitch: Use nested-BH locking for ovs_pcpu_storage Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 12/18] openvswitch: Move ovs_frag_data_storage into the struct ovs_pcpu_storage Sebastian Andrzej Siewior
2025-04-15 16:26 ` Aaron Conole [this message]
2025-04-16 16:45 ` Sebastian Andrzej Siewior
2025-04-17 8:01 ` Paolo Abeni
2025-04-17 9:08 ` Sebastian Andrzej Siewior
2025-04-17 9:48 ` Paolo Abeni
2025-04-17 10:18 ` Sebastian Andrzej Siewior
2025-04-17 15:07 ` Aaron Conole
2025-04-14 16:07 ` [PATCH net-next v2 13/18] net/sched: act_mirred: Move the recursion counter struct netdev_xmit Sebastian Andrzej Siewior
2025-04-17 8:29 ` Paolo Abeni
2025-04-17 10:47 ` Sebastian Andrzej Siewior
2025-04-17 11:31 ` Paolo Abeni
2025-04-14 16:07 ` [PATCH net-next v2 14/18] net/sched: Use nested-BH locking for sch_frag_data_storage Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 15/18] mptcp: Use nested-BH locking for hmac_storage Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 16/18] rds: Disable only bottom halves in rds_page_remainder_alloc() Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 17/18] rds: Acquire per-CPU pointer within BH disabled section Sebastian Andrzej Siewior
2025-04-14 16:07 ` [PATCH net-next v2 18/18] rds: Use nested-BH locking for rds_page_remainder Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f7tbjsxfl22.fsf@redhat.com \
--to=aconole@redhat.com \
--cc=bigeasy@linutronix.de \
--cc=davem@davemloft.net \
--cc=dev@openvswitch.org \
--cc=echaudro@redhat.com \
--cc=edumazet@google.com \
--cc=horms@kernel.org \
--cc=i.maximets@ovn.org \
--cc=kuba@kernel.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.