netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Pirko <jiri@resnulli.us>
To: taoyuan_eddy@hotmail.com
Cc: dev@openvswitch.org, Pravin B Shelar <pshelar@ovn.org>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] net: openvswitch: reduce cpu_used_mask memory consumption
Date: Wed, 1 Feb 2023 13:09:20 +0100	[thread overview]
Message-ID: <Y9pWcFdwaa4l6bPP@nanopsycho> (raw)
In-Reply-To: <OS3P286MB2295C30BCD41592C25805136F5D09@OS3P286MB2295.JPNP286.PROD.OUTLOOK.COM>

Tue, Jan 31, 2023 at 02:58:22PM CET, taoyuan_eddy@hotmail.com wrote:
>From: eddytaoyuan <taoyuan_eddy@hotmail.com>
>
>struct cpumask cpu_used_mask is directly embedded in struct sw_flow
>however, its size is hardcoded to CONFIG_NR_CPUS bits, which
>can be as large as 8192 by default, it cost memory and slows down
>ovs_flow_alloc, this fix used actual CPU number instead
>
>Signed-off-by: eddytaoyuan <taoyuan_eddy@hotmail.com>

Hmm, I guess that the name should be rather "Dddy Taoyuan" ? Please fix,
also in your "From:" header and preferably email client.


>---
> net/openvswitch/flow.c       |  6 +++---
> net/openvswitch/flow.h       |  2 +-
> net/openvswitch/flow_table.c | 25 ++++++++++++++++++++++---
> 3 files changed, 26 insertions(+), 7 deletions(-)
>
>diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
>index e20d1a973417..06345cd8c777 100644
>--- a/net/openvswitch/flow.c
>+++ b/net/openvswitch/flow.c
>@@ -107,7 +107,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 tcp_flags,
> 
> 					rcu_assign_pointer(flow->stats[cpu],
> 							   new_stats);
>-					cpumask_set_cpu(cpu, &flow->cpu_used_mask);
>+					cpumask_set_cpu(cpu, flow->cpu_used_mask);
> 					goto unlock;
> 				}
> 			}
>@@ -135,7 +135,7 @@ void ovs_flow_stats_get(const struct sw_flow *flow,
> 	memset(ovs_stats, 0, sizeof(*ovs_stats));
> 
> 	/* We open code this to make sure cpu 0 is always considered */
>-	for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, &flow->cpu_used_mask)) {
>+	for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, flow->cpu_used_mask)) {
> 		struct sw_flow_stats *stats = rcu_dereference_ovsl(flow->stats[cpu]);
> 
> 		if (stats) {
>@@ -159,7 +159,7 @@ void ovs_flow_stats_clear(struct sw_flow *flow)
> 	int cpu;
> 
> 	/* We open code this to make sure cpu 0 is always considered */
>-	for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, &flow->cpu_used_mask)) {
>+	for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, flow->cpu_used_mask)) {
> 		struct sw_flow_stats *stats = ovsl_dereference(flow->stats[cpu]);
> 
> 		if (stats) {
>diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
>index 073ab73ffeaa..b5711aff6e76 100644
>--- a/net/openvswitch/flow.h
>+++ b/net/openvswitch/flow.h
>@@ -229,7 +229,7 @@ struct sw_flow {
> 					 */
> 	struct sw_flow_key key;
> 	struct sw_flow_id id;
>-	struct cpumask cpu_used_mask;
>+	struct cpumask *cpu_used_mask;
> 	struct sw_flow_mask *mask;
> 	struct sw_flow_actions __rcu *sf_acts;
> 	struct sw_flow_stats __rcu *stats[]; /* One for each CPU.  First one
>diff --git a/net/openvswitch/flow_table.c b/net/openvswitch/flow_table.c
>index 0a0e4c283f02..c0fdff73272f 100644
>--- a/net/openvswitch/flow_table.c
>+++ b/net/openvswitch/flow_table.c
>@@ -87,11 +87,12 @@ struct sw_flow *ovs_flow_alloc(void)
> 	if (!stats)
> 		goto err;
> 
>+	flow->cpu_used_mask = (struct cpumask *)&(flow->stats[nr_cpu_ids]);
> 	spin_lock_init(&stats->lock);
> 
> 	RCU_INIT_POINTER(flow->stats[0], stats);
> 
>-	cpumask_set_cpu(0, &flow->cpu_used_mask);
>+	cpumask_set_cpu(0, flow->cpu_used_mask);
> 
> 	return flow;
> err:
>@@ -115,7 +116,7 @@ static void flow_free(struct sw_flow *flow)
> 					  flow->sf_acts);
> 	/* We open code this to make sure cpu 0 is always considered */
> 	for (cpu = 0; cpu < nr_cpu_ids;
>-	     cpu = cpumask_next(cpu, &flow->cpu_used_mask)) {
>+	     cpu = cpumask_next(cpu, flow->cpu_used_mask)) {
> 		if (flow->stats[cpu])
> 			kmem_cache_free(flow_stats_cache,
> 					(struct sw_flow_stats __force *)flow->stats[cpu]);
>@@ -1194,9 +1195,27 @@ int ovs_flow_init(void)
> 	BUILD_BUG_ON(__alignof__(struct sw_flow_key) % __alignof__(long));
> 	BUILD_BUG_ON(sizeof(struct sw_flow_key) % sizeof(long));
> 
>+        /*
>+         * Simply including 'struct cpumask' in 'struct sw_flow'
>+         * consumes memory unnecessarily large.
>+         * The reason is that compilation option CONFIG_NR_CPUS(default 8192)
>+         * decides the value of NR_CPUS, which in turn decides size of
>+         * 'struct cpumask', which means 1024 bytes are needed for the cpumask
>+         * It affects ovs_flow_alloc performance as well as memory footprint.
>+         * We should use actual CPU count instead of hardcoded value.
>+         *
>+         * To address this, 'cpu_used_mask' is redefined to pointer
>+         * and append a cpumask_size() after 'stat' to hold the actual memory
>+         * of struct cpumask
>+         *
>+         * cpumask APIs like cpumask_next and cpumask_set_cpu have been defined
>+         * to never access bits beyond cpu count by design, thus above change is
>+         * safe even though the actual memory provided is smaller than
>+         * sizeof(struct cpumask)

I don't understand the reason to have this comment in the code. From
what I see, this should be moved to the patch description. Of course
with changing the mood to imperation when you say the codebase what to
do.




>+         */
> 	flow_cache = kmem_cache_create("sw_flow", sizeof(struct sw_flow)
> 				       + (nr_cpu_ids
>-					  * sizeof(struct sw_flow_stats *)),
>+					  * sizeof(struct sw_flow_stats *)) + cpumask_size(),
> 				       0, 0, NULL);
> 	if (flow_cache == NULL)
> 		return -ENOMEM;
>-- 
>2.27.0
>

  reply	other threads:[~2023-02-01 12:09 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-31 13:58 [PATCH] net: openvswitch: reduce cpu_used_mask memory consumption taoyuan_eddy
2023-02-01 12:09 ` Jiri Pirko [this message]
  -- strict thread matches above, loose matches on Subject: below --
2023-02-01  9:21 taoyuan_eddy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9pWcFdwaa4l6bPP@nanopsycho \
    --to=jiri@resnulli.us \
    --cc=davem@davemloft.net \
    --cc=dev@openvswitch.org \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pshelar@ovn.org \
    --cc=taoyuan_eddy@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).