Re: [PATCH net-next v2 4/7] netfilter: flowtable: allow updating offloaded rules asynchronously

netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Vlad Buslov <vladbu@nvidia.com>
To: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: <davem@davemloft.net>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<pablo@netfilter.org>, <netdev@vger.kernel.org>,
	<netfilter-devel@vger.kernel.org>, <jhs@mojatatu.com>,
	<xiyou.wangcong@gmail.com>, <jiri@resnulli.us>, <ozsh@nvidia.com>,
	<simon.horman@corigine.com>
Subject: Re: [PATCH net-next v2 4/7] netfilter: flowtable: allow updating offloaded rules asynchronously
Date: Tue, 17 Jan 2023 19:33:31 +0200	[thread overview]
Message-ID: <875yd5cbcl.fsf@nvidia.com> (raw)
In-Reply-To: <Y8a+qBjcr7KuPf+e@t14s.localdomain>


On Tue 17 Jan 2023 at 12:28, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote:
> On Fri, Jan 13, 2023 at 05:55:45PM +0100, Vlad Buslov wrote:
>> Following patches in series need to update flowtable rule several times
>> during its lifetime in order to synchronize hardware offload with actual ct
>> status. However, reusing existing 'refresh' logic in act_ct would cause
>> data path to potentially schedule significant amount of spurious tasks in
>> 'add' workqueue since it is executed per-packet. Instead, introduce a new
>> flow 'update' flag and use it to schedule async flow refresh in flowtable
>> gc which will only be executed once per gc iteration.
>> 
>> Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
>> ---
>>  include/net/netfilter/nf_flow_table.h |  3 ++-
>>  net/netfilter/nf_flow_table_core.c    | 20 +++++++++++++++-----
>>  net/netfilter/nf_flow_table_offload.c |  5 +++--
>>  3 files changed, 20 insertions(+), 8 deletions(-)
>> 
>> diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
>> index 88ab98ab41d9..e396424e2e68 100644
>> --- a/include/net/netfilter/nf_flow_table.h
>> +++ b/include/net/netfilter/nf_flow_table.h
>> @@ -165,6 +165,7 @@ enum nf_flow_flags {
>>  	NF_FLOW_HW_DEAD,
>>  	NF_FLOW_HW_PENDING,
>>  	NF_FLOW_HW_BIDIRECTIONAL,
>> +	NF_FLOW_HW_UPDATE,
>>  };
>>  
>>  enum flow_offload_type {
>> @@ -300,7 +301,7 @@ unsigned int nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb,
>>  #define MODULE_ALIAS_NF_FLOWTABLE(family)	\
>>  	MODULE_ALIAS("nf-flowtable-" __stringify(family))
>>  
>> -void nf_flow_offload_add(struct nf_flowtable *flowtable,
>> +bool nf_flow_offload_add(struct nf_flowtable *flowtable,
>>  			 struct flow_offload *flow);
>>  void nf_flow_offload_del(struct nf_flowtable *flowtable,
>>  			 struct flow_offload *flow);
>> diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
>> index 04bd0ed4d2ae..5b495e768655 100644
>> --- a/net/netfilter/nf_flow_table_core.c
>> +++ b/net/netfilter/nf_flow_table_core.c
>> @@ -316,21 +316,28 @@ int flow_offload_add(struct nf_flowtable *flow_table, struct flow_offload *flow)
>>  }
>>  EXPORT_SYMBOL_GPL(flow_offload_add);
>>  
>> +static bool __flow_offload_refresh(struct nf_flowtable *flow_table,
>> +				   struct flow_offload *flow)
>> +{
>> +	if (likely(!nf_flowtable_hw_offload(flow_table)))
>> +		return true;
>> +
>> +	return nf_flow_offload_add(flow_table, flow);
>> +}
>> +
>>  void flow_offload_refresh(struct nf_flowtable *flow_table,
>>  			  struct flow_offload *flow)
>>  {
>>  	u32 timeout;
>>  
>>  	timeout = nf_flowtable_time_stamp + flow_offload_get_timeout(flow);
>> -	if (timeout - READ_ONCE(flow->timeout) > HZ)
>> +	if (timeout - READ_ONCE(flow->timeout) > HZ &&
>> +	    !test_bit(NF_FLOW_HW_UPDATE, &flow->flags))
>>  		WRITE_ONCE(flow->timeout, timeout);
>>  	else
>>  		return;
>>  
>> -	if (likely(!nf_flowtable_hw_offload(flow_table)))
>> -		return;
>> -
>> -	nf_flow_offload_add(flow_table, flow);
>> +	__flow_offload_refresh(flow_table, flow);
>>  }
>>  EXPORT_SYMBOL_GPL(flow_offload_refresh);
>>  
>> @@ -435,6 +442,9 @@ static void nf_flow_offload_gc_step(struct nf_flowtable *flow_table,
>>  		} else {
>>  			flow_offload_del(flow_table, flow);
>>  		}
>> +	} else if (test_and_clear_bit(NF_FLOW_HW_UPDATE, &flow->flags)) {
>> +		if (!__flow_offload_refresh(flow_table, flow))
>> +			set_bit(NF_FLOW_HW_UPDATE, &flow->flags);
>>  	} else if (test_bit(NF_FLOW_HW, &flow->flags)) {
>>  		nf_flow_offload_stats(flow_table, flow);
>
> AFAICT even after this patchset it is possible to have both flags set
> at the same time.
> With that, this would cause the stats to skip a beat.
> This would be better:
>
> - 	} else if (test_bit(NF_FLOW_HW, &flow->flags)) {
> - 		nf_flow_offload_stats(flow_table, flow);
> +	} else {
> +		if (test_and_clear_bit(NF_FLOW_HW_UPDATE, &flow->flags))
> +			if (!__flow_offload_refresh(flow_table, flow))
> +				set_bit(NF_FLOW_HW_UPDATE, &flow->flags);
> +	 	if (test_bit(NF_FLOW_HW, &flow->flags))
> + 			nf_flow_offload_stats(flow_table, flow);
>  	}
>
> But a flow cannot have 2 pending actions at a time.

Yes. And timeouts are quite generous so there is IMO no problem in
skipping one iteration. It is not like this wq is high priority and we
can guarantee any exact update interval here anyway.

> Then maybe an update to nf_flow_offload_tuple() to make it handle the
> stats implicitly?

I considered this, but didn't want to over-complicate this series which
is tricky enough as it is.

>
>>  	}
>> diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
>> index 8b852f10fab4..103b2ca8d123 100644
>> --- a/net/netfilter/nf_flow_table_offload.c
>> +++ b/net/netfilter/nf_flow_table_offload.c
>> @@ -1036,16 +1036,17 @@ nf_flow_offload_work_alloc(struct nf_flowtable *flowtable,
>>  }
>>  
>>  
>> -void nf_flow_offload_add(struct nf_flowtable *flowtable,
>> +bool nf_flow_offload_add(struct nf_flowtable *flowtable,
>>  			 struct flow_offload *flow)
>>  {
>>  	struct flow_offload_work *offload;
>>  
>>  	offload = nf_flow_offload_work_alloc(flowtable, flow, FLOW_CLS_REPLACE);
>>  	if (!offload)
>> -		return;
>> +		return false;
>>  
>>  	flow_offload_queue_work(offload);
>> +	return true;
>>  }
>>  
>>  void nf_flow_offload_del(struct nf_flowtable *flowtable,
>> -- 
>> 2.38.1
>>

next prev parent reply	other threads:[~2023-01-17 17:49 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-13 16:55 [PATCH net-next v2 0/7] Allow offloading of UDP NEW connections via act_ct Vlad Buslov
2023-01-13 16:55 ` [PATCH net-next v2 1/7] net: flow_offload: provision conntrack info in ct_metadata Vlad Buslov
2023-01-17 14:42   ` Pablo Neira Ayuso
2023-01-17 17:43     ` Vlad Buslov
2023-01-17 15:04   ` Marcelo Ricardo Leitner
2023-01-17 15:09     ` Marcelo Ricardo Leitner
2023-01-17 17:28       ` Vlad Buslov
2023-01-19  4:15         ` Marcelo Ricardo Leitner
2023-01-17 17:25     ` Vlad Buslov
2023-01-13 16:55 ` [PATCH net-next v2 2/7] netfilter: flowtable: fixup UDP timeout depending on ct state Vlad Buslov
2023-01-13 16:55 ` [PATCH net-next v2 3/7] netfilter: flowtable: allow unidirectional rules Vlad Buslov
2023-01-17 15:15   ` Marcelo Ricardo Leitner
2023-01-17 17:31     ` Vlad Buslov
2023-01-13 16:55 ` [PATCH net-next v2 4/7] netfilter: flowtable: allow updating offloaded rules asynchronously Vlad Buslov
2023-01-17 15:28   ` Marcelo Ricardo Leitner
2023-01-17 17:33     ` Vlad Buslov [this message]
2023-01-17 17:47       ` Marcelo Ricardo Leitner
2023-01-13 16:55 ` [PATCH net-next v2 5/7] net/sched: act_ct: set ctinfo in meta action depending on ct state Vlad Buslov
2023-01-13 16:55 ` [PATCH net-next v2 6/7] net/sched: act_ct: offload UDP NEW connections Vlad Buslov
2023-01-17 15:41   ` Marcelo Ricardo Leitner
2023-01-17 17:36     ` Vlad Buslov
2023-01-17 17:56       ` Marcelo Ricardo Leitner
2023-01-17 19:12         ` Vlad Buslov
2023-01-19  4:18           ` Marcelo Ricardo Leitner
2023-01-13 16:55 ` [PATCH net-next v2 7/7] netfilter: nf_conntrack: allow early drop of offloaded UDP conns Vlad Buslov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=875yd5cbcl.fsf@nvidia.com \
    --to=vladbu@nvidia.com \
    --cc=davem@davemloft.net \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=marcelo.leitner@gmail.com \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=ozsh@nvidia.com \
    --cc=pabeni@redhat.com \
    --cc=pablo@netfilter.org \
    --cc=simon.horman@corigine.com \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).