netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Oz Shlomo <ozsh@nvidia.com>
Cc: Sven Auhagen <sven.auhagen@voleatech.de>,
	Felix Fietkau <nbd@nbd.name>,
	netdev@vger.kernel.org, netfilter-devel@vger.kernel.org,
	Florian Westphal <fw@strlen.de>, Paul Blakey <paulb@nvidia.com>
Subject: Re: [PATCH net v2] netfilter: nf_flow_table: fix teardown flow timeout
Date: Mon, 16 May 2022 14:13:03 +0200	[thread overview]
Message-ID: <YoI/z+aWkmAAycR3@salvia> (raw)
In-Reply-To: <YoIt5rHw4Xwl1zgY@salvia>

[-- Attachment #1: Type: text/plain, Size: 2052 bytes --]

On Mon, May 16, 2022 at 12:56:41PM +0200, Pablo Neira Ayuso wrote:
> On Thu, May 12, 2022 at 09:28:03PM +0300, Oz Shlomo wrote:
> > Connections leaving the established state (due to RST / FIN TCP packets)
> > set the flow table teardown flag. The packet path continues to set lower
> > timeout value as per the new TCP state but the offload flag remains set.
> >
> > Hence, the conntrack garbage collector may race to undo the timeout
> > adjustment of the packet path, leaving the conntrack entry in place with
> > the internal offload timeout (one day).
> >
> > Avoid ct gc timeout overwrite by flagging teared down flowtable
> > connections.
> >
> > On the nftables side we only need to allow established TCP connections to
> > create a flow offload entry. Since we can not guaruantee that
> > flow_offload_teardown is called by a TCP FIN packet we also need to make
> > sure that flow_offload_fixup_ct is also called in flow_offload_del
> > and only fixes up established TCP connections.
> [...]
> > diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> > index 0164e5f522e8..324fdb62c08b 100644
> > --- a/net/netfilter/nf_conntrack_core.c
> > +++ b/net/netfilter/nf_conntrack_core.c
> > @@ -1477,7 +1477,8 @@ static void gc_worker(struct work_struct *work)
> >  			tmp = nf_ct_tuplehash_to_ctrack(h);
> >  
> >  			if (test_bit(IPS_OFFLOAD_BIT, &tmp->status)) {
> > -				nf_ct_offload_timeout(tmp);
> 
> Hm, it is the trick to avoid checking for IPS_OFFLOAD from the packet
> path that triggers the race, ie. nf_ct_is_expired()
> 
> The flowtable ct fixup races with conntrack gc collector.
> 
> Clearing IPS_OFFLOAD might result in offloading the entry again for
> the closing packets.
> 
> Probably clear IPS_OFFLOAD from teardown, and skip offload if flow is
> in a TCP state that represent closure?
> 
>   		if (unlikely(!tcph || tcph->fin || tcph->rst))
>   			goto out;
> 
> this is already the intention in the existing code.

I'm attaching an incomplete sketch patch. My goal is to avoid the
extra IPS_ bit.

[-- Attachment #2: x.patch --]
[-- Type: text/x-diff, Size: 1804 bytes --]

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 20b4a14e5d4e..7af1e2e8f595 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -362,8 +362,6 @@ static void flow_offload_del(struct nf_flowtable *flow_table,
 			       &flow->tuplehash[FLOW_OFFLOAD_DIR_REPLY].node,
 			       nf_flow_offload_rhash_params);
 
-	clear_bit(IPS_OFFLOAD_BIT, &flow->ct->status);
-
 	if (nf_flow_has_expired(flow))
 		flow_offload_fixup_ct(flow->ct);
 	else
@@ -375,6 +373,7 @@ static void flow_offload_del(struct nf_flowtable *flow_table,
 void flow_offload_teardown(struct flow_offload *flow)
 {
 	set_bit(NF_FLOW_TEARDOWN, &flow->flags);
+	clear_bit(IPS_OFFLOAD_BIT, &flow->ct->status);
 
 	flow_offload_fixup_ct_state(flow->ct);
 }
diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index 187b8cb9a510..7bc56377496c 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -273,6 +273,12 @@ static bool nft_flow_offload_skip(struct sk_buff *skb, int family)
 	return false;
 }
 
+static bool flow_offload_teardown_state(const struct ip_ct_tcp *state)
+{
+	return state->state > TCP_CONNTRACK_ESTABLISHED &&
+	       state->state <= TCP_CONNTRACK_CLOSE;
+}
+
 static void nft_flow_offload_eval(const struct nft_expr *expr,
 				  struct nft_regs *regs,
 				  const struct nft_pktinfo *pkt)
@@ -298,7 +304,8 @@ static void nft_flow_offload_eval(const struct nft_expr *expr,
 	case IPPROTO_TCP:
 		tcph = skb_header_pointer(pkt->skb, nft_thoff(pkt),
 					  sizeof(_tcph), &_tcph);
-		if (unlikely(!tcph || tcph->fin || tcph->rst))
+		if (unlikely(!tcph || tcph->fin || tcph->rst ||
+			     flow_offload_teardown_state(ct->proto.tcp)))
 			goto out;
 		break;
 	case IPPROTO_UDP:

  parent reply	other threads:[~2022-05-16 12:13 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-12 18:28 [PATCH net v2] netfilter: nf_flow_table: fix teardown flow timeout Oz Shlomo
2022-05-16 10:56 ` Pablo Neira Ayuso
2022-05-16 11:18   ` Sven Auhagen
2022-05-16 11:37     ` Pablo Neira Ayuso
2022-05-16 12:06       ` Pablo Neira Ayuso
2022-05-16 12:17         ` Sven Auhagen
2022-05-16 17:54           ` Pablo Neira Ayuso
2022-05-16 12:13   ` Pablo Neira Ayuso [this message]
2022-05-16 12:23     ` Sven Auhagen
2022-05-16 12:43       ` Pablo Neira Ayuso
2022-05-16 13:02         ` Sven Auhagen
2022-05-16 17:50           ` Pablo Neira Ayuso
2022-05-16 18:23             ` Sven Auhagen
2022-05-17  8:32               ` Pablo Neira Ayuso
2022-05-17  8:36                 ` Sven Auhagen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YoI/z+aWkmAAycR3@salvia \
    --to=pablo@netfilter.org \
    --cc=fw@strlen.de \
    --cc=nbd@nbd.name \
    --cc=netdev@vger.kernel.org \
    --cc=netfilter-devel@vger.kernel.org \
    --cc=ozsh@nvidia.com \
    --cc=paulb@nvidia.com \
    --cc=sven.auhagen@voleatech.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).