netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ct hardware offload ignores RST packet
@ 2024-09-23  9:47 Chris Mi
  2024-09-23 10:03 ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Mi @ 2024-09-23  9:47 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Ali Abdallah; +Cc: netfilter-devel

Hi Pablo & Ali,

Our customer reported an issue. I found that it can be reproduced like
this. If the tcp client program sets socketopt linger to 0, when the 
client program exits, RST packet will be sent instead of FIN.

But this RST packet doesn't match the expected sequence, server will
ignore it and the ct entry will be in ESTABLISHED state for 5 days.
It seems like an expected behavior due to commit [1].

We found another commit [2] in recent kernel. We tried to set 
nf_conntrack_tcp_ignore_invalid_rst to 1.
It doesn't work as well. And the commit message is too short. We don't
know what's the usecase for it.

In our case, if we have the following diff, ct will be closed normally:

diff --git a/net/netfilter/nf_conntrack_proto_tcp.c 
b/net/netfilter/nf_conntrack_proto_tcp.c
index ae493599a3ef..04c0e5a86990 100644
--- a/net/netfilter/nf_conntrack_proto_tcp.c
+++ b/net/netfilter/nf_conntrack_proto_tcp.c
@@ -1218,7 +1218,8 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct,
                         /* ... RST sequence number doesn't match 
exactly, keep
                          * established state to allow a possible 
challenge ACK.
                          */
-                       new_state = old_state;
+                       if (!tn->tcp_ignore_invalid_rst)
+                               new_state = old_state;
                 }
                 if (((test_bit(IPS_SEEN_REPLY_BIT, &ct->status)
                          && ct->proto.tcp.last_index == TCP_SYN_SET)

Before I submit it, I'm wondering if you have any suggestion for this
issue and diff?

Thanks,
Chris

[1] netfilter: conntrack: tcp: only close if RST matches exact sequence
[2] netfilter: conntrack: add new sysctl to disable RST check

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: ct hardware offload ignores RST packet
  2024-09-23  9:47 ct hardware offload ignores RST packet Chris Mi
@ 2024-09-23 10:03 ` Florian Westphal
  2024-09-23 15:47   ` Chris Mi
  0 siblings, 1 reply; 8+ messages in thread
From: Florian Westphal @ 2024-09-23 10:03 UTC (permalink / raw)
  To: Chris Mi; +Cc: Pablo Neira Ayuso, Ali Abdallah, netfilter-devel

Chris Mi <cmi@nvidia.com> wrote:
> Hi Pablo & Ali,
> 
> Our customer reported an issue. I found that it can be reproduced like
> this. If the tcp client program sets socketopt linger to 0, when the client
> program exits, RST packet will be sent instead of FIN.
> 
> But this RST packet doesn't match the expected sequence, server will
> ignore it and the ct entry will be in ESTABLISHED state for 5 days.
> It seems like an expected behavior due to commit [1].
> 
> We found another commit [2] in recent kernel. We tried to set
> nf_conntrack_tcp_ignore_invalid_rst to 1.
> It doesn't work as well. And the commit message is too short. We don't
> know what's the usecase for it.
> 
> In our case, if we have the following diff, ct will be closed normally:
> 
> diff --git a/net/netfilter/nf_conntrack_proto_tcp.c
> b/net/netfilter/nf_conntrack_proto_tcp.c
> index ae493599a3ef..04c0e5a86990 100644
> --- a/net/netfilter/nf_conntrack_proto_tcp.c
> +++ b/net/netfilter/nf_conntrack_proto_tcp.c
> @@ -1218,7 +1218,8 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct,
>                         /* ... RST sequence number doesn't match exactly,
> keep
>                          * established state to allow a possible challenge
> ACK.
>                          */
> -                       new_state = old_state;
> +                       if (!tn->tcp_ignore_invalid_rst)
> +                               new_state = old_state;

Can you test if a call to
nf_tcp_handle_invalid() here resolves the problem as well?
Intent would be to reduce timeout but keep connecton state
as-is.

I don't think we should force customers to tweak sysctls to
make expiry work as intended.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ct hardware offload ignores RST packet
  2024-09-23 10:03 ` Florian Westphal
@ 2024-09-23 15:47   ` Chris Mi
  2024-09-23 16:51     ` Florian Westphal
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Mi @ 2024-09-23 15:47 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Pablo Neira Ayuso, Ali Abdallah, netfilter-devel

On 9/23/2024 6:03 PM, Florian Westphal wrote:
> Chris Mi <cmi@nvidia.com> wrote:
>> Hi Pablo & Ali,
>>
>> Our customer reported an issue. I found that it can be reproduced like
>> this. If the tcp client program sets socketopt linger to 0, when the client
>> program exits, RST packet will be sent instead of FIN.
>>
>> But this RST packet doesn't match the expected sequence, server will
>> ignore it and the ct entry will be in ESTABLISHED state for 5 days.
>> It seems like an expected behavior due to commit [1].
>>
>> We found another commit [2] in recent kernel. We tried to set
>> nf_conntrack_tcp_ignore_invalid_rst to 1.
>> It doesn't work as well. And the commit message is too short. We don't
>> know what's the usecase for it.
>>
>> In our case, if we have the following diff, ct will be closed normally:
>>
>> diff --git a/net/netfilter/nf_conntrack_proto_tcp.c
>> b/net/netfilter/nf_conntrack_proto_tcp.c
>> index ae493599a3ef..04c0e5a86990 100644
>> --- a/net/netfilter/nf_conntrack_proto_tcp.c
>> +++ b/net/netfilter/nf_conntrack_proto_tcp.c
>> @@ -1218,7 +1218,8 @@ int nf_conntrack_tcp_packet(struct nf_conn *ct,
>>                          /* ... RST sequence number doesn't match exactly,
>> keep
>>                           * established state to allow a possible challenge
>> ACK.
>>                           */
>> -                       new_state = old_state;
>> +                       if (!tn->tcp_ignore_invalid_rst)
>> +                               new_state = old_state;
> 
> Can you test if a call to
> nf_tcp_handle_invalid() here resolves the problem as well?
> Intent would be to reduce timeout but keep connecton state
> as-is.
> 
> I don't think we should force customers to tweak sysctls to
> make expiry work as intended.

It doesn't work. The if statement is not executed because the condition
is not met.

[Mon Sep 23 18:41:59 2024] nf_tcp_handle_invalid: 756, last_dir: 0, dir: 
0, last_index: 3

Even if the if statement is executed, the timeout is still not changed.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ct hardware offload ignores RST packet
  2024-09-23 15:47   ` Chris Mi
@ 2024-09-23 16:51     ` Florian Westphal
  2024-09-23 17:41       ` Pablo Neira Ayuso
  2024-09-24  1:03       ` Chris Mi
  0 siblings, 2 replies; 8+ messages in thread
From: Florian Westphal @ 2024-09-23 16:51 UTC (permalink / raw)
  To: Chris Mi; +Cc: Florian Westphal, Pablo Neira Ayuso, Ali Abdallah,
	netfilter-devel

Chris Mi <cmi@nvidia.com> wrote:
> > nf_tcp_handle_invalid() here resolves the problem as well?
> > Intent would be to reduce timeout but keep connecton state
> > as-is.
> > 
> > I don't think we should force customers to tweak sysctls to
> > make expiry work as intended.
> 
> It doesn't work. The if statement is not executed because the condition
> is not met.
> 
> [Mon Sep 23 18:41:59 2024] nf_tcp_handle_invalid: 756, last_dir: 0, dir: 0,
> last_index: 3

How about relaxing nf_tcp_handle_invalid() to no longer check dir and
last_index?

It already makes sure that timeout can only be reduced by such invalid
fin/rst.

I.e. also get rid of else clause and extra indent level.

> Even if the if statement is executed, the timeout is still not changed.

Hmm, why not? Can you elaborate? Is the timeout already below 2 minutes?
If so, what is the exact expectation?

Could you propose a patch? As I said, I dislike tying this to sysctls.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ct hardware offload ignores RST packet
  2024-09-23 16:51     ` Florian Westphal
@ 2024-09-23 17:41       ` Pablo Neira Ayuso
  2024-09-24  1:04         ` Chris Mi
  2024-09-24  1:03       ` Chris Mi
  1 sibling, 1 reply; 8+ messages in thread
From: Pablo Neira Ayuso @ 2024-09-23 17:41 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Chris Mi, Ali Abdallah, netfilter-devel

On Mon, Sep 23, 2024 at 06:51:15PM +0200, Florian Westphal wrote:
> Could you propose a patch? As I said, I dislike tying this to sysctls.

I prefer to remove the sysctl too and try to handle this via the
invalid routine handling as Florian suggests.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ct hardware offload ignores RST packet
  2024-09-23 16:51     ` Florian Westphal
  2024-09-23 17:41       ` Pablo Neira Ayuso
@ 2024-09-24  1:03       ` Chris Mi
  2024-09-24 19:11         ` Florian Westphal
  1 sibling, 1 reply; 8+ messages in thread
From: Chris Mi @ 2024-09-24  1:03 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Pablo Neira Ayuso, Ali Abdallah, netfilter-devel

On 9/24/2024 12:51 AM, Florian Westphal wrote:
> Chris Mi <cmi@nvidia.com> wrote:
>>> nf_tcp_handle_invalid() here resolves the problem as well?
>>> Intent would be to reduce timeout but keep connecton state
>>> as-is.
>>>
>>> I don't think we should force customers to tweak sysctls to
>>> make expiry work as intended.
>>
>> It doesn't work. The if statement is not executed because the condition
>> is not met.
>>
>> [Mon Sep 23 18:41:59 2024] nf_tcp_handle_invalid: 756, last_dir: 0, dir: 0,
>> last_index: 3
> 
> How about relaxing nf_tcp_handle_invalid() to no longer check dir and
> last_index?

Yes, I did that. I removed the check. The timeout value is 1 day.
I remember it should be 5 days. Not sure what changed.

> 
> It already makes sure that timeout can only be reduced by such invalid
> fin/rst.
> 
> I.e. also get rid of else clause and extra indent level.
> 
>> Even if the if statement is executed, the timeout is still not changed.
> 
> Hmm, why not? Can you elaborate? Is the timeout already below 2 minutes?

As I mentioned above, the timeout is 1 day.

> If so, what is the exact expectation?
> 
> Could you propose a patch? As I said, I dislike tying this to sysctls.

Sure. I will add more debug log to understand the function
nf_tcp_handle_invalid() and propose a patch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ct hardware offload ignores RST packet
  2024-09-23 17:41       ` Pablo Neira Ayuso
@ 2024-09-24  1:04         ` Chris Mi
  0 siblings, 0 replies; 8+ messages in thread
From: Chris Mi @ 2024-09-24  1:04 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal; +Cc: Ali Abdallah, netfilter-devel



On 9/24/2024 1:41 AM, Pablo Neira Ayuso wrote:
> On Mon, Sep 23, 2024 at 06:51:15PM +0200, Florian Westphal wrote:
>> Could you propose a patch? As I said, I dislike tying this to sysctls.
> 
> I prefer to remove the sysctl too and try to handle this via the
> invalid routine handling as Florian suggests.

Sure. I will propose a patch to call nf_tcp_handle_invalid().

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: ct hardware offload ignores RST packet
  2024-09-24  1:03       ` Chris Mi
@ 2024-09-24 19:11         ` Florian Westphal
  0 siblings, 0 replies; 8+ messages in thread
From: Florian Westphal @ 2024-09-24 19:11 UTC (permalink / raw)
  To: Chris Mi; +Cc: Florian Westphal, Pablo Neira Ayuso, Ali Abdallah,
	netfilter-devel

Chris Mi <cmi@nvidia.com> wrote:
> > If so, what is the exact expectation?
> > 
> > Could you propose a patch? As I said, I dislike tying this to sysctls.
> 
> Sure. I will add more debug log to understand the function
> nf_tcp_handle_invalid() and propose a patch.

Thanks!  Earlier this year I had a look at conntrack <-> flowtable
plumbing and found multiple issues.  Not sure they are related, I am
recovering the patches that I made back then and will post them asap
after they cleared basic testing.

I'll place you on CC.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-09-24 19:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-23  9:47 ct hardware offload ignores RST packet Chris Mi
2024-09-23 10:03 ` Florian Westphal
2024-09-23 15:47   ` Chris Mi
2024-09-23 16:51     ` Florian Westphal
2024-09-23 17:41       ` Pablo Neira Ayuso
2024-09-24  1:04         ` Chris Mi
2024-09-24  1:03       ` Chris Mi
2024-09-24 19:11         ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).