From: Pablo Neira Ayuso <pablo@netfilter.org>
To: Tula Kraiser <tkraiser@arista.com>
Cc: netfilter-devel@vger.kernel.org
Subject: Re: Avoid race between tcp_packet packet processing and timeout set by a netfilter CTA_TIMEOUT message
Date: Fri, 18 Nov 2022 16:13:53 +0100 [thread overview]
Message-ID: <Y3ehMVHetV5Vx7R3@salvia> (raw)
In-Reply-To: <CAKh0D7xP9rmwes4zjwDAYvrB706Au3aLvfA25NV0+sYR17+-NQ@mail.gmail.com>
Hi,
On Thu, Nov 10, 2022 at 10:03:43AM -0800, Tula Kraiser wrote:
> Hello,
>
> We have been using the nat netfilter module to create NAT translations
> and then we offload the translations to our hardware. Once the
> translation is offloaded to hardware we expect only FIN and RST to be
> received by the linux stack. Once we finish programming the hardware
> we send a NETLINK message to the kernel setting the entry timeout to a
> larger value (we use the CTA_TIMEOUT for that). That's because we rely
> on hardware hitbit to indicate when the entry should be removed due to
> inactivity.
This sounds very much like the flowtable infrastructure [1], the TCP
FIN and RST handling is done in a similar way.
> Unfortunately there is a delay between receiving the notification of
> the translation (we subscribe to netfilter conntrack events for that)
> and the time we program the hardware where packets still make it into
> the kernel input queue. There is a race between the CTA_TIMEOUT
> message and the queue packets where the kernel can replace the timeout
> with its default values leading to the entry being removed
> prematurely.
>
>
> To avoid that we are proposing introducing a new attribute to the
> CTA_PROTOINFO for TCP where we set the IPS_FIXED_TIMEOUT_FLAG on the
> conntrack entry if the conntrack TCP state is less or equal to
> TCP_ESTABLISHED. That takes care of the race. We are modifying the
> tcp_packet routine to reset the IPS_FIXED_TIMEOUT_FLAG when the tcp
> state moves the established state so FINs can be processed correctly.
>
>
> Does this sound like a reasonable solution to the problem or are there
> better suggestions? Does this sound like an interesting patch to push
> upstream?
There are already several drivers that support hardware offload
through this infrastructure.
Someone contributed more detailed documentation about the flowtable [2].
Code can be found under net/netfilter/nf_flow_table*.c files.
[1] https://docs.kernel.org/networking/nf_flowtable.html
[2] https://thermalcircle.de/doku.php?id=blog:linux:flowtables_1_a_netfilter_nftables_fastpath
prev parent reply other threads:[~2022-11-18 15:14 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-10 18:03 Avoid race between tcp_packet packet processing and timeout set by a netfilter CTA_TIMEOUT message Tula Kraiser
2022-11-18 15:13 ` Pablo Neira Ayuso [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y3ehMVHetV5Vx7R3@salvia \
--to=pablo@netfilter.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=tkraiser@arista.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).