* Avoid race between tcp_packet packet processing and timeout set by a netfilter CTA_TIMEOUT message
@ 2022-11-10 18:03 Tula Kraiser
  2022-11-18 15:13 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 2+ messages in thread
From: Tula Kraiser @ 2022-11-10 18:03 UTC (permalink / raw)
  To: netfilter-devel
Hello,
We have been using the nat netfilter module to create NAT translations
and then we offload the translations to our hardware. Once the
translation is offloaded to hardware we expect only FIN and RST to be
received by the linux stack. Once we finish programming the hardware
we send a NETLINK message to the kernel setting the entry timeout to a
larger value (we use the CTA_TIMEOUT for that). That's because we rely
on hardware hitbit to indicate when the entry should be removed due to
inactivity.
Unfortunately there is a delay between receiving the notification of
the translation (we subscribe to netfilter conntrack events for that)
and the time we program the hardware where packets still make it into
the kernel input queue. There is a race between the CTA_TIMEOUT
message and the queue packets where the kernel can replace the timeout
with its default values leading to the entry being removed
prematurely.
To avoid that we are proposing introducing a new attribute to the
CTA_PROTOINFO for TCP where we set the IPS_FIXED_TIMEOUT_FLAG on the
conntrack entry if the conntrack TCP state is less or equal to
TCP_ESTABLISHED.  That takes care of the race.  We are modifying the
tcp_packet routine to reset the IPS_FIXED_TIMEOUT_FLAG when the tcp
state moves the established state so FINs can be processed correctly.
Does this sound like a reasonable solution to the problem or are there
better suggestions? Does this sound like an interesting patch to push
upstream?
Thanks,
Tula
^ permalink raw reply	[flat|nested] 2+ messages in thread
* Re: Avoid race between tcp_packet packet processing and timeout set by a netfilter CTA_TIMEOUT message
  2022-11-10 18:03 Avoid race between tcp_packet packet processing and timeout set by a netfilter CTA_TIMEOUT message Tula Kraiser
@ 2022-11-18 15:13 ` Pablo Neira Ayuso
  0 siblings, 0 replies; 2+ messages in thread
From: Pablo Neira Ayuso @ 2022-11-18 15:13 UTC (permalink / raw)
  To: Tula Kraiser; +Cc: netfilter-devel
Hi,
On Thu, Nov 10, 2022 at 10:03:43AM -0800, Tula Kraiser wrote:
> Hello,
> 
> We have been using the nat netfilter module to create NAT translations
> and then we offload the translations to our hardware. Once the
> translation is offloaded to hardware we expect only FIN and RST to be
> received by the linux stack. Once we finish programming the hardware
> we send a NETLINK message to the kernel setting the entry timeout to a
> larger value (we use the CTA_TIMEOUT for that). That's because we rely
> on hardware hitbit to indicate when the entry should be removed due to
> inactivity.
This sounds very much like the flowtable infrastructure [1], the TCP
FIN and RST handling is done in a similar way.
> Unfortunately there is a delay between receiving the notification of
> the translation (we subscribe to netfilter conntrack events for that)
> and the time we program the hardware where packets still make it into
> the kernel input queue. There is a race between the CTA_TIMEOUT
> message and the queue packets where the kernel can replace the timeout
> with its default values leading to the entry being removed
> prematurely.
> 
> 
> To avoid that we are proposing introducing a new attribute to the
> CTA_PROTOINFO for TCP where we set the IPS_FIXED_TIMEOUT_FLAG on the
> conntrack entry if the conntrack TCP state is less or equal to
> TCP_ESTABLISHED.  That takes care of the race.  We are modifying the
> tcp_packet routine to reset the IPS_FIXED_TIMEOUT_FLAG when the tcp
> state moves the established state so FINs can be processed correctly.
> 
> 
> Does this sound like a reasonable solution to the problem or are there
> better suggestions? Does this sound like an interesting patch to push
> upstream?
There are already several drivers that support hardware offload
through this infrastructure.
Someone contributed more detailed documentation about the flowtable [2].
Code can be found under net/netfilter/nf_flow_table*.c files.
[1] https://docs.kernel.org/networking/nf_flowtable.html
[2] https://thermalcircle.de/doku.php?id=blog:linux:flowtables_1_a_netfilter_nftables_fastpath
^ permalink raw reply	[flat|nested] 2+ messages in thread
end of thread, other threads:[~2022-11-18 15:14 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-10 18:03 Avoid race between tcp_packet packet processing and timeout set by a netfilter CTA_TIMEOUT message Tula Kraiser
2022-11-18 15:13 ` Pablo Neira Ayuso
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).