All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick McHardy <kaber@trash.net>
To: Juneja Kapil <Kapil.Juneja@freescale.com>
Cc: netfilter-devel@vger.kernel.org,
	Medve Emilian <Emilian.Medve@freescale.com>
Subject: Re: [PATCH] nf_conntrack_core: Updated nf_conntrack to destroy/refresh conn irrespective of del_timer status
Date: Wed, 27 Feb 2008 14:00:56 +0100	[thread overview]
Message-ID: <47C55F08.9070905@trash.net> (raw)
In-Reply-To: <2A6F278C5B66C4459AF4013E77A40CD30122F288@zin33exm20.fsl.freescale.net>

Juneja Kapil wrote:
>> -----Original Message-----
>> From: Patrick McHardy [mailto:kaber@trash.net] 
>> Sent: Monday, February 25, 2008 5:41 PM
>> To: Juneja Kapil
>> Cc: netfilter-devel@vger.kernel.org; Medve Emilian
>> Subject: Re: [PATCH] nf_conntrack_core: Updated nf_conntrack 
>> to destroy/refresh conn irrespective of del_timer status
>>
>> Kapil Juneja wrote:
>>> Currently NF_CONNTRACK assumes that a running timer is 
>> present before refreshing
>>> the connection or destroying it. This may not be the case 
>> when, for example,
>>> another forwarding engine hooks up to it to listen to new 
>> connections
>>> but disables the NF_CONNTRACK timer in order to have more control.
>>> In such a scenario, only control packets may be terminated 
>> to NF_CONNTRACK for
>>> it to decode and update the connection status. It will not 
>> impact the present
>>> scenario of kernel forwarding without the aid of any 
>> forwarding engine. 
>>
>> Do you have a pointer to the code you're talking about?
> 
> The forwarding engine concept is  same as the Grand Unified Flow Cache
> idea mooted by Rusty Russel some time back:
> http://lwn.net/Articles/194443/
> 
> Our architecture runs on three components - Linux NF_CONNTRACK, a
> Control (Linux) Module(CM), and a Forwarding Engine or Flow Caching
> system (FC):
> 1) The CM registers itself to the NF_CONNTRACK notifier chain. Whenever
> an assured connection notifier event is received, it extracts all the
> relevant tuple parameters (Src IP, Dst IP, Protocol, Src Port, Dst Port
> etc.) and caches them to the FC. Due to reasons mentioned subsequently,
> it also disables the timer for the said conntrack object. The object,
> however, remains in the conntrack list as long as it is not destroyed.
> 3) The FC, sitting at the ethernet driver level, sends all the data
> packets belonging to the cached connection directly to the outbound port
> (as identified in step 2), bypassing the Linux stack altogether. All the
> packets not belonging to either of the cached tuples are terminated to
> the Linux stack. Also TCP control packets FIN/RST/SYN are terminated
> irrespective of wether the connection is cached or not.
> 4) With assistance from the FC, the CM also runs aging on the cached
> connections (hence the requirement of deleting the NF_CONNTRACK timer in
> step 2)
> 5) Cached connections can be terminated (i.e, removed from cache) in two
> ways:
> 	i) Aging out by the CM: In this scenario, the CM removes the
> connection tuple from FC as well as NF_CONNTRACK by calling the
> corresponding timer destroy function directly.
> 	ii) Destroy via TCP control packet: All the FIN-ACK, RST,
> RST-ACK packets are send to conntrack irrespective of the fact that they
> match a cached tuple. They are picked up by the TCP conntrack module
> which restarts the accounting and refreshes the connection state. It is
> at this point that the first chunk of this patch comes into picture. 
> 6) When the NF_CONNTRACK module is removed, it iterates through the list
> to destroy the detected connections. Currently, it does not remove those
> connections whose timers have gone off (which is the case with
> connections cached to FC). This is fixed by the second chunk of the
> patch.


That sounds pretty reasonable. Is that code available somewhere?

>>> +		if (newtime - ct->timeout.expires >= HZ) {
>>> +			/*
>>> +			 * The timer could have already been deleted
>>> +			 * while still alive (for example connection
>>> +			 * offloaded to a forwarding module other than
>>> +			 * the kernel stack).
>>> +			 */
>>> +			mod_timer(&ct->timeout, newtime);
>>>  			event = IPCT_REFRESH;
>> This adds a race, we don't want to update the timer if it already
>> went off this that means the connection is already destroyed.
>> Same problem with the other chunk.
>>
> 
> A timer call would have invalidated the conntrack by a call to
> 'death_by_timeout' (or similar such routine), thereby rendering this
> check redundant.  Theoretically, I think the check is irrelevant unless
> a hypothetical timeout doesn't really invalidate the conntrack. Can you
> describe the race scenario mentioned by you?

Very simple:

CPU0					CPU1
					timer goes off
refresh_timer: mod_timer, rearm		death_by_timeout()

timer goes off again

Using del_timer prevents us from rearming the timer if it
already went off.

  reply	other threads:[~2008-02-27 13:00 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-25  5:19 [PATCH] nf_conntrack_core: Updated nf_conntrack to destroy/refresh conn irrespective of del_timer status Kapil Juneja
2008-02-25 12:11 ` Patrick McHardy
2008-02-26  7:39   ` Juneja Kapil
2008-02-27 13:00     ` Patrick McHardy [this message]
2008-02-29  9:26       ` Juneja Kapil
2008-02-29 12:23         ` Patrick McHardy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47C55F08.9070905@trash.net \
    --to=kaber@trash.net \
    --cc=Emilian.Medve@freescale.com \
    --cc=Kapil.Juneja@freescale.com \
    --cc=netfilter-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.