* Bug with conntracks created arbitrarily through netlink
@ 2008-09-24 17:43 Luca Landi
2008-09-24 18:00 ` Patrick McHardy
0 siblings, 1 reply; 4+ messages in thread
From: Luca Landi @ 2008-09-24 17:43 UTC (permalink / raw)
To: netfilter-devel
Hello everybody,
I think I have found a bug with conntracks created by userspace through
netlink. The bug resides in the kernel.
Basically, when you create a new conntrack through netlink, you normally
do so in advance. That is, before any actual packet of that connection
arrives (or goes out). (Sidenote: in fact honestly I can't think of
situations where one would want to manually create a conntrack truly
after the connection to be tracked was already in progress, and also I
guess that that would be anyway impossible as such a conntrack would
have been created automatically by the kernel).
Anyway, as such manual in-advance creation (through netlink) does not go
through the L4 initialization normally done by l4proto->new() (as well
as all the other things done by init_conntrack()), this at least in the
case of a TCP connection does affect tcp_in_window() in that when it
sees the first ever packet for that connection (normally the SYN),
tcp_in_window() regardlessly goes through the
"i-am-in-the-middle-of-a-connection" codepath, not syncing properly with
the window. The outcome is that after some kbytes of data transferred,
tcp_in_window() starts rejecting packets. In case of nat-ed connections
this also means that packets won't be nat-ed any more. Seen all this
happening live myself.
I already went through the analysis of the relevant kernel code, spotted
what happens exactly at every stage, and found (and tested successfully)
a possible fix. Just wondered whether you knew this already and had a
good argument for it. I know about the "be-liberal-with-windows" flag,
which might be considered as a ready workaround, but I thought we could
and should do better for this case. What do you think?
PS: I'm using ubuntu 8.04 and working with its official kernel. At a
superficial glance at the latest stable from kernel.org it seems to me
that the problem is still there. Blame me if I am mistaken.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bug with conntracks created arbitrarily through netlink
2008-09-24 17:43 Bug with conntracks created arbitrarily through netlink Luca Landi
@ 2008-09-24 18:00 ` Patrick McHardy
[not found] ` <1222287997.6546.22.camel@quasar>
0 siblings, 1 reply; 4+ messages in thread
From: Patrick McHardy @ 2008-09-24 18:00 UTC (permalink / raw)
To: Luca Landi; +Cc: netfilter-devel
Luca Landi wrote:
> Hello everybody,
>
> I think I have found a bug with conntracks created by userspace through
> netlink. The bug resides in the kernel.
>
> Basically, when you create a new conntrack through netlink, you normally
> do so in advance. That is, before any actual packet of that connection
> arrives (or goes out). (Sidenote: in fact honestly I can't think of
> situations where one would want to manually create a conntrack truly
> after the connection to be tracked was already in progress, and also I
> guess that that would be anyway impossible as such a conntrack would
> have been created automatically by the kernel).
>
> Anyway, as such manual in-advance creation (through netlink) does not go
> through the L4 initialization normally done by l4proto->new() (as well
> as all the other things done by init_conntrack()), this at least in the
> case of a TCP connection does affect tcp_in_window() in that when it
> sees the first ever packet for that connection (normally the SYN),
> tcp_in_window() regardlessly goes through the
> "i-am-in-the-middle-of-a-connection" codepath, not syncing properly with
> the window. The outcome is that after some kbytes of data transferred,
> tcp_in_window() starts rejecting packets. In case of nat-ed connections
> this also means that packets won't be nat-ed any more. Seen all this
> happening live myself.
>
> I already went through the analysis of the relevant kernel code, spotted
> what happens exactly at every stage, and found (and tested successfully)
> a possible fix. Just wondered whether you knew this already and had a
> good argument for it. I know about the "be-liberal-with-windows" flag,
> which might be considered as a ready workaround, but I thought we could
> and should do better for this case. What do you think?
>
> PS: I'm using ubuntu 8.04 and working with its official kernel. At a
> superficial glance at the latest stable from kernel.org it seems to me
> that the problem is still there. Blame me if I am mistaken.
We're automatically enabling the be-liberal logic for picked up
connections nowadays, so you shouldn't get out of window packets.
Which kernel does ubuntu 8.04 use?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bug with conntracks created arbitrarily through netlink
[not found] ` <1222287997.6546.22.camel@quasar>
@ 2008-09-24 20:41 ` Patrick McHardy
2008-09-25 8:56 ` Luca Landi
0 siblings, 1 reply; 4+ messages in thread
From: Patrick McHardy @ 2008-09-24 20:41 UTC (permalink / raw)
To: Luca Landi, Netfilter Development Mailinglist
Don't remove netfilter-devel from the CC list please.
Luca Landi wrote:
> Il giorno mer, 24/09/2008 alle 20.00 +0200, Patrick McHardy ha scritto:
>
>> We're automatically enabling the be-liberal logic for picked up
>> connections nowadays,
>
> Currently (as of 2.6.26.5) as well as on the ubuntu's kernel that's done
> only by tcp_new(), not by tcp_in_window()
Indeed. You're able to specify that flag from userspace though.
> However, my point is that in case of a manually created conntrack we
> could avoid enabling the be-liberal logic, because the subsystem _will_
> see the true first packet of the tracked connection eventually (the SYN
> in case of a tcp stream, but conceptually speaking the equivalent should
> apply to any proto), and thus should be able to set up proper tracking.
> Am I wrong?
No, thats correct. However the structure of the code doesn't allow
to do that easily since the ->new function is only called when
initializing a new conntrack at runtime. It might be possible to
move invocation up to resolve_normal_ct and make it dependant on
the connection state, it mainly depends on whether the other
functions called during initialization need that state from ->new.
They should not I think, but I haven't checked. Then you could also
invoke it based on some other condition controlable through ctnetlink.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Bug with conntracks created arbitrarily through netlink
2008-09-24 20:41 ` Patrick McHardy
@ 2008-09-25 8:56 ` Luca Landi
0 siblings, 0 replies; 4+ messages in thread
From: Luca Landi @ 2008-09-25 8:56 UTC (permalink / raw)
To: Patrick McHardy; +Cc: Netfilter Development Mailinglist
Il giorno mer, 24/09/2008 alle 22.41 +0200, Patrick McHardy ha scritto:
> Don't remove netfilter-devel from the CC list please.
Sorry, I hit "reply to sender" instead of "reply to all"
> > However, my point is that in case of a manually created conntrack we
> > could avoid enabling the be-liberal logic, because the subsystem _will_
> > see the true first packet of the tracked connection eventually (the SYN
> > in case of a tcp stream, but conceptually speaking the equivalent should
> > apply to any proto), and thus should be able to set up proper tracking.
> > Am I wrong?
>
> No, thats correct. However the structure of the code doesn't allow
> to do that easily since the ->new function is only called when
> initializing a new conntrack at runtime. It might be possible to
> move invocation up to resolve_normal_ct and make it dependant on
> the connection state,
Yes, I've done precisely that. Split init_conntrack in two separate
functions: create_conntrack and setup_conntrack. Where was
init_conntrack now lies create_conntrack which only does alloc and
insertion into the unconfirmed hash table. Immediately below that there
already was the connection's status tracking, and in the IP_CT_NEW case
is just where I added the call to setup_conntrack, which does all the
rest that was previously (before my modify) done by init_conntrack (with
relationship and helper setup wrapped around an "already set up" check
in order not to override nor duplicate what netlink could have done
already).
Yesterday before writing to you I tested it for 1 hour in production on
a box which deals with hundreds new connections per second, thousands
established connections at any given moment (it's a smtp proxy). It
worked seamlessly. That box never deals with related conns though, nor
helpers, so I could not test those cases. Also, I had to leave office
and not felt confident enough to leave everything unattended. I'll put
it back in production when I'll be present on-site for one full day
long.
> it mainly depends on whether the other
> functions called during initialization need that state from ->new.
> They should not I think, but I haven't checked.
With the code setup said above, the resulting run-path for a new
conntrack created automatically acts like before, as if everything was
still in the same block of code, with the only addition of the
conntrack's status bits checks-and-sets now in between. Those
checks-and-sets do not seem to need any state from ->new (nor from
relation and helper setup).
Admittedly one drawback is that this code setup does not make it very
clear that you must not put anything that needs state from ->new and
friends in the run-path between create_conntrack and setup_conntrack.
> Then you could also
> invoke it based on some other condition controlable through ctnetlink.
Naturally you could. If it was my choice though I'd go for a solution
which makes it simpler for userspace, and let daddy kernel take care of
the dirtiest work. At least as a default behavior.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-09-25 8:56 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-09-24 17:43 Bug with conntracks created arbitrarily through netlink Luca Landi
2008-09-24 18:00 ` Patrick McHardy
[not found] ` <1222287997.6546.22.camel@quasar>
2008-09-24 20:41 ` Patrick McHardy
2008-09-25 8:56 ` Luca Landi
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.