* [PATCH] Unconditionaly push mark to conntrack structure
@ 2006-05-19 8:45 Eric Leblond
2006-05-30 23:43 ` Patrick McHardy
0 siblings, 1 reply; 10+ messages in thread
From: Eric Leblond @ 2006-05-19 8:45 UTC (permalink / raw)
To: netfilter-devel
This is needed in userspace as the mark can be used to select
efficiently a subset of the conntrack events to work on.
Signed-off-by: Eric Leblond <eric@inl.fr>
---
net/ipv4/netfilter/ip_conntrack_netlink.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
a0d56f7badb6267e07582a569c82779b2cdb2711
diff --git a/net/ipv4/netfilter/ip_conntrack_netlink.c b/net/ipv4/netfilter/ip_conntrack_netlink.c
index 0fee630..02531fe 100644
--- a/net/ipv4/netfilter/ip_conntrack_netlink.c
+++ b/net/ipv4/netfilter/ip_conntrack_netlink.c
@@ -381,8 +381,8 @@ static int ctnetlink_conntrack_event(str
if (events & IPCT_HELPINFO
&& ctnetlink_dump_helpinfo(skb, ct) < 0)
goto nfattr_failure;
- if (events & IPCT_MARK
- && ctnetlink_dump_mark(skb, ct) < 0)
+
+ if (ctnetlink_dump_mark(skb, ct) < 0)
goto nfattr_failure;
if (ctnetlink_dump_counters(skb, ct, IP_CT_DIR_ORIGINAL) < 0 ||
--
1.2.6
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH] Unconditionaly push mark to conntrack structure
2006-05-19 8:45 [PATCH] Unconditionaly push mark to conntrack structure Eric Leblond
@ 2006-05-30 23:43 ` Patrick McHardy
2006-05-30 23:55 ` Patrick McHardy
0 siblings, 1 reply; 10+ messages in thread
From: Patrick McHardy @ 2006-05-30 23:43 UTC (permalink / raw)
To: Eric Leblond; +Cc: netfilter-devel
Eric Leblond wrote:
> This is needed in userspace as the mark can be used to select
> efficiently a subset of the conntrack events to work on.
I'm a bit reluctant to special case mark, but mostly because I wonder
whether we shouldn't just behave like all other networking subsystems
and send update messages containing the entire new state. If you look
at the optional information:
- status bits are only 4 byte.
- timeout is currently transmitted for every packet anyway - its better
to just reduce the event rate (we even had a patch for this for ages)
- protoinfo: at least for the majority of traffic (tcp) included in
every message as well
- helpinfo: most connections don't have helpers
The mark is currently missing entirely (your patch is on top of one
of my patches I didn't submit yet), another 4 bytes.
So basically we have an extra 8 bytes per message and reduce complexity
for users by sending the entire state .. a good tradeoff in my opinion.
With the patch to reduce timer events we should still need _a lot_ less
bandwidth than today.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Unconditionaly push mark to conntrack structure
2006-05-30 23:43 ` Patrick McHardy
@ 2006-05-30 23:55 ` Patrick McHardy
2006-05-31 0:26 ` Patrick McHardy
0 siblings, 1 reply; 10+ messages in thread
From: Patrick McHardy @ 2006-05-30 23:55 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel, Eric Leblond
Patrick McHardy wrote:
> Eric Leblond wrote:
>
>>This is needed in userspace as the mark can be used to select
>>efficiently a subset of the conntrack events to work on.
>
>
> I'm a bit reluctant to special case mark, but mostly because I wonder
> whether we shouldn't just behave like all other networking subsystems
> and send update messages containing the entire new state. If you look
> at the optional information:
>
> - status bits are only 4 byte.
> - timeout is currently transmitted for every packet anyway - its better
> to just reduce the event rate (we even had a patch for this for ages)
Actually this isn't true, I just noticed we never send timeout update
notifications except for the first packet (which means we have tons
of unnecessary notifier chain calls). I think this isn't really
intended and was done to work around the high timeout event generation
rate. Pablo, do you more about this?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Unconditionaly push mark to conntrack structure
2006-05-30 23:55 ` Patrick McHardy
@ 2006-05-31 0:26 ` Patrick McHardy
2006-05-31 0:35 ` Pablo Neira Ayuso
0 siblings, 1 reply; 10+ messages in thread
From: Patrick McHardy @ 2006-05-31 0:26 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel, Eric Leblond
Patrick McHardy wrote:
> Patrick McHardy wrote:
>
>>Eric Leblond wrote:
>>
>>
>>>This is needed in userspace as the mark can be used to select
>>>efficiently a subset of the conntrack events to work on.
>>
>>
>>I'm a bit reluctant to special case mark, but mostly because I wonder
>>whether we shouldn't just behave like all other networking subsystems
>>and send update messages containing the entire new state. If you look
>>at the optional information:
>>
>>- status bits are only 4 byte.
>>- timeout is currently transmitted for every packet anyway - its better
>> to just reduce the event rate (we even had a patch for this for ages)
>
>
> Actually this isn't true, I just noticed we never send timeout update
> notifications except for the first packet (which means we have tons
> of unnecessary notifier chain calls). I think this isn't really
> intended and was done to work around the high timeout event generation
> rate. Pablo, do you more about this?
More bad news .. the timeout is sent in HZ instead of USER_HZ. This
unfortunately seems to call for an ABI break, I'd really hate to add
a CTA_TIMEOUT2 attribute. I guess we can live with it since its
usually not even included in the messages.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Unconditionaly push mark to conntrack structure
2006-05-31 0:26 ` Patrick McHardy
@ 2006-05-31 0:35 ` Pablo Neira Ayuso
2006-05-31 1:01 ` Patrick McHardy
0 siblings, 1 reply; 10+ messages in thread
From: Pablo Neira Ayuso @ 2006-05-31 0:35 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netfilter-devel, Eric Leblond
Patrick McHardy wrote:
> Patrick McHardy wrote:
>
>>Patrick McHardy wrote:
>>
>>
>>>Eric Leblond wrote:
>>>
>>>>This is needed in userspace as the mark can be used to select
>>>>efficiently a subset of the conntrack events to work on.
>>>
>>>I'm a bit reluctant to special case mark, but mostly because I wonder
>>>whether we shouldn't just behave like all other networking subsystems
>>>and send update messages containing the entire new state. If you look
>>>at the optional information:
>>>
>>>- status bits are only 4 byte.
>>>- timeout is currently transmitted for every packet anyway - its better
>>> to just reduce the event rate (we even had a patch for this for ages)
>>
>>
>>Actually this isn't true, I just noticed we never send timeout update
>>notifications except for the first packet (which means we have tons
>>of unnecessary notifier chain calls). I think this isn't really
>>intended and was done to work around the high timeout event generation
>>rate. Pablo, do you more about this?
Indeed, the timer refresh event through netlink just burden the system
and overrun the socket queue, so netlink starts dropping messages.
> More bad news .. the timeout is sent in HZ instead of USER_HZ. This
> unfortunately seems to call for an ABI break, I'd really hate to add
> a CTA_TIMEOUT2 attribute. I guess we can live with it since its
> usually not even included in the messages.
To be frank, I can't see how the timer can be useful from userspace. I
think that we should remove it.
About Eric's patch, I think that he can keep a cache of conntracks in
userspace, as conntrackd does, instead of increasing the message size
for something that is not always required.
--
The dawn of the fourth age of Linux firewalling is coming; a time of
great struggle and heroic deeds -- J.Kadlecsik got inspired by J.Morris
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Unconditionaly push mark to conntrack structure
2006-05-31 0:35 ` Pablo Neira Ayuso
@ 2006-05-31 1:01 ` Patrick McHardy
2006-06-06 11:35 ` Pablo Neira Ayuso
2006-06-06 17:27 ` Pablo Neira Ayuso
0 siblings, 2 replies; 10+ messages in thread
From: Patrick McHardy @ 2006-05-31 1:01 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel, Eric Leblond
Pablo Neira Ayuso wrote:
> Patrick McHardy wrote:
>
>>> Actually this isn't true, I just noticed we never send timeout update
>>> notifications except for the first packet (which means we have tons
>>> of unnecessary notifier chain calls). I think this isn't really
>>> intended and was done to work around the high timeout event generation
>>> rate. Pablo, do you more about this?
>
>
> Indeed, the timer refresh event through netlink just burden the system
> and overrun the socket queue, so netlink starts dropping messages.
This is easy to fix by only sending an timer update for each connection
once every n seconds. If done in ip_ct_refresh_acct it will also reduce
the notifier load.
>> More bad news .. the timeout is sent in HZ instead of USER_HZ. This
>> unfortunately seems to call for an ABI break, I'd really hate to add
>> a CTA_TIMEOUT2 attribute. I guess we can live with it since its
>> usually not even included in the messages.
>
>
> To be frank, I can't see how the timer can be useful from userspace. I
> think that we should remove it.
Don't you need it for synchronization? One example where it could be
useful is to implement different timeout strategies (for example
something like pf's adaptive timeouts) in userspace.
> About Eric's patch, I think that he can keep a cache of conntracks in
> userspace, as conntrackd does, instead of increasing the message size
> for something that is not always required.
Well, I do agree that any serious use of ctnetlink needs to take
care of the unreliability of netlink and therefore maintain its
own database that is resynchronized after losses etc. (I hope
conntrackd does that :)) But I think at least the networking
netlink subsystems should behave similar (it is for example
a requirement for maintaining automatic libnl caches, should
it be possible to use it with nfnetlink's different byteorder),
and I don't think the few bytes saved are worth beeing incompatible
with assumptions that hold for every other networking netlink
protocol.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Unconditionaly push mark to conntrack structure
2006-05-31 1:01 ` Patrick McHardy
@ 2006-06-06 11:35 ` Pablo Neira Ayuso
2006-06-08 7:25 ` Patrick McHardy
2006-06-06 17:27 ` Pablo Neira Ayuso
1 sibling, 1 reply; 10+ messages in thread
From: Pablo Neira Ayuso @ 2006-06-06 11:35 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netfilter-devel, Eric Leblond
Patrick McHardy wrote:
> Pablo Neira Ayuso wrote:
>
>>Patrick McHardy wrote:
>>
>>
>>>>Actually this isn't true, I just noticed we never send timeout update
>>>>notifications except for the first packet (which means we have tons
>>>>of unnecessary notifier chain calls). I think this isn't really
>>>>intended and was done to work around the high timeout event generation
>>>>rate. Pablo, do you more about this?
>>
>>
>>Indeed, the timer refresh event through netlink just burden the system
>>and overrun the socket queue, so netlink starts dropping messages.
>
>
> This is easy to fix by only sending an timer update for each connection
> once every n seconds. If done in ip_ct_refresh_acct it will also reduce
> the notifier load.
>
>
>>>More bad news .. the timeout is sent in HZ instead of USER_HZ. This
>>>unfortunately seems to call for an ABI break, I'd really hate to add
>>>a CTA_TIMEOUT2 attribute. I guess we can live with it since its
>>>usually not even included in the messages.
>>
>>
>>To be frank, I can't see how the timer can be useful from userspace. I
>>think that we should remove it.
>
>
> Don't you need it for synchronization? One example where it could be
> useful is to implement different timeout strategies (for example
> something like pf's adaptive timeouts) in userspace.
But these adaptive timeouts could be implemented in kernelspace.
Although I don't know too much about the in-deep details of adaptive
timeouts.
>>About Eric's patch, I think that he can keep a cache of conntracks in
>>userspace, as conntrackd does, instead of increasing the message size
>>for something that is not always required.
>
>
> Well, I do agree that any serious use of ctnetlink needs to take
> care of the unreliability of netlink and therefore maintain its
> own database that is resynchronized after losses etc. (I hope
> conntrackd does that :)) But I think at least the networking
> netlink subsystems should behave similar (it is for example
> a requirement for maintaining automatic libnl caches, should
> it be possible to use it with nfnetlink's different byteorder),
> and I don't think the few bytes saved are worth beeing incompatible
> with assumptions that hold for every other networking netlink
> protocol.
Unfortunately, ctnetlink is not doing any sequence tracking of the
events at the moment :( and we have to. Here my old PIII 866MHz with a
100Mbits network card starts dropping events when it reaches ~300
simultaneos short TCP connections (2 seconds) with netperf. I'm going to
cook a patch for this.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Unconditionaly push mark to conntrack structure
2006-05-31 1:01 ` Patrick McHardy
2006-06-06 11:35 ` Pablo Neira Ayuso
@ 2006-06-06 17:27 ` Pablo Neira Ayuso
1 sibling, 0 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2006-06-06 17:27 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netfilter-devel, Eric Leblond
Patrick McHardy wrote:
> Pablo Neira Ayuso wrote:
>
>>Patrick McHardy wrote:
>>
>>
>>>>Actually this isn't true, I just noticed we never send timeout update
>>>>notifications except for the first packet (which means we have tons
>>>>of unnecessary notifier chain calls). I think this isn't really
>>>>intended and was done to work around the high timeout event generation
>>>>rate. Pablo, do you more about this?
>>
>>
>>Indeed, the timer refresh event through netlink just burden the system
>>and overrun the socket queue, so netlink starts dropping messages.
>
>
> This is easy to fix by only sending an timer update for each connection
> once every n seconds. If done in ip_ct_refresh_acct it will also reduce
> the notifier load.
>
>
>>>More bad news .. the timeout is sent in HZ instead of USER_HZ. This
>>>unfortunately seems to call for an ABI break, I'd really hate to add
>>>a CTA_TIMEOUT2 attribute. I guess we can live with it since its
>>>usually not even included in the messages.
>>
>>
>>To be frank, I can't see how the timer can be useful from userspace. I
>>think that we should remove it.
>
> Don't you need it for synchronization? One example where it could be
> useful is to implement different timeout strategies (for example
> something like pf's adaptive timeouts) in userspace.
Thinking well, you are right, this feature could be easily added to
conntrackd.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Unconditionaly push mark to conntrack structure
2006-06-06 11:35 ` Pablo Neira Ayuso
@ 2006-06-08 7:25 ` Patrick McHardy
2006-06-11 22:00 ` Pablo Neira Ayuso
0 siblings, 1 reply; 10+ messages in thread
From: Patrick McHardy @ 2006-06-08 7:25 UTC (permalink / raw)
To: Pablo Neira Ayuso; +Cc: netfilter-devel, Eric Leblond
Pablo Neira Ayuso wrote:
> Patrick McHardy wrote:
>
>> Pablo Neira Ayuso wrote:
>>
>>> To be frank, I can't see how the timer can be useful from userspace. I
>>> think that we should remove it.
>>
>>
>>
>> Don't you need it for synchronization? One example where it could be
>> useful is to implement different timeout strategies (for example
>> something like pf's adaptive timeouts) in userspace.
>
>
> But these adaptive timeouts could be implemented in kernelspace.
Thats not a good argument .. by that logic we wouldn't need ctnetlink
at all :)
> Unfortunately, ctnetlink is not doing any sequence tracking of the
> events at the moment :( and we have to. Here my old PIII 866MHz with a
> 100Mbits network card starts dropping events when it reaches ~300
> simultaneos short TCP connections (2 seconds) with netperf. I'm going to
> cook a patch for this.
That seems to be pretty poor performance - by sequence tracking you
mean TCP state updates? Is that poor performance with or without
them?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH] Unconditionaly push mark to conntrack structure
2006-06-08 7:25 ` Patrick McHardy
@ 2006-06-11 22:00 ` Pablo Neira Ayuso
0 siblings, 0 replies; 10+ messages in thread
From: Pablo Neira Ayuso @ 2006-06-11 22:00 UTC (permalink / raw)
To: Patrick McHardy; +Cc: netfilter-devel, Eric Leblond
Patrick McHardy wrote:
> Pablo Neira Ayuso wrote:
>
>>Patrick McHardy wrote:
>>
>>
>>>Pablo Neira Ayuso wrote:
>>>
>>>
>>>>To be frank, I can't see how the timer can be useful from userspace. I
>>>>think that we should remove it.
>>>
>>>
>>>
>>>Don't you need it for synchronization? One example where it could be
>>>useful is to implement different timeout strategies (for example
>>>something like pf's adaptive timeouts) in userspace.
>>
>>
>>But these adaptive timeouts could be implemented in kernelspace.
>
> Thats not a good argument .. by that logic we wouldn't need ctnetlink
> at all :)
Indeed, weak argument. I've been working today on conntrackd to
generalize it more, now it's fairly extensible I think, so this feature
can be easily added
>>Unfortunately, ctnetlink is not doing any sequence tracking of the
>>events at the moment :( and we have to. Here my old PIII 866MHz with a
>>100Mbits network card starts dropping events when it reaches ~300
>>simultaneos short TCP connections (2 seconds) with netperf. I'm going to
>>cook a patch for this.
>
> That seems to be pretty poor performance - by sequence tracking you
> mean TCP state updates? Is that poor performance with or without
> them?
I meant that nlmsg_seq is not set in the events sent by ctnetlink. I
sent you a RFC patch some days ago, please let me know what you think.
About the performance test, I'll do some testing to get some results
tomorrow again, then I'll post them so we could comment them.
--
The dawn of the fourth age of Linux firewalling is coming; a time of
great struggle and heroic deeds -- J.Kadlecsik got inspired by J.Morris
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-06-11 22:00 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-05-19 8:45 [PATCH] Unconditionaly push mark to conntrack structure Eric Leblond
2006-05-30 23:43 ` Patrick McHardy
2006-05-30 23:55 ` Patrick McHardy
2006-05-31 0:26 ` Patrick McHardy
2006-05-31 0:35 ` Pablo Neira Ayuso
2006-05-31 1:01 ` Patrick McHardy
2006-06-06 11:35 ` Pablo Neira Ayuso
2006-06-08 7:25 ` Patrick McHardy
2006-06-11 22:00 ` Pablo Neira Ayuso
2006-06-06 17:27 ` Pablo Neira Ayuso
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.