* [RFC] [PATCH] ctnetlink updates
@ 2005-03-27 23:55 Pablo Neira
2005-04-01 6:59 ` Harald Welte
2005-04-03 18:01 ` Patrick McHardy
0 siblings, 2 replies; 48+ messages in thread
From: Pablo Neira @ 2005-03-27 23:55 UTC (permalink / raw)
To: Netfilter Development Mailinglist; +Cc: Harald Welte, Patrick McHardy
Hi,
I've ported nfnetlink-ctnetlink to 2.6 ip_conntrack to make the
transition easier. So my intentions are porting it to nfconntrack once
it gets pushed forward. My work is done on top of the ct-event-API.
There are some issues I'd like to discuss:
o Declaring ID as unsigned int. I think it's just fine.
- A conntrack must be identified with one of the tuples (original or
reply) and its id. That way it can be uniquely identified.
- Using u_int64_t just reduces the possibility of the wrapping around
but such possible problem is still there.
o dump_table() has problems once wrapping around happens.
- The ordered list isn't ordered anymore once id wrapping around
happens. New conntracks with low id's are inserted at the end. While
dumping the table, the branch that compares that ct->id <= cb->args[0]
returns true and those new conntracks aren't dumped.
I've introduced a function that inserts conntrack ordered by id in the
buckets.
static inline void
list_insert_ordered(struct list_head *head,
struct ip_conntrack *ct,
enum ip_conntrack_dir dir)
{
struct list_head *i;
struct ip_conntrack *cur;
ASSERT_WRITE_LOCK(head);
list_for_each(i, head) {
cur = (struct ip_conntrack *) i;
if (ct->id <= cur->id) {
list_add_tail(&ct->tuplehash[dir].list, i);
return;
}
}
list_add_tail(&ct->tuplehash[dir].list, head);
}
--
Pablo
^ permalink raw reply [flat|nested] 48+ messages in thread* Re: [RFC] [PATCH] ctnetlink updates 2005-03-27 23:55 [RFC] [PATCH] ctnetlink updates Pablo Neira @ 2005-04-01 6:59 ` Harald Welte 2005-04-03 18:01 ` Patrick McHardy 1 sibling, 0 replies; 48+ messages in thread From: Harald Welte @ 2005-04-01 6:59 UTC (permalink / raw) To: Pablo Neira; +Cc: Netfilter Development Mailinglist, Patrick McHardy [-- Attachment #1: Type: text/plain, Size: 1412 bytes --] On Mon, Mar 28, 2005 at 01:55:15AM +0200, Pablo Neira wrote: > Hi, > > I've ported nfnetlink-ctnetlink to 2.6 ip_conntrack to make the transition > easier. So my intentions are porting it to nfconntrack once it gets pushed > forward. My work is done on top of the ct-event-API. We have the habit of working simultaneously in the same area :( Unfortunately I've reorganized the tree and shuffled the files, so I'll have a hard time merging... > There are some issues I'd like to discuss: > > o Declaring ID as unsigned int. I think it's just fine. > > - A conntrack must be identified with one of the tuples > (original or reply) and its id. That way it can be uniquely > identified. > > - Using u_int64_t just reduces the possibility of the wrapping > around but such possible problem is still there. Well, those of you who know the discussion know my point of view: I don't want an Id and/or an ordered list. If the user tells us to delete a connection with a given tuple, we simply delete it. -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] [PATCH] ctnetlink updates 2005-03-27 23:55 [RFC] [PATCH] ctnetlink updates Pablo Neira 2005-04-01 6:59 ` Harald Welte @ 2005-04-03 18:01 ` Patrick McHardy 2005-04-06 18:08 ` Pablo Neira 1 sibling, 1 reply; 48+ messages in thread From: Patrick McHardy @ 2005-04-03 18:01 UTC (permalink / raw) To: Pablo Neira; +Cc: Harald Welte, Netfilter Development Mailinglist Pablo Neira wrote: > I've ported nfnetlink-ctnetlink to 2.6 ip_conntrack to make the > transition easier. So my intentions are porting it to nfconntrack once > it gets pushed forward. My work is done on top of the ct-event-API. > > There are some issues I'd like to discuss: > > o Declaring ID as unsigned int. I think it's just fine. > > - A conntrack must be identified with one of the tuples (original or > reply) and its id. That way it can be uniquely identified. Good idea, although I'm not completely convinced. > - Using u_int64_t just reduces the possibility of the wrapping > around but such possible problem is still there. The time until a wrap is many many years even if you assume very high connection rate and many CPUs, so its not a practical problem. The difference to your solution is that you can tell for sure that clashes won't occur until a date under known conditions. I dislike the idea of an unreliable API that has no possibilty of even noticing and/or handling the error, so I'm not sure about this. > o dump_table() has problems once wrapping around happens. > > - The ordered list isn't ordered anymore once id wrapping around > happens. New conntracks with low id's are inserted at the end. While > dumping the table, the branch that compares that ct->id <= cb->args[0] > returns true and those new conntracks aren't dumped. > > I've introduced a function that inserts conntrack ordered by id in the > buckets. I don't like this, but lets talk about the other problem first, maybe it will just go away :) Regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] [PATCH] ctnetlink updates 2005-04-03 18:01 ` Patrick McHardy @ 2005-04-06 18:08 ` Pablo Neira 2005-04-17 15:07 ` Patrick McHardy 0 siblings, 1 reply; 48+ messages in thread From: Pablo Neira @ 2005-04-06 18:08 UTC (permalink / raw) To: Patrick McHardy; +Cc: Harald Welte, Netfilter Development Mailinglist Patrick McHardy wrote: > Pablo Neira wrote: >> There are some issues I'd like to discuss: >> >> o Declaring ID as unsigned int. I think it's just fine. >> >> - A conntrack must be identified with one of the tuples (original >> or reply) and its id. That way it can be uniquely identified. > > Good idea, although I'm not completely convinced. Now I've changed my mind :). I think that we can identify a connection with both the original and reply tuple. Since a connection is represented by means of a conntrack, if a user kills a conntrack via ctnetlink, he's willing to kill the connection that the conntrack represents, and not to such conntrack itself. There aren't two conntracks with the same original and reply tuples. I can't see anymore why we need such id. >> - Using u_int64_t just reduces the possibility of the wrapping >> around but such possible problem is still there. > > > The time until a wrap is many many years even if you assume very > high connection rate and many CPUs, so its not a practical problem. > The difference to your solution is that you can tell for sure that > clashes won't occur until a date under known conditions. I dislike > the idea of an unreliable API that has no possibilty of even noticing > and/or handling the error, so I'm not sure about this. Yes, you are right this is not a practical problem. Well, if we keep using the id, we could detect a wrap and adjust all sequence numbers. Of course that this is a expensive operation but it would happen once in a blue moon :). -- Pablo ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] [PATCH] ctnetlink updates 2005-04-06 18:08 ` Pablo Neira @ 2005-04-17 15:07 ` Patrick McHardy 2005-04-29 7:14 ` Jozsef Kadlecsik 0 siblings, 1 reply; 48+ messages in thread From: Patrick McHardy @ 2005-04-17 15:07 UTC (permalink / raw) To: Pablo Neira; +Cc: Harald Welte, Netfilter Development Mailinglist Pablo Neira wrote: > Now I've changed my mind :). > > I think that we can identify a connection with both the original and > reply tuple. Since a connection is represented by means of a conntrack, > if a user kills a conntrack via ctnetlink, he's willing to kill the > connection that the conntrack represents, and not to such conntrack itself. It depends on by what criteria the user selects the conntrack. I might choose to kill/remark/... every connection that has transfered more than X bytes, in which case I don't want to touch a new connection with the same tuples that has transfered less than X. How can we handle this and similar cases without an identifier that is unique over time? Regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] [PATCH] ctnetlink updates 2005-04-17 15:07 ` Patrick McHardy @ 2005-04-29 7:14 ` Jozsef Kadlecsik 2005-04-29 8:02 ` Harald Welte 2005-05-01 23:49 ` [RFC] [PATCH] ctnetlink updates Pablo Neira 0 siblings, 2 replies; 48+ messages in thread From: Jozsef Kadlecsik @ 2005-04-29 7:14 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira On Sun, 17 Apr 2005, Patrick McHardy wrote: > Pablo Neira wrote: > > Now I've changed my mind :). > > > > I think that we can identify a connection with both the original and > > reply tuple. Since a connection is represented by means of a conntrack, > > if a user kills a conntrack via ctnetlink, he's willing to kill the > > connection that the conntrack represents, and not to such conntrack itself. > > It depends on by what criteria the user selects the conntrack. I might > choose to kill/remark/... every connection that has transfered more than > X bytes, in which case I don't want to touch a new connection with the > same tuples that has transfered less than X. How can we handle this > and similar cases without an identifier that is unique over time? That is independent from id/tuple, because the condition is formulated in the terms of transferred bytes. I don't like id either. Conntrack can uniquely identified by - src/dst tuples, globally, even in a cluster - the pointer of the conntrack entry, locally Why should we need another unique id? Looking at the last changes, I think it'd be much more better to port ip_queue to nfnetlink than to reserve another netlink ID: the hooks in nfnetlink are already there. I know that'd create backward compatibility issues at the existing queue applications, though... :-( Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] [PATCH] ctnetlink updates 2005-04-29 7:14 ` Jozsef Kadlecsik @ 2005-04-29 8:02 ` Harald Welte 2005-05-04 9:18 ` [RFC] alternative to conntrack ID Amin Azez 2005-05-01 23:49 ` [RFC] [PATCH] ctnetlink updates Pablo Neira 1 sibling, 1 reply; 48+ messages in thread From: Harald Welte @ 2005-04-29 8:02 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Netfilter Development Mailinglist, Pablo Neira, Patrick McHardy [-- Attachment #1: Type: text/plain, Size: 2281 bytes --] On Fri, Apr 29, 2005 at 09:14:16AM +0200, Jozsef Kadlecsik wrote: > I don't like id either. Conntrack can uniquely identified by > > - src/dst tuples, globally, even in a cluster > - the pointer of the conntrack entry, locally Yes, but not over time, i.e. if your cycle of reading the table and issuing a 'delete' is long enough, then you could remove a connection that was using the same tuple but was established meanwhile (after the old died). However looking at current timeouts, that would be more than one or two minutes delat between read and delete. My point of view is that we don't need the ID. If there is too much delay, well then the user has a certain risk. If we would call it 'deleting a flow' then we'd be safe, since a flow has no start and beginning, and multiple successive connections can comprise one flow ;) > Looking at the last changes, I think it'd be much more better to port > ip_queue to nfnetlink than to reserve another netlink ID: the hooks in > nfnetlink are already there. I know that'd create backward compatibility > issues at the existing queue applications, though... :-( We've discussed that with David Miller at netconf'04. The result was that we can get another NETLINK family, as there is a number of obsolete/outdated ones in the kernel at the moment. Also, if we keep ULOG and ip_queue for now, and later migrate them into nfnetlink, there will be again more free numbers. ip_queue needs to be renamed to pkt_queue or nf_queue and made layer3 independent. Same goes for ULOG. Also, ULOG should no longer have a fixed header containing interface names, ... but rather have that in TLV's that are added according to the rule specified by the admin. I've alsos been thinking of experimenting with a mmap'ed ring buffer for ulog... at least it would be worth investigating at some point. > Best regards, > Jozsef -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 48+ messages in thread
* [RFC] alternative to conntrack ID 2005-04-29 8:02 ` Harald Welte @ 2005-05-04 9:18 ` Amin Azez 2005-05-04 9:32 ` Patrick Schaaf 2005-05-04 11:30 ` Patrick McHardy 0 siblings, 2 replies; 48+ messages in thread From: Amin Azez @ 2005-05-04 9:18 UTC (permalink / raw) To: Harald Welte Cc: Netfilter Development Mailinglist, Pablo Neira, Patrick McHardy Harald Welte wrote: > On Fri, Apr 29, 2005 at 09:14:16AM +0200, Jozsef Kadlecsik wrote: > >>I don't like id either. Conntrack can uniquely identified by >> >>- src/dst tuples, globally, even in a cluster >>- the pointer of the conntrack entry, locally > > Yes, but not over time, i.e. if your cycle of reading the table and > issuing a 'delete' is long enough, then you could remove a connection > that was using the same tuple but was established meanwhile (after the > old died). However looking at current timeouts, that would be more than > one or two minutes delat between read and delete. > > My point of view is that we don't need the ID. If there is too much > delay, well then the user has a certain risk. If we would call it > 'deleting a flow' then we'd be safe, since a flow has no start and > beginning, and multiple successive connections can comprise one flow ;) I hope I am bringing a new angle to this and not the same old stuff. With Pablo's new conntrack(-tool) there is an increased risk of this race condition. No longer will a userspace application read the table and "issue a delete" but it receives events via the netlink socket. Any userspace tool tracking connections based on contrack events will receive an event some time after a conntrack is destroyed, but possibly after taking action on a new conntrack with the same tuples. Here is an ascii art timeline with one of the failure cases time+----+----+----+----+----+----+----+----+----+----+ destRoyed created again???? contrack *==*???????????????????????????????? netlink create event * user prog create event * netlink destroy event * user prog create action * action may happen on new conntrack user prog destroy event * user prog destroy action * now we know we may have raced and lost It is entirely possible that a new conntrack with the same tuples is created before the user program can be aware the old one has been destroyed. Defining multiple successive connections as "one flow" is convenient, but as user space clients are notified of "interuptions and restorations" to this "one flow", it would be also convenient if they could safely take advantage of such notifications. If an ID is not desirable as part of the tuple (and I can see that it is not) perhaps a "created time-stamp" per conntrack would suffice as an extra "guard" which MAY be provided to conntrack manipulation routines, and if so provided MUST also be satisified for the operation to take place. That is my suggestion. It does not introduce an alternative ID, it does avoid the problem of race conditions. Comments? Amin ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-04 9:18 ` [RFC] alternative to conntrack ID Amin Azez @ 2005-05-04 9:32 ` Patrick Schaaf 2005-05-04 11:30 ` Patrick McHardy 1 sibling, 0 replies; 48+ messages in thread From: Patrick Schaaf @ 2005-05-04 9:32 UTC (permalink / raw) To: Amin Azez; +Cc: netfilter-devel, Pablo Neira, Patrick McHardy > perhaps a "created time-stamp" per conntrack I like this idea. One could then also have a match expressing "conntrack has been live for at most / at least X seconds". Which would be a useful new feature. best regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-04 9:18 ` [RFC] alternative to conntrack ID Amin Azez 2005-05-04 9:32 ` Patrick Schaaf @ 2005-05-04 11:30 ` Patrick McHardy 2005-05-04 12:01 ` Amin Azez 1 sibling, 1 reply; 48+ messages in thread From: Patrick McHardy @ 2005-05-04 11:30 UTC (permalink / raw) To: Amin Azez; +Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Amin Azez wrote: > It is entirely possible that a new conntrack with the same tuples is > created before the user program can be aware the old one has been > destroyed. > > Defining multiple successive connections as "one flow" is convenient, > but as user space clients are notified of "interuptions and > restorations" to this "one flow", it would be also convenient if they > could safely take advantage of such notifications. Agreed. Besides, this is an interface to conntrack, not flowtrack :) > If an ID is not desirable as part of the tuple (and I can see that it is > not) perhaps a "created time-stamp" per conntrack would suffice as an > extra "guard" which MAY be provided to conntrack manipulation routines, > and if so provided MUST also be satisified for the operation to take place. > > That is my suggestion. It does not introduce an alternative ID, it does > avoid the problem of race conditions. > > Comments? Why is that better than a unique ID? It needs space as well, but can't be used to identify the conntrack without further information. Regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-04 11:30 ` Patrick McHardy @ 2005-05-04 12:01 ` Amin Azez 2005-05-06 15:16 ` Patrick McHardy 0 siblings, 1 reply; 48+ messages in thread From: Amin Azez @ 2005-05-04 12:01 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Patrick McHardy wrote: >Amin Azez wrote: > > > >>It is entirely possible that a new conntrack with the same tuples is >>created before the user program can be aware the old one has been >>destroyed. >> >>Defining multiple successive connections as "one flow" is convenient, >>but as user space clients are notified of "interuptions and >>restorations" to this "one flow", it would be also convenient if they >>could safely take advantage of such notifications. >> >> > >Agreed. Besides, this is an interface to conntrack, not flowtrack :) > > >>If an ID is not desirable as part of the tuple (and I can see that it is >>not) perhaps a "created time-stamp" per conntrack would suffice as an >>extra "guard" which MAY be provided to conntrack manipulation routines, >>and if so provided MUST also be satisified for the operation to take place. >> >>That is my suggestion. It does not introduce an alternative ID, it does >>avoid the problem of race conditions. >> >>Comments? >> >> > >Why is that better than a unique ID? It needs space as well, but can't >be used to identify the conntrack without further information. > > There isn't the problems of having to generate a unique id, or the worry of it finally wrapping every few years as we don't pretend it is unique. However, combined with either tuple it forms a unique id that wraps only when the calendar does. Further, as pointed out by Patrick Schaaf, start time has the potential to be more useful than a unique id in filtering Amin ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-04 12:01 ` Amin Azez @ 2005-05-06 15:16 ` Patrick McHardy 2005-05-07 20:36 ` Marcus Sundberg 0 siblings, 1 reply; 48+ messages in thread From: Patrick McHardy @ 2005-05-06 15:16 UTC (permalink / raw) To: Amin Azez; +Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira Amin Azez wrote: > Patrick McHardy wrote: > >> Why is that better than a unique ID? It needs space as well, but can't >> be used to identify the conntrack without further information. >> >> > There isn't the problems of having to generate a unique id, or the worry > of it finally wrapping every few years as we don't pretend it is unique. > However, combined with either tuple it forms a unique id that wraps only > when the calendar does. Wrapping is not a problem with a 64bit id. One thing I'm worried about with using a timestamp is that it might not be of high enough precision with very fast CPU and network to uniquely identify each connection. > Further, as pointed out by Patrick Schaaf, start time has the potential > to be more useful than a unique id in filtering Agreed, but it is secondary to solving the problem. Regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-06 15:16 ` Patrick McHardy @ 2005-05-07 20:36 ` Marcus Sundberg 2005-05-07 22:18 ` Patrick McHardy 0 siblings, 1 reply; 48+ messages in thread From: Marcus Sundberg @ 2005-05-07 20:36 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira, Amin Azez Patrick McHardy wrote: > Amin Azez wrote: > >>There isn't the problems of having to generate a unique id, or the worry >>of it finally wrapping every few years as we don't pretend it is unique. >>However, combined with either tuple it forms a unique id that wraps only >>when the calendar does. > > Wrapping is not a problem with a 64bit id. One thing I'm worried about > with using a timestamp is that it might not be of high enough precision > with very fast CPU and network to uniquely identify each connection. You don't even need fast CPUs or networks to risk precision problems - think multiple NICs and SMP. //Marcus -- ---------------------------------------+-------------------------- Marcus Sundberg <marcus@ingate.com> | Firewalls with SIP & NAT Software Developer, Ingate Systems AB | http://www.ingate.com/ ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-07 20:36 ` Marcus Sundberg @ 2005-05-07 22:18 ` Patrick McHardy 2005-05-07 22:32 ` Marcus Sundberg 0 siblings, 1 reply; 48+ messages in thread From: Patrick McHardy @ 2005-05-07 22:18 UTC (permalink / raw) To: Marcus Sundberg Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira, Amin Azez Marcus Sundberg wrote: > You don't even need fast CPUs or networks to risk precision problems > - think multiple NICs and SMP. SMP or multiple NIcs don't matter because at any point in time only one instance of a connection can exist. The challenge is to have a unique identifier over time. Regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-07 22:18 ` Patrick McHardy @ 2005-05-07 22:32 ` Marcus Sundberg 2005-05-09 14:17 ` KOVACS Krisztian 2005-05-11 8:43 ` Amin Azez 0 siblings, 2 replies; 48+ messages in thread From: Marcus Sundberg @ 2005-05-07 22:32 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira, Amin Azez Patrick McHardy wrote: > Marcus Sundberg wrote: > >>You don't even need fast CPUs or networks to risk precision problems >>- think multiple NICs and SMP. > > SMP or multiple NIcs don't matter because at any point in time only > one instance of a connection can exist. The challenge is to have a > unique identifier over time. Yes, having a unique identifier over time was what was being discussed, and I was merely pointing out that with SMP you can get two conntracks with identical timestamps even if you have infinite precision, since two new conntracks can be timestamped simultaneously by different CPUs. //Marcus -- ---------------------------------------+-------------------------- Marcus Sundberg <marcus@ingate.com> | Firewalls with SIP & NAT Software Developer, Ingate Systems AB | http://www.ingate.com/ ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-07 22:32 ` Marcus Sundberg @ 2005-05-09 14:17 ` KOVACS Krisztian 2005-05-09 15:08 ` Amin Azez 2005-05-17 16:12 ` Amin Azez 2005-05-11 8:43 ` Amin Azez 1 sibling, 2 replies; 48+ messages in thread From: KOVACS Krisztian @ 2005-05-09 14:17 UTC (permalink / raw) To: Marcus Sundberg Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira, Patrick McHardy, Amin Azez Hi, 2005-05-08, v keltezéssel 00.32-kor Marcus Sundberg ezt írta: > >>You don't even need fast CPUs or networks to risk precision problems > >>- think multiple NICs and SMP. > > > > SMP or multiple NIcs don't matter because at any point in time only > > one instance of a connection can exist. The challenge is to have a > > unique identifier over time. > > Yes, having a unique identifier over time was what was being discussed, > and I was merely pointing out that with SMP you can get two conntracks > with identical timestamps even if you have infinite precision, since > two new conntracks can be timestamped simultaneously by different CPUs. OK, but it's not a problem. If the two conntracks are identical (their tuples are the same) then equal timestamps are not a problem (and of course one of them will be dropped anyway, since hash insertion is serialized). If they are not the same, then it does not matter because the timestamp is just an additional info -- the tuple identifies the conntrack by itself. Probably Patrick was referring to a possible problem where the following happens: a new connection is established and destroyed in a very short time. If a new connection with the same tuple is created before the timestamp increases (which is perfectly possible IMHO if you have some slow embedded HW with no high precision timer available) then you won't be able to tell the difference in the userspace app, so that the race described by Amin is still possible. -- Regards, Krisztian Kovacs ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-09 14:17 ` KOVACS Krisztian @ 2005-05-09 15:08 ` Amin Azez 2005-05-10 6:49 ` Harald Welte 2005-05-17 16:12 ` Amin Azez 1 sibling, 1 reply; 48+ messages in thread From: Amin Azez @ 2005-05-09 15:08 UTC (permalink / raw) To: KOVACS Krisztian Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira, Marcus Sundberg, Patrick McHardy KOVACS Krisztian wrote: > > OK, but it's not a problem. If the two conntracks are identical (their >tuples are the same) then equal timestamps are not a problem (and of >course one of them will be dropped anyway, since hash insertion is >serialized). If they are not the same, then it does not matter because >the timestamp is just an additional info -- the tuple identifies the >conntrack by itself. > > Probably Patrick was referring to a possible problem where the >following happens: a new connection is established and destroyed in a >very short time. If a new connection with the same tuple is created >before the timestamp increases (which is perfectly possible IMHO if you >have some slow embedded HW with no high precision timer available) then >you won't be able to tell the difference in the userspace app, so that >the race described by Amin is still possible. > > The time struct used in skb's has time and microtime. Is there a sequence of packets that conntrack could monitor so that a conntrack to be created and destroyed and re-created in the same microsecond? I can't imagine any slow embedded hardware being called upon to process a sequence of packets that occur so quickly; are installations actually called upon to process packets at a rate beyond their ability to time? (maybe so, just curious) Perhaps if the conntrack would be destroyed in the same time instance that it is created it is instead saved but destroyed later by a timer callback. If a conntrack is then to be "re-created" before it has been destroyed, a small starts-at-0 counter in the conntrack struct is increased to indicate re-use. The size of this counter would have to reflect the number of times a conntrack could be destroyed and resurrected in the same timer-tick. It is sad that the addition of the extra field, and deferral of destruction on contracks in the same microsecond they were created in make the solution less simple, but they would make it robust. Amin ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-09 15:08 ` Amin Azez @ 2005-05-10 6:49 ` Harald Welte 0 siblings, 0 replies; 48+ messages in thread From: Harald Welte @ 2005-05-10 6:49 UTC (permalink / raw) To: Amin Azez Cc: Patrick McHardy, Netfilter Development Mailinglist, Pablo Neira, Marcus Sundberg, KOVACS Krisztian [-- Attachment #1: Type: text/plain, Size: 729 bytes --] On Mon, May 09, 2005 at 04:08:49PM +0100, Amin Azez wrote: > The time struct used in skb's has time and microtime. If you're referring to the skb receive timestamp: That doesn't exist for locally-generated packet, and on 'real' incoming pcakets from the network it doesn't exist by default unless some application (such as tcpdump) requests it. -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-09 14:17 ` KOVACS Krisztian 2005-05-09 15:08 ` Amin Azez @ 2005-05-17 16:12 ` Amin Azez 2005-05-17 20:17 ` Patrick McHardy 2005-05-18 6:45 ` Jozsef Kadlecsik 1 sibling, 2 replies; 48+ messages in thread From: Amin Azez @ 2005-05-17 16:12 UTC (permalink / raw) To: KOVACS Krisztian Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira, Patrick McHardy KOVACS Krisztian wrote: > Probably Patrick was referring to a possible problem where the > following happens: a new connection is established and destroyed in a > very short time. If a new connection with the same tuple is created > before the timestamp increases (which is perfectly possible IMHO if you > have some slow embedded HW with no high precision timer available) After further reading I think this scenario is highly unlikely. I don't mean improbable, I mean, is there any such hardware? If a socket is not to be reused until TCP_TIME_WAIT which is recommended to be in the region of 4 minutes, is there really any hardware that can't time to that resolution? Is there really any devices that will re-use a TCP socket in the same timer tick as they closed it? Any devices that re-use sockets too quickly are going to have problems anyway (of course that doesn't mean we can ignore them) but I surely they are so buggy as to not remain in use? Amin > then > you won't be able to tell the difference in the userspace app, so that > the race described by Amin is still possible. > ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-17 16:12 ` Amin Azez @ 2005-05-17 20:17 ` Patrick McHardy 2005-05-18 7:24 ` Amin Azez 2005-05-18 9:30 ` Jozsef Kadlecsik 2005-05-18 6:45 ` Jozsef Kadlecsik 1 sibling, 2 replies; 48+ messages in thread From: Patrick McHardy @ 2005-05-17 20:17 UTC (permalink / raw) To: Amin Azez Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira, KOVACS Krisztian Amin Azez wrote: > KOVACS Krisztian wrote: > >> Probably Patrick was referring to a possible problem where the >> following happens: a new connection is established and destroyed in a >> very short time. If a new connection with the same tuple is created >> before the timestamp increases (which is perfectly possible IMHO if you >> have some slow embedded HW with no high precision timer available) Exactly. > After further reading I think this scenario is highly unlikely. Unlikely is still enough reason to handle it properly in an API. Otherwise anything you build on top of it has to take this into account for any guarantees it would like to give. And so far, I haven't even seen a suggestion how to notice it - which would also be fine with me. Regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-17 20:17 ` Patrick McHardy @ 2005-05-18 7:24 ` Amin Azez 2005-05-18 9:30 ` Jozsef Kadlecsik 1 sibling, 0 replies; 48+ messages in thread From: Amin Azez @ 2005-05-18 7:24 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira, KOVACS Krisztian Patrick McHardy wrote: >Amin Azez wrote: > > >>KOVACS Krisztian wrote: >> >> >> >>> Probably Patrick was referring to a possible problem where the >>>following happens: a new connection is established and destroyed in a >>>very short time. If a new connection with the same tuple is created >>>before the timestamp increases (which is perfectly possible IMHO if you >>>have some slow embedded HW with no high precision timer available) >>> >>> > >Exactly. > > > >>After further reading I think this scenario is highly unlikely. >> >> > >Unlikely is still enough reason to handle it properly in an API. > > By unlikely I didn't mean it would rarely happen I meant the hardware with which it could ever happen is surely unlikely. (A different order of unlikiness) However I guess your comment below still holds. >Otherwise anything you build on top of it has to take this into >account for any guarantees it would like to give. And so far, I >haven't even seen a suggestion how to notice it - which would >also be fine with me. > > One such suggestion is: IFF the conntrack is to be destroyed in the same clock tick as it was created, to instead destroy the conntrack one clock tick later through death-by-timeout. Then the new conntrack would have to be created (although the same clock tick) with a different internal conntrack id. The costs of this would only be borne when such unusual hardware was in use, and when the problem case came up, but the internal conntrack id could then be used in conjunction with the timestamp to form a unique qualifier that (takes deep breath) could be used with the tuple to recognize a specific conntrack instance. It would require no extra storage but increase the amount of data sent though the netlink socket. This would still offer some slight benefit over a public conntrack serial number in that it would also allow conntrack creation time matching in iptables rules. I do point out and wonder about the possibilities of a denial of service though queueing lots of conntracks to be destroyed by timeout 1 tick later but think this is hardly any worse than without a timeout in practice. Another hacky "policy" fix would be to drop the SYN packet that would re-create the conntrack in the same tick as its original creation and let it be sent again. Its barely normal behaviour to do such a thing, such packets deserve to be dropped (for the sins of their parents? Hmm) Would such packets get re-sent via a loopback interface? But then again device that abuses themselves in such a way beyond the resolution of their own timers are surely on drugs? Amin ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-17 20:17 ` Patrick McHardy 2005-05-18 7:24 ` Amin Azez @ 2005-05-18 9:30 ` Jozsef Kadlecsik 2005-06-04 23:52 ` Pablo Neira 1 sibling, 1 reply; 48+ messages in thread From: Jozsef Kadlecsik @ 2005-05-18 9:30 UTC (permalink / raw) To: Patrick McHardy; +Cc: Netfilter Development Mailinglist On Tue, 17 May 2005, Patrick McHardy wrote: > Amin Azez wrote: > > KOVACS Krisztian wrote: > > > >> Probably Patrick was referring to a possible problem where the > >> following happens: a new connection is established and destroyed in a > >> very short time. If a new connection with the same tuple is created > >> before the timestamp increases (which is perfectly possible IMHO if you > >> have some slow embedded HW with no high precision timer available) > > Exactly. > > > After further reading I think this scenario is highly unlikely. > > Unlikely is still enough reason to handle it properly in an API. > Otherwise anything you build on top of it has to take this into > account for any guarantees it would like to give. And so far, I > haven't even seen a suggestion how to notice it - which would > also be fine with me. I think we should not state any guarantee here. Conntrack entries are uniquely identified by tuples, that's all we should say. There *is* a certain ambiquity, when, during te kernel-userspace communication, a conntrack entry is deleted and a new one with the same tuples is created, but that can be documented clearly. In order to create unique identification of conntrack entries, there were a couple of clever suggestions, all of them burdened by something: - timer based solutions may be not fine-grained enough - pointer of conntrack is not unique as it can be reused - id creates a new possible bottleneck What wrong can happen, if a reborn conntrack entry is deleted instead of the original one? If the conntrack entry is to be dropped due to a change in policy, then what we did is just fine! If there was a "stuck" conntrack entry and the admin was going to delete it manually but it went away and he deleted the new conntrack entry, that's an unfortunate event - but the user was in trouble anyway. So - do we really need such accuracy? Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-18 9:30 ` Jozsef Kadlecsik @ 2005-06-04 23:52 ` Pablo Neira 2005-06-05 1:02 ` Pablo Neira 2005-06-06 8:17 ` Jozsef Kadlecsik 0 siblings, 2 replies; 48+ messages in thread From: Pablo Neira @ 2005-06-04 23:52 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist, Patrick McHardy Jozsef Kadlecsik wrote: > On Tue, 17 May 2005, Patrick McHardy wrote: >>Amin Azez wrote: >> >>>KOVACS Krisztian wrote: >>> >>>> Probably Patrick was referring to a possible problem where the >>>>following happens: a new connection is established and destroyed in a >>>>very short time. If a new connection with the same tuple is created >>>>before the timestamp increases (which is perfectly possible IMHO if you >>>>have some slow embedded HW with no high precision timer available) >> >>Exactly. >> >>>After further reading I think this scenario is highly unlikely. >> >>Unlikely is still enough reason to handle it properly in an API. >>Otherwise anything you build on top of it has to take this into >>account for any guarantees it would like to give. And so far, I >>haven't even seen a suggestion how to notice it - which would >>also be fine with me. > > I think we should not state any guarantee here. Conntrack entries are > uniquely identified by tuples, that's all we should say. > > There *is* a certain ambiquity, when, during te kernel-userspace > communication, a conntrack entry is deleted and a new one with the same > tuples is created, but that can be documented clearly. > > In order to create unique identification of conntrack entries, there were > a couple of clever suggestions, all of them burdened by something: > > - timer based solutions may be not fine-grained enough > - pointer of conntrack is not unique as it can be reused > - id creates a new possible bottleneck > > What wrong can happen, if a reborn conntrack entry is deleted instead of > the original one? > > If the conntrack entry is to be dropped due to a change in policy, then > what we did is just fine! If there was a "stuck" conntrack entry and the > admin was going to delete it manually but it went away and he deleted the > new conntrack entry, that's an unfortunate event - but the user was in > trouble anyway. > > So - do we really need such accuracy? I want give another spin to this issue. A small digest about this ID thing: + The unique ID eats 8 extra bytes, since it will be an __u64. On my laptop (1787 buckets, 14296 max), that makes 114368 extra bytes (worst case). + "Slow" devices. As Krisztian and Patrick pointed out, a conntrack could be destroyed while the user could be trying to kill it, then another conntrack is created with the same tuples. Result: the user kills a connection that he didn't mean to. + If we've got an ID, the user could decide it he wants such accuracy or not to kill connections. If not, we would need to document this issue. I'd definitely like to have such accuracy, but I still see this incident unlikely. I think that such TCP stack must be broken if it starts a brand new connection using the same source/destination ports that it's recently used. -- Pablo ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-04 23:52 ` Pablo Neira @ 2005-06-05 1:02 ` Pablo Neira 2005-06-06 8:48 ` Jozsef Kadlecsik 2005-06-06 8:17 ` Jozsef Kadlecsik 1 sibling, 1 reply; 48+ messages in thread From: Pablo Neira @ 2005-06-05 1:02 UTC (permalink / raw) To: Pablo Neira Cc: Netfilter Development Mailinglist, Patrick McHardy, Jozsef Kadlecsik Pablo Neira wrote: > I'd definitely like to have such accuracy, but I still see this incident > unlikely. I think that such TCP stack must be broken if it starts a > brand new connection using the same source/destination ports that it's > recently used. Forget this, this can happen in an attempt to reopen a closed connection, and such case is likely. We need such ID in order to achieve accuracy. I think that it must be the user who has to choose if he wants accuracy or not, in such case we have to provide the corresponding methods to achieve it. A user could kill a conntrack by means of: a) the tuples, if he doesn't want accuracy b) the tuples + the id, if he does. -- Pablo ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-05 1:02 ` Pablo Neira @ 2005-06-06 8:48 ` Jozsef Kadlecsik 2005-06-09 12:52 ` Pablo Neira 0 siblings, 1 reply; 48+ messages in thread From: Jozsef Kadlecsik @ 2005-06-06 8:48 UTC (permalink / raw) To: Pablo Neira; +Cc: Netfilter Development Mailinglist, Patrick McHardy Hi Pablo, On Sun, 5 Jun 2005, Pablo Neira wrote: > Pablo Neira wrote: > > I'd definitely like to have such accuracy, but I still see this incident > > unlikely. I think that such TCP stack must be broken if it starts a > > brand new connection using the same source/destination ports that it's > > recently used. > > Forget this, this can happen in an attempt to reopen a closed > connection, and such case is likely. We need such ID in order to achieve > accuracy. I think that it must be the user who has to choose if he wants > accuracy or not, in such case we have to provide the corresponding > methods to achieve it. A user could kill a conntrack by means of: > > a) the tuples, if he doesn't want accuracy > b) the tuples + the id, if he does. I share your feelings about giving complete accurate access to the users over conntrack entries. Still, I'm not completely convinced about the practical usefulness of such accuracy. Let's therefore look at it again: a. Policy changed and admin wants to enforce the new policy on the living conntrack entries as well: here the id does not buy anything, tuples are just sufficient. b. Admin wants to kill a "stuck" conntrack entry, in order to make possible to build up a new connection. In my opinion that's just not the proper way to deal with the problem, conntrack should be able to handle such cases automatically. And I believe we worked very hard and that part is highly polished in conntrack in the recent 2.6 tree, so that it's just a theoretical example ;-) c. conntrack table is full and admin wants to get rid of a bunch of entries manually. Somehow I don't think id would be very useful here either. Other possibilities? I do not think users should poke conntrack without very good reason, at their whim. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-06 8:48 ` Jozsef Kadlecsik @ 2005-06-09 12:52 ` Pablo Neira 2005-06-09 13:00 ` Pablo Neira 0 siblings, 1 reply; 48+ messages in thread From: Pablo Neira @ 2005-06-09 12:52 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist, Patrick McHardy Hi Jozsef, Jozsef Kadlecsik wrote: >>Pablo Neira wrote: >> >>>I'd definitely like to have such accuracy, but I still see this incident >>>unlikely. I think that such TCP stack must be broken if it starts a >>>brand new connection using the same source/destination ports that it's >>>recently used. >> >>Forget this, this can happen in an attempt to reopen a closed >>connection, and such case is likely. We need such ID in order to achieve >>accuracy. I think that it must be the user who has to choose if he wants >>accuracy or not, in such case we have to provide the corresponding >>methods to achieve it. A user could kill a conntrack by means of: >> >>a) the tuples, if he doesn't want accuracy >>b) the tuples + the id, if he does. > > > I share your feelings about giving complete accurate access to the users > over conntrack entries. Still, I'm not completely convinced about the > practical usefulness of such accuracy. Let's therefore look at it again: > > a. Policy changed and admin wants to enforce the new policy on the living > conntrack entries as well: here the id does not buy anything, tuples > are just sufficient. > b. Admin wants to kill a "stuck" conntrack entry, in order to make > possible to build up a new connection. In my opinion that's just not > the proper way to deal with the problem, conntrack should be able to > handle such cases automatically. And I believe we worked very hard > and that part is highly polished in conntrack in the recent 2.6 > tree, so that it's just a theoretical example ;-) > c. conntrack table is full and admin wants to get rid of a bunch of > entries manually. Somehow I don't think id would be very useful here > either. > > Other possibilities? > > I do not think users should poke conntrack without very good reason, at > their whim. Agreed, those scenarios look pretty realistic. But if the ID goes out, I'll have another concern. ctnetlink_dump_table[_w] currently uses the ID to know where it's stopped dumping the conntrack table, netlink dumping is not atomic. I could increase the conntrack refcount and hold a pointer to it but if timeout expires while returning data to user space, the conntrack won't be in hashes anymore, so it couldn't continue the travel through the conntrack table. I thought about freezing the conntrack timer and active it once I continue traversing the list. That could result in problems since netlink dumping is not atomic, someone could interrupt the dumping and that conntrack will be stuck there forever. Moreover, I don't like it. All the things I've though so far are burned by something. The cleanest way to do this looks the ID. Any other ideas? P.D: Thanks to Krisztian Kovacs for the feedback. -- Pablo ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-09 12:52 ` Pablo Neira @ 2005-06-09 13:00 ` Pablo Neira 2005-06-09 13:34 ` Jozsef Kadlecsik 0 siblings, 1 reply; 48+ messages in thread From: Pablo Neira @ 2005-06-09 13:00 UTC (permalink / raw) To: Pablo Neira Cc: Netfilter Development Mailinglist, Patrick McHardy, Jozsef Kadlecsik Pablo Neira wrote: > All the things I've though so far are burned by something. The cleanest > way to do this looks the ID. Any other ideas? Hm, this idea just came to my head. We could use a unsigned 8 bit per-bucket-id that, together with the tuple, could uniquely identify a conntrack (and make Patrick sleep with both eyes closed), reduce memory comsumption (and make Jozsef happier) and fix my problem of the conntrack table dumping (let Pablo drinks beer calmly). Anything else? -- Pablo ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-09 13:00 ` Pablo Neira @ 2005-06-09 13:34 ` Jozsef Kadlecsik 2005-06-10 10:21 ` Pablo Neira 0 siblings, 1 reply; 48+ messages in thread From: Jozsef Kadlecsik @ 2005-06-09 13:34 UTC (permalink / raw) To: Pablo Neira; +Cc: Netfilter Development Mailinglist, Patrick McHardy Hi Pablo, On Thu, 9 Jun 2005, Pablo Neira wrote: > Pablo Neira wrote: > > All the things I've though so far are burned by something. The cleanest > > way to do this looks the ID. Any other ideas? > > Hm, this idea just came to my head. We could use a unsigned 8 bit > per-bucket-id that, together with the tuple, could uniquely identify a > conntrack (and make Patrick sleep with both eyes closed), reduce memory > comsumption (and make Jozsef happier) and fix my problem of the > conntrack table dumping (let Pablo drinks beer calmly). Let the id be at least unsigned 16 bit. No hash function can guarantee that a given bucket won't happen to grow above 256 entries. How are you going to handle id collision due to wraparound? I like the idea! Per bucket id don't destroy what one can gain by per bucket locking ;-). (But the latter would require something more scalable than the unconfirmed list as well...) Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-09 13:34 ` Jozsef Kadlecsik @ 2005-06-10 10:21 ` Pablo Neira 2005-06-13 7:41 ` Jozsef Kadlecsik 0 siblings, 1 reply; 48+ messages in thread From: Pablo Neira @ 2005-06-10 10:21 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist, Patrick McHardy Hi Jozsef, Jozsef Kadlecsik wrote: >>Pablo Neira wrote: >> >>>All the things I've though so far are burned by something. The cleanest >>>way to do this looks the ID. Any other ideas? >> >>Hm, this idea just came to my head. We could use a unsigned 8 bit >>per-bucket-id [blah... blah... blah] > > Let the id be at least unsigned 16 bit. No hash function can guarantee > that a given bucket won't happen to grow above 256 entries. > > How are you going to handle id collision due to wraparound? yes, 8 bits is too short. About the wraparound problem, I'm planning to re-use id's. The id of a new conntrack will be set to the lastest inserted in the bucket plus one. However this wouldn't uniquely identify a conntrack: Say a connection is established, lastest conntrack in the bucket uses id A, so its id will be set to A+1. After quite some time the connection is closed. Then, in a very short period of time, another connection with the same tuples is established and lastest conntrack id is still A, in that case the id of the new conntrack will be set to A+1 again. To avoid that, I could keep an array of lastest id's released per bucket and set the id based on: if (id_of_lastest_ct_inserted > lastest_id_released[bucket]) id_of_lastest_ct_inserted + 1 else lastest_id_released[bucket] + 1; About memory comsumption. On my laptop, ip_conntrack version 2.1 (1787 buckets, 14296 max) This approach: 1787 * 2 = 3574 extra bytes to store the lastest id used 14296 * 2 (extra bytes per conntrack) = 28592 extra bytes (worst case) With regards to current u64 id approach: 28592 * 8 (extra bytes per conntrack) = 228736 extra bytes (worst case) -- Pablo ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-10 10:21 ` Pablo Neira @ 2005-06-13 7:41 ` Jozsef Kadlecsik 2005-06-14 2:30 ` Pablo Neira 0 siblings, 1 reply; 48+ messages in thread From: Jozsef Kadlecsik @ 2005-06-13 7:41 UTC (permalink / raw) To: Pablo Neira; +Cc: Netfilter Development Mailinglist, Patrick McHardy Hi Pablo, On Fri, 10 Jun 2005, Pablo Neira wrote: > > How are you going to handle id collision due to wraparound? > > yes, 8 bits is too short. About the wraparound problem, I'm planning to > re-use id's. The id of a new conntrack will be set to the lastest > inserted in the bucket plus one. > > However this wouldn't uniquely identify a conntrack: Say a connection is > established, lastest conntrack in the bucket uses id A, so its id will > be set to A+1. After quite some time the connection is closed. Then, in > a very short period of time, another connection with the same tuples is > established and lastest conntrack id is still A, in that case the id of > the new conntrack will be set to A+1 again. Wouldn't be more straightforward to store the last assigned id value in the bucket and simply increment that whenever the next value is used up? (Id collision is actually not a big problem, because the entries are identified by the tuples in the first place.) At dumping we could use the flip-bit solution: entries which are already dumped were marked with the next value of the bit. Of course user requests for dumping must be serialized, but conntrack replication could benefit from such schema, because new entries could be added to the conntrack table and replicated during full conntrack table replication as well. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-13 7:41 ` Jozsef Kadlecsik @ 2005-06-14 2:30 ` Pablo Neira 2005-06-14 2:42 ` Patrick McHardy 0 siblings, 1 reply; 48+ messages in thread From: Pablo Neira @ 2005-06-14 2:30 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist, Patrick McHardy Hello Jozsef, Jozsef Kadlecsik wrote: > On Fri, 10 Jun 2005, Pablo Neira wrote: > > >>>How are you going to handle id collision due to wraparound? >> >>yes, 8 bits is too short. About the wraparound problem, I'm planning to >>re-use id's. The id of a new conntrack will be set to the lastest >>inserted in the bucket plus one. >> >>However this wouldn't uniquely identify a conntrack: Say a connection is >>established, lastest conntrack in the bucket uses id A, so its id will >>be set to A+1. After quite some time the connection is closed. Then, in >>a very short period of time, another connection with the same tuples is >>established and lastest conntrack id is still A, in that case the id of >>the new conntrack will be set to A+1 again. > > > Wouldn't be more straightforward to store the last assigned id value in > the bucket and simply increment that whenever the next value is used up? > (Id collision is actually not a big problem, because the entries are > identified by the tuples in the first place.) Right, but then I'll have to face another problem, once the wraparound happens the conntracks inserted in the bucket aren't ordered by the id anymore. Currently if the skbuff that is going to be sent to user space via netlink gets full (one page sized), I'll need to know which was the lastest processed conntrack, including possible race conditions, ie. the conntrack expires while netlink is returning the packet to user space. This is controled by the following branch while iterating over the list: if (ct->id <= cb->args[1]) continue; That's why I came up with the idea of re-using id's, I want to avoid a wraparound. BTW, inserting conntracks in order isn't a solution either, since this will break LRU early drop. > At dumping we could use the flip-bit solution: entries which are already > dumped were marked with the next value of the bit. Of course user requests > for dumping must be serialized, but conntrack replication could benefit > from such schema, because new entries could be added to the conntrack > table and replicated during full conntrack table replication as well. Could you elaborate this idea about the flip-bit solution, please? looks interesting. I'm still looking for a solution based on a simpler logic, we'll see ;). -- Pablo ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-14 2:30 ` Pablo Neira @ 2005-06-14 2:42 ` Patrick McHardy 2005-06-15 2:41 ` Pablo Neira 2005-06-20 16:04 ` Amin Azez 0 siblings, 2 replies; 48+ messages in thread From: Patrick McHardy @ 2005-06-14 2:42 UTC (permalink / raw) To: Pablo Neira; +Cc: Netfilter Development Mailinglist, Jozsef Kadlecsik On Tue, 14 Jun 2005, Pablo Neira wrote: >> At dumping we could use the flip-bit solution: entries which are already >> dumped were marked with the next value of the bit. Of course user requests >> for dumping must be serialized, but conntrack replication could benefit >> from such schema, because new entries could be added to the conntrack >> table and replicated during full conntrack table replication as well. > > Could you elaborate this idea about the flip-bit solution, please? looks > interesting. You only allow one process to dump the table at a time, In each conntrack entry you flip a bit when dumping it. When continuing you continue with the next entry that has the bit unflipped. This way you don't need an ID at all. You need a timeout to make sure a hung process isn't blocking dumps forever. A malicious acting process could probably still block others for a long time, but dumping the conntrack table should only be possible with root priviliges anyway. When a dump is interrupted the state of the bits is inconsistent, in this case you need to reset all of them to a known state. Regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-14 2:42 ` Patrick McHardy @ 2005-06-15 2:41 ` Pablo Neira 2005-06-20 16:04 ` Amin Azez 1 sibling, 0 replies; 48+ messages in thread From: Pablo Neira @ 2005-06-15 2:41 UTC (permalink / raw) To: Patrick McHardy; +Cc: Netfilter Development Mailinglist, Jozsef Kadlecsik Patrick McHardy wrote: > On Tue, 14 Jun 2005, Pablo Neira wrote: > >>> At dumping we could use the flip-bit solution: entries which are already >>> dumped were marked with the next value of the bit. Of course user >>> requests >>> for dumping must be serialized, but conntrack replication could benefit >>> from such schema, because new entries could be added to the conntrack >>> table and replicated during full conntrack table replication as well. >> >> >> Could you elaborate this idea about the flip-bit solution, please? >> looks interesting. > > > You only allow one process to dump the table at a time, In each > conntrack entry you flip a bit when dumping it. When continuing > you continue with the next entry that has the bit unflipped. > This way you don't need an ID at all. You need a timeout to make > sure a hung process isn't blocking dumps forever. A malicious > acting process could probably still block others for a long time, > but dumping the conntrack table should only be possible with > root priviliges anyway. When a dump is interrupted the state of > the bits is inconsistent, in this case you need to reset all of > them to a known state. This would complicate the logic. Moreover, a top-like processing dumping the conntrack table every x seconds could be such "evil" process. I'm going to stuck on the idea of using a u64 id. It's the simpler solution at the moment. -- Pablo ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-14 2:42 ` Patrick McHardy 2005-06-15 2:41 ` Pablo Neira @ 2005-06-20 16:04 ` Amin Azez 2005-06-20 16:12 ` Patrick McHardy 1 sibling, 1 reply; 48+ messages in thread From: Amin Azez @ 2005-06-20 16:04 UTC (permalink / raw) To: Patrick McHardy; +Cc: Netfilter Development Mailinglist, Jozsef Kadlecsik Patrick McHardy wrote: > On Tue, 14 Jun 2005, Pablo Neira wrote: > >>> At dumping we could use the flip-bit solution: entries which are already >>> dumped were marked with the next value of the bit. Of course user >>> requests >>> for dumping must be serialized, but conntrack replication could benefit >>> from such schema, because new entries could be added to the conntrack >>> table and replicated during full conntrack table replication as well. >> >> >> Could you elaborate this idea about the flip-bit solution, please? >> looks interesting. > > > You only allow one process to dump the table at a time, In each > conntrack entry you flip a bit when dumping it. When continuing > you continue with the next entry that has the bit unflipped. > This way you don't need an ID at all. You need a timeout to make > sure a hung process isn't blocking dumps forever. A malicious > acting process could probably still block others for a long time, > but dumping the conntrack table should only be possible with > root priviliges anyway. When a dump is interrupted the state of > the bits is inconsistent, in this case you need to reset all of > them to a known state. One of my uses for conntrack is for statistics and analysis and to reduce race conditions in taking actions on a particular conntrack. I need some kind of conntrack ID that will be consistent in the medium term accross different conntrack manipulations Amin ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-20 16:04 ` Amin Azez @ 2005-06-20 16:12 ` Patrick McHardy 2005-06-22 9:09 ` Amin Azez 0 siblings, 1 reply; 48+ messages in thread From: Patrick McHardy @ 2005-06-20 16:12 UTC (permalink / raw) To: Amin Azez; +Cc: Netfilter Development Mailinglist, Jozsef Kadlecsik Amin Azez wrote: > One of my uses for conntrack is for statistics and analysis and to > reduce race conditions in taking actions on a particular conntrack. > > I need some kind of conntrack ID that will be consistent in the medium > term accross different conntrack manipulations That is why I've always argued in favour of the ID. Since its needed for other reasons too, I suggest to just keep it and get on. Regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-20 16:12 ` Patrick McHardy @ 2005-06-22 9:09 ` Amin Azez 2005-06-22 9:30 ` Oscar Mechanic 2005-06-22 17:23 ` Patrick McHardy 0 siblings, 2 replies; 48+ messages in thread From: Amin Azez @ 2005-06-22 9:09 UTC (permalink / raw) To: Patrick McHardy; +Cc: Netfilter Development Mailinglist, Jozsef Kadlecsik Patrick McHardy wrote: >Amin Azez wrote: > > >>One of my uses for conntrack is for statistics and analysis and to >>reduce race conditions in taking actions on a particular conntrack. >> >>I need some kind of conntrack ID that will be consistent in the medium >>term accross different conntrack manipulations >> >> > >That is why I've always argued in favour of the ID. Since its needed for >other reasons too, I suggest to just keep it and get on. > > Err... the current problem is that the conntrack id _may_ be re-used within milli-seconds? I was trying to find a safe conntrack id. Amin ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-22 9:09 ` Amin Azez @ 2005-06-22 9:30 ` Oscar Mechanic 2005-06-22 17:23 ` Patrick McHardy 1 sibling, 0 replies; 48+ messages in thread From: Oscar Mechanic @ 2005-06-22 9:30 UTC (permalink / raw) To: Amin Azez Cc: Netfilter Development Mailinglist, Patrick McHardy, Jozsef Kadlecsik I was thinking about this like using a random number or multipler or divider on the connection params. One thought, from ip_conntrack_max and buckets you have an approx number of connections that is feasible to pass e.g 32k. So the conntrack id goes from 0 --> 32k So if these were to be looked at like slots e.g. if the ID goes over 32k start from the bottom again and find an empty slot. Quite simple suggestion probably eloquently displays that I don't know what I am talking about. This is not going to be unique for accounting and I don't think anything you choose can assure that as we are dealing with a state machine On Wed, 2005-06-22 at 10:09 +0100, Amin Azez wrote: > Patrick McHardy wrote: > > >Amin Azez wrote: > > > > > >>One of my uses for conntrack is for statistics and analysis and to > >>reduce race conditions in taking actions on a particular conntrack. > >> > >>I need some kind of conntrack ID that will be consistent in the medium > >>term accross different conntrack manipulations > >> > >> > > > >That is why I've always argued in favour of the ID. Since its needed for > >other reasons too, I suggest to just keep it and get on. > > > > > Err... the current problem is that the conntrack id _may_ be re-used > within milli-seconds? > I was trying to find a safe conntrack id. > > Amin ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-22 9:09 ` Amin Azez 2005-06-22 9:30 ` Oscar Mechanic @ 2005-06-22 17:23 ` Patrick McHardy 2005-07-11 5:41 ` Harald Welte 1 sibling, 1 reply; 48+ messages in thread From: Patrick McHardy @ 2005-06-22 17:23 UTC (permalink / raw) To: Amin Azez; +Cc: Netfilter Development Mailinglist, Jozsef Kadlecsik Amin Azez wrote: > Err... the current problem is that the conntrack id _may_ be re-used > within milli-seconds? > I was trying to find a safe conntrack id. No, it is 64 bit wide and does not wrap for a long time. Regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-22 17:23 ` Patrick McHardy @ 2005-07-11 5:41 ` Harald Welte 2005-07-11 7:47 ` Patrick McHardy 0 siblings, 1 reply; 48+ messages in thread From: Harald Welte @ 2005-07-11 5:41 UTC (permalink / raw) To: Patrick McHardy Cc: Netfilter Development Mailinglist, Amin Azez, Jozsef Kadlecsik [-- Attachment #1: Type: text/plain, Size: 1113 bytes --] On Wed, Jun 22, 2005 at 07:23:20PM +0200, Patrick McHardy wrote: > Amin Azez wrote: > > Err... the current problem is that the conntrack id _may_ be re-used > > within milli-seconds? > > I was trying to find a safe conntrack id. > > No, it is 64 bit wide and does not wrap for a long time. I'm still not convinced that the ID is a good idea (or that it is needed in most cases). However, However, flow based accounting is basically finished, all that it lacks is nfnetlink/ctnetlink. So I want to submit them pretty soon for mainline inclusion. If you have decided onto which form of ID, please try to merge those patches (if any) soon and tell me when I can finalize ctnetlink/nfnetlink for submission. Thanks! -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-07-11 5:41 ` Harald Welte @ 2005-07-11 7:47 ` Patrick McHardy 2005-07-11 9:50 ` Pablo Neira 0 siblings, 1 reply; 48+ messages in thread From: Patrick McHardy @ 2005-07-11 7:47 UTC (permalink / raw) To: Harald Welte Cc: Netfilter Development Mailinglist, Pablo Neira, Amin Azez, Jozsef Kadlecsik Harald Welte wrote: > I'm still not convinced that the ID is a good idea (or that it is needed > in most cases). However, > > However, flow based accounting is basically finished, all that it lacks > is nfnetlink/ctnetlink. So I want to submit them pretty soon for > mainline inclusion. > > If you have decided onto which form of ID, please try to merge those patches > (if any) soon and tell me when I can finalize ctnetlink/nfnetlink for > submission. Pablo decided to keep the 64bit ID, mainly there is no better alternative for dumping. I don't know about the state, but AFAIK he is currently reworking the ctnetlink message format to use nested attributes instead of kernel structures. Unicast communication also needs to be fixed, right now everything is only broadcasted and userspace needs to filter. It should behave like all other netlink families. That's all I know of that needs to be done, Pablo probably has more. Regards Patrick ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-07-11 7:47 ` Patrick McHardy @ 2005-07-11 9:50 ` Pablo Neira 0 siblings, 0 replies; 48+ messages in thread From: Pablo Neira @ 2005-07-11 9:50 UTC (permalink / raw) To: Patrick McHardy Cc: Harald Welte, Netfilter Development Mailinglist, Amin Azez, Jozsef Kadlecsik Hi! Patrick McHardy wrote: > Harald Welte wrote: > >>I'm still not convinced that the ID is a good idea (or that it is needed >>in most cases). > > Pablo decided to keep the 64bit ID, mainly there is no better > alternative for dumping. Yes, we don't know any reliable way to know from which point the dumping stopped once the skbuff gets full. > he is currently reworking the ctnetlink message format to use > nested attributes instead of kernel structures. Indeed. The new message format has required tons of changes but it's the way to go. > Unicast communication > also needs to be fixed, right now everything is only broadcasted and > userspace needs to filter. It should behave like all other netlink > families. That's all I know of that needs to be done, Pablo probably > has more. I expect to send the patches tomorrow, so we could discuss on the code. -- Pablo ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-06-04 23:52 ` Pablo Neira 2005-06-05 1:02 ` Pablo Neira @ 2005-06-06 8:17 ` Jozsef Kadlecsik 1 sibling, 0 replies; 48+ messages in thread From: Jozsef Kadlecsik @ 2005-06-06 8:17 UTC (permalink / raw) To: Pablo Neira; +Cc: Netfilter Development Mailinglist, Patrick McHardy On Sun, 5 Jun 2005, Pablo Neira wrote: > + The unique ID eats 8 extra bytes, since it will be an __u64. On my > laptop (1787 buckets, 14296 max), that makes 114368 extra bytes (worst > case). And if nf_conntrack is submitted in its present form, every IP address in conntrack will require extra 12 bytes, which makes extra 48 bytes per entry. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-17 16:12 ` Amin Azez 2005-05-17 20:17 ` Patrick McHardy @ 2005-05-18 6:45 ` Jozsef Kadlecsik 2005-05-18 7:08 ` Amin Azez 1 sibling, 1 reply; 48+ messages in thread From: Jozsef Kadlecsik @ 2005-05-18 6:45 UTC (permalink / raw) To: Amin Azez; +Cc: Netfilter Development Mailinglist On Tue, 17 May 2005, Amin Azez wrote: > KOVACS Krisztian wrote: > > Probably Patrick was referring to a possible problem where the > > following happens: a new connection is established and destroyed in a > > very short time. If a new connection with the same tuple is created > > before the timestamp increases (which is perfectly possible IMHO if you > > have some slow embedded HW with no high precision timer available) > > After further reading I think this scenario is highly unlikely. > > I don't mean improbable, I mean, is there any such hardware? > If a socket is not to be reused until TCP_TIME_WAIT which is recommended > to be in the region of 4 minutes, is there really any hardware that > can't time to that resolution? Is there really any devices that will > re-use a TCP socket in the same timer tick as they closed it? We have to deal with other protocols as well, not just TCP. For example using UDP one could fairly easily trigger the described situation. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-18 6:45 ` Jozsef Kadlecsik @ 2005-05-18 7:08 ` Amin Azez 2005-05-18 7:17 ` Jozsef Kadlecsik 0 siblings, 1 reply; 48+ messages in thread From: Amin Azez @ 2005-05-18 7:08 UTC (permalink / raw) To: Jozsef Kadlecsik; +Cc: Netfilter Development Mailinglist Jozsef Kadlecsik wrote: >On Tue, 17 May 2005, Amin Azez wrote: > > > >>KOVACS Krisztian wrote: >> >> >>> Probably Patrick was referring to a possible problem where the >>>following happens: a new connection is established and destroyed in a >>>very short time. If a new connection with the same tuple is created >>>before the timestamp increases (which is perfectly possible IMHO if you >>>have some slow embedded HW with no high precision timer available) >>> >>> >>After further reading I think this scenario is highly unlikely. >> >>I don't mean improbable, I mean, is there any such hardware? >>If a socket is not to be reused until TCP_TIME_WAIT which is recommended >>to be in the region of 4 minutes, is there really any hardware that >>can't time to that resolution? Is there really any devices that will >>re-use a TCP socket in the same timer tick as they closed it? >> >> > >We have to deal with other protocols as well, not just TCP. For example >using UDP one could fairly easily trigger the described situation. > > > I think this situation could not be triggered by UDP, as there are no explicit close sequences for udp that conntrack recognizes, so the conntrack would only be destroyed after a conntrack timer expires (which must be larger than the minimum resolution of the timer), therefore it becomes impossible to bring up two conntracks with the same tuples in the same clock tick. Amin ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-18 7:08 ` Amin Azez @ 2005-05-18 7:17 ` Jozsef Kadlecsik 0 siblings, 0 replies; 48+ messages in thread From: Jozsef Kadlecsik @ 2005-05-18 7:17 UTC (permalink / raw) To: Amin Azez; +Cc: Netfilter Development Mailinglist On Wed, 18 May 2005, Amin Azez wrote: > >>> Probably Patrick was referring to a possible problem where the > >>>following happens: a new connection is established and destroyed in a > >>>very short time. If a new connection with the same tuple is created > >>>before the timestamp increases (which is perfectly possible IMHO if you > >>>have some slow embedded HW with no high precision timer available) > >>> > >>> > >>After further reading I think this scenario is highly unlikely. > >> > >>I don't mean improbable, I mean, is there any such hardware? > >>If a socket is not to be reused until TCP_TIME_WAIT which is recommended > >>to be in the region of 4 minutes, is there really any hardware that > >>can't time to that resolution? Is there really any devices that will > >>re-use a TCP socket in the same timer tick as they closed it? > > > >We have to deal with other protocols as well, not just TCP. For example > >using UDP one could fairly easily trigger the described situation. > > I think this situation could not be triggered by UDP, as there are no > explicit close sequences for udp that conntrack recognizes, so the > conntrack would only be destroyed after a conntrack timer expires (which > must be larger than the minimum resolution of the timer), therefore it > becomes impossible to bring up two conntracks with the same tuples in > the same clock tick. You're right. That was a broken example. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : KFKI Research Institute for Particle and Nuclear Physics H-1525 Budapest 114, POB. 49, Hungary ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] alternative to conntrack ID 2005-05-07 22:32 ` Marcus Sundberg 2005-05-09 14:17 ` KOVACS Krisztian @ 2005-05-11 8:43 ` Amin Azez 1 sibling, 0 replies; 48+ messages in thread From: Amin Azez @ 2005-05-11 8:43 UTC (permalink / raw) To: Marcus Sundberg Cc: Harald Welte, Netfilter Development Mailinglist, Pablo Neira, Patrick McHardy I re-propose the possibility of using a serial number for conntracks as an "additional qualifer" (although also unique) to be used by user-space applications. This way we keep the efficiency of using the tuple as a hash-key to retreive the conntrack, but the serial number to guard retreival of the right one. I think that it is clear that although timestamp may sometimes be useful in a conntrack, it is does not universally solve the problem identifying a particular connection over short periods of time; and this because of the claim that it may be possible on some platforms to create and destroy and re-create a conntrack in the same tick. UDP conntracks cannot be recreated in the same tick because their destruction is timer based, relating to a period of inactivity. I think some cases where a TCP conntrack can be re-created in the same tick are where 1) SO_DONTLINGER/SO_REUSE_ADDR & friends are used on participating originating machines 2) Embedded and other weird network devices re-connect rapidly 3) A different MAC address takes over an IP address Hacks to overcome this unlikely situation render the whole solution less attractive than a conntrack serial number. User-space applications monitoring and manipulating conntracks do need a more permanent reference to a conntrack that is likely to remain unique over a timescale of at least a few minutes, so I re-propose a serial number to this end. Amin ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] [PATCH] ctnetlink updates 2005-04-29 7:14 ` Jozsef Kadlecsik 2005-04-29 8:02 ` Harald Welte @ 2005-05-01 23:49 ` Pablo Neira 2005-05-02 10:47 ` Harald Welte 1 sibling, 1 reply; 48+ messages in thread From: Pablo Neira @ 2005-05-01 23:49 UTC (permalink / raw) To: Jozsef Kadlecsik Cc: Harald Welte, Netfilter Development Mailinglist, Patrick McHardy Jozsef Kadlecsik wrote: > Looking at the last changes, I think it'd be much more better to port > ip_queue to nfnetlink than to reserve another netlink ID: the hooks in > nfnetlink are already there. I know that'd create backward compatibility > issues at the existing queue applications, though... :-( I was playing around with an experimental port of ip_queue to nf_queue/nfnetlink during xmas holidays, it is crap. So I probably start it from scratch. In this specific case where we can break third party applications, to ensure backward compatibility, I think that we can keep both ip_queue and nf_queue in kernel tree for quite some time. -- Pablo ^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: [RFC] [PATCH] ctnetlink updates 2005-05-01 23:49 ` [RFC] [PATCH] ctnetlink updates Pablo Neira @ 2005-05-02 10:47 ` Harald Welte 0 siblings, 0 replies; 48+ messages in thread From: Harald Welte @ 2005-05-02 10:47 UTC (permalink / raw) To: Pablo Neira Cc: Netfilter Development Mailinglist, Patrick McHardy, Jozsef Kadlecsik [-- Attachment #1: Type: text/plain, Size: 742 bytes --] On Mon, May 02, 2005 at 01:49:38AM +0200, Pablo Neira wrote: > In this specific case where we can break third party applications, to ensure > backward compatibility, I think that we can keep both ip_queue and > nf_queue in kernel tree for quite some time. yes. also, as long as the libipq api can be offered to applications, I don't see that much of an issue. -- - Harald Welte <laforge@netfilter.org> http://netfilter.org/ ============================================================================ "Fragmentation is like classful addressing -- an interesting early architectural error that shows how much experimentation was going on while IP was being designed." -- Paul Vixie [-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2005-07-11 9:50 UTC | newest] Thread overview: 48+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-03-27 23:55 [RFC] [PATCH] ctnetlink updates Pablo Neira 2005-04-01 6:59 ` Harald Welte 2005-04-03 18:01 ` Patrick McHardy 2005-04-06 18:08 ` Pablo Neira 2005-04-17 15:07 ` Patrick McHardy 2005-04-29 7:14 ` Jozsef Kadlecsik 2005-04-29 8:02 ` Harald Welte 2005-05-04 9:18 ` [RFC] alternative to conntrack ID Amin Azez 2005-05-04 9:32 ` Patrick Schaaf 2005-05-04 11:30 ` Patrick McHardy 2005-05-04 12:01 ` Amin Azez 2005-05-06 15:16 ` Patrick McHardy 2005-05-07 20:36 ` Marcus Sundberg 2005-05-07 22:18 ` Patrick McHardy 2005-05-07 22:32 ` Marcus Sundberg 2005-05-09 14:17 ` KOVACS Krisztian 2005-05-09 15:08 ` Amin Azez 2005-05-10 6:49 ` Harald Welte 2005-05-17 16:12 ` Amin Azez 2005-05-17 20:17 ` Patrick McHardy 2005-05-18 7:24 ` Amin Azez 2005-05-18 9:30 ` Jozsef Kadlecsik 2005-06-04 23:52 ` Pablo Neira 2005-06-05 1:02 ` Pablo Neira 2005-06-06 8:48 ` Jozsef Kadlecsik 2005-06-09 12:52 ` Pablo Neira 2005-06-09 13:00 ` Pablo Neira 2005-06-09 13:34 ` Jozsef Kadlecsik 2005-06-10 10:21 ` Pablo Neira 2005-06-13 7:41 ` Jozsef Kadlecsik 2005-06-14 2:30 ` Pablo Neira 2005-06-14 2:42 ` Patrick McHardy 2005-06-15 2:41 ` Pablo Neira 2005-06-20 16:04 ` Amin Azez 2005-06-20 16:12 ` Patrick McHardy 2005-06-22 9:09 ` Amin Azez 2005-06-22 9:30 ` Oscar Mechanic 2005-06-22 17:23 ` Patrick McHardy 2005-07-11 5:41 ` Harald Welte 2005-07-11 7:47 ` Patrick McHardy 2005-07-11 9:50 ` Pablo Neira 2005-06-06 8:17 ` Jozsef Kadlecsik 2005-05-18 6:45 ` Jozsef Kadlecsik 2005-05-18 7:08 ` Amin Azez 2005-05-18 7:17 ` Jozsef Kadlecsik 2005-05-11 8:43 ` Amin Azez 2005-05-01 23:49 ` [RFC] [PATCH] ctnetlink updates Pablo Neira 2005-05-02 10:47 ` Harald Welte
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.