All of lore.kernel.org
 help / color / mirror / Atom feed
* priv_data patch
@ 2006-08-14 13:34 Patrick McHardy
  2006-08-14 14:25 ` Joakim Axelsson
                   ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 13:34 UTC (permalink / raw)
  To: Massimiliano Hofer; +Cc: Netfilter Development Mailinglist

I'm afraid I have some bad news ..

While merging the priv_data patch I noticed an oversight. Currently,
when modifying the ruleset, all modules dump their entire state
(user configuration + internal state kept in the same structure)
to userspace, which will return it to the kernel. That means for
example that the limit match will not loose its current state when
modifying other rules. When we move the state out of the data shared
with userspace this can't be done anymore, so each modification to
the table will cause all modules to loose their current state, even
if they we're not directly affected by the change. We can't break
this behaviour, so this limits potential users of the priv_data stuff
to things like hashlimit or recent, which do a lookup of state stored
completely external from the ruleset (and could use it to cache the
lookup result). I don't think that this is worth it, we probably need
to wait until we have a better userspace interface before we can do
something like this ..

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 13:34 priv_data patch Patrick McHardy
@ 2006-08-14 14:25 ` Joakim Axelsson
  2006-08-14 14:31   ` Patrick McHardy
  2006-08-14 14:40 ` Massimiliano Hofer
       [not found] ` <200608141557.35918.max@nucleus.it>
  2 siblings, 1 reply; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 14:25 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

2006-08-14 15:34:05+0200, Patrick McHardy <kaber@trash.net> ->
> I'm afraid I have some bad news ..
> 
> While merging the priv_data patch I noticed an oversight. Currently,
> when modifying the ruleset, all modules dump their entire state
> (user configuration + internal state kept in the same structure)
> to userspace, which will return it to the kernel. That means for
> example that the limit match will not loose its current state when
> modifying other rules. When we move the state out of the data shared
> with userspace this can't be done anymore, so each modification to
> the table will cause all modules to loose their current state, even
> if they we're not directly affected by the change. We can't break
> this behaviour, so this limits potential users of the priv_data stuff
> to things like hashlimit or recent, which do a lookup of state stored
> completely external from the ruleset (and could use it to cache the
> lookup result). I don't think that this is worth it, we probably need
> to wait until we have a better userspace interface before we can do
> something like this ..
> 

I do not completly understand you. Today a modification of ONE rule will or
will not trigger the checkentry()/init() of ALL rules? 

I know they did before (in 2.4) since modules i have written has code to
workaround this. Having a low limiter like say a few packets each 5min can't
just be reset each time we modify another unrelated rule.

Latly howver it seams as it doesn't? What do you mean we are breaking with
this patch? A match/target doesn't have to use this new data area. Just let
don't alter them and they will continue to act aas they always done? We will
however provide better tools for new modules (not yet in pom-ng).

However, a problem that i just mailed about in another thread still exists.

The problem with limit acting up can be solved with my new module "lim". Its
so complety different that a new name is needed rather than a revision.
Solving the problem. Im currently porting it now.

Or did i missunderstand you completly?

--
Joakim Axelsson

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 14:25 ` Joakim Axelsson
@ 2006-08-14 14:31   ` Patrick McHardy
  2006-08-14 15:20     ` Joakim Axelsson
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 14:31 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

Joakim Axelsson wrote:
> 2006-08-14 15:34:05+0200, Patrick McHardy <kaber@trash.net> ->
> 
>>I'm afraid I have some bad news ..
>>
>>[...]
>
> I do not completly understand you. Today a modification of ONE rule will or
> will not trigger the checkentry()/init() of ALL rules? 

Yes it will. Modification happens like this:

- dump entire table to userspace
- modify table
- send new table to kernel

_All_ matches and target and reinstantiated, since the kernel doesn't
know which rule in the currently active table corresponds to which
in the new table. When moving state out of the data shared with
userspace it will get lost during this.

> I know they did before (in 2.4) since modules i have written has code to
> workaround this. Having a low limiter like say a few packets each 5min can't
> just be reset each time we modify another unrelated rule.

Exactly.

> Latly howver it seams as it doesn't? What do you mean we are breaking with
> this patch? A match/target doesn't have to use this new data area. Just let
> don't alter them and they will continue to act aas they always done? We will
> however provide better tools for new modules (not yet in pom-ng).

Well, if nobody can use it reasonable there is no reason to introduce
it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 13:34 priv_data patch Patrick McHardy
  2006-08-14 14:25 ` Joakim Axelsson
@ 2006-08-14 14:40 ` Massimiliano Hofer
  2006-08-14 14:48   ` Patrick McHardy
       [not found] ` <200608141557.35918.max@nucleus.it>
  2 siblings, 1 reply; 31+ messages in thread
From: Massimiliano Hofer @ 2006-08-14 14:40 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Netfilter Development Mailinglist

On Monday 14 August 2006 3:34 pm, you wrote:

> While merging the priv_data patch I noticed an oversight. Currently,
> when modifying the ruleset, all modules dump their entire state
> (user configuration + internal state kept in the same structure)
> to userspace, which will return it to the kernel. That means for
> example that the limit match will not loose its current state when
> modifying other rules. When we move the state out of the data shared
> with userspace this can't be done anymore, so each modification to
> the table will cause all modules to loose their current state, even
> if they we're not directly affected by the change. We can't break
> this behaviour, so this limits potential users of the priv_data stuff
> to things like hashlimit or recent, which do a lookup of state stored
> completely external from the ruleset (and could use it to cache the
> lookup result). I don't think that this is worth it, we probably need
> to wait until we have a better userspace interface before we can do
> something like this ..

I'm afraid you are right. This limits the usefulness to volatile data that can 
safely be discarded. This can be some sort of disposable statistic or 
performance enhancing cache of data that can be retrieved in other ways.

Without this patch we are condemned to ugly tricks for data needed by 
condition, hashlimit and recent. I think this is useful, but I'll leave it to 
you to decide if it's worth.

Any idea for a better userspace interface?
It's not the first time you tell me that we could have a better "next 
generation" userspace interface. Maybe it's time to start planning.
Does anyone have wishes for new or different ways to do things?

Just an example without proper thought or planning: we could set an optional 
way rules could use to tag themselves and have their data back if they want 
it. As with priv_data this won't benefit everyone. I'll keep thinking of 
better ways.

IMHO a portion of data outside the one passed by userspace (persistent or 
volatile) is a must in the long run and will free us from an arbitrary 
constraint between userspace and kernelspace. I see other people are writing 
matches that rely on separate user input to complete its data (interface 
groups?) and they will need somewhere to store it.

-- 
Saluti,
   Massimiliano Hofer
        Nucleus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 14:40 ` Massimiliano Hofer
@ 2006-08-14 14:48   ` Patrick McHardy
  2006-08-14 14:58     ` Joakim Axelsson
  2006-08-14 16:19     ` Massimiliano Hofer
  0 siblings, 2 replies; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 14:48 UTC (permalink / raw)
  To: Massimiliano Hofer; +Cc: Netfilter Development Mailinglist

Massimiliano Hofer wrote:
> On Monday 14 August 2006 3:34 pm, you wrote:
> 
>>[...]
>
> I'm afraid you are right. This limits the usefulness to volatile data that can 
> safely be discarded. This can be some sort of disposable statistic or 
> performance enhancing cache of data that can be retrieved in other ways.

Agreed.

> Without this patch we are condemned to ugly tricks for data needed by 
> condition, hashlimit and recent. I think this is useful, but I'll leave it to 
> you to decide if it's worth.

Hmm .. recent does a table lookup during runtime and the table could be
cached. That would improve things a bit, but in my opinion not enough
to justify this patch. Same for hashlimit. What data would condition
store exactly?

> Any idea for a better userspace interface?
> It's not the first time you tell me that we could have a better "next 
> generation" userspace interface. Maybe it's time to start planning.
> Does anyone have wishes for new or different ways to do things?


Its actually quite clear what is needed. We want a userspace interface
built on netlink, that acts on individual rules, not entire rulesets.
There are a few more ideas, like handling negation centrally, allowing
userspace to specify whether a target is terminal or not, allow multiple
non-terminal targets in a row, etc, but nothing really fundamental.

> Just an example without proper thought or planning: we could set an optional 
> way rules could use to tag themselves and have their data back if they want 
> it. As with priv_data this won't benefit everyone. I'll keep thinking of 
> better ways.


We want to get rid of the atomic table replacements entirely.

> IMHO a portion of data outside the one passed by userspace (persistent or 
> volatile) is a must in the long run and will free us from an arbitrary 
> constraint between userspace and kernelspace. I see other people are writing 
> matches that rely on separate user input to complete its data (interface 
> groups?) and they will need somewhere to store it.


Once we stop replacing entire rulesets and move to a finer grained level
this problem will be gone, state will be kept for all rules except the
ones affected. Using netlink attributes will also allow us to flexibly
enhance the interface as needed.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 14:48   ` Patrick McHardy
@ 2006-08-14 14:58     ` Joakim Axelsson
  2006-08-14 15:05       ` Patrick McHardy
  2006-08-14 16:19     ` Massimiliano Hofer
  1 sibling, 1 reply; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 14:58 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

2006-08-14 16:48:27+0200, Patrick McHardy <kaber@trash.net> ->
> > Any idea for a better userspace interface?
> > It's not the first time you tell me that we could have a better "next 
> > generation" userspace interface. Maybe it's time to start planning.
> > Does anyone have wishes for new or different ways to do things?
> 
> 
> Its actually quite clear what is needed. We want a userspace interface
> built on netlink, that acts on individual rules, not entire rulesets.
> There are a few more ideas, like handling negation centrally, allowing
> userspace to specify whether a target is terminal or not, allow multiple
> non-terminal targets in a row, etc, but nothing really fundamental.
> 

I have suggested this some years ago. But a new module type "action" could be
used, along with "match" and "target". Meaning:
1. After zero, one or more matches
2. You run a zero, one or more actions
3. And finally end up in zero or one target.

Example:
iptables -m condition -m limit -a LOG -j DROP 

This means that the only targets we have today (as i can remember now) are:
-j ACCEPT
-j DROP
-j REJECT
-j other_chain

As of now i have a few modules cheating this, being a match when they should
be a target. A match always matching. This to lower the number of rules
needed to perform some things.

--
Joakim Axelsson

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 14:58     ` Joakim Axelsson
@ 2006-08-14 15:05       ` Patrick McHardy
  0 siblings, 0 replies; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 15:05 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

Joakim Axelsson wrote:
> 2006-08-14 16:48:27+0200, Patrick McHardy <kaber@trash.net> ->
> 
>>Its actually quite clear what is needed. We want a userspace interface
>>built on netlink, that acts on individual rules, not entire rulesets.
>>There are a few more ideas, like handling negation centrally, allowing
>>userspace to specify whether a target is terminal or not, allow multiple
>>non-terminal targets in a row, etc, but nothing really fundamental.
>>
> 
> 
> I have suggested this some years ago. But a new module type "action" could be
> used, along with "match" and "target". Meaning:
> 1. After zero, one or more matches
> 2. You run a zero, one or more actions
> 3. And finally end up in zero or one target.
> 
> Example:
> iptables -m condition -m limit -a LOG -j DROP 
> 
> This means that the only targets we have today (as i can remember now) are:
> -j ACCEPT
> -j DROP
> -j REJECT
> -j other_chain


Yes, we clearly want something like that. The exact details need to be
worked out when actually implementing it.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
       [not found]     ` <200608141702.50753.max@nucleus.it>
@ 2006-08-14 15:14       ` Patrick McHardy
  0 siblings, 0 replies; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 15:14 UTC (permalink / raw)
  To: Massimiliano Hofer; +Cc: Netfilter Development Mailinglist

Massimiliano Hofer wrote:
> On Monday 14 August 2006 4:37 pm, you wrote:
> 
>>Hmm .. recent does a table lookup during runtime and the table could be
>>cached. That would improve things a bit, but in my opinion not enough
>>to justify this patch. Same for hashlimit. What data would condition
>>store exactly?
> 
> 
> I need a pointer to per condition data, so that multiple rules with the same 
> name refer to the same flag.
> I can break userspace compatibility and store a pointer in the userspace 
> structure. I just thought this could be useful to everyone (and let me 
> maintain userspace compatibility along the way).


That looks like the only valid type of usage. Which means your initial
implementation, which just provided space for a pointer to the
individual instances, might have been the better way. I need to think
about this some more and look at the modules that could make use of
this again.

>>Its actually quite clear what is needed. We want a userspace interface
>>built on netlink, that acts on individual rules, not entire rulesets.
>>There are a few more ideas, like handling negation centrally, allowing
>>userspace to specify whether a target is terminal or not, allow multiple
>>non-terminal targets in a row, etc, but nothing really fundamental.
> 
> 
> I thought the current way of doing things was specifically designed to 
> minimize softirq locking (especially with arbitarily long chains and 
> arbitrary initialization code). We could switch to RCU lists, though...


Yes, it should be possible to do lockless ruleset evaluation (at least
on the ruleset level, some modules will still need locking).

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 14:31   ` Patrick McHardy
@ 2006-08-14 15:20     ` Joakim Axelsson
  2006-08-14 15:28       ` Patrick McHardy
                         ` (2 more replies)
  0 siblings, 3 replies; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 15:20 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

2006-08-14 16:31:34+0200, Patrick McHardy <kaber@trash.net> ->
> Joakim Axelsson wrote:
> > 2006-08-14 15:34:05+0200, Patrick McHardy <kaber@trash.net> ->
> > 
> >>I'm afraid I have some bad news ..
> >>
> >>[...]
> >
> > I do not completly understand you. Today a modification of ONE rule will or
> > will not trigger the checkentry()/init() of ALL rules? 
> 
> Yes it will. Modification happens like this:
> 
> - dump entire table to userspace
> - modify table
> - send new table to kernel
> 
> _All_ matches and target and reinstantiated, since the kernel doesn't
> know which rule in the currently active table corresponds to which
> in the new table. When moving state out of the data shared with
> userspace it will get lost during this.
> 

Lost? Like the memory will be reallocated and we have a memory leak from the
old priv_data?

Can't we just figure out if thie pointer is null and don't allocate new
memory?

Or am i lost here?

> > I know they did before (in 2.4) since modules i have written has code to
> > workaround this. Having a low limiter like say a few packets each 5min can't
> > just be reset each time we modify another unrelated rule.
> 
> Exactly.
> 
> > Latly howver it seams as it doesn't? What do you mean we are breaking with
> > this patch? A match/target doesn't have to use this new data area. Just let
> > don't alter them and they will continue to act aas they always done? We will
> > however provide better tools for new modules (not yet in pom-ng).
> 
> Well, if nobody can use it reasonable there is no reason to introduce
> it.

Alot of my patches can use it. Not having todo an ugly solution trying to
sneak away from being reseted when another rule is altered. I sure would
like to have it added. Simpyl do not change for example -m limit into using
it if it breaks the "feature" of reseting its state then altering another
unrelated rule.

Please have a look here for 4 modules "needing" this patch:
http://www.gozem.se/~gozem/netfilter/

I'm copying here the code they are using today to workaround this
reset-"feature":

struct info {
	... data here ...
	atomic_t refcount;
};

init() {
	/* Already initiated? Since this is runned each time ANY rule is changed */
	if (lim->state != NULL) {
		/* Increase the reference counter so we wont delete this match */
		atomic_inc( &lim->state->refcount );

		DEBUGPRINT("already initiated, abort ref=%u", 
		   atomic_read( &lim->state->refcount) );

		return 1;
	}
		
	/* init state data, set refcount to 1 */
	lim->state = kmalloc( sizeof(struct ipt_lim_state), GFP_ATOMIC );
	if (lim->state == NULL)
		return -ENOMEM;
	atomic_set(&lim->state->refcount, 1);
}

destroy() {
	/* Decrease our reference counter and test if its zero*/
	if ( atomic_dec_and_test(&lim->state->refcount) ) {
		/* Really delete this match */
		DEBUGPRINTP("really delete");

		/* free state */
		kfree(lim->state);
	}
}


--
Joakim Axelsson

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 15:20     ` Joakim Axelsson
@ 2006-08-14 15:28       ` Patrick McHardy
  2006-08-14 16:04         ` Joakim Axelsson
  2006-08-14 15:31       ` Patrick McHardy
  2006-08-14 15:53       ` Massimiliano Hofer
  2 siblings, 1 reply; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 15:28 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

Joakim Axelsson wrote:
> 2006-08-14 16:31:34+0200, Patrick McHardy <kaber@trash.net> ->
> 
>
>>>I do not completly understand you. Today a modification of ONE rule will or
>>>will not trigger the checkentry()/init() of ALL rules? 
>>
>>Yes it will. Modification happens like this:
>>
>>- dump entire table to userspace
>>- modify table
>>- send new table to kernel
>>
>>_All_ matches and target and reinstantiated, since the kernel doesn't
>>know which rule in the currently active table corresponds to which
>>in the new table. When moving state out of the data shared with
>>userspace it will get lost during this.
>>
> 
> Lost? Like the memory will be reallocated and we have a memory leak from the
> old priv_data?

No, the contents will be lost since the allocated memory belonging to
the old table will get freed and new memory is allocated for the new
table.

> Can't we just figure out if thie pointer is null and don't allocate new
> memory?
> 
> Or am i lost here?

It won't be non-NULL since we're always initializing a new table from
the kernels POV.

>>Well, if nobody can use it reasonable there is no reason to introduce
>>it.
> 
> 
> Alot of my patches can use it. Not having todo an ugly solution trying to
> sneak away from being reseted when another rule is altered. I sure would
> like to have it added. Simpyl do not change for example -m limit into using
> it if it breaks the "feature" of reseting its state then altering another
> unrelated rule.
> 
> Please have a look here for 4 modules "needing" this patch:
> http://www.gozem.se/~gozem/netfilter/

Please post your examples to the list.

> I'm copying here the code they are using today to workaround this
> reset-"feature":
> 
> struct info {
> 	... data here ...
> 	atomic_t refcount;
> };
> 
> init() {
> 	/* Already initiated? Since this is runned each time ANY rule is changed */
> 	if (lim->state != NULL) {


I'm not sure I understand what you're trying to show here,
but I assume its some kind of shared state between multiple
instances of your "lim" match. the first question would be:
where does the state pointer get its value from here?
You can't rely on userspace passing back a valid pointer,
this is questionable today (CAP_NET_ADMIN might crash the
box), but its a huge bug once you consider things like
OpenVZ.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 15:20     ` Joakim Axelsson
  2006-08-14 15:28       ` Patrick McHardy
@ 2006-08-14 15:31       ` Patrick McHardy
  2006-08-14 15:40         ` Joakim Axelsson
  2006-08-14 15:53       ` Massimiliano Hofer
  2 siblings, 1 reply; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 15:31 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

Joakim Axelsson wrote:
> Alot of my patches can use it. Not having todo an ugly solution trying to
> sneak away from being reseted when another rule is altered. I sure would
> like to have it added. Simpyl do not change for example -m limit into using
> it if it breaks the "feature" of reseting its state then altering another
> unrelated rule.

I forgot to reply to this. You seem to misunderstand, limit doesn't
reset its state today. It will when moving private data out of the
structures shared with userspace. Same for all other users of this,
they will "forget" their state on each ruleset change.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 15:31       ` Patrick McHardy
@ 2006-08-14 15:40         ` Joakim Axelsson
  2006-08-14 15:46           ` Patrick McHardy
  0 siblings, 1 reply; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 15:40 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

2006-08-14 17:31:18+0200, Patrick McHardy <kaber@trash.net> ->
> Joakim Axelsson wrote:
> > Alot of my patches can use it. Not having todo an ugly solution trying to
> > sneak away from being reseted when another rule is altered. I sure would
> > like to have it added. Simpyl do not change for example -m limit into using
> > it if it breaks the "feature" of reseting its state then altering another
> > unrelated rule.
> 
> I forgot to reply to this. You seem to misunderstand, limit doesn't
> reset its state today. It will when moving private data out of the
> structures shared with userspace. Same for all other users of this,
> they will "forget" their state on each ruleset change.

Okie, now I get it. This seams to have changed from 2.4 then. As altering
one unrelated rule will trigger the checkentry for _all_ rules. The code i
posted was a (somewhat ugly) workaround for this, and yes relying on
userspace not altering a kernel-space pointer for us. However, the case is
the same for xt_limit with r->master = r; (and quota). Alter master in
userspace after the limit rule has been initiated and you will get some
really nasty result.

--
Joakim Axelsson

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 15:40         ` Joakim Axelsson
@ 2006-08-14 15:46           ` Patrick McHardy
  2006-08-14 15:56             ` Joakim Axelsson
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 15:46 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

Joakim Axelsson wrote:
> 2006-08-14 17:31:18+0200, Patrick McHardy <kaber@trash.net> ->
> 
>>Joakim Axelsson wrote:
>>
>>>Alot of my patches can use it. Not having todo an ugly solution trying to
>>>sneak away from being reseted when another rule is altered. I sure would
>>>like to have it added. Simpyl do not change for example -m limit into using
>>>it if it breaks the "feature" of reseting its state then altering another
>>>unrelated rule.
>>
>>I forgot to reply to this. You seem to misunderstand, limit doesn't
>>reset its state today. It will when moving private data out of the
>>structures shared with userspace. Same for all other users of this,
>>they will "forget" their state on each ruleset change.
> 
> 
> Okie, now I get it. This seams to have changed from 2.4 then.

No, this behaviour has been there since the beginning.

> As altering
> one unrelated rule will trigger the checkentry for _all_ rules. The code i
> posted was a (somewhat ugly) workaround for this, and yes relying on
> userspace not altering a kernel-space pointer for us. However, the case is
> the same for xt_limit with r->master = r; (and quota). Alter master in
> userspace after the limit rule has been initiated and you will get some
> really nasty result.


Thats not true, the master pointer is reinitialized on every change by
the checkentry function (which, as you note, is called on all rules for
every change). The simple reason why it keeps its current state is
because it is dumped to userspace and echoed back. If you move it out of
the structure shared with userspace, this can not happen anymore.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 15:20     ` Joakim Axelsson
  2006-08-14 15:28       ` Patrick McHardy
  2006-08-14 15:31       ` Patrick McHardy
@ 2006-08-14 15:53       ` Massimiliano Hofer
  2 siblings, 0 replies; 31+ messages in thread
From: Massimiliano Hofer @ 2006-08-14 15:53 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Netfilter Development Mailinglist, Patrick McHardy

On Monday 14 August 2006 5:20 pm, Joakim Axelsson wrote:

> Lost? Like the memory will be reallocated and we have a memory leak from
> the old priv_data?

No.
Every time a modification is done a complete new chain is created and 
initialized, swapped for the current one and then the old one is removed.
This means that any memory needed by priv_data is reallocated for the new 
chain end than freed while destroying the old one.
My patch doesn't care since I keep a global list where I can fetch my data 
back using the condition name and use priv_data only as a way to keep it at 
hand and achieve O(1) performance.
If you don't have an alternative means to retrieve your data I'm afraid you'r 
bust. :(
I agree with Patrick and I think it's time to think about other ways to manage 
changes.

-- 
Saluti,
   Massimiliano Hofer
        Nucleus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 15:46           ` Patrick McHardy
@ 2006-08-14 15:56             ` Joakim Axelsson
  2006-08-14 16:01               ` Patrick McHardy
  0 siblings, 1 reply; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 15:56 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

2006-08-14 17:46:59+0200, Patrick McHardy <kaber@trash.net> ->
> Joakim Axelsson wrote:
> > 2006-08-14 17:31:18+0200, Patrick McHardy <kaber@trash.net> ->
> > 
> >>Joakim Axelsson wrote:
> >>
> >>>Alot of my patches can use it. Not having todo an ugly solution trying to
> >>>sneak away from being reseted when another rule is altered. I sure would
> >>>like to have it added. Simpyl do not change for example -m limit into using
> >>>it if it breaks the "feature" of reseting its state then altering another
> >>>unrelated rule.
> >>
> >>I forgot to reply to this. You seem to misunderstand, limit doesn't
> >>reset its state today. It will when moving private data out of the
> >>structures shared with userspace. Same for all other users of this,
> >>they will "forget" their state on each ruleset change.
> > 
> > 
> > Okie, now I get it. This seams to have changed from 2.4 then.
> 
> No, this behaviour has been there since the beginning.
> 
> > As altering
> > one unrelated rule will trigger the checkentry for _all_ rules. The code i
> > posted was a (somewhat ugly) workaround for this, and yes relying on
> > userspace not altering a kernel-space pointer for us. However, the case is
> > the same for xt_limit with r->master = r; (and quota). Alter master in
> > userspace after the limit rule has been initiated and you will get some
> > really nasty result.
> 
> 
> Thats not true, the master pointer is reinitialized on every change by
> the checkentry function (which, as you note, is called on all rules for
> every change). The simple reason why it keeps its current state is
> because it is dumped to userspace and echoed back. If you move it out of
> the structure shared with userspace, this can not happen anymore.

I think we define reinitiated differently then. I define reinitated by
checkentry() being runned even tho the match/rule isn't a new (or altered)
match/rule. 

If checkentry() is runned for _all_ matches/targers everytime an unrelated
rule is altered then limit will lose its state because the code resets the
bucket and refills it. Meaning that if I have a limit of 1 packet each hour
and I change an (unrelated) rule every 30mins the limit will be a limit of 1
packet every 30min instead. The (ugly) workaround i posted will solve this
issue by keeping track if it has been running the checkentry() before. This
is where the priv_data is needed.

But if you say that the priv_data pointer will be lost for all rules on any
alter of rules its kinda void to have it. However a solution keeping that
priv_data pointer intact could be very useful. 

One reason for not posting my patches is this uglyness where i have to rely
on kernel pointers not being altered by userspace.

--
Joakim Axelsson

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 15:56             ` Joakim Axelsson
@ 2006-08-14 16:01               ` Patrick McHardy
  2006-08-14 16:13                 ` Joakim Axelsson
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 16:01 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

Joakim Axelsson wrote:
> 2006-08-14 17:46:59+0200, Patrick McHardy <kaber@trash.net> ->
> 
>>Thats not true, the master pointer is reinitialized on every change by
>>the checkentry function (which, as you note, is called on all rules for
>>every change). The simple reason why it keeps its current state is
>>because it is dumped to userspace and echoed back. If you move it out of
>>the structure shared with userspace, this can not happen anymore.
> 
> 
> I think we define reinitiated differently then. I define reinitated by
> checkentry() being runned even tho the match/rule isn't a new (or altered)
> match/rule. 
> 
> If checkentry() is runned for _all_ matches/targers everytime an unrelated
> rule is altered then limit will lose its state because the code resets the
> bucket and refills it.


You're right. That is actually a bug in my opinion.

> Meaning that if I have a limit of 1 packet each hour
> and I change an (unrelated) rule every 30mins the limit will be a limit of 1
> packet every 30min instead. The (ugly) workaround i posted will solve this
> issue by keeping track if it has been running the checkentry() before. This
> is where the priv_data is needed.


No, we can simply check if f.e. credit is non-zero.

> But if you say that the priv_data pointer will be lost for all rules on any
> alter of rules its kinda void to have it. However a solution keeping that
> priv_data pointer intact could be very useful. 


That would mean the kernel needs to associate new rules with rules in
the old table (or the individual modules itself need to do it somehow,
like hashlimit or recent). This is really getting too ugly to even
consider in my opinion, especially since the obvious solution of not
doing an atomic table exchange also solves a lot of other problems.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 15:28       ` Patrick McHardy
@ 2006-08-14 16:04         ` Joakim Axelsson
  2006-08-14 16:13           ` Patrick McHardy
  0 siblings, 1 reply; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 16:04 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

2006-08-14 17:28:19+0200, Patrick McHardy <kaber@trash.net> ->
> Joakim Axelsson wrote:
> > 2006-08-14 16:31:34+0200, Patrick McHardy <kaber@trash.net> ->
> > 
> >
> >>>I do not completly understand you. Today a modification of ONE rule will or
> >>>will not trigger the checkentry()/init() of ALL rules? 
> >>
> >>Yes it will. Modification happens like this:
> >>
> >>- dump entire table to userspace
> >>- modify table
> >>- send new table to kernel
> >>
> >>_All_ matches and target and reinstantiated, since the kernel doesn't
> >>know which rule in the currently active table corresponds to which
> >>in the new table. When moving state out of the data shared with
> >>userspace it will get lost during this.
> >>
> > 
> > Lost? Like the memory will be reallocated and we have a memory leak from the
> > old priv_data?
> 
> No, the contents will be lost since the allocated memory belonging to
> the old table will get freed and new memory is allocated for the new
> table.
> 
> > Can't we just figure out if thie pointer is null and don't allocate new
> > memory?
> > 
> > Or am i lost here?
> 
> It won't be non-NULL since we're always initializing a new table from
> the kernels POV.
> 

This was a hard one. Seams then the only solution for this is to keep some
global state in a list and for every checkentry find your entry in it and
cache it in info (or priv_data if we apply the patch).


> >>Well, if nobody can use it reasonable there is no reason to introduce
> >>it.
> > 
> > 
> > Alot of my patches can use it. Not having todo an ugly solution trying to
> > sneak away from being reseted when another rule is altered. I sure would
> > like to have it added. Simpyl do not change for example -m limit into using
> > it if it breaks the "feature" of reseting its state then altering another
> > unrelated rule.
> > 
> > Please have a look here for 4 modules "needing" this patch:
> > http://www.gozem.se/~gozem/netfilter/
> 
> Please post your examples to the list.
> 

In which format? As a full patch for the kernel or something that fits in
pom-ng? The current code works for 2.4 and early 2.6. I don't want to spend
time porting it into "wrong" API now when we are discussing priv_data to be
or not to be :-)

> > I'm copying here the code they are using today to workaround this
> > reset-"feature":
> > 
> > struct info {
> > 	... data here ...
> > 	atomic_t refcount;
> > };
> > 
> > init() {
> > 	/* Already initiated? Since this is runned each time ANY rule is changed */
> > 	if (lim->state != NULL) {
> 
> 
> I'm not sure I understand what you're trying to show here,
> but I assume its some kind of shared state between multiple
> instances of your "lim" match. the first question would be:
> where does the state pointer get its value from here?
> You can't rely on userspace passing back a valid pointer,
> this is questionable today (CAP_NET_ADMIN might crash the
> box), but its a huge bug once you consider things like
> OpenVZ.
> 

Yes, its ugly but has been working in our routers for atleast a year now. I
will however port it to use a global list state instead as some of the
modules already does.
 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 16:01               ` Patrick McHardy
@ 2006-08-14 16:13                 ` Joakim Axelsson
  2006-08-14 16:26                   ` Patrick McHardy
  0 siblings, 1 reply; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 16:13 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

2006-08-14 18:01:19+0200, Patrick McHardy <kaber@trash.net> ->
> Joakim Axelsson wrote:
> > 2006-08-14 17:46:59+0200, Patrick McHardy <kaber@trash.net> ->
> > 
> >>Thats not true, the master pointer is reinitialized on every change by
> >>the checkentry function (which, as you note, is called on all rules for
> >>every change). The simple reason why it keeps its current state is
> >>because it is dumped to userspace and echoed back. If you move it out of
> >>the structure shared with userspace, this can not happen anymore.
> > 
> > 
> > I think we define reinitiated differently then. I define reinitated by
> > checkentry() being runned even tho the match/rule isn't a new (or altered)
> > match/rule. 
> > 
> > If checkentry() is runned for _all_ matches/targers everytime an unrelated
> > rule is altered then limit will lose its state because the code resets the
> > bucket and refills it.
> 
> 
> You're right. That is actually a bug in my opinion.
> 
> > Meaning that if I have a limit of 1 packet each hour
> > and I change an (unrelated) rule every 30mins the limit will be a limit of 1
> > packet every 30min instead. The (ugly) workaround i posted will solve this
> > issue by keeping track if it has been running the checkentry() before. This
> > is where the priv_data is needed.
> 
> 
> No, we can simply check if f.e. credit is non-zero.
> 

It needs to be coded then :-). It's not the case today.

> > But if you say that the priv_data pointer will be lost for all rules on any
> > alter of rules its kinda void to have it. However a solution keeping that
> > priv_data pointer intact could be very useful. 
> 
> 
> That would mean the kernel needs to associate new rules with rules in
> the old table (or the individual modules itself need to do it somehow,
> like hashlimit or recent). This is really getting too ugly to even
> consider in my opinion, especially since the obvious solution of not
> doing an atomic table exchange also solves a lot of other problems.

I agree. A better solution is needed or no solution at all. However building
a new API that actually only will change the parts it needs to will need a
huge amount of work and will most probably break existing userspace. It will
be very hard to keep backwards compability.

As i see it some new hooks are needed on every module to retrieve the kernel
data. This means we need 5 things:
1. start a new rule/match/target (todays checkentry())
2. remove an old rule/match/target (todays destroy())
3. the worker (todays match() and target())
4. A new 'list' that will feed info needed by iptables -L and iptables-save
to userspace.
5. Possible a new 'alter' that will alter info in the rules/match/targets
private kernel data.

Point 4 and 5 can be views as read and write of the rule. Point 1 and 2 as
create and destroy.

A uniq ID will be needed for each rule so userspace can define which rule it
wishes to alter/remove.

Fairly simple interface to build with netlink. However the real challenge is
to try to keep backward compability. 

--
Joakim Axelsson

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 16:04         ` Joakim Axelsson
@ 2006-08-14 16:13           ` Patrick McHardy
  2006-08-14 16:55             ` Joakim Axelsson
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 16:13 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

Joakim Axelsson wrote:
> 2006-08-14 17:28:19+0200, Patrick McHardy <kaber@trash.net> ->
> 
>>>
>>>Please have a look here for 4 modules "needing" this patch:
>>>http://www.gozem.se/~gozem/netfilter/
>>
>>Please post your examples to the list.
>>
> 
> 
> In which format? As a full patch for the kernel or something that fits in
> pom-ng? The current code works for 2.4 and early 2.6. I don't want to spend
> time porting it into "wrong" API now when we are discussing priv_data to be
> or not to be :-)

I'm mostly interested in the way you wish to use it, so I can better
judge whether this change is worth doing or not.

> Yes, its ugly but has been working in our routers for atleast a year now. I
> will however port it to use a global list state instead as some of the
> modules already does.

I don't say its not working, but we can't put something like this
in the kernel. So you need some identifier to find your state after
changes. Besides the uglyness, we don't have anywhere to put it in
the blob so userspace can return it, so it would end up beeing something
local to the individual modules again (like table name in recent
and hashlimit), just what we're doing today.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 14:48   ` Patrick McHardy
  2006-08-14 14:58     ` Joakim Axelsson
@ 2006-08-14 16:19     ` Massimiliano Hofer
  2006-08-14 16:32       ` Joakim Axelsson
  1 sibling, 1 reply; 31+ messages in thread
From: Massimiliano Hofer @ 2006-08-14 16:19 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Netfilter Development Mailinglist

On Monday 14 August 2006 4:48 pm, Patrick McHardy wrote:

> Hmm .. recent does a table lookup during runtime and the table could be
> cached. That would improve things a bit, but in my opinion not enough
> to justify this patch. Same for hashlimit. What data would condition
> store exactly?

I need a pointer to per condition data, so that multiple rules with the same 
name refer to the same flag.
I can break userspace compatibility and store a pointer in the userspace 
structure. I just thought this could be useful to everyone (and let me 
maintain userspace compatibility along the way).

> Its actually quite clear what is needed. We want a userspace interface
> built on netlink, that acts on individual rules, not entire rulesets.
> There are a few more ideas, like handling negation centrally, allowing
> userspace to specify whether a target is terminal or not, allow multiple
> non-terminal targets in a row, etc, but nothing really fundamental.

I thought the current way of doing things was specifically designed to 
minimize softirq locking (especially with arbitarily long chains and 
arbitrary initialization code). We could switch to RCU lists, though...

-- 
Saluti,
   Massimiliano Hofer
        Nucleus

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 16:13                 ` Joakim Axelsson
@ 2006-08-14 16:26                   ` Patrick McHardy
  2006-08-14 16:40                     ` Joakim Axelsson
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 16:26 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

Joakim Axelsson wrote:
> 2006-08-14 18:01:19+0200, Patrick McHardy <kaber@trash.net> ->
> 
>>[...]
> 
> I agree. A better solution is needed or no solution at all. However building
> a new API that actually only will change the parts it needs to will need a
> huge amount of work and will most probably break existing userspace. It will
> be very hard to keep backwards compability.

We need to keep the current API stable of course, but that doesn't
prevent us from introducing an entirely new one. Matches and targets
probably can be reused to a certain amount, although I wouldn't
mind getting rid of the excessive amount of matches doing basically
the same thing at the same time.

> As i see it some new hooks are needed on every module to retrieve the kernel
> data. This means we need 5 things:
> 1. start a new rule/match/target (todays checkentry())
> 2. remove an old rule/match/target (todays destroy())
> 3. the worker (todays match() and target())
> 4. A new 'list' that will feed info needed by iptables -L and iptables-save
> to userspace.

Commonly called dump in other netlink interfaces.

> 5. Possible a new 'alter' that will alter info in the rules/match/targets
> private kernel data.

This is tricky to get right on the rule level, a rule consist of
multiple elements that would need to be changed atomically.

> Point 4 and 5 can be views as read and write of the rule. Point 1 and 2 as
> create and destroy.
> 
> A uniq ID will be needed for each rule so userspace can define which rule it
> wishes to alter/remove.

Yes.

> Fairly simple interface to build with netlink. However the real challenge is
> to try to keep backward compability. 

That is basically impossible. We can keep a compatible command-line
interface, but the ABI can't be kept compatible. The interface itself
it quite simple, but we also need new ruleset evaluation functions,
new loop detection and probably a few other things.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 16:19     ` Massimiliano Hofer
@ 2006-08-14 16:32       ` Joakim Axelsson
  0 siblings, 0 replies; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 16:32 UTC (permalink / raw)
  To: Massimiliano Hofer; +Cc: Netfilter Development Mailinglist, Patrick McHardy

2006-08-14 18:19:36+0200, Massimiliano Hofer <max@nucleus.it> ->
> On Monday 14 August 2006 4:48 pm, Patrick McHardy wrote:
> 
> > Hmm .. recent does a table lookup during runtime and the table could be
> > cached. That would improve things a bit, but in my opinion not enough
> > to justify this patch. Same for hashlimit. What data would condition
> > store exactly?
> 
> I need a pointer to per condition data, so that multiple rules with the same 
> name refer to the same flag.
> I can break userspace compatibility and store a pointer in the userspace 
> structure. I just thought this could be useful to everyone (and let me 
> maintain userspace compatibility along the way).
> 

Same goes for all but one of my matches. All of them are accessing "their"
global variable that several rules might share. All but my lim (new limiter)
that has its own data that should be re-checkentry():ed everytime another
rule changes. I sure can keep a global list of all limits but i have no good
way of making each uniq. There is no naming as in condition (and all other
of my modules).

> > Its actually quite clear what is needed. We want a userspace interface
> > built on netlink, that acts on individual rules, not entire rulesets.
> > There are a few more ideas, like handling negation centrally, allowing
> > userspace to specify whether a target is terminal or not, allow multiple
> > non-terminal targets in a row, etc, but nothing really fundamental.
> 
> I thought the current way of doing things was specifically designed to 
> minimize softirq locking (especially with arbitarily long chains and 
> arbitrary initialization code). We could switch to RCU lists, though...
> 

That would solve alot of problems and make the data structure much more
flexible in the futhure for alterations. Guess we have to change back to the
old ipchains name then :-P

--
Joakim Axelsson

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 16:26                   ` Patrick McHardy
@ 2006-08-14 16:40                     ` Joakim Axelsson
  2006-08-14 16:50                       ` Patrick McHardy
  0 siblings, 1 reply; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 16:40 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

2006-08-14 18:26:21+0200, Patrick McHardy <kaber@trash.net> ->
> Joakim Axelsson wrote:
> > 2006-08-14 18:01:19+0200, Patrick McHardy <kaber@trash.net> ->
> > 
> >>[...]
> > 
> > I agree. A better solution is needed or no solution at all. However building
> > a new API that actually only will change the parts it needs to will need a
> > huge amount of work and will most probably break existing userspace. It will
> > be very hard to keep backwards compability.
> 
> We need to keep the current API stable of course, but that doesn't
> prevent us from introducing an entirely new one. Matches and targets
> probably can be reused to a certain amount, although I wouldn't
> mind getting rid of the excessive amount of matches doing basically
> the same thing at the same time.
> 
> > As i see it some new hooks are needed on every module to retrieve the kernel
> > data. This means we need 5 things:
> > 1. start a new rule/match/target (todays checkentry())
> > 2. remove an old rule/match/target (todays destroy())
> > 3. the worker (todays match() and target())
> > 4. A new 'list' that will feed info needed by iptables -L and iptables-save
> > to userspace.
> 
> Commonly called dump in other netlink interfaces.
> 
> > 5. Possible a new 'alter' that will alter info in the rules/match/targets
> > private kernel data.
> 
> This is tricky to get right on the rule level, a rule consist of
> multiple elements that would need to be changed atomically.
> 

Yes, sorry. I mean per match/target.

> > Point 4 and 5 can be views as read and write of the rule. Point 1 and 2 as
> > create and destroy.
> > 
> > A uniq ID will be needed for each rule so userspace can define which rule it
> > wishes to alter/remove.
> 
> Yes.
> 
> > Fairly simple interface to build with netlink. However the real challenge is
> > to try to keep backward compability. 
> 
> That is basically impossible. We can keep a compatible command-line
> interface, but the ABI can't be kept compatible. The interface itself
> it quite simple, but we also need new ruleset evaluation functions,
> new loop detection and probably a few other things.

This work is huge, but really needed. I don't feel I am skilled enough to
write it, only contribute with porting matches and other things. I did
however write most of the code that ipset is based on now. So I have the
"extreme amount of hook functions needed" in my back. 

The real question is. Do we really want to force each match/target to
implement a fair amount of functions for it to work? We need to think big
from the start here, not missing some feature that will be hard to add
after. I rather see one too many needed function then one too few.

Is this really something we want? It will most probably end up in a new
ipfwadmin/ipchains/iptables -version. Its the easiest way todo it. Drop the
backward compability completly and possibly only make a new iptables
userspace command-line compability tool using the new API.

--
Joakim Axelsson
 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 16:40                     ` Joakim Axelsson
@ 2006-08-14 16:50                       ` Patrick McHardy
  2006-08-14 17:11                         ` Joakim Axelsson
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 16:50 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

Joakim Axelsson wrote:
> 2006-08-14 18:26:21+0200, Patrick McHardy <kaber@trash.net> ->
> 
>>>5. Possible a new 'alter' that will alter info in the rules/match/targets
>>>private kernel data.
>>
>>This is tricky to get right on the rule level, a rule consist of
>>multiple elements that would need to be changed atomically.
>>
> 
> Yes, sorry. I mean per match/target.


Is that really useful without beeing able to change more than
one component

>>That is basically impossible. We can keep a compatible command-line
>>interface, but the ABI can't be kept compatible. The interface itself
>>it quite simple, but we also need new ruleset evaluation functions,
>>new loop detection and probably a few other things.
> 
> 
> This work is huge, but really needed. I don't feel I am skilled enough to
> write it, only contribute with porting matches and other things. I did
> however write most of the code that ipset is based on now. So I have the
> "extreme amount of hook functions needed" in my back. 
> 
> The real question is. Do we really want to force each match/target to
> implement a fair amount of functions for it to work? We need to think big
> from the start here, not missing some feature that will be hard to add
> after. I rather see one too many needed function then one too few.

Its not so much. The interface comes down to "init", "destroy", "dump",
"do something". If we really want "change" it should be possible to
do it in one function with "init".

And as I already said, I would like to get rid of the large amount of
matches doing the same thing anyway. connbytes, connmark, conntrack,
helper, ... basically all do "take data from conntrack, compare".
realm, length, pkttype, .. do the same with skb metadata. A lot
of matches on real packet data are also quite similar. We could
easily get rid of 50%-75% of all matches and still have the same
functionality.

> Is this really something we want? It will most probably end up in a new
> ipfwadmin/ipchains/iptables -version. Its the easiest way todo it. Drop the
> backward compability completly and possibly only make a new iptables
> userspace command-line compability tool using the new API.

We can't do this, people expect to be able to user old versions of
the iptables tool. But we can introduce a new interface and new
tools without breaking the old ones.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 16:13           ` Patrick McHardy
@ 2006-08-14 16:55             ` Joakim Axelsson
  2006-08-14 16:59               ` Patrick McHardy
  2006-08-15  8:27               ` Amin Azez
  0 siblings, 2 replies; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 16:55 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Netfilter Development Mailinglist

2006-08-14 18:13:38+0200, Patrick McHardy <kaber@trash.net> ->
> Joakim Axelsson wrote:
> > 2006-08-14 17:28:19+0200, Patrick McHardy <kaber@trash.net> ->
> > 
> >>>
> >>>Please have a look here for 4 modules "needing" this patch:
> >>>http://www.gozem.se/~gozem/netfilter/
> >>
> >>Please post your examples to the list.
> >>
> > 
> > 
> > In which format? As a full patch for the kernel or something that fits in
> > pom-ng? The current code works for 2.4 and early 2.6. I don't want to spend
> > time porting it into "wrong" API now when we are discussing priv_data to be
> > or not to be :-)
> 
> I'm mostly interested in the way you wish to use it, so I can better
> judge whether this change is worth doing or not.
> 

The change is NOT worth it if we can't keep the priv_data pointer between
changes of unrelated rules. I better use a global list instead as condition
is using today.

Still, i want to know in whcih format the new modules are wanted. Patch,
files fitting pom-ng or my own svn-repository to source.list?

I will port my matches as soon as possible and post them for you to judge.
Probably for tomorrow evening.

--
Joakim Axelsson

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 16:55             ` Joakim Axelsson
@ 2006-08-14 16:59               ` Patrick McHardy
  2006-08-15  8:27               ` Amin Azez
  1 sibling, 0 replies; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 16:59 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Netfilter Development Mailinglist

Joakim Axelsson wrote:
> The change is NOT worth it if we can't keep the priv_data pointer between
> changes of unrelated rules. I better use a global list instead as condition
> is using today.

Thats my feeling as well.

> Still, i want to know in whcih format the new modules are wanted. Patch,
> files fitting pom-ng or my own svn-repository to source.list?

An URL to an external repository is preferred.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 16:50                       ` Patrick McHardy
@ 2006-08-14 17:11                         ` Joakim Axelsson
  2006-08-14 17:48                           ` Patrick McHardy
  0 siblings, 1 reply; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 17:11 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

2006-08-14 18:50:28+0200, Patrick McHardy <kaber@trash.net> ->
> Joakim Axelsson wrote:
> > 2006-08-14 18:26:21+0200, Patrick McHardy <kaber@trash.net> ->
> > 
> >>>5. Possible a new 'alter' that will alter info in the rules/match/targets
> >>>private kernel data.
> >>
> >>This is tricky to get right on the rule level, a rule consist of
> >>multiple elements that would need to be changed atomically.
> >>
> > 
> > Yes, sorry. I mean per match/target.
> 
> 
> Is that really useful without beeing able to change more than
> one component
> 

I don't really understand your consern here?

> >>That is basically impossible. We can keep a compatible command-line
> >>interface, but the ABI can't be kept compatible. The interface itself
> >>it quite simple, but we also need new ruleset evaluation functions,
> >>new loop detection and probably a few other things.
> > 
> > 
> > This work is huge, but really needed. I don't feel I am skilled enough to
> > write it, only contribute with porting matches and other things. I did
> > however write most of the code that ipset is based on now. So I have the
> > "extreme amount of hook functions needed" in my back. 
> > 
> > The real question is. Do we really want to force each match/target to
> > implement a fair amount of functions for it to work? We need to think big
> > from the start here, not missing some feature that will be hard to add
> > after. I rather see one too many needed function then one too few.
> 
> Its not so much. The interface comes down to "init", "destroy", "dump",
> "do something". If we really want "change" it should be possible to
> do it in one function with "init".
> 

So we are down to 4 function:
1. init
2. destroy
3. match/target (use)
4. dump (list)

Meaning an alter is really a "dump, destroy, init"? This might/will give you
atomic problems. I rather add one too many function not being used by most
modules rather than one too few. Remeber its one hook for each modules, not
for each rule. Its not a memory waster. Also think large matcher as recent
here. If we are to skip any /proc -involement we would like to be able to
remove one IP in a recent-match. That's an "alter".

Perhaps we should add one or two "userdefined" functions. Something that the
modules can define for them self to use. For example in the quota-case this
can be "add more quota". It sure will make the new iptables-structure really
flexible. 

Example:
iptables -U1 --match-id xxxxx -m module --params
iptables -U1 --match-id 12345 -m quota --add-quota 4000

We might be able to do this in an alter if we construct the alter to be able
to take commands that the module can specify. Easily done if a seperate
struct is passed with defined data that the module can interpetrate it self.

We will probably be needing to be able to identify a rule and/or a
match/target/module in some way uniqly. A simple tripple (chainname,
rulenumber, matchnumber) might do it though.

> And as I already said, I would like to get rid of the large amount of
> matches doing the same thing anyway. connbytes, connmark, conntrack,
> helper, ... basically all do "take data from conntrack, compare".
> realm, length, pkttype, .. do the same with skb metadata. A lot
> of matches on real packet data are also quite similar. We could
> easily get rid of 50%-75% of all matches and still have the same
> functionality.
>

Just like u32 match. It can replace alot of matches.
 
> > Is this really something we want? It will most probably end up in a new
> > ipfwadmin/ipchains/iptables -version. Its the easiest way todo it. Drop the
> > backward compability completly and possibly only make a new iptables
> > userspace command-line compability tool using the new API.
> 
> We can't do this, people expect to be able to user old versions of
> the iptables tool. But we can introduce a new interface and new
> tools without breaking the old ones.

Chainging to using RCU-list instead of tables will probably break old
iptables badly. We can try writing a new iptables using the new API, but i
guess nobody will be really intressed in doing this work.

--
Joakim Axelsson

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 17:11                         ` Joakim Axelsson
@ 2006-08-14 17:48                           ` Patrick McHardy
  2006-08-14 17:59                             ` Joakim Axelsson
  0 siblings, 1 reply; 31+ messages in thread
From: Patrick McHardy @ 2006-08-14 17:48 UTC (permalink / raw)
  To: Joakim Axelsson; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

Joakim Axelsson wrote:
> 2006-08-14 18:50:28+0200, Patrick McHardy <kaber@trash.net> ->
> 
>>Is that really useful without beeing able to change more than
>>one component
>>
> 
> 
> I don't really understand your consern here?

Seems I accidentally cut out half of the sentence. I meant to ask
if this is really useful if only one component of a rule can be
changed at a time atomically.


>>>>That is basically impossible. We can keep a compatible command-line
>>>>interface, but the ABI can't be kept compatible. The interface itself
>>>>it quite simple, but we also need new ruleset evaluation functions,
>>>>new loop detection and probably a few other things.
>>>
>>>
>>>This work is huge, but really needed. I don't feel I am skilled enough to
>>>write it, only contribute with porting matches and other things. I did
>>>however write most of the code that ipset is based on now. So I have the
>>>"extreme amount of hook functions needed" in my back. 
>>>
>>>The real question is. Do we really want to force each match/target to
>>>implement a fair amount of functions for it to work? We need to think big
>>>from the start here, not missing some feature that will be hard to add
>>>after. I rather see one too many needed function then one too few.
>>
>>Its not so much. The interface comes down to "init", "destroy", "dump",
>>"do something". If we really want "change" it should be possible to
>>do it in one function with "init".
>>
> 
> 
> So we are down to 4 function:
> 1. init
> 2. destroy
> 3. match/target (use)
> 4. dump (list)
> 
> Meaning an alter is really a "dump, destroy, init"?

No, but init and change will be pretty similar most of the time, so they
can probably be handled by the same function.

> Perhaps we should add one or two "userdefined" functions. Something that the
> modules can define for them self to use. For example in the quota-case this
> can be "add more quota". It sure will make the new iptables-structure really
> flexible. 
> 
> Example:
> iptables -U1 --match-id xxxxx -m module --params
> iptables -U1 --match-id 12345 -m quota --add-quota 4000
> 
> We might be able to do this in an alter if we construct the alter to be able
> to take commands that the module can specify. Easily done if a seperate
> struct is passed with defined data that the module can interpetrate it self.

This is not needed, netlink attributes can be nested, so module-specific
stuff can be encapsulated in the module attributes.

>>And as I already said, I would like to get rid of the large amount of
>>matches doing the same thing anyway. connbytes, connmark, conntrack,
>>helper, ... basically all do "take data from conntrack, compare".
>>realm, length, pkttype, .. do the same with skb metadata. A lot
>>of matches on real packet data are also quite similar. We could
>>easily get rid of 50%-75% of all matches and still have the same
>>functionality.
>>
> 
> 
> Just like u32 match. It can replace alot of matches.

Something like that, but hopefully without running over the
packet data. I had something like the meta ematch in mind.

> Chainging to using RCU-list instead of tables will probably break old
> iptables badly. We can try writing a new iptables using the new API, but i
> guess nobody will be really intressed in doing this work.

Both I and Harald are definitely interested in doing this work.
Again, it must not break existing stuff, it will be something
completely new (at least the core stuff, maybe not the targets
or matches).

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 17:48                           ` Patrick McHardy
@ 2006-08-14 17:59                             ` Joakim Axelsson
  0 siblings, 0 replies; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-14 17:59 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: Massimiliano Hofer, Netfilter Development Mailinglist

2006-08-14 19:48:48+0200, Patrick McHardy <kaber@trash.net> ->
> Joakim Axelsson wrote:
> > 2006-08-14 18:50:28+0200, Patrick McHardy <kaber@trash.net> ->
> > 
> >>Is that really useful without beeing able to change more than
> >>one component
> >>
> > 
> > 
> > I don't really understand your consern here?
> 
> Seems I accidentally cut out half of the sentence. I meant to ask
> if this is really useful if only one component of a rule can be
> changed at a time atomically.
> 
> 
> >>>>That is basically impossible. We can keep a compatible command-line
> >>>>interface, but the ABI can't be kept compatible. The interface itself
> >>>>it quite simple, but we also need new ruleset evaluation functions,
> >>>>new loop detection and probably a few other things.
> >>>
> >>>
> >>>This work is huge, but really needed. I don't feel I am skilled enough to
> >>>write it, only contribute with porting matches and other things. I did
> >>>however write most of the code that ipset is based on now. So I have the
> >>>"extreme amount of hook functions needed" in my back. 
> >>>
> >>>The real question is. Do we really want to force each match/target to
> >>>implement a fair amount of functions for it to work? We need to think big
> >>>from the start here, not missing some feature that will be hard to add
> >>>after. I rather see one too many needed function then one too few.
> >>
> >>Its not so much. The interface comes down to "init", "destroy", "dump",
> >>"do something". If we really want "change" it should be possible to
> >>do it in one function with "init".
> >>
> > 
> > 
> > So we are down to 4 function:
> > 1. init
> > 2. destroy
> > 3. match/target (use)
> > 4. dump (list)
> > 
> > Meaning an alter is really a "dump, destroy, init"?
> 
> No, but init and change will be pretty similar most of the time, so they
> can probably be handled by the same function.
> 
> > Perhaps we should add one or two "userdefined" functions. Something that the
> > modules can define for them self to use. For example in the quota-case this
> > can be "add more quota". It sure will make the new iptables-structure really
> > flexible. 
> > 
> > Example:
> > iptables -U1 --match-id xxxxx -m module --params
> > iptables -U1 --match-id 12345 -m quota --add-quota 4000
> > 
> > We might be able to do this in an alter if we construct the alter to be able
> > to take commands that the module can specify. Easily done if a seperate
> > struct is passed with defined data that the module can interpetrate it self.
> 
> This is not needed, netlink attributes can be nested, so module-specific
> stuff can be encapsulated in the module attributes.
> 

I don't really follow here? If you think really big. Include recent, ipset
and the old ippool-idea. Where data is added and removed into the modules
state as it works. If we want to change this state with a userspace tool we
either need:
1. /proc
2. a new userspace-tool hooking, like ipset (ippool).
3. Function added to this new iptables being capable of it. A simply init
will not do it. You can't remove one IP in a recent-pool using an
init-function. It will probably wipe them all and empty the recent-pool.

Removing the need to extra userspace programs (ipset, ippool, i know
accounting has a tool as well) or the need for /proc is a huge win in my
opinion. This also calls for two different forms of listing. The first one
is the one we have today. List the rules for your firewall. The second in is
to list the state of one match/target. Like 'cat
/proc/net/ipt_recent/xxxx' today.

Try to cover functions needed to operate ipset and/or recent from the
begining.

> > Chainging to using RCU-list instead of tables will probably break old
> > iptables badly. We can try writing a new iptables using the new API, but i
> > guess nobody will be really intressed in doing this work.
> 
> Both I and Harald are definitely interested in doing this work.
> Again, it must not break existing stuff, it will be something
> completely new (at least the core stuff, maybe not the targets
> or matches).

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-14 16:55             ` Joakim Axelsson
  2006-08-14 16:59               ` Patrick McHardy
@ 2006-08-15  8:27               ` Amin Azez
  2006-08-15  8:40                 ` Joakim Axelsson
  1 sibling, 1 reply; 31+ messages in thread
From: Amin Azez @ 2006-08-15  8:27 UTC (permalink / raw)
  To: Patrick McHardy, Netfilter Development Mailinglist

* Joakim Axelsson wrote, On 14/08/06 17:55:
> 2006-08-14 18:13:38+0200, Patrick McHardy <kaber@trash.net> ->
>> Joakim Axelsson wrote:
>>> 2006-08-14 17:28:19+0200, Patrick McHardy <kaber@trash.net> ->
>>>
>>>>> Please have a look here for 4 modules "needing" this patch:
>>>>> http://www.gozem.se/~gozem/netfilter/
>>>> Please post your examples to the list.
>>>>
>>>
>>> In which format? As a full patch for the kernel or something that fits in
>>> pom-ng? The current code works for 2.4 and early 2.6. I don't want to spend
>>> time porting it into "wrong" API now when we are discussing priv_data to be
>>> or not to be :-)
>> I'm mostly interested in the way you wish to use it, so I can better
>> judge whether this change is worth doing or not.
>>
> 
> The change is NOT worth it if we can't keep the priv_data pointer between
> changes of unrelated rules. I better use a global list instead as condition
> is using today.


Or, anticipating the future, why not let the current implementation of
priv_data be used to cache the entry from the global list. When
priv_data is satsifactory the global list can go. Why not let priv_data
manage the "backup" global list while it is needed, then client modules
of priv_data don't even need to know about this; which is better than
client modules all implementing their own global list.

I'm saying that while a global list may currently be needed we can still
encapsulate it in the priv_data api and still get MOST of the benefit
now and easily get all of the benefit later.

Sam

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: priv_data patch
  2006-08-15  8:27               ` Amin Azez
@ 2006-08-15  8:40                 ` Joakim Axelsson
  0 siblings, 0 replies; 31+ messages in thread
From: Joakim Axelsson @ 2006-08-15  8:40 UTC (permalink / raw)
  To: Amin Azez; +Cc: Netfilter Development Mailinglist, Patrick McHardy

2006-08-15 09:27:40+0100, Amin Azez <azez@ufomechanic.net> ->
> * Joakim Axelsson wrote, On 14/08/06 17:55:
> > 2006-08-14 18:13:38+0200, Patrick McHardy <kaber@trash.net> ->
> >> Joakim Axelsson wrote:
> >>> 2006-08-14 17:28:19+0200, Patrick McHardy <kaber@trash.net> ->
> >>>
> >>>>> Please have a look here for 4 modules "needing" this patch:
> >>>>> http://www.gozem.se/~gozem/netfilter/
> >>>> Please post your examples to the list.
> >>>>
> >>>
> >>> In which format? As a full patch for the kernel or something that fits in
> >>> pom-ng? The current code works for 2.4 and early 2.6. I don't want to spend
> >>> time porting it into "wrong" API now when we are discussing priv_data to be
> >>> or not to be :-)
> >> I'm mostly interested in the way you wish to use it, so I can better
> >> judge whether this change is worth doing or not.
> >>
> > 
> > The change is NOT worth it if we can't keep the priv_data pointer between
> > changes of unrelated rules. I better use a global list instead as condition
> > is using today.
> 
> 
> Or, anticipating the future, why not let the current implementation of
> priv_data be used to cache the entry from the global list. 

This can be done in the info userspace passes. Might be somewhat ugly to
have a kernel-only-used pointer to this internal kernel state. Of couse will
this pointer be needed to be set in every call of checkentry().

> When
> priv_data is satsifactory the global list can go. Why not let priv_data
> manage the "backup" global list while it is needed, then client modules
> of priv_data don't even need to know about this; which is better than
> client modules all implementing their own global list.
> 
> I'm saying that while a global list may currently be needed we can still
> encapsulate it in the priv_data api and still get MOST of the benefit
> now and easily get all of the benefit later.
> 
> Sam

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2006-08-15  8:40 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-08-14 13:34 priv_data patch Patrick McHardy
2006-08-14 14:25 ` Joakim Axelsson
2006-08-14 14:31   ` Patrick McHardy
2006-08-14 15:20     ` Joakim Axelsson
2006-08-14 15:28       ` Patrick McHardy
2006-08-14 16:04         ` Joakim Axelsson
2006-08-14 16:13           ` Patrick McHardy
2006-08-14 16:55             ` Joakim Axelsson
2006-08-14 16:59               ` Patrick McHardy
2006-08-15  8:27               ` Amin Azez
2006-08-15  8:40                 ` Joakim Axelsson
2006-08-14 15:31       ` Patrick McHardy
2006-08-14 15:40         ` Joakim Axelsson
2006-08-14 15:46           ` Patrick McHardy
2006-08-14 15:56             ` Joakim Axelsson
2006-08-14 16:01               ` Patrick McHardy
2006-08-14 16:13                 ` Joakim Axelsson
2006-08-14 16:26                   ` Patrick McHardy
2006-08-14 16:40                     ` Joakim Axelsson
2006-08-14 16:50                       ` Patrick McHardy
2006-08-14 17:11                         ` Joakim Axelsson
2006-08-14 17:48                           ` Patrick McHardy
2006-08-14 17:59                             ` Joakim Axelsson
2006-08-14 15:53       ` Massimiliano Hofer
2006-08-14 14:40 ` Massimiliano Hofer
2006-08-14 14:48   ` Patrick McHardy
2006-08-14 14:58     ` Joakim Axelsson
2006-08-14 15:05       ` Patrick McHardy
2006-08-14 16:19     ` Massimiliano Hofer
2006-08-14 16:32       ` Joakim Axelsson
     [not found] ` <200608141557.35918.max@nucleus.it>
     [not found]   ` <44E08AC7.2050204@trash.net>
     [not found]     ` <200608141702.50753.max@nucleus.it>
2006-08-14 15:14       ` Patrick McHardy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.