All of lore.kernel.org
 help / color / mirror / Atom feed
* what's the lockingrules for ip_conntrack_expect_list?
@ 2002-10-10 21:40 Martin Josefsson
  2002-10-11 13:06 ` Jozsef Kadlecsik
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Josefsson @ 2002-10-10 21:40 UTC (permalink / raw)
  To: Netfilter-devel; +Cc: Harald Welte

Hi,

I've been going through the latest bugreport about failed ASSERT's...
It's the usual suspects, exp_for_packet that calls ip_ct_find_proto
instead of __find_proto, this isn't a bug, it's just that the
debug-macros can't handle it, we've been over this before...

But here's the real question...
What's the lockingrules for ip_conntrack_expect_list?

I've gone through 2.4.20-pre8aa2 which was the kernel the bugreport was
from. And I've found some inconsistencies.

here's a list of all functions that deals with ip_conntrack_expect_list
and which locks they are holding when they do so, and the specific
operation they perform on the list.


ip_conntrack_expect_find_get 
	READ_LOCK(&ip_conntrack_lock);
	READ_LOCK(&ip_conntrack_expect_tuple_lock);
	__ip_ct_expect_find
		MUST_BE_READ_LOCKED(&ip_conntrack_lock);
		MUST_BE_READ_LOCKED(&ip_conntrack_expect_tuple_lock);
		LIST_FIND

death_by_timeout
	WRITE_LOCK(&ip_conntrack_lock);
	clean_from_lists
		MUST_BE_WRITE_LOCKED(&ip_conntrack_lock);
		remove_expectations
			list_inlist

init_conntrack
	first:
		WRITE_LOCK(&ip_conntrack_lock);
		READ_LOCK(&ip_conntrack_expect_tuple_lock);
		LIST_FIND
	later: 
		WRITE_LOCK(&ip_conntrack_lock);
		LIST_DELETE

ip_conntrack_expect_related
	WRITE_LOCK(&ip_conntrack_lock);
	LIST_FIND
	LIST_FIND
	list_prepend

ip_conntrack_change_expect
	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
	LIST_FIND

	FAILS!! called from nat-helpers which only hold their own locks 	if any
at all.


So what's the story? sometimes we hold only ip_conntrack_lock and
sometimes we hold that plus ip_conntrack_expect_tuple_lock and now I've
found out that sometimes we never hold any lock.

Give me a hint and I'll write up a patch to fix it.

-- 
/Martin

Never argue with an idiot. They drag you down to their level, then beat
you with experience.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-10 21:40 what's the lockingrules for ip_conntrack_expect_list? Martin Josefsson
@ 2002-10-11 13:06 ` Jozsef Kadlecsik
  2002-10-11 13:56   ` Martin Josefsson
  2002-10-12 12:17   ` Harald Welte
  0 siblings, 2 replies; 19+ messages in thread
From: Jozsef Kadlecsik @ 2002-10-11 13:06 UTC (permalink / raw)
  To: Martin Josefsson; +Cc: Netfilter-devel, Harald Welte

Hi Martin,

On 10 Oct 2002, Martin Josefsson wrote:

> What's the lockingrules for ip_conntrack_expect_list?

As far as I remember, the guiding rules were the following:
ip_conntrack_lock is always held when we do something with the
expectations. If it's write-locked, then there is no need for additional
locking. If it's read-locked, then we use the additional
ip_conntrack_expect_tuple_lock to protect the expectation lists.

> I've gone through 2.4.20-pre8aa2 which was the kernel the bugreport was
> from. And I've found some inconsistencies.

> init_conntrack
> 	first:
> 		WRITE_LOCK(&ip_conntrack_lock);
> 		READ_LOCK(&ip_conntrack_expect_tuple_lock);
> 		LIST_FIND

Read-locking ip_conntrack_expect_tuple_lock is unnecessary here then.

> 	later:
> 		WRITE_LOCK(&ip_conntrack_lock);
> 		LIST_DELETE
>
> ip_conntrack_expect_related
> 	WRITE_LOCK(&ip_conntrack_lock);
> 	LIST_FIND
> 	LIST_FIND
> 	list_prepend
>
> ip_conntrack_change_expect
> 	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
> 	LIST_FIND
>
> 	FAILS!! called from nat-helpers which only hold their own locks 	if any
> at all.

Well spotted! It's a locking bug then.

Regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-11 13:06 ` Jozsef Kadlecsik
@ 2002-10-11 13:56   ` Martin Josefsson
  2002-10-11 14:07     ` Jozsef Kadlecsik
  2002-10-12 12:17   ` Harald Welte
  1 sibling, 1 reply; 19+ messages in thread
From: Martin Josefsson @ 2002-10-11 13:56 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: Netfilter-devel, Harald Welte

On Fri, 2002-10-11 at 15:06, Jozsef Kadlecsik wrote:
> Hi Martin,

Hi Jozsef,
 
> As far as I remember, the guiding rules were the following:
> ip_conntrack_lock is always held when we do something with the
> expectations. If it's write-locked, then there is no need for additional
> locking. If it's read-locked, then we use the additional
> ip_conntrack_expect_tuple_lock to protect the expectation lists.

Ahh that makes sense.
But is it really neccessary to hold ip_conntrack_lock when we are only
going to change an expectation and not touch anything else?
If we enforce that we always hold a readlock or writelock on
ip_conntrack_expect_tuple_lock when we touch it we should be safe?

I think Rusty has changed the lockingrules in his conntrack-optimization
patch to only require that we hold the ip_conntrack_expect_tuple_lock.

I'll look at Rusty's patch and make two versions of this fix, one for
current conntrack and one for his patch (if the bug exists there).

> > ip_conntrack_change_expect
> > 	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
> > 	LIST_FIND
> >
> > 	FAILS!! called from nat-helpers which only hold their own locks 	if any
> > at all.
> 
> Well spotted! It's a locking bug then.

I just examined a bugreport :)

Thanks for the explanation.
 
-- 
/Martin

Never argue with an idiot. They drag you down to their level, then beat
you with experience.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-11 13:56   ` Martin Josefsson
@ 2002-10-11 14:07     ` Jozsef Kadlecsik
  2002-10-11 18:42       ` [PATCH] " Martin Josefsson
  0 siblings, 1 reply; 19+ messages in thread
From: Jozsef Kadlecsik @ 2002-10-11 14:07 UTC (permalink / raw)
  To: Martin Josefsson; +Cc: Netfilter-devel, Harald Welte

On 11 Oct 2002, Martin Josefsson wrote:

> But is it really neccessary to hold ip_conntrack_lock when we are only
> going to change an expectation and not touch anything else?
> If we enforce that we always hold a readlock or writelock on
> ip_conntrack_expect_tuple_lock when we touch it we should be safe?

We wanted to avoid possible cross-locking bugs by keeping a strict
hierarchy of the locks. Sometimes too complicated functions are called
while a lock is held.

> I think Rusty has changed the lockingrules in his conntrack-optimization
> patch to only require that we hold the ip_conntrack_expect_tuple_lock.

The other way around: he removed ip_conntrack_expect_tuple_lock
completely. I think that is the proper way, but it implies that later
the fine-grained locking is introduced in order to remove the dependency
of everything on the single ip_conntrack_lock.

Regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlec@sunserv.kfki.hu
PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt
Address : KFKI Research Institute for Particle and Nuclear Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-11 14:07     ` Jozsef Kadlecsik
@ 2002-10-11 18:42       ` Martin Josefsson
  2002-10-12 12:25         ` Harald Welte
  0 siblings, 1 reply; 19+ messages in thread
From: Martin Josefsson @ 2002-10-11 18:42 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: Netfilter-devel, Harald Welte

[-- Attachment #1: Type: text/plain, Size: 1725 bytes --]

On Fri, 2002-10-11 at 16:07, Jozsef Kadlecsik wrote:
> On 11 Oct 2002, Martin Josefsson wrote:
> 
> > But is it really neccessary to hold ip_conntrack_lock when we are only
> > going to change an expectation and not touch anything else?
> > If we enforce that we always hold a readlock or writelock on
> > ip_conntrack_expect_tuple_lock when we touch it we should be safe?
> 
> We wanted to avoid possible cross-locking bugs by keeping a strict
> hierarchy of the locks. Sometimes too complicated functions are called
> while a lock is held.

Ok.
Patch for ip_conntrack_change_expect is attached.

Harald, is this something you would accept and forward for inclusion?
And while we're at it, I still think we should change exp_for_packet to
use __find_proto instead of ip_ct_find_proto to avoid the ASSERT's that
people are complaining about. Sure they are harmless but users don't
know that and send reports. (this problem goes away with Rustys patches
for 2.5, exp_for_packet is completely removed)

> > I think Rusty has changed the lockingrules in his conntrack-optimization
> > patch to only require that we hold the ip_conntrack_expect_tuple_lock.
> 
> The other way around: he removed ip_conntrack_expect_tuple_lock
> completely. I think that is the proper way, but it implies that later
> the fine-grained locking is introduced in order to remove the dependency
> of everything on the single ip_conntrack_lock.

Yes I saw that when I read the patch. That change combined with the
speedup patch will be nice. I found two small locking-bugs in those
patches aswell, I'll send Rusty some patches in a short while.

-- 
/Martin

Never argue with an idiot. They drag you down to their level, then beat
you with experience.

[-- Attachment #2: ip_conntrack_change_expect-locking.diff --]
[-- Type: text/plain, Size: 2161 bytes --]

--- linux-2.4.20-pre10.orig/net/ipv4/netfilter/ip_conntrack_core.c	2002-10-11 15:48:28.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_core.c	2002-10-11 20:04:42.000000000 +0200
@@ -695,10 +695,8 @@
 
 	WRITE_LOCK(&ip_conntrack_lock);
 	/* Need finding and deleting of expected ONLY if we win race */
-	READ_LOCK(&ip_conntrack_expect_tuple_lock);
 	expected = LIST_FIND(&ip_conntrack_expect_list, expect_cmp,
 			     struct ip_conntrack_expect *, tuple);
-	READ_UNLOCK(&ip_conntrack_expect_tuple_lock);
 
 	/* Look up the conntrack helper for master connections only */
 	if (!expected)
@@ -1060,7 +1058,7 @@
 int ip_conntrack_change_expect(struct ip_conntrack_expect *expect,
 			       struct ip_conntrack_tuple *newtuple)
 {
-	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
+	int ret;
 
 	DEBUGP("change_expect:\n");
 	DEBUGP("exp tuple: "); DUMP_TUPLE(&expect->tuple);
@@ -1069,30 +1067,32 @@
 	if (expect->ct_tuple.dst.protonum == 0) {
 		/* Never seen before */
 		DEBUGP("change expect: never seen before\n");
+		READ_LOCK(&ip_conntrack_lock);
+		WRITE_LOCK(&ip_conntrack_expect_tuple_lock);
 		if (!ip_ct_tuple_equal(&expect->tuple, newtuple) 
 		    && LIST_FIND(&ip_conntrack_expect_list, expect_clash,
 			         struct ip_conntrack_expect *, newtuple, &expect->mask)) {
 			/* Force NAT to find an unused tuple */
-			return -1;
+			ret = -1;
 		} else {
-			WRITE_LOCK(&ip_conntrack_expect_tuple_lock);
 			memcpy(&expect->ct_tuple, &expect->tuple, sizeof(expect->tuple));
 			memcpy(&expect->tuple, newtuple, sizeof(expect->tuple));
-			WRITE_UNLOCK(&ip_conntrack_expect_tuple_lock);
-			return 0;
+			ret = 0;
 		}
+		WRITE_UNLOCK(&ip_conntrack_expect_tuple_lock);
+		READ_UNLOCK(&ip_conntrack_lock);
 	} else {
 		/* Resent packet */
 		DEBUGP("change expect: resent packet\n");
 		if (ip_ct_tuple_equal(&expect->tuple, newtuple)) {
-			return 0;
+			ret = 0;
 		} else {
 			/* Force NAT to choose again the same port */
-			return -1;
+			ret = -1;
 		}
 	}
 	
-	return -1;
+	return ret;
 }
 
 /* Alter reply tuple (maybe alter helper).  If it's already taken,

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-11 13:06 ` Jozsef Kadlecsik
  2002-10-11 13:56   ` Martin Josefsson
@ 2002-10-12 12:17   ` Harald Welte
  2002-10-12 13:09     ` Martin Josefsson
  1 sibling, 1 reply; 19+ messages in thread
From: Harald Welte @ 2002-10-12 12:17 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: Martin Josefsson, Netfilter-devel


[-- Attachment #1.1: Type: text/plain, Size: 3657 bytes --]

On Fri, Oct 11, 2002 at 03:06:59PM +0200, Jozsef Kadlecsik wrote:
> Hi Martin,
> 
> On 10 Oct 2002, Martin Josefsson wrote:
> 
> > What's the lockingrules for ip_conntrack_expect_list?
> 
> As far as I remember, the guiding rules were the following:
> ip_conntrack_lock is always held when we do something with the
> expectations. If it's write-locked, then there is no need for additional
> locking. If it's read-locked, then we use the additional
> ip_conntrack_expect_tuple_lock to protect the expectation lists.

Unfortunately I (the inventor of this second lock) don't even remember 
the exact purpose anymore.  Sometimes I feel really stupid.  so let's go
read the code and my personal notes again...

I think the idea of the expect_tuple_lock() was to protect the tuple/mask
inside a struct ip_conntrack_expect.  This way we can implement
ip_conntrack_change_expect() without the need to grab a writelock on
ip_conntrack_lock. This is not jut an optimization, but absolutely necessarry
for the NAT helpers:

We grab a readlock on ip_conntrack_lock and traverse the list of siblings.
For every sibling we call the nat helper function, which in turn wants to
alter the expectation by calling ip_conntrack_change_expect().  Altering the
expectation needs some kind of locking.  If we only had one lock, we would
need a writelock on ip_conntrack_lock.  Since there is no primitive for
atomically upgrading a lock from readlock to writelock, we would need to
grab a writelock before traversing the sibling list.  And this sucks, since
nobody else can do anything inside connection tracking for a very long time.

> > I've gone through 2.4.20-pre8aa2 which was the kernel the bugreport was
> > from. And I've found some inconsistencies.
> 
> > init_conntrack
> > 	first:
> > 		WRITE_LOCK(&ip_conntrack_lock);
> > 		READ_LOCK(&ip_conntrack_expect_tuple_lock);
> > 		LIST_FIND
> 
> Read-locking ip_conntrack_expect_tuple_lock is unnecessary here then.

Well, this is correct. But it's an optimiziation based on the assumption
that there is only one place where we WRITE_LOCK(&expect_tuple_lock),
which is in turn protected by READ_LOCK(&ip_conntrack_lock).  While this
is true now, I'm not sure that this is a good idea in general.

> > 	later:
> > 		WRITE_LOCK(&ip_conntrack_lock);
> > 		LIST_DELETE

This is OK, since list_delete doesn't read the tuple/mask inside this expect.

> > ip_conntrack_expect_related
> > 	WRITE_LOCK(&ip_conntrack_lock);
> > 	LIST_FIND
> > 	LIST_FIND
> > 	list_prepend

This is clearly a bug, but works according to the optimization described 
above.  I think we should either grab the tuple_lock or at least make 
a verbose comment in the code describing why we don't grab it.

> > ip_conntrack_change_expect
> > 	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
> > 	LIST_FIND
> >
> > 	FAILS!! called from nat-helpers which only hold their own locks
> > 	if any at all.

No.  The helper's help functions are called from do_bindings, which grab
a readlock on ip_conntrack_lock before traversing the list of expectations
matching this packet.  See also my comment above.

Please correct me if I'm overlooking something.

> Regards,
> Jozsef

Just for your info, I'm describing a  textfiles about my initial
thoughs on newnat locking..
 
-- 
Live long and prosper
- Harald Welte / laforge@gnumonks.org               http://www.gnumonks.org/
============================================================================
"If this were a dictatorship, it'd be a heck of a lot easier, just so long
 as I'm the dictator."  --  George W. Bush Dec 18, 2000

[-- Attachment #1.2: newnat-problems --]
[-- Type: text/plain, Size: 4350 bytes --]


Problems with locking and the current multirel/newnat code

The general problem is that somebody, either the nat core or the nat helper
have to traverse the sibling_list of ip_conntrack and possibly modify this
list while traversing it.

While this is not unsolvable, it is always ugly... maybe somebody has some 
ideas.

Possible ways of implementation:

a) traverse sibling_list and do not alter it while traversal
- grab readlock before traversal
- ip_ct_unexpect_related() just 'notes' the to-be-deleted expectations
- release readlock
- grab writelock
- delete all 'noted' expectations

b) let the helper do it by itself
- call helper once for every packet (like current code)
- helper grabs writelock before traversal
- helper can alter his list exclusively while traversing
  (he has to know what he is doing, if for example deleting while traversing)
- helper releases writelock after traversal


Either way we go, there is some locking involved.  'traditionally' we would
use the ip_conntrack_lock.  However, as described below, this has some problems
and using a seperate lock seems interesting.  In any case, it looks unclean.
Comments?


1. General locking problems if we would use a seperate sibling_lock:

  While iterating over the sibling list, the helper has to grab the
sibling_lock.  Then, while iterating, it calls expect_related/unexpect_related
functions.  Those functions need to grab the sibling_lock as well, as they have
to add/ remove items from the list.  Of course they can not grab the lock, as
it is already grabbed.

  So what we could do, is provide non-locking variants of expect_related and
unexpect_related, which are to be called in this case.

  However, expect_related and unexpect_related have to grab the
ip_conntrack_lock in addition to the sibling_lock, because they have to 
possibly modify the global expect_list.

destroy_expectations:
  Then there is destroy_expectations, which is called with an already locked
ip_conntrack_lock, and wishes to grab sibling_lock as well, because it also
has to delete expectations.  -> possibility of deadlock.

destroy_conntrack:
  Has to grab ip_conntrack_lock as well as sibling_lock, because it has
to remove itself from the sibling_list.


2. Using only one lock

  If we only use one lock, the already existing ip_conntrack_lock, it would
have to be grabbed for a long time.  And there is no functionality for 
atomically changing a read lock into a write lock.  So we would have to grab
a write lock before we call the helper / traverse the list, and could release
this only after the whole job is done.

  The problem with expect_related / unexpect_related still persists, because
they still want to grab an already taken lock.  Solution like above, provide
non-locking-counterparts.

destroy_expectations and/or destroy_conntrack shouldn't raise any additional
problems in this setup..

Disadvantage: Most of the time, the nat helper will be called without having
some actual work to do.  Most of the time, the traffic will not raise any
expectations, but still we have to grab the write lock ind advance, thus 
hurting SMP scalability :(


=== sat oct 12, 2002:
Unfortunately I (the inventor of this second lock) don't even remember
the exact purpose anymore.  Sometimes I feel really stupid.  so let's go
read the code and my personal notes again...

I think the idea of the expect_tuple_lock() was to protect the tuple/mask
inside a struct ip_conntrack_expect.  This way we can implement
ip_conntrack_change_expect() without the need to grab a writelock on
ip_conntrack_lock. This is not jut an optimization, but absolutely necessarry
for the NAT helpers:

We grab a readlock on ip_conntrack_lock and traverse the list of siblings.
For every sibling we call the nat helper function, which in turn wants to
alter the expectation by calling ip_conntrack_change_expect().Altering the
expectation needs some kind of locking.  If we only had one lock, we would
need a writelock on ip_conntrack_lock.  Since there is no primitive for
atomically upgrading a lock from readlock to writelock, we would need to
grab a writelock before traversing the sibling list.  And this sucks, since
nobody else can do anything inside connection tracking for a very long time.



[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-11 18:42       ` [PATCH] " Martin Josefsson
@ 2002-10-12 12:25         ` Harald Welte
  2002-10-12 13:11           ` Martin Josefsson
  0 siblings, 1 reply; 19+ messages in thread
From: Harald Welte @ 2002-10-12 12:25 UTC (permalink / raw)
  To: Martin Josefsson; +Cc: Jozsef Kadlecsik, Netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 1337 bytes --]

On Fri, Oct 11, 2002 at 08:42:51PM +0200, Martin Josefsson wrote:

> Harald, is this something you would accept and forward for inclusion?

See my other mail, I don't agree with everything said in the thread before.
Sorry for my late comments, I should have sent that mail two days earlier.

> And while we're at it, I still think we should change exp_for_packet to
> use __find_proto instead of ip_ct_find_proto to avoid the ASSERT's that
> people are complaining about. 

It's always ugly to export a non-locking __foo function as API to another
module.

> Sure they are harmless but users don't know that and send reports. (this
> problem goes away with Rustys patches for 2.5, exp_for_packet is completely
> removed)

Yes I agree with the problem, but not with the style of the solution. 
*sigh*. 

__find_proto should be inlined anyway. So if we put it into a header file,
which is included by conntrack and nat (ip_conntrack.h), we would avoid
exporting it as a symbol.

> -- 
> /Martin
-- 
Live long and prosper
- Harald Welte / laforge@gnumonks.org               http://www.gnumonks.org/
============================================================================
"If this were a dictatorship, it'd be a heck of a lot easier, just so long
 as I'm the dictator."  --  George W. Bush Dec 18, 2000

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-12 12:17   ` Harald Welte
@ 2002-10-12 13:09     ` Martin Josefsson
  2002-10-12 13:20       ` Martin Josefsson
                         ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Martin Josefsson @ 2002-10-12 13:09 UTC (permalink / raw)
  To: Harald Welte; +Cc: Jozsef Kadlecsik, Netfilter-devel

On Sat, 2002-10-12 at 14:17, Harald Welte wrote:

> > > init_conntrack
> > > 	first:
> > > 		WRITE_LOCK(&ip_conntrack_lock);
> > > 		READ_LOCK(&ip_conntrack_expect_tuple_lock);
> > > 		LIST_FIND
> > 
> > Read-locking ip_conntrack_expect_tuple_lock is unnecessary here then.
> 
> Well, this is correct. But it's an optimiziation based on the assumption
> that there is only one place where we WRITE_LOCK(&expect_tuple_lock),
> which is in turn protected by READ_LOCK(&ip_conntrack_lock).  While this
> is true now, I'm not sure that this is a good idea in general.

Ok then that stays.
 
> > > 	later:
> > > 		WRITE_LOCK(&ip_conntrack_lock);
> > > 		LIST_DELETE
> 
> This is OK, since list_delete doesn't read the tuple/mask inside this expect.

Ok.
 
> > > ip_conntrack_expect_related
> > > 	WRITE_LOCK(&ip_conntrack_lock);
> > > 	LIST_FIND
> > > 	LIST_FIND
> > > 	list_prepend
> 
> This is clearly a bug, but works according to the optimization described 
> above.  I think we should either grab the tuple_lock or at least make 
> a verbose comment in the code describing why we don't grab it.

I'll add a read-lock on the tuple_lock here just to mark that it should
be taken.

All modifications of the liststructure are done under a write-locked
ip_conntrack_lock, the only things that happen under a read-locked
ip_conntrack_lock are a find and a change, nothing that changes the
structure. see below.
 
> > > ip_conntrack_change_expect
> > > 	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
> > > 	LIST_FIND
> > >
> > > 	FAILS!! called from nat-helpers which only hold their own locks
> > > 	if any at all.
> 
> No.  The helper's help functions are called from do_bindings, which grab
> a readlock on ip_conntrack_lock before traversing the list of expectations
> matching this packet.  See also my comment above.

Ahh I missed this, the helper is called after exp_for_packet which has
already screwed up the lock-debug.

But we'll need to hold at least a read-lock on the tuple_lock during
that LIST_FIND since we only have a read-lock on ip_conntrack_lock.
and we modify tupe/mask inside this read-locked ip_conntrack_lock but
within a write-lock'd tuple_lock. And since we'll probably need to take
that write-lock anyway we might just as well take it from the beginning.
 
> Please correct me if I'm overlooking something.

I think you got everything covered.

So the lockingrules comes down to this if I've understood things
correctly:

1. All accesses to the expect-list are to be performed with
ip_conntrack_lock held.

2. All list-structure changes are done under a write-locked
ip_conntrack_lock and tuple_lock is not needed.

3. All other accesses are performed under at least a read-locked
ip_conntrack_lock.

4. All read-accesses to tuple/mask data inside the list are performed
under at least a read-locked tuple_lock.

5. All write-accesses to tuple/mask data inside the list are performed
under a write-locked tuple_lock.

Does that sound ok to you?

-- 
/Martin

Never argue with an idiot. They drag you down to their level, then beat
you with experience.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-12 12:25         ` Harald Welte
@ 2002-10-12 13:11           ` Martin Josefsson
  0 siblings, 0 replies; 19+ messages in thread
From: Martin Josefsson @ 2002-10-12 13:11 UTC (permalink / raw)
  To: Harald Welte; +Cc: Jozsef Kadlecsik, Netfilter-devel

On Sat, 2002-10-12 at 14:25, Harald Welte wrote:
 
> See my other mail, I don't agree with everything said in the thread before.
> Sorry for my late comments, I should have sent that mail two days earlier.

No problem, I'm happy that you cleared things up.

> It's always ugly to export a non-locking __foo function as API to another
> module.

Yes I know, but we need to do something about this.
 
> > Sure they are harmless but users don't know that and send reports. (this
> > problem goes away with Rustys patches for 2.5, exp_for_packet is completely
> > removed)
> 
> Yes I agree with the problem, but not with the style of the solution. 
> *sigh*. 
> 
> __find_proto should be inlined anyway. So if we put it into a header file,
> which is included by conntrack and nat (ip_conntrack.h), we would avoid
> exporting it as a symbol.

I'll whip up a patch.

-- 
/Martin

Never argue with an idiot. They drag you down to their level, then beat
you with experience.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-12 13:09     ` Martin Josefsson
@ 2002-10-12 13:20       ` Martin Josefsson
  2002-10-12 13:29       ` [PATCH 2] " Martin Josefsson
  2002-10-12 14:34       ` what's the lockingrules for ip_conntrack_expect_list? Harald Welte
  2 siblings, 0 replies; 19+ messages in thread
From: Martin Josefsson @ 2002-10-12 13:20 UTC (permalink / raw)
  To: Harald Welte; +Cc: Jozsef Kadlecsik, Netfilter-devel

On Sat, 2002-10-12 at 15:09, Martin Josefsson wrote:
  
> > > > ip_conntrack_expect_related
> > > > 	WRITE_LOCK(&ip_conntrack_lock);
> > > > 	LIST_FIND
> > > > 	LIST_FIND
> > > > 	list_prepend
> > 
> > This is clearly a bug, but works according to the optimization described 
> > above.  I think we should either grab the tuple_lock or at least make 
> > a verbose comment in the code describing why we don't grab it.
> 
> I'll add a read-lock on the tuple_lock here just to mark that it should
> be taken.

Oops, spoke too soon, there already exists such a comment:

         /* Because of the write lock, no reader can walk the lists,
          * so there is no need to use the tuple lock too */

I'll leave it at that.

-- 
/Martin

Never argue with an idiot. They drag you down to their level, then beat
you with experience.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 2] Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-12 13:09     ` Martin Josefsson
  2002-10-12 13:20       ` Martin Josefsson
@ 2002-10-12 13:29       ` Martin Josefsson
  2002-10-12 14:36         ` Harald Welte
  2002-10-12 14:34       ` what's the lockingrules for ip_conntrack_expect_list? Harald Welte
  2 siblings, 1 reply; 19+ messages in thread
From: Martin Josefsson @ 2002-10-12 13:29 UTC (permalink / raw)
  To: Harald Welte; +Cc: Jozsef Kadlecsik, Netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 1217 bytes --]

On Sat, 2002-10-12 at 15:09, Martin Josefsson wrote:
> On Sat, 2002-10-12 at 14:17, Harald Welte wrote:

> > > > ip_conntrack_change_expect
> > > > 	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
> > > > 	LIST_FIND
> > > >
> > > > 	FAILS!! called from nat-helpers which only hold their own locks
> > > > 	if any at all.
> > 
> > No.  The helper's help functions are called from do_bindings, which grab
> > a readlock on ip_conntrack_lock before traversing the list of expectations
> > matching this packet.  See also my comment above.
> 
> Ahh I missed this, the helper is called after exp_for_packet which has
> already screwed up the lock-debug.
> 
> But we'll need to hold at least a read-lock on the tuple_lock during
> that LIST_FIND since we only have a read-lock on ip_conntrack_lock.
> and we modify tupe/mask inside this read-locked ip_conntrack_lock but
> within a write-lock'd tuple_lock. And since we'll probably need to take
> that write-lock anyway we might just as well take it from the beginning.

Attached patch changes the write-locked tuple_lock to cover the entire
function.
Is this ok with you?

-- 
/Martin

Never argue with an idiot. They drag you down to their level, then beat
you with experience.

[-- Attachment #2: ip_conntrack_change_expect-lockfix.diff --]
[-- Type: text/plain, Size: 1494 bytes --]

--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_core.c.mjufs	2002-10-12 15:23:22.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_core.c	2002-10-12 15:23:00.000000000 +0200
@@ -1060,7 +1060,10 @@
 int ip_conntrack_change_expect(struct ip_conntrack_expect *expect,
 			       struct ip_conntrack_tuple *newtuple)
 {
+	int ret;
+
 	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
+	WRITE_LOCK(&ip_conntrack_expect_tuple_lock);
 
 	DEBUGP("change_expect:\n");
 	DEBUGP("exp tuple: "); DUMP_TUPLE(&expect->tuple);
@@ -1073,26 +1076,25 @@
 		    && LIST_FIND(&ip_conntrack_expect_list, expect_clash,
 			         struct ip_conntrack_expect *, newtuple, &expect->mask)) {
 			/* Force NAT to find an unused tuple */
-			return -1;
+			ret = -1;
 		} else {
-			WRITE_LOCK(&ip_conntrack_expect_tuple_lock);
 			memcpy(&expect->ct_tuple, &expect->tuple, sizeof(expect->tuple));
 			memcpy(&expect->tuple, newtuple, sizeof(expect->tuple));
-			WRITE_UNLOCK(&ip_conntrack_expect_tuple_lock);
-			return 0;
+			ret = 0;
 		}
 	} else {
 		/* Resent packet */
 		DEBUGP("change expect: resent packet\n");
 		if (ip_ct_tuple_equal(&expect->tuple, newtuple)) {
-			return 0;
+			ret = 0;
 		} else {
 			/* Force NAT to choose again the same port */
-			return -1;
+			ret = -1;
 		}
 	}
+	WRITE_UNLOCK(&ip_conntrack_expect_tuple_lock);
 	
-	return -1;
+	return ret;
 }
 
 /* Alter reply tuple (maybe alter helper).  If it's already taken,

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-12 13:09     ` Martin Josefsson
  2002-10-12 13:20       ` Martin Josefsson
  2002-10-12 13:29       ` [PATCH 2] " Martin Josefsson
@ 2002-10-12 14:34       ` Harald Welte
  2 siblings, 0 replies; 19+ messages in thread
From: Harald Welte @ 2002-10-12 14:34 UTC (permalink / raw)
  To: Martin Josefsson; +Cc: Jozsef Kadlecsik, Netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 2201 bytes --]

On Sat, Oct 12, 2002 at 03:09:10PM +0200, Martin Josefsson wrote:

> > > > ip_conntrack_change_expect
> > > > 	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
> > > > 	LIST_FIND
> > > >
> > > > 	FAILS!! called from nat-helpers which only hold their own locks
> > > > 	if any at all.
> > 
> > No.  The helper's help functions are called from do_bindings, which grab
> > a readlock on ip_conntrack_lock before traversing the list of expectations
> > matching this packet.  See also my comment above.
> 
> Ahh I missed this, the helper is called after exp_for_packet which has
> already screwed up the lock-debug.
> 
> But we'll need to hold at least a read-lock on the tuple_lock during
> that LIST_FIND since we only have a read-lock on ip_conntrack_lock.
> and we modify tupe/mask inside this read-locked ip_conntrack_lock but
> within a write-lock'd tuple_lock. And since we'll probably need to take
> that write-lock anyway we might just as well take it from the beginning.

this is correct, we should grab a write-tuple-lock around the whole block,
just as your patch did.  Could you prepare (and test) a patch which does
only this (and the ip_ct_find_proto() change?).

> So the lockingrules comes down to this if I've understood things
> correctly:
> 
> 1. All accesses to the expect-list are to be performed with
> ip_conntrack_lock held.
> 
> 2. All list-structure changes are done under a write-locked
> ip_conntrack_lock and tuple_lock is not needed.
> 
> 3. All other accesses are performed under at least a read-locked
> ip_conntrack_lock.
> 
> 4. All read-accesses to tuple/mask data inside the list are performed
> under at least a read-locked tuple_lock.
> 
> 5. All write-accesses to tuple/mask data inside the list are performed
> under a write-locked tuple_lock.
> 
> Does that sound ok to you?

sounds correct.

> /Martin

-- 
Live long and prosper
- Harald Welte / laforge@gnumonks.org               http://www.gnumonks.org/
============================================================================
"If this were a dictatorship, it'd be a heck of a lot easier, just so long
 as I'm the dictator."  --  George W. Bush Dec 18, 2000

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2] Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-12 13:29       ` [PATCH 2] " Martin Josefsson
@ 2002-10-12 14:36         ` Harald Welte
  2002-10-12 14:59           ` Min Li
  2002-10-12 15:32           ` Martin Josefsson
  0 siblings, 2 replies; 19+ messages in thread
From: Harald Welte @ 2002-10-12 14:36 UTC (permalink / raw)
  To: Martin Josefsson; +Cc: Jozsef Kadlecsik, Netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 616 bytes --]

On Sat, Oct 12, 2002 at 03:29:12PM +0200, Martin Josefsson wrote:
> Attached patch changes the write-locked tuple_lock to cover the entire
> function.
> Is this ok with you?

looks fine.  Now we only need the find_proto() change, and I'll submit 
the two combined to the kernel.

> /Martin
-- 
Live long and prosper
- Harald Welte / laforge@gnumonks.org               http://www.gnumonks.org/
============================================================================
"If this were a dictatorship, it'd be a heck of a lot easier, just so long
 as I'm the dictator."  --  George W. Bush Dec 18, 2000

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2] Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-12 14:36         ` Harald Welte
@ 2002-10-12 14:59           ` Min Li
  2002-10-12 15:32           ` Martin Josefsson
  1 sibling, 0 replies; 19+ messages in thread
From: Min Li @ 2002-10-12 14:59 UTC (permalink / raw)
  Cc: Netfilter-devel


Dear All,

I have to implement one senario which is to differentiate traffic coming
through the router. For any new connections forwarded by the router, I
have to change the source address of those packets. Also, the new
source addresses are known at user space only. So, by simply using
iptables NAT table, I won't be able to do that. In my application, I plan
to inform the kernel about the new source addresses, and then create a
address mapping table in the kernel. Then add a new module in iptables to
take care of specially my scenario. So, in the kernel, I will do all the
address translation based on the connection tracking results. However,
that means I need to change iptables tool as well as adding a new module.
I have never done linux kernel module programming before. Will it be
possible for you to give me some advice?

Thanks a lot.

Min

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2] Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-12 14:36         ` Harald Welte
  2002-10-12 14:59           ` Min Li
@ 2002-10-12 15:32           ` Martin Josefsson
  2002-10-14 11:33             ` Harald Welte
  2002-10-23 15:16             ` how you use nfnetlink-ctnetlink-0.11.patch marian stagarescu
  1 sibling, 2 replies; 19+ messages in thread
From: Martin Josefsson @ 2002-10-12 15:32 UTC (permalink / raw)
  To: Harald Welte; +Cc: Jozsef Kadlecsik, Netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 972 bytes --]

On Sat, 2002-10-12 at 16:36, Harald Welte wrote:
> On Sat, Oct 12, 2002 at 03:29:12PM +0200, Martin Josefsson wrote:
> > Attached patch changes the write-locked tuple_lock to cover the entire
> > function.
> > Is this ok with you?
> 
> looks fine.  Now we only need the find_proto() change, and I'll submit 
> the two combined to the kernel.

The find_proto() change is a little trickier... it depends on
ip_conntrack_protocol.h. I suggest that we move find_proto() to
ip_conntrack_protocol.h. And then find_proto() also depends on
listhelp.h... include-dependencies isn't fun.

Making find_proto() an inline adds two exported symbols, protocol_list
and ip_conntrack_generic_protocol.

I've attached a patch that at least compiles, I havn't tried booting it
yet. Does this look even remotely ok to you? If it does I'll try booting
it and send you a combined patch.

-- 
/Martin

Never argue with an idiot. They drag you down to their level, then beat
you with experience.

[-- Attachment #2: find_proto-inline.diff --]
[-- Type: text/plain, Size: 10383 bytes --]

--- linux-2.4.20-pre10/include/linux/netfilter_ipv4/ip_conntrack_core.h.meep	2002-10-12 16:54:30.000000000 +0200
+++ linux-2.4.20-pre10/include/linux/netfilter_ipv4/ip_conntrack_core.h	2002-10-12 17:17:16.000000000 +0200
@@ -16,9 +16,6 @@
 
 struct ip_conntrack_protocol;
 extern struct ip_conntrack_protocol *ip_ct_find_proto(u_int8_t protocol);
-/* Like above, but you already have conntrack read lock. */
-extern struct ip_conntrack_protocol *__find_proto(u_int8_t protocol);
-extern struct list_head protocol_list;
 
 /* Returns conntrack if it dealt with ICMP, and filled in skb->nfct */
 extern struct ip_conntrack *icmp_error_track(struct sk_buff *skb,
--- linux-2.4.20-pre10/include/linux/netfilter_ipv4/ip_conntrack_protocol.h.meep	2002-10-12 16:55:28.000000000 +0200
+++ linux-2.4.20-pre10/include/linux/netfilter_ipv4/ip_conntrack_protocol.h	2002-10-12 17:14:40.000000000 +0200
@@ -62,4 +62,28 @@
 extern struct ip_conntrack_protocol ip_conntrack_protocol_udp;
 extern struct ip_conntrack_protocol ip_conntrack_protocol_icmp;
 extern int ip_conntrack_protocol_tcp_init(void);
+
+extern struct list_head protocol_list;
+extern struct ip_conntrack_protocol ip_conntrack_generic_protocol;
+
+static inline int proto_cmpfn(const struct ip_conntrack_protocol *curr,
+			      u_int8_t protocol)
+{
+	return protocol == curr->proto;
+}
+
+static inline struct ip_conntrack_protocol *__find_proto(u_int8_t protocol)
+{
+	struct ip_conntrack_protocol *p;
+
+	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
+	p = LIST_FIND(&protocol_list, proto_cmpfn,
+		      struct ip_conntrack_protocol *, protocol);
+	if (!p)
+		p = &ip_conntrack_generic_protocol;
+
+	return p;
+}
+
+
 #endif /*_IP_CONNTRACK_PROTOCOL_H*/
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_core.c.meep	2002-10-12 15:29:47.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_core.c	2002-10-12 17:09:32.000000000 +0200
@@ -39,11 +39,11 @@
 #define ASSERT_READ_LOCK(x) MUST_BE_READ_LOCKED(&ip_conntrack_lock)
 #define ASSERT_WRITE_LOCK(x) MUST_BE_WRITE_LOCKED(&ip_conntrack_lock)
 
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_conntrack.h>
 #include <linux/netfilter_ipv4/ip_conntrack_protocol.h>
 #include <linux/netfilter_ipv4/ip_conntrack_helper.h>
 #include <linux/netfilter_ipv4/ip_conntrack_core.h>
-#include <linux/netfilter_ipv4/listhelp.h>
 
 #define IP_CONNTRACK_VERSION	"2.1"
 
@@ -66,27 +66,6 @@
 struct list_head *ip_conntrack_hash;
 static kmem_cache_t *ip_conntrack_cachep;
 
-extern struct ip_conntrack_protocol ip_conntrack_generic_protocol;
-
-static inline int proto_cmpfn(const struct ip_conntrack_protocol *curr,
-			      u_int8_t protocol)
-{
-	return protocol == curr->proto;
-}
-
-struct ip_conntrack_protocol *__find_proto(u_int8_t protocol)
-{
-	struct ip_conntrack_protocol *p;
-
-	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
-	p = LIST_FIND(&protocol_list, proto_cmpfn,
-		      struct ip_conntrack_protocol *, protocol);
-	if (!p)
-		p = &ip_conntrack_generic_protocol;
-
-	return p;
-}
-
 struct ip_conntrack_protocol *ip_ct_find_proto(u_int8_t protocol)
 {
 	struct ip_conntrack_protocol *p;
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_nat_core.c.meep	2002-10-12 15:37:02.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_nat_core.c	2002-10-12 17:30:38.000000000 +0200
@@ -21,6 +21,7 @@
 #define ASSERT_READ_LOCK(x) MUST_BE_READ_LOCKED(&ip_nat_lock)
 #define ASSERT_WRITE_LOCK(x) MUST_BE_WRITE_LOCKED(&ip_nat_lock)
 
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_conntrack.h>
 #include <linux/netfilter_ipv4/ip_conntrack_core.h>
 #include <linux/netfilter_ipv4/ip_conntrack_protocol.h>
@@ -29,7 +30,6 @@
 #include <linux/netfilter_ipv4/ip_nat_core.h>
 #include <linux/netfilter_ipv4/ip_nat_helper.h>
 #include <linux/netfilter_ipv4/ip_conntrack_helper.h>
-#include <linux/netfilter_ipv4/listhelp.h>
 
 #if 0
 #define DEBUGP printk
@@ -739,7 +739,7 @@
 	int ret = 1;
 
 	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
-	proto = ip_ct_find_proto((*pskb)->nh.iph->protocol);
+	proto = __find_proto((*pskb)->nh.iph->protocol);
 	if (proto->exp_matches_pkt)
 		ret = proto->exp_matches_pkt(exp, pskb);
 
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_standalone.c.meep	2002-10-12 17:00:17.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_standalone.c	2002-10-12 17:25:42.000000000 +0200
@@ -21,11 +21,11 @@
 #define ASSERT_READ_LOCK(x) MUST_BE_READ_LOCKED(&ip_conntrack_lock)
 #define ASSERT_WRITE_LOCK(x) MUST_BE_WRITE_LOCKED(&ip_conntrack_lock)
 
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_conntrack.h>
 #include <linux/netfilter_ipv4/ip_conntrack_protocol.h>
 #include <linux/netfilter_ipv4/ip_conntrack_core.h>
 #include <linux/netfilter_ipv4/ip_conntrack_helper.h>
-#include <linux/netfilter_ipv4/listhelp.h>
 
 #if 0
 #define DEBUGP printk
@@ -361,6 +361,8 @@
 EXPORT_SYMBOL(ip_ct_selective_cleanup);
 EXPORT_SYMBOL(ip_ct_refresh);
 EXPORT_SYMBOL(ip_ct_find_proto);
+EXPORT_SYMBOL(protocol_list);
+EXPORT_SYMBOL(ip_conntrack_generic_protocol);
 EXPORT_SYMBOL(ip_ct_find_helper);
 EXPORT_SYMBOL(ip_conntrack_expect_related);
 EXPORT_SYMBOL(ip_conntrack_change_expect);
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_proto_generic.c.meep	2002-10-12 17:09:55.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_proto_generic.c	2002-10-12 17:11:35.000000000 +0200
@@ -1,7 +1,12 @@
 #include <linux/types.h>
 #include <linux/sched.h>
 #include <linux/timer.h>
+
+#define ASSERT_READ_LOCK(x)
+#define ASSERT_WRITE_LOCK(x)
+
 #include <linux/netfilter.h>
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_conntrack_protocol.h>
 
 #define GENERIC_TIMEOUT (600*HZ)
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_proto_tcp.c.meep	2002-10-12 17:11:56.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_proto_tcp.c	2002-10-12 17:12:24.000000000 +0200
@@ -11,6 +11,10 @@
 
 #include <net/tcp.h>
 
+#define ASSERT_READ_LOCK(x)
+#define ASSERT_WRITE_LOCK(x)
+
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_conntrack.h>
 #include <linux/netfilter_ipv4/ip_conntrack_protocol.h>
 #include <linux/netfilter_ipv4/lockhelp.h>
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_proto_udp.c.meep	2002-10-12 17:12:34.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_proto_udp.c	2002-10-12 17:12:55.000000000 +0200
@@ -4,6 +4,11 @@
 #include <linux/netfilter.h>
 #include <linux/in.h>
 #include <linux/udp.h>
+
+#define ASSERT_READ_LOCK(x)
+#define ASSERT_WRITE_LOCK(x)
+
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_conntrack_protocol.h>
 
 #define UDP_TIMEOUT (30*HZ)
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_proto_icmp.c.meep	2002-10-12 17:13:10.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_proto_icmp.c	2002-10-12 17:13:22.000000000 +0200
@@ -4,6 +4,11 @@
 #include <linux/netfilter.h>
 #include <linux/in.h>
 #include <linux/icmp.h>
+
+#define ASSERT_READ_LOCK(x)
+#define ASSERT_WRITE_LOCK(x)
+
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_conntrack_protocol.h>
 
 #define ICMP_TIMEOUT (30*HZ)
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_nat_helper.c.meep	2002-10-12 17:15:10.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_nat_helper.c	2002-10-12 17:15:19.000000000 +0200
@@ -26,13 +26,13 @@
 #define ASSERT_READ_LOCK(x) MUST_BE_READ_LOCKED(&ip_nat_lock)
 #define ASSERT_WRITE_LOCK(x) MUST_BE_WRITE_LOCKED(&ip_nat_lock)
 
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_conntrack.h>
 #include <linux/netfilter_ipv4/ip_conntrack_helper.h>
 #include <linux/netfilter_ipv4/ip_nat.h>
 #include <linux/netfilter_ipv4/ip_nat_protocol.h>
 #include <linux/netfilter_ipv4/ip_nat_core.h>
 #include <linux/netfilter_ipv4/ip_nat_helper.h>
-#include <linux/netfilter_ipv4/listhelp.h>
 
 #if 0
 #define DEBUGP printk
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_nat_rule.c.meep	2002-10-12 17:15:28.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_nat_rule.c	2002-10-12 17:15:45.000000000 +0200
@@ -15,11 +15,11 @@
 #define ASSERT_READ_LOCK(x) MUST_BE_READ_LOCKED(&ip_nat_lock)
 #define ASSERT_WRITE_LOCK(x) MUST_BE_WRITE_LOCKED(&ip_nat_lock)
 
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_tables.h>
 #include <linux/netfilter_ipv4/ip_nat.h>
 #include <linux/netfilter_ipv4/ip_nat_core.h>
 #include <linux/netfilter_ipv4/ip_nat_rule.h>
-#include <linux/netfilter_ipv4/listhelp.h>
 
 #if 0
 #define DEBUGP printk
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_nat_standalone.c.meep	2002-10-12 17:15:53.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_nat_standalone.c	2002-10-12 17:16:01.000000000 +0200
@@ -28,6 +28,7 @@
 #define ASSERT_READ_LOCK(x) MUST_BE_READ_LOCKED(&ip_nat_lock)
 #define ASSERT_WRITE_LOCK(x) MUST_BE_WRITE_LOCKED(&ip_nat_lock)
 
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_nat.h>
 #include <linux/netfilter_ipv4/ip_nat_rule.h>
 #include <linux/netfilter_ipv4/ip_nat_protocol.h>
@@ -35,7 +36,6 @@
 #include <linux/netfilter_ipv4/ip_nat_helper.h>
 #include <linux/netfilter_ipv4/ip_tables.h>
 #include <linux/netfilter_ipv4/ip_conntrack_core.h>
-#include <linux/netfilter_ipv4/listhelp.h>
 
 #if 0
 #define DEBUGP printk
--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_fw_compat_masq.c.meep	2002-10-12 17:20:08.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_fw_compat_masq.c	2002-10-12 17:20:35.000000000 +0200
@@ -20,11 +20,11 @@
 #define ASSERT_READ_LOCK(x) MUST_BE_READ_LOCKED(&ip_conntrack_lock)
 #define ASSERT_WRITE_LOCK(x) MUST_BE_WRITE_LOCKED(&ip_conntrack_lock)
 
+#include <linux/netfilter_ipv4/listhelp.h>
 #include <linux/netfilter_ipv4/ip_conntrack.h>
 #include <linux/netfilter_ipv4/ip_conntrack_core.h>
 #include <linux/netfilter_ipv4/ip_nat.h>
 #include <linux/netfilter_ipv4/ip_nat_core.h>
-#include <linux/netfilter_ipv4/listhelp.h>
 
 #if 0
 #define DEBUGP printk

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2] Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-12 15:32           ` Martin Josefsson
@ 2002-10-14 11:33             ` Harald Welte
  2002-10-14 13:26               ` Martin Josefsson
  2002-10-23 15:16             ` how you use nfnetlink-ctnetlink-0.11.patch marian stagarescu
  1 sibling, 1 reply; 19+ messages in thread
From: Harald Welte @ 2002-10-14 11:33 UTC (permalink / raw)
  To: Martin Josefsson; +Cc: Jozsef Kadlecsik, Netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 1244 bytes --]

On Sat, Oct 12, 2002 at 05:32:55PM +0200, Martin Josefsson wrote:
> On Sat, 2002-10-12 at 16:36, Harald Welte wrote:
> > On Sat, Oct 12, 2002 at 03:29:12PM +0200, Martin Josefsson wrote:
> > > Attached patch changes the write-locked tuple_lock to cover the entire
> > > function.
> > > Is this ok with you?
> > 
> > looks fine.  Now we only need the find_proto() change, and I'll submit 
> > the two combined to the kernel.
> 
> The find_proto() change is a little trickier... it depends on
> ip_conntrack_protocol.h. I suggest that we move find_proto() to
> ip_conntrack_protocol.h. And then find_proto() also depends on
> listhelp.h... include-dependencies isn't fun.
> 
> Making find_proto() an inline adds two exported symbols, protocol_list
> and ip_conntrack_generic_protocol.

I didn't think about this, sorry.  So I think we now go for the previous
solution... export an __ function :(

> /Martin
-- 
Live long and prosper
- Harald Welte / laforge@gnumonks.org               http://www.gnumonks.org/
============================================================================
"If this were a dictatorship, it'd be a heck of a lot easier, just so long
 as I'm the dictator."  --  George W. Bush Dec 18, 2000

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 2] Re: what's the lockingrules for ip_conntrack_expect_list?
  2002-10-14 11:33             ` Harald Welte
@ 2002-10-14 13:26               ` Martin Josefsson
  0 siblings, 0 replies; 19+ messages in thread
From: Martin Josefsson @ 2002-10-14 13:26 UTC (permalink / raw)
  To: Harald Welte; +Cc: Jozsef Kadlecsik, Netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 674 bytes --]

On Mon, 2002-10-14 at 13:33, Harald Welte wrote:

> > Making find_proto() an inline adds two exported symbols, protocol_list
> > and ip_conntrack_generic_protocol.
> 
> I didn't think about this, sorry.  So I think we now go for the previous
> solution... export an __ function :(

You can't think of everything :)

I saw that you submitted the __ patch, I havn't had time to test the
ip_conntrack_change_expect patch but it looks correct to me, and it
compiles. If you feel dangerous you can submit it.

I've attached it again. (it applies cleanly to 2.4.20-pre10)
 
-- 
/Martin

Never argue with an idiot. They drag you down to their level, then beat
you with experience.

[-- Attachment #2: ip_conntrack_change_expect-lockfix.diff --]
[-- Type: text/plain, Size: 1494 bytes --]

--- linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_core.c.mjufs	2002-10-12 15:23:22.000000000 +0200
+++ linux-2.4.20-pre10/net/ipv4/netfilter/ip_conntrack_core.c	2002-10-12 15:23:00.000000000 +0200
@@ -1060,7 +1060,10 @@
 int ip_conntrack_change_expect(struct ip_conntrack_expect *expect,
 			       struct ip_conntrack_tuple *newtuple)
 {
+	int ret;
+
 	MUST_BE_READ_LOCKED(&ip_conntrack_lock);
+	WRITE_LOCK(&ip_conntrack_expect_tuple_lock);
 
 	DEBUGP("change_expect:\n");
 	DEBUGP("exp tuple: "); DUMP_TUPLE(&expect->tuple);
@@ -1073,26 +1076,25 @@
 		    && LIST_FIND(&ip_conntrack_expect_list, expect_clash,
 			         struct ip_conntrack_expect *, newtuple, &expect->mask)) {
 			/* Force NAT to find an unused tuple */
-			return -1;
+			ret = -1;
 		} else {
-			WRITE_LOCK(&ip_conntrack_expect_tuple_lock);
 			memcpy(&expect->ct_tuple, &expect->tuple, sizeof(expect->tuple));
 			memcpy(&expect->tuple, newtuple, sizeof(expect->tuple));
-			WRITE_UNLOCK(&ip_conntrack_expect_tuple_lock);
-			return 0;
+			ret = 0;
 		}
 	} else {
 		/* Resent packet */
 		DEBUGP("change expect: resent packet\n");
 		if (ip_ct_tuple_equal(&expect->tuple, newtuple)) {
-			return 0;
+			ret = 0;
 		} else {
 			/* Force NAT to choose again the same port */
-			return -1;
+			ret = -1;
 		}
 	}
+	WRITE_UNLOCK(&ip_conntrack_expect_tuple_lock);
 	
-	return -1;
+	return ret;
 }
 
 /* Alter reply tuple (maybe alter helper).  If it's already taken,

^ permalink raw reply	[flat|nested] 19+ messages in thread

* how you use nfnetlink-ctnetlink-0.11.patch
  2002-10-12 15:32           ` Martin Josefsson
  2002-10-14 11:33             ` Harald Welte
@ 2002-10-23 15:16             ` marian stagarescu
  2002-10-23 15:52               ` Harald Welte
  1 sibling, 1 reply; 19+ messages in thread
From: marian stagarescu @ 2002-10-23 15:16 UTC (permalink / raw)
  To: laforge, jschlst; +Cc: Netfilter-devel

can someone give me some input on this patch: 
stability and how to use it ? 
sometime ago i was looking for a way to flush individual conntrack
entries. i think this patch may let me access conntrack entries and, for
example, set their TTL to 0 so that it can be flushed. but i'm not user
how to use it: does it have a corresponding userland iptables part ?


thanks,
marian

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: how you use nfnetlink-ctnetlink-0.11.patch
  2002-10-23 15:16             ` how you use nfnetlink-ctnetlink-0.11.patch marian stagarescu
@ 2002-10-23 15:52               ` Harald Welte
  0 siblings, 0 replies; 19+ messages in thread
From: Harald Welte @ 2002-10-23 15:52 UTC (permalink / raw)
  To: marian stagarescu; +Cc: jschlst, Netfilter-devel

[-- Attachment #1: Type: text/plain, Size: 994 bytes --]

On Wed, Oct 23, 2002 at 11:16:36AM -0400, marian stagarescu wrote:
> can someone give me some input on this patch: 
> stability and how to use it ? 
> sometime ago i was looking for a way to flush individual conntrack
> entries. i think this patch may let me access conntrack entries and, for
> example, set their TTL to 0 so that it can be flushed. but i'm not user
> how to use it: does it have a corresponding userland iptables part ?

It has a userspace library, called libctnetlink (see our cvs tree or
http://cvs.netfilter.org/cgi-bin/cvsweb/netfilter/iptables2/libctnetlink/

This library has an example program, called 'ctnltest.c'

> thanks,
> marian

-- 
Live long and prosper
- Harald Welte / laforge@gnumonks.org               http://www.gnumonks.org/
============================================================================
"If this were a dictatorship, it'd be a heck of a lot easier, just so long
 as I'm the dictator."  --  George W. Bush Dec 18, 2000

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2002-10-23 15:52 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-10-10 21:40 what's the lockingrules for ip_conntrack_expect_list? Martin Josefsson
2002-10-11 13:06 ` Jozsef Kadlecsik
2002-10-11 13:56   ` Martin Josefsson
2002-10-11 14:07     ` Jozsef Kadlecsik
2002-10-11 18:42       ` [PATCH] " Martin Josefsson
2002-10-12 12:25         ` Harald Welte
2002-10-12 13:11           ` Martin Josefsson
2002-10-12 12:17   ` Harald Welte
2002-10-12 13:09     ` Martin Josefsson
2002-10-12 13:20       ` Martin Josefsson
2002-10-12 13:29       ` [PATCH 2] " Martin Josefsson
2002-10-12 14:36         ` Harald Welte
2002-10-12 14:59           ` Min Li
2002-10-12 15:32           ` Martin Josefsson
2002-10-14 11:33             ` Harald Welte
2002-10-14 13:26               ` Martin Josefsson
2002-10-23 15:16             ` how you use nfnetlink-ctnetlink-0.11.patch marian stagarescu
2002-10-23 15:52               ` Harald Welte
2002-10-12 14:34       ` what's the lockingrules for ip_conntrack_expect_list? Harald Welte

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.