* [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC [not found] <1166797726.18915.4.camel@alice> @ 2006-12-28 12:17 ` Jarek Poplawski 2007-01-03 23:16 ` Jon Maloy 0 siblings, 1 reply; 6+ messages in thread From: Jarek Poplawski @ 2006-12-28 12:17 UTC (permalink / raw) To: Eric Sesterhenn; +Cc: Per Liden, netdev, linux-kernel On 22-12-2006 15:28, Eric Sesterhenn wrote: > hi, > > while running my usual stuff on 2.6.20-rc1-git5, sfuzz (http://www.digitaldwarf.be/products/sfuzz.c) > did the following, to produce the lockdep warning below: ... > Here is the stacktrace: > > [ 313.239556] ======================================================= > [ 313.239718] [ INFO: possible circular locking dependency detected ] > [ 313.239795] 2.6.20-rc1-git5 #26 > [ 313.239858] ------------------------------------------------------- > [ 313.239929] sfuzz/4133 is trying to acquire lock: > [ 313.239996] (ref_table_lock){-+..}, at: [<c04cd0a9>] tipc_ref_discard+0x29/0xe0 > [ 313.241101] > [ 313.241105] but task is already holding lock: > [ 313.241225] (&table[i].lock){-+..}, at: [<c04cb7c0>] tipc_deleteport+0x40/0x1a0 > [ 313.241524] > [ 313.241528] which lock already depends on the new lock. > [ 313.241535] > [ 313.241709] > [ 313.241713] the existing dependency chain (in reverse order) is: > [ 313.241837] > [ 313.241841] -> #1 (&table[i].lock){-+..}: > [ 313.242096] [<c01366c5>] __lock_acquire+0xd05/0xde0 > [ 313.242562] [<c0136809>] lock_acquire+0x69/0xa0 > [ 313.243013] [<c04d4040>] _spin_lock_bh+0x40/0x60 > [ 313.243476] [<c04cd1cb>] tipc_ref_acquire+0x6b/0xe0 > [ 313.244115] [<c04cac73>] tipc_createport_raw+0x33/0x260 > [ 313.244562] [<c04cbe21>] tipc_createport+0x41/0x120 > [ 313.245007] [<c04c57ec>] tipc_subscr_start+0xcc/0x120 > [ 313.245458] [<c04bdb56>] process_signal_queue+0x56/0xa0 > [ 313.245906] [<c011ea18>] tasklet_action+0x38/0x80 > [ 313.246361] [<c011ecbb>] __do_softirq+0x5b/0xc0 > [ 313.246817] [<c01060e8>] do_softirq+0x88/0xe0 > [ 313.247450] [<ffffffff>] 0xffffffff > [ 313.247894] > [ 313.247898] -> #0 (ref_table_lock){-+..}: > [ 313.248155] [<c0136415>] __lock_acquire+0xa55/0xde0 > [ 313.248601] [<c0136809>] lock_acquire+0x69/0xa0 > [ 313.249037] [<c04d4100>] _write_lock_bh+0x40/0x60 > [ 313.249486] [<c04cd0a9>] tipc_ref_discard+0x29/0xe0 > [ 313.249922] [<c04cb7da>] tipc_deleteport+0x5a/0x1a0 > [ 313.250543] [<c04cd4f8>] tipc_create+0x58/0x160 > [ 313.250980] [<c0431cb2>] __sock_create+0x112/0x280 > [ 313.251422] [<c0431e5a>] sock_create+0x1a/0x20 > [ 313.251863] [<c04320fb>] sys_socket+0x1b/0x40 > [ 313.252301] [<c0432a72>] sys_socketcall+0x92/0x260 > [ 313.252738] [<c0102fd0>] syscall_call+0x7/0xb > [ 313.253175] [<ffffffff>] 0xffffffff > [ 313.253778] > [ 313.253782] other info that might help us debug this: > [ 313.253790] > [ 313.253956] 1 lock held by sfuzz/4133: > [ 313.254019] #0: (&table[i].lock){-+..}, at: [<c04cb7c0>] tipc_deleteport+0x40/0x1a0 > [ 313.254346] > [ 313.254351] stack backtrace: > [ 313.254470] [<c01045da>] show_trace_log_lvl+0x1a/0x40 > [ 313.254594] [<c0104d72>] show_trace+0x12/0x20 > [ 313.254711] [<c0104e79>] dump_stack+0x19/0x20 > [ 313.254829] [<c013480f>] print_circular_bug_tail+0x6f/0x80 > [ 313.254952] [<c0136415>] __lock_acquire+0xa55/0xde0 > [ 313.255070] [<c0136809>] lock_acquire+0x69/0xa0 > [ 313.255188] [<c04d4100>] _write_lock_bh+0x40/0x60 > [ 313.255315] [<c04cd0a9>] tipc_ref_discard+0x29/0xe0 > [ 313.255435] [<c04cb7da>] tipc_deleteport+0x5a/0x1a0 > [ 313.255565] [<c04cd4f8>] tipc_create+0x58/0x160 > [ 313.255687] [<c0431cb2>] __sock_create+0x112/0x280 > [ 313.255811] [<c0431e5a>] sock_create+0x1a/0x20 > [ 313.255942] [<c04320fb>] sys_socket+0x1b/0x40 > [ 313.256059] [<c0432a72>] sys_socketcall+0x92/0x260 > [ 313.256179] [<c0102fd0>] syscall_call+0x7/0xb > [ 313.256300] ======================= > > Greetings, Eric Hello, Maybe I misinterpret this but, IMHO lockdep complains about locks acquired in different order: tipc_ref_acquire() gets ref_table_lock and then tipc_ret_table.entries[index]->lock, but tipc_deleteport() inversely (with: tipc_port_lock() and tipc_ref_discard()). I hope maintainers will decide the correct order. Btw. there is a problem with tipc_ref_discard(): it should be called with tipc_port_lock, but how to discard a ref if this lock can't be acquired? Is it OK to call it without the lock like in subscr_named_msg_event()? Btw. #2: during this checking I've found two places where return values from tipc_ref_lock() and tipc_port_lock() are not checked, so I attach a patch proposal for this (compiled but not tested): Regards, Jarek P. --- [PATCH] tipc: checking returns from locking functions Checking of return values from tipc_ref_lock() and tipc_port_lock() added in 2 places. Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> --- diff -Nurp linux-2.6.20-rc2-/net/tipc/port.c linux-2.6.20-rc2/net/tipc/port.c --- linux-2.6.20-rc2-/net/tipc/port.c 2006-11-29 22:57:37.000000000 +0100 +++ linux-2.6.20-rc2/net/tipc/port.c 2006-12-28 11:05:17.000000000 +0100 @@ -238,7 +238,12 @@ u32 tipc_createport_raw(void *usr_handle return 0; } - tipc_port_lock(ref); + if (!tipc_port_lock(ref)) { + tipc_ref_discard(ref); + warn("Port creation failed, reference table invalid\n"); + kfree(p_ptr); + return 0; + } p_ptr->publ.ref = ref; msg = &p_ptr->publ.phdr; msg_init(msg, DATA_LOW, TIPC_NAMED_MSG, TIPC_OK, LONG_H_SIZE, 0); diff -Nurp linux-2.6.20-rc2-/net/tipc/subscr.c linux-2.6.20-rc2/net/tipc/subscr.c --- linux-2.6.20-rc2-/net/tipc/subscr.c 2006-12-18 09:01:04.000000000 +0100 +++ linux-2.6.20-rc2/net/tipc/subscr.c 2006-12-28 11:31:27.000000000 +0100 @@ -499,7 +499,12 @@ static void subscr_named_msg_event(void /* Add subscriber to topology server's subscriber list */ - tipc_ref_lock(subscriber->ref); + if (!tipc_ref_lock(subscriber->ref)) { + warn("Subscriber rejected, unable to find port\n"); + tipc_ref_discard(subscriber->ref); + kfree(subscriber); + return; + } spin_lock_bh(&topsrv.lock); list_add(&subscriber->subscriber_list, &topsrv.subscriber_list); spin_unlock_bh(&topsrv.lock); ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC 2006-12-28 12:17 ` [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC Jarek Poplawski @ 2007-01-03 23:16 ` Jon Maloy 2007-01-04 12:28 ` Jarek Poplawski 0 siblings, 1 reply; 6+ messages in thread From: Jon Maloy @ 2007-01-03 23:16 UTC (permalink / raw) To: Jarek Poplawski Cc: Eric Sesterhenn, Per Liden, netdev, linux-kernel, 'tipc-discussion@lists.sourceforge.net' See my comments below. Regards ///jon Jarek Poplawski wrote: > .......... > >Maybe I misinterpret this but, IMHO lockdep >complains about locks acquired in different >order: tipc_ref_acquire() gets ref_table_lock >and then tipc_ret_table.entries[index]->lock, >but tipc_deleteport() inversely (with: >tipc_port_lock() and tipc_ref_discard()). >I hope maintainers will decide the correct >order. > > This order is correct. There can never be parallel access to the same _instance_ of tipc_ret_table.entries[index]->lock from the two functions you mention. Note that tipc_deleteport() takes as argument the reference (=index) returned from tipc_ref_acquire(), so it can not be (and is not) called until and unless the latter function has returned a valid reference. As a parallel, you can't do free() on a memory chunk until malloc() has given you a pointer to it. >Btw. there is a problem with tipc_ref_discard(): >it should be called with tipc_port_lock, but >how to discard a ref if this lock can't be >acquired? Is it OK to call it without the lock >like in subscr_named_msg_event()? > > I suspect you are mixing up things here. We are handling two different reference entries and two different locks in this function. One reference entry points to a subscription instance, and its reference (index) is obtainable from subscriber->ref. So, we could easily lock the entry if needed. However, in this particular case it is unnecessary, since there is no chance that anybody else could have obtained the new reference, and hence no risk for race conditions. The other reference entry was intended to point to a new port, but, since we didn't obtain any reference in the first place, there is no port to delete and no reference to discard. >Btw. #2: during this checking I've found >two places where return values from >tipc_ref_lock() and tipc_port_lock() are not >checked, so I attach a patch proposal for >this (compiled but not tested): > > Thanks. >Regards, >Jarek P. >--- > >[PATCH] tipc: checking returns from locking functions > >Checking of return values from tipc_ref_lock() >and tipc_port_lock() added in 2 places. > >Signed-off-by: Jarek Poplawski <jarkao2@o2.pl> >--- > >diff -Nurp linux-2.6.20-rc2-/net/tipc/port.c linux-2.6.20-rc2/net/tipc/port.c >--- linux-2.6.20-rc2-/net/tipc/port.c 2006-11-29 22:57:37.000000000 +0100 >+++ linux-2.6.20-rc2/net/tipc/port.c 2006-12-28 11:05:17.000000000 +0100 >@@ -238,7 +238,12 @@ u32 tipc_createport_raw(void *usr_handle > return 0; > } > >- tipc_port_lock(ref); >+ if (!tipc_port_lock(ref)) { >+ tipc_ref_discard(ref); >+ warn("Port creation failed, reference table invalid\n"); >+ kfree(p_ptr); >+ return 0; >+ } > p_ptr->publ.ref = ref; > msg = &p_ptr->publ.phdr; > msg_init(msg, DATA_LOW, TIPC_NAMED_MSG, TIPC_OK, LONG_H_SIZE, 0); >diff -Nurp linux-2.6.20-rc2-/net/tipc/subscr.c linux-2.6.20-rc2/net/tipc/subscr.c >--- linux-2.6.20-rc2-/net/tipc/subscr.c 2006-12-18 09:01:04.000000000 +0100 >+++ linux-2.6.20-rc2/net/tipc/subscr.c 2006-12-28 11:31:27.000000000 +0100 >@@ -499,7 +499,12 @@ static void subscr_named_msg_event(void > > /* Add subscriber to topology server's subscriber list */ > >- tipc_ref_lock(subscriber->ref); >+ if (!tipc_ref_lock(subscriber->ref)) { >+ warn("Subscriber rejected, unable to find port\n"); >+ tipc_ref_discard(subscriber->ref); >+ kfree(subscriber); >+ return; >+ } > spin_lock_bh(&topsrv.lock); > list_add(&subscriber->subscriber_list, &topsrv.subscriber_list); > spin_unlock_bh(&topsrv.lock); >- >To unsubscribe from this list: send the line "unsubscribe netdev" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC 2007-01-03 23:16 ` Jon Maloy @ 2007-01-04 12:28 ` Jarek Poplawski 2007-01-04 16:16 ` Jon Maloy 0 siblings, 1 reply; 6+ messages in thread From: Jarek Poplawski @ 2007-01-04 12:28 UTC (permalink / raw) To: Jon Maloy Cc: Eric Sesterhenn, Per Liden, netdev, linux-kernel, 'tipc-discussion@lists.sourceforge.net' On Wed, Jan 03, 2007 at 11:16:59PM +0000, Jon Maloy wrote: > See my comments below. > Regards > ///jon > > Jarek Poplawski wrote: > > >.......... > > > >Maybe I misinterpret this but, IMHO lockdep > >complains about locks acquired in different > >order: tipc_ref_acquire() gets ref_table_lock > >and then tipc_ret_table.entries[index]->lock, > >but tipc_deleteport() inversely (with: > >tipc_port_lock() and tipc_ref_discard()). > >I hope maintainers will decide the correct > >order. > > > > > This order is correct. There can never be parallel access to the > same _instance_ of tipc_ret_table.entries[index]->lock from > the two functions you mention. > Note that tipc_deleteport() takes as argument the reference (=index) > returned from tipc_ref_acquire(), so it can not be (and is not) called > until and unless the latter function has returned a valid reference. > As a parallel, you can't do free() on a memory chunk until > malloc() has given you a pointer to it. I'm happy the order is correct! But the warning probably will be back. I know lockdep is sometimes too careful but nevertheless some change is needed to fix a real bug or give additional information to lockdep. > >Btw. there is a problem with tipc_ref_discard(): > >it should be called with tipc_port_lock, but > >how to discard a ref if this lock can't be > >acquired? Is it OK to call it without the lock > >like in subscr_named_msg_event()? > > > > > I suspect you are mixing up things here. > We are handling two different reference entries and two > different locks in this function. > One reference entry points to a subscription instance, and its > reference (index) is obtainable from subscriber->ref. So, we > could easily lock the entry if needed. However, in this > particular case it is unnecessary, since there is no chance that > anybody else could have obtained the new reference, and > hence no risk for race conditions. > The other reference entry was intended to point to a new port, > but, since we didn't obtain any reference in the first place, > there is no port to delete and no reference to discard. I admit I don't know this program and I hope I didn't mislead anybody with my message. I only tried to point at some doubts and maybe this function could be better commented about when the lock is needed. Thanks for explanations & best regards, Jarek P. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC 2007-01-04 12:28 ` Jarek Poplawski @ 2007-01-04 16:16 ` Jon Maloy 2007-01-05 7:58 ` Jarek Poplawski 0 siblings, 1 reply; 6+ messages in thread From: Jon Maloy @ 2007-01-04 16:16 UTC (permalink / raw) To: Jarek Poplawski Cc: Eric Sesterhenn, Per Liden, netdev, linux-kernel, 'tipc-discussion@lists.sourceforge.net' Regards ///jon Jarek Poplawski wrote: > >I know lockdep is sometimes >too careful but nevertheless some change is needed >to fix a real bug or give additional information >to lockdep. > > I don't know lockdep well enough yet, but I will try to find out if that is possible. > > >>>Btw. there is a problem with tipc_ref_discard(): >>>it should be called with tipc_port_lock, but >>>how to discard a ref if this lock can't be >>>acquired? Is it OK to call it without the lock >>>like in subscr_named_msg_event()? >>> >>> >>> >>> >>I suspect you are mixing up things here. >>We are handling two different reference entries and two >>different locks in this function. >>One reference entry points to a subscription instance, and its >>reference (index) is obtainable from subscriber->ref. So, we >>could easily lock the entry if needed. However, in this >>particular case it is unnecessary, since there is no chance that >>anybody else could have obtained the new reference, and >>hence no risk for race conditions. >>The other reference entry was intended to point to a new port, >>but, since we didn't obtain any reference in the first place, >>there is no port to delete and no reference to discard. >> >> > >I admit I don't know this program and I hope I >didn't mislead anybody with my message. I only >tried to point at some doubts and maybe this >function could be better commented about when >the lock is needed. > > Agreed. >Thanks for explanations & best regards, > >Jarek P. > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC 2007-01-04 16:16 ` Jon Maloy @ 2007-01-05 7:58 ` Jarek Poplawski 2007-01-05 17:22 ` Jon Maloy 0 siblings, 1 reply; 6+ messages in thread From: Jarek Poplawski @ 2007-01-05 7:58 UTC (permalink / raw) To: Jon Maloy Cc: Eric Sesterhenn, Per Liden, netdev, linux-kernel, 'tipc-discussion@lists.sourceforge.net' On Thu, Jan 04, 2007 at 04:16:20PM +0000, Jon Maloy wrote: > Regards > ///jon > > Jarek Poplawski wrote: > > > > >I know lockdep is sometimes > >too careful but nevertheless some change is needed > >to fix a real bug or give additional information > >to lockdep. > > > > > I don't know lockdep well enough yet, but I will try to find out if that > is possible. If you are sure there is no circular locking possible between these two functions and this entry->lock here isn't endangered by other functions, you could try to make lockdep "silent" like this: write_lock_bh(&ref_table_lock); if (tipc_ref_table.first_free) { index = tipc_ref_table.first_free; entry = &(tipc_ref_table.entries[index]); index_mask = tipc_ref_table.index_mask; /* take lock in case a previous user of entry still holds it */ - spin_lock_bh(&entry->lock, ); + local_bh_disable(); + spin_lock_nested(&entry->lock, SINGLE_DEPTH_NESTING); next_plus_upper = entry->data.next_plus_upper; tipc_ref_table.first_free = next_plus_upper & index_mask; reference = (next_plus_upper & ~index_mask) + index; entry->data.reference = reference; entry->object = object; if (lock != 0) *lock = &entry->lock; /* may stay as is or: */ - spin_unlock_bh(&entry->lock); + spin_unlock(&entry->lock); + local_bh_enable(); } write_unlock_bh(&ref_table_lock); Cheers, Jarek P. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC 2007-01-05 7:58 ` Jarek Poplawski @ 2007-01-05 17:22 ` Jon Maloy 0 siblings, 0 replies; 6+ messages in thread From: Jon Maloy @ 2007-01-05 17:22 UTC (permalink / raw) To: Jarek Poplawski Cc: Eric Sesterhenn, Per Liden, netdev, linux-kernel, 'tipc-discussion@lists.sourceforge.net' Jarek Poplawski wrote: > >If you are sure there is no circular locking possible >between these two functions and this entry->lock here >isn't endangered by other functions, you could try to >make lockdep "silent" like this: > > > write_lock_bh(&ref_table_lock); > if (tipc_ref_table.first_free) { > index = tipc_ref_table.first_free; > entry = &(tipc_ref_table.entries[index]); > index_mask = tipc_ref_table.index_mask; > /* take lock in case a previous user of entry still holds it */ > >- spin_lock_bh(&entry->lock, ); >+ local_bh_disable(); >+ spin_lock_nested(&entry->lock, SINGLE_DEPTH_NESTING); > > next_plus_upper = entry->data.next_plus_upper; > tipc_ref_table.first_free = next_plus_upper & index_mask; > reference = (next_plus_upper & ~index_mask) + index; > entry->data.reference = reference; > entry->object = object; > if (lock != 0) > *lock = &entry->lock; > >/* may stay as is or: */ >- spin_unlock_bh(&entry->lock); >+ spin_unlock(&entry->lock); >+ local_bh_enable(); > > } > write_unlock_bh(&ref_table_lock); > > > > Looks like an acceptable solution. I will try this. Thanks ///Jon ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2007-01-05 17:22 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <1166797726.18915.4.camel@alice>
2006-12-28 12:17 ` [PATCH] tipc: checking returns and Re: Possible Circular Locking in TIPC Jarek Poplawski
2007-01-03 23:16 ` Jon Maloy
2007-01-04 12:28 ` Jarek Poplawski
2007-01-04 16:16 ` Jon Maloy
2007-01-05 7:58 ` Jarek Poplawski
2007-01-05 17:22 ` Jon Maloy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).