public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] lockdep vs recursive read locks
@ 2008-03-12 12:09 Peter Zijlstra
  2008-03-12 12:09 ` [PATCH 1/2] lockdep: fix recursive read lock validation Peter Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Peter Zijlstra @ 2008-03-12 12:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Gautham R Shenoy, Hugh Dickins, Peter Zijlstra

We seemed to have some lockdep trouble wrt recursive read locks. Today Gautham
asked me some specific questions about it, which led to these patches.

IIRC Hugh ran into something similar a while back. This hopefully fixes it.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] lockdep: fix recursive read lock validation
  2008-03-12 12:09 [PATCH 0/2] lockdep vs recursive read locks Peter Zijlstra
@ 2008-03-12 12:09 ` Peter Zijlstra
  2008-03-12 20:26   ` Gautham R Shenoy
  2008-03-12 12:09 ` [PATCH 2/2] lockdep: fix fib_hash softirq inversion Peter Zijlstra
  2008-03-13 11:25 ` [PATCH 0/2] lockdep vs recursive read locks Hugh Dickins
  2 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2008-03-12 12:09 UTC (permalink / raw)
  To: linux-kernel; +Cc: Ingo Molnar, Gautham R Shenoy, Hugh Dickins, Peter Zijlstra

[-- Attachment #1: lockdep-fix-recursive-locking.patch --]
[-- Type: text/plain, Size: 2733 bytes --]

__lock_acquire( .read = 2 )
  hlock->read = read; /* [1] */
  validate_chain()
    ret = check_deadlock(); /* returns 2 when recursive */

    if (ret == 2)
      hlock->read = 2; /* but it was already 2 from [1] */

    check_prevs_add()
      if (hlock->read != 2)
        /* add to dependency chain */

So it will never add a recursive read lock to the dependency chain. Fix this
by setting hlock->read to 1 when its the first recursive lock instance.

This means that the following sequence is now invalid, whereas previously
it was considered valid:

  rlock(a); rlock(b); runlock(b); runlock(a)
  rlock(b); rlock(a);

It really is invalid when considered against write locks.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Gautham R Shenoy <ego@in.ibm.com>
---
 kernel/lockdep.c       |    9 ++++-----
 lib/locking-selftest.c |   12 ++++++------
 2 files changed, 10 insertions(+), 11 deletions(-)

Index: linux-2.6-2/kernel/lockdep.c
===================================================================
--- linux-2.6-2.orig/kernel/lockdep.c
+++ linux-2.6-2/kernel/lockdep.c
@@ -1557,12 +1557,11 @@ static int validate_chain(struct task_st
 		if (!ret)
 			return 0;
 		/*
-		 * Mark recursive read, as we jump over it when
-		 * building dependencies (just like we jump over
-		 * trylock entries):
+		 * If we are the first recursive read, don't jump over our
+		 * dependency.
 		 */
-		if (ret == 2)
-			hlock->read = 2;
+		if (hlock->read == 2 && ret != 2)
+			hlock->read = 1;
 		/*
 		 * Add dependency only if this lock is not the head
 		 * of the chain, and if it's not a secondary read-lock:
Index: linux-2.6-2/lib/locking-selftest.c
===================================================================
--- linux-2.6-2.orig/lib/locking-selftest.c
+++ linux-2.6-2/lib/locking-selftest.c
@@ -1135,12 +1135,12 @@ void locking_selftest(void)
 	debug_locks_silent = !debug_locks_verbose;
 
 	DO_TESTCASE_6R("A-A deadlock", AA);
-	DO_TESTCASE_6R("A-B-B-A deadlock", ABBA);
-	DO_TESTCASE_6R("A-B-B-C-C-A deadlock", ABBCCA);
-	DO_TESTCASE_6R("A-B-C-A-B-C deadlock", ABCABC);
-	DO_TESTCASE_6R("A-B-B-C-C-D-D-A deadlock", ABBCCDDA);
-	DO_TESTCASE_6R("A-B-C-D-B-D-D-A deadlock", ABCDBDDA);
-	DO_TESTCASE_6R("A-B-C-D-B-C-D-A deadlock", ABCDBCDA);
+	DO_TESTCASE_6("A-B-B-A deadlock", ABBA);
+	DO_TESTCASE_6("A-B-B-C-C-A deadlock", ABBCCA);
+	DO_TESTCASE_6("A-B-C-A-B-C deadlock", ABCABC);
+	DO_TESTCASE_6("A-B-B-C-C-D-D-A deadlock", ABBCCDDA);
+	DO_TESTCASE_6("A-B-C-D-B-D-D-A deadlock", ABCDBDDA);
+	DO_TESTCASE_6("A-B-C-D-B-C-D-A deadlock", ABCDBCDA);
 	DO_TESTCASE_6("double unlock", double_unlock);
 	DO_TESTCASE_6("initialize held", init_held);
 	DO_TESTCASE_6_SUCCESS("bad unlock order", bad_unlock_order);

--


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/2] lockdep: fix fib_hash softirq inversion
  2008-03-12 12:09 [PATCH 0/2] lockdep vs recursive read locks Peter Zijlstra
  2008-03-12 12:09 ` [PATCH 1/2] lockdep: fix recursive read lock validation Peter Zijlstra
@ 2008-03-12 12:09 ` Peter Zijlstra
  2008-03-13  3:50   ` [PATCH] net/lockdep: Make nl_table_lock read acquire softirq safe Gautham R Shenoy
  2008-03-13  6:08   ` [PATCH 2/2] lockdep: fix fib_hash softirq inversion David Miller
  2008-03-13 11:25 ` [PATCH 0/2] lockdep vs recursive read locks Hugh Dickins
  2 siblings, 2 replies; 9+ messages in thread
From: Peter Zijlstra @ 2008-03-12 12:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Gautham R Shenoy, Hugh Dickins, Peter Zijlstra,
	Eric Dumazet, David S. Miller

[-- Attachment #1: lockdep-fib_hash-softirq-inversion.patch --]
[-- Type: text/plain, Size: 32391 bytes --]

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.25-rc4-sched-devel.git #56
---------------------------------------------------------
swapper/0 just changed the state of lock:
 (&rt_hash_locks[i]){-+..}, at: [<ffffffff804721ac>] rt_intern_hash+0x8c/0x3b0
but this lock took another, soft-read-irq-unsafe lock in the past:
 (fib_hash_lock){-.-?}

and interrupts could create inverse lock ordering between them.


other info that might help us debug this:
1 lock held by swapper/0:
 #0:  (rcu_read_lock){..--}, at: [<ffffffff8045a706>] netif_receive_skb+0x76/0x350

the first lock's dependencies:
-> (&rt_hash_locks[i]){-+..} ops: 0 {
   initial-use  at:
                                       [<ffffffff80275900>] __lock_acquire+0x150/0x1080
                                       [<ffffffff80276895>] lock_acquire+0x65/0x90
                                       [<ffffffff804e99a6>] _spin_lock_bh+0x36/0x70
                                       [<ffffffff804721ac>] rt_intern_hash+0x8c/0x3b0
                                       [<ffffffff80472acd>] __ip_route_output_key+0x5fd/0xb10
                                       [<ffffffff80472ffc>] ip_route_output_flow+0x1c/0x70
                                       [<ffffffff8049b49d>] udp_sendmsg+0x5dd/0x760
                                       [<ffffffff804a2235>] inet_sendmsg+0x45/0x80
                                       [<ffffffff8044d677>] sock_sendmsg+0x127/0x140
                                       [<ffffffff8044e43a>] sys_sendto+0xea/0x120
                                       [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                       [<ffffffffffffffff>] 0xffffffffffffffff
   in-softirq-W at:
                                       [<ffffffffffffffff>] 0xffffffffffffffff
   hardirq-on-W at:
                                       [<ffffffff80275e72>] __lock_acquire+0x6c2/0x1080
                                       [<ffffffff80276895>] lock_acquire+0x65/0x90
                                       [<ffffffff804e99a6>] _spin_lock_bh+0x36/0x70
                                       [<ffffffff804721ac>] rt_intern_hash+0x8c/0x3b0
                                       [<ffffffff80472acd>] __ip_route_output_key+0x5fd/0xb10
                                       [<ffffffff80472ffc>] ip_route_output_flow+0x1c/0x70
                                       [<ffffffff8049b49d>] udp_sendmsg+0x5dd/0x760
                                       [<ffffffff804a2235>] inet_sendmsg+0x45/0x80
                                       [<ffffffff8044d677>] sock_sendmsg+0x127/0x140
                                       [<ffffffff8044e43a>] sys_sendto+0xea/0x120
                                       [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                       [<ffffffffffffffff>] 0xffffffffffffffff
 }
 ... key      at: [<ffffffff80b96b14>] __key.37724+0x0/0x8
 -> (&tbl->lock){-+-+} ops: 0 {
    initial-use  at:
                                         [<ffffffffffffffff>] 0xffffffffffffffff
    in-softirq-W at:
                                         [<ffffffffffffffff>] 0xffffffffffffffff
    hardirq-on-W at:
                                         [<ffffffffffffffff>] 0xffffffffffffffff
    in-softirq-R at:
                                         [<ffffffffffffffff>] 0xffffffffffffffff
    hardirq-on-R at:
                                         [<ffffffff80275bf0>] __lock_acquire+0x440/0x1080
                                         [<ffffffff80276895>] lock_acquire+0x65/0x90
                                         [<ffffffff804e9b59>] _read_lock_bh+0x39/0x70
                                         [<ffffffff80461a46>] neigh_lookup+0x76/0x150
                                         [<ffffffff8049c5a2>] arp_bind_neighbour+0x72/0xb0
                                         [<ffffffff8047223a>] rt_intern_hash+0x11a/0x3b0
                                         [<ffffffff80472acd>] __ip_route_output_key+0x5fd/0xb10
                                         [<ffffffff80472ffc>] ip_route_output_flow+0x1c/0x70
                                         [<ffffffff8049b49d>] udp_sendmsg+0x5dd/0x760
                                         [<ffffffff804a2235>] inet_sendmsg+0x45/0x80
                                         [<ffffffff8044d677>] sock_sendmsg+0x127/0x140
                                         [<ffffffff8044e43a>] sys_sendto+0xea/0x120
                                         [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                         [<ffffffffffffffff>] 0xffffffffffffffff
  }
  ... key      at: [<ffffffff80b967f8>] __key.29558+0x0/0x8
  -> (base_lock_keys + cpu){++..} ops: 0 {
     initial-use  at:
                                           [<ffffffff80275900>] __lock_acquire+0x150/0x1080
                                           [<ffffffff80276895>] lock_acquire+0x65/0x90
                                           [<ffffffff804e9d27>] _spin_lock_irqsave+0x47/0x80
                                           [<ffffffff8025a946>] lock_timer_base+0x36/0x70
                                           [<ffffffff8025aaab>] __mod_timer+0x3b/0xf0
                                           [<ffffffff8025adbe>] mod_timer+0x2e/0x50
                                           [<ffffffff806ccc6b>] con_init+0x26b/0x290
                                           [<ffffffff806cc0f2>] console_init+0x32/0x50
                                           [<ffffffff806abbc0>] start_kernel+0x1e0/0x330
                                           [<ffffffff806ab1b2>] _sinittext+0x1b2/0x200
                                           [<ffffffffffffffff>] 0xffffffffffffffff
     in-hardirq-W at:
                                           [<ffffffffffffffff>] 0xffffffffffffffff
     in-softirq-W at:
                                           [<ffffffffffffffff>] 0xffffffffffffffff
   }
   ... key      at: [<ffffffff807a7400>] base_lock_keys+0x0/0x40
  ... acquired at:
   [<ffffffffffffffff>] 0xffffffffffffffff

 ... acquired at:
   [<ffffffff80276734>] __lock_acquire+0xf84/0x1080
   [<ffffffff80276895>] lock_acquire+0x65/0x90
   [<ffffffff804e9b59>] _read_lock_bh+0x39/0x70
   [<ffffffff80461a46>] neigh_lookup+0x76/0x150
   [<ffffffff8049c5a2>] arp_bind_neighbour+0x72/0xb0
   [<ffffffff8047223a>] rt_intern_hash+0x11a/0x3b0
   [<ffffffff80472acd>] __ip_route_output_key+0x5fd/0xb10
   [<ffffffff80472ffc>] ip_route_output_flow+0x1c/0x70
   [<ffffffff8049b49d>] udp_sendmsg+0x5dd/0x760
   [<ffffffff804a2235>] inet_sendmsg+0x45/0x80
   [<ffffffff8044d677>] sock_sendmsg+0x127/0x140
   [<ffffffff8044e43a>] sys_sendto+0xea/0x120
   [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
   [<ffffffffffffffff>] 0xffffffffffffffff

 -> (fib_hash_lock){-.-?} ops: 0 {
    initial-use  at:
                                         [<ffffffff80275900>] __lock_acquire+0x150/0x1080
                                         [<ffffffff80276895>] lock_acquire+0x65/0x90
                                         [<ffffffff804e9a76>] _write_lock_bh+0x36/0x70
                                         [<ffffffff804ad443>] fn_hash_insert+0x5a3/0x720
                                         [<ffffffff804a88ef>] fib_magic+0x10f/0x120
                                         [<ffffffff804a8973>] fib_add_ifaddr+0x73/0x170
                                         [<ffffffff804a8b7b>] fib_inetaddr_event+0x4b/0x280
                                         [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                         [<ffffffff8026b998>] __blocking_notifier_call_chain+0x58/0x80
                                         [<ffffffff8026b9d1>] blocking_notifier_call_chain+0x11/0x20
                                         [<ffffffff8049fa44>] __inet_insert_ifa+0xd4/0x170
                                         [<ffffffff8049faed>] inet_insert_ifa+0xd/0x10
                                         [<ffffffff804a0eff>] inetdev_event+0x3df/0x480
                                         [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                         [<ffffffff8026b739>] __raw_notifier_call_chain+0x9/0x10
                                         [<ffffffff8026b751>] raw_notifier_call_chain+0x11/0x20
                                         [<ffffffff8045b3a6>] call_netdevice_notifiers+0x16/0x20
                                         [<ffffffff8045ce52>] dev_open+0x82/0x90
                                         [<ffffffff8045b529>] dev_change_flags+0x99/0x1b0
                                         [<ffffffff804a171e>] devinet_ioctl+0x5ce/0x780
                                         [<ffffffff804a205c>] inet_ioctl+0x5c/0x80
                                         [<ffffffff8044c8b5>] sock_ioctl+0x65/0x250
                                         [<ffffffff802d5241>] vfs_ioctl+0x31/0x90
                                         [<ffffffff802d5313>] do_vfs_ioctl+0x73/0x2d0
                                         [<ffffffff802d55f2>] sys_ioctl+0x82/0xa0
                                         [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                         [<ffffffffffffffff>] 0xffffffffffffffff
    hardirq-on-W at:
                                         [<ffffffff80275e72>] __lock_acquire+0x6c2/0x1080
                                         [<ffffffff80276895>] lock_acquire+0x65/0x90
                                         [<ffffffff804e9a76>] _write_lock_bh+0x36/0x70
                                         [<ffffffff804ad443>] fn_hash_insert+0x5a3/0x720
                                         [<ffffffff804a88ef>] fib_magic+0x10f/0x120
                                         [<ffffffff804a8973>] fib_add_ifaddr+0x73/0x170
                                         [<ffffffff804a8b7b>] fib_inetaddr_event+0x4b/0x280
                                         [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                         [<ffffffff8026b998>] __blocking_notifier_call_chain+0x58/0x80
                                         [<ffffffff8026b9d1>] blocking_notifier_call_chain+0x11/0x20
                                         [<ffffffff8049fa44>] __inet_insert_ifa+0xd4/0x170
                                         [<ffffffff8049faed>] inet_insert_ifa+0xd/0x10
                                         [<ffffffff804a0eff>] inetdev_event+0x3df/0x480
                                         [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                         [<ffffffff8026b739>] __raw_notifier_call_chain+0x9/0x10
                                         [<ffffffff8026b751>] raw_notifier_call_chain+0x11/0x20
                                         [<ffffffff8045b3a6>] call_netdevice_notifiers+0x16/0x20
                                         [<ffffffff8045ce52>] dev_open+0x82/0x90
                                         [<ffffffff8045b529>] dev_change_flags+0x99/0x1b0
                                         [<ffffffff804a171e>] devinet_ioctl+0x5ce/0x780
                                         [<ffffffff804a205c>] inet_ioctl+0x5c/0x80
                                         [<ffffffff8044c8b5>] sock_ioctl+0x65/0x250
                                         [<ffffffff802d5241>] vfs_ioctl+0x31/0x90
                                         [<ffffffff802d5313>] do_vfs_ioctl+0x73/0x2d0
                                         [<ffffffff802d55f2>] sys_ioctl+0x82/0xa0
                                         [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                         [<ffffffffffffffff>] 0xffffffffffffffff
    in-softirq-R at:
                                         [<ffffffffffffffff>] 0xffffffffffffffff
    softirq-on-R at:
                                         [<ffffffff80275c1b>] __lock_acquire+0x46b/0x1080
                                         [<ffffffff80276895>] lock_acquire+0x65/0x90
                                         [<ffffffff804e9bc4>] _read_lock+0x34/0x70
                                         [<ffffffff804acdd7>] fn_hash_lookup+0x27/0xf0
                                         [<ffffffff804a855d>] inet_addr_type+0x6d/0xf0
                                         [<ffffffff804aada2>] fib_create_info+0x5a2/0xbf0
                                         [<ffffffff804acf0b>] fn_hash_insert+0x6b/0x720
                                         [<ffffffff804a88ef>] fib_magic+0x10f/0x120
                                         [<ffffffff804a89c8>] fib_add_ifaddr+0xc8/0x170
                                         [<ffffffff804a8b7b>] fib_inetaddr_event+0x4b/0x280
                                         [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                         [<ffffffff8026b998>] __blocking_notifier_call_chain+0x58/0x80
                                         [<ffffffff8026b9d1>] blocking_notifier_call_chain+0x11/0x20
                                         [<ffffffff8049fa44>] __inet_insert_ifa+0xd4/0x170
                                         [<ffffffff8049faed>] inet_insert_ifa+0xd/0x10
                                         [<ffffffff804a0eff>] inetdev_event+0x3df/0x480
                                         [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                         [<ffffffff8026b739>] __raw_notifier_call_chain+0x9/0x10
                                         [<ffffffff8026b751>] raw_notifier_call_chain+0x11/0x20
                                         [<ffffffff8045b3a6>] call_netdevice_notifiers+0x16/0x20
                                         [<ffffffff8045ce52>] dev_open+0x82/0x90
                                         [<ffffffff8045b529>] dev_change_flags+0x99/0x1b0
                                         [<ffffffff804a171e>] devinet_ioctl+0x5ce/0x780
                                         [<ffffffff804a205c>] inet_ioctl+0x5c/0x80
                                         [<ffffffff8044c8b5>] sock_ioctl+0x65/0x250
                                         [<ffffffff802d5241>] vfs_ioctl+0x31/0x90
                                         [<ffffffff802d5313>] do_vfs_ioctl+0x73/0x2d0
                                         [<ffffffff802d55f2>] sys_ioctl+0x82/0xa0
                                         [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                         [<ffffffffffffffff>] 0xffffffffffffffff
    hardirq-on-R at:
                                         [<ffffffff80275bf0>] __lock_acquire+0x440/0x1080
                                         [<ffffffff80276895>] lock_acquire+0x65/0x90
                                         [<ffffffff804e9bc4>] _read_lock+0x34/0x70
                                         [<ffffffff804acdd7>] fn_hash_lookup+0x27/0xf0
                                         [<ffffffff804a855d>] inet_addr_type+0x6d/0xf0
                                         [<ffffffff804aada2>] fib_create_info+0x5a2/0xbf0
                                         [<ffffffff804acf0b>] fn_hash_insert+0x6b/0x720
                                         [<ffffffff804a88ef>] fib_magic+0x10f/0x120
                                         [<ffffffff804a89c8>] fib_add_ifaddr+0xc8/0x170
                                         [<ffffffff804a8b7b>] fib_inetaddr_event+0x4b/0x280
                                         [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                         [<ffffffff8026b998>] __blocking_notifier_call_chain+0x58/0x80
                                         [<ffffffff8026b9d1>] blocking_notifier_call_chain+0x11/0x20
                                         [<ffffffff8049fa44>] __inet_insert_ifa+0xd4/0x170
                                         [<ffffffff8049faed>] inet_insert_ifa+0xd/0x10
                                         [<ffffffff804a0eff>] inetdev_event+0x3df/0x480
                                         [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                         [<ffffffff8026b739>] __raw_notifier_call_chain+0x9/0x10
                                         [<ffffffff8026b751>] raw_notifier_call_chain+0x11/0x20
                                         [<ffffffff8045b3a6>] call_netdevice_notifiers+0x16/0x20
                                         [<ffffffff8045ce52>] dev_open+0x82/0x90
                                         [<ffffffff8045b529>] dev_change_flags+0x99/0x1b0
                                         [<ffffffff804a171e>] devinet_ioctl+0x5ce/0x780
                                         [<ffffffff804a205c>] inet_ioctl+0x5c/0x80
                                         [<ffffffff8044c8b5>] sock_ioctl+0x65/0x250
                                         [<ffffffff802d5241>] vfs_ioctl+0x31/0x90
                                         [<ffffffff802d5313>] do_vfs_ioctl+0x73/0x2d0
                                         [<ffffffff802d55f2>] sys_ioctl+0x82/0xa0
                                         [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                         [<ffffffffffffffff>] 0xffffffffffffffff
  }
  ... key      at: [<ffffffff806799b8>] fib_hash_lock+0x18/0x40
 ... acquired at:
   [<ffffffff80276734>] __lock_acquire+0xf84/0x1080
   [<ffffffff80276895>] lock_acquire+0x65/0x90
   [<ffffffff804e9bc4>] _read_lock+0x34/0x70
   [<ffffffff804acdd7>] fn_hash_lookup+0x27/0xf0
   [<ffffffff804a855d>] inet_addr_type+0x6d/0xf0
   [<ffffffff8049dcc6>] arp_constructor+0x86/0x280
   [<ffffffff8046272f>] neigh_create+0x19f/0x5b0
   [<ffffffff8049c5cf>] arp_bind_neighbour+0x9f/0xb0
   [<ffffffff8047223a>] rt_intern_hash+0x11a/0x3b0
   [<ffffffff80472acd>] __ip_route_output_key+0x5fd/0xb10
   [<ffffffff804947ec>] tcp_v4_connect+0x10c/0x550
   [<ffffffff804a343a>] inet_stream_connect+0x23a/0x2d0
   [<ffffffff8044e530>] sys_connect+0xa0/0xc0
   [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
   [<ffffffffffffffff>] 0xffffffffffffffff


the second lock's dependencies:
-> (fib_hash_lock){-.-?} ops: 0 {
   initial-use  at:
                                       [<ffffffff80275900>] __lock_acquire+0x150/0x1080
                                       [<ffffffff80276895>] lock_acquire+0x65/0x90
                                       [<ffffffff804e9a76>] _write_lock_bh+0x36/0x70
                                       [<ffffffff804ad443>] fn_hash_insert+0x5a3/0x720
                                       [<ffffffff804a88ef>] fib_magic+0x10f/0x120
                                       [<ffffffff804a8973>] fib_add_ifaddr+0x73/0x170
                                       [<ffffffff804a8b7b>] fib_inetaddr_event+0x4b/0x280
                                       [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                       [<ffffffff8026b998>] __blocking_notifier_call_chain+0x58/0x80
                                       [<ffffffff8026b9d1>] blocking_notifier_call_chain+0x11/0x20
                                       [<ffffffff8049fa44>] __inet_insert_ifa+0xd4/0x170
                                       [<ffffffff8049faed>] inet_insert_ifa+0xd/0x10
                                       [<ffffffff804a0eff>] inetdev_event+0x3df/0x480
                                       [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                       [<ffffffff8026b739>] __raw_notifier_call_chain+0x9/0x10
                                       [<ffffffff8026b751>] raw_notifier_call_chain+0x11/0x20
                                       [<ffffffff8045b3a6>] call_netdevice_notifiers+0x16/0x20
                                       [<ffffffff8045ce52>] dev_open+0x82/0x90
                                       [<ffffffff8045b529>] dev_change_flags+0x99/0x1b0
                                       [<ffffffff804a171e>] devinet_ioctl+0x5ce/0x780
                                       [<ffffffff804a205c>] inet_ioctl+0x5c/0x80
                                       [<ffffffff8044c8b5>] sock_ioctl+0x65/0x250
                                       [<ffffffff802d5241>] vfs_ioctl+0x31/0x90
                                       [<ffffffff802d5313>] do_vfs_ioctl+0x73/0x2d0
                                       [<ffffffff802d55f2>] sys_ioctl+0x82/0xa0
                                       [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                       [<ffffffffffffffff>] 0xffffffffffffffff
   hardirq-on-W at:
                                       [<ffffffff80275e72>] __lock_acquire+0x6c2/0x1080
                                       [<ffffffff80276895>] lock_acquire+0x65/0x90
                                       [<ffffffff804e9a76>] _write_lock_bh+0x36/0x70
                                       [<ffffffff804ad443>] fn_hash_insert+0x5a3/0x720
                                       [<ffffffff804a88ef>] fib_magic+0x10f/0x120
                                       [<ffffffff804a8973>] fib_add_ifaddr+0x73/0x170
                                       [<ffffffff804a8b7b>] fib_inetaddr_event+0x4b/0x280
                                       [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                       [<ffffffff8026b998>] __blocking_notifier_call_chain+0x58/0x80
                                       [<ffffffff8026b9d1>] blocking_notifier_call_chain+0x11/0x20
                                       [<ffffffff8049fa44>] __inet_insert_ifa+0xd4/0x170
                                       [<ffffffff8049faed>] inet_insert_ifa+0xd/0x10
                                       [<ffffffff804a0eff>] inetdev_event+0x3df/0x480
                                       [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                       [<ffffffff8026b739>] __raw_notifier_call_chain+0x9/0x10
                                       [<ffffffff8026b751>] raw_notifier_call_chain+0x11/0x20
                                       [<ffffffff8045b3a6>] call_netdevice_notifiers+0x16/0x20
                                       [<ffffffff8045ce52>] dev_open+0x82/0x90
                                       [<ffffffff8045b529>] dev_change_flags+0x99/0x1b0
                                       [<ffffffff804a171e>] devinet_ioctl+0x5ce/0x780
                                       [<ffffffff804a205c>] inet_ioctl+0x5c/0x80
                                       [<ffffffff8044c8b5>] sock_ioctl+0x65/0x250
                                       [<ffffffff802d5241>] vfs_ioctl+0x31/0x90
                                       [<ffffffff802d5313>] do_vfs_ioctl+0x73/0x2d0
                                       [<ffffffff802d55f2>] sys_ioctl+0x82/0xa0
                                       [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                       [<ffffffffffffffff>] 0xffffffffffffffff
   in-softirq-R at:
                                       [<ffffffffffffffff>] 0xffffffffffffffff
   softirq-on-R at:
                                       [<ffffffff80275c1b>] __lock_acquire+0x46b/0x1080
                                       [<ffffffff80276895>] lock_acquire+0x65/0x90
                                       [<ffffffff804e9bc4>] _read_lock+0x34/0x70
                                       [<ffffffff804acdd7>] fn_hash_lookup+0x27/0xf0
                                       [<ffffffff804a855d>] inet_addr_type+0x6d/0xf0
                                       [<ffffffff804aada2>] fib_create_info+0x5a2/0xbf0
                                       [<ffffffff804acf0b>] fn_hash_insert+0x6b/0x720
                                       [<ffffffff804a88ef>] fib_magic+0x10f/0x120
                                       [<ffffffff804a89c8>] fib_add_ifaddr+0xc8/0x170
                                       [<ffffffff804a8b7b>] fib_inetaddr_event+0x4b/0x280
                                       [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                       [<ffffffff8026b998>] __blocking_notifier_call_chain+0x58/0x80
                                       [<ffffffff8026b9d1>] blocking_notifier_call_chain+0x11/0x20
                                       [<ffffffff8049fa44>] __inet_insert_ifa+0xd4/0x170
                                       [<ffffffff8049faed>] inet_insert_ifa+0xd/0x10
                                       [<ffffffff804a0eff>] inetdev_event+0x3df/0x480
                                       [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                       [<ffffffff8026b739>] __raw_notifier_call_chain+0x9/0x10
                                       [<ffffffff8026b751>] raw_notifier_call_chain+0x11/0x20
                                       [<ffffffff8045b3a6>] call_netdevice_notifiers+0x16/0x20
                                       [<ffffffff8045ce52>] dev_open+0x82/0x90
                                       [<ffffffff8045b529>] dev_change_flags+0x99/0x1b0
                                       [<ffffffff804a171e>] devinet_ioctl+0x5ce/0x780
                                       [<ffffffff804a205c>] inet_ioctl+0x5c/0x80
                                       [<ffffffff8044c8b5>] sock_ioctl+0x65/0x250
                                       [<ffffffff802d5241>] vfs_ioctl+0x31/0x90
                                       [<ffffffff802d5313>] do_vfs_ioctl+0x73/0x2d0
                                       [<ffffffff802d55f2>] sys_ioctl+0x82/0xa0
                                       [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                       [<ffffffffffffffff>] 0xffffffffffffffff
   hardirq-on-R at:
                                       [<ffffffff80275bf0>] __lock_acquire+0x440/0x1080
                                       [<ffffffff80276895>] lock_acquire+0x65/0x90
                                       [<ffffffff804e9bc4>] _read_lock+0x34/0x70
                                       [<ffffffff804acdd7>] fn_hash_lookup+0x27/0xf0
                                       [<ffffffff804a855d>] inet_addr_type+0x6d/0xf0
                                       [<ffffffff804aada2>] fib_create_info+0x5a2/0xbf0
                                       [<ffffffff804acf0b>] fn_hash_insert+0x6b/0x720
                                       [<ffffffff804a88ef>] fib_magic+0x10f/0x120
                                       [<ffffffff804a89c8>] fib_add_ifaddr+0xc8/0x170
                                       [<ffffffff804a8b7b>] fib_inetaddr_event+0x4b/0x280
                                       [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                       [<ffffffff8026b998>] __blocking_notifier_call_chain+0x58/0x80
                                       [<ffffffff8026b9d1>] blocking_notifier_call_chain+0x11/0x20
                                       [<ffffffff8049fa44>] __inet_insert_ifa+0xd4/0x170
                                       [<ffffffff8049faed>] inet_insert_ifa+0xd/0x10
                                       [<ffffffff804a0eff>] inetdev_event+0x3df/0x480
                                       [<ffffffff804ed193>] notifier_call_chain+0x53/0x80
                                       [<ffffffff8026b739>] __raw_notifier_call_chain+0x9/0x10
                                       [<ffffffff8026b751>] raw_notifier_call_chain+0x11/0x20
                                       [<ffffffff8045b3a6>] call_netdevice_notifiers+0x16/0x20
                                       [<ffffffff8045ce52>] dev_open+0x82/0x90
                                       [<ffffffff8045b529>] dev_change_flags+0x99/0x1b0
                                       [<ffffffff804a171e>] devinet_ioctl+0x5ce/0x780
                                       [<ffffffff804a205c>] inet_ioctl+0x5c/0x80
                                       [<ffffffff8044c8b5>] sock_ioctl+0x65/0x250
                                       [<ffffffff802d5241>] vfs_ioctl+0x31/0x90
                                       [<ffffffff802d5313>] do_vfs_ioctl+0x73/0x2d0
                                       [<ffffffff802d55f2>] sys_ioctl+0x82/0xa0
                                       [<ffffffff8022044b>] system_call_after_swapgs+0x7b/0x80
                                       [<ffffffffffffffff>] 0xffffffffffffffff
 }
 ... key      at: [<ffffffff806799b8>] fib_hash_lock+0x18/0x40

stack backtrace:
Pid: 0, comm: swapper Not tainted 2.6.25-rc4-sched-devel.git #56

Call Trace:
 <IRQ>  [<ffffffff80273a4c>] print_irq_inversion_bug+0x13c/0x160
 [<ffffffff80274a9d>] check_usage_forwards+0x4d/0x60
 [<ffffffff80274c46>] mark_lock+0x196/0x5e0
 [<ffffffff80275d15>] __lock_acquire+0x565/0x1080
 [<ffffffff80276895>] lock_acquire+0x65/0x90
 [<ffffffff804721ac>] ? rt_intern_hash+0x8c/0x3b0
 [<ffffffff804e99a6>] _spin_lock_bh+0x36/0x70
 [<ffffffff804721ac>] rt_intern_hash+0x8c/0x3b0
 [<ffffffff80473a4e>] ip_route_input+0x8fe/0x1250
 [<ffffffff8027530d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff8027530d>] ? trace_hardirqs_on+0xd/0x10
 [<ffffffff804769a8>] ip_rcv+0x518/0x650
 [<ffffffff8045a8f9>] netif_receive_skb+0x269/0x350
 [<ffffffff80405529>] e1000_clean_rx_irq+0x1a9/0x5a0
 [<ffffffff8040072b>] e1000_clean+0x1fb/0x5b0
 [<ffffffff8027268d>] ? trace_hardirqs_off+0xd/0x10
 [<ffffffff8045d18f>] net_rx_action+0xbf/0x180
 [<ffffffff80255c55>] __do_softirq+0x75/0x100
 [<ffffffff8022173c>] call_softirq+0x1c/0x30
 [<ffffffff80223505>] do_softirq+0x65/0xa0
 [<ffffffff80255b67>] irq_exit+0x97/0xa0
 [<ffffffff80223618>] do_IRQ+0xa8/0x130
 [<ffffffff80220a06>] ret_from_intr+0x0/0xf
 <EOI>  [<ffffffff8021ef83>] ? default_idle+0x43/0x80
 [<ffffffff8021ef81>] ? default_idle+0x41/0x80
 [<ffffffff8021ef40>] ? default_idle+0x0/0x80
 [<ffffffff8021f02d>] ? cpu_idle+0x6d/0x120
 [<ffffffff804e3edb>] ? start_secondary+0x31b/0x420

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
CC: Eric Dumazet <dada1@cosmosbay.com>
CC: David S. Miller <davem@davemloft.net>
---
 net/ipv4/fib_hash.c |   18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

Index: linux-2.6-2/net/ipv4/fib_hash.c
===================================================================
--- linux-2.6-2.orig/net/ipv4/fib_hash.c
+++ linux-2.6-2/net/ipv4/fib_hash.c
@@ -251,7 +251,7 @@ fn_hash_lookup(struct fib_table *tb, con
 	struct fn_zone *fz;
 	struct fn_hash *t = (struct fn_hash*)tb->tb_data;
 
-	read_lock(&fib_hash_lock);
+	read_lock_bh(&fib_hash_lock);
 	for (fz = t->fn_zone_list; fz; fz = fz->fz_next) {
 		struct hlist_head *head;
 		struct hlist_node *node;
@@ -273,7 +273,7 @@ fn_hash_lookup(struct fib_table *tb, con
 	}
 	err = 1;
 out:
-	read_unlock(&fib_hash_lock);
+	read_unlock_bh(&fib_hash_lock);
 	return err;
 }
 
@@ -295,7 +295,7 @@ fn_hash_select_default(struct fib_table 
 	last_resort = NULL;
 	order = -1;
 
-	read_lock(&fib_hash_lock);
+	read_lock_bh(&fib_hash_lock);
 	hlist_for_each_entry(f, node, &fz->fz_hash[0], fn_hash) {
 		struct fib_alias *fa;
 
@@ -343,7 +343,7 @@ fn_hash_select_default(struct fib_table 
 		fib_result_assign(res, last_resort);
 	tb->tb_default = last_idx;
 out:
-	read_unlock(&fib_hash_lock);
+	read_unlock_bh(&fib_hash_lock);
 }
 
 /* Insert node F to FZ. */
@@ -753,18 +753,18 @@ static int fn_hash_dump(struct fib_table
 	struct fn_hash *table = (struct fn_hash*)tb->tb_data;
 
 	s_m = cb->args[2];
-	read_lock(&fib_hash_lock);
+	read_lock_bh(&fib_hash_lock);
 	for (fz = table->fn_zone_list, m=0; fz; fz = fz->fz_next, m++) {
 		if (m < s_m) continue;
 		if (fn_hash_dump_zone(skb, cb, tb, fz) < 0) {
 			cb->args[2] = m;
-			read_unlock(&fib_hash_lock);
+			read_unlock_bh(&fib_hash_lock);
 			return -1;
 		}
 		memset(&cb->args[3], 0,
 		       sizeof(cb->args) - 3*sizeof(cb->args[0]));
 	}
-	read_unlock(&fib_hash_lock);
+	read_unlock_bh(&fib_hash_lock);
 	cb->args[2] = m;
 	return skb->len;
 }
@@ -962,7 +962,7 @@ static void *fib_seq_start(struct seq_fi
 	struct fib_iter_state *iter = seq->private;
 	void *v = NULL;
 
-	read_lock(&fib_hash_lock);
+	read_lock_bh(&fib_hash_lock);
 	if (fib_get_table(iter->p.net, RT_TABLE_MAIN))
 		v = *pos ? fib_get_idx(seq, *pos - 1) : SEQ_START_TOKEN;
 	return v;
@@ -977,7 +977,7 @@ static void *fib_seq_next(struct seq_fil
 static void fib_seq_stop(struct seq_file *seq, void *v)
 	__releases(fib_hash_lock)
 {
-	read_unlock(&fib_hash_lock);
+	read_unlock_bh(&fib_hash_lock);
 }
 
 static unsigned fib_flag_trans(int type, __be32 mask, struct fib_info *fi)

--


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] lockdep: fix recursive read lock validation
  2008-03-12 12:09 ` [PATCH 1/2] lockdep: fix recursive read lock validation Peter Zijlstra
@ 2008-03-12 20:26   ` Gautham R Shenoy
  0 siblings, 0 replies; 9+ messages in thread
From: Gautham R Shenoy @ 2008-03-12 20:26 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, Hugh Dickins

On Wed, Mar 12, 2008 at 01:09:21PM +0100, Peter Zijlstra wrote:
> __lock_acquire( .read = 2 )
>   hlock->read = read; /* [1] */
>   validate_chain()
>     ret = check_deadlock(); /* returns 2 when recursive */
> 
>     if (ret == 2)
>       hlock->read = 2; /* but it was already 2 from [1] */
> 
>     check_prevs_add()
>       if (hlock->read != 2)
>         /* add to dependency chain */
> 
> So it will never add a recursive read lock to the dependency chain. Fix this
> by setting hlock->read to 1 when its the first recursive lock instance.
> 
> This means that the following sequence is now invalid, whereas previously
> it was considered valid:
> 
>   rlock(a); rlock(b); runlock(b); runlock(a)
>   rlock(b); rlock(a);
> 
> It really is invalid when considered against write locks.
> 
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> CC: Gautham R Shenoy <ego@in.ibm.com>

Tested-by: Gautham R Shenoy <ego@in.ibm.com>

> ---
>  kernel/lockdep.c       |    9 ++++-----
>  lib/locking-selftest.c |   12 ++++++------
>  2 files changed, 10 insertions(+), 11 deletions(-)
> 
> Index: linux-2.6-2/kernel/lockdep.c
> ===================================================================
> --- linux-2.6-2.orig/kernel/lockdep.c
> +++ linux-2.6-2/kernel/lockdep.c
> @@ -1557,12 +1557,11 @@ static int validate_chain(struct task_st
>  		if (!ret)
>  			return 0;
>  		/*
> -		 * Mark recursive read, as we jump over it when
> -		 * building dependencies (just like we jump over
> -		 * trylock entries):
> +		 * If we are the first recursive read, don't jump over our
> +		 * dependency.
>  		 */
> -		if (ret == 2)
> -			hlock->read = 2;
> +		if (hlock->read == 2 && ret != 2)
> +			hlock->read = 1;
>  		/*
>  		 * Add dependency only if this lock is not the head
>  		 * of the chain, and if it's not a secondary read-lock:
> Index: linux-2.6-2/lib/locking-selftest.c
> ===================================================================
> --- linux-2.6-2.orig/lib/locking-selftest.c
> +++ linux-2.6-2/lib/locking-selftest.c
> @@ -1135,12 +1135,12 @@ void locking_selftest(void)
>  	debug_locks_silent = !debug_locks_verbose;
> 
>  	DO_TESTCASE_6R("A-A deadlock", AA);
> -	DO_TESTCASE_6R("A-B-B-A deadlock", ABBA);
> -	DO_TESTCASE_6R("A-B-B-C-C-A deadlock", ABBCCA);
> -	DO_TESTCASE_6R("A-B-C-A-B-C deadlock", ABCABC);
> -	DO_TESTCASE_6R("A-B-B-C-C-D-D-A deadlock", ABBCCDDA);
> -	DO_TESTCASE_6R("A-B-C-D-B-D-D-A deadlock", ABCDBDDA);
> -	DO_TESTCASE_6R("A-B-C-D-B-C-D-A deadlock", ABCDBCDA);
> +	DO_TESTCASE_6("A-B-B-A deadlock", ABBA);
> +	DO_TESTCASE_6("A-B-B-C-C-A deadlock", ABBCCA);
> +	DO_TESTCASE_6("A-B-C-A-B-C deadlock", ABCABC);
> +	DO_TESTCASE_6("A-B-B-C-C-D-D-A deadlock", ABBCCDDA);
> +	DO_TESTCASE_6("A-B-C-D-B-D-D-A deadlock", ABCDBDDA);
> +	DO_TESTCASE_6("A-B-C-D-B-C-D-A deadlock", ABCDBCDA);
>  	DO_TESTCASE_6("double unlock", double_unlock);
>  	DO_TESTCASE_6("initialize held", init_held);
>  	DO_TESTCASE_6_SUCCESS("bad unlock order", bad_unlock_order);
> 
> --

-- 
Thanks and Regards
gautham

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] net/lockdep: Make nl_table_lock read acquire softirq safe.
  2008-03-12 12:09 ` [PATCH 2/2] lockdep: fix fib_hash softirq inversion Peter Zijlstra
@ 2008-03-13  3:50   ` Gautham R Shenoy
  2008-03-13  6:08   ` [PATCH 2/2] lockdep: fix fib_hash softirq inversion David Miller
  1 sibling, 0 replies; 9+ messages in thread
From: Gautham R Shenoy @ 2008-03-13  3:50 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, Ingo Molnar, Hugh Dickins, Eric Dumazet,
	David S. Miller

On Wed, Mar 12, 2008 at 01:09:22PM +0100, Peter Zijlstra wrote:
> =========================================================
> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.25-rc4-sched-devel.git #56
> ---------------------------------------------------------
> swapper/0 just changed the state of lock:
>  (&rt_hash_locks[i]){-+..}, at: [<ffffffff804721ac>] rt_intern_hash+0x8c/0x3b0
> but this lock took another, soft-read-irq-unsafe lock in the past:
>  (fib_hash_lock){-.-?}
> 
> and interrupts could create inverse lock ordering between them.
> 
> 

I found one more of these during the bootup.

It appears as though rwlocks totally missed lockdep validation because
of jumping their dependency chains!

-->

   From: Gautham R Shenoy <ego@in.ibm.com>
   
   net: Make nl_table_lock read acquire softirq safe.

    =========================================================
    [ INFO: possible irq lock inversion dependency detected ]
    2.6.25-rc3 #61
    ---------------------------------------------------------
    swapper/0 just changed the state of lock:
     (&tb->tb6_lock){-+..}, at: [<c049a5ab>] __ip6_ins_rt+0x1e/0x3c
    but this lock took another, soft-read-irq-unsafe lock in the past:
     (nl_table_lock){..-?}
    
    and interrupts could create inverse lock ordering between them.
    
    Full lockdep report at
    http://gautham.shenoy.googlepages.com/nl_table_lock_irq_inversion.log
    
    This patch fixes the problem by making the read acquire of nl_table_lock
    softirq safe.
    
    Signed-off-by: Gautham R Shenoy <ego@in.ibm.com>
    Cc: Peter Ziljstra <a.p.zijlstra@chello.nl>
    Cc: Eric Dumazet <dada1@cosmosbay.com>
    Cc: David S. Miller <davem@davemloft.net>

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 1ab0da2..7ade317 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -205,9 +205,9 @@ netlink_lock_table(void)
 {
 	/* read_lock() synchronizes us to netlink_table_grab */
 
-	read_lock(&nl_table_lock);
+	read_lock_bh(&nl_table_lock);
 	atomic_inc(&nl_table_users);
-	read_unlock(&nl_table_lock);
+	read_unlock_bh(&nl_table_lock);
 }
 
 static inline void
@@ -225,7 +225,7 @@ static inline struct sock *netlink_lookup(struct net *net, int protocol,
 	struct sock *sk;
 	struct hlist_node *node;
 
-	read_lock(&nl_table_lock);
+	read_lock_bh(&nl_table_lock);
 	head = nl_pid_hashfn(hash, pid);
 	sk_for_each(sk, node, head) {
 		if ((sk->sk_net == net) && (nlk_sk(sk)->pid == pid)) {
@@ -235,7 +235,7 @@ static inline struct sock *netlink_lookup(struct net *net, int protocol,
 	}
 	sk = NULL;
 found:
-	read_unlock(&nl_table_lock);
+	read_unlock_bh(&nl_table_lock);
 	return sk;
 }
 
@@ -1078,12 +1078,12 @@ void netlink_set_err(struct sock *ssk, u32 pid, u32 group, int code)
 	info.group = group;
 	info.code = code;
 
-	read_lock(&nl_table_lock);
+	read_lock_bh(&nl_table_lock);
 
 	sk_for_each_bound(sk, node, &nl_table[ssk->sk_protocol].mc_list)
 		do_one_set_err(sk, &info);
 
-	read_unlock(&nl_table_lock);
+	read_unlock_bh(&nl_table_lock);
 }
 
 /* must be called with netlink table grabbed */
@@ -1776,7 +1776,7 @@ static struct sock *netlink_seq_socket_idx(struct seq_file *seq, loff_t pos)
 static void *netlink_seq_start(struct seq_file *seq, loff_t *pos)
 	__acquires(nl_table_lock)
 {
-	read_lock(&nl_table_lock);
+	read_lock_bh(&nl_table_lock);
 	return *pos ? netlink_seq_socket_idx(seq, *pos - 1) : SEQ_START_TOKEN;
 }
 
@@ -1825,7 +1825,7 @@ static void *netlink_seq_next(struct seq_file *seq, void *v, loff_t *pos)
 static void netlink_seq_stop(struct seq_file *seq, void *v)
 	__releases(nl_table_lock)
 {
-	read_unlock(&nl_table_lock);
+	read_unlock_bh(&nl_table_lock);
 }
 
 
-- 
Thanks and Regards
gautham

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] lockdep: fix fib_hash softirq inversion
  2008-03-12 12:09 ` [PATCH 2/2] lockdep: fix fib_hash softirq inversion Peter Zijlstra
  2008-03-13  3:50   ` [PATCH] net/lockdep: Make nl_table_lock read acquire softirq safe Gautham R Shenoy
@ 2008-03-13  6:08   ` David Miller
  2008-03-13  9:40     ` Arjan van de Ven
  1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2008-03-13  6:08 UTC (permalink / raw)
  To: a.p.zijlstra; +Cc: linux-kernel, mingo, ego, hugh, dada1

From: Peter Zijlstra <a.p.zijlstra@chello.nl>
Date: Wed, 12 Mar 2008 13:09:22 +0100

> =========================================================
> [ INFO: possible irq lock inversion dependency detected ]
> 2.6.25-rc4-sched-devel.git #56
> ---------------------------------------------------------
> swapper/0 just changed the state of lock:
>  (&rt_hash_locks[i]){-+..}, at: [<ffffffff804721ac>] rt_intern_hash+0x8c/0x3b0
> but this lock took another, soft-read-irq-unsafe lock in the past:
>  (fib_hash_lock){-.-?}
> 
> and interrupts could create inverse lock ordering between them.

I tried to figure out what lockdep doesn't like here.

Could you show me the specific code path that could cause
the lock conflict?

Adding BH disabling to fib_hash_lock will add non-trivial
costs to these code paths, so I'd like to avoid this if
possible.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] lockdep: fix fib_hash softirq inversion
  2008-03-13  6:08   ` [PATCH 2/2] lockdep: fix fib_hash softirq inversion David Miller
@ 2008-03-13  9:40     ` Arjan van de Ven
  0 siblings, 0 replies; 9+ messages in thread
From: Arjan van de Ven @ 2008-03-13  9:40 UTC (permalink / raw)
  To: David Miller; +Cc: a.p.zijlstra, linux-kernel, mingo, ego, hugh, dada1

On Wed, 12 Mar 2008 23:08:43 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Date: Wed, 12 Mar 2008 13:09:22 +0100
> 
> > =========================================================
> > [ INFO: possible irq lock inversion dependency detected ]
> > 2.6.25-rc4-sched-devel.git #56
> > ---------------------------------------------------------
> > swapper/0 just changed the state of lock:
> >  (&rt_hash_locks[i]){-+..}, at: [<ffffffff804721ac>]
> > rt_intern_hash+0x8c/0x3b0 but this lock took another,
> > soft-read-irq-unsafe lock in the past: (fib_hash_lock){-.-?}
> > 
> > and interrupts could create inverse lock ordering between them.
> 
> I tried to figure out what lockdep doesn't like here.
> 
> Could you show me the specific code path that could cause
> the lock conflict?
> 
> Adding BH disabling to fib_hash_lock will add non-trivial
> costs to these code paths, so I'd like to avoid this if
> possible.

from what I can see the backtrace is the following:



Abstracted form of the deadlock:

There is a lock A that is used in process and irq context
There is a lock B that is used only in process context

There is a case where, in user context, lock A is taken (irqs off), and then lock B is taken,
and there a case where lock B is taken without disabling irqs.

This can lead to the following deadlock:

cpu 0                      cpu 1			note
..                         spin_lock_irq(lockA)
spin_lock(lockB)           ....
... interrupt hits ...     .....
spin_lock_irq(lockA)       .....                        in the irq handler; spins
...                        spin_lock(lockB)		AB-BA deadlock



This case:

for this case it's about BH's not strict irqs, but for all intents and purposes that's the same

lock A is rt_intern_hash
lock B is fib_hash_lock

previously, lockdep has observed that 
   [<ffffffff804acdd7>] fn_hash_lookup+0x27/0xf0
   [<ffffffff804a855d>] inet_addr_type+0x6d/0xf0
   [<ffffffff8049dcc6>] arp_constructor+0x86/0x280
   [<ffffffff8046272f>] neigh_create+0x19f/0x5b0
   [<ffffffff8049c5cf>] arp_bind_neighbour+0x9f/0xb0
   [<ffffffff8047223a>] rt_intern_hash+0x11a/0x3b0
   [<ffffffff80472acd>] __ip_route_output_key+0x5fd/0xb10
   [<ffffffff804947ec>] tcp_v4_connect+0x10c/0x550
   [<ffffffff804a343a>] inet_stream_connect+0x23a/0x2d0
   [<ffffffff8044e530>] sys_connect+0xa0/0xc0
this call chain creates the relationship from cpu1 above (but at this point, lockdep doesn't know yet
that rt_intern_hash lock is ever taken in irq context, so it cannot complain yet)

And now lockdep is observing ip_rcv calling iproute_input calling rt_intern_hash from irq context,
 [<ffffffff804721ac>] rt_intern_hash+0x8c/0x3b0
 [<ffffffff80473a4e>] ip_route_input+0x8fe/0x1250
 [<ffffffff804769a8>] ip_rcv+0x518/0x650
 [<ffffffff8045a8f9>] netif_receive_skb+0x269/0x350
 [<ffffffff80405529>] e1000_clean_rx_irq+0x1a9/0x5a0
 [<ffffffff8040072b>] e1000_clean+0x1fb/0x5b0
 [<ffffffff8045d18f>] net_rx_action+0xbf/0x180
 [<ffffffff80255c55>] __do_softirq+0x75/0x100
 [<ffffffff8022173c>] call_softirq+0x1c/0x30
 [<ffffffff80223505>] do_softirq+0x65/0xa0
 [<ffffffff80255b67>] irq_exit+0x97/0xa0
 [<ffffffff80223618>] do_IRQ+0xa8/0x130
which creates the irq part of the relationship between lock A and lock B in the
scenario above, and lockdep complains that the abstracted deadlock scenario from
above now becomes real.


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] lockdep vs recursive read locks
  2008-03-12 12:09 [PATCH 0/2] lockdep vs recursive read locks Peter Zijlstra
  2008-03-12 12:09 ` [PATCH 1/2] lockdep: fix recursive read lock validation Peter Zijlstra
  2008-03-12 12:09 ` [PATCH 2/2] lockdep: fix fib_hash softirq inversion Peter Zijlstra
@ 2008-03-13 11:25 ` Hugh Dickins
  2008-03-14  7:35   ` Peter Zijlstra
  2 siblings, 1 reply; 9+ messages in thread
From: Hugh Dickins @ 2008-03-13 11:25 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel, Ingo Molnar, Gautham R Shenoy

On Wed, 12 Mar 2008, Peter Zijlstra wrote:
> We seemed to have some lockdep trouble wrt recursive read locks. Today Gautham
> asked me some specific questions about it, which led to these patches.
> 
> IIRC Hugh ran into something similar a while back. This hopefully fixes it.

Hmm, I don't remember that myself: can you jog my memory?

What I do remember is that when I fixed madvise(MADV_REMOVE i.e. holepunch)
not to hold mmap_sem (at that time down_write but today down_read, though
would still be bad) across vmtruncate_range which takes i_mutex (contrast
writing from an unfaulted area that takes i_mutex then down_read mmap_sem),
Ingo asked me offline if lockdep caught that and it didn't.  Just tried
again now, with your patches applied, and madvise_remove restored to its
old misbehaviour, and it looks like lockdep still doesn't catch it
(but suspect me of pilot error if you find otherwise).

Hugh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] lockdep vs recursive read locks
  2008-03-13 11:25 ` [PATCH 0/2] lockdep vs recursive read locks Hugh Dickins
@ 2008-03-14  7:35   ` Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2008-03-14  7:35 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: linux-kernel, Ingo Molnar, Gautham R Shenoy

On Thu, 2008-03-13 at 11:25 +0000, Hugh Dickins wrote:
> On Wed, 12 Mar 2008, Peter Zijlstra wrote:
> > We seemed to have some lockdep trouble wrt recursive read locks. Today Gautham
> > asked me some specific questions about it, which led to these patches.
> > 
> > IIRC Hugh ran into something similar a while back. This hopefully fixes it.
> 
> Hmm, I don't remember that myself: can you jog my memory?

Sadly that is about as far as my memory went: 

  Hugh spotted problem that lockdep missed, involves r/w lock.

> What I do remember is that when I fixed madvise(MADV_REMOVE i.e. holepunch)
> not to hold mmap_sem (at that time down_write but today down_read, though
> would still be bad) across vmtruncate_range which takes i_mutex (contrast
> writing from an unfaulted area that takes i_mutex then down_read mmap_sem),
> Ingo asked me offline if lockdep caught that and it didn't.  Just tried
> again now, with your patches applied, and madvise_remove restored to its
> old misbehaviour, and it looks like lockdep still doesn't catch it
> (but suspect me of pilot error if you find otherwise).

Ah, ok. Yes, the rwsem will not be affected by this change as it was
specific to recursive readers, which is only allowed by rwlock_t.

I'll try to look into the scenario you described. Would be interesting
to figure that out.


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-03-14  7:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-12 12:09 [PATCH 0/2] lockdep vs recursive read locks Peter Zijlstra
2008-03-12 12:09 ` [PATCH 1/2] lockdep: fix recursive read lock validation Peter Zijlstra
2008-03-12 20:26   ` Gautham R Shenoy
2008-03-12 12:09 ` [PATCH 2/2] lockdep: fix fib_hash softirq inversion Peter Zijlstra
2008-03-13  3:50   ` [PATCH] net/lockdep: Make nl_table_lock read acquire softirq safe Gautham R Shenoy
2008-03-13  6:08   ` [PATCH 2/2] lockdep: fix fib_hash softirq inversion David Miller
2008-03-13  9:40     ` Arjan van de Ven
2008-03-13 11:25 ` [PATCH 0/2] lockdep vs recursive read locks Hugh Dickins
2008-03-14  7:35   ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox