netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* BUG: warning at ... (netlink) [Was: 2.6.17-rc5-mm1]
       [not found] <20060530022925.8a67b613.akpm@osdl.org>
@ 2006-05-30 11:02 ` Jiri Slaby
  2006-05-30 11:55   ` [patch, -rc5-mm1] lock validator: remove softirq.c WARN_ON Ingo Molnar
  2006-05-30 15:59 ` 2.6.17-rc5-mm1 Michal Piotrowski
  1 sibling, 1 reply; 8+ messages in thread
From: Jiri Slaby @ 2006-05-30 11:02 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, jgarzik, netdev, kuznet, alan

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Morton napsal(a):
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/

BUG: warning at /l/latest/xxx/kernel/softirq.c:86/local_bh_disable()
 [<c0103e66>] show_trace+0x1b/0x1d
 [<c01045a4>] dump_stack+0x26/0x28
 [<c012708f>] local_bh_disable+0x53/0x55
 [<c0399fd6>] _write_lock_bh+0x10/0x15
 [<c034e314>] netlink_table_grab+0x12/0xe9
 [<c034e6f6>] netlink_insert+0x2a/0x156
 [<c034fa46>] netlink_kernel_create+0xad/0x143
 [<c051f869>] rtnetlink_init+0x70/0xc7
 [<c051fb9f>] netlink_proto_init+0x187/0x192
 [<c01003cb>] init+0x12b/0x2f1
 [<c0101005>] kernel_thread_helper+0x5/0xb

If more info needed, feel free to ask.

regards,
- --
Jiri Slaby         www.fi.muni.cz/~xslaby
\_.-^-._   jirislaby@gmail.com   _.-^-._/
B67499670407CE62ACC8 22A032CC55C339D47A7E
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFEfCYeMsxVwznUen4RApvNAJ94piY4mvFzO9x3qSBKL8DstkeBbgCguCnz
Zzw1YFf/s3AtKVo0XgYWsek=
=x+hX
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [patch, -rc5-mm1] lock validator: remove softirq.c WARN_ON
  2006-05-30 11:02 ` BUG: warning at ... (netlink) [Was: 2.6.17-rc5-mm1] Jiri Slaby
@ 2006-05-30 11:55   ` Ingo Molnar
  2006-05-30 16:00     ` Alexey Kuznetsov
  0 siblings, 1 reply; 8+ messages in thread
From: Ingo Molnar @ 2006-05-30 11:55 UTC (permalink / raw)
  To: Jiri Slaby; +Cc: Andrew Morton, linux-kernel, jgarzik, netdev, kuznet, alan


* Jiri Slaby <jirislaby@gmail.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Andrew Morton napsal(a):
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
> 
> BUG: warning at /l/latest/xxx/kernel/softirq.c:86/local_bh_disable()

ok, that WARN_ON is over-eager. Fix is below:

--------------
Subject: lock validator: remove softirq.c WARN_ON
From: Ingo Molnar <mingo@elte.hu>

there is nothing wrong with calling local_bh_disable() in irqs-off
section (as long as the local_bh_enable isnt done with irqs-off),
so remove this over-eager WARN_ON().

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
 kernel/softirq.c |    1 -
 1 file changed, 1 deletion(-)

Index: linux/kernel/softirq.c
===================================================================
--- linux.orig/kernel/softirq.c
+++ linux/kernel/softirq.c
@@ -83,7 +83,6 @@ static void __local_bh_disable(unsigned 
 
 void local_bh_disable(void)
 {
-	WARN_ON_ONCE(irqs_disabled());
 	__local_bh_disable((unsigned long)__builtin_return_address(0));
 }
 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.17-rc5-mm1
       [not found] <20060530022925.8a67b613.akpm@osdl.org>
  2006-05-30 11:02 ` BUG: warning at ... (netlink) [Was: 2.6.17-rc5-mm1] Jiri Slaby
@ 2006-05-30 15:59 ` Michal Piotrowski
  2006-05-30 16:08   ` 2.6.17-rc5-mm1 Arjan van de Ven
  1 sibling, 1 reply; 8+ messages in thread
From: Michal Piotrowski @ 2006-05-30 15:59 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, netdev

Hi,

On 30/05/06, Andrew Morton <akpm@osdl.org> wrote:
>
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
>
>

It looks like a network stack problem.

May 30 17:50:34 ltg01-fedora init: Switching to runlevel: 6
May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Got SIGTERM, quitting.
May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Leaving mDNS
multicast group on interface eth0.IPv4 with address 192.168.0.
14.
May 30 17:50:35 ltg01-fedora kernel:
May 30 17:50:35 ltg01-fedora kernel: ======================================
May 30 17:50:35 ltg01-fedora kernel: [ BUG: bad unlock ordering detected! ]
May 30 17:50:35 ltg01-fedora kernel: --------------------------------------
May 30 17:50:35 ltg01-fedora kernel: avahi-daemon/1878 is trying to
release lock (&in_dev->mc_list_lock) at:
May 30 17:50:35 ltg01-fedora kernel:  [<c02e693b>] ip_mc_del_src+0x5e/0xd5
May 30 17:50:35 ltg01-fedora kernel: but the next lock to release is:
May 30 17:50:35 ltg01-fedora kernel:  (&im->lock){-...}, at:
[<c02e6934>] ip_mc_del_src+0x57/0xd5
May 30 17:50:35 ltg01-fedora kernel:
May 30 17:50:35 ltg01-fedora kernel: other info that might help us debug this:
May 30 17:50:35 ltg01-fedora kernel: 2 locks held by avahi-daemon/1878:
May 30 17:50:35 ltg01-fedora kernel:  #0:  (rtnl_mutex){--..}, at:
[<c02f0b0f>] mutex_lock+0x1c/0x1f
May 30 17:50:35 ltg01-fedora kernel:  #1:
(&in_dev->mc_list_lock){-.-?}, at: [<c02e6905>]
ip_mc_del_src+0x28/0xd5
May 30 17:50:35 ltg01-fedora kernel:
May 30 17:50:35 ltg01-fedora kernel: stack backtrace:
May 30 17:50:35 ltg01-fedora kernel:  [<c0103e52>] show_trace_log_lvl+0x4b/0xf4
May 30 17:50:35 ltg01-fedora kernel:  [<c01044b3>] show_trace+0xd/0x10
May 30 17:50:35 ltg01-fedora kernel:  [<c010457b>] dump_stack+0x19/0x1b
May 30 17:50:35 ltg01-fedora kernel:  [<c0139bfa>] lockdep_release+0x18b/0x350
May 30 17:50:35 ltg01-fedora kernel:  [<c02f2640>] _read_unlock+0x16/0x4d
May 30 17:50:35 ltg01-fedora kernel:  [<c02e693b>] ip_mc_del_src+0x5e/0xd5
May 30 17:50:35 ltg01-fedora kernel:  [<c02e69de>] ip_mc_leave_src+0x2c/0x6c
May 30 17:50:35 ltg01-fedora kernel:  [<c02e6c5b>] ip_mc_leave_group+0x3d/0x97
May 30 17:50:35 ltg01-fedora kernel:  [<c02c8a68>] ip_setsockopt+0x4d0/0x9a6
May 30 17:50:35 ltg01-fedora kernel:  [<c02def6d>] udp_setsockopt+0x1f/0x9c
May 30 17:50:35 ltg01-fedora kernel:  [<c02a7006>]
sock_common_setsockopt+0x13/0x18
May 30 17:50:35 ltg01-fedora kernel:  [<c02a5956>] sys_setsockopt+0x73/0xa4
May 30 17:50:35 ltg01-fedora kernel:  [<c02a6c53>] sys_socketcall+0x148/0x186
May 30 17:50:35 ltg01-fedora kernel:  [<c02f2ad5>] sysenter_past_esp+0x56/0x8d

Here is config
http://www.stardust.webpages.pl/files/mm/2.6.17-rc5-mm1/mm-config

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch, -rc5-mm1] lock validator: remove softirq.c WARN_ON
  2006-05-30 11:55   ` [patch, -rc5-mm1] lock validator: remove softirq.c WARN_ON Ingo Molnar
@ 2006-05-30 16:00     ` Alexey Kuznetsov
  2006-05-30 16:05       ` Arjan van de Ven
  0 siblings, 1 reply; 8+ messages in thread
From: Alexey Kuznetsov @ 2006-05-30 16:00 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Jiri Slaby, Andrew Morton, linux-kernel, jgarzik, netdev, alan

Hello!

> ok, that WARN_ON is over-eager. Fix is below:

Nevertheless, I cannot figure out what's happening here.

This local_bh_disable() is called right after schedule().
No way irqs can be disabled there. What is wrong?


static void netlink_table_grab(void)
{
        write_lock_bh(&nl_table_lock);

        if (atomic_read(&nl_table_users)) {
                DECLARE_WAITQUEUE(wait, current);

                add_wait_queue_exclusive(&nl_table_wait, &wait);
                for(;;) {
                        set_current_state(TASK_UNINTERRUPTIBLE);
                        if (atomic_read(&nl_table_users) == 0)
                                break;
                        write_unlock_bh(&nl_table_lock);
                        schedule();
                        write_lock_bh(&nl_table_lock);
                }

                __set_current_state(TASK_RUNNING);
                remove_wait_queue(&nl_table_wait, &wait);
        }
}


Alexey

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch, -rc5-mm1] lock validator: remove softirq.c WARN_ON
  2006-05-30 16:00     ` Alexey Kuznetsov
@ 2006-05-30 16:05       ` Arjan van de Ven
  2006-05-30 16:15         ` Alexey Kuznetsov
  0 siblings, 1 reply; 8+ messages in thread
From: Arjan van de Ven @ 2006-05-30 16:05 UTC (permalink / raw)
  To: Alexey Kuznetsov
  Cc: Ingo Molnar, Jiri Slaby, Andrew Morton, linux-kernel, jgarzik,
	netdev, alan

On Tue, 2006-05-30 at 20:00 +0400, Alexey Kuznetsov wrote:
> Hello!
> 
> > ok, that WARN_ON is over-eager. Fix is below:
> 
> Nevertheless, I cannot figure out what's happening here.
> 
> This local_bh_disable() is called right after schedule().
> No way irqs can be disabled there. What is wrong?
> 
> 
> static void netlink_table_grab(void)
> {
>         write_lock_bh(&nl_table_lock);

well it could be this one as well...

> 
>         if (atomic_read(&nl_table_users)) {



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.17-rc5-mm1
  2006-05-30 15:59 ` 2.6.17-rc5-mm1 Michal Piotrowski
@ 2006-05-30 16:08   ` Arjan van de Ven
  2006-05-30 18:51     ` 2.6.17-rc5-mm1 Michal Piotrowski
  0 siblings, 1 reply; 8+ messages in thread
From: Arjan van de Ven @ 2006-05-30 16:08 UTC (permalink / raw)
  To: Michal Piotrowski; +Cc: netdev, linux-kernel, Andrew Morton, mingo

On Tue, 2006-05-30 at 17:59 +0200, Michal Piotrowski wrote:
> Hi,
> 
> On 30/05/06, Andrew Morton <akpm@osdl.org> wrote:
> >
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
> >
> >
> 
> It looks like a network stack problem.
> 
> May 30 17:50:34 ltg01-fedora init: Switching to runlevel: 6
> May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Got SIGTERM, quitting.
> May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Leaving mDNS
> multicast group on interface eth0.IPv4 with address 192.168.0.
> 14.
> May 30 17:50:35 ltg01-fedora kernel:
> May 30 17:50:35 ltg01-fedora kernel: ======================================
> May 30 17:50:35 ltg01-fedora kernel: [ BUG: bad unlock ordering detected! ]
> May 30 17:50:35 ltg01-fedora kernel: --------------------------------------
> May 30 17:50:35 ltg01-fedora kernel: avahi-daemon/1878 is trying to

does this fix it for you?



Mark out of order unlocking in igmp.c as such

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
 net/ipv4/igmp.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6.17-rc5-mm1-lockdep/net/ipv4/igmp.c
===================================================================
--- linux-2.6.17-rc5-mm1-lockdep.orig/net/ipv4/igmp.c
+++ linux-2.6.17-rc5-mm1-lockdep/net/ipv4/igmp.c
@@ -1472,7 +1472,7 @@ static int ip_mc_del_src(struct in_devic
 		return -ESRCH;
 	}
 	spin_lock_bh(&pmc->lock);
-	read_unlock(&in_dev->mc_list_lock);
+	read_unlock_non_nested(&in_dev->mc_list_lock);
 #ifdef CONFIG_IP_MULTICAST
 	sf_markstate(pmc);
 #endif


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [patch, -rc5-mm1] lock validator: remove softirq.c WARN_ON
  2006-05-30 16:05       ` Arjan van de Ven
@ 2006-05-30 16:15         ` Alexey Kuznetsov
  0 siblings, 0 replies; 8+ messages in thread
From: Alexey Kuznetsov @ 2006-05-30 16:15 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Ingo Molnar, Jiri Slaby, Andrew Morton, linux-kernel, jgarzik,
	netdev, alan

Hello!

> > static void netlink_table_grab(void)
> > {
> >         write_lock_bh(&nl_table_lock);
> 
> well it could be this one as well...

Indeed.

But it still looks as something very strange.
There are some GFP_KERNEL allocations on the way to this function.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 2.6.17-rc5-mm1
  2006-05-30 16:08   ` 2.6.17-rc5-mm1 Arjan van de Ven
@ 2006-05-30 18:51     ` Michal Piotrowski
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Piotrowski @ 2006-05-30 18:51 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: netdev, linux-kernel, Andrew Morton, mingo

Hi Arjan,

On 30/05/06, Arjan van de Ven <arjan@linux.intel.com> wrote:
> On Tue, 2006-05-30 at 17:59 +0200, Michal Piotrowski wrote:
> > Hi,
> >
> > On 30/05/06, Andrew Morton <akpm@osdl.org> wrote:
> > >
> > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.17-rc5/2.6.17-rc5-mm1/
> > >
> > >
> >
> > It looks like a network stack problem.
> >
> > May 30 17:50:34 ltg01-fedora init: Switching to runlevel: 6
> > May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Got SIGTERM, quitting.
> > May 30 17:50:35 ltg01-fedora avahi-daemon[1878]: Leaving mDNS
> > multicast group on interface eth0.IPv4 with address 192.168.0.
> > 14.
> > May 30 17:50:35 ltg01-fedora kernel:
> > May 30 17:50:35 ltg01-fedora kernel: ======================================
> > May 30 17:50:35 ltg01-fedora kernel: [ BUG: bad unlock ordering detected! ]
> > May 30 17:50:35 ltg01-fedora kernel: --------------------------------------
> > May 30 17:50:35 ltg01-fedora kernel: avahi-daemon/1878 is trying to
>
> does this fix it for you?

Yes, thanks.

> Mark out of order unlocking in igmp.c as such
>
> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>

Regards,
Michal

-- 
Michal K. K. Piotrowski
LTG - Linux Testers Group
(http://www.stardust.webpages.pl/ltg/wiki/)

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2006-05-30 18:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20060530022925.8a67b613.akpm@osdl.org>
2006-05-30 11:02 ` BUG: warning at ... (netlink) [Was: 2.6.17-rc5-mm1] Jiri Slaby
2006-05-30 11:55   ` [patch, -rc5-mm1] lock validator: remove softirq.c WARN_ON Ingo Molnar
2006-05-30 16:00     ` Alexey Kuznetsov
2006-05-30 16:05       ` Arjan van de Ven
2006-05-30 16:15         ` Alexey Kuznetsov
2006-05-30 15:59 ` 2.6.17-rc5-mm1 Michal Piotrowski
2006-05-30 16:08   ` 2.6.17-rc5-mm1 Arjan van de Ven
2006-05-30 18:51     ` 2.6.17-rc5-mm1 Michal Piotrowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).