From: Jean Tourrilhes <jt@bougret.hpl.hp.com>
To: Martin Diehl <lists@mdiehl.de>
Cc: Jeff Garzik <jgarzik@pobox.com>,
Linux kernel mailing list <linux-kernel@vger.kernel.org>
Subject: Re: [2.5.69] rtnl-deadlock with usermodehelper and keventd
Date: Thu, 15 May 2003 13:12:55 -0700 [thread overview]
Message-ID: <20030515201255.GA18643@bougret.hpl.hp.com> (raw)
In-Reply-To: <Pine.LNX.4.44.0305151443180.1435-100000@notebook.home.mdiehl.de>
Greg,
This is a HotPlug problem, so would you mind forwarding this
to the relevant person and help Martin ?
Thanks in advance...
Jean
On Thu, May 15, 2003 at 03:14:36PM +0200, Martin Diehl wrote:
>
> Hi,
>
> seems we may run into mutual deadlock in the unregister_netdev() path with
> CONFIG_HOTPLUG=y. I managed to reproduce an irda-user report leading to
> the following description:
>
> * killing irattach (userland daemon comparable to pppd) starts closing the
> irda tty-ldisc
>
> * there we call unregister_netdev() on behalf of the (already closed)
> irda0 network device.
>
> * unregister_netdev() takes rtnl_lock
>
> * further down in unregister_netdevice() with CONFIG_HOTPLUG the network
> layers wants to call userland hotplug stuff
>
> * the request to fork the usermodehelper gets queued for the event/0
> workqueue (aka keventd) and we are blocking with rtnl still acquired for
> completion.
>
> * at this moment for some reason keventd has a linkwatch_event()
> apparently already scheduled before the usermode helper. So we run into
> linkwatch_event() with tries to get rtnl_lock.
>
> -> mutual deadlock: keventd waiting for rtnl_lock which is still hold by
> unregister_netdev blocking for completion of work scheduled for keventd.
>
> I can reproduce this with 2.5.69 with CONFIG_HOTPLUG enabled, no matter
> what /proc/sys/kernel/hotplug is, even /bin/true is sufficient. I've no
> idea why I get this with irda0 but not with eth0 for example.
> FWIW kernel is SMP running on UP without preempt.
>
> As I don't see how the irda stuff could cause unregister_netdev() to
> schedule the hotplug stuff with some linkwatch_event already scheduled
> I've no idea what the real problem and fix might be.
>
> Below a commented calltrace catched right when it hangs as described.
>
> Thanks
> Martin
>
> -----------------------------
>
> > May 14 13:14:17 laptop kernel: events/0 D C12FDF04 412092 3 1 4 2 (L-TLB)
> > May 14 13:14:17 laptop kernel: Call Trace:
> > May 14 13:14:17 laptop kernel: [__down+150/256] __down+0x96/0x100
> > May 14 13:14:17 laptop kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
> > May 14 13:14:17 laptop kernel: [__down_failed+8/12] __down_failed+0x8/0xc
> > May 14 13:14:17 laptop kernel: [.text.lock.rtnetlink+5/54] .text.lock.rtnetlink+0x5/0x36
> > May 14 13:14:17 laptop kernel: [linkwatch_event+29/48] linkwatch_event+0x1d/0x30
> > May 14 13:14:17 laptop kernel: [worker_thread+511/736] worker_thread+0x1ff/0x2e0
> > May 14 13:14:17 laptop kernel: [linkwatch_event+0/48] linkwatch_event+0x0/0x30
> > May 14 13:14:17 laptop kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
> > May 14 13:14:17 laptop kernel: [ret_from_fork+6/20] ret_from_fork+0x6/0x14
> > May 14 13:14:17 laptop kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
> > May 14 13:14:17 laptop kernel: [worker_thread+0/736] worker_thread+0x0/0x2e0
> > May 14 13:14:17 laptop kernel: [kernel_thread_helper+5/24] kernel_thread_helper+0x5/0x18
>
> This is the keventd-thread. It has some work scheduled for the network
> layer, namely linkwatch_event(). This is currently blocking to get the
> rtnl_lock semaphore.
>
>
> > May 14 13:14:17 laptop kernel: irattach D 00000000 4283667124 400 1 537 396 (NOTLB)
> > May 14 13:14:17 laptop kernel: Call Trace:
> > May 14 13:14:17 laptop kernel: [try_to_wake_up+296/464] try_to_wake_up+0x128/0x1d0
> > May 14 13:14:17 laptop kernel: [wait_for_completion+153/224] wait_for_completion+0x99/0xe0
>
> (5)
>
> > May 14 13:14:17 laptop kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
> > May 14 13:14:17 laptop kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
> > May 14 13:14:17 laptop kernel: [queue_work+132/160] queue_work+0x84/0xa0
>
> (4)
>
> > May 14 13:14:17 laptop kernel: [call_usermodehelper+257/272] call_usermodehelper+0x101/0x110
> > May 14 13:14:17 laptop kernel: [__call_usermodehelper+0/112] __call_usermodehelper+0x0/0x70
> > May 14 13:14:17 laptop kernel: [vsprintf+39/48] vsprintf+0x27/0x30
> > May 14 13:14:17 laptop kernel: [sprintf+31/48] sprintf+0x1f/0x30
> > May 14 13:14:17 laptop kernel: [net_run_sbin_hotplug+174/195] net_run_sbin_hotplug+0xae/0xc3
>
> (3)
>
> > May 14 13:14:17 laptop kernel: [try_to_wake_up+296/464] try_to_wake_up+0x128/0x1d0
> > May 14 13:14:17 laptop kernel: [pfifo_fast_reset+158/160] pfifo_fast_reset+0x9e/0xa0
> > May 14 13:14:17 laptop kernel: [qdisc_destroy+158/160] qdisc_destroy+0x9e/0xa0
> > May 14 13:14:17 laptop kernel: [unregister_netdevice+211/608] unregister_netdevice+0xd3/0x260
> > May 14 13:14:17 laptop kernel: [_end+282800068/1070304612] sirdev_dtor+0x0/0x20 [sir_dev]
>
> (2)
>
> > May 14 13:14:17 laptop kernel: [unregister_netdev+24/48] unregister_netdev+0x18/0x30
>
> (1)
>
> > May 14 13:14:17 laptop kernel: [_end+282800429/1070304612] sirdev_put_instance+0x149/0x1ad [sir_dev]
> > May 14 13:14:17 laptop kernel: [_end+282804705/1070304612] __func__.9+0x0/0x14 [sir_dev]
> > May 14 13:14:17 laptop kernel: [_end+282131315/1070304612] irtty_close+0x4f/0x120 [irtty_sir]
> > May 14 13:14:17 laptop kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
> > May 14 13:14:17 laptop kernel: [tty_set_ldisc+1091/1200] tty_set_ldisc+0x443/0x4b0
> > May 14 13:14:17 laptop kernel: [uart_wait_until_sent+144/224] uart_wait_until_sent+0x90/0xe0
> > May 14 13:14:17 laptop kernel: [tty_wait_until_sent+243/272] tty_wait_until_sent+0xf3/0x110
> > May 14 13:14:17 laptop kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
> > May 14 13:14:17 laptop kernel: [sock_destroy_inode+27/32] sock_destroy_inode+0x1b/0x20
> > May 14 13:14:17 laptop kernel: [_end+282132178/1070304612] +0x15a/0x16c [irtty_sir]
> > May 14 13:14:17 laptop kernel: [_end+282130740/1070304612] irtty_open+0x0/0x1f0 [irtty_sir]
> > May 14 13:14:17 laptop kernel: [_end+282131236/1070304612] irtty_close+0x0/0x120 [irtty_sir]
> > May 14 13:14:17 laptop kernel: [_end+282130132/1070304612] irtty_ioctl+0x0/0x260 [irtty_sir]
> > May 14 13:14:17 laptop kernel: [_end+282129076/1070304612] irtty_receive_buf+0x0/0xc0 [irtty_sir]
> > May 14 13:14:17 laptop kernel: [_end+282129268/1070304612] irtty_receive_room+0x0/0x30 [irtty_sir]
> > May 14 13:14:17 laptop kernel: [_end+282129316/1070304612] irtty_write_wakeup+0x0/0x40 [irtty_sir]
> > May 14 13:14:17 laptop kernel: [_end+282134820/1070304612] +0x0/0xe0 [irtty_sir]
> > May 14 13:14:17 laptop kernel: [sys_ioctl+256/656] sys_ioctl+0x100/0x290
> > May 14 13:14:17 laptop kernel: [syscall_call+7/11] syscall_call+0x7/0xb
>
> Ok, nice trace btw: The last printk from sir_dev was at (1) before we
> called unregister_netdev() - which in turn acquired rtnl_lock (2). Due to
> the disappearing irda0 device (and CONFIG_HOTPLUG=y) the network layer
> decided to call the hotplug stuff (3). For this to fork the usermode
> helper, it scheduled some work for keventd (4). Finally we are blocking
> for completion until keventd finishes wait4 usermodehelper (5).
>
> Unfortunately we are blocking for completion with rtnl still locked and
> keventd apparently having the linkwatch_event() scheduled before the
> usermodehelper -> mutual deadlock between irattach and keventd!
>
next prev parent reply other threads:[~2003-05-15 20:01 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2003-05-15 13:14 [2.5.69] rtnl-deadlock with usermodehelper and keventd Martin Diehl
2003-05-15 20:12 ` Jean Tourrilhes [this message]
2003-05-15 20:19 ` Greg KH
2003-05-15 20:25 ` Jean Tourrilhes
[not found] <PAO-EX01Cv3uS7sBdxk00001183@pao-ex01.pao.digeo.com>
2003-05-16 0:53 ` David S. Miller
2003-05-16 1:12 ` [2.5.69] rtnl-deadlock with usermodehelper and keventd Andrew Morton
2003-05-16 1:27 ` Jean Tourrilhes
2003-05-23 7:06 ` Martin Diehl
2003-05-23 6:59 ` David S. Miller
2003-05-23 9:38 ` Martin Diehl
2003-05-23 9:43 ` David S. Miller
2003-05-23 14:42 ` Stian Jordet
2003-05-23 16:46 ` Jean Tourrilhes
2003-05-23 23:25 ` Martin Diehl
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20030515201255.GA18643@bougret.hpl.hp.com \
--to=jt@bougret.hpl.hp.com \
--cc=jgarzik@pobox.com \
--cc=jt@hpl.hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lists@mdiehl.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox