From mboxrd@z Thu Jan 1 00:00:00 1970 From: Marc Kleine-Budde Subject: [PATCH] net: sched: dev_deactivate_many(): use msleep(1) instead of yield() to wait for outstanding qdisc_run calls Date: Wed, 5 Mar 2014 00:49:47 +0100 Message-ID: <1393976987-23555-1-git-send-email-mkl@pengutronix.de> Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org, kernel@pengutronix.de, Marc Kleine-Budde To: linux-rt-users@vger.kernel.org Return-path: Sender: linux-kernel-owner@vger.kernel.org List-Id: netdev.vger.kernel.org On PREEMPT_RT enabled systems the interrupt handler run as threads at prio 50 (by default). If a high priority userspace process tries to shut down a busy network interface it might spin in a yield loop waiting for the device to become idle. With the interrupt thread having a lower priority than the looping process it might never be scheduled and so result in a deadlock on UP systems. With Magic SysRq the following backtrace can be produced: > test_app R running 0 174 168 0x00000000 > [] (__schedule+0x220/0x3fc) from [] (preempt_schedule_irq+0x48/0x80) > [] (preempt_schedule_irq+0x48/0x80) from [] (svc_preempt+0x8/0x20) > [] (svc_preempt+0x8/0x20) from [] (local_bh_enable+0x18/0x88) > [] (local_bh_enable+0x18/0x88) from [] (dev_deactivate_many+0x220/0x264) > [] (dev_deactivate_many+0x220/0x264) from [] (__dev_close_many+0x64/0xd4) > [] (__dev_close_many+0x64/0xd4) from [] (__dev_close+0x28/0x3c) > [] (__dev_close+0x28/0x3c) from [] (__dev_change_flags+0x88/0x130) > [] (__dev_change_flags+0x88/0x130) from [] (dev_change_flags+0x10/0x48) > [] (dev_change_flags+0x10/0x48) from [] (do_setlink+0x370/0x7ec) > [] (do_setlink+0x370/0x7ec) from [] (rtnl_newlink+0x2b4/0x450) > [] (rtnl_newlink+0x2b4/0x450) from [] (rtnetlink_rcv_msg+0x158/0x1f4) > [] (rtnetlink_rcv_msg+0x158/0x1f4) from [] (netlink_rcv_skb+0xac/0xc0) > [] (netlink_rcv_skb+0xac/0xc0) from [] (rtnetlink_rcv+0x18/0x24) > [] (rtnetlink_rcv+0x18/0x24) from [] (netlink_unicast+0x13c/0x198) > [] (netlink_unicast+0x13c/0x198) from [] (netlink_sendmsg+0x264/0x2e0) > [] (netlink_sendmsg+0x264/0x2e0) from [] (sock_sendmsg+0x78/0x98) > [] (sock_sendmsg+0x78/0x98) from [] (___sys_sendmsg.part.25+0x268/0x278) > [] (___sys_sendmsg.part.25+0x268/0x278) from [] (__sys_sendmsg+0x48/0x78) > [] (__sys_sendmsg+0x48/0x78) from [] (ret_fast_syscall+0x0/0x2c) This patch works around the problem by replacing yield() by msleep(1), giving the interrupt thread time to finish, similar to other changes contained in the rt patch set. Using wait_for_completion() instead would probably be a better solution. Signed-off-by: Marc Kleine-Budde --- Hello, this patch is intended for -rt, as the problem probably doesn't occur on non rt systems. However, I put netdev on Cc, as they probably can come up with a proper solution. regards, Marc net/sched/sch_generic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index a7f838b..d5b00ad 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -839,7 +839,7 @@ void dev_deactivate_many(struct list_head *head) /* Wait for outstanding qdisc_run calls. */ list_for_each_entry(dev, head, unreg_list) while (some_qdisc_is_busy(dev)) - yield(); + msleep(1) } void dev_deactivate(struct net_device *dev) -- 1.9.0