From mboxrd@z Thu Jan 1 00:00:00 1970 From: Shawn Guo Subject: Re: [PATCH v5 1/1 net] net: fec: fix kernel oops when plug/unplug cable many times Date: Mon, 13 May 2013 13:07:43 +0800 Message-ID: References: <1367971724-1974-1-git-send-email-Frank.Li@freescale.com> <20130513045710.GA4440@S2101-09.ap.freescale.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Cc: romieu@fr.zoreil.com, Robert Schwebel , David Miller , l.stach@pengutronix.de, Linux Netdev List , Fabio Estevam , lznuaa@gmail.com To: Frank Li Return-path: Received: from mail-pa0-f48.google.com ([209.85.220.48]:61106 "EHLO mail-pa0-f48.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751308Ab3EMFHo (ORCPT ); Mon, 13 May 2013 01:07:44 -0400 Received: by mail-pa0-f48.google.com with SMTP id kp6so4298685pab.21 for ; Sun, 12 May 2013 22:07:43 -0700 (PDT) In-Reply-To: <20130513045710.GA4440@S2101-09.ap.freescale.net> Sender: netdev-owner@vger.kernel.org List-ID: On 13 May 2013 12:57, Shawn Guo wrote: > Hi Frank, > > On Wed, May 08, 2013 at 08:08:44AM +0800, Frank Li wrote: >> reproduce steps >> 1. flood ping from other machine >> ping -f -s 41000 IP >> 2. run below script >> while [ 1 ]; do ethtool -s eth0 autoneg off; >> sleep 3;ethtool -s eth0 autoneg on; sleep 4; done; >> >> You can see oops in one hour. >> >> The reason is fec_restart clear BD but NAPI may use it. >> The solution is disable NAPI and stop xmit when reset BD. >> disable NAPI may sleep, so fec_restart can't be call in >> atomic context. >> >> Signed-off-by: Frank Li >> Reviewed-by: Lucas Stach >> Tested-by: Lucas Stach > > The patch has landed on 3.10-rc1. Seems that it introduces a lock > warning as below. Turn on CONFIG_PROVE_LOCKING and you will be able > to see it. The warning message looks a little different in one of my imx28 boot tests. Shawn [ 4.749454] ================================= [ 4.753829] [ INFO: inconsistent lock state ] [ 4.758207] 3.10.0-rc1-00013-gc5f5ad9 #77 Not tainted [ 4.763270] --------------------------------- [ 4.767641] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. [ 4.773669] kworker/0:1/20 [HC0[0]:SC1[3]:HE1:SE0] takes: [ 4.779080] (_xmit_ETHER#2){+.?...}, at: [] sch_direct_xmit+0x9c/0x2d0 [ 4.786686] {SOFTIRQ-ON-W} state was registered at: [ 4.791579] [] __lock_acquire+0x5fc/0x1a90 [ 4.796767] [] lock_acquire+0x9c/0x124 [ 4.801598] [] _raw_spin_lock+0x2c/0x3c [ 4.806532] [] fec_restart+0x5bc/0x654 [ 4.811370] [] fec_enet_adjust_link+0x7c/0xb4 [ 4.816806] [] phy_state_machine+0x17c/0x394 [ 4.822153] [] process_one_work+0x1bc/0x4c4 [ 4.827422] [] worker_thread+0x134/0x3a0 [ 4.832420] [] kthread+0xa4/0xb0 [ 4.836726] [] ret_from_fork+0x14/0x34 [ 4.841564] irq event stamp: 828 [ 4.844806] hardirqs last enabled at (828): [] local_bh_enable_ip+0x84/0xf0 [ 4.852773] hardirqs last disabled at (827): [] local_bh_enable_ip+0x44/0xf0 [ 4.860727] softirqs last enabled at (808): [] __do_softirq+0x180/0x284 [ 4.868333] softirqs last disabled at (811): [] irq_exit+0x9c/0xd8 [ 4.875416] [ 4.875416] other info that might help us debug this: [ 4.881959] Possible unsafe locking scenario: [ 4.881959] [ 4.887891] CPU0 [ 4.890345] ---- [ 4.892799] lock(_xmit_ETHER#2); [ 4.896247] [ 4.898874] lock(_xmit_ETHER#2); [ 4.902497] [ 4.902497] *** DEADLOCK *** [ 4.902497] [ 4.908444] 7 locks held by kworker/0:1/20: [ 4.912637] #0: (rpciod){.+.+.+}, at: [] process_one_work+0x138/0x4c4 [ 4.920217] #1: ((&(&transport->connect_worker)->work)){+.+.+.}, at: [] process_one_work+0x138/0x4c4 [ 4.930482] #2: (sk_lock-AF_INET-RPC){+.+...}, at: [] inet_stream_connect+0x20/0x48 [ 4.939293] #3: (rcu_read_lock){.+.+..}, at: [] ip_queue_xmit+0x0/0x3f8 [ 4.947043] #4: (rcu_read_lock){.+.+..}, at: [] __netif_receive_skb_core+0x34/0x5b0 [ 4.955834] #5: (rcu_read_lock){.+.+..}, at: [] neigh_update+0x2a4/0x50c [ 4.963667] #6: (rcu_read_lock_bh){.+....}, at: [] dev_queue_xmit+0x0/0x684 [ 4.971767] [ 4.971767] stack backtrace: [ 4.976162] CPU: 0 PID: 20 Comm: kworker/0:1 Not tainted 3.10.0-rc1-00013-gc5f5ad9 #77 [ 4.984122] Workqueue: rpciod xs_tcp_setup_socket [ 4.988909] [] (unwind_backtrace+0x0/0xf0) from [] (show_stack+0x10/0x14) [ 4.997491] [] (show_stack+0x10/0x14) from [] (print_usage_bug.part.28+0x218/0x280) [ 5.006937] [] (print_usage_bug.part.28+0x218/0x280) from [] (mark_lock+0x528/0x668) [ 5.016461] [] (mark_lock+0x528/0x668) from [] (__lock_acquire+0x5b4/0x1a90) [ 5.025289] [] (__lock_acquire+0x5b4/0x1a90) from [] (lock_acquire+0x9c/0x124) [ 5.034298] [] (lock_acquire+0x9c/0x124) from [] (_raw_spin_lock+0x2c/0x3c) [ 5.043050] [] (_raw_spin_lock+0x2c/0x3c) from [] (sch_direct_xmit+0x9c/0x2d0) [ 5.052055] [] (sch_direct_xmit+0x9c/0x2d0) from [] (dev_queue_xmit+0x2a8/0x684) [ 5.061234] [] (dev_queue_xmit+0x2a8/0x684) from [] (neigh_update+0x214/0x50c) [ 5.070241] [] (neigh_update+0x214/0x50c) from [] (arp_process+0x264/0x62c) [ 5.078985] [] (arp_process+0x264/0x62c) from [] (__netif_receive_skb_core+0x298/0x5b0) [ 5.088770] [] (__netif_receive_skb_core+0x298/0x5b0) from [] (napi_gro_receive+0x74/0xa0) [ 5.098824] [] (napi_gro_receive+0x74/0xa0) from [] (fec_enet_rx_napi+0x250/0x5d8) [ 5.108182] [] (fec_enet_rx_napi+0x250/0x5d8) from [] (net_rx_action+0xc0/0x238) [ 5.117364] [] (net_rx_action+0xc0/0x238) from [] (__do_softirq+0xec/0x284) [ 5.126110] [] (__do_softirq+0xec/0x284) from [] (irq_exit+0x9c/0xd8) [ 5.134335] [] (irq_exit+0x9c/0xd8) from [] (handle_IRQ+0x34/0x84) [ 5.142296] [] (handle_IRQ+0x34/0x84) from [] (icoll_handle_irq+0x34/0x4c) [ 5.150949] [] (icoll_handle_irq+0x34/0x4c) from [] (__irq_svc+0x44/0x54) [ 5.159494] Exception stack(0xc7561c78 to 0xc7561cc0) [ 5.164571] 1c60: 00000001 c7520338 [ 5.172783] 1c80: 00000000 20000013 c7560000 c03d3e84 c76d6d00 c7736400 c77582c0 00000000 [ 5.180994] 1ca0: 00000000 00000000 04228060 c7561cc0 c005c944 c002307c 20000013 ffffffff [ 5.189217] [] (__irq_svc+0x44/0x54) from [] (local_bh_enable+0x90/0xf0) [ 5.197700] [] (local_bh_enable+0x90/0xf0) from [] (ip_finish_output+0x21c/0x504) [ 5.206960] [] (ip_finish_output+0x21c/0x504) from [] (ip_local_out+0x30/0x94) [ 5.215955] [] (ip_local_out+0x30/0x94) from [] (ip_queue_xmit+0x144/0x3f8) [ 5.224707] [] (ip_queue_xmit+0x144/0x3f8) from [] (tcp_transmit_skb+0x3cc/0x80c) [ 5.233981] [] (tcp_transmit_skb+0x3cc/0x80c) from [] (tcp_connect+0x57c/0x5d4) [ 5.243081] [] (tcp_connect+0x57c/0x5d4) from [] (tcp_v4_connect+0x288/0x3a8) [ 5.252010] [] (tcp_v4_connect+0x288/0x3a8) from [] (__inet_stream_connect+0x258/0x2dc) [ 5.261799] [] (__inet_stream_connect+0x258/0x2dc) from [] (inet_stream_connect+0x34/0x48) [ 5.271863] [] (inet_stream_connect+0x34/0x48) from [] (kernel_connect+0x10/0x14) [ 5.281139] [] (kernel_connect+0x10/0x14) from [] (xs_tcp_setup_socket+0x10c/0x360) [ 5.290587] [] (xs_tcp_setup_socket+0x10c/0x360) from [] (process_one_work+0x1bc/0x4c4) [ 5.300372] [] (process_one_work+0x1bc/0x4c4) from [] (worker_thread+0x134/0x3a0) [ 5.309632] [] (worker_thread+0x134/0x3a0) from [] (kthread+0xa4/0xb0) [ 5.317944] [] (kthread+0xa4/0xb0) from [] (ret_from_fork+0x14/0x34)