* [3.19-rc3] tg3: BUG: sleeping function called from invalid context
@ 2015-01-13 0:59 Peter Hurley
2015-01-13 2:30 ` Prashant Sreedharan
2015-01-13 6:49 ` Michael Chan
0 siblings, 2 replies; 7+ messages in thread
From: Peter Hurley @ 2015-01-13 0:59 UTC (permalink / raw)
To: Prashant Sreedharan, Michael Chan; +Cc: netdev, Linux kernel
On 3.19-rc3, I'm seeing this might_sleep() warning [1] from the tg3_open()
call stack. Let me know if I need to bisect this.
Regards,
Peter Hurley
[1]
[ 17.203009] BUG: sleeping function called from invalid context at /home/peter/src/kernels/mainline/kernel/irq/manage.c:104
[ 17.203067] in_atomic(): 1, irqs_disabled(): 0, pid: 1106, name: ip
[ 17.203092] 2 locks held by ip/1106:
[ 17.205255] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816adf1f>] rtnetlink_rcv+0x1f/0x40
[ 17.207445] #1: (&(&tp->lock)->rlock){+.....}, at: [<ffffffffa01073e6>] tg3_start+0xc06/0x11f0 [tg3]
[ 17.209725] CPU: 2 PID: 1106 Comm: ip Not tainted 3.19.0-rc3+wip-xeon+lockdep #rc3+wip
[ 17.211900] Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 04/30/2012
[ 17.214086] 0000000000000068 ffff8802ac823498 ffffffff817af7e8 0000000000000005
[ 17.216265] ffffffff81a9be78 ffff8802ac8234a8 ffffffff810998a5 ffff8802ac8234d8
[ 17.218446] ffffffff8109991a ffff8802ac8234c8 ffff8802af0aae00 ffffffffa00ed000
[ 17.220636] Call Trace:
[ 17.222743] [<ffffffff817af7e8>] dump_stack+0x4f/0x7b
[ 17.224808] [<ffffffff810998a5>] ___might_sleep+0x105/0x140
[ 17.226842] [<ffffffff8109991a>] __might_sleep+0x3a/0xa0
[ 17.228869] [<ffffffffa00ed000>] ? 0xffffffffa00ed000
[ 17.230939] [<ffffffff810d7d78>] synchronize_irq+0x38/0xa0
[ 17.232967] [<ffffffffa00ed000>] ? 0xffffffffa00ed000
[ 17.234991] [<ffffffffa010105f>] tg3_chip_reset+0x13f/0x9c0 [tg3]
[ 17.236988] [<ffffffffa01020ae>] tg3_reset_hw+0x7e/0x2d20 [tg3]
[ 17.238996] [<ffffffff813bfaff>] ? __udelay+0x2f/0x40
[ 17.241007] [<ffffffffa00ef2f7>] ? _tw32_flush+0x47/0x80 [tg3]
[ 17.243066] [<ffffffffa0104dac>] tg3_init_hw+0x5c/0x70 [tg3]
[ 17.245438] [<ffffffffa010740b>] tg3_start+0xc2b/0x11f0 [tg3]
[ 17.247444] [<ffffffffa0107ad7>] ? tg3_open+0x107/0x2e0 [tg3]
[ 17.249556] [<ffffffff810c338d>] ? trace_hardirqs_on+0xd/0x10
[ 17.251581] [<ffffffff8107806f>] ? __local_bh_enable_ip+0x6f/0x100
[ 17.253710] [<ffffffffa0107af8>] tg3_open+0x128/0x2e0 [tg3]
[ 17.255758] [<ffffffff816ba3f5>] ? netpoll_poll_disable+0x5/0xa0
[ 17.257932] [<ffffffff816a14af>] __dev_open+0xbf/0x140
[ 17.260091] [<ffffffff816a17c1>] __dev_change_flags+0xa1/0x160
[ 17.262222] [<ffffffff816a18a9>] dev_change_flags+0x29/0x60
[ 17.264360] [<ffffffff816b0e02>] do_setlink+0x2f2/0xa30
[ 17.266431] [<ffffffff816b1b7f>] rtnl_newlink+0x51f/0x750
[ 17.268485] [<ffffffff816b1749>] ? rtnl_newlink+0xe9/0x750
[ 17.270483] [<ffffffff811869c2>] ? free_pages_prepare+0x1d2/0x270
[ 17.272507] [<ffffffff810c32bd>] ? trace_hardirqs_on_caller+0x11d/0x1e0
[ 17.274531] [<ffffffff813dd1b2>] ? nla_parse+0x32/0x120
[ 17.276531] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0
[ 17.278514] [<ffffffff816adfd5>] rtnetlink_rcv_msg+0x95/0x250
[ 17.280485] [<ffffffff8109f699>] ? preempt_count_sub+0x49/0x50
[ 17.282448] [<ffffffff817b4a02>] ? mutex_lock_nested+0x382/0x530
[ 17.284402] [<ffffffff816adf1f>] ? rtnetlink_rcv+0x1f/0x40
[ 17.286290] [<ffffffff816adf1f>] ? rtnetlink_rcv+0x1f/0x40
[ 17.288142] [<ffffffff816adf40>] ? rtnetlink_rcv+0x40/0x40
[ 17.290031] [<ffffffff816cedc1>] netlink_rcv_skb+0xc1/0xe0
[ 17.291836] [<ffffffff816adf2e>] rtnetlink_rcv+0x2e/0x40
[ 17.293615] [<ffffffff816ce473>] netlink_unicast+0xf3/0x1d0
[ 17.295420] [<ffffffff816ce863>] netlink_sendmsg+0x313/0x690
[ 17.297132] [<ffffffff811ada4f>] ? might_fault+0x5f/0xb0
[ 17.298799] [<ffffffff8168253c>] do_sock_sendmsg+0x8c/0x100
[ 17.300493] [<ffffffff81681e3e>] ? copy_msghdr_from_user+0x15e/0x1f0
[ 17.302173] [<ffffffff81682aeb>] ___sys_sendmsg+0x30b/0x320
[ 17.303798] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0
[ 17.305431] [<ffffffff810bdee0>] ? cpuacct_account_field+0x80/0xb0
[ 17.307085] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0
[ 17.308744] [<ffffffff810a4f35>] ? sched_clock_local+0x25/0x90
[ 17.310375] [<ffffffff810a5dc1>] ? vtime_account_user+0x91/0xa0
[ 17.311948] [<ffffffff810a5198>] ? sched_clock_cpu+0xb8/0xe0
[ 17.313509] [<ffffffff810bf8be>] ? put_lock_stats.isra.26+0xe/0x30
[ 17.315069] [<ffffffff810c007e>] ? lock_release_holdtime.part.27+0x12e/0x1b0
[ 17.316618] [<ffffffff810a5dc1>] ? vtime_account_user+0x91/0xa0
[ 17.318162] [<ffffffff8109f5d1>] ? get_parent_ip+0x11/0x50
[ 17.319703] [<ffffffff8109f699>] ? preempt_count_sub+0x49/0x50
[ 17.321235] [<ffffffff811807e5>] ? context_tracking_user_exit+0x55/0x130
[ 17.322732] [<ffffffff811807e5>] ? context_tracking_user_exit+0x55/0x130
[ 17.324197] [<ffffffff816834f2>] __sys_sendmsg+0x42/0x80
[ 17.325634] [<ffffffff81683542>] SyS_sendmsg+0x12/0x20
[ 17.327048] [<ffffffff817ba12d>] system_call_fastpath+0x16/0x1b
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [3.19-rc3] tg3: BUG: sleeping function called from invalid context 2015-01-13 0:59 [3.19-rc3] tg3: BUG: sleeping function called from invalid context Peter Hurley @ 2015-01-13 2:30 ` Prashant Sreedharan 2015-01-13 12:30 ` Peter Hurley 2015-01-14 16:06 ` Peter Hurley 2015-01-13 6:49 ` Michael Chan 1 sibling, 2 replies; 7+ messages in thread From: Prashant Sreedharan @ 2015-01-13 2:30 UTC (permalink / raw) To: Peter Hurley; +Cc: Michael Chan, netdev, Linux kernel On Mon, 2015-01-12 at 19:59 -0500, Peter Hurley wrote: > On 3.19-rc3, I'm seeing this might_sleep() warning [1] from the tg3_open() > call stack. Let me know if I need to bisect this. > > Regards, > Peter Hurley > > [1] > > [ 17.203009] BUG: sleeping function called from invalid context at /home/peter/src/kernels/mainline/kernel/irq/manage.c:104 > [ 17.203067] in_atomic(): 1, irqs_disabled(): 0, pid: 1106, name: ip > [ 17.203092] 2 locks held by ip/1106: > [ 17.205255] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816adf1f>] rtnetlink_rcv+0x1f/0x40 > [ 17.207445] #1: (&(&tp->lock)->rlock){+.....}, at: [<ffffffffa01073e6>] tg3_start+0xc06/0x11f0 [tg3] > [ 17.209725] CPU: 2 PID: 1106 Comm: ip Not tainted 3.19.0-rc3+wip-xeon+lockdep #rc3+wip > [ 17.211900] Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 04/30/2012 > [ 17.214086] 0000000000000068 ffff8802ac823498 ffffffff817af7e8 0000000000000005 > [ 17.216265] ffffffff81a9be78 ffff8802ac8234a8 ffffffff810998a5 ffff8802ac8234d8 > [ 17.218446] ffffffff8109991a ffff8802ac8234c8 ffff8802af0aae00 ffffffffa00ed000 > [ 17.220636] Call Trace: > [ 17.222743] [<ffffffff817af7e8>] dump_stack+0x4f/0x7b > [ 17.224808] [<ffffffff810998a5>] ___might_sleep+0x105/0x140 > [ 17.226842] [<ffffffff8109991a>] __might_sleep+0x3a/0xa0 > [ 17.228869] [<ffffffffa00ed000>] ? 0xffffffffa00ed000 > [ 17.230939] [<ffffffff810d7d78>] synchronize_irq+0x38/0xa0 > [ 17.232967] [<ffffffffa00ed000>] ? 0xffffffffa00ed000 > [ 17.234991] [<ffffffffa010105f>] tg3_chip_reset+0x13f/0x9c0 [tg3] > [ 17.236988] [<ffffffffa01020ae>] tg3_reset_hw+0x7e/0x2d20 [tg3] > [ 17.238996] [<ffffffff813bfaff>] ? __udelay+0x2f/0x40 > [ 17.241007] [<ffffffffa00ef2f7>] ? _tw32_flush+0x47/0x80 [tg3] > [ 17.243066] [<ffffffffa0104dac>] tg3_init_hw+0x5c/0x70 [tg3] > [ 17.245438] [<ffffffffa010740b>] tg3_start+0xc2b/0x11f0 [tg3] > [ 17.247444] [<ffffffffa0107ad7>] ? tg3_open+0x107/0x2e0 [tg3] > [ 17.249556] [<ffffffff810c338d>] ? trace_hardirqs_on+0xd/0x10 > [ 17.251581] [<ffffffff8107806f>] ? __local_bh_enable_ip+0x6f/0x100 > [ 17.253710] [<ffffffffa0107af8>] tg3_open+0x128/0x2e0 [tg3] > [ 17.255758] [<ffffffff816ba3f5>] ? netpoll_poll_disable+0x5/0xa0 > [ 17.257932] [<ffffffff816a14af>] __dev_open+0xbf/0x140 > [ 17.260091] [<ffffffff816a17c1>] __dev_change_flags+0xa1/0x160 > [ 17.262222] [<ffffffff816a18a9>] dev_change_flags+0x29/0x60 > [ 17.264360] [<ffffffff816b0e02>] do_setlink+0x2f2/0xa30 > [ 17.266431] [<ffffffff816b1b7f>] rtnl_newlink+0x51f/0x750 > [ 17.268485] [<ffffffff816b1749>] ? rtnl_newlink+0xe9/0x750 > [ 17.270483] [<ffffffff811869c2>] ? free_pages_prepare+0x1d2/0x270 > [ 17.272507] [<ffffffff810c32bd>] ? trace_hardirqs_on_caller+0x11d/0x1e0 > [ 17.274531] [<ffffffff813dd1b2>] ? nla_parse+0x32/0x120 > [ 17.276531] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0 > [ 17.278514] [<ffffffff816adfd5>] rtnetlink_rcv_msg+0x95/0x250 > [ 17.280485] [<ffffffff8109f699>] ? preempt_count_sub+0x49/0x50 > [ 17.282448] [<ffffffff817b4a02>] ? mutex_lock_nested+0x382/0x530 > [ 17.284402] [<ffffffff816adf1f>] ? rtnetlink_rcv+0x1f/0x40 > [ 17.286290] [<ffffffff816adf1f>] ? rtnetlink_rcv+0x1f/0x40 > [ 17.288142] [<ffffffff816adf40>] ? rtnetlink_rcv+0x40/0x40 > [ 17.290031] [<ffffffff816cedc1>] netlink_rcv_skb+0xc1/0xe0 > [ 17.291836] [<ffffffff816adf2e>] rtnetlink_rcv+0x2e/0x40 > [ 17.293615] [<ffffffff816ce473>] netlink_unicast+0xf3/0x1d0 > [ 17.295420] [<ffffffff816ce863>] netlink_sendmsg+0x313/0x690 > [ 17.297132] [<ffffffff811ada4f>] ? might_fault+0x5f/0xb0 > [ 17.298799] [<ffffffff8168253c>] do_sock_sendmsg+0x8c/0x100 > [ 17.300493] [<ffffffff81681e3e>] ? copy_msghdr_from_user+0x15e/0x1f0 > [ 17.302173] [<ffffffff81682aeb>] ___sys_sendmsg+0x30b/0x320 > [ 17.303798] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0 > [ 17.305431] [<ffffffff810bdee0>] ? cpuacct_account_field+0x80/0xb0 > [ 17.307085] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0 > [ 17.308744] [<ffffffff810a4f35>] ? sched_clock_local+0x25/0x90 > [ 17.310375] [<ffffffff810a5dc1>] ? vtime_account_user+0x91/0xa0 > [ 17.311948] [<ffffffff810a5198>] ? sched_clock_cpu+0xb8/0xe0 > [ 17.313509] [<ffffffff810bf8be>] ? put_lock_stats.isra.26+0xe/0x30 > [ 17.315069] [<ffffffff810c007e>] ? lock_release_holdtime.part.27+0x12e/0x1b0 > [ 17.316618] [<ffffffff810a5dc1>] ? vtime_account_user+0x91/0xa0 > [ 17.318162] [<ffffffff8109f5d1>] ? get_parent_ip+0x11/0x50 > [ 17.319703] [<ffffffff8109f699>] ? preempt_count_sub+0x49/0x50 > [ 17.321235] [<ffffffff811807e5>] ? context_tracking_user_exit+0x55/0x130 > [ 17.322732] [<ffffffff811807e5>] ? context_tracking_user_exit+0x55/0x130 > [ 17.324197] [<ffffffff816834f2>] __sys_sendmsg+0x42/0x80 > [ 17.325634] [<ffffffff81683542>] SyS_sendmsg+0x12/0x20 > [ 17.327048] [<ffffffff817ba12d>] system_call_fastpath+0x16/0x1b Please bisect, there hasn't been tg3 code changes in this path that might cause this. It would help to know the commit changes that is triggering the problem. Also could you provide the device details, from syslog look for "Tigon3 [partno(BCMxxxxx) rev xxxxxxx]". Thanks. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [3.19-rc3] tg3: BUG: sleeping function called from invalid context 2015-01-13 2:30 ` Prashant Sreedharan @ 2015-01-13 12:30 ` Peter Hurley 2015-01-14 16:06 ` Peter Hurley 1 sibling, 0 replies; 7+ messages in thread From: Peter Hurley @ 2015-01-13 12:30 UTC (permalink / raw) To: Prashant Sreedharan; +Cc: Michael Chan, netdev, Linux kernel On 01/12/2015 09:30 PM, Prashant Sreedharan wrote: > On Mon, 2015-01-12 at 19:59 -0500, Peter Hurley wrote: >> On 3.19-rc3, I'm seeing this might_sleep() warning [1] from the tg3_open() >> call stack. Let me know if I need to bisect this. >> >> Regards, >> Peter Hurley >> >> [1] >> >> [ 17.203009] BUG: sleeping function called from invalid context at /home/peter/src/kernels/mainline/kernel/irq/manage.c:104 >> [ 17.203067] in_atomic(): 1, irqs_disabled(): 0, pid: 1106, name: ip >> [ 17.203092] 2 locks held by ip/1106: >> [ 17.205255] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816adf1f>] rtnetlink_rcv+0x1f/0x40 >> [ 17.207445] #1: (&(&tp->lock)->rlock){+.....}, at: [<ffffffffa01073e6>] tg3_start+0xc06/0x11f0 [tg3] >> [ 17.209725] CPU: 2 PID: 1106 Comm: ip Not tainted 3.19.0-rc3+wip-xeon+lockdep #rc3+wip >> [ 17.211900] Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 04/30/2012 >> [ 17.214086] 0000000000000068 ffff8802ac823498 ffffffff817af7e8 0000000000000005 >> [ 17.216265] ffffffff81a9be78 ffff8802ac8234a8 ffffffff810998a5 ffff8802ac8234d8 >> [ 17.218446] ffffffff8109991a ffff8802ac8234c8 ffff8802af0aae00 ffffffffa00ed000 >> [ 17.220636] Call Trace: >> [ 17.222743] [<ffffffff817af7e8>] dump_stack+0x4f/0x7b >> [ 17.224808] [<ffffffff810998a5>] ___might_sleep+0x105/0x140 >> [ 17.226842] [<ffffffff8109991a>] __might_sleep+0x3a/0xa0 >> [ 17.228869] [<ffffffffa00ed000>] ? 0xffffffffa00ed000 >> [ 17.230939] [<ffffffff810d7d78>] synchronize_irq+0x38/0xa0 >> [ 17.232967] [<ffffffffa00ed000>] ? 0xffffffffa00ed000 >> [ 17.234991] [<ffffffffa010105f>] tg3_chip_reset+0x13f/0x9c0 [tg3] >> [ 17.236988] [<ffffffffa01020ae>] tg3_reset_hw+0x7e/0x2d20 [tg3] >> [ 17.238996] [<ffffffff813bfaff>] ? __udelay+0x2f/0x40 >> [ 17.241007] [<ffffffffa00ef2f7>] ? _tw32_flush+0x47/0x80 [tg3] >> [ 17.243066] [<ffffffffa0104dac>] tg3_init_hw+0x5c/0x70 [tg3] >> [ 17.245438] [<ffffffffa010740b>] tg3_start+0xc2b/0x11f0 [tg3] >> [ 17.247444] [<ffffffffa0107ad7>] ? tg3_open+0x107/0x2e0 [tg3] >> [ 17.249556] [<ffffffff810c338d>] ? trace_hardirqs_on+0xd/0x10 >> [ 17.251581] [<ffffffff8107806f>] ? __local_bh_enable_ip+0x6f/0x100 >> [ 17.253710] [<ffffffffa0107af8>] tg3_open+0x128/0x2e0 [tg3] >> [ 17.255758] [<ffffffff816ba3f5>] ? netpoll_poll_disable+0x5/0xa0 >> [ 17.257932] [<ffffffff816a14af>] __dev_open+0xbf/0x140 >> [ 17.260091] [<ffffffff816a17c1>] __dev_change_flags+0xa1/0x160 >> [ 17.262222] [<ffffffff816a18a9>] dev_change_flags+0x29/0x60 >> [ 17.264360] [<ffffffff816b0e02>] do_setlink+0x2f2/0xa30 >> [ 17.266431] [<ffffffff816b1b7f>] rtnl_newlink+0x51f/0x750 >> [ 17.268485] [<ffffffff816b1749>] ? rtnl_newlink+0xe9/0x750 >> [ 17.270483] [<ffffffff811869c2>] ? free_pages_prepare+0x1d2/0x270 >> [ 17.272507] [<ffffffff810c32bd>] ? trace_hardirqs_on_caller+0x11d/0x1e0 >> [ 17.274531] [<ffffffff813dd1b2>] ? nla_parse+0x32/0x120 >> [ 17.276531] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0 >> [ 17.278514] [<ffffffff816adfd5>] rtnetlink_rcv_msg+0x95/0x250 >> [ 17.280485] [<ffffffff8109f699>] ? preempt_count_sub+0x49/0x50 >> [ 17.282448] [<ffffffff817b4a02>] ? mutex_lock_nested+0x382/0x530 >> [ 17.284402] [<ffffffff816adf1f>] ? rtnetlink_rcv+0x1f/0x40 >> [ 17.286290] [<ffffffff816adf1f>] ? rtnetlink_rcv+0x1f/0x40 >> [ 17.288142] [<ffffffff816adf40>] ? rtnetlink_rcv+0x40/0x40 >> [ 17.290031] [<ffffffff816cedc1>] netlink_rcv_skb+0xc1/0xe0 >> [ 17.291836] [<ffffffff816adf2e>] rtnetlink_rcv+0x2e/0x40 >> [ 17.293615] [<ffffffff816ce473>] netlink_unicast+0xf3/0x1d0 >> [ 17.295420] [<ffffffff816ce863>] netlink_sendmsg+0x313/0x690 >> [ 17.297132] [<ffffffff811ada4f>] ? might_fault+0x5f/0xb0 >> [ 17.298799] [<ffffffff8168253c>] do_sock_sendmsg+0x8c/0x100 >> [ 17.300493] [<ffffffff81681e3e>] ? copy_msghdr_from_user+0x15e/0x1f0 >> [ 17.302173] [<ffffffff81682aeb>] ___sys_sendmsg+0x30b/0x320 >> [ 17.303798] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0 >> [ 17.305431] [<ffffffff810bdee0>] ? cpuacct_account_field+0x80/0xb0 >> [ 17.307085] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0 >> [ 17.308744] [<ffffffff810a4f35>] ? sched_clock_local+0x25/0x90 >> [ 17.310375] [<ffffffff810a5dc1>] ? vtime_account_user+0x91/0xa0 >> [ 17.311948] [<ffffffff810a5198>] ? sched_clock_cpu+0xb8/0xe0 >> [ 17.313509] [<ffffffff810bf8be>] ? put_lock_stats.isra.26+0xe/0x30 >> [ 17.315069] [<ffffffff810c007e>] ? lock_release_holdtime.part.27+0x12e/0x1b0 >> [ 17.316618] [<ffffffff810a5dc1>] ? vtime_account_user+0x91/0xa0 >> [ 17.318162] [<ffffffff8109f5d1>] ? get_parent_ip+0x11/0x50 >> [ 17.319703] [<ffffffff8109f699>] ? preempt_count_sub+0x49/0x50 >> [ 17.321235] [<ffffffff811807e5>] ? context_tracking_user_exit+0x55/0x130 >> [ 17.322732] [<ffffffff811807e5>] ? context_tracking_user_exit+0x55/0x130 >> [ 17.324197] [<ffffffff816834f2>] __sys_sendmsg+0x42/0x80 >> [ 17.325634] [<ffffffff81683542>] SyS_sendmsg+0x12/0x20 >> [ 17.327048] [<ffffffff817ba12d>] system_call_fastpath+0x16/0x1b > > Please bisect, there hasn't been tg3 code changes in this path that > might cause this. It would help to know the commit changes that is > triggering the problem. Ok, will do. > Also could you provide the device details, from > syslog look for "Tigon3 [partno(BCMxxxxx) rev xxxxxxx]". Thanks. [ 1.430884] tg3 0000:08:00.0 eth0: Tigon3 [partno(BCM95754) rev b002] (PCI Express) MAC address xx:xx:xx:xx:xx:xx [ 1.431095] tg3 0000:08:00.0 eth0: attached PHY is 5787 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[0]) [ 1.431295] tg3 0000:08:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1] [ 1.431488] tg3 0000:08:00.0 eth0: dma_rwctrl[76180000] dma_mask[64-bit] Regards, Peter Hurley ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [3.19-rc3] tg3: BUG: sleeping function called from invalid context 2015-01-13 2:30 ` Prashant Sreedharan 2015-01-13 12:30 ` Peter Hurley @ 2015-01-14 16:06 ` Peter Hurley 1 sibling, 0 replies; 7+ messages in thread From: Peter Hurley @ 2015-01-14 16:06 UTC (permalink / raw) To: Prashant Sreedharan; +Cc: Michael Chan, netdev, Linux kernel On 01/12/2015 09:30 PM, Prashant Sreedharan wrote: > On Mon, 2015-01-12 at 19:59 -0500, Peter Hurley wrote: >> On 3.19-rc3, I'm seeing this might_sleep() warning [1] from the tg3_open() >> call stack. Let me know if I need to bisect this. >> >> Regards, >> Peter Hurley >> >> [1] >> >> [ 17.203009] BUG: sleeping function called from invalid context at /home/peter/src/kernels/mainline/kernel/irq/manage.c:104 >> [ 17.203067] in_atomic(): 1, irqs_disabled(): 0, pid: 1106, name: ip >> [ 17.203092] 2 locks held by ip/1106: >> [ 17.205255] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816adf1f>] rtnetlink_rcv+0x1f/0x40 >> [ 17.207445] #1: (&(&tp->lock)->rlock){+.....}, at: [<ffffffffa01073e6>] tg3_start+0xc06/0x11f0 [tg3] >> [ 17.209725] CPU: 2 PID: 1106 Comm: ip Not tainted 3.19.0-rc3+wip-xeon+lockdep #rc3+wip >> [ 17.211900] Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 04/30/2012 >> [ 17.214086] 0000000000000068 ffff8802ac823498 ffffffff817af7e8 0000000000000005 >> [ 17.216265] ffffffff81a9be78 ffff8802ac8234a8 ffffffff810998a5 ffff8802ac8234d8 >> [ 17.218446] ffffffff8109991a ffff8802ac8234c8 ffff8802af0aae00 ffffffffa00ed000 >> [ 17.220636] Call Trace: >> [ 17.222743] [<ffffffff817af7e8>] dump_stack+0x4f/0x7b >> [ 17.224808] [<ffffffff810998a5>] ___might_sleep+0x105/0x140 >> [ 17.226842] [<ffffffff8109991a>] __might_sleep+0x3a/0xa0 >> [ 17.228869] [<ffffffffa00ed000>] ? 0xffffffffa00ed000 >> [ 17.230939] [<ffffffff810d7d78>] synchronize_irq+0x38/0xa0 >> [ 17.232967] [<ffffffffa00ed000>] ? 0xffffffffa00ed000 >> [ 17.234991] [<ffffffffa010105f>] tg3_chip_reset+0x13f/0x9c0 [tg3] >> [ 17.236988] [<ffffffffa01020ae>] tg3_reset_hw+0x7e/0x2d20 [tg3] >> [ 17.238996] [<ffffffff813bfaff>] ? __udelay+0x2f/0x40 >> [ 17.241007] [<ffffffffa00ef2f7>] ? _tw32_flush+0x47/0x80 [tg3] >> [ 17.243066] [<ffffffffa0104dac>] tg3_init_hw+0x5c/0x70 [tg3] >> [ 17.245438] [<ffffffffa010740b>] tg3_start+0xc2b/0x11f0 [tg3] >> [ 17.247444] [<ffffffffa0107ad7>] ? tg3_open+0x107/0x2e0 [tg3] >> [ 17.249556] [<ffffffff810c338d>] ? trace_hardirqs_on+0xd/0x10 >> [ 17.251581] [<ffffffff8107806f>] ? __local_bh_enable_ip+0x6f/0x100 >> [ 17.253710] [<ffffffffa0107af8>] tg3_open+0x128/0x2e0 [tg3] >> [ 17.255758] [<ffffffff816ba3f5>] ? netpoll_poll_disable+0x5/0xa0 >> [ 17.257932] [<ffffffff816a14af>] __dev_open+0xbf/0x140 >> [ 17.260091] [<ffffffff816a17c1>] __dev_change_flags+0xa1/0x160 >> [ 17.262222] [<ffffffff816a18a9>] dev_change_flags+0x29/0x60 >> [ 17.264360] [<ffffffff816b0e02>] do_setlink+0x2f2/0xa30 >> [ 17.266431] [<ffffffff816b1b7f>] rtnl_newlink+0x51f/0x750 >> [ 17.268485] [<ffffffff816b1749>] ? rtnl_newlink+0xe9/0x750 >> [ 17.270483] [<ffffffff811869c2>] ? free_pages_prepare+0x1d2/0x270 >> [ 17.272507] [<ffffffff810c32bd>] ? trace_hardirqs_on_caller+0x11d/0x1e0 >> [ 17.274531] [<ffffffff813dd1b2>] ? nla_parse+0x32/0x120 >> [ 17.276531] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0 >> [ 17.278514] [<ffffffff816adfd5>] rtnetlink_rcv_msg+0x95/0x250 >> [ 17.280485] [<ffffffff8109f699>] ? preempt_count_sub+0x49/0x50 >> [ 17.282448] [<ffffffff817b4a02>] ? mutex_lock_nested+0x382/0x530 >> [ 17.284402] [<ffffffff816adf1f>] ? rtnetlink_rcv+0x1f/0x40 >> [ 17.286290] [<ffffffff816adf1f>] ? rtnetlink_rcv+0x1f/0x40 >> [ 17.288142] [<ffffffff816adf40>] ? rtnetlink_rcv+0x40/0x40 >> [ 17.290031] [<ffffffff816cedc1>] netlink_rcv_skb+0xc1/0xe0 >> [ 17.291836] [<ffffffff816adf2e>] rtnetlink_rcv+0x2e/0x40 >> [ 17.293615] [<ffffffff816ce473>] netlink_unicast+0xf3/0x1d0 >> [ 17.295420] [<ffffffff816ce863>] netlink_sendmsg+0x313/0x690 >> [ 17.297132] [<ffffffff811ada4f>] ? might_fault+0x5f/0xb0 >> [ 17.298799] [<ffffffff8168253c>] do_sock_sendmsg+0x8c/0x100 >> [ 17.300493] [<ffffffff81681e3e>] ? copy_msghdr_from_user+0x15e/0x1f0 >> [ 17.302173] [<ffffffff81682aeb>] ___sys_sendmsg+0x30b/0x320 >> [ 17.303798] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0 >> [ 17.305431] [<ffffffff810bdee0>] ? cpuacct_account_field+0x80/0xb0 >> [ 17.307085] [<ffffffff81021ab5>] ? native_sched_clock+0x35/0xa0 >> [ 17.308744] [<ffffffff810a4f35>] ? sched_clock_local+0x25/0x90 >> [ 17.310375] [<ffffffff810a5dc1>] ? vtime_account_user+0x91/0xa0 >> [ 17.311948] [<ffffffff810a5198>] ? sched_clock_cpu+0xb8/0xe0 >> [ 17.313509] [<ffffffff810bf8be>] ? put_lock_stats.isra.26+0xe/0x30 >> [ 17.315069] [<ffffffff810c007e>] ? lock_release_holdtime.part.27+0x12e/0x1b0 >> [ 17.316618] [<ffffffff810a5dc1>] ? vtime_account_user+0x91/0xa0 >> [ 17.318162] [<ffffffff8109f5d1>] ? get_parent_ip+0x11/0x50 >> [ 17.319703] [<ffffffff8109f699>] ? preempt_count_sub+0x49/0x50 >> [ 17.321235] [<ffffffff811807e5>] ? context_tracking_user_exit+0x55/0x130 >> [ 17.322732] [<ffffffff811807e5>] ? context_tracking_user_exit+0x55/0x130 >> [ 17.324197] [<ffffffff816834f2>] __sys_sendmsg+0x42/0x80 >> [ 17.325634] [<ffffffff81683542>] SyS_sendmsg+0x12/0x20 >> [ 17.327048] [<ffffffff817ba12d>] system_call_fastpath+0x16/0x1b > > Please bisect, there hasn't been tg3 code changes in this path that > might cause this. What triggers this is the new debugging code added to catch nested sleeps; specifically e22b886 ("sched/wait: Add might_sleep() checks"). Regards, Peter Hurley ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [3.19-rc3] tg3: BUG: sleeping function called from invalid context 2015-01-13 0:59 [3.19-rc3] tg3: BUG: sleeping function called from invalid context Peter Hurley 2015-01-13 2:30 ` Prashant Sreedharan @ 2015-01-13 6:49 ` Michael Chan 2015-01-13 12:47 ` Peter Hurley 1 sibling, 1 reply; 7+ messages in thread From: Michael Chan @ 2015-01-13 6:49 UTC (permalink / raw) To: Peter Hurley; +Cc: Prashant Sreedharan, netdev, Linux kernel On Mon, 2015-01-12 at 19:59 -0500, Peter Hurley wrote: > [ 17.203009] BUG: sleeping function called from invalid context at /home/peter/src/kernels/mainline/kernel/irq/manage.c:104 > [ 17.203067] in_atomic(): 1, irqs_disabled(): 0, pid: 1106, name: ip > [ 17.203092] 2 locks held by ip/1106: > [ 17.205255] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816adf1f>] rtnetlink_rcv+0x1f/0x40 > [ 17.207445] #1: (&(&tp->lock)->rlock){+.....}, at: [<ffffffffa01073e6>] tg3_start+0xc06/0x11f0 [tg3] > [ 17.209725] CPU: 2 PID: 1106 Comm: ip Not tainted 3.19.0-rc3+wip-xeon+lockdep #rc3+wip > [ 17.211900] Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 04/30/2012 > [ 17.214086] 0000000000000068 ffff8802ac823498 ffffffff817af7e8 0000000000000005 > [ 17.216265] ffffffff81a9be78 ffff8802ac8234a8 ffffffff810998a5 ffff8802ac8234d8 > [ 17.218446] ffffffff8109991a ffff8802ac8234c8 ffff8802af0aae00 ffffffffa00ed000 > [ 17.220636] Call Trace: > [ 17.222743] [<ffffffff817af7e8>] dump_stack+0x4f/0x7b > [ 17.224808] [<ffffffff810998a5>] ___might_sleep+0x105/0x140 > [ 17.226842] [<ffffffff8109991a>] __might_sleep+0x3a/0xa0 > [ 17.228869] [<ffffffffa00ed000>] ? 0xffffffffa00ed000 > [ 17.230939] [<ffffffff810d7d78>] synchronize_irq+0x38/0xa0 > [ 17.232967] [<ffffffffa00ed000>] ? 0xffffffffa00ed000 > [ 17.234991] [<ffffffffa010105f>] tg3_chip_reset+0x13f/0x9c0 [tg3] > [ 17.236988] [<ffffffffa01020ae>] tg3_reset_hw+0x7e/0x2d20 [tg3] tp->lock is held in this code path. If synchronize_irq() sleeps in wait_event(desc->wait_for_threads, ...), we'll get the warning. The synchronize_irq() call is to wait for any tg3 irq handler to finish so that it is guaranteed that next time it will see the CHIP_RESETTING flag and do nothing. Not sure if we can drop the tp->lock before we call synchronize_irq() and then take it again after synchronize_irq(). ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [3.19-rc3] tg3: BUG: sleeping function called from invalid context 2015-01-13 6:49 ` Michael Chan @ 2015-01-13 12:47 ` Peter Hurley 2015-01-13 17:25 ` Michael Chan 0 siblings, 1 reply; 7+ messages in thread From: Peter Hurley @ 2015-01-13 12:47 UTC (permalink / raw) To: Michael Chan; +Cc: Prashant Sreedharan, netdev, Linux kernel On 01/13/2015 01:49 AM, Michael Chan wrote: > On Mon, 2015-01-12 at 19:59 -0500, Peter Hurley wrote: >> [ 17.203009] BUG: sleeping function called from invalid context at /home/peter/src/kernels/mainline/kernel/irq/manage.c:104 >> [ 17.203067] in_atomic(): 1, irqs_disabled(): 0, pid: 1106, name: ip >> [ 17.203092] 2 locks held by ip/1106: >> [ 17.205255] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816adf1f>] rtnetlink_rcv+0x1f/0x40 >> [ 17.207445] #1: (&(&tp->lock)->rlock){+.....}, at: [<ffffffffa01073e6>] tg3_start+0xc06/0x11f0 [tg3] >> [ 17.209725] CPU: 2 PID: 1106 Comm: ip Not tainted 3.19.0-rc3+wip-xeon+lockdep #rc3+wip >> [ 17.211900] Hardware name: Dell Inc. Precision WorkStation T5400 /0RW203, BIOS A11 04/30/2012 >> [ 17.214086] 0000000000000068 ffff8802ac823498 ffffffff817af7e8 0000000000000005 >> [ 17.216265] ffffffff81a9be78 ffff8802ac8234a8 ffffffff810998a5 ffff8802ac8234d8 >> [ 17.218446] ffffffff8109991a ffff8802ac8234c8 ffff8802af0aae00 ffffffffa00ed000 >> [ 17.220636] Call Trace: >> [ 17.222743] [<ffffffff817af7e8>] dump_stack+0x4f/0x7b >> [ 17.224808] [<ffffffff810998a5>] ___might_sleep+0x105/0x140 >> [ 17.226842] [<ffffffff8109991a>] __might_sleep+0x3a/0xa0 >> [ 17.228869] [<ffffffffa00ed000>] ? 0xffffffffa00ed000 >> [ 17.230939] [<ffffffff810d7d78>] synchronize_irq+0x38/0xa0 >> [ 17.232967] [<ffffffffa00ed000>] ? 0xffffffffa00ed000 >> [ 17.234991] [<ffffffffa010105f>] tg3_chip_reset+0x13f/0x9c0 [tg3] >> [ 17.236988] [<ffffffffa01020ae>] tg3_reset_hw+0x7e/0x2d20 [tg3] > > tp->lock is held in this code path. If synchronize_irq() sleeps in > wait_event(desc->wait_for_threads, ...), we'll get the warning. > > The synchronize_irq() call is to wait for any tg3 irq handler to finish > so that it is guaranteed that next time it will see the CHIP_RESETTING > flag and do nothing. > > Not sure if we can drop the tp->lock before we call synchronize_irq() > and then take it again after synchronize_irq(). Well, this device [1] is using MSI (INTx disabled) so if the synchronize_irq() is _only_ for the CHIP_RESETTING logic then it would seem ok to skip it (the synchronize_irq()). Regards, Peter Hurley [1] lspci -vv 08:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5754 Gigabit Ethernet PCI Express (rev 02) Subsystem: Dell Precision T5400 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 31 Region 0: Memory at d3ff0000 (64-bit, non-prefetchable) [size=64K] Expansion ROM at <ignored> [disabled] Capabilities: [48] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME- Capabilities: [50] Vital Product Data Product Name: Broadcom NetLink Gigabit Ethernet Controller Read-only fields: [PN] Part number: BCM95754 [EC] Engineering changes: 106679-15 [SN] Serial number: 0123456789 [MN] Manufacture ID: 31 34 65 34 [RV] Reserved: checksum good, 30 byte(s) reserved Read/write fields: [YA] Asset tag: XYZ01234567 [RW] Read-write area: 107 byte(s) free End Capabilities: [58] Vendor Specific Information: Len=78 <?> Capabilities: [e8] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee0400c Data: 41a2 Capabilities: [d0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s, Exit Latency L0s <4us, L1 <64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr+ BadTLP- BadDLLP+ Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [13c v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Capabilities: [160 v1] Device Serial Number xx-xx-xx-xx-xx-xx-xx-xx Capabilities: [16c v1] Power Budgeting <?> Kernel driver in use: tg3 ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [3.19-rc3] tg3: BUG: sleeping function called from invalid context 2015-01-13 12:47 ` Peter Hurley @ 2015-01-13 17:25 ` Michael Chan 0 siblings, 0 replies; 7+ messages in thread From: Michael Chan @ 2015-01-13 17:25 UTC (permalink / raw) To: Peter Hurley; +Cc: Prashant Sreedharan, netdev, Linux kernel On Tue, 2015-01-13 at 07:47 -0500, Peter Hurley wrote: > > tp->lock is held in this code path. If synchronize_irq() sleeps in > > wait_event(desc->wait_for_threads, ...), we'll get the warning. > > > > The synchronize_irq() call is to wait for any tg3 irq handler to finish > > so that it is guaranteed that next time it will see the CHIP_RESETTING > > flag and do nothing. > > > > Not sure if we can drop the tp->lock before we call synchronize_irq() > > and then take it again after synchronize_irq(). > > Well, this device [1] is using MSI (INTx disabled) so if the synchronize_irq() > is _only_ for the CHIP_RESETTING logic then it would seem ok to skip it (the > synchronize_irq()). It is only for INTx. But any device can operate in INTx mode if MSI/MSIX is not available, so the fix needs to work in all cases. Let me review the code some more. If we can guarantee that another reset, the timer code, etc, cannot come in even if we drop the tp->lock, the simplest fix will be to drop it before calling synchronize_irq(). Thanks. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-01-14 16:06 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-01-13 0:59 [3.19-rc3] tg3: BUG: sleeping function called from invalid context Peter Hurley 2015-01-13 2:30 ` Prashant Sreedharan 2015-01-13 12:30 ` Peter Hurley 2015-01-14 16:06 ` Peter Hurley 2015-01-13 6:49 ` Michael Chan 2015-01-13 12:47 ` Peter Hurley 2015-01-13 17:25 ` Michael Chan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).