* Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
@ 2009-10-08 22:19 Anirban Sinha
2009-10-08 22:54 ` David Miller
0 siblings, 1 reply; 15+ messages in thread
From: Anirban Sinha @ 2009-10-08 22:19 UTC (permalink / raw)
To: netdev; +Cc: davem, asinha
Hi All;
We are noticing a kernel OOPS on 2.6.26 kernel when we issue the command
"clear ip bgp <bgp-peer-ip>" on Quagga BGP routing software. Also we are
noticing a very similar
crash when we set and reset the local TCP MD5 password. I am attaching the
backtrace. I wonder if this has been seen by any other person and whether it
is a known issue? We are running our kernel on the mips hardware.
Thanks,
Ani
# [23:10:35.108808] Kernel bug detected[#1]:
[23:10:35.112527] Cpu 0
[23:10:35.114676] $ 0 : 0000000000000000 0000000014001fe0
0000000000000066 0000000000000004
[23:10:35.122845] $ 4 : ffffffff80516c10 0000000014001fe0
ffffffff8050c010 0000000000000004
[23:10:35.131015] $ 8 : 0000000000000000 0000000000000041
ffffffff805142e8 0000000000000001
[23:10:35.139184] $12 : ffffffff80600000 ffffffff805f0000
0000000000000064 0000000000000190
[23:10:35.147354] $16 : 0000000000000102 ffffffff803afdf0
ffffffff80539040 ffffffff80600780
[23:10:35.155526] $20 : ffffffff80540000 0000000000200200
ffffffff804c0000 000000000000000a
[23:10:35.163695] $24 : a3d70a3d70a3d70b 8000000000000003
[23:10:35.171865] $28 : ffffffff8050c000 ffffffff8050fd90
9000000010030000 ffffffff801487a8
[23:10:35.180035] Hi : 0000000000000000
[23:10:35.183819] Lo : 0000000000000000
[23:10:35.187603] epc : ffffffff801487a8 run_timer_softirq+0x198/0x258
Tainted: P
[23:10:35.196032] ra : ffffffff801487a8 run_timer_softirq+0x198/0x258
[23:10:35.202395] Status: 14001fe3 KX SX UX KERNEL EXL IE
[23:10:35.207814] Cause : 00808024
[23:10:35.210911] PrId : 01041100 (SiByte SB1A)
[23:10:35.215209] Modules linked in: xt_state ipt_REJECT iptable_filter
nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4
ip_tables ebtable_filter ebtables bridge llc zeug_ipmcdrv(P) irqdisp(P)
zvirt(P) zeugmod(P) softdog
[23:10:35.236024] Process swapper (pid: 0, threadinfo=ffffffff8050c000,
task=ffffffff805142e8, tls=0000000000000000)
[23:10:35.246169] Stack : ffffffff8050fd90 ffffffff8050fd90
0000000014001fe0 ffffffff805ff3e0
[23:10:35.254166] ffffffff806003c4 0000000000000001
ffffffff8053f650 ffffffff805706d0
[23:10:35.262337] ffffffff80572020 ffffffff80142280
ffffffff806003c0 0000000000000000
[23:10:35.270507] 0000000014001fe0 000000000000c5b0
ffffffff8fefc520 ffffffff8feea52c
[23:10:35.278676] 0000000000000015 0000000000004460
0000000000000940 ffffffff8fe1bf00
[23:10:35.286846] ffffffff8fffdab0 ffffffff80142410
0000000000000000 ffffffff80142778
[23:10:35.295017] ffffffff80103d20 ffffffff80103d20
0000000000000000 0000000014001fe1
[23:10:35.303187] 0000000000040000 ffffffff8050c010
0000000000000000 a80000017f87c138
[23:10:35.311357] 0000000014001fe0 ffffffffffff00fe
0000000000000004 a80000017e7e0680
[23:10:35.319528] 0000000000000000 000000000000001d
ffffffff8050ffe0 0000000000001f00
[23:10:35.327696] ...
[23:10:35.330536] Call Trace:
[23:10:35.333201] [<ffffffff801487a8>] run_timer_softirq+0x198/0x258
[23:10:35.339224] [<ffffffff80142280>] __do_softirq+0x198/0x288
[23:10:35.344812] [<ffffffff80142410>] do_softirq+0xa0/0xa8
[23:10:35.350057] [<ffffffff80142778>] irq_exit+0x70/0x88
[23:10:35.355131] [<ffffffff80103d20>] ret_from_irq+0x0/0x4
[23:10:35.360377] [<ffffffff801063f4>] cpu_idle+0x1c/0x88
[23:10:35.365455]
[23:10:35.367171]
[23:10:35.367174] Code: 0040382d 0c04ef4c 00000000 <0200000d> 0c10ee9c
0260202d dfa60000 17a6ffe5 00000000
[23:10:35.378822] Kernel panic - not syncing: Fatal exception in
interrupt
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-08 22:19 Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled Anirban Sinha
@ 2009-10-08 22:54 ` David Miller
2009-10-08 23:33 ` Anirban Sinha
0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2009-10-08 22:54 UTC (permalink / raw)
To: asinha; +Cc: netdev
From: Anirban Sinha <asinha@zeugmasystems.com>
Date: Thu, 8 Oct 2009 15:19:48 -0700 (PDT)
> We are noticing a kernel OOPS on 2.6.26 kernel when we issue the command
> "clear ip bgp <bgp-peer-ip>" on Quagga BGP routing software.
You will need to update your kernel, there have been many TCP
MD5 bug fixes since 2.6.26
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-08 22:54 ` David Miller
@ 2009-10-08 23:33 ` Anirban Sinha
2009-10-09 0:57 ` David Miller
0 siblings, 1 reply; 15+ messages in thread
From: Anirban Sinha @ 2009-10-08 23:33 UTC (permalink / raw)
To: David Miller; +Cc: netdev
Hi:
Thanks for responding.
> > We are noticing a kernel OOPS on 2.6.26 kernel when we issue the command
> > "clear ip bgp <bgp-peer-ip>" on Quagga BGP routing software.
>
> You will need to update your kernel, there have been many TCP
> MD5 bug fixes since 2.6.26
>
Sigh ... wish that were that easy! Anyway, as far as I could, I have tried to
apply the upstream patches that seemed relevant to TCP MD5SUM. Am I missing
some other patches? It will be great if someone can point me to any patch that
I might be missing related to the TCP MD5SUM support.
I applied the following patches:
(a)
author Adam Langley <agl@imperialviolet.org>
Sat, 19 Jul 2008 07:01:42 +0000 (00:01 -0700)
committer David S. Miller <davem@davemloft.net>
Sat, 19 Jul 2008 07:01:42 +0000 (00:01 -0700)
commit 49a72dfb8814c2d65bd9f8c9c6daf6395a1ec58d
tree 38804d609f21503573bbdd8bb9af38df99275ff5 tree | snapshot
parent 845525a642c1c9e1335c33a274d4273906ee58eb commit | diff
tcp: Fix MD5 signatures for non-linear skbs
Currently, the MD5 code assumes that the SKBs are linear and, in the case
that they aren't, happily goes off and hashes off the end of the SKB and
into random memory.
Reported by Stephen Hemminger in [1]. Advice thanks to Stephen and Evgeniy
Polyakov. Also includes a couple of missed route_caps from Stephen's patch
in [2].
[1] http://marc.info/?l=linux-netdev&m=121445989106145&w=2
[2] http://marc.info/?l=linux-netdev&m=121459157816964&w=2
Signed-off-by: Adam Langley <agl@imperialviolet.org>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(b)
author YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Fri, 18 Apr 2008 03:45:16 +0000 (12:45 +0900)
committer YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Wed, 11 Jun 2008 18:46:30 +0000 (03:46 +0900)
commit 9501f9722922f2e80e1f9dc6682311d65c2b5690
tree ca8195e04ea63e8273801030ce26527fe5a8a7c7 tree | snapshot
parent 8d26d76dd4a4c87ef037a44a42a0608ffc730199 commit | diff
tcp md5sig: Let the caller pass appropriate key for
tcp_v{4,6}_do_calc_md5_hash().
As we do for other socket/timewait-socket specific parameters,
let the callers pass appropriate arguments to
tcp_v{4,6}_do_calc_md5_hash().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
(c)
author YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Thu, 17 Apr 2008 04:19:16 +0000 (13:19 +0900)
committer YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Wed, 11 Jun 2008 17:38:20 +0000 (02:38 +0900)
commit 8d26d76dd4a4c87ef037a44a42a0608ffc730199
tree 884ff53a83e460aa3f1837cc336a5a34f364156e tree | snapshot
parent 076fb7223357769c39f3ddf900bba6752369c76a commit | diff
tcp md5sig: Share most of hash calcucaltion bits between IPv4 and IPv6.
We can share most part of the hash calculation code because
the only difference between IPv4 and IPv6 is their pseudo headers.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
(d)
author YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Thu, 17 Apr 2008 03:48:12 +0000 (12:48 +0900)
committer YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Wed, 11 Jun 2008 17:38:19 +0000 (02:38 +0900)
commit 076fb7223357769c39f3ddf900bba6752369c76a
tree db75c2af3bf71cda4d0cccd6ebcfa8d1a62c3620 tree | snapshot
parent 7d5d5525bd88313e6fd90c0659665aee5114bc2d commit | diff
tcp md5sig: Remove redundant protocol argument.
Protocol is always TCP, so remove useless protocol argument.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
(e)
author YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Thu, 17 Apr 2008 03:29:53 +0000 (12:29 +0900)
committer YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Wed, 11 Jun 2008 17:38:18 +0000 (02:38 +0900)
commit 7d5d5525bd88313e6fd90c0659665aee5114bc2d
tree 41517e753220261c8cc46d975977cfd711892f6c tree | snapshot
parent 81b302a321a0d99ff172b8cb2a8de17bff2f9499 commit | diff
tcp md5sig: Share MD5 Signature option parser between IPv4 and IPv6.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Cheers,
Ani
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-08 23:33 ` Anirban Sinha
@ 2009-10-09 0:57 ` David Miller
2009-10-17 17:57 ` Anirban Sinha
0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2009-10-09 0:57 UTC (permalink / raw)
To: asinha; +Cc: netdev
From: Anirban Sinha <asinha@zeugmasystems.com>
Date: Thu, 8 Oct 2009 16:33:47 -0700 (PDT)
> Hi:
>
> Thanks for responding.
>
>> > We are noticing a kernel OOPS on 2.6.26 kernel when we issue the command
>> > "clear ip bgp <bgp-peer-ip>" on Quagga BGP routing software.
>>
>> You will need to update your kernel, there have been many TCP
>> MD5 bug fixes since 2.6.26
>>
>
> Sigh ... wish that were that easy!
Contact your vendor for support :-)
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-09 0:57 ` David Miller
@ 2009-10-17 17:57 ` Anirban Sinha
2009-10-18 2:35 ` Anirban Sinha
0 siblings, 1 reply; 15+ messages in thread
From: Anirban Sinha @ 2009-10-17 17:57 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-kernel, ani
On Thu, 8 Oct 2009, David Miller wrote:
> >> > We are noticing a kernel OOPS on 2.6.26 kernel when we issue the command
> >> > "clear ip bgp <bgp-peer-ip>" on Quagga BGP routing software.
> >>
> >> You will need to update your kernel, there have been many TCP
> >> MD5 bug fixes since 2.6.26
> >>
> >
> > Sigh ... wish that were that easy!
>
> Contact your vendor for support :-)
but we *are* the vendors for our distribution and even though I may not be
a networking guru, I have little bit knowledge in working my way through the
kernel code. I have traced down the cause of the BUG() and when I did a git
pull against Linus' tree, I see the same issue in teh git tip as well.
The BUG() is triggered from kernel/timer.c, line 1037 within the
function __run_timers(). I am reporting these line numbers from the latest git
tip.
What happens is that before and after the callback, the code grabs the preempt
count and catches unbalanced preempt_enable() and preempt_disable() calls from
within the callback function. In this case, the callback function is
inet_twdr_hangman() as can be seen from this instrumented log:
[02:15:15.941981] Kernel panic - not syncing: <3>huh, entered ffffffff803fbd60
(inet_twdr_hangman+0x0/0xe0)with preempt_count 00000102, exited with 00000101?
Clearly there is an extra unbalanced preempt_enable() somewhere within the
callback function.
When I looked at the function, I see that in net/ipv4/inet_timewait_sock.c
line 215, the function calls schedule_work(). schedule_work() calls
queue_work() which in turn calls put_cpu() that ultimately does a
preempt_enable(). It is this unbalanced preempt_enable() that decriments the
preempt_count by one as can be seen from the above trace.
I suspect that workqueue related operatios are illegal from a timer callback
function. In that case, the above mentioned callback function needs to be
fixed.
Yes, I can't explain why others are also not seeing the same bug crash. We
don't have the luxury to pull in the latest and greatest kernel from the git
tree everytime an update is made and try it out. So I am unable to repo the
issue with the latest kernel. But if that means that this issue should be
ignored, then that is fine by me. We will fix our private kenrel with an
appropriate patch as we continue to investigate more.
Cheers,
Ani
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-17 17:57 ` Anirban Sinha
@ 2009-10-18 2:35 ` Anirban Sinha
2009-10-18 20:19 ` Anirban Sinha
0 siblings, 1 reply; 15+ messages in thread
From: Anirban Sinha @ 2009-10-18 2:35 UTC (permalink / raw)
To: linux-kernel; +Cc: David Miller, netdev, Anirban Sinha
Once upon a time, like on 09-10-17 10:57 AM, Anirban Sinha wrote:
> On Thu, 8 Oct 2009, David Miller wrote:
>
>>>>> We are noticing a kernel OOPS on 2.6.26 kernel when we issue the command
>>>>> "clear ip bgp <bgp-peer-ip>" on Quagga BGP routing software.
and btw, this is the crash (on mips) we are talking about:
# [23:10:35.108808] Kernel bug detected[#1]:
[23:10:35.112527] Cpu 0
[23:10:35.114676] $ 0 : 0000000000000000 0000000014001fe0
0000000000000066 0000000000000004
[23:10:35.122845] $ 4 : ffffffff80516c10 0000000014001fe0
ffffffff8050c010 0000000000000004
[23:10:35.131015] $ 8 : 0000000000000000 0000000000000041
ffffffff805142e8 0000000000000001
[23:10:35.139184] $12 : ffffffff80600000 ffffffff805f0000
0000000000000064 0000000000000190
[23:10:35.147354] $16 : 0000000000000102 ffffffff803afdf0
ffffffff80539040 ffffffff80600780
[23:10:35.155526] $20 : ffffffff80540000 0000000000200200
ffffffff804c0000 000000000000000a
[23:10:35.163695] $24 : a3d70a3d70a3d70b 8000000000000003
[23:10:35.171865] $28 : ffffffff8050c000 ffffffff8050fd90
9000000010030000 ffffffff801487a8
[23:10:35.180035] Hi : 0000000000000000
[23:10:35.183819] Lo : 0000000000000000
[23:10:35.187603] epc : ffffffff801487a8 run_timer_softirq+0x198/0x258
Tainted: P
[23:10:35.196032] ra : ffffffff801487a8 run_timer_softirq+0x198/0x258
[23:10:35.202395] Status: 14001fe3 KX SX UX KERNEL EXL IE
[23:10:35.207814] Cause : 00808024
[23:10:35.210911] PrId : 01041100 (SiByte SB1A)
[23:10:35.215209] Modules linked in: xt_state ipt_REJECT iptable_filter
nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4
ip_tables ebtable_filter ebtables bridge llc zeug_ipmcdrv(P) irqdisp(P)
zvirt(P) zeugmod(P) softdog
[23:10:35.236024] Process swapper (pid: 0, threadinfo=ffffffff8050c000,
task=ffffffff805142e8, tls=0000000000000000)
[23:10:35.246169] Stack : ffffffff8050fd90 ffffffff8050fd90
0000000014001fe0 ffffffff805ff3e0
[23:10:35.254166] ffffffff806003c4 0000000000000001
ffffffff8053f650 ffffffff805706d0
[23:10:35.262337] ffffffff80572020 ffffffff80142280
ffffffff806003c0 0000000000000000
[23:10:35.270507] 0000000014001fe0 000000000000c5b0
ffffffff8fefc520 ffffffff8feea52c
[23:10:35.278676] 0000000000000015 0000000000004460
0000000000000940 ffffffff8fe1bf00
[23:10:35.286846] ffffffff8fffdab0 ffffffff80142410
0000000000000000 ffffffff80142778
[23:10:35.295017] ffffffff80103d20 ffffffff80103d20
0000000000000000 0000000014001fe1
[23:10:35.303187] 0000000000040000 ffffffff8050c010
0000000000000000 a80000017f87c138
[23:10:35.311357] 0000000014001fe0 ffffffffffff00fe
0000000000000004 a80000017e7e0680
[23:10:35.319528] 0000000000000000 000000000000001d
ffffffff8050ffe0 0000000000001f00
[23:10:35.327696] ...
[23:10:35.330536] Call Trace:
[23:10:35.333201] [<ffffffff801487a8>] run_timer_softirq+0x198/0x258
[23:10:35.339224] [<ffffffff80142280>] __do_softirq+0x198/0x288
[23:10:35.344812] [<ffffffff80142410>] do_softirq+0xa0/0xa8
[23:10:35.350057] [<ffffffff80142778>] irq_exit+0x70/0x88
[23:10:35.355131] [<ffffffff80103d20>] ret_from_irq+0x0/0x4
[23:10:35.360377] [<ffffffff801063f4>] cpu_idle+0x1c/0x88
[23:10:35.365455]
[23:10:35.367171]
[23:10:35.367174] Code: 0040382d 0c04ef4c 00000000 <0200000d> 0c10ee9c
0260202d dfa60000 17a6ffe5 00000000
[23:10:35.378822] Kernel panic - not syncing: Fatal exception in
interrupt
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-18 2:35 ` Anirban Sinha
@ 2009-10-18 20:19 ` Anirban Sinha
2009-10-19 12:13 ` Oleg Nesterov
0 siblings, 1 reply; 15+ messages in thread
From: Anirban Sinha @ 2009-10-18 20:19 UTC (permalink / raw)
To: linux-kernel, Oleg Nesterov; +Cc: David Miller, netdev, Anirban Sinha
Hi Oleg:
I have a question for you. The queue_work() routine which is called from schedule_work() does a put_cpu() which in turn does a enable_preempt(). Is this an attempt to trigger the scheduler? One of the side affects of this enable_preempt() is the crash that we see below. What is happening is that a timer callback routine, in this case inet_twdr_hangman(), tries a bunch of cleanup until a threshold is reached. If further cleanups needs to be done beyond the threshold, it queues a work function. Now when the timer callback is run in __run_timers(), the routine grabs the value of preempt_count before and after the callback function call. If the two counts do not match, it calls BUG() (line 1037 in kernel/timer.c). Is is it illegal to schedule a work function from within a timer callback? Wha
t would be a good solution? I have already posted in netdev but since workqueues and timers are general kernel infrastructure, I thought I might as well post the question in the main linux m
ailing list and to you.
Here's the output from my instrumented BUG() call:
[02:15:15.941981] Kernel panic - not syncing: <3>huh, entered ffffffff803fbd60
(inet_twdr_hangman+0x0/0xe0)with preempt_count 00000102, exited with 00000101?
I was thinking of a hacky solution, to replace schedule_work() with schedule_delayed_work() just to get around the issue. But I am sure this is just too hacky and probably not the ideal solution ...
Cheers,
Ani
Once upon a time, like on 09-10-17 7:35 PM, Anirban Sinha wrote:
>
>
> Once upon a time, like on 09-10-17 10:57 AM, Anirban Sinha wrote:
>> On Thu, 8 Oct 2009, David Miller wrote:
>>
>>>>>> We are noticing a kernel OOPS on 2.6.26 kernel when we issue the command
>>>>>> "clear ip bgp <bgp-peer-ip>" on Quagga BGP routing software.
>
> and btw, this is the crash (on mips) we are talking about:
>
> # [23:10:35.108808] Kernel bug detected[#1]:
> [23:10:35.112527] Cpu 0
> [23:10:35.114676] $ 0 : 0000000000000000 0000000014001fe0
> 0000000000000066 0000000000000004
> [23:10:35.122845] $ 4 : ffffffff80516c10 0000000014001fe0
> ffffffff8050c010 0000000000000004
> [23:10:35.131015] $ 8 : 0000000000000000 0000000000000041
> ffffffff805142e8 0000000000000001
> [23:10:35.139184] $12 : ffffffff80600000 ffffffff805f0000
> 0000000000000064 0000000000000190
> [23:10:35.147354] $16 : 0000000000000102 ffffffff803afdf0
> ffffffff80539040 ffffffff80600780
> [23:10:35.155526] $20 : ffffffff80540000 0000000000200200
> ffffffff804c0000 000000000000000a
> [23:10:35.163695] $24 : a3d70a3d70a3d70b 8000000000000003
> [23:10:35.171865] $28 : ffffffff8050c000 ffffffff8050fd90
> 9000000010030000 ffffffff801487a8
> [23:10:35.180035] Hi : 0000000000000000
> [23:10:35.183819] Lo : 0000000000000000
> [23:10:35.187603] epc : ffffffff801487a8 run_timer_softirq+0x198/0x258
> Tainted: P
> [23:10:35.196032] ra : ffffffff801487a8 run_timer_softirq+0x198/0x258
> [23:10:35.202395] Status: 14001fe3 KX SX UX KERNEL EXL IE
> [23:10:35.207814] Cause : 00808024
> [23:10:35.210911] PrId : 01041100 (SiByte SB1A)
> [23:10:35.215209] Modules linked in: xt_state ipt_REJECT iptable_filter
> nf_conntrack_ftp ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4
> ip_tables ebtable_filter ebtables bridge llc zeug_ipmcdrv(P) irqdisp(P)
> zvirt(P) zeugmod(P) softdog
> [23:10:35.236024] Process swapper (pid: 0, threadinfo=ffffffff8050c000,
> task=ffffffff805142e8, tls=0000000000000000)
> [23:10:35.246169] Stack : ffffffff8050fd90 ffffffff8050fd90
> 0000000014001fe0 ffffffff805ff3e0
> [23:10:35.254166] ffffffff806003c4 0000000000000001
> ffffffff8053f650 ffffffff805706d0
> [23:10:35.262337] ffffffff80572020 ffffffff80142280
> ffffffff806003c0 0000000000000000
> [23:10:35.270507] 0000000014001fe0 000000000000c5b0
> ffffffff8fefc520 ffffffff8feea52c
> [23:10:35.278676] 0000000000000015 0000000000004460
> 0000000000000940 ffffffff8fe1bf00
> [23:10:35.286846] ffffffff8fffdab0 ffffffff80142410
> 0000000000000000 ffffffff80142778
> [23:10:35.295017] ffffffff80103d20 ffffffff80103d20
> 0000000000000000 0000000014001fe1
> [23:10:35.303187] 0000000000040000 ffffffff8050c010
> 0000000000000000 a80000017f87c138
> [23:10:35.311357] 0000000014001fe0 ffffffffffff00fe
> 0000000000000004 a80000017e7e0680
> [23:10:35.319528] 0000000000000000 000000000000001d
> ffffffff8050ffe0 0000000000001f00
> [23:10:35.327696] ...
> [23:10:35.330536] Call Trace:
> [23:10:35.333201] [<ffffffff801487a8>] run_timer_softirq+0x198/0x258
> [23:10:35.339224] [<ffffffff80142280>] __do_softirq+0x198/0x288
> [23:10:35.344812] [<ffffffff80142410>] do_softirq+0xa0/0xa8
> [23:10:35.350057] [<ffffffff80142778>] irq_exit+0x70/0x88
> [23:10:35.355131] [<ffffffff80103d20>] ret_from_irq+0x0/0x4
> [23:10:35.360377] [<ffffffff801063f4>] cpu_idle+0x1c/0x88
> [23:10:35.365455]
> [23:10:35.367171]
> [23:10:35.367174] Code: 0040382d 0c04ef4c 00000000 <0200000d> 0c10ee9c
> 0260202d dfa60000 17a6ffe5 00000000
> [23:10:35.378822] Kernel panic - not syncing: Fatal exception in
> interrupt
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-18 20:19 ` Anirban Sinha
@ 2009-10-19 12:13 ` Oleg Nesterov
2009-10-19 15:32 ` Anirban Sinha
2009-10-20 0:56 ` Anirban Sinha
0 siblings, 2 replies; 15+ messages in thread
From: Oleg Nesterov @ 2009-10-19 12:13 UTC (permalink / raw)
To: Anirban Sinha; +Cc: linux-kernel, David Miller, netdev, Anirban Sinha
Hi Anirban,
On 10/18, Anirban Sinha wrote:
>
> I have a question for you. The queue_work() routine which is called from
> schedule_work() does a put_cpu() which in turn does a enable_preempt(). Is
> this an attempt to trigger the scheduler?
No. please note that queue_work() does get_cpu() + put_cpu() to protect
against cpu_down() in between.
This can trigger the scheduler of course, but everything should be OK.
> One of the side affects of
> this enable_preempt() is the crash that we see below. What is happening
> is that a timer callback routine, in this case inet_twdr_hangman(),
> tries a bunch of cleanup until a threshold is reached. If further cleanups
> needs to be done beyond the threshold, it queues a work function. Now when
> the timer callback is run in __run_timers(), the routine grabs the value
> of preempt_count before and after the callback function call. If the two
> counts do not match, it calls BUG() (line 1037 in kernel/timer.c).
Yes, but I can't see how queue_work() can be involved, it doesn't change
->preempt_count. Note again it does put after get.
> Is is
> it illegal to schedule a work function from within a timer callback?
Yes sure.
I'd suppose that this unbalance comes from inet_twdr_hangman() pathes.
Could you verify this?
Oleg.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-19 12:13 ` Oleg Nesterov
@ 2009-10-19 15:32 ` Anirban Sinha
2009-10-19 15:36 ` Oleg Nesterov
2009-10-20 0:56 ` Anirban Sinha
1 sibling, 1 reply; 15+ messages in thread
From: Anirban Sinha @ 2009-10-19 15:32 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: linux-kernel, David Miller, netdev, Anirban Sinha
Once upon a time, like on 09-10-19 5:13 AM, Oleg Nesterov wrote:
> Hi Anirban,
>
> On 10/18, Anirban Sinha wrote:
>>
>> I have a question for you. The queue_work() routine which is called from
>> schedule_work() does a put_cpu() which in turn does a enable_preempt(). Is
>> this an attempt to trigger the scheduler?
>
> No. please note that queue_work() does get_cpu() + put_cpu() to protect
> against cpu_down() in between.
grrr! Ah yes, my eyes failed me (or it saw what I wanted it to see :)). You do have a get_cpu() and put_cpu() together in the same code path. I guess I will have to keep looking at inet_twdr_hangman().
>> Is is
>> it illegal to schedule a work function from within a timer callback?
>
> Yes sure.
hmm. may be in that case, that function needs to be re-written.
> I'd suppose that this unbalance comes from inet_twdr_hangman() pathes.
>
> Could you verify this?
I'll keep looking. Thanks for the help Oleg.
Ani
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-19 15:32 ` Anirban Sinha
@ 2009-10-19 15:36 ` Oleg Nesterov
2009-10-19 16:01 ` Anirban Sinha
0 siblings, 1 reply; 15+ messages in thread
From: Oleg Nesterov @ 2009-10-19 15:36 UTC (permalink / raw)
To: Anirban Sinha; +Cc: linux-kernel, David Miller, netdev, Anirban Sinha
On 10/19, Anirban Sinha wrote:
>
> Once upon a time, like on 09-10-19 5:13 AM, Oleg Nesterov wrote:
>
> >> Is is
> >> it illegal to schedule a work function from within a timer callback?
> >
> > Yes sure.
>
> hmm. may be in that case, that function needs to be re-written.
OOPS!!!! I misread your question, didn't notice "il" above...
I meant: yes sure it _is legal_ to schedule a work from within a timer
callback (in fact it is legal from any context).
Sorry for confusion.
Oleg.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-19 15:36 ` Oleg Nesterov
@ 2009-10-19 16:01 ` Anirban Sinha
0 siblings, 0 replies; 15+ messages in thread
From: Anirban Sinha @ 2009-10-19 16:01 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: linux-kernel, David Miller, netdev, Anirban Sinha
> I meant: yes sure it _is legal_ to schedule a work from within a timer
> callback (in fact it is legal from any context).
ok, then that part of the function looks fine.
Ani
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-19 12:13 ` Oleg Nesterov
2009-10-19 15:32 ` Anirban Sinha
@ 2009-10-20 0:56 ` Anirban Sinha
2009-10-20 1:08 ` [PATCH] " Anirban Sinha
1 sibling, 1 reply; 15+ messages in thread
From: Anirban Sinha @ 2009-10-20 0:56 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: Anirban Sinha, linux-kernel, David Miller, netdev
> I'd suppose that this unbalance comes from inet_twdr_hangman() pathes.
>
> Could you verify this?
Yes, I have now verified this. There is indeed an issue with one of the
functions called by inet_twdr_hangman(). The call sequence is:
inet_twdr_hangman() -> inet_twdr_do_twkill_work() -> inet_twsk_put() ->
twsk_destructor().
In this case, the destructor callback is tcp_twsk_destructor() (installed
from line 1208 in net/ipv4/tcp_ipv4.c and line 906 in net/ipv6/tcp_ipv6.c) .
Without the TCP_MD5SUM compiled in, the function is a no-op. However, with the MD5SUM
compiled in, it calls tcp_put_md5_sig_pool() (when keylen is non zero) which
does an unbalanced put_cpu(). I did a grep across the entire tree.
tcp_put_md5_sig_pool() is a matching function for tcp_get_md5_sig_pool() and
in all other TCP IPV4 cases, it is called from net/ipv4/tcp_ipv4.c from
functions tcp_v4_md5_hash_hdr() and
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-20 0:56 ` Anirban Sinha
@ 2009-10-20 1:08 ` Anirban Sinha
2009-10-20 1:13 ` David Miller
0 siblings, 1 reply; 15+ messages in thread
From: Anirban Sinha @ 2009-10-20 1:08 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: Anirban Sinha, linux-kernel, David Miller, netdev
> I'd suppose that this unbalance comes from inet_twdr_hangman() pathes.
>
> Could you verify this?
Yes, I have now verified this. There is indeed an issue with one of the
functions called by inet_twdr_hangman(). The call sequence is:
inet_twdr_hangman() -> inet_twdr_do_twkill_work() -> inet_twsk_put() ->
twsk_destructor().
In this case, the destructor callback is tcp_twsk_destructor() (installed
from line 1208 in net/ipv4/tcp_ipv4.c and line 906 in net/ipv6/tcp_ipv6.c) .
Without the TCP_MD5SUM compiled in, the function is a no-op. However, with the MD5SUM
compiled in, it calls tcp_put_md5_sig_pool() (when keylen is non zero) which
does an unbalanced put_cpu(). I did a grep across the entire tree.
tcp_put_md5_sig_pool() is a matching function for tcp_get_md5_sig_pool() and
in all other TCP IPV4 cases, it is called from net/ipv4/tcp_ipv4.c from
functions tcp_v4_md5_hash_hdr() and tcp_v4_hash_skb() along with the matching
get()
function. So I would think that in tcp_twsk_destructor(), the call should be
replaced by tcp_free_md5_sig_pool() instead.
Signed-of-by: Anirban Sinha <asinha@zeugmasystems.com>
---
net/ipv4/tcp_minisocks.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index e48c37d..dccc01e 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -363,7 +363,7 @@ void tcp_twsk_destructor(struct sock *sk)
#ifdef CONFIG_TCP_MD5SIG
struct tcp_timewait_sock *twsk = tcp_twsk(sk);
if (twsk->tw_md5_keylen)
- tcp_put_md5sig_pool();
+ tcp_free_md5sig_pool();
#endif
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-20 1:08 ` [PATCH] " Anirban Sinha
@ 2009-10-20 1:13 ` David Miller
2009-10-20 1:17 ` Anirban Sinha
0 siblings, 1 reply; 15+ messages in thread
From: David Miller @ 2009-10-20 1:13 UTC (permalink / raw)
To: asinha; +Cc: oleg, ani, linux-kernel, netdev
From: Anirban Sinha <asinha@zeugmasystems.com>
Date: Mon, 19 Oct 2009 18:08:21 -0700 (PDT)
> @@ -363,7 +363,7 @@ void tcp_twsk_destructor(struct sock *sk)
> #ifdef CONFIG_TCP_MD5SIG
> struct tcp_timewait_sock *twsk = tcp_twsk(sk);
> if (twsk->tw_md5_keylen)
> - tcp_put_md5sig_pool();
> + tcp_free_md5sig_pool();
> #endif
> }
This has been fixed in the tree for a month of so:
commit 657e9649e745b06675aa5063c84430986cdc3afa
Author: Robert Varga <nite@hq.alert.sk>
Date: Tue Sep 15 23:49:21 2009 -0700
tcp: fix CONFIG_TCP_MD5SIG + CONFIG_PREEMPT timer BUG()
I have recently came across a preemption imbalance detected by:
<4>huh, entered ffffffff80644630 with preempt_count 00000102, exited with 00000101?
<0>------------[ cut here ]------------
<2>kernel BUG at /usr/src/linux/kernel/timer.c:664!
<0>invalid opcode: 0000 [1] PREEMPT SMP
with ffffffff80644630 being inet_twdr_hangman().
This appeared after I enabled CONFIG_TCP_MD5SIG and played with it a
bit, so I looked at what might have caused it.
One thing that struck me as strange is tcp_twsk_destructor(), as it
calls tcp_put_md5sig_pool() -- which entails a put_cpu(), causing the
detected imbalance. Found on 2.6.23.9, but 2.6.31 is affected as well,
as far as I can tell.
Signed-off-by: Robert Varga <nite@hq.alert.sk>
Signed-off-by: David S. Miller <davem@davemloft.net>
diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 045bcfd..624c3c9 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -363,7 +363,7 @@ void tcp_twsk_destructor(struct sock *sk)
#ifdef CONFIG_TCP_MD5SIG
struct tcp_timewait_sock *twsk = tcp_twsk(sk);
if (twsk->tw_md5_keylen)
- tcp_put_md5sig_pool();
+ tcp_free_md5sig_pool();
#endif
}
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH] Re: Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled
2009-10-20 1:13 ` David Miller
@ 2009-10-20 1:17 ` Anirban Sinha
0 siblings, 0 replies; 15+ messages in thread
From: Anirban Sinha @ 2009-10-20 1:17 UTC (permalink / raw)
To: David Miller; +Cc: oleg, ani, linux-kernel, netdev
> This has been fixed in the tree for a month of so:
Grrrr! Time for me to do a git pull again. The kernel source tree in my work
machine must be out of date by about the same time.
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-10-20 1:17 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-08 22:19 Kernel oops when clearing bgp neighbor info with TCP MD5SUM enabled Anirban Sinha
2009-10-08 22:54 ` David Miller
2009-10-08 23:33 ` Anirban Sinha
2009-10-09 0:57 ` David Miller
2009-10-17 17:57 ` Anirban Sinha
2009-10-18 2:35 ` Anirban Sinha
2009-10-18 20:19 ` Anirban Sinha
2009-10-19 12:13 ` Oleg Nesterov
2009-10-19 15:32 ` Anirban Sinha
2009-10-19 15:36 ` Oleg Nesterov
2009-10-19 16:01 ` Anirban Sinha
2009-10-20 0:56 ` Anirban Sinha
2009-10-20 1:08 ` [PATCH] " Anirban Sinha
2009-10-20 1:13 ` David Miller
2009-10-20 1:17 ` Anirban Sinha
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).