* Unhandled kernel unaligned access on IP32 w/ network I/O && 3.7.1?
@ 2012-12-28 4:27 Joshua Kinard
2012-12-28 21:52 ` Ralf Baechle
0 siblings, 1 reply; 5+ messages in thread
From: Joshua Kinard @ 2012-12-28 4:27 UTC (permalink / raw)
To: Linux MIPS List
[-- Attachment #1: Type: text/plain, Size: 3305 bytes --]
Has anyone run into an unhandled kernel unaligned access under 3.7.1? I've
triggered it twice w/ network I/O on an SGI IP32 machine, however, the stack
trace does not appear to be specific to any of IP32's own drivers. 3.6.7
was very stable, and the two oopses I've triggered so far both happened
under 3.7.1.
It looks like the culprit is in sk_stream_alloc_skb or tcp_sendmsg, however,
I have little experience in the higher-level networking stack within Linux
and wanted to see if anyone else has triggered this on other MIPS systems.
Seems to happen when I am logged in via SSH (on IPv6) and generating a burst
of console output.
Unhandled kernel unaligned access[#3]:
Cpu 0
$ 0 : 0000000000000000 0000000000000010 0000000000000000 bfffff005671271c
$ 4 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000
$ 8 : 980000005c24e000 0000000000000000 980000005c24e000 00000000000000cc
$12 : ffffffff9001fce1 000000001000001e fffffffffffff000 000000000000001f
$16 : 980000005c00fa40 ffffffffde0300b8 ffffff0000000000 0000000000000005
$20 : 000000007f875700 00000000000005a8 0000000000000008 0000000000000005
$24 : 0000000000000001 00000000000003f0
$28 : 980000005c00c000 980000005c00fa10 0000000000000000 ffffffff800059a0
Hi : 0000000007a11c93
Lo : b645a1cac992645e
epc : ffffffff8000b700 do_ade+0x1b0/0x480
Tainted: G D
ra : ffffffff800059a0 ret_from_exception+0x0/0x24
Status: 9001fce3 KX SX UX KERNEL EXL IE
Cause : 00000010
BadVA : bfffff005671271c
PrId : 00002733 (RM7000)
Process rsync (pid: 3844, threadinfo=980000005c00c000,
task=980000005fada000, tls=0000000077dc9490)
Stack : 980000005c24e6a0 9800000056712664 980000005c228000 00000000000005a8
0000000000000005 ffffffff800059a0 0000000000000000 0000000000000010
00000000000000d0 0000000000000000 980000005c228000 00000000000008a0
0000000000000000 0000000000000000 980000005c24e000 0000000000000000
980000005c24e000 00000000000000cc 0000000000000020 ffffffff80223b6c
fffffffffffff000 000000000000001f 9800000056712664 980000005c228000
00000000000005a8 0000000000000005 000000007f875700 00000000000005a8
0000000000000008 0000000000000005 0000000000000001 00000000000003f0
0000000000000014 ffffffff802de0d0 980000005c00c000 980000005c00fb70
0000000000000000 ffffffff80334ef8 ffffffff9001fce3 0000000007a11c93
...
Call Trace:
[<ffffffff8000b700>] do_ade+0x1b0/0x480
[<ffffffff800059a0>] ret_from_exception+0x0/0x24
[<ffffffff80334f24>] sk_stream_alloc_skb+0x6c/0x118
[<ffffffff80335e8c>] tcp_sendmsg+0x6fc/0xe90
[<ffffffff802d3744>] sock_aio_write+0x10c/0x150
[<ffffffff800b48c4>] do_sync_write+0x9c/0x108
[<ffffffff800b4a98>] vfs_write+0x168/0x180
[<ffffffff800b4bbc>] SyS_write+0x54/0xb8
[<ffffffff80013538>] handle_sys+0x118/0x13c
Code: 00441024 5440ffe6 de030100 <68730000> 6c730007 24030000 14600040
00000000 8e020124
---[ end trace e5d137adb9de32d0 ]---
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And
our lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 834 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Unhandled kernel unaligned access on IP32 w/ network I/O && 3.7.1?
2012-12-28 4:27 Unhandled kernel unaligned access on IP32 w/ network I/O && 3.7.1? Joshua Kinard
@ 2012-12-28 21:52 ` Ralf Baechle
2012-12-28 23:37 ` Joshua Kinard
2012-12-30 8:23 ` Joshua Kinard
0 siblings, 2 replies; 5+ messages in thread
From: Ralf Baechle @ 2012-12-28 21:52 UTC (permalink / raw)
To: Joshua Kinard; +Cc: Linux MIPS List
On Thu, Dec 27, 2012 at 11:27:37PM -0500, Joshua Kinard wrote:
> Has anyone run into an unhandled kernel unaligned access under 3.7.1? I've
> triggered it twice w/ network I/O on an SGI IP32 machine, however, the stack
> trace does not appear to be specific to any of IP32's own drivers. 3.6.7
> was very stable, and the two oopses I've triggered so far both happened
> under 3.7.1.
>
> It looks like the culprit is in sk_stream_alloc_skb or tcp_sendmsg, however,
> I have little experience in the higher-level networking stack within Linux
> and wanted to see if anyone else has triggered this on other MIPS systems.
>
> Seems to happen when I am logged in via SSH (on IPv6) and generating a burst
> of console output.
>
> Unhandled kernel unaligned access[#3]:
> Cpu 0
> $ 0 : 0000000000000000 0000000000000010 0000000000000000 bfffff005671271c
> $ 4 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> $ 8 : 980000005c24e000 0000000000000000 980000005c24e000 00000000000000cc
> $12 : ffffffff9001fce1 000000001000001e fffffffffffff000 000000000000001f
> $16 : 980000005c00fa40 ffffffffde0300b8 ffffff0000000000 0000000000000005
> $20 : 000000007f875700 00000000000005a8 0000000000000008 0000000000000005
> $24 : 0000000000000001 00000000000003f0
> $28 : 980000005c00c000 980000005c00fa10 0000000000000000 ffffffff800059a0
> Hi : 0000000007a11c93
> Lo : b645a1cac992645e
> epc : ffffffff8000b700 do_ade+0x1b0/0x480
> Tainted: G D
^^^
This kernel has already oopsed before. Which means this oops message is
pretty much worthless.
Ralf
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Unhandled kernel unaligned access on IP32 w/ network I/O && 3.7.1?
2012-12-28 21:52 ` Ralf Baechle
@ 2012-12-28 23:37 ` Joshua Kinard
2012-12-30 8:23 ` Joshua Kinard
1 sibling, 0 replies; 5+ messages in thread
From: Joshua Kinard @ 2012-12-28 23:37 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Linux MIPS List
[-- Attachment #1: Type: text/plain, Size: 2209 bytes --]
On 12/28/2012 4:52 PM, Ralf Baechle wrote:
> On Thu, Dec 27, 2012 at 11:27:37PM -0500, Joshua Kinard wrote:
>
>> Has anyone run into an unhandled kernel unaligned access under 3.7.1? I've
>> triggered it twice w/ network I/O on an SGI IP32 machine, however, the stack
>> trace does not appear to be specific to any of IP32's own drivers. 3.6.7
>> was very stable, and the two oopses I've triggered so far both happened
>> under 3.7.1.
>>
>> It looks like the culprit is in sk_stream_alloc_skb or tcp_sendmsg, however,
>> I have little experience in the higher-level networking stack within Linux
>> and wanted to see if anyone else has triggered this on other MIPS systems.
>>
>> Seems to happen when I am logged in via SSH (on IPv6) and generating a burst
>> of console output.
>>
>> Unhandled kernel unaligned access[#3]:
>> Cpu 0
>> $ 0 : 0000000000000000 0000000000000010 0000000000000000 bfffff005671271c
>> $ 4 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> $ 8 : 980000005c24e000 0000000000000000 980000005c24e000 00000000000000cc
>> $12 : ffffffff9001fce1 000000001000001e fffffffffffff000 000000000000001f
>> $16 : 980000005c00fa40 ffffffffde0300b8 ffffff0000000000 0000000000000005
>> $20 : 000000007f875700 00000000000005a8 0000000000000008 0000000000000005
>> $24 : 0000000000000001 00000000000003f0
>> $28 : 980000005c00c000 980000005c00fa10 0000000000000000 ffffffff800059a0
>> Hi : 0000000007a11c93
>> Lo : b645a1cac992645e
>> epc : ffffffff8000b700 do_ade+0x1b0/0x480
>> Tainted: G D
> ^^^
>
> This kernel has already oopsed before. Which means this oops message is
> pretty much worthless.
>
> Ralf
I'll try to recapture it over the weekend. This was the only output left in
the serial console buffer when I reattached, as I had the console
disconnected when it happened. Thanks!
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And
our lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 834 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Unhandled kernel unaligned access on IP32 w/ network I/O && 3.7.1?
2012-12-28 21:52 ` Ralf Baechle
2012-12-28 23:37 ` Joshua Kinard
@ 2012-12-30 8:23 ` Joshua Kinard
2013-01-01 3:25 ` Joshua Kinard
1 sibling, 1 reply; 5+ messages in thread
From: Joshua Kinard @ 2012-12-30 8:23 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Linux MIPS List
[-- Attachment #1: Type: text/plain, Size: 4628 bytes --]
On 12/28/2012 4:52 PM, Ralf Baechle wrote:
> On Thu, Dec 27, 2012 at 11:27:37PM -0500, Joshua Kinard wrote:
>
>> Has anyone run into an unhandled kernel unaligned access under 3.7.1? I've
>> triggered it twice w/ network I/O on an SGI IP32 machine, however, the stack
>> trace does not appear to be specific to any of IP32's own drivers. 3.6.7
>> was very stable, and the two oopses I've triggered so far both happened
>> under 3.7.1.
>>
>> It looks like the culprit is in sk_stream_alloc_skb or tcp_sendmsg, however,
>> I have little experience in the higher-level networking stack within Linux
>> and wanted to see if anyone else has triggered this on other MIPS systems.
>>
>> Seems to happen when I am logged in via SSH (on IPv6) and generating a burst
>> of console output.
>>
>> Unhandled kernel unaligned access[#3]:
>> Cpu 0
>> $ 0 : 0000000000000000 0000000000000010 0000000000000000 bfffff005671271c
>> $ 4 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> $ 8 : 980000005c24e000 0000000000000000 980000005c24e000 00000000000000cc
>> $12 : ffffffff9001fce1 000000001000001e fffffffffffff000 000000000000001f
>> $16 : 980000005c00fa40 ffffffffde0300b8 ffffff0000000000 0000000000000005
>> $20 : 000000007f875700 00000000000005a8 0000000000000008 0000000000000005
>> $24 : 0000000000000001 00000000000003f0
>> $28 : 980000005c00c000 980000005c00fa10 0000000000000000 ffffffff800059a0
>> Hi : 0000000007a11c93
>> Lo : b645a1cac992645e
>> epc : ffffffff8000b700 do_ade+0x1b0/0x480
>> Tainted: G D
> ^^^
>
> This kernel has already oopsed before. Which means this oops message is
> pretty much worthless.
Here's an untainted oops from IP32. Triggered by logging in over SSH on
IPv6 and running 'dmesg':
Unhandled kernel unaligned access[#1]:
Cpu 0
$ 0 : 0000000000000000 0000000000000010 0000000000000000 bfffff005e17aac4
$ 4 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000
$ 8 : 980000005e00e000 0000000000000000 980000005e00e000 0000000000000410
$12 : ffffffff9001fce1 000000001000001e fffffffffffff000 000000000000001f
$16 : 980000005e03fa40 ffffffffde0300b8 ffffff0000000000 0000000000000034
$20 : 00000000006532d8 0000000000000594 00000000004a1134 00000000004a0000
$24 : 0000000000000001 00000000000003f0
$28 : 980000005e03c000 980000005e03fa10 0000000000000000 ffffffff800059a0
Hi : 000000000011a02a
Lo : 000000000005e00e
epc : ffffffff8000b700 do_ade+0x1b0/0x480
Not tainted
ra : ffffffff800059a0 ret_from_exception+0x0/0x24
Status: 9001fce3 KX SX UX KERNEL EXL IE
Cause : 00000010
BadVA : bfffff005e17aac4
PrId : 00002733 (RM7000)
Process sshd (pid: 1323, threadinfo=980000005e03c000, task=980000005fe76000,
tls=0000000077010490)
Stack : 980000005e00e6a0 980000005e17aa0c 980000005faef000 0000000000000594
0000000000000034 ffffffff800059a0 0000000000000000 0000000000000010
00000000000000d0 0000000000000000 980000005faef000 00000000000008a0
0000000000000000 0000000000000000 980000005e00e000 0000000000000000
980000005e00e000 0000000000000410 0000000000000020 ffffffff80223b6c
fffffffffffff000 000000000000001f 980000005e17aa0c 980000005faef000
0000000000000594 0000000000000034 00000000006532d8 0000000000000594
00000000004a1134 00000000004a0000 0000000000000001 00000000000003f0
0000000000000014 ffffffff802de0d0 980000005e03c000 980000005e03fb70
0000000000000000 ffffffff80334ef8 ffffffff9001fce3 000000000011a02a
...
Call Trace:
[<ffffffff8000b700>] do_ade+0x1b0/0x480
[<ffffffff800059a0>] ret_from_exception+0x0/0x24
[<ffffffff80334f24>] sk_stream_alloc_skb+0x6c/0x118
[<ffffffff80335e8c>] tcp_sendmsg+0x6fc/0xe90
[<ffffffff802d3744>] sock_aio_write+0x10c/0x150
[<ffffffff800b48c4>] do_sync_write+0x9c/0x108
[<ffffffff800b4a98>] vfs_write+0x168/0x180
[<ffffffff800b4bbc>] SyS_write+0x54/0xb8
[<ffffffff80013538>] handle_sys+0x118/0x13c
Code: 00441024 5440ffe6 de030100 <68730000> 6c730007 24030000 14600040
00000000 8e020124
---[ end trace 8127ff095caa30f9 ]---
Turns out it is non-fatal. The serial console is still alive, but sshd was
terminated as a result (it's in the 'Ds' state under ps ux output).
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And
our lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 834 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Unhandled kernel unaligned access on IP32 w/ network I/O && 3.7.1?
2012-12-30 8:23 ` Joshua Kinard
@ 2013-01-01 3:25 ` Joshua Kinard
0 siblings, 0 replies; 5+ messages in thread
From: Joshua Kinard @ 2013-01-01 3:25 UTC (permalink / raw)
To: Ralf Baechle; +Cc: Linux MIPS List
[-- Attachment #1: Type: text/plain, Size: 4898 bytes --]
On 12/30/2012 3:23 AM, Joshua Kinard wrote:
>
> Here's an untainted oops from IP32. Triggered by logging in over SSH on
> IPv6 and running 'dmesg':
>
> Unhandled kernel unaligned access[#1]:
> Cpu 0
> $ 0 : 0000000000000000 0000000000000010 0000000000000000 bfffff005e17aac4
> $ 4 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> $ 8 : 980000005e00e000 0000000000000000 980000005e00e000 0000000000000410
> $12 : ffffffff9001fce1 000000001000001e fffffffffffff000 000000000000001f
> $16 : 980000005e03fa40 ffffffffde0300b8 ffffff0000000000 0000000000000034
> $20 : 00000000006532d8 0000000000000594 00000000004a1134 00000000004a0000
> $24 : 0000000000000001 00000000000003f0
> $28 : 980000005e03c000 980000005e03fa10 0000000000000000 ffffffff800059a0
> Hi : 000000000011a02a
> Lo : 000000000005e00e
> epc : ffffffff8000b700 do_ade+0x1b0/0x480
> Not tainted
> ra : ffffffff800059a0 ret_from_exception+0x0/0x24
> Status: 9001fce3 KX SX UX KERNEL EXL IE
> Cause : 00000010
> BadVA : bfffff005e17aac4
> PrId : 00002733 (RM7000)
> Process sshd (pid: 1323, threadinfo=980000005e03c000, task=980000005fe76000,
> tls=0000000077010490)
> Stack : 980000005e00e6a0 980000005e17aa0c 980000005faef000 0000000000000594
> 0000000000000034 ffffffff800059a0 0000000000000000 0000000000000010
> 00000000000000d0 0000000000000000 980000005faef000 00000000000008a0
> 0000000000000000 0000000000000000 980000005e00e000 0000000000000000
> 980000005e00e000 0000000000000410 0000000000000020 ffffffff80223b6c
> fffffffffffff000 000000000000001f 980000005e17aa0c 980000005faef000
> 0000000000000594 0000000000000034 00000000006532d8 0000000000000594
> 00000000004a1134 00000000004a0000 0000000000000001 00000000000003f0
> 0000000000000014 ffffffff802de0d0 980000005e03c000 980000005e03fb70
> 0000000000000000 ffffffff80334ef8 ffffffff9001fce3 000000000011a02a
> ...
> Call Trace:
> [<ffffffff8000b700>] do_ade+0x1b0/0x480
> [<ffffffff800059a0>] ret_from_exception+0x0/0x24
> [<ffffffff80334f24>] sk_stream_alloc_skb+0x6c/0x118
> [<ffffffff80335e8c>] tcp_sendmsg+0x6fc/0xe90
> [<ffffffff802d3744>] sock_aio_write+0x10c/0x150
> [<ffffffff800b48c4>] do_sync_write+0x9c/0x108
> [<ffffffff800b4a98>] vfs_write+0x168/0x180
> [<ffffffff800b4bbc>] SyS_write+0x54/0xb8
> [<ffffffff80013538>] handle_sys+0x118/0x13c
>
>
> Code: 00441024 5440ffe6 de030100 <68730000> 6c730007 24030000 14600040
> 00000000 8e020124
> ---[ end trace 8127ff095caa30f9 ]---
>
>
> Turns out it is non-fatal. The serial console is still alive, but sshd was
> terminated as a result (it's in the 'Ds' state under ps ux output).
Some quick digging via objdump and a new oops, from a rebuilt kernel
including full debugging, points at an inlined call to skb_reserve from
within sk_stream_alloc_skb in net/ipv4/tcp.c.
Bottom of new oops:
Call Trace:
[<ffffffff8000b710>] do_ade+0x1b0/0x480
[<ffffffff800059a0>] ret_from_exception+0x0/0x24
[<ffffffff803352dc>] sk_stream_alloc_skb+0x6c/0x118
[<ffffffff8033624c>] tcp_sendmsg+0x6fc/0xe98
[<ffffffff802d3c44>] sock_aio_write+0x10c/0x150
[<ffffffff800b5cd4>] do_sync_write+0x9c/0x108
[<ffffffff800b5ea8>] vfs_write+0x168/0x180
[<ffffffff800b5fcc>] SyS_write+0x54/0xb8
[<ffffffff80013558>] handle_sys+0x118/0x13c
Disassembly of vmlinux, and match of address ffffffff803352dc yields this:
if (sk_wmem_schedule(sk, skb->truesize)) {
skb_reserve(skb, sk->sk_prot->max_header);
ffffffff803352d8: 8c420108 lw v0,264(v0)
* Increase the headroom of an empty &sk_buff by reducing the tail
* room. This is only allowed for an empty buffer.
*/
static inline void skb_reserve(struct sk_buff *skb, int len)
{
skb->data += len;
ffffffff803352dc: de0300b8 ld v1,184(s0)
skb->tail += len;
ffffffff803352e0: 8e0400a8 lw a0,168(s0)
* Increase the headroom of an empty &sk_buff by reducing the tail
* room. This is only allowed for an empty buffer.
*/
I looked around at several files in git, mainly, net/ipv4/tcp.c, and none of
the recent changes to 3.7 sticks out immediately as the cause. I'll either
have to use git bisect or run kgdb on it to figure anything else out.
Does this look like a case of scheduling while atomic? There's a fix in
davem's -next tree that addresses such a cause, but I haven't tried that
just yet to see if it's the same issue.
--
Joshua Kinard
Gentoo/MIPS
kumba@gentoo.org
4096R/D25D95E3 2011-03-28
"The past tempts us, the present confuses us, the future frightens us. And
our lives slip away, moment by moment, lost in that vast, terrible in-between."
--Emperor Turhan, Centauri Republic
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 834 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2013-01-01 3:25 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-12-28 4:27 Unhandled kernel unaligned access on IP32 w/ network I/O && 3.7.1? Joshua Kinard
2012-12-28 21:52 ` Ralf Baechle
2012-12-28 23:37 ` Joshua Kinard
2012-12-30 8:23 ` Joshua Kinard
2013-01-01 3:25 ` Joshua Kinard
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.