All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <dada1@cosmosbay.com>
To: Ron Yorgason <yorgasor@gmail.com>
Cc: netdev@vger.kernel.org
Subject: Re: Kernel Oops in UDP w/ ARM architecture
Date: Mon, 09 Mar 2009 18:16:48 +0100	[thread overview]
Message-ID: <49B54F00.5090706@cosmosbay.com> (raw)
In-Reply-To: <93d1fdd10903090852g268b4141h31dc39a5848fcf32@mail.gmail.com>

Ron Yorgason a écrit :
> I'm working on an embedded video streaming application using gstreamer
> over RTP/UDP on a Freescale iMX27 ARM platform.  I have one board
> doing the video capture and compression, and streaming it across the
> network to another board which does the decoding and display.  I'm
> stuck right now with a kernel oops we're getting.  It usually occurs
> within 2-6 hours, but sometimes it takes longer for it to happen.  I
> believe it always dies with the same address in the failure.
> 
> I'm using a 2.6.19.2 kernel release.  I don't know if this problem has
> already been found and fixed in a future release (I didn't see any
> mention of it in the changelogs of the next few releases), but this is
> a customized kernel and I don't know how feasible it would be to port
> all the changes to a newer kernel.  We haven't touched the networking
> stack, so it's most likely this bug is in the stock release.
> 
> Unable to handle kernel paging request at virtual address c6f9202a
> pgd = c6d7c000
> [c6f9202a] *pgd=a6e0041e(bad)
> Internal error: Oops: 1 [#3]
> Modules linked in:
> CPU: 0
> PC is at udp_recvmsg+0x184/0x21c
> LR is at 0xf2799669
> pc : [<c024a3e0>]    lr : [<f2799669>]    Not tainted
> sp : c6f9fd48  ip : 00000000  fp : c6f9fd80
> r10: c6f9fea0  r9 : 00000000  r8 : 00000400
> r7 : 00000400  r6 : c7a52200  r5 : c6f9ff20  r4 : c6291780
> r3 : c6f9201e  r2 : 00000000  r1 : 00000008  r0 : c6f9fea8
> Flags: NzCv  IRQs on  FIQs on  Mode SVC_32  Segment user
> Control: 5317F
> Table: A6D7C000  DAC: 00000015
> Process gst-launch-0.10 (pid: 18165, stack limit = 0xc6f9e250)
> Stack: (0xc6f9fd48 to 0xc6fa0000)
> fd40:                   00000001 00000000 00000000 00000000 c02fbb80 c6f9ff20
> fd60: c6f9ff20 00000400 00000000 00000000 00000000 c6f9fda8 c6f9fd84 c0207468
> fd80: c024a26c 00000000 00000000 c6f9fd90 00000010 c6f9fdb0 c7c4fac0 c6f9fe9c
> fda0: c6f9fdac c0205ae0 c020742c 00000000 c02e06c8 00000001 00000000 00000001
> fdc0: ffffffff 00000000 00000000 00000000 00000000 00000000 c7c4fac0 00000000
> fde0: 00000000 c6c5d720 c7c4fac0 c006a3a4 c6f9fdf0 c6f9fdf0 c6f9e000 ffffffff
> fe00: c6f9fe34 c7176b60 c7176b90 8511a8c0 c6f9fea8 00000408 c6f9fe44 c6f9fe28
> fe20: c0209ff8 00000001 00000004 40ee9e04 40ee9e04 00000000 00000000 00000000
> fe40: 00000400 c759bba0 00000000 00000000 c6f9ff20 00000500 00000000 00000000
> fe60: 00000400 00000000 00000000 c03714a4 c6f9fef8 00000000 00000400 00093800
> fe80: c6f9fea0 c76d45a0 c6f9e000 40ee9e84 c6f9ff70 c6f9fea0 c0206990 c0205a30
> fea0: 03080002 c005d660 a0000093 00043887 c7d6a000 000002c0 c7d6a2c0 60000013
> fec0: c6f9fedc c6f9fed0 c005dbc0 c005da94 c6f9ff34 c6f9fee0 c018455c c005db90
> fee0: 485a7d2d 00046731 00000400 c6f9ff10 c6f9fefc c024a130 c0059780 c76d45a0
> ff00: 0000541b c6f9ff20 c6f9ff14 c024ff7c c024a0a8 c6f9ff3c c6f9ff24 c02052cc
> ff20: c6f9fea0 00000080 c6f9ff3c 00000001 00000000 00000000 c00a8cf8 00093c00
> ff40: 00000000 00000001 40ee9e9c 0000000c 00093800 00000400 00000066 c0038f84
> ff60: 404fa2f0 c6f9ffa4 c6f9ff74 c0206e9c c0206908 40ee9e84 40ee9ea0 0000000a
> ff80: 00093800 00000400 00000000 40ee9e84 40ee9ea0 000001c4 00000000 c6f9ffa8
> ffa0: c0038de0 c0206d10 000001c4 00093800 0000000c 40ee9dd4 40eea56c 00000002
> ffc0: 000001c4 00093800 00000400 0000000a 40ee9ea0 40ee9e84 404fa2f0 000350d0
> ffe0: 00000000 40ee9dd0 4020fe74 40210808 80000010 0000000c 033a0000 8c020000
> Backtrace:
> [<c024a25c>] (udp_recvmsg+0x0/0x21c) from [<c0207468>] (sock_common_recvmsg+0x4)
> [<c020741c>] (sock_common_recvmsg+0x0/0x60) from [<c0205ae0>] (sock_recvmsg+0xc)
>  r5 = C7C4FAC0  r4 = C6F9FDB0
> [<c0205a20>] (sock_recvmsg+0x0/0xec) from [<c0206990>] (sys_recvfrom+0x98/0xf0)
> [<c02068f8>] (sys_recvfrom+0x0/0xf0) from [<c0206e9c>] (sys_socketcall+0x19c/0x)
> [<c0206d00>] (sys_socketcall+0x0/0x1f0) from [<c0038de0>] (ret_fast_syscall+0x0)
>  r4 = 000001C4
> Code: e28a0008 e1d330b0 e3a01008 e1ca30b2 (e5943020)
> 
> 
> I did the disassembly to find out exactly where the failure occurs.  I
> put an asterisk by the address offset mentioned in the oops, but I
> believe it's the next line down where it references the address where
> it chokes.

Yes I agree  (R3 + offset) chokes, not (r4 + offset)

> 
> 00001ae4 <udp_recvmsg>:
>     1ae4:	e1a0c00d 	mov	ip, sp
>     1ae8:	e92ddff0 	stmdb	sp!, {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr, pc}
>     1aec:	e24cb004 	sub	fp, ip, #4	; 0x4
>     1af0:	e24dd010 	sub	sp, sp, #16	; 0x10
>     1af4:	e59b000c 	ldr	r0, [fp, #12]
>     1af8:	e59b9008 	ldr	r9, [fp, #8]
>     1afc:	e3500000 	cmp	r0, #0	; 0x0
>     1b00:	e1a08003 	mov	r8, r3
>     1b04:	13a03010 	movne	r3, #16	; 0x10
>     1b08:	e592a000 	ldr	sl, [r2]
>     1b0c:	15803000 	strne	r3, [r0]
>     1b10:	e3190a02 	tst	r9, #8192	; 0x2000
>     1b14:	e1a05002 	mov	r5, r2
>     1b18:	e1a06001 	mov	r6, r1
>     1b1c:	0a000004 	beq	1b34 <udp_recvmsg+0x50>
>     1b20:	e1a00001 	mov	r0, r1
>     1b24:	e1a01002 	mov	r1, r2
>     1b28:	e1a02008 	mov	r2, r8
>     1b2c:	ebfffffe 	bl	0 <ip_recv_error>
>     1b30:	ea00006e 	b	1cf0 <udp_recvmsg+0x20c>
>     1b34:	e1a01009 	mov	r1, r9
>     1b38:	e59b2004 	ldr	r2, [fp, #4]
>     1b3c:	e24b302c 	sub	r3, fp, #44	; 0x2c
>     1b40:	e1a00006 	mov	r0, r6
>     1b44:	ebfffffe 	bl	0 <skb_recv_datagram>
>     1b48:	e2504000 	subs	r4, r0, #0	; 0x0
>     1b4c:	e3a01008 	mov	r1, #8	; 0x8
>     1b50:	0a000057 	beq	1cb4 <udp_recvmsg+0x1d0>
>     1b54:	e5943060 	ldr	r3, [r4, #96]
>     1b58:	e2437008 	sub	r7, r3, #8	; 0x8
>     1b5c:	e1570008 	cmp	r7, r8
>     1b60:	85953018 	ldrhi	r3, [r5, #24]
>     1b64:	81a07008 	movhi	r7, r8
>     1b68:	83833020 	orrhi	r3, r3, #32	; 0x20
>     1b6c:	85853018 	strhi	r3, [r5, #24]
>     1b70:	e5d43074 	ldrb	r3, [r4, #116]
>     1b74:	e203300c 	and	r3, r3, #12	; 0xc
>     1b78:	e3530008 	cmp	r3, #8	; 0x8
>     1b7c:	01a01003 	moveq	r1, r3
>     1b80:	0a000007 	beq	1ba4 <udp_recvmsg+0xc0>
>     1b84:	e5953018 	ldr	r3, [r5, #24]
>     1b88:	e3130020 	tst	r3, #32	; 0x20
>     1b8c:	0a000009 	beq	1bb8 <udp_recvmsg+0xd4>
>     1b90:	ebfffffe 	bl	0 <__skb_checksum_complete>
>     1b94:	e3500000 	cmp	r0, #0	; 0x0
>     1b98:	1a000047 	bne	1cbc <udp_recvmsg+0x1d8>
>     1b9c:	e1a00004 	mov	r0, r4
>     1ba0:	e3a01008 	mov	r1, #8	; 0x8
>     1ba4:	e5952008 	ldr	r2, [r5, #8]
>     1ba8:	e1a03007 	mov	r3, r7
>     1bac:	ebfffffe 	bl	0 <skb_copy_datagram_iovec>
>     1bb0:	e50b002c 	str	r0, [fp, #-44]
>     1bb4:	ea000004 	b	1bcc <udp_recvmsg+0xe8>
>     1bb8:	e5952008 	ldr	r2, [r5, #8]
>     1bbc:	ebfffffe 	bl	0 <skb_copy_and_csum_datagram_iovec>
>     1bc0:	e3700016 	cmn	r0, #22	; 0x16
>     1bc4:	e50b002c 	str	r0, [fp, #-44]
>     1bc8:	0a00003b 	beq	1cbc <udp_recvmsg+0x1d8>
>     1bcc:	e51b302c 	ldr	r3, [fp, #-44]
>     1bd0:	e3530000 	cmp	r3, #0	; 0x0
>     1bd4:	1a000033 	bne	1ca8 <udp_recvmsg+0x1c4>
>     1bd8:	e594100c 	ldr	r1, [r4, #12]
>     1bdc:	e5962094 	ldr	r2, [r6, #148]
>     1be0:	e50b1034 	str	r1, [fp, #-52]
>     1be4:	e5943010 	ldr	r3, [r4, #16]
>     1be8:	e3120b02 	tst	r2, #2048	; 0x800
>     1bec:	e50b3030 	str	r3, [fp, #-48]
>     1bf0:	0a00000f 	beq	1c34 <udp_recvmsg+0x150>
>     1bf4:	e3510000 	cmp	r1, #0	; 0x0
>     1bf8:	1a000001 	bne	1c04 <udp_recvmsg+0x120>
>     1bfc:	e24b0034 	sub	r0, fp, #52	; 0x34
>     1c00:	ebfffffe 	bl	0 <do_gettimeofday>
>     1c04:	e51b3034 	ldr	r3, [fp, #-52]
>     1c08:	e24bc034 	sub	ip, fp, #52	; 0x34
>     1c0c:	e584300c 	str	r3, [r4, #12]
>     1c10:	e51b3030 	ldr	r3, [fp, #-48]
>     1c14:	e1a00005 	mov	r0, r5
>     1c18:	e5843010 	str	r3, [r4, #16]
>     1c1c:	e3a01001 	mov	r1, #1	; 0x1
>     1c20:	e3a0201d 	mov	r2, #29	; 0x1d
>     1c24:	e3a03008 	mov	r3, #8	; 0x8
>     1c28:	e58dc000 	str	ip, [sp]
>     1c2c:	ebfffffe 	bl	0 <put_cmsg>
>     1c30:	ea000003 	b	1c44 <udp_recvmsg+0x160>
>     1c34:	e24b2034 	sub	r2, fp, #52	; 0x34
>     1c38:	e892000c 	ldmia	r2, {r2, r3}
>     1c3c:	e58620f8 	str	r2, [r6, #248]
>     1c40:	e58630fc 	str	r3, [r6, #252]
>     1c44:	e35a0000 	cmp	sl, #0	; 0x0
> 
> 
>     1c48:	0a00000a 	beq	1c78 <udp_recvmsg+0x194>
>     1c4c:	e3a03002 	mov	r3, #2	; 0x2
>     1c50:	e1ca30b0 	strh	r3, [sl]
>     1c54:	e594301c 	ldr	r3, [r4, #28]
>     1c58:	e28a0008 	add	r0, sl, #8	; 0x8
>     1c5c:	e1d330b0 	ldrh	r3, [r3]
>     1c60:	e3a01008 	mov	r1, #8	; 0x8
>     1c64:	e1ca30b2 	strh	r3, [sl, #2]
> *    1c68:	e5943020 	ldr	r3, [r4, #32]
>     1c6c:	e593300c 	ldr	r3, [r3, #12]
>     1c70:	e58a3004 	str	r3, [sl, #4]
>     1c74:	ebfffffe 	bl	0 <__memzero>
>     1c78:	e59f3078 	ldr	r3, [pc, #120]	; 1cf8 <.text+0x1cf8>
>     1c7c:	e19630b3 	ldrh	r3, [r6, r3]
> 
> 
>     1c80:	e3530000 	cmp	r3, #0	; 0x0
>     1c84:	0a000002 	beq	1c94 <udp_recvmsg+0x1b0>
>     1c88:	e1a00005 	mov	r0, r5
>     1c8c:	e1a01004 	mov	r1, r4
>     1c90:	ebfffffe 	bl	0 <ip_cmsg_recv>
>     1c94:	e3190020 	tst	r9, #32	; 0x20
>     1c98:	e50b702c 	str	r7, [fp, #-44]
>     1c9c:	15943060 	ldrne	r3, [r4, #96]
>     1ca0:	12433008 	subne	r3, r3, #8	; 0x8
>     1ca4:	150b302c 	strne	r3, [fp, #-44]
>     1ca8:	e1a00006 	mov	r0, r6
>     1cac:	e1a01004 	mov	r1, r4
>     1cb0:	ebfffffe 	bl	0 <skb_free_datagram>
>     1cb4:	e51b002c 	ldr	r0, [fp, #-44]
>     1cb8:	ea00000c 	b	1cf0 <udp_recvmsg+0x20c>
>     1cbc:	e59f3038 	ldr	r3, [pc, #56]	; 1cfc <.text+0x1cfc>
>     1cc0:	e1a02009 	mov	r2, r9
>     1cc4:	e593c000 	ldr	ip, [r3]
>     1cc8:	e1a01004 	mov	r1, r4
>     1ccc:	e59c300c 	ldr	r3, [ip, #12]
>     1cd0:	e1a00006 	mov	r0, r6
>     1cd4:	e2833001 	add	r3, r3, #1	; 0x1
>     1cd8:	e58c300c 	str	r3, [ip, #12]
>     1cdc:	ebfffffe 	bl	0 <skb_kill_datagram>
>     1ce0:	e59b2004 	ldr	r2, [fp, #4]
>     1ce4:	e3520000 	cmp	r2, #0	; 0x0
>     1ce8:	0affff91 	beq	1b34 <udp_recvmsg+0x50>
>     1cec:	e3e0000a 	mvn	r0, #10	; 0xa
>     1cf0:	e24bd028 	sub	sp, fp, #40	; 0x28
>     1cf4:	e89daff0 	ldmia	sp, {r4, r5, r6, r7, r8, r9, sl, fp, sp, pc}
>     1cf8:	00000146 	andeq	r0, r0, r6, asr #2
>     1cfc:	00000000 	andeq	r0, r0, r0
> 
> 
> In the udp_recvmsg() function, the fault occurs in this code:
> 	/* Copy the address. */
> 	if (sin)
> 	{
> 		sin->sin_family = AF_INET;
> 		sin->sin_port = skb->h.uh->source;
> 		sin->sin_addr.s_addr = skb->nh.iph->saddr;  // <- failure accessing
> memory at saddr
> 		memset(sin->sin_zero, 0, sizeof(sin->sin_zero));
>   	}
> 
> 
> After reviewing the assembly and the source code, it looks like the
> address "c6f9202a" is where it thinks saddr should be.  Ideally, I'd

This address is not aligned to a word (multiple of 4), which seems strange...

Maybe ARM doesnt handle unaligned accesses ?


    1c48:	0a00000a 	beq	1c78 <udp_recvmsg+0x194>
    1c4c:	e3a03002 	mov	r3, #2	; 0x2
    1c50:	e1ca30b0 	strh	r3, [sl]
    1c54:	e594301c 	ldr	r3, [r4, #28]       skb->h.uh (udp hdr)  OK
    1c58:	e28a0008 	add	r0, sl, #8	; 0x8
    1c5c:	e1d330b0 	ldrh	r3, [r3]
    1c60:	e3a01008 	mov	r1, #8	; 0x8
    1c64:	e1ca30b2 	strh	r3, [sl, #2]
*    1c68:	e5943020 	ldr	r3, [r4, #32]   skb->nh.iph  (IP header) OK
    1c6c:	e593300c 	ldr	r3, [r3, #12]   but (R+12) is unaligned
    1c70:	e58a3004 	str	r3, [sl, #4]
    1c74:	ebfffffe 	bl	0 <__memzero>
    1c78:	e59f3078 	ldr	r3, [pc, #120]	; 1cf8 <.text+0x1cf8>
    1c7c:	e19630b3 	ldrh	r3, [r6, r3]

What is your NIC driver ? 

> like to figure out how to solve the problem.  From ifconfig, I'm
> finding a few errors with overruns, so maybe the queue is wrapping
> around and clobbering the sk_buffs.
> 
> eth0      Link encap:Ethernet  HWaddr 00:00:D0:D0:DA:D2
>           inet addr:192.168.17.133  Bcast:192.168.17.255  Mask:255.255.255.0
>           UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
>           RX packets:440979642 errors:8 dropped:0 overruns:8 frame:0
>           TX packets:601998 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:2838009823 (2.6 GiB)  TX bytes:155320893 (148.1 MiB)
>           Base address:0xb000
> 
> I'd also be willing to settle for a short term solution of finding a
> way to test whether it's safe to dereference that pointer, and
> skipping that sk_buff if it's bad.


  reply	other threads:[~2009-03-09 17:16 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-09 15:52 Kernel Oops in UDP w/ ARM architecture Ron Yorgason
2009-03-09 17:16 ` Eric Dumazet [this message]
2009-03-09 17:46   ` Ron Yorgason
2009-03-09 18:21     ` Eric Dumazet
2009-03-09 19:18       ` Ron Yorgason
2009-03-09 20:24         ` Eric Dumazet
2009-03-09 20:57           ` Ron Yorgason
2009-03-09 21:19             ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49B54F00.5090706@cosmosbay.com \
    --to=dada1@cosmosbay.com \
    --cc=netdev@vger.kernel.org \
    --cc=yorgasor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.