IPv6/sparc64: icmp port unreachable corruption

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* IPv6/sparc64: icmp port unreachable corruption
@ 2003-11-09 12:28 Jan Oravec
  2003-11-09 13:25 ` Jan Oravec
  2003-11-11  5:46 ` David S. Miller
  0 siblings, 2 replies; 12+ messages in thread
From: Jan Oravec @ 2003-11-09 12:28 UTC (permalink / raw)
  To: netdev, davem, yoshfuji

Hello,


I have found the following problem with 2.6.0-test9-bk13 on sparc64:

We do traceroute6 to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (IP of that
sparc64). We get the following corrupted answer:

13:17:47.191547 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: [|icmp6] (len 72, hlim 62)
0x0000   6000 0000 0048 3a3e 3ffe 80ee 03bd 0000        ....H:>?.......
0x0010   0a00 20ff fec7 a192 3ffe 80ee 000a 0000        ........?.......
0x0020   0201 03ff fed5 bd1e 0104 aa7c 0000 0000        ...........|....
0x0030   0000 0064 0000 0000 0100 0000 0100 0000        ...d............
0x0040   aaaa aaaa aaaa aaaa 9680 c00b c622 7fec        ............."..
0x0050   aaaa aaaa aaaa aaaa 9680 c00b c622 7ffc        ............."..
0x0060   aaaa aaaa 0000 0000 8a10 2000 04c2 8049        ...............I


When doing exactly same to x86 box (with 2.6.0-test7-bk7 running), we get
the correct answer:

13:17:31.140230 3ffe:80ee:1:0:204:76ff:fe97:d69a > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: icmp6: 3ffe:80ee:1:0:204:76ff:fe97:d69a udp port 33434 unreachable (len 72, hlim 63)
0x0000   6000 0000 0048 3a3f 3ffe 80ee 0001 0000        ....H:??.......
0x0010   0204 76ff fe97 d69a 3ffe 80ee 000a 0000        ..v.....?.......
0x0020   0201 03ff fed5 bd1e 0104 fb79 0000 0000        ...........y....
0x0030   6000 0000 0018 1101 3ffe 80ee 000a 0000        .......?.......
0x0040   0201 03ff fed5 bd1e 3ffe 80ee 0001 0000        ........?.......
0x0050   0204 76ff fe97 d69a 8018 829a 0018 0c82        ..v.............
0x0060   0000 1df3 0000 0005 5b30 ae3f 3512 0200        ........[0.?5...


Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-09 12:28 IPv6/sparc64: icmp port unreachable corruption Jan Oravec
@ 2003-11-09 13:25 ` Jan Oravec
  2003-11-09 13:39   ` Jan Oravec
  2003-11-11  5:46 ` David S. Miller
  1 sibling, 1 reply; 12+ messages in thread
From: Jan Oravec @ 2003-11-09 13:25 UTC (permalink / raw)
  To: netdev, davem, yoshfuji

This may be related to the problem (on sparc64):

# traceroute6 3ffe:80ee:3bd:0:a00:20ff:fec7:a192
traceroute to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (3ffe:80ee:3bd:0:a00:20ff:fec7:a192) from ::1, 30 hops max, 24 byte packets
Bus error

# traceroute6 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 -s 3ffe:80ee:3bd:0:a00:20ff:fec7:a192
traceroute to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (3ffe:80ee:3bd:0:a00:20ff:fec7:a192) from 3ffe:80ee:3bd:0:a00:20ff:fec7:a192, 30 hops max, 24 byte packets
Bus error

# traceroute6 www.kame.net
traceroute to orange.kame.net (2001:200:0:8002:203:47ff:fea5:3085) from 3ffe:80ee:3bd:0:a00:20ff:fec7:a192, 30 hops max, 24 byte packets
 1  skbra-00-01.pop.xs26.net (3ffe:80ee:3bd:0:a00:20ff:fec9:3aad)  0.953 ms 0.305 ms  0.341 ms
...

The following lines are appearing in dmesg:
raw v6 hw csum failure.

All of this worked fine in 2.4.22-pre6.


The common problem of 2.4 and 2.6 is with IPv4 traceroute, but it is
probably because of buggy 64-bit traceroute, because it worked fine in
32-bit userspace:

# traceroute www.google.com
traceroute to www.google.akadns.net (216.239.57.99), 30 hops max, 52 byte packets
Bus error



On Sun, Nov 09, 2003 at 01:28:44PM +0100, Jan Oravec wrote:
> Hello,
> 
> 
> I have found the following problem with 2.6.0-test9-bk13 on sparc64:
> 
> We do traceroute6 to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (IP of that
> sparc64). We get the following corrupted answer:
> 
> 13:17:47.191547 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: [|icmp6] (len 72, hlim 62)
> 0x0000   6000 0000 0048 3a3e 3ffe 80ee 03bd 0000        ....H:>?.......
> 0x0010   0a00 20ff fec7 a192 3ffe 80ee 000a 0000        ........?.......
> 0x0020   0201 03ff fed5 bd1e 0104 aa7c 0000 0000        ...........|....
> 0x0030   0000 0064 0000 0000 0100 0000 0100 0000        ...d............
> 0x0040   aaaa aaaa aaaa aaaa 9680 c00b c622 7fec        ............."..
> 0x0050   aaaa aaaa aaaa aaaa 9680 c00b c622 7ffc        ............."..
> 0x0060   aaaa aaaa 0000 0000 8a10 2000 04c2 8049        ...............I
> 
> 
> When doing exactly same to x86 box (with 2.6.0-test7-bk7 running), we get
> the correct answer:
> 
> 13:17:31.140230 3ffe:80ee:1:0:204:76ff:fe97:d69a > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: icmp6: 3ffe:80ee:1:0:204:76ff:fe97:d69a udp port 33434 unreachable (len 72, hlim 63)
> 0x0000   6000 0000 0048 3a3f 3ffe 80ee 0001 0000        ....H:??.......
> 0x0010   0204 76ff fe97 d69a 3ffe 80ee 000a 0000        ..v.....?.......
> 0x0020   0201 03ff fed5 bd1e 0104 fb79 0000 0000        ...........y....
> 0x0030   6000 0000 0018 1101 3ffe 80ee 000a 0000        .......?.......
> 0x0040   0201 03ff fed5 bd1e 3ffe 80ee 0001 0000        ........?.......
> 0x0050   0204 76ff fe97 d69a 8018 829a 0018 0c82        ..v.............
> 0x0060   0000 1df3 0000 0005 5b30 ae3f 3512 0200        ........[0.?5...
> 
> 
> Jan
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-09 13:25 ` Jan Oravec
@ 2003-11-09 13:39   ` Jan Oravec
  2003-11-09 14:37     ` Jan Oravec
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Oravec @ 2003-11-09 13:39 UTC (permalink / raw)
  To: netdev, davem, yoshfuji

And another observation is that on 2.6.0-test9-bk4 on Opteron x86_64 when I
do:

# traceroute6 ::1

The kernel crashs.

I will have kernel OOPS output tommorow (the box is located in office)



On Sun, Nov 09, 2003 at 02:25:53PM +0100, Jan Oravec wrote:
> This may be related to the problem (on sparc64):
> 
> # traceroute6 3ffe:80ee:3bd:0:a00:20ff:fec7:a192
> traceroute to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (3ffe:80ee:3bd:0:a00:20ff:fec7:a192) from ::1, 30 hops max, 24 byte packets
> Bus error
> 
> # traceroute6 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 -s 3ffe:80ee:3bd:0:a00:20ff:fec7:a192
> traceroute to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (3ffe:80ee:3bd:0:a00:20ff:fec7:a192) from 3ffe:80ee:3bd:0:a00:20ff:fec7:a192, 30 hops max, 24 byte packets
> Bus error
> 
> # traceroute6 www.kame.net
> traceroute to orange.kame.net (2001:200:0:8002:203:47ff:fea5:3085) from 3ffe:80ee:3bd:0:a00:20ff:fec7:a192, 30 hops max, 24 byte packets
>  1  skbra-00-01.pop.xs26.net (3ffe:80ee:3bd:0:a00:20ff:fec9:3aad)  0.953 ms 0.305 ms  0.341 ms
> ...
> 
> The following lines are appearing in dmesg:
> raw v6 hw csum failure.
> 
> All of this worked fine in 2.4.22-pre6.
> 
> 
> The common problem of 2.4 and 2.6 is with IPv4 traceroute, but it is
> probably because of buggy 64-bit traceroute, because it worked fine in
> 32-bit userspace:
> 
> # traceroute www.google.com
> traceroute to www.google.akadns.net (216.239.57.99), 30 hops max, 52 byte packets
> Bus error
> 
> 
> 
> On Sun, Nov 09, 2003 at 01:28:44PM +0100, Jan Oravec wrote:
> > Hello,
> > 
> > 
> > I have found the following problem with 2.6.0-test9-bk13 on sparc64:
> > 
> > We do traceroute6 to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (IP of that
> > sparc64). We get the following corrupted answer:
> > 
> > 13:17:47.191547 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: [|icmp6] (len 72, hlim 62)
> > 0x0000   6000 0000 0048 3a3e 3ffe 80ee 03bd 0000        ....H:>?.......
> > 0x0010   0a00 20ff fec7 a192 3ffe 80ee 000a 0000        ........?.......
> > 0x0020   0201 03ff fed5 bd1e 0104 aa7c 0000 0000        ...........|....
> > 0x0030   0000 0064 0000 0000 0100 0000 0100 0000        ...d............
> > 0x0040   aaaa aaaa aaaa aaaa 9680 c00b c622 7fec        ............."..
> > 0x0050   aaaa aaaa aaaa aaaa 9680 c00b c622 7ffc        ............."..
> > 0x0060   aaaa aaaa 0000 0000 8a10 2000 04c2 8049        ...............I
> > 
> > 
> > When doing exactly same to x86 box (with 2.6.0-test7-bk7 running), we get
> > the correct answer:
> > 
> > 13:17:31.140230 3ffe:80ee:1:0:204:76ff:fe97:d69a > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: icmp6: 3ffe:80ee:1:0:204:76ff:fe97:d69a udp port 33434 unreachable (len 72, hlim 63)
> > 0x0000   6000 0000 0048 3a3f 3ffe 80ee 0001 0000        ....H:??.......
> > 0x0010   0204 76ff fe97 d69a 3ffe 80ee 000a 0000        ..v.....?.......
> > 0x0020   0201 03ff fed5 bd1e 0104 fb79 0000 0000        ...........y....
> > 0x0030   6000 0000 0018 1101 3ffe 80ee 000a 0000        .......?.......
> > 0x0040   0201 03ff fed5 bd1e 3ffe 80ee 0001 0000        ........?.......
> > 0x0050   0204 76ff fe97 d69a 8018 829a 0018 0c82        ..v.............
> > 0x0060   0000 1df3 0000 0005 5b30 ae3f 3512 0200        ........[0.?5...
> > 
> > 
> > Jan
> > 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-09 13:39   ` Jan Oravec
@ 2003-11-09 14:37     ` Jan Oravec
  0 siblings, 0 replies; 12+ messages in thread
From: Jan Oravec @ 2003-11-09 14:37 UTC (permalink / raw)
  To: netdev, davem, yoshfuji

A colleague of mine has Opteron at home, he tried traceroute6 ::1 on
2.6.0-test9-bk4, here is the kernel output:

RDX: 0000000000000048 RSI: 000001001ec06048 RDI: 000001011ec06218
RBP: 0000000000000048 R08: 0000000000000000 R09: 0000000000000000
R10: 000001001ea6d1c0 R11: 00000000000000dc R12: 0000000000000001
R13: 0000000000000000 R14: 000001001f95f740 R15: 000001001ec06048
FS:  0000002a958d2060(0000) GS:ffffffff804f4500(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000000101000 CR4: 00000000000006a0
Process traceroute6 (pid: 787, stackpage=1001eaf04e0)
Stack: 0000000000000000 0000000000000000 0000000000000048 0000000000000048
       000001001f95f740 0000000000000000 0000000000000048 ffffffff802e710f
       0000000000000246 ffffffff8021f8d0
Call Trace:<ffffffff802e710f>{csum_partial_copy_nocheck+15} <ffffffff8021f8d0>{skb_copy_and_csum_bits+96}
       <ffffffff802bc043>{icmpv6_getfrag+35} <ffffffff802a91e6>{ip6_append_data+1158}
       <ffffffff802bc020>{icmpv6_getfrag+0} <ffffffff802bc4a4>{icmpv6_send+1044}
       <ffffffff802b8ca7>{updv6_rcv+647} <ffffffff802aa34d>{ip6_input_finish+429}
       <ffffffff802aa1a0>{ip6_input_finish+0} <ffffffff8022b403>{nf_hook_slow+227}
       <ffffffff802aa1a0>{ip6_input_finish+0} <ffffffff802aa160>{ip6_rcv_finish+0} 
       <ffffffff802aa0e6>{ip6_input+662} <ffffffff8022b01e>{nf_interate+94}
       <ffffffff802aa160>{ip6_rcv_finish+0} <ffffffff802aa17f>{ip6_rcv_finish+31}
       <ffffffff8022b403>{nf_hook_slow+227} <ffffffff802aa160>{ip6_rcv_finish+0}
       <ffffffff802a9df7>{ipv6_rcv+503} <ffffffff8022290a>{netif_receive_skb+394}
       <ffffffff802229ca>{process_backlog+138} <ffffffff80222acb>{net_rx_action+123}
       <ffffffff80130a1b>{do_softirq+123} <ffffffff80222442>{dev_queue_xmit+354}
       <ffffffff802281d2>{neigh_resolve_output+322} <ffffffff802a9a80>{ip6_output_finish+0}
       <ffffffff802a9b23>{ip6_output_finish+163} <ffffffff802a9a80>{ip6_output_finish+0}
       <ffffffff8022b403>{nf_hook_slow+227} <ffffffff802a9a80>{ip6_output_finish+0}
       <ffffffff802a9a50>{dst_output+0} <ffffffff802a69dc>{ip6_output2+540}
       <ffffffff802a9a61>{dst_output+17} <ffffffff8022b403>{nf_hook_slow+227}
       <ffffffff802a9a50>{dst_output+0} <ffffffff802a98a0>{ip6_push_pending_frames+784} 
       <ffffffff802a91e6>{ip6_append_data+1158} <ffffffff802b906f>{udp_v6_push_pending_frames+319}
       <ffffffff802b97d5>{udpv6_sendmsg+1861} <ffffffff80273154>{inet_sendmsg+84}
       <ffffffff8021adad>{sock_sendmsg+125} <ffffffff80141e4d>{find_get_page+13}
       <ffffffff80142ccd>{filemap_nopage+269} <ffffffff8014eead>{do_no_page+813}
       <ffffffff8021abd0>{sockfd_lookup+32} <ffffffff8021a897>{move_addr_to_kernel+39}
       <ffffffff8021bf99>{sys_sendto+233} <ffffffff80272312>{inet_setsockopt+18}
       <ffffffff8021c203>{sys_setsockopt+147} <ffffffff8010eb60>{system_call+124}

Code: c7 00 f2 ff ff ff eb d6 48 8b 44 24 08 c7 00 f2 ff ff ff eb
RIP <ffffffff802e72cd>{csum_partial_copy_generic+349} RSP <000001001e6354c8>
CR2: 0000000000000000
 <0>Kernel panic: Aiee, killing interrupt handler!
In interrupt handler - not syncing


On Sun, Nov 09, 2003 at 02:39:39PM +0100, Jan Oravec wrote:
> And another observation is that on 2.6.0-test9-bk4 on Opteron x86_64 when I
> do:
> 
> # traceroute6 ::1
> 
> The kernel crashs.
> 
> I will have kernel OOPS output tommorow (the box is located in office)
> 
> 
> 
> On Sun, Nov 09, 2003 at 02:25:53PM +0100, Jan Oravec wrote:
> > This may be related to the problem (on sparc64):
> > 
> > # traceroute6 3ffe:80ee:3bd:0:a00:20ff:fec7:a192
> > traceroute to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (3ffe:80ee:3bd:0:a00:20ff:fec7:a192) from ::1, 30 hops max, 24 byte packets
> > Bus error
> > 
> > # traceroute6 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 -s 3ffe:80ee:3bd:0:a00:20ff:fec7:a192
> > traceroute to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (3ffe:80ee:3bd:0:a00:20ff:fec7:a192) from 3ffe:80ee:3bd:0:a00:20ff:fec7:a192, 30 hops max, 24 byte packets
> > Bus error
> > 
> > # traceroute6 www.kame.net
> > traceroute to orange.kame.net (2001:200:0:8002:203:47ff:fea5:3085) from 3ffe:80ee:3bd:0:a00:20ff:fec7:a192, 30 hops max, 24 byte packets
> >  1  skbra-00-01.pop.xs26.net (3ffe:80ee:3bd:0:a00:20ff:fec9:3aad)  0.953 ms 0.305 ms  0.341 ms
> > ...
> > 
> > The following lines are appearing in dmesg:
> > raw v6 hw csum failure.
> > 
> > All of this worked fine in 2.4.22-pre6.
> > 
> > 
> > The common problem of 2.4 and 2.6 is with IPv4 traceroute, but it is
> > probably because of buggy 64-bit traceroute, because it worked fine in
> > 32-bit userspace:
> > 
> > # traceroute www.google.com
> > traceroute to www.google.akadns.net (216.239.57.99), 30 hops max, 52 byte packets
> > Bus error
> > 
> > 
> > 
> > On Sun, Nov 09, 2003 at 01:28:44PM +0100, Jan Oravec wrote:
> > > Hello,
> > > 
> > > 
> > > I have found the following problem with 2.6.0-test9-bk13 on sparc64:
> > > 
> > > We do traceroute6 to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (IP of that
> > > sparc64). We get the following corrupted answer:
> > > 
> > > 13:17:47.191547 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: [|icmp6] (len 72, hlim 62)
> > > 0x0000   6000 0000 0048 3a3e 3ffe 80ee 03bd 0000        ....H:>?.......
> > > 0x0010   0a00 20ff fec7 a192 3ffe 80ee 000a 0000        ........?.......
> > > 0x0020   0201 03ff fed5 bd1e 0104 aa7c 0000 0000        ...........|....
> > > 0x0030   0000 0064 0000 0000 0100 0000 0100 0000        ...d............
> > > 0x0040   aaaa aaaa aaaa aaaa 9680 c00b c622 7fec        ............."..
> > > 0x0050   aaaa aaaa aaaa aaaa 9680 c00b c622 7ffc        ............."..
> > > 0x0060   aaaa aaaa 0000 0000 8a10 2000 04c2 8049        ...............I
> > > 
> > > 
> > > When doing exactly same to x86 box (with 2.6.0-test7-bk7 running), we get
> > > the correct answer:
> > > 
> > > 13:17:31.140230 3ffe:80ee:1:0:204:76ff:fe97:d69a > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: icmp6: 3ffe:80ee:1:0:204:76ff:fe97:d69a udp port 33434 unreachable (len 72, hlim 63)
> > > 0x0000   6000 0000 0048 3a3f 3ffe 80ee 0001 0000        ....H:??.......
> > > 0x0010   0204 76ff fe97 d69a 3ffe 80ee 000a 0000        ..v.....?.......
> > > 0x0020   0201 03ff fed5 bd1e 0104 fb79 0000 0000        ...........y....
> > > 0x0030   6000 0000 0018 1101 3ffe 80ee 000a 0000        .......?.......
> > > 0x0040   0201 03ff fed5 bd1e 3ffe 80ee 0001 0000        ........?.......
> > > 0x0050   0204 76ff fe97 d69a 8018 829a 0018 0c82        ..v.............
> > > 0x0060   0000 1df3 0000 0005 5b30 ae3f 3512 0200        ........[0.?5...
> > > 
> > > 
> > > Jan
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-09 12:28 IPv6/sparc64: icmp port unreachable corruption Jan Oravec
  2003-11-09 13:25 ` Jan Oravec
@ 2003-11-11  5:46 ` David S. Miller
  2003-11-11  7:06   ` YOSHIFUJI Hideaki / 吉藤英明
  2003-11-11 22:26   ` Jan Oravec
  1 sibling, 2 replies; 12+ messages in thread
From: David S. Miller @ 2003-11-11  5:46 UTC (permalink / raw)
  To: Jan Oravec; +Cc: netdev, yoshfuji

On Sun, 9 Nov 2003 13:28:44 +0100
Jan Oravec <jan.oravec@6com.sk> wrote:

> We do traceroute6 to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (IP of that
> sparc64). We get the following corrupted answer:
> 
> 13:17:47.191547 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: [|icmp6] (len 72, hlim 62)
> 0x0000   6000 0000 0048 3a3e 3ffe 80ee 03bd 0000        ....H:>?.......
> 0x0010   0a00 20ff fec7 a192 3ffe 80ee 000a 0000        ........?.......
> 0x0020   0201 03ff fed5 bd1e 0104 aa7c 0000 0000        ...........|....
> 0x0030   0000 0064 0000 0000 0100 0000 0100 0000        ...d............
> 0x0040   aaaa aaaa aaaa aaaa 9680 c00b c622 7fec        ............."..
> 0x0050   aaaa aaaa aaaa aaaa 9680 c00b c622 7ffc        ............."..
> 0x0060   aaaa aaaa 0000 0000 8a10 2000 04c2 8049        ...............I

What specifically about this packet makes you think it is corrupted?

Let's look at the ICMP header you say is "correct" from the x86 box:

> 0104 fb79 0000 0000

type = ICMPV6_DEST_UNREACH
code = ICMPV6_PORT_UNREACH

In the sparc64 generated packet these two values are identical:

> 0104 aa7c 0000 0000

So why does tcpdump not say that this is "udp port XXX unreachable"
like it does for the x86 generated packet.

Incorrect checksum or corrupted payload after the icmp6 header?

What compiler are you using to build 2.6.x kernels btw?  We could
be looking at a miscompile here.

The bus error you reported from running traceroute6 on the sparc64
system is not that useful, can you use gdb or some other tool to
figure out where inside of tcpdump6 the bus error is occuring?  Is is
happening in the tcpdump6 program itself?  It is due to a failed system
call?

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-11  5:46 ` David S. Miller
@ 2003-11-11  7:06   ` YOSHIFUJI Hideaki / 吉藤英明
  2003-11-11 22:26   ` Jan Oravec
  1 sibling, 0 replies; 12+ messages in thread
From: YOSHIFUJI Hideaki / 吉藤英明 @ 2003-11-11  7:06 UTC (permalink / raw)
  To: davem; +Cc: jan.oravec, netdev

In article <20031110214603.0057e365.davem@redhat.com> (at Mon, 10 Nov 2003 21:46:03 -0800), "David S. Miller" <davem@redhat.com> says:

> > 13:17:47.191547 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: [|icmp6] (len 72, hlim 62)
> > 0x0000   6000 0000 0048 3a3e 3ffe 80ee 03bd 0000        ....H:>?.......
> > 0x0010   0a00 20ff fec7 a192 3ffe 80ee 000a 0000        ........?.......
> > 0x0020   0201 03ff fed5 bd1e 0104 aa7c 0000 0000        ...........|....
> > 0x0030   0000 0064 0000 0000 0100 0000 0100 0000        ...d............
> > 0x0040   aaaa aaaa aaaa aaaa 9680 c00b c622 7fec        ............."..
> > 0x0050   aaaa aaaa aaaa aaaa 9680 c00b c622 7ffc        ............."..
> > 0x0060   aaaa aaaa 0000 0000 8a10 2000 04c2 8049        ...............I
> 
> What specifically about this packet makes you think it is corrupted?
:
> So why does tcpdump not say that this is "udp port XXX unreachable"
> like it does for the x86 generated packet.
> 
> Incorrect checksum or corrupted payload after the icmp6 header?

0x0030- should be the copy of the original packet.
it is corrupted.

-- 
Hideaki YOSHIFUJI @ USAGI Project <yoshfuji@linux-ipv6.org>
GPG FP: 9022 65EB 1ECF 3AD1 0BDF  80D8 4807 F894 E062 0EEA

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-11  5:46 ` David S. Miller
  2003-11-11  7:06   ` YOSHIFUJI Hideaki / 吉藤英明
@ 2003-11-11 22:26   ` Jan Oravec
  2003-11-11 23:13     ` David S. Miller
  1 sibling, 1 reply; 12+ messages in thread
From: Jan Oravec @ 2003-11-11 22:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, yoshfuji

> > We do traceroute6 to 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 (IP of that
> > sparc64). We get the following corrupted answer:
> > 
> > 13:17:47.191547 3ffe:80ee:3bd:0:a00:20ff:fec7:a192 > 3ffe:80ee:a:0:201:3ff:fed5:bd1e: [|icmp6] (len 72, hlim 62)
> > 0x0000   6000 0000 0048 3a3e 3ffe 80ee 03bd 0000        ....H:>?.......
> > 0x0010   0a00 20ff fec7 a192 3ffe 80ee 000a 0000        ........?.......
> > 0x0020   0201 03ff fed5 bd1e 0104 aa7c 0000 0000        ...........|....
> > 0x0030   0000 0064 0000 0000 0100 0000 0100 0000        ...d............
> > 0x0040   aaaa aaaa aaaa aaaa 9680 c00b c622 7fec        ............."..
> > 0x0050   aaaa aaaa aaaa aaaa 9680 c00b c622 7ffc        ............."..
> > 0x0060   aaaa aaaa 0000 0000 8a10 2000 04c2 8049        ...............I
> 
> What specifically about this packet makes you think it is corrupted?

The ICMP reply should contain the original packet.


> What compiler are you using to build 2.6.x kernels btw?  We could
> be looking at a miscompile here.

3.3.2


> The bus error you reported from running traceroute6 on the sparc64
> system is not that useful, can you use gdb or some other tool to
> figure out where inside of tcpdump6 the bus error is occuring?  Is is
> happening in the tcpdump6 program itself?  It is due to a failed system
> call?

I am running 64-bit-only userspace and there is no gdb/strace for sparc64
yet :(.


But I think I have found the problem:

icmpv6_send() can get skb where skb->nh.raw < skb->data, thus computed plen
(see icmp.c:382) is negative. When passed as unsigned int to __skb_pull, it
underflows and is interpreted as 0x100000000-something_small. In __skb_pull
we then increase skb->data by that number; because skb->data is 64-bit while
plen is 32-bit, we get pointer which is 0x100000000 higher than needed. On
32-bit platform that does not cause any troubles because it overflows again.

I do not know whether icmpv6_send() was meant to receive skb with ->data
pulled no more than nh.raw; in that case I suggest the following patch
(against test9-bk16):

--- linux/net/ipv6/udp.c.orig	2003-11-11 23:04:08.393138608 +0100
+++ linux/net/ipv6/udp.c	2003-11-11 23:07:20.964089789 +0100
@@ -677,6 +677,7 @@
 			goto discard;
 		UDP6_INC_STATS_BH(UdpNoPorts);
 
+		__skb_push(skb, skb->data - skb->nh.raw);
 		icmpv6_send(skb, ICMPV6_DEST_UNREACH, ICMPV6_PORT_UNREACH, 0, dev);
 
 		kfree_skb(skb);



I looked at the other icmpv6_send() calls; they seem to be OK (not 100%
sure).

Instead, if we want make icmpv6_send to work with any ->data, we could use this
patch:

--- linux/net/ipv6/icmp.c.orig	2003-10-25 20:43:17.000000000 +0200
+++ linux/net/ipv6/icmp.c	2003-11-11 23:23:09.661409756 +0100
@@ -380,7 +380,11 @@
 	}
 
 	plen = skb->nh.raw - skb->data;
-	__skb_pull(skb, plen);
+	if (plen < 0)
+		__skb_push(skb, -plen);
+	else
+		__skb_pull(skb, plen);
+
 	len = skb->len;
 	len = min_t(unsigned int, len, IPV6_MIN_MTU - sizeof(struct ipv6hdr) -sizeof(struct icmp6hdr));
 	if (len < 0) {
@@ -399,7 +403,10 @@
 		goto out_put;
 	}
 	err = icmpv6_push_pending_frames(sk, &fl, &tmp_hdr, len + sizeof(struct icmp6hdr));
-	__skb_push(skb, plen);
+	if (plen < 0)
+		__skb_pull(skb, -plen);
+	else
+		__skb_push(skb, plen);
 
 	if (type >= ICMPV6_DEST_UNREACH && type <= ICMPV6_PARAMPROB)
 		ICMP6_INC_STATS_OFFSET_BH(idev, Icmp6OutDestUnreachs, type - ICMPV6_DEST_UNREACH);


Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-11 22:26   ` Jan Oravec
@ 2003-11-11 23:13     ` David S. Miller
  2003-11-12  0:41       ` Jan Oravec
  2003-11-12  9:26       ` David S. Miller
  0 siblings, 2 replies; 12+ messages in thread
From: David S. Miller @ 2003-11-11 23:13 UTC (permalink / raw)
  To: Jan Oravec; +Cc: netdev, yoshfuji

On Tue, 11 Nov 2003 23:26:11 +0100
Jan Oravec <jan.oravec@6com.sk> wrote:

> > The bus error you reported from running traceroute6 on the sparc64
> > system is not that useful, can you use gdb or some other tool to
> > figure out where inside of tcpdump6 the bus error is occuring?  Is is
> > happening in the tcpdump6 program itself?  It is due to a failed system
> > call?
> 
> I am running 64-bit-only userspace and there is no gdb/strace for sparc64
> yet :(.

Yes there is a gdb, here is a prebuilt 64-bit gdb for you.  It is even
statically linked so there are no shared library dependencies.  It can
debug 32-bit processes as well:

	ftp://pizda.ninka.net/pub/for_jakub/gdb64

Enjoy.

> But I think I have found the problem:
> 
> icmpv6_send() can get skb where skb->nh.raw < skb->data, thus computed plen
> (see icmp.c:382) is negative. When passed as unsigned int to __skb_pull, it
> underflows and is interpreted as 0x100000000-something_small. In __skb_pull
> we then increase skb->data by that number; because skb->data is 64-bit while
> plen is 32-bit, we get pointer which is 0x100000000 higher than needed. On
> 32-bit platform that does not cause any troubles because it overflows again.
> 
> I do not know whether icmpv6_send() was meant to receive skb with ->data
> pulled no more than nh.raw; in that case I suggest the following patch
> (against test9-bk16):

Great analysis, thanks a lot.

I will look at your patch proposals.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-11 23:13     ` David S. Miller
@ 2003-11-12  0:41       ` Jan Oravec
  2003-11-12  9:26       ` David S. Miller
  1 sibling, 0 replies; 12+ messages in thread
From: Jan Oravec @ 2003-11-12  0:41 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, yoshfuji

> Yes there is a gdb, here is a prebuilt 64-bit gdb for you.  It is even
> statically linked so there are no shared library dependencies.  It can
> debug 32-bit processes as well:
> 
> 	ftp://pizda.ninka.net/pub/for_jakub/gdb64

Thanks a lot!

That is exactly what I was looking for, now I can debug IPv4 traceroute :-).


Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-11 23:13     ` David S. Miller
  2003-11-12  0:41       ` Jan Oravec
@ 2003-11-12  9:26       ` David S. Miller
  2003-11-12 15:14         ` Jan Oravec
  1 sibling, 1 reply; 12+ messages in thread
From: David S. Miller @ 2003-11-12  9:26 UTC (permalink / raw)
  To: David S. Miller; +Cc: jan.oravec, netdev, yoshfuji

On Tue, 11 Nov 2003 15:13:40 -0800
"David S. Miller" <davem@redhat.com> wrote:

> I will look at your patch proposals.

All of the __skb_{push,pull}() modifications made by icmpv6_send()
are illegal.  The SKB could be cloned and being inspected by other
entities in the networking, therefore moving the pointers around
could cause problems.

Therefore what we do instead is propagate the:

	skb->nh.raw - skb->data

offset into the skb_copy_and_csum_bits() calls.  This is what
the pre-IPSEC version of the icmpv6 code did.

When you pass a negative offset into skb_copy_and_csum_bits()
this means start that many bytes before skb->data

Jan, can you give this patch a try with your setup?
Thanks a lot.

--- net/ipv6/icmp.c.~1~	Wed Nov 12 01:04:01 2003
+++ net/ipv6/icmp.c	Wed Nov 12 01:26:16 2003
@@ -86,15 +86,6 @@
 	.flags		=	INET6_PROTO_FINAL,
 };
 
-struct icmpv6_msg {
-	struct icmp6hdr		icmph;
-	struct sk_buff		*skb;
-	int			offset;
-	struct in6_addr		*daddr;
-	int			len;
-	__u32			csum;
-};
-
 static __inline__ int icmpv6_xmit_lock(void)
 {
 	local_bh_disable();
@@ -258,11 +249,19 @@
 	return err;
 }
 
+struct icmpv6_msg {
+	struct sk_buff	*skb;
+	int		offset;
+};
+
 static int icmpv6_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb)
 {
-	struct sk_buff *org_skb = (struct sk_buff *)from;
+	struct icmpv6_msg *msg = (struct icmpv6_msg *) from;
+	struct sk_buff *org_skb = msg->skb;
 	__u32 csum = 0;
-	csum = skb_copy_and_csum_bits(org_skb, offset, to, len, csum);
+
+	csum = skb_copy_and_csum_bits(org_skb, msg->offset + offset,
+				      to, len, csum);
 	skb->csum = csum_block_add(skb->csum, csum, odd);
 	return 0;
 }
@@ -281,9 +280,10 @@
 	struct dst_entry *dst;
 	struct icmp6hdr tmp_hdr;
 	struct flowi fl;
+	struct icmpv6_msg msg;
 	int iif = 0;
 	int addr_type = 0;
-	int len, plen;
+	int len;
 	int hlimit = -1;
 	int err = 0;
 
@@ -379,27 +379,29 @@
 			hlimit = dst_metric(dst, RTAX_HOPLIMIT);
 	}
 
-	plen = skb->nh.raw - skb->data;
-	__skb_pull(skb, plen);
+	msg.skb = skb;
+	msg.offset = skb->nh.raw - skb->data;
+
 	len = skb->len;
 	len = min_t(unsigned int, len, IPV6_MIN_MTU - sizeof(struct ipv6hdr) -sizeof(struct icmp6hdr));
 	if (len < 0) {
 		if (net_ratelimit())
 			printk(KERN_DEBUG "icmp: len problem\n");
-		__skb_push(skb, plen);
 		goto out_dst_release;
 	}
 
 	idev = in6_dev_get(skb->dev);
 
-	err = ip6_append_data(sk, icmpv6_getfrag, skb, len + sizeof(struct icmp6hdr), sizeof(struct icmp6hdr),
-				hlimit, NULL, &fl, (struct rt6_info*)dst, MSG_DONTWAIT);
+	err = ip6_append_data(sk, icmpv6_getfrag, &msg,
+			      len + sizeof(struct icmp6hdr),
+			      sizeof(struct icmp6hdr),
+			      hlimit, NULL, &fl, (struct rt6_info*)dst,
+			      MSG_DONTWAIT);
 	if (err) {
 		ip6_flush_pending_frames(sk);
 		goto out_put;
 	}
 	err = icmpv6_push_pending_frames(sk, &fl, &tmp_hdr, len + sizeof(struct icmp6hdr));
-	__skb_push(skb, plen);
 
 	if (type >= ICMPV6_DEST_UNREACH && type <= ICMPV6_PARAMPROB)
 		ICMP6_INC_STATS_OFFSET_BH(idev, Icmp6OutDestUnreachs, type - ICMPV6_DEST_UNREACH);
@@ -423,6 +425,7 @@
 	struct icmp6hdr *icmph = (struct icmp6hdr *) skb->h.raw;
 	struct icmp6hdr tmp_hdr;
 	struct flowi fl;
+	struct icmpv6_msg msg;
 	struct dst_entry *dst;
 	int err = 0;
 	int hlimit = -1;
@@ -464,7 +467,10 @@
 
 	idev = in6_dev_get(skb->dev);
 
-	err = ip6_append_data(sk, icmpv6_getfrag, skb, skb->len + sizeof(struct icmp6hdr),
+	msg.skb = skb;
+	msg.offset = 0;
+
+	err = ip6_append_data(sk, icmpv6_getfrag, &msg, skb->len + sizeof(struct icmp6hdr),
 				sizeof(struct icmp6hdr), hlimit, NULL, &fl,
 				(struct rt6_info*)dst, MSG_DONTWAIT);
 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-12  9:26       ` David S. Miller
@ 2003-11-12 15:14         ` Jan Oravec
  2003-11-12 22:40           ` David S. Miller
  0 siblings, 1 reply; 12+ messages in thread
From: Jan Oravec @ 2003-11-12 15:14 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, yoshfuji

> All of the __skb_{push,pull}() modifications made by icmpv6_send()
> are illegal.  The SKB could be cloned and being inspected by other
> entities in the networking, therefore moving the pointers around
> could cause problems.
> 
> Therefore what we do instead is propagate the:
> 
> 	skb->nh.raw - skb->data
> 
> offset into the skb_copy_and_csum_bits() calls.  This is what
> the pre-IPSEC version of the icmpv6 code did.
> 
> When you pass a negative offset into skb_copy_and_csum_bits()
> this means start that many bytes before skb->data

OK, I like your patch more.

You have forgot to decrease 'len' by msg.offset here:


> -	plen = skb->nh.raw - skb->data;
> -	__skb_pull(skb, plen);
> +	msg.skb = skb;
> +	msg.offset = skb->nh.raw - skb->data;
> +
>  	len = skb->len;


I've fixed that and tested, here is a working patch:

--- linux/net/ipv6/icmp.c.orig	2003-11-12 16:02:23.000000000 +0100
+++ linux/net/ipv6/icmp.c	2003-11-12 16:03:59.000000000 +0100
@@ -86,15 +86,6 @@
 	.flags		=	INET6_PROTO_FINAL,
 };
 
-struct icmpv6_msg {
-	struct icmp6hdr		icmph;
-	struct sk_buff		*skb;
-	int			offset;
-	struct in6_addr		*daddr;
-	int			len;
-	__u32			csum;
-};
-
 static __inline__ int icmpv6_xmit_lock(void)
 {
 	local_bh_disable();
@@ -258,11 +249,19 @@
 	return err;
 }
 
+struct icmpv6_msg {
+	struct sk_buff	*skb;
+	int		offset;
+};
+
 static int icmpv6_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb)
 {
-	struct sk_buff *org_skb = (struct sk_buff *)from;
+	struct icmpv6_msg *msg = (struct icmpv6_msg *) from;
+	struct sk_buff *org_skb = msg->skb;
 	__u32 csum = 0;
-	csum = skb_copy_and_csum_bits(org_skb, offset, to, len, csum);
+
+	csum = skb_copy_and_csum_bits(org_skb, msg->offset + offset,
+				      to, len, csum);
 	skb->csum = csum_block_add(skb->csum, csum, odd);
 	return 0;
 }
@@ -281,9 +280,10 @@
 	struct dst_entry *dst;
 	struct icmp6hdr tmp_hdr;
 	struct flowi fl;
+	struct icmpv6_msg msg;
 	int iif = 0;
 	int addr_type = 0;
-	int len, plen;
+	int len;
 	int hlimit = -1;
 	int err = 0;
 
@@ -379,27 +379,29 @@
 			hlimit = dst_metric(dst, RTAX_HOPLIMIT);
 	}
 
-	plen = skb->nh.raw - skb->data;
-	__skb_pull(skb, plen);
-	len = skb->len;
+	msg.skb = skb;
+	msg.offset = skb->nh.raw - skb->data;
+
+	len = skb->len - msg.offset;
 	len = min_t(unsigned int, len, IPV6_MIN_MTU - sizeof(struct ipv6hdr) -sizeof(struct icmp6hdr));
 	if (len < 0) {
 		if (net_ratelimit())
 			printk(KERN_DEBUG "icmp: len problem\n");
-		__skb_push(skb, plen);
 		goto out_dst_release;
 	}
 
 	idev = in6_dev_get(skb->dev);
 
-	err = ip6_append_data(sk, icmpv6_getfrag, skb, len + sizeof(struct icmp6hdr), sizeof(struct icmp6hdr),
-				hlimit, NULL, &fl, (struct rt6_info*)dst, MSG_DONTWAIT);
+	err = ip6_append_data(sk, icmpv6_getfrag, &msg,
+			      len + sizeof(struct icmp6hdr),
+			      sizeof(struct icmp6hdr),
+			      hlimit, NULL, &fl, (struct rt6_info*)dst,
+			      MSG_DONTWAIT);
 	if (err) {
 		ip6_flush_pending_frames(sk);
 		goto out_put;
 	}
 	err = icmpv6_push_pending_frames(sk, &fl, &tmp_hdr, len + sizeof(struct icmp6hdr));
-	__skb_push(skb, plen);
 
 	if (type >= ICMPV6_DEST_UNREACH && type <= ICMPV6_PARAMPROB)
 		ICMP6_INC_STATS_OFFSET_BH(idev, Icmp6OutDestUnreachs, type - ICMPV6_DEST_UNREACH);
@@ -423,6 +425,7 @@
 	struct icmp6hdr *icmph = (struct icmp6hdr *) skb->h.raw;
 	struct icmp6hdr tmp_hdr;
 	struct flowi fl;
+	struct icmpv6_msg msg;
 	struct dst_entry *dst;
 	int err = 0;
 	int hlimit = -1;
@@ -464,7 +467,10 @@
 
 	idev = in6_dev_get(skb->dev);
 
-	err = ip6_append_data(sk, icmpv6_getfrag, skb, skb->len + sizeof(struct icmp6hdr),
+	msg.skb = skb;
+	msg.offset = 0;
+
+	err = ip6_append_data(sk, icmpv6_getfrag, &msg, skb->len + sizeof(struct icmp6hdr),
 				sizeof(struct icmp6hdr), hlimit, NULL, &fl,
 				(struct rt6_info*)dst, MSG_DONTWAIT);
 


Jan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: IPv6/sparc64: icmp port unreachable corruption
  2003-11-12 15:14         ` Jan Oravec
@ 2003-11-12 22:40           ` David S. Miller
  0 siblings, 0 replies; 12+ messages in thread
From: David S. Miller @ 2003-11-12 22:40 UTC (permalink / raw)
  To: Jan Oravec; +Cc: netdev, yoshfuji

On Wed, 12 Nov 2003 16:14:44 +0100
Jan Oravec <jan.oravec@6com.sk> wrote:

> You have forgot to decrease 'len' by msg.offset here:

Indeed, thanks a lot for all of your help Jan.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2003-11-12 22:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-11-09 12:28 IPv6/sparc64: icmp port unreachable corruption Jan Oravec
2003-11-09 13:25 ` Jan Oravec
2003-11-09 13:39   ` Jan Oravec
2003-11-09 14:37     ` Jan Oravec
2003-11-11  5:46 ` David S. Miller
2003-11-11  7:06   ` YOSHIFUJI Hideaki / 吉藤英明
2003-11-11 22:26   ` Jan Oravec
2003-11-11 23:13     ` David S. Miller
2003-11-12  0:41       ` Jan Oravec
2003-11-12  9:26       ` David S. Miller
2003-11-12 15:14         ` Jan Oravec
2003-11-12 22:40           ` David S. Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).