netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach
@ 2023-09-07  2:43 Kyle Zeng
  2023-09-07 11:03 ` Paolo Abeni
  2023-09-07 14:39 ` David Ahern
  0 siblings, 2 replies; 8+ messages in thread
From: Kyle Zeng @ 2023-09-07  2:43 UTC (permalink / raw)
  To: davem, dsahern, netdev; +Cc: ssuryaextr

Currently, we assume the skb is associated with a device before calling __ip_options_compile, which is not always the case if it is re-routed by ipvs.
When skb->dev is NULL, dev_net(skb->dev) will become null-dereference.
Since we know that all the options will be set to IPOPT_END, which does
not depend on struct net, we pass NULL to it.

KASAN splash:

[    6.289675] general protection fault, probably for non-canonical address 0xdffffc0000000096: 0000 [#1] PREEMPT SMP KASAN NOPTI
[    6.292146] KASAN: null-ptr-deref in range [0x00000000000004b0-0x00000000000004b7]
[    6.293823] CPU: 0 PID: 509 Comm: poc Not tainted 6.1.47+ #59
[    6.294699] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[    6.295151] RIP: 0010:ipv4_link_failure+0x2dc/0x610
[    6.295423] Code: 80 3c 28 00 48 89 e9 74 12 4c 89 f7 e8 5d e8 bd fd 48 b9 00 00 00 00 00 fc ff df bd b0 04 00 00 49 03 2e 48 89 e8 48 c1 e8 03 <80> 3c 08 00 74 08 48 89 ef e8 36 e8 bd fd 48 8b 7d 00 48 8d 74 24
[    6.296423] RSP: 0018:ffff88800bc87530 EFLAGS: 00010206
[    6.296710] RAX: 0000000000000096 RBX: ffff88800af22c04 RCX: dffffc0000000000
[    6.297096] RDX: dffffc0000000000 RSI: 00000000fffffff8 RDI: ffff88800bc87578
[    6.297482] RBP: 00000000000004b0 R08: dffffc0000000000 R09: ffff88800bc87570
[    6.297868] R10: dfffe91001790eb1 R11: 1ffff11001790eae R12: ffff88800af22b40
[    6.298282] R13: 1ffff110015e4576 R14: ffff88800af22b50 R15: 1ffff110015e4576
[    6.298679] FS:  00000000032493c0(0000) GS:ffff888034e00000(0000) knlGS:0000000000000000
[    6.299123] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    6.299442] CR2: 0000000000404dfe CR3: 000000000d492004 CR4: 0000000000770ef0
[    6.299841] PKRU: 55555554
[    6.299995] Call Trace:
[    6.300137]  <TASK>
[    6.300259]  ? __die_body+0x67/0xb0
[    6.300455]  ? die_addr+0xb2/0xe0
[    6.300641]  ? exc_general_protection+0x27f/0x3c0
[    6.300902]  ? asm_exc_general_protection+0x22/0x30
[    6.301167]  ? ipv4_link_failure+0x2dc/0x610
[    6.301398]  __ip_vs_get_out_rt+0x54a/0x1060
[    6.301629]  ? kasan_save_free_info+0x27/0x40
[    6.301871]  ip_vs_nat_xmit+0x144/0x800
[    6.302114]  ? ip_vs_in_stats+0x1ca/0x2d0
[    6.302334]  ip_vs_in_hook+0xc13/0x1b20
[    6.302546]  ? ip_vs_out_hook+0xd70/0xd70
[    6.302767]  nf_hook_slow+0xb4/0x190
[    6.302963]  __ip_local_out+0x347/0x450
[    6.303171]  ? __ip_local_out+0x450/0x450
[    6.303387]  ip_send_skb+0x48/0x110
[    6.303589]  udp_send_skb+0x6e4/0x1370
[    6.303805]  udp_sendmsg+0x16ba/0x2850
[    6.304016]  ? ip_skb_dst_mtu+0x5e0/0x5e0
[    6.304250]  ? inet_send_prepare+0x2f0/0x2f0
[    6.304492]  ____sys_sendmsg+0x560/0x6d0
[    6.304726]  __sys_sendmsg+0x1bd/0x240
[    6.304959]  do_syscall_64+0x67/0x90
[    6.305165]  ? exit_to_user_mode_prepare+0x12/0xa0
[    6.305429]  ? syscall_exit_to_user_mode+0x28/0x150
[    6.305701]  ? do_syscall_64+0x75/0x90
[    6.305915]  ? exit_to_user_mode_prepare+0x12/0xa0
[    6.306199]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[    6.306485] RIP: 0033:0x474087
[    6.306690] Code: ff ff f7 d8 64 89 02 b8 ff ff ff ff eb b8 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
[    6.308329] RSP: 002b:00007ffe8e9dce28 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[    6.308792] RAX: ffffffffffffffda RBX: 00007ffe8e9dd0a8 RCX: 0000000000474087
[    6.309199] RDX: 0000000000000000 RSI: 00007ffe8e9dce40 RDI: 0000000000000006
[    6.309629] RBP: 00007ffe8e9dcea0 R08: 0000000000000004 R09: 000000000324b480
[    6.310071] R10: 00007ffe8e9dce30 R11: 0000000000000246 R12: 0000000000000001
[    6.310451] R13: 00007ffe8e9dd098 R14: 00000000004fd740 R15: 0000000000000002
[    6.310869]  </TASK>
[    6.311002] Modules linked in:
[    6.311222] ---[ end trace 0000000000000000 ]---
[    6.311493] RIP: 0010:ipv4_link_failure+0x2dc/0x610
[    6.311807] Code: 80 3c 28 00 48 89 e9 74 12 4c 89 f7 e8 5d e8 bd fd 48 b9 00 00 00 00 00 fc ff df bd b0 04 00 00 49 03 2e 48 89 e8 48 c1 e8 03 <80> 3c 08 00 74 08 48 89 ef e8 36 e8 bd fd 48 8b 7d 00 48 8d 74 24
[    6.312941] RSP: 0018:ffff88800bc87530 EFLAGS: 00010206
[    6.313283] RAX: 0000000000000096 RBX: ffff88800af22c04 RCX: dffffc0000000000
[    6.313766] RDX: dffffc0000000000 RSI: 00000000fffffff8 RDI: ffff88800bc87578
[    6.314248] RBP: 00000000000004b0 R08: dffffc0000000000 R09: ffff88800bc87570
[    6.314676] R10: dfffe91001790eb1 R11: 1ffff11001790eae R12: ffff88800af22b40
[    6.315138] R13: 1ffff110015e4576 R14: ffff88800af22b50 R15: 1ffff110015e4576
[    6.315591] FS:  00000000032493c0(0000) GS:ffff888034e00000(0000) knlGS:0000000000000000
[    6.316130] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    6.316501] CR2: 0000000000404dfe CR3: 000000000d492004 CR4: 0000000000770ef0
[    6.316979] PKRU: 55555554
[    6.317146] Kernel panic - not syncing: Fatal exception
[    6.317688] Kernel Offset: disabled
[    6.317910] Rebooting in 1000 seconds..

Fixes: ed0de45 ("ipv4: recompile ip options in ipv4_link_failure")
Signed-off-by: Kyle Zeng <zengyhkyle@gmail.com>
Cc: Stephen Suryaputra <ssuryaextr@gmail.com>
---
 net/ipv4/route.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index d8c99bdc617..fcb8e57b95f 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1230,7 +1230,7 @@ static void ipv4_send_dest_unreach(struct sk_buff *skb)
 		opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr);
 
 		rcu_read_lock();
-		res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL);
+		res = __ip_options_compile(NULL, &opt, skb, NULL);
 		rcu_read_unlock();
 
 		if (res)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach
  2023-09-07  2:43 [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach Kyle Zeng
@ 2023-09-07 11:03 ` Paolo Abeni
  2023-09-07 22:53   ` Kyle Zeng
  2023-09-07 14:39 ` David Ahern
  1 sibling, 1 reply; 8+ messages in thread
From: Paolo Abeni @ 2023-09-07 11:03 UTC (permalink / raw)
  To: Kyle Zeng, davem, dsahern, netdev; +Cc: ssuryaextr

On Wed, 2023-09-06 at 19:43 -0700, Kyle Zeng wrote:
> Currently, we assume the skb is associated with a device before calling __ip_options_compile, which is not always the case if it is re-routed by ipvs.
> When skb->dev is NULL, dev_net(skb->dev) will become null-dereference.
> Since we know that all the options will be set to IPOPT_END, which does
> not depend on struct net, we pass NULL to it.

It's not clear to me why we can infer the above. Possibly would be more
safe to skip entirely the __ip_options_compile() call?!?

Please at least clarify the changelog and trim it to 72 chars. 

Additionally trim the subj to the same len and include the target tree
(net) into the subj prefix.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach
  2023-09-07  2:43 [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach Kyle Zeng
  2023-09-07 11:03 ` Paolo Abeni
@ 2023-09-07 14:39 ` David Ahern
  1 sibling, 0 replies; 8+ messages in thread
From: David Ahern @ 2023-09-07 14:39 UTC (permalink / raw)
  To: Kyle Zeng, davem, netdev; +Cc: ssuryaextr

On 9/6/23 8:43 PM, Kyle Zeng wrote:
> Currently, we assume the skb is associated with a device before calling __ip_options_compile, which is not always the case if it is re-routed by ipvs.
> When skb->dev is NULL, dev_net(skb->dev) will become null-dereference.
> Since we know that all the options will be set to IPOPT_END, which does
> not depend on struct net, we pass NULL to it.
> 
> KASAN splash:
> 
> [    6.289675] general protection fault, probably for non-canonical address 0xdffffc0000000096: 0000 [#1] PREEMPT SMP KASAN NOPTI
> [    6.292146] KASAN: null-ptr-deref in range [0x00000000000004b0-0x00000000000004b7]
> [    6.293823] CPU: 0 PID: 509 Comm: poc Not tainted 6.1.47+ #59
> [    6.294699] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> [    6.295151] RIP: 0010:ipv4_link_failure+0x2dc/0x610
> [    6.295423] Code: 80 3c 28 00 48 89 e9 74 12 4c 89 f7 e8 5d e8 bd fd 48 b9 00 00 00 00 00 fc ff df bd b0 04 00 00 49 03 2e 48 89 e8 48 c1 e8 03 <80> 3c 08 00 74 08 48 89 ef e8 36 e8 bd fd 48 8b 7d 00 48 8d 74 24
> [    6.296423] RSP: 0018:ffff88800bc87530 EFLAGS: 00010206
> [    6.296710] RAX: 0000000000000096 RBX: ffff88800af22c04 RCX: dffffc0000000000
> [    6.297096] RDX: dffffc0000000000 RSI: 00000000fffffff8 RDI: ffff88800bc87578
> [    6.297482] RBP: 00000000000004b0 R08: dffffc0000000000 R09: ffff88800bc87570
> [    6.297868] R10: dfffe91001790eb1 R11: 1ffff11001790eae R12: ffff88800af22b40
> [    6.298282] R13: 1ffff110015e4576 R14: ffff88800af22b50 R15: 1ffff110015e4576
> [    6.298679] FS:  00000000032493c0(0000) GS:ffff888034e00000(0000) knlGS:0000000000000000
> [    6.299123] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    6.299442] CR2: 0000000000404dfe CR3: 000000000d492004 CR4: 0000000000770ef0
> [    6.299841] PKRU: 55555554
> [    6.299995] Call Trace:
> [    6.300137]  <TASK>
> [    6.300259]  ? __die_body+0x67/0xb0
> [    6.300455]  ? die_addr+0xb2/0xe0
> [    6.300641]  ? exc_general_protection+0x27f/0x3c0
> [    6.300902]  ? asm_exc_general_protection+0x22/0x30
> [    6.301167]  ? ipv4_link_failure+0x2dc/0x610
> [    6.301398]  __ip_vs_get_out_rt+0x54a/0x1060
> [    6.301629]  ? kasan_save_free_info+0x27/0x40
> [    6.301871]  ip_vs_nat_xmit+0x144/0x800
> [    6.302114]  ? ip_vs_in_stats+0x1ca/0x2d0
> [    6.302334]  ip_vs_in_hook+0xc13/0x1b20
> [    6.302546]  ? ip_vs_out_hook+0xd70/0xd70
> [    6.302767]  nf_hook_slow+0xb4/0x190
> [    6.302963]  __ip_local_out+0x347/0x450
> [    6.303171]  ? __ip_local_out+0x450/0x450
> [    6.303387]  ip_send_skb+0x48/0x110
> [    6.303589]  udp_send_skb+0x6e4/0x1370
> [    6.303805]  udp_sendmsg+0x16ba/0x2850
> [    6.304016]  ? ip_skb_dst_mtu+0x5e0/0x5e0
> [    6.304250]  ? inet_send_prepare+0x2f0/0x2f0
> [    6.304492]  ____sys_sendmsg+0x560/0x6d0
> [    6.304726]  __sys_sendmsg+0x1bd/0x240
> [    6.304959]  do_syscall_64+0x67/0x90
> [    6.305165]  ? exit_to_user_mode_prepare+0x12/0xa0
> [    6.305429]  ? syscall_exit_to_user_mode+0x28/0x150
> [    6.305701]  ? do_syscall_64+0x75/0x90
> [    6.305915]  ? exit_to_user_mode_prepare+0x12/0xa0
> [    6.306199]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> [    6.306485] RIP: 0033:0x474087
> [    6.306690] Code: ff ff f7 d8 64 89 02 b8 ff ff ff ff eb b8 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
> [    6.308329] RSP: 002b:00007ffe8e9dce28 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
> [    6.308792] RAX: ffffffffffffffda RBX: 00007ffe8e9dd0a8 RCX: 0000000000474087
> [    6.309199] RDX: 0000000000000000 RSI: 00007ffe8e9dce40 RDI: 0000000000000006
> [    6.309629] RBP: 00007ffe8e9dcea0 R08: 0000000000000004 R09: 000000000324b480
> [    6.310071] R10: 00007ffe8e9dce30 R11: 0000000000000246 R12: 0000000000000001
> [    6.310451] R13: 00007ffe8e9dd098 R14: 00000000004fd740 R15: 0000000000000002
> [    6.310869]  </TASK>
> [    6.311002] Modules linked in:
> [    6.311222] ---[ end trace 0000000000000000 ]---
> [    6.311493] RIP: 0010:ipv4_link_failure+0x2dc/0x610
> [    6.311807] Code: 80 3c 28 00 48 89 e9 74 12 4c 89 f7 e8 5d e8 bd fd 48 b9 00 00 00 00 00 fc ff df bd b0 04 00 00 49 03 2e 48 89 e8 48 c1 e8 03 <80> 3c 08 00 74 08 48 89 ef e8 36 e8 bd fd 48 8b 7d 00 48 8d 74 24
> [    6.312941] RSP: 0018:ffff88800bc87530 EFLAGS: 00010206
> [    6.313283] RAX: 0000000000000096 RBX: ffff88800af22c04 RCX: dffffc0000000000
> [    6.313766] RDX: dffffc0000000000 RSI: 00000000fffffff8 RDI: ffff88800bc87578
> [    6.314248] RBP: 00000000000004b0 R08: dffffc0000000000 R09: ffff88800bc87570
> [    6.314676] R10: dfffe91001790eb1 R11: 1ffff11001790eae R12: ffff88800af22b40
> [    6.315138] R13: 1ffff110015e4576 R14: ffff88800af22b50 R15: 1ffff110015e4576
> [    6.315591] FS:  00000000032493c0(0000) GS:ffff888034e00000(0000) knlGS:0000000000000000
> [    6.316130] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    6.316501] CR2: 0000000000404dfe CR3: 000000000d492004 CR4: 0000000000770ef0
> [    6.316979] PKRU: 55555554
> [    6.317146] Kernel panic - not syncing: Fatal exception
> [    6.317688] Kernel Offset: disabled
> [    6.317910] Rebooting in 1000 seconds..
> 
> Fixes: ed0de45 ("ipv4: recompile ip options in ipv4_link_failure")
> Signed-off-by: Kyle Zeng <zengyhkyle@gmail.com>
> Cc: Stephen Suryaputra <ssuryaextr@gmail.com>
> ---
>  net/ipv4/route.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/route.c b/net/ipv4/route.c
> index d8c99bdc617..fcb8e57b95f 100644
> --- a/net/ipv4/route.c
> +++ b/net/ipv4/route.c
> @@ -1230,7 +1230,7 @@ static void ipv4_send_dest_unreach(struct sk_buff *skb)
>  		opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr);
>  
>  		rcu_read_lock();
> -		res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL);
> +		res = __ip_options_compile(NULL, &opt, skb, NULL);
>  		rcu_read_unlock();
>  
>  		if (res)

ipv4_send_dest_unreach is called from ipv4_link_failure which might have
an rtable (dst_entry) which has a device which is in a net namespace.
That is better than blindly ignoring the namepsace.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach
  2023-09-07 11:03 ` Paolo Abeni
@ 2023-09-07 22:53   ` Kyle Zeng
  2023-09-08  1:08     ` [PATCH] fix null-deref in ipv4_link_failure kernel test robot
                       ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Kyle Zeng @ 2023-09-07 22:53 UTC (permalink / raw)
  To: Paolo Abeni, dsahern; +Cc: davem, netdev, ssuryaextr

[-- Attachment #1: Type: text/plain, Size: 2098 bytes --]

On Thu, Sep 07, 2023 at 01:03:52PM +0200, Paolo Abeni wrote:
> On Wed, 2023-09-06 at 19:43 -0700, Kyle Zeng wrote:
> > Currently, we assume the skb is associated with a device before calling __ip_options_compile, which is not always the case if it is re-routed by ipvs.
> > When skb->dev is NULL, dev_net(skb->dev) will become null-dereference.
> > Since we know that all the options will be set to IPOPT_END, which does
> > not depend on struct net, we pass NULL to it.
> 
> It's not clear to me why we can infer the above. Possibly would be more
> safe to skip entirely the __ip_options_compile() call?!?
> 
> Please at least clarify the changelog and trim it to 72 chars. 
> 
> Additionally trim the subj to the same len and include the target tree
> (net) into the subj prefix.
> 
> Thanks!
> 
> Paolo
> 

Hi Paolo,

> It's not clear to me why we can infer the above. Possibly would be more
> safe to skip entirely the __ip_options_compile() call?!?
Sorry, after you pointed it out, I realized that I misunderstood the
code. Initially I thought `memset(&opt, 0, sizeof(opt));` would reset all
the option to OPOPT_END. But after carefully reading the code, it seems
that it only resets the io_options struct and the `optptr` is still the
original one. 

Do you think it is better to do:
`struct net = skb->dev ? dev_net(skb->dev) : NULL` ?

> Please at least clarify the changelog and trim it to 72 chars. 
> 
> Additionally trim the subj to the same len and include the target tree
> (net) into the subj prefix.
Sorry for that. I'm new to the Linux kernel community and I wonder whether
I should initiate a different patch or send another patch in this thread
in this case.

Hi David,

> ipv4_send_dest_unreach is called from ipv4_link_failure which might have
> an rtable (dst_entry) which has a device which is in a net namespace.
> That is better than blindly ignoring the namepsace.
Following your suggestion, I drafted another patch which is attached to
this email. I verified that the crash does not happen anymore. Can you
please advise whether it is a correct patch?

Thanks,
Kyle Zeng

[-- Attachment #2: 0001-fix-null-deref-in-ipv4_link_failure.patch --]
[-- Type: text/x-diff, Size: 1590 bytes --]

From ddf42a72bd2aabc7b66529ddadd90df420a73610 Mon Sep 17 00:00:00 2001
From: Kyle Zeng <zengyhkyle@gmail.com>
Date: Thu, 7 Sep 2023 15:49:46 -0700
Subject: [PATCH] fix null-deref in ipv4_link_failure

Currently, we assume the skb is associated with a device before calling
__ip_options_compile, which is not always the case if it is re-routed by
ipvs.
When skb->dev is NULL, dev_net(skb->dev) will become null-dereference.
This patch adds a check for the edge case and switch to use the net_device
from the rtable when skb->dev is NULL.

Suggested-by: Paolo Abeni<pabeni@redhat.com>
Suggested-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Kyle Zeng <zengyhkyle@gmail.com>
Cc: Stephen Suryaputra <ssuryaextr@gmail.com>
---
 net/ipv4/route.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index d8c99bdc617..735a491e1ff 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1215,6 +1215,7 @@ static void ipv4_send_dest_unreach(struct sk_buff *skb)
 {
 	struct ip_options opt;
 	int res;
+	struct net_device *dev;
 
 	/* Recompile ip options since IPCB may not be valid anymore.
 	 * Also check we have a reasonable ipv4 header.
@@ -1230,7 +1231,8 @@ static void ipv4_send_dest_unreach(struct sk_buff *skb)
 		opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr);
 
 		rcu_read_lock();
-		res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL);
+		dev = skb->dev ? skb->dev : skb_rtable(skb)->dst.dev;
+		res = __ip_options_compile(dev_net(net), &opt, skb, NULL);
 		rcu_read_unlock();
 
 		if (res)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] fix null-deref in ipv4_link_failure
  2023-09-07 22:53   ` Kyle Zeng
@ 2023-09-08  1:08     ` kernel test robot
  2023-09-08  1:40     ` kernel test robot
  2023-09-08  1:45     ` [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach Kyle Zeng
  2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2023-09-08  1:08 UTC (permalink / raw)
  To: Kyle Zeng, Paolo Abeni, dsahern; +Cc: oe-kbuild-all, davem, netdev, ssuryaextr

Hi Kyle,

kernel test robot noticed the following build warnings:

[auto build test WARNING on linus/master]
[also build test WARNING on v6.5 next-20230907]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kyle-Zeng/fix-null-deref-in-ipv4_link_failure/20230908-065510
base:   linus/master
patch link:    https://lore.kernel.org/r/ZPpUfm%2FHhFet3ejH%40westworld
patch subject: [PATCH] fix null-deref in ipv4_link_failure
config: m68k-allyesconfig (https://download.01.org/0day-ci/archive/20230908/202309080837.urhzWjw6-lkp@intel.com/config)
compiler: m68k-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230908/202309080837.urhzWjw6-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309080837.urhzWjw6-lkp@intel.com/

All warnings (new ones prefixed by >>):

   net/ipv4/route.c: In function 'ipv4_send_dest_unreach':
   net/ipv4/route.c:1235:52: error: 'net' undeclared (first use in this function)
    1235 |                 res = __ip_options_compile(dev_net(net), &opt, skb, NULL);
         |                                                    ^~~
   net/ipv4/route.c:1235:52: note: each undeclared identifier is reported only once for each function it appears in
>> net/ipv4/route.c:1218:28: warning: variable 'dev' set but not used [-Wunused-but-set-variable]
    1218 |         struct net_device *dev;
         |                            ^~~


vim +/dev +1218 net/ipv4/route.c

  1213	
  1214	static void ipv4_send_dest_unreach(struct sk_buff *skb)
  1215	{
  1216		struct ip_options opt;
  1217		int res;
> 1218		struct net_device *dev;
  1219	
  1220		/* Recompile ip options since IPCB may not be valid anymore.
  1221		 * Also check we have a reasonable ipv4 header.
  1222		 */
  1223		if (!pskb_network_may_pull(skb, sizeof(struct iphdr)) ||
  1224		    ip_hdr(skb)->version != 4 || ip_hdr(skb)->ihl < 5)
  1225			return;
  1226	
  1227		memset(&opt, 0, sizeof(opt));
  1228		if (ip_hdr(skb)->ihl > 5) {
  1229			if (!pskb_network_may_pull(skb, ip_hdr(skb)->ihl * 4))
  1230				return;
  1231			opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr);
  1232	
  1233			rcu_read_lock();
  1234			dev = skb->dev ? skb->dev : skb_rtable(skb)->dst.dev;
  1235			res = __ip_options_compile(dev_net(net), &opt, skb, NULL);
  1236			rcu_read_unlock();
  1237	
  1238			if (res)
  1239				return;
  1240		}
  1241		__icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0, &opt);
  1242	}
  1243	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] fix null-deref in ipv4_link_failure
  2023-09-07 22:53   ` Kyle Zeng
  2023-09-08  1:08     ` [PATCH] fix null-deref in ipv4_link_failure kernel test robot
@ 2023-09-08  1:40     ` kernel test robot
  2023-09-08  1:45     ` [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach Kyle Zeng
  2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2023-09-08  1:40 UTC (permalink / raw)
  To: Kyle Zeng, Paolo Abeni, dsahern
  Cc: llvm, oe-kbuild-all, davem, netdev, ssuryaextr

Hi Kyle,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.5 next-20230907]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Kyle-Zeng/fix-null-deref-in-ipv4_link_failure/20230908-065510
base:   linus/master
patch link:    https://lore.kernel.org/r/ZPpUfm%2FHhFet3ejH%40westworld
patch subject: [PATCH] fix null-deref in ipv4_link_failure
config: um-allnoconfig (https://download.01.org/0day-ci/archive/20230908/202309080905.JnJFQ6YN-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230908/202309080905.JnJFQ6YN-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309080905.JnJFQ6YN-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from net/ipv4/route.c:67:
   In file included from include/linux/memblock.h:13:
   In file included from arch/um/include/asm/dma.h:5:
   In file included from arch/um/include/asm/io.h:24:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     547 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     560 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
         |                                                   ^
   In file included from net/ipv4/route.c:67:
   In file included from include/linux/memblock.h:13:
   In file included from arch/um/include/asm/dma.h:5:
   In file included from arch/um/include/asm/io.h:24:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     573 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
         |                                                   ^
   In file included from net/ipv4/route.c:67:
   In file included from include/linux/memblock.h:13:
   In file included from arch/um/include/asm/dma.h:5:
   In file included from arch/um/include/asm/io.h:24:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     584 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     594 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     604 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     692 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     700 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     708 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     717 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     726 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     735 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   net/ipv4/route.c:880:6: warning: variable 'log_martians' set but not used [-Wunused-but-set-variable]
     880 |         int log_martians;
         |             ^
>> net/ipv4/route.c:1235:38: error: use of undeclared identifier 'net'
    1235 |                 res = __ip_options_compile(dev_net(net), &opt, skb, NULL);
         |                                                    ^
   13 warnings and 1 error generated.


vim +/net +1235 net/ipv4/route.c

  1213	
  1214	static void ipv4_send_dest_unreach(struct sk_buff *skb)
  1215	{
  1216		struct ip_options opt;
  1217		int res;
  1218		struct net_device *dev;
  1219	
  1220		/* Recompile ip options since IPCB may not be valid anymore.
  1221		 * Also check we have a reasonable ipv4 header.
  1222		 */
  1223		if (!pskb_network_may_pull(skb, sizeof(struct iphdr)) ||
  1224		    ip_hdr(skb)->version != 4 || ip_hdr(skb)->ihl < 5)
  1225			return;
  1226	
  1227		memset(&opt, 0, sizeof(opt));
  1228		if (ip_hdr(skb)->ihl > 5) {
  1229			if (!pskb_network_may_pull(skb, ip_hdr(skb)->ihl * 4))
  1230				return;
  1231			opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr);
  1232	
  1233			rcu_read_lock();
  1234			dev = skb->dev ? skb->dev : skb_rtable(skb)->dst.dev;
> 1235			res = __ip_options_compile(dev_net(net), &opt, skb, NULL);
  1236			rcu_read_unlock();
  1237	
  1238			if (res)
  1239				return;
  1240		}
  1241		__icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0, &opt);
  1242	}
  1243	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach
  2023-09-07 22:53   ` Kyle Zeng
  2023-09-08  1:08     ` [PATCH] fix null-deref in ipv4_link_failure kernel test robot
  2023-09-08  1:40     ` kernel test robot
@ 2023-09-08  1:45     ` Kyle Zeng
  2023-09-08  3:15       ` David Ahern
  2 siblings, 1 reply; 8+ messages in thread
From: Kyle Zeng @ 2023-09-08  1:45 UTC (permalink / raw)
  To: Paolo Abeni, dsahern; +Cc: davem, netdev, ssuryaextr

[-- Attachment #1: Type: text/plain, Size: 2504 bytes --]

On Thu, Sep 7, 2023 at 3:53 PM Kyle Zeng <zengyhkyle@gmail.com> wrote:
>
> On Thu, Sep 07, 2023 at 01:03:52PM +0200, Paolo Abeni wrote:
> > On Wed, 2023-09-06 at 19:43 -0700, Kyle Zeng wrote:
> > > Currently, we assume the skb is associated with a device before calling __ip_options_compile, which is not always the case if it is re-routed by ipvs.
> > > When skb->dev is NULL, dev_net(skb->dev) will become null-dereference.
> > > Since we know that all the options will be set to IPOPT_END, which does
> > > not depend on struct net, we pass NULL to it.
> >
> > It's not clear to me why we can infer the above. Possibly would be more
> > safe to skip entirely the __ip_options_compile() call?!?
> >
> > Please at least clarify the changelog and trim it to 72 chars.
> >
> > Additionally trim the subj to the same len and include the target tree
> > (net) into the subj prefix.
> >
> > Thanks!
> >
> > Paolo
> >
>
> Hi Paolo,
>
> > It's not clear to me why we can infer the above. Possibly would be more
> > safe to skip entirely the __ip_options_compile() call?!?
> Sorry, after you pointed it out, I realized that I misunderstood the
> code. Initially I thought `memset(&opt, 0, sizeof(opt));` would reset all
> the option to OPOPT_END. But after carefully reading the code, it seems
> that it only resets the io_options struct and the `optptr` is still the
> original one.
>
> Do you think it is better to do:
> `struct net = skb->dev ? dev_net(skb->dev) : NULL` ?
>
> > Please at least clarify the changelog and trim it to 72 chars.
> >
> > Additionally trim the subj to the same len and include the target tree
> > (net) into the subj prefix.
> Sorry for that. I'm new to the Linux kernel community and I wonder whether
> I should initiate a different patch or send another patch in this thread
> in this case.
>
> Hi David,
>
> > ipv4_send_dest_unreach is called from ipv4_link_failure which might have
> > an rtable (dst_entry) which has a device which is in a net namespace.
> > That is better than blindly ignoring the namepsace.
> Following your suggestion, I drafted another patch which is attached to
> this email. I verified that the crash does not happen anymore. Can you
> please advise whether it is a correct patch?
>
> Thanks,
> Kyle Zeng

Sorry for the typo in the previous patch. I fixed it and tested it.
My proof-of-concept code can no longer trigger the crash.
The patch is attached to this email.

Thanks,
Kyle Zeng

[-- Attachment #2: 0001-fix-null-deref-in-ipv4_link_failure.patch --]
[-- Type: text/x-patch, Size: 1583 bytes --]

From d4ae89c676373bec9e922eb3ade04128624615e0 Mon Sep 17 00:00:00 2001
From: Kyle Zeng <zengyhkyle@gmail.com>
Date: Thu, 7 Sep 2023 18:19:00 -0700
Subject: [PATCH] fix null-deref in ipv4_link_failure

Currently, we assume the skb is associated with a device before calling
__ip_options_compile, which is not always the case if it is re-routed by
ipvs.
When skb->dev is NULL, dev_net(skb->dev) will become null-dereference.
This patch adds a check for the edge case and switch to use the net_device
from the rtable when skb->dev is NULL.

Suggested-by: Paolo Abeni<pabeni@redhat.com>
Suggested-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Kyle Zeng <zengyhkyle@gmail.com>
Cc: Stephen Suryaputra <ssuryaextr@gmail.com>
---
 net/ipv4/route.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index d8c99bdc617..1be34e5eea1 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -1214,6 +1214,7 @@ EXPORT_INDIRECT_CALLABLE(ipv4_dst_check);
 static void ipv4_send_dest_unreach(struct sk_buff *skb)
 {
 	struct ip_options opt;
+	struct net_device *dev;
 	int res;
 
 	/* Recompile ip options since IPCB may not be valid anymore.
@@ -1230,7 +1231,8 @@ static void ipv4_send_dest_unreach(struct sk_buff *skb)
 		opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr);
 
 		rcu_read_lock();
-		res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL);
+		dev = skb->dev ? skb->dev : skb_rtable(skb)->dst.dev;
+		res = __ip_options_compile(dev_net(dev), &opt, skb, NULL);
 		rcu_read_unlock();
 
 		if (res)
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach
  2023-09-08  1:45     ` [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach Kyle Zeng
@ 2023-09-08  3:15       ` David Ahern
  0 siblings, 0 replies; 8+ messages in thread
From: David Ahern @ 2023-09-08  3:15 UTC (permalink / raw)
  To: Kyle Zeng, Paolo Abeni; +Cc: davem, netdev, ssuryaextr

> 
> Sorry for the typo in the previous patch. I fixed it and tested it.
> My proof-of-concept code can no longer trigger the crash.
> The patch is attached to this email.
> 

Send as a standalone patch, not an attachment.


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-09-08  3:15 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-07  2:43 [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach Kyle Zeng
2023-09-07 11:03 ` Paolo Abeni
2023-09-07 22:53   ` Kyle Zeng
2023-09-08  1:08     ` [PATCH] fix null-deref in ipv4_link_failure kernel test robot
2023-09-08  1:40     ` kernel test robot
2023-09-08  1:45     ` [PATCH] don't assume the existence of skb->dev when trying to reset ip_options in ipv4_send_dest_unreach Kyle Zeng
2023-09-08  3:15       ` David Ahern
2023-09-07 14:39 ` David Ahern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).