Re: KASAN veth use after free in XDP_REDIRECT

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Toke Høiland-Jørgensen" <toke@kernel.org>
To: Shawn Bohrer <sbohrer@cloudflare.com>,
	netdev@vger.kernel.org, bpf@vger.kernel.org
Cc: makita.toshiaki@lab.ntt.co.jp
Subject: Re: KASAN veth use after free in XDP_REDIRECT
Date: Wed, 25 Jan 2023 02:54:52 +0100	[thread overview]
Message-ID: <87357znztf.fsf@toke.dk> (raw)
In-Reply-To: <Y9BfknDG0LXmruDu@JNXK7M3>

Shawn Bohrer <sbohrer@cloudflare.com> writes:

> Hello,
>
> We've seen the following KASAN report on our systems. When using
> AF_XDP on a veth.
>
> KASAN report:
>
> BUG: KASAN: use-after-free in __xsk_rcv+0x18d/0x2c0
> Read of size 78 at addr ffff888976250154 by task napi/iconduit-g/148640
>
> CPU: 5 PID: 148640 Comm: napi/iconduit-g Kdump: loaded Tainted: G           O       6.1.4-cloudflare-kasan-2023.1.2 #1
> Hardware name: Quanta Computer Inc. QuantaPlex T41S-2U/S2S-MB, BIOS S2S_3B10.03 06/21/2018
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0x34/0x48
>  print_report+0x170/0x473
>  ? __xsk_rcv+0x18d/0x2c0
>  kasan_report+0xad/0x130
>  ? __xsk_rcv+0x18d/0x2c0
>  kasan_check_range+0x149/0x1a0
>  memcpy+0x20/0x60
>  __xsk_rcv+0x18d/0x2c0
>  __xsk_map_redirect+0x1f3/0x490
>  ? veth_xdp_rcv_skb+0x89c/0x1ba0 [veth]
>  xdp_do_redirect+0x5ca/0xd60
>  veth_xdp_rcv_skb+0x935/0x1ba0 [veth]
>  ? __netif_receive_skb_list_core+0x671/0x920
>  ? veth_xdp+0x670/0x670 [veth]
>  veth_xdp_rcv+0x304/0xa20 [veth]
>  ? do_xdp_generic+0x150/0x150
>  ? veth_xdp_rcv_one+0xde0/0xde0 [veth]
>  ? _raw_spin_lock_bh+0xe0/0xe0
>  ? newidle_balance+0x887/0xe30
>  ? __perf_event_task_sched_in+0xdb/0x800
>  veth_poll+0x139/0x571 [veth]
>  ? veth_xdp_rcv+0xa20/0xa20 [veth]
>  ? _raw_spin_unlock+0x39/0x70
>  ? finish_task_switch.isra.0+0x17e/0x7d0
>  ? __switch_to+0x5cf/0x1070
>  ? __schedule+0x95b/0x2640
>  ? io_schedule_timeout+0x160/0x160
>  __napi_poll+0xa1/0x440
>  napi_threaded_poll+0x3d1/0x460
>  ? __napi_poll+0x440/0x440
>  ? __kthread_parkme+0xc6/0x1f0
>  ? __napi_poll+0x440/0x440
>  kthread+0x2a2/0x340
>  ? kthread_complete_and_exit+0x20/0x20
>  ret_from_fork+0x22/0x30
>  </TASK>
>
> Freed by task 148640:
>  kasan_save_stack+0x23/0x50
>  kasan_set_track+0x21/0x30
>  kasan_save_free_info+0x2a/0x40
>  ____kasan_slab_free+0x169/0x1d0
>  slab_free_freelist_hook+0xd2/0x190
>  __kmem_cache_free+0x1a1/0x2f0
>  skb_release_data+0x449/0x600
>  consume_skb+0x9f/0x1c0
>  veth_xdp_rcv_skb+0x89c/0x1ba0 [veth]
>  veth_xdp_rcv+0x304/0xa20 [veth]
>  veth_poll+0x139/0x571 [veth]
>  __napi_poll+0xa1/0x440
>  napi_threaded_poll+0x3d1/0x460
>  kthread+0x2a2/0x340
>  ret_from_fork+0x22/0x30
>
>
> The buggy address belongs to the object at ffff888976250000
>  which belongs to the cache kmalloc-2k of size 2048
> The buggy address is located 340 bytes inside of
>  2048-byte region [ffff888976250000, ffff888976250800)
>
> The buggy address belongs to the physical page:
> page:00000000ae18262a refcount:2 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x976250
> head:00000000ae18262a order:3 compound_mapcount:0 compound_pincount:0
> flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
> raw: 002ffff800010200 0000000000000000 dead000000000122 ffff88810004cf00
> raw: 0000000000000000 0000000080080008 00000002ffffffff 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  ffff888976250000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff888976250080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>>ffff888976250100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>                                                  ^
>  ffff888976250180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff888976250200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>
>
> If I understand the code correctly it looks like a xdp_buf is
> constructed pointing to the memory backed by a skb but consume_skb()
> is called while the xdp_buf() is still in use.
>
> ```
> 	case XDP_REDIRECT:
> 		veth_xdp_get(&xdp);
> 		consume_skb(skb);
> 		xdp.rxq->mem = rq->xdp_mem;
> 		if (xdp_do_redirect(rq->dev, &xdp, xdp_prog)) {
> 			stats->rx_drops++;
> 			goto err_xdp;
> 		}
> 		stats->xdp_redirect++;
> 		rcu_read_unlock();
> 		goto xdp_xmit;
> ```
>
> It is worth noting that I think XDP_TX has the exact same problem.
>
> Again assuming I understand the problem one naive solution might be to
> move the consum_skb() call after xdp_do_redirect().  I think this
> might work for BPF_MAP_TYPE_XSKMAP, BPF_MAP_TYPE_DEVMAP, and
> BPF_MAP_TYPE_DEVMAP_HASH since those all seem to copy the xdb_buf to
> new memory.  The copy happens for XSKMAP in __xsk_rcv() and for the
> DEVMAP cases happens in dev_map_enqueue_clone().
>
> However, it would appear that for BPF_MAP_TYPE_CPUMAP that memory can
> live much longer, possibly even after xdp_do_flush().  If I'm correct,
> I'm not really sure where it would be safe to call consume_skb().

So the idea is that veth_xdp_get() does a
get_page(virt_to_page(xdp->data)), where xdp->data in this case points
to skb->head. This should keep the data page alive even if the skb
surrounding it is freed by the call to consume_skb().

However, because the skb->head in this case was allocated from a slab
allocator, taking a page refcount is not enough to prevent it from being
freed.

I'm not sure how best to fix this. I guess we could try to detect this
case in the veth driver and copy the data, like we do for skb_share()
etc. However, I'm not sure how to actually detect this case...

Where are the skbs being processed coming from? I.e., what path
allocated them?

-Toke

next prev parent reply	other threads:[~2023-01-25  1:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-24 22:45 KASAN veth use after free in XDP_REDIRECT Shawn Bohrer
2023-01-25  1:54 ` Toke Høiland-Jørgensen [this message]
2023-01-25  2:02   ` Jakub Kicinski
2023-01-25  2:07   ` Toshiaki Makita
2023-03-08 22:33     ` Shawn Bohrer
2023-03-14  8:02       ` Toshiaki Makita
2023-03-14 15:25         ` Shawn Bohrer
2023-03-14 15:33           ` [PATCH] veth: Fix " Shawn Bohrer
2023-03-14 17:13             ` Lorenzo Bianconi
2023-03-15  6:19             ` Toshiaki Makita
2023-03-15 18:08             ` Toke Høiland-Jørgensen
2023-03-16  4:20             ` patchwork-bot+netdevbpf
2023-01-25 16:40   ` KASAN veth " Shawn Bohrer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87357znztf.fsf@toke.dk \
    --to=toke@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=makita.toshiaki@lab.ntt.co.jp \
    --cc=netdev@vger.kernel.org \
    --cc=sbohrer@cloudflare.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.