From: Daniel Borkmann <daniel@iogearbox.net>
To: Roi Dayan <roid@mellanox.com>,
Cong Wang <xiyou.wangcong@gmail.com>,
netdev@vger.kernel.org
Cc: jiri@mellanox.com, John Fastabend <john.fastabend@gmail.com>
Subject: Re: [Patch net-next] net_sched: move the empty tp check from ->destroy() to ->delete()
Date: Thu, 24 Nov 2016 18:18:40 +0100 [thread overview]
Message-ID: <583720F0.7090606@iogearbox.net> (raw)
In-Reply-To: <58370558.9070004@iogearbox.net>
On 11/24/2016 04:20 PM, Daniel Borkmann wrote:
> On 11/24/2016 12:01 PM, Roi Dayan wrote:
>> On 24/11/2016 12:14, Daniel Borkmann wrote:
>>> On 11/24/2016 09:29 AM, Roi Dayan wrote:
>>>> Hi,
>>>>
>>>> I'm testing this patch with KASAN enabled and got into a new kernel crash I didn't hit before.
>>>>
>>>> [ 1860.725065] ==================================================================
>>>> [ 1860.733893] BUG: KASAN: use-after-free in __netif_receive_skb_core+0x1ebe/0x29a0 at addr ffff880a68b04028
>>>> [ 1860.745415] Read of size 8 by task CPU 0/KVM/5334
>>>> [ 1860.751368] CPU: 8 PID: 5334 Comm: CPU 0/KVM Tainted: G O 4.9.0-rc3+ #18
>
> (Btw, your kernel is tainted with o-o-tree module? Anything relevant?)
>
>>>> [ 1860.760547] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 07/01/2015
>>>> [ 1860.768036] Call Trace:
>>>> [ 1860.771307] [<ffffffffa9b6dc42>] dump_stack+0x63/0x81
>>>> [ 1860.777167] [<ffffffffa95fb751>] kasan_object_err+0x21/0x70
>>>> [ 1860.783826] [<ffffffffa95fb9dd>] kasan_report_error+0x1ed/0x4e0
>>>> [ 1860.790640] [<ffffffffa9b9b841>] ? csum_partial+0x11/0x20
>>>> [ 1860.796871] [<ffffffffaa44a6b9>] ? csum_partial_ext+0x9/0x10
>>>> [ 1860.803571] [<ffffffffaa453155>] ? __skb_checksum+0x115/0x8d0
>>>> [ 1860.810370] [<ffffffffa95fbe81>] __asan_report_load8_noabort+0x61/0x70
>>>> [ 1860.818263] [<ffffffffaa49c3fe>] ? __netif_receive_skb_core+0x1ebe/0x29a0
>>>> [ 1860.826215] [<ffffffffaa49c3fe>] __netif_receive_skb_core+0x1ebe/0x29a0
>>>> [ 1860.833991] [<ffffffffaa49a540>] ? netdev_info+0x100/0x100
>>>> [ 1860.840529] [<ffffffffaa671792>] ? udp4_gro_receive+0x802/0x1090
>>>> [ 1860.847783] [<ffffffffa9bb9a08>] ? find_next_bit+0x18/0x20
>>>> [ 1860.854126] [<ffffffffaa49cf04>] __netif_receive_skb+0x24/0x150
>>>> [ 1860.861695] [<ffffffffaa49d0d1>] netif_receive_skb_internal+0xa1/0x1d0
>>>> [ 1860.869366] [<ffffffffaa49d030>] ? __netif_receive_skb+0x150/0x150
>>>> [ 1860.876464] [<ffffffffaa49f7e9>] ? dev_gro_receive+0x969/0x1660
>>>> [ 1860.883924] [<ffffffffaa4a0e1f>] napi_gro_receive+0x1df/0x300
>>>> [ 1860.890744] [<ffffffffc02e885d>] mlx5e_handle_rx_cqe_rep+0x83d/0xd30 [mlx5_core]
>>>>
>>>> checking with gdb
>>>>
>>>> (gdb) l *(__netif_receive_skb_core+0x1ebe)
>>>> 0xffffffff8249c3fe is in __netif_receive_skb_core (net/core/dev.c:3937).
>>>> 3932 *pt_prev = NULL;
>>>> 3933 }
>>>> 3934
>>>> 3935 qdisc_skb_cb(skb)->pkt_len = skb->len;
>>>> 3936 skb->tc_verd = SET_TC_AT(skb->tc_verd, AT_INGRESS);
>>>> 3937 qdisc_bstats_cpu_update(cl->q, skb);
>>>> 3938
>>>> 3939 switch (tc_classify(skb, cl, &cl_res, false)) {
>>>> 3940 case TC_ACT_OK:
>>>> 3941 case TC_ACT_RECLASSIFY:
>>>
>>> Can you elaborate some more on your test-case? Adding/dropping ingress qdisc with
>>> some classifier on it in a loop while traffic goes through?
>>
>> I first delete the qdisc ingress from the relevant interface
>> I start traffic on it then I add the qdisc ingress to the relevant interface and start adding tc flower rules to match the traffic.
>
> Ok, strange, qdisc_destroy() calls into ops->destroy(), where ingress
> drops its entire chain via tcf_destroy_chain(), so that will be NULL
> eventually. The tps are freed by call_rcu() as well as qdisc itself
> later on via qdisc_rcu_free(), where it frees per-cpu bstats as well.
> Outstanding readers should either bail out due to if (!cl) or can still
> process the chain until read section ends, but during that time, cl->q
> resp. bstats should be good. Do you happen to know what's at address
> ffff880a68b04028? I was wondering wrt call_rcu() vs call_rcu_bh(), but
> at least on ingress (netif_receive_skb_internal()) we hold rcu_read_lock()
> here. The KASAN report is reliably happening at this location, right?
Tried to reproduce this on my phys machine on top of Cong's patch and no
luck hitting above so far. I have a KASAN compiled kernel with pktgen
hitting ingress and ingress qdisc + flower filter rules added/destroyed
in a loop. Hmm, do you have a kernel config (particular RCU settings)?
next prev parent reply other threads:[~2016-11-24 17:18 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-24 1:58 [Patch net-next] net_sched: move the empty tp check from ->destroy() to ->delete() Cong Wang
2016-11-24 8:29 ` Roi Dayan
2016-11-24 10:14 ` Daniel Borkmann
2016-11-24 11:01 ` Roi Dayan
2016-11-24 15:20 ` Daniel Borkmann
2016-11-24 17:18 ` Daniel Borkmann [this message]
2016-11-26 6:46 ` Cong Wang
2016-11-26 11:09 ` Daniel Borkmann
2016-11-27 0:33 ` Daniel Borkmann
2016-11-27 4:47 ` Roi Dayan
2016-11-27 6:29 ` Roi Dayan
2016-11-28 2:26 ` John Fastabend
2016-11-28 2:51 ` John Fastabend
2016-11-29 6:59 ` Cong Wang
2016-11-28 2:57 ` John Fastabend
2016-11-29 6:57 ` Cong Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=583720F0.7090606@iogearbox.net \
--to=daniel@iogearbox.net \
--cc=jiri@mellanox.com \
--cc=john.fastabend@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=roid@mellanox.com \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.