From mboxrd@z Thu Jan  1 00:00:00 1970
From: John Fastabend <john.fastabend@gmail.com>
Subject: Re: RCU callback crashes
Date: Wed, 20 Dec 2017 12:18:42 -0800
Message-ID: <e75a11f4-8a53-3c92-9d19-5a4a9241ffef@gmail.com>
References: <20171219175921.7db9b0e1@cakuba.netronome.com>
 <20171220061118.GB1916@nanopsycho>
 <20171219222227.402e684a@cakuba.netronome.com>
 <20171219223404.03786d66@cakuba.netronome.com>
 <CAM_iQpWUjfv2-Sirmdb5WfV4pZ4uF0m7=HR5YGWaKxb4KHp8gQ@mail.gmail.com>
 <20171220195922.GB1760@nanopsycho> <20171220201555.GF1760@nanopsycho>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Cc: Jakub Kicinski <kubakici@wp.pl>,
        "netdev@vger.kernel.org" <netdev@vger.kernel.org>
To: Jiri Pirko <jiri@resnulli.us>, Cong Wang <xiyou.wangcong@gmail.com>
Return-path: <netdev-owner@vger.kernel.org>
Received: from mail-pl0-f41.google.com ([209.85.160.41]:33672 "EHLO
        mail-pl0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1755839AbdLTUTM (ORCPT
        <rfc822;netdev@vger.kernel.org>); Wed, 20 Dec 2017 15:19:12 -0500
Received: by mail-pl0-f41.google.com with SMTP id 1so8195073plv.0
        for <netdev@vger.kernel.org>; Wed, 20 Dec 2017 12:19:12 -0800 (PST)
In-Reply-To: <20171220201555.GF1760@nanopsycho>
Content-Language: en-US
Sender: netdev-owner@vger.kernel.org
List-ID: <netdev.vger.kernel.org>

On 12/20/2017 12:15 PM, Jiri Pirko wrote:
> Wed, Dec 20, 2017 at 08:59:22PM CET, jiri@resnulli.us wrote:
>> Wed, Dec 20, 2017 at 07:17:50PM CET, xiyou.wangcong@gmail.com wrote:
>>> On Tue, Dec 19, 2017 at 10:34 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
>>>> Ah, no object debug but KASAN on produces this:
>>>>
>>>
>>>
>>> I bet it is an ingress qdisc which is being freed?
>>>
>>>
>>>
>>>> [   39.268209] BUG: KASAN: use-after-free in cpu_needs_another_gp+0x246/0x2b0
>>>> [   39.275965] Read of size 8 at addr ffff8803aa64f138 by task swapper/13/0
>>>> [   39.283524]
>>>> [   39.285256] CPU: 13 PID: 0 Comm: swapper/13 Not tainted 4.15.0-rc3-perf-00955-g1d0b01347dd5-dirty #8
>>>> [   39.295535] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016
>>>> [   39.303969] Call Trace:
>>>> [   39.306769]  <IRQ>
>>>> [   39.309088]  dump_stack+0xa6/0x118
>>>> [   39.312957]  ? _atomic_dec_and_lock+0xe8/0xe8
>>>> [   39.317895]  ? cpu_needs_another_gp+0x246/0x2b0
>>>> [   39.323030]  print_address_description+0x6a/0x270
>>>> [   39.328380]  ? cpu_needs_another_gp+0x246/0x2b0
>>>> [   39.333510]  kasan_report+0x23f/0x350
>>>> [   39.337672]  cpu_needs_another_gp+0x246/0x2b0
>>>> ...
>>>> [   39.383026]  rcu_process_callbacks+0x1a0/0x620
>>>> ...
>>>
>>>
>>> This is confusing.
>>>
>>> I guess it is q->miniqp which is freed in qdisc_graft() without properly
>>> waiting for rcu readers?
>>
>> miniqp is inside qdisc private data:
>> struct ingress_sched_data {
>>        struct tcf_block *block;
>>        struct tcf_block_ext_info block_info;
>>        struct mini_Qdisc_pair miniqp;
>> };
>>
>> That is freed along with the qdisc itself in:
>> qdisc_destroy->qdisc_free
>>
>> Before miniq, tp was checked in the rcu reader path. In case it was not
>> null, q was processed. In slow patch, tp is freed after rcu grace period:
>> tcf_proto_destroy->kfree_rcu
>>
>> I assumed that since q is processed in rcu reader, it is also freed after
>> a grace period, but now looking at the code I don't see it happening
>> like that.
> 
> Aha! It was removed by:
> commit c5ad119fb6c09b0297446be05bd66602fa564758
> Author: John Fastabend <john.fastabend@gmail.com>
> Date:   Thu Dec 7 09:58:19 2017 -0800
> 
>     net: sched: pfifo_fast use skb_array
> 

Even farther back right,

commit 752fbcc33405d6f8249465e4b2c4e420091bb825
Author: Cong Wang <xiyou.wangcong@gmail.com>
Date:   Tue Sep 19 13:15:42 2017 -0700

    net_sched: no need to free qdisc in RCU callback


> 
>>
>> So I think that change to miniq made the existing race window
>> a bit wider and easier to hit.
>>
>> I believe that calling kfree_rcu by call_rcu should resolve this.