From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jakub Kicinski Subject: Re: RCU callback crashes Date: Wed, 20 Dec 2017 16:08:55 -0800 Message-ID: <20171220160855.0c0fbcd7@cakuba.netronome.com> References: <20171219175921.7db9b0e1@cakuba.netronome.com> <20171220061118.GB1916@nanopsycho> <20171219222227.402e684a@cakuba.netronome.com> <20171219223404.03786d66@cakuba.netronome.com> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Jiri Pirko , "netdev@vger.kernel.org" To: Cong Wang Return-path: Received: from mx3.wp.pl ([212.77.101.9]:35359 "EHLO mx3.wp.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755811AbdLUAJD (ORCPT ); Wed, 20 Dec 2017 19:09:03 -0500 In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Wed, 20 Dec 2017 16:03:49 -0800, Cong Wang wrote: > On Wed, Dec 20, 2017 at 10:31 AM, Cong Wang wrote: > > On Wed, Dec 20, 2017 at 10:17 AM, Cong Wang wrote: > >> > >> I guess it is q->miniqp which is freed in qdisc_graft() without properly > >> waiting for rcu readers? > > > > It is probably so, the call_rcu_bh(&miniq_old->rcu, mini_qdisc_rcu_func) > > in the end of mini_qdisc_pair_swap() is invoked on miniq_old->rcu, > > but miniq is being freed, no rcu barrier waits for it... > > > > You can try to add a rcu_barrier_bh() at the end to see if this crash > > is gone, but I don't think people like adding yet another rcu barrier... > > Hi, Jakub > > Can you test the following fix? I am not a fan of rcu barrier but we > already have one so... > > diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c > index 876fab2604b8..1b68fedea124 100644 > --- a/net/sched/sch_generic.c > +++ b/net/sched/sch_generic.c > @@ -1240,6 +1240,8 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp, > > if (!tp_head) { > RCU_INIT_POINTER(*miniqp->p_miniq, NULL); > + /* Wait for existing flying RCU callback before being freed. */ > + rcu_barrier_bh(); > return; > } Mm.. I was running with this hack for the last two hours and it was OK: diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c index 876fab2604b8..d7e0c3ad0a1c 100644 --- a/net/sched/sch_generic.c +++ b/net/sched/sch_generic.c @@ -1260,6 +1260,7 @@ void mini_qdisc_pair_swap(struct mini_Qdisc_pair *miniqp, * are not seeing it. */ call_rcu_bh(&miniq_old->rcu, mini_qdisc_rcu_func); + rcu_barrier_bh(); } EXPORT_SYMBOL(mini_qdisc_pair_swap); Let me try to move the barrier...