From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from zimbra.alphalink.fr ([217.15.80.77]:35330 "EHLO zimbra.alphalink.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1033214AbeCAUBo (ORCPT ); Thu, 1 Mar 2018 15:01:44 -0500 Date: Thu, 1 Mar 2018 21:01:40 +0100 From: Guillaume Nault To: Denys Fedoryshchenko Cc: Cong Wang , Linux Kernel Network Developers , netdev-owner@vger.kernel.org Subject: Re: ppp/pppoe, still panic 4.15.3 in ppp_push Message-ID: <20180301200140.GC1413@alphalink.fr> References: <79ba31ea47137ca539beea112bec8e57@nuclearcat.com> <20180215193114.GK1422@alphalink.fr> <20180222183038.GD1322@alphalink.fr> <336f2012881f89f2b86b29a853c5a672@nuclearcat.com> <20180223093851.GE1322@alphalink.fr> <20180223100710.GF1322@alphalink.fr> <20180227185627.GI1322@alphalink.fr> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180227185627.GI1322@alphalink.fr> Sender: netdev-owner@vger.kernel.org List-ID: On Tue, Feb 27, 2018 at 07:56:27PM +0100, Guillaume Nault wrote: > On Tue, Feb 27, 2018 at 12:58:55PM +0200, Denys Fedoryshchenko wrote: > > On 2018-02-23 12:07, Guillaume Nault wrote: > > > On Fri, Feb 23, 2018 at 11:41:43AM +0200, Denys Fedoryshchenko wrote: > > > > On 2018-02-23 11:38, Guillaume Nault wrote: > > > > > On Thu, Feb 22, 2018 at 08:51:19PM +0200, Denys Fedoryshchenko wrote: > > > > > > I'm using accel-ppp that has unit-cache option, i guess for > > > > > > "reusing" ppp > > > > > > interfaces (because creating a lot of interfaces on BRAS with 8k > > > > > > users quite > > > > > > expensive). > > > > > > Maybe it is somehow related and can be that scenario causing this bug? > > > > > > > > > > > Indeed, it'd be interesting to know if unit-cache is part of the > > > > > equation (if it's workable for you to disable it). > > > > Already did that and testing, unfortunately i had to disable KASAN > > > > and full > > > > refcount, as performance hit is too heavy for me. I will try to > > > > enable KASAN > > > > alone tomorrow. > > > > > > > Don't hesitate to post the result even if you can't afford enabling > > > KASAN. > > Till now 4 days and no reboots. > > > That unit-cache information was very useful. I can now reproduce the > issue and work on a fix. > You can try the following patch. Sorry for the delay, I'm a bit out of time these days. diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c index 255a5def56e9..2acf4b0eabd1 100644 --- a/drivers/net/ppp/ppp_generic.c +++ b/drivers/net/ppp/ppp_generic.c @@ -3161,6 +3161,15 @@ ppp_connect_channel(struct channel *pch, int unit) goto outl; ppp_lock(ppp); + spin_lock_bh(&pch->downl); + if (!pch->chan) { + /* Don't connect unregistered channels */ + ppp_unlock(ppp); + spin_unlock_bh(&pch->downl); + ret = -ENOTCONN; + goto outl; + } + spin_unlock_bh(&pch->downl); if (pch->file.hdrlen > ppp->file.hdrlen) ppp->file.hdrlen = pch->file.hdrlen; hdrlen = pch->file.hdrlen + 2; /* for protocol bytes */