From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751472AbaEWCb7 (ORCPT ); Thu, 22 May 2014 22:31:59 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:35562 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751302AbaEWCb5 (ORCPT ); Thu, 22 May 2014 22:31:57 -0400 Date: Fri, 23 May 2014 11:31:41 +0900 From: Greg Kroah-Hartmann To: Guenter Roeck Cc: Francesco Ruggeri , Hannes Reinecke , linux-kernel@vger.kernel.org Subject: Re: pci: kernel crash in bus_find_device Message-ID: <20140523023141.GC13900@kroah.com> References: <20140520195041.GA28913@roeck-us.net> <20140520233812.GA15640@roeck-us.net> <20140521193010.GA1721@roeck-us.net> <20140521225958.GB2467@roeck-us.net> <20140522071427.GA21230@kroah.com> <537DA5C0.2070609@roeck-us.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <537DA5C0.2070609@roeck-us.net> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 22, 2014 at 12:22:40AM -0700, Guenter Roeck wrote: > On 05/22/2014 12:14 AM, Greg Kroah-Hartmann wrote: > > On Wed, May 21, 2014 at 03:59:58PM -0700, Guenter Roeck wrote: > >> On Wed, May 21, 2014 at 01:04:04PM -0700, Francesco Ruggeri wrote: > >>> I have been using an x86 platform. > >>> When I started working on it I got early crashes until I added the > >>> check for p not NULL in > >>> > >>> +void bus_release_device(struct device *dev) > >>> +{ > >>> + struct device_private *p = dev->p; > >>> + > >>> + if (p && klist_node_attached(&p->knode_bus)) > >>> + klist_put_last(&p->knode_bus); > >>> +} > >>> + > >>> > >>> Maybe on powerpc *p is overriden between device_del and device_release? > >>> > >>> Or maybe some of the BUG_ONs in the patch? The ones on knode_dead are > >>> treated as WARN_ONs in the current klist code. > >>> The one in BUG_ON(!klist_dec_and_del(n)); is new, and in my tests I > >>> ran into it without the second patch (but only when I ran my module > >>> and tests). > >>> > >> Hi Francesco, > >> > >> I replaced the BUG_ON with WARN_ON; still crashes. > >> > >> Anyway, the problem seems to be known. I found two related exchanges. > >> > >> [1] describes pretty much the same problem. I don't see if/where it was > >> ever fixed, though. > >> > >> [2] is a patch to fix the problem. It did not apply cleanly to 3.14, > >> so I had to make some adjustments in klist_iter_init_node. Resulting > >> patch is below. With this patch, the problem is gone. It is not perfect, > >> as it aborts the loop if it encounters a deleted kobject, but it is better > >> than nothing. Unfortunately, the patch never made it upstream; no idea why. > >> Copying the author and Greg to get additional feedback. > >> > >> Guenter > >> > >> [1] https://lkml.org/lkml/2008/10/26/79 > >> [2] https://lkml.org/lkml/2012/4/16/218 > > > > 2 years ago? I have no idea what was up with that, sorry... > > > > Ok, but do you have comments on the patch itself in its current version ? I have no idea, and at the moment, no time to look at this at all, sorry. Feel free to work on it and verify if it is a valid fix or not for this issue and let me know. thanks, greg k-h