From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755387AbcAMRMA (ORCPT ); Wed, 13 Jan 2016 12:12:00 -0500 Received: from out3-smtp.messagingengine.com ([66.111.4.27]:36486 "EHLO out3-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755200AbcAMRL4 (ORCPT ); Wed, 13 Jan 2016 12:11:56 -0500 X-Sasl-enc: JGfOGNE+z6ciw2+bPnTYivarROxpHaHHbTP+D5FL/bom 1452705115 Date: Wed, 13 Jan 2016 09:11:54 -0800 From: Greg Kroah-Hartman To: James Bottomley Cc: linux-kernel , linux-scsi , "Ewan D. Milne" Subject: Re: klist: fix starting point removed bug in klist iterators Message-ID: <20160113171154.GA29535@kroah.com> References: <1452701431.2363.12.camel@HansenPartnership.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1452701431.2363.12.camel@HansenPartnership.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 13, 2016 at 08:10:31AM -0800, James Bottomley wrote: > The starting node for a klist iteration is often passed in from > somewhere way above the klist infrastructure, meaning there's no > guarantee the node is still on the list. We've seen this in SCSI where > we use bus_find_device() to iterate through a list of devices. In the > face of heavy hotplug activity, the last device returned by > bus_find_device() can be removed before the next call. This leads to > > Dec 3 13:22:02 localhost kernel: WARNING: CPU: 2 PID: 28073 at include/linux/kref.h:47 klist_iter_init_node+0x3d/0x50() > Dec 3 13:22:02 localhost kernel: Modules linked in: scsi_debug x86_pkg_temp_thermal kvm_intel kvm irqbypass crc32c_intel joydev iTCO_wdt dcdbas ipmi_devintf acpi_power_meter iTCO_vendor_support ipmi_si imsghandler pcspkr wmi acpi_cpufreq tpm_tis tpm shpchp lpc_ich mfd_core nfsd nfs_acl lockd grace sunrpc tg3 ptp pps_core > Dec 3 13:22:02 localhost kernel: CPU: 2 PID: 28073 Comm: cat Not tainted 4.4.0-rc1+ #2 > Dec 3 13:22:02 localhost kernel: Hardware name: Dell Inc. PowerEdge R320/08VT7V, BIOS 2.0.22 11/19/2013 > Dec 3 13:22:02 localhost kernel: ffffffff81a20e77 ffff880613acfd18 ffffffff81321eef 0000000000000000 > Dec 3 13:22:02 localhost kernel: ffff880613acfd50 ffffffff8107ca52 ffff88061176b198 0000000000000000 > Dec 3 13:22:02 localhost kernel: ffffffff814542b0 ffff880610cfb100 ffff88061176b198 ffff880613acfd60 > Dec 3 13:22:02 localhost kernel: Call Trace: > Dec 3 13:22:02 localhost kernel: [] dump_stack+0x44/0x55 > Dec 3 13:22:02 localhost kernel: [] warn_slowpath_common+0x82/0xc0 > Dec 3 13:22:02 localhost kernel: [] ? proc_scsi_show+0x20/0x20 > Dec 3 13:22:02 localhost kernel: [] warn_slowpath_null+0x1a/0x20 > Dec 3 13:22:02 localhost kernel: [] klist_iter_init_node+0x3d/0x50 > Dec 3 13:22:02 localhost kernel: [] bus_find_device+0x51/0xb0 > Dec 3 13:22:02 localhost kernel: [] scsi_seq_next+0x2d/0x40 > [...] > > And an eventual crash. It can actually occur in any hotplug system > which has a device finder and a starting device. > > We can fix this globally by making sure the starting node for > klist_iter_init_node() is actually a member of the list before using it > (and by starting from the beginning if it isn't). > > Reported-by: Ewan D. Milne > Tested-by: Ewan D. Milne > Cc: stable@vger.kernel.org > Signed-off-by: James Bottomley > > --- > > diff --git a/lib/klist.c b/lib/klist.c > index d74cf7a..0507fa5 100644 > --- a/lib/klist.c > +++ b/lib/klist.c > @@ -282,9 +282,9 @@ void klist_iter_init_node(struct klist *k, struct klist_iter *i, > struct klist_node *n) > { > i->i_klist = k; > - i->i_cur = n; > - if (n) > - kref_get(&n->n_ref); > + i->i_cur = NULL; > + if (n && kref_get_unless_zero(&n->n_ref)) > + i->i_cur = n; > } > EXPORT_SYMBOL_GPL(klist_iter_init_node); Thanks for this, looks good, I'll queue it up after 4.5-rc1 is out. greg k-h