From mboxrd@z Thu Jan 1 00:00:00 1970 From: malahal@us.ibm.com Subject: Re: [PATCH] aic94xx: Hotplug ex_change_count race fix Date: Wed, 4 Oct 2006 16:52:57 -0700 Message-ID: <20061004235257.GA6594@us.ibm.com> References: <1159308320.9567.55.camel@alexis> <1159885160.3443.20.camel@mulgrave.il.steeleye.com> <1159889413.7024.15.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from e32.co.us.ibm.com ([32.97.110.150]:44755 "EHLO e32.co.us.ibm.com") by vger.kernel.org with ESMTP id S1751238AbWJDXxA (ORCPT ); Wed, 4 Oct 2006 19:53:00 -0400 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e32.co.us.ibm.com (8.13.8/8.12.11) with ESMTP id k94NqxOj018779 for ; Wed, 4 Oct 2006 19:52:59 -0400 Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by westrelay02.boulder.ibm.com (8.13.6/8.13.6/NCO v8.1.1) with ESMTP id k94NqxB5337156 for ; Wed, 4 Oct 2006 17:52:59 -0600 Received: from d03av01.boulder.ibm.com (loopback [127.0.0.1]) by d03av01.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id k94NqwVo012191 for ; Wed, 4 Oct 2006 17:52:58 -0600 Content-Disposition: inline In-Reply-To: <1159889413.7024.15.camel@localhost.localdomain> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Alexis Bruemmer Cc: James Bottomley , linux-scsi Yes, I noticed that we stop processing as soon as we find a device that has change count. We should go through all the devices that have change counts. At least, in FCP world an RSCN (similar to BROADCAST in SAS) may be delayed by the Fabric to collect few changes. Thanks, Malahal. Alexis Bruemmer [alexisb@us.ibm.com] wrote: > On Tue, 2006-10-03 at 09:19 -0500, James Bottomley wrote: > > On Tue, 2006-09-26 at 15:05 -0700, Alexis Bruemmer wrote: > > > In some cases while hotplugging disks on a system with an expander the > > > broadcast primitive will be posted and begin processing before the > > > expander change count is updated. This causes the device that triggered > > > the broadcast to not be found. > > > > Thanks; I'll stick this in. > > > > However, it has always struck me that this broadcast code is fragile > > because of the way event processing works. If we get two fairly close > > together broadcast events, we'll amalgamate them into a single event and > > then stop processing as soon as we find one expander that changed, if > > you want to look at sorting that out ... > > I will look into it. > > --Alexis