From mboxrd@z Thu Jan 1 00:00:00 1970 From: malahal@us.ibm.com Subject: Re: AIC94XX discovery timeout problem details... Date: Tue, 19 Sep 2006 13:59:02 -0700 Message-ID: <20060919205902.GA4326@us.ibm.com> References: <20060915072030.GA25595@us.ibm.com> <20060918073506.74938.qmail@web31807.mail.mud.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from e34.co.us.ibm.com ([32.97.110.152]:65454 "EHLO e34.co.us.ibm.com") by vger.kernel.org with ESMTP id S1752046AbWISU7F (ORCPT ); Tue, 19 Sep 2006 16:59:05 -0400 Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e34.co.us.ibm.com (8.13.8/8.12.11) with ESMTP id k8JKx40O020198 for ; Tue, 19 Sep 2006 16:59:04 -0400 Received: from d03av03.boulder.ibm.com (d03av03.boulder.ibm.com [9.17.195.169]) by westrelay02.boulder.ibm.com (8.13.6/8.13.6/NCO v8.1.1) with ESMTP id k8JKx4kH268036 for ; Tue, 19 Sep 2006 14:59:04 -0600 Received: from d03av03.boulder.ibm.com (loopback [127.0.0.1]) by d03av03.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id k8JKx4TV027551 for ; Tue, 19 Sep 2006 14:59:04 -0600 Content-Disposition: inline In-Reply-To: <20060918073506.74938.qmail@web31807.mail.mud.yahoo.com> Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Luben Tuikov Cc: linux-scsi@vger.kernel.org I am planning to process port/phy events in two phases. Phase-I sets up asd_sas_port, asd_sas_phy lists and calls the LLDD with a correct phy_mask. Phase-II is done in the thread context by queuing to the scsi work queue that involves setting up sysfs objects. Still there would be just one thread setting up/tearing down sysfs objects that should avoid any races. Would appreciate any suggestions or issues with this approach. Thanks, Malahal. Luben Tuikov [ltuikov@yahoo.com] wrote: > --- malahal@us.ibm.com wrote: > > I chased the time out problem and found that the PORTE_BYTES_DMAED port > > event must be responded with a call to lldd_port_formed() which will > > update PHY_IS_UP and port-links fields in DDB 0. As there is a single > > thread handling PHY/port events as well as discovery, we really can't > > handle PORTE_BYTES_DMAED event until the discovery is complete. This > > results in SCSI commands timing out in the discovery thread no matter > > what I do! This problem may be unique to Vitesse expander, read the PS > > section for details. > > > > Tried using two threads, one for events and the other for discovery. > > Malahal, > > If you go back in the archives, and take a look at the SAS Stack > as I submitted it last year, you'll notice that this is exactly how > the code is: there is a separate event thread and a separate > discovery thread. > > That is, from inception, my code has always had two separate threads, > one for events and one for discovery. > > I wasn't aware that the code had been changed such that a single > thread handles events and discovery. This is a very naive approach, > and a regression over the original code. > > > That avoided the timeout problems, but the discovery thread would die > > after few iterations due to the event thread and the discovery thread > > racing each other for setting up and tearing down of sysfs objects. > > Indeed, the situation presented in such circumstances is tricky. > These "races" have been dealt with in my (original) code. Currently, > I don't experience any problems with my SAS Stack, as I maintain it. > > If any of my original comments are still left in the code or the README > files, that would give you a hint of how such "races" are handled. > > > I tried calling lldd_port_formed() with appropriate phy_mask from the > > notify_port_event() itself. That worked fine. It is just a hack for now!