From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mike Anderson Subject: Re: [linux-usb-devel] Re: 2.6.14-rc1 load average calculation broken? Date: Sun, 18 Sep 2005 20:58:04 -0700 Message-ID: <20050919035804.GA5260@us.ibm.com> References: <432B43B3.2080801@ppp0.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from e4.ny.us.ibm.com ([32.97.182.144]:52932 "EHLO e4.ny.us.ibm.com") by vger.kernel.org with ESMTP id S932163AbVISD6a (ORCPT ); Sun, 18 Sep 2005 23:58:30 -0400 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e4.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id j8J3wTER030087 for ; Sun, 18 Sep 2005 23:58:29 -0400 Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by d01relay02.pok.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j8J3wT0f075832 for ; Sun, 18 Sep 2005 23:58:29 -0400 Received: from d01av01.pok.ibm.com (loopback [127.0.0.1]) by d01av01.pok.ibm.com (8.12.11/8.13.3) with ESMTP id j8J3wSKI032706 for ; Sun, 18 Sep 2005 23:58:29 -0400 Content-Disposition: inline In-Reply-To: Sender: linux-scsi-owner@vger.kernel.org List-Id: linux-scsi@vger.kernel.org To: Alan Stern Cc: Jan Dittmer , James Bottomley , Pavel Machek , Greg KH , SCSI development list Alan Stern wrote: > On Sat, 17 Sep 2005, Jan Dittmer wrote: > > > > Maybe the wakeup occurred before ap->ops was set correctly, or after it > > > was unset. Jan, at what point did the oops happen? Was it right after > > > the device was detected, during removal, or some other time? > > > > > > Can you put in some debugging printk's to see what values are in ap, > > > ap->ops, and ap->ops->eng_timeout? > > > > ap->ops is 0, on dereferencing I get a backtrace. ap has a valid pointer > > (-573296044 whatever that maps to). > > Hmm... I imagine that when the error handler is first starting up, > ->host_busy is equal to ->host_failed because both are 0. So that really > is not the appropriate condition to wait for. A better approach would be > to have an atomic_t variable recording the number of pending invocations. > > On the whole, I wonder if using kthread_stop here is such a good idea. > The old mechanism for stopping worked well... > Since scsi_eh_wakeup can only be called on a completion or timeout of an IO you cannot get a comparison when both are 0 (unless we have a bug somewhere). If the increment of host_failed, increment of host_busy, decrement of host_busy, and the comparison of host_busy to host_failed is all under the host_lock why would the atomic_t be better. -andmike -- Michael Anderson andmike@us.ibm.com