From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark Lord Subject: Re: sata_mv port lockup on hotplug (kernel 2.6.38.2) Date: Fri, 10 Jun 2011 08:28:51 -0400 Message-ID: <4DF20E03.9030502@teksavvy.com> References: <20110423005610.GC1576@mtj.dyndns.org> <20110425162242.GB30828@mtj.dyndns.org> <20110426135027.GI878@htj.dyndns.org> <20110426155229.GM878@htj.dyndns.org> <20110430140109.GJ29280@htj.dyndns.org> <20110525094127.GF10146@htj.dyndns.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Return-path: Received: from ironport2-out.teksavvy.com ([206.248.154.183]:36825 "EHLO ironport2-out.pppoe.ca" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1755770Ab1FJM24 (ORCPT ); Fri, 10 Jun 2011 08:28:56 -0400 In-Reply-To: <20110525094127.GF10146@htj.dyndns.org> Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: Tejun Heo Cc: Bruce Stenning , "linux-kernel@vger.kernel.org" , "linux-ide@vger.kernel.org" On 11-05-25 05:41 AM, Tejun Heo wrote: > Hello, sorry about the long delay. > > On Tue, May 17, 2011 at 04:30:20PM +0100, Bruce Stenning wrote: >> __ata_port_freeze: ata4 port frozen >> ata4: hard resetting link >> sata_link_hardreset: ENTER >> ata4: COMRESET failed (errno=-32) >> sata_link_hardreset: EXIT, rc=-32 >> ata4: reset failed (errno=-32), retrying in 33 secs >> __ata_port_freeze: ata4 port frozen >> ata4: hard resetting link >> sata_link_hardreset: ENTER >> ata4: COMRESET failed (errno=-32) >> sata_link_hardreset: EXIT, rc=-32 >> ata4: reset failed, giving up >> ata_eh_recover: EXIT, rc=-32 >> ata4.00: disabled >> ata4: EH complete >> ata_scsi_error: EXIT >> >> The IRQ for that port is masked off afterwards. > > This is a different issue. libata EH plugs the port if reset fails > repeatedly. This behavior was implemented to avoid causing continuous > resets on a port in case it has flaky PHY state reporting; however, it > seems to cause more trouble than fixing issues - ie. plugging in a > broken device may end up plugging the port even after the offending > device is removed until manual rescan or reboot. I've been pondering > about changing the behavior like the following. > > diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c > index dfb6e9d..05797fe 100644 > --- a/drivers/ata/libata-eh.c > +++ b/drivers/ata/libata-eh.c > @@ -2885,8 +2885,17 @@ int ata_eh_reset(struct ata_link *link, int classify, > sata_scr_read(link, SCR_STATUS, &sstatus)) > rc = -ERESTART; > > - if (rc == -ERESTART || try >= max_tries) > + if (rc == -ERESTART || try >= max_tries) { > + /* > + * Thaw host port even if reset failed, so that the port > + * can be retried on the next phy event. This risks > + * repeated EH runs but seems to be a better tradeoff than > + * shutting down a port after a botched hotplug attempt. > + */ > + if (ata_is_host_link(link)) > + ata_eh_thaw_port(ap); > goto out; > + } > > now = jiffies; > if (time_before(now, deadline)) { Tejun, did this ever go upstream and to -stable ?? I'm asking because I see the same issue with other SATA controllers, in particular with sata_sil boards. Hot plug generally works _once_ per port, and then stops working.