From: Hannes Reinecke <hare@suse.de>
To: Sean Bruno <sean.bruno@dsl-only.net>
Cc: linux-scsi@vger.kernel.org
Subject: Re: Adaptec 29320 [aic79xx] fails on power cycle of LUN
Date: Fri, 20 Oct 2006 09:01:54 +0200 [thread overview]
Message-ID: <45387462.10300@suse.de> (raw)
In-Reply-To: <1161274711.3204.41.camel@home-desk>
[-- Attachment #1: Type: text/plain, Size: 2649 bytes --]
Sean Bruno wrote:
> On Thu, 2006-10-19 at 16:10 +0200, Hannes Reinecke wrote:
>> Sean Bruno wrote:
>>> On Thu, 2006-10-19 at 01:52 -0400, Mike Christie wrote:
>>>> On Wed, 2006-10-18 at 15:32 -0700, Sean Bruno wrote:
>>>>> On Wed, 2006-10-18 at 15:24 -0700, Sean Bruno wrote:
>>>>>> I have had a tough time tracking this one down, however I can say for
>>>>>> certain that the 29320 is really having trouble if a LUN is power
>>>>>> cycled.
>>>>>>
>>>>>> I don't have access to a BUS analyzer right now, but here is my
>>>>>> regression.
>>>>>>
>>>>>> 1. Hook an external SCSI array/disk to a 29320.
>>>>>> 2. Power up SCSI array/disk
>>>>>> 3. Power up PC with 29320.
>>>>>> 4. When PC has booted, login and test device by creating a file
>>>>>> system, eg. mkfs /dev/sda (or whatever disk the array is called on
>>>>>> ur machine).
>>>>>> 5. Power cycle array/disk
>>>>>> 6. Retest device with another 'mkfs /dev/sda' ... panic/crash/lock-up
>>>>>> ensues.
>>>>>>
>>>>>>
>>>>>>
>>>>>> This did not happen in 2.6.15.7 but did appear in 2.6.16 and higher.
>>>>>>
>>>> Does this only occur with sg or is that the only way you got a trace? In
>>>> the original bug report you mentioned it occurring with mkfs, but the
>>>> bug oops is from a sg request. Is tdg_2 run while the mkfs is running?
>>> Snippets from 'dmesg' during step 6:
>>>
>>> scsi0: Someone reset channel A
>>> sd 0:0:4:0: Attempting to queue an ABORT message:CDB: 0x28 0x0 0x0 0x0
>>> 0x0 0x80 0x0 0x0 0x80 0x0
>>> Infinite interrupt loop, INTSTAT = 8scsi0: At time of recovery, card was
>>> paused
>> Ah. Hmm. Infinite SCSI interrupt.
>>
>> Maybe someone forgot to clear the status ...
>>
>> Can you try the attached patch?
>>
>> Cheers,
>>
>> Hannes
>
> Better. The patch allows me to cycle power on the array exactly once.
> So the new regression is:
>
> 1. Hook an external SCSI array/disk to a 29320.
> 2. Power up SCSI array/disk
> 3. Power up PC with 29320.
> 4. When PC has booted, login and test device by creating a file
> system, eg. mkfs /dev/sda (or whatever disk the array is called on
> ur machine).
> 5. Power cycle array/disk
> 6. Retest device with another 'mkfs /dev/sda' <-- works just fine!
> 7. Power cycle array/disk
> 8. No need to do anything, card dump in dmesg/messages appears and
> device in not useable:
>
Ok. Not bad. So we have to switch to non-pkt commands after a reset.
Make sense. Care to try the updated patch?
Thanks for all the testing!
Cheers,
Hannes
--
Dr. Hannes Reinecke hare@suse.de
SuSE Linux Products GmbH S390 & zSeries
Maxfeldstraße 5 +49 911 74053 688
90409 Nürnberg http://www.suse.de
[-- Attachment #2: aic79xx-external-device-reset --]
[-- Type: text/plain, Size: 3984 bytes --]
diff --git a/drivers/scsi/aic7xxx/aic79xx_core.c b/drivers/scsi/aic7xxx/aic79xx_core.c
index 653818d..555920a 100644
--- a/drivers/scsi/aic7xxx/aic79xx_core.c
+++ b/drivers/scsi/aic7xxx/aic79xx_core.c
@@ -1053,10 +1053,12 @@ #endif
* If a target takes us into the command phase
* assume that it has been externally reset and
* has thus lost our previous packetized negotiation
- * agreement.
- * Revert to async/narrow transfers until we
- * can renegotiate with the device and notify
- * the OSM about the reset.
+ * agreement. Since we have not sent an identify
+ * message and may not have fully qualified the
+ * connection, we change our command to TUR, assert
+ * ATN and ABORT the task when we go to message in
+ * phase. The OSM will see the REQUEUE_REQUEST
+ * status and retry the command.
*/
scbid = ahd_get_scbptr(ahd);
scb = ahd_lookup_scb(ahd, scbid);
@@ -1083,7 +1085,28 @@ #endif
ahd_set_syncrate(ahd, &devinfo, /*period*/0,
/*offset*/0, /*ppr_options*/0,
AHD_TRANS_ACTIVE, /*paused*/TRUE);
- scb->flags |= SCB_EXTERNAL_RESET;
+ /* Hand-craft TUR command */
+ ahd_outb(ahd, SCB_CDB_STORE, 0);
+ ahd_outb(ahd, SCB_CDB_STORE+1, 0);
+ ahd_outb(ahd, SCB_CDB_STORE+2, 0);
+ ahd_outb(ahd, SCB_CDB_STORE+3, 0);
+ ahd_outb(ahd, SCB_CDB_STORE+4, 0);
+ ahd_outb(ahd, SCB_CDB_STORE+5, 0);
+ ahd_outb(ahd, SCB_CDB_LEN, 6);
+ scb->hscb->control &= ~(TAG_ENB|SCB_TAG_TYPE);
+ scb->hscb->control |= MK_MESSAGE;
+ ahd_outb(ahd, SCB_CONTROL, scb->hscb->control);
+ ahd_outb(ahd, MSG_OUT, HOST_MSG);
+ ahd_outb(ahd, SAVED_SCSIID, scb->hscb->scsiid);
+ /*
+ * The lun is 0, regardless of the SCB's lun
+ * as we have not sent an identify message.
+ */
+ ahd_outb(ahd, SAVED_LUN, 0);
+ ahd_outb(ahd, SEQ_FLAGS, 0);
+ ahd_assert_atn(ahd);
+ scb->flags &= ~SCB_PACKETIZED;
+ scb->flags |= SCB_ABORT|SCB_EXTERNAL_RESET;
ahd_freeze_devq(ahd, scb);
ahd_set_transaction_status(scb, CAM_REQUEUE_REQ);
ahd_freeze_scb(scb);
@@ -1519,8 +1542,10 @@ ahd_handle_scsiint(struct ahd_softc *ahd
/*
* Ignore external resets after a bus reset.
*/
- if (((status & SCSIRSTI) != 0) && (ahd->flags & AHD_BUS_RESET_ACTIVE))
+ if (((status & SCSIRSTI) != 0) && (ahd->flags & AHD_BUS_RESET_ACTIVE)) {
+ ahd_outb(ahd, CLRSINT1, CLRSCSIRSTI);
return;
+ }
/*
* Clear bus reset flag
@@ -2200,6 +2225,22 @@ ahd_handle_nonpkt_busfree(struct ahd_sof
if (sent_msg == MSG_ABORT_TAG)
tag = SCB_GET_TAG(scb);
+ if ((scb->flags & SCB_EXTERNAL_RESET) != 0) {
+ /*
+ * This abort is in response to an
+ * unexpected switch to command phase
+ * for a packetized connection. Since
+ * the identify message was never sent,
+ * "saved lun" is 0. We really want to
+ * abort only the SCB that encountered
+ * this error, which could have a different
+ * lun. The SCB will be retried so the OS
+ * will see the UA after renegotiating to
+ * packetized.
+ */
+ tag = SCB_GET_TAG(scb);
+ saved_lun = scb->hscb->lun;
+ }
found = ahd_abort_scbs(ahd, target, 'A', saved_lun,
tag, ROLE_INITIATOR,
CAM_REQ_ABORTED);
@@ -7920,6 +7961,11 @@ #endif
ahd_clear_fifo(ahd, 1);
/*
+ * Clear SCSI interrupt status
+ */
+ ahd_outb(ahd, CLRSINT1, CLRSCSIRSTI);
+
+ /*
* Reenable selections
*/
ahd_outb(ahd, SIMODE1, ahd_inb(ahd, SIMODE1) | ENSCSIRST);
@@ -7952,10 +7998,6 @@ #ifdef AHD_TARGET_MODE
}
}
#endif
- /* Notify the XPT that a bus reset occurred */
- ahd_send_async(ahd, devinfo.channel, CAM_TARGET_WILDCARD,
- CAM_LUN_WILDCARD, AC_BUS_RESET);
-
/*
* Revert to async/narrow transfers until we renegotiate.
*/
@@ -7977,6 +8019,10 @@ #endif
}
}
+ /* Notify the XPT that a bus reset occurred */
+ ahd_send_async(ahd, devinfo.channel, CAM_TARGET_WILDCARD,
+ CAM_LUN_WILDCARD, AC_BUS_RESET);
+
ahd_restart(ahd);
return (found);
next prev parent reply other threads:[~2006-10-20 7:02 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-10-18 22:24 Adaptec 29320 [aic79xx] fails on power cycle of LUN Sean Bruno
2006-10-18 22:27 ` James Bottomley
2006-10-18 22:32 ` Sean Bruno
2006-10-19 5:52 ` Mike Christie
2006-10-19 12:23 ` Sean Bruno
2006-10-19 12:25 ` Sean Bruno
2006-10-19 14:10 ` Hannes Reinecke
2006-10-19 16:18 ` Sean Bruno
2006-10-20 7:01 ` Hannes Reinecke [this message]
2006-10-21 20:48 ` Sean Bruno
2006-10-22 4:45 ` Sean Bruno
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45387462.10300@suse.de \
--to=hare@suse.de \
--cc=linux-scsi@vger.kernel.org \
--cc=sean.bruno@dsl-only.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox