public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* [patch 16/17] mptbase: reset ioc initiator during PCI resume
@ 2007-10-02 21:38 akpm
  2007-10-02 22:51 ` Moore, Eric
  0 siblings, 1 reply; 4+ messages in thread
From: akpm @ 2007-10-02 21:38 UTC (permalink / raw)
  To: James.Bottomley; +Cc: linux-scsi, akpm, djwong

From: "Darrick J. Wong" <djwong@us.ibm.com>

It appears that the LSI SAS 1064E chip needs to be reset after a
suspend/resume cycle before the driver attempts further communications with
the chip.  Without this patch, resuming the chip results in this error
message being printed repeatedly and no more disk I/O.

mptbase: ioc0: ERROR - Invalid IOC facts reply, msgLength=0 offsetof=6!

So far it seems to fix suspend/resume on all the MPT Fusion cards I have
(SAS and U320 SCSI) but since I don't know the internals of that chip I
can't say for sure if this is a proper fix.

Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 drivers/message/fusion/mptbase.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff -puN drivers/message/fusion/mptbase.c~mptbase-reset-ioc-initiator-during-pci-resume drivers/message/fusion/mptbase.c
--- a/drivers/message/fusion/mptbase.c~mptbase-reset-ioc-initiator-during-pci-resume
+++ a/drivers/message/fusion/mptbase.c
@@ -1830,6 +1830,12 @@ mpt_resume(struct pci_dev *pdev)
 		(mpt_GetIocState(ioc, 1) >> MPI_IOC_STATE_SHIFT),
 		CHIPREG_READ32(&ioc->chip->Doorbell));
 
+	/* put ioc into READY_STATE */
+	if(SendIocReset(ioc, MPI_FUNCTION_IOC_MESSAGE_UNIT_RESET, CAN_SLEEP)) {
+		printk(MYIOC_s_ERR_FMT
+		"pci-resume:  IOC msg unit reset failed!\n", ioc->name);
+	}
+
 	/* bring ioc to operational state */
 	if ((recovery_state = mpt_do_ioc_recovery(ioc,
 	    MPT_HOSTEVENT_IOC_RECOVER, CAN_SLEEP)) != 0) {
_

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [patch 16/17] mptbase: reset ioc initiator during PCI resume
  2007-10-02 21:38 [patch 16/17] mptbase: reset ioc initiator during PCI resume akpm
@ 2007-10-02 22:51 ` Moore, Eric
  2007-10-02 23:06   ` Darrick J. Wong
  0 siblings, 1 reply; 4+ messages in thread
From: Moore, Eric @ 2007-10-02 22:51 UTC (permalink / raw)
  To: akpm, James.Bottomley; +Cc: linux-scsi, djwong

On Tuesday, October 02, 2007 3:38 PM,  Darrick J. Wong wrote:

> 
> It appears that the LSI SAS 1064E chip needs to be reset after a
> suspend/resume cycle before the driver attempts further 
> communications with
> the chip.  Without this patch, resuming the chip results in this error
> message being printed repeatedly and no more disk I/O.
> 
> mptbase: ioc0: ERROR - Invalid IOC facts reply, msgLength=0 
> offsetof=6!
> 
> So far it seems to fix suspend/resume on all the MPT Fusion 
> cards I have
> (SAS and U320 SCSI) but since I don't know the internals of 
> that chip I
> can't say for sure if this is a proper fix.
> 

I replied to this thread a couple times last week, and no response from
Darrick.   I doubt this is required becase the MESSAGE_UNIT_RESET is
issued from inside mpt_do_ioc_recovery.  I need some logs with debug
enabled.   Darrick did you see my email?

Eric  

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [patch 16/17] mptbase: reset ioc initiator during PCI resume
  2007-10-02 22:51 ` Moore, Eric
@ 2007-10-02 23:06   ` Darrick J. Wong
  2007-10-03 19:32     ` Moore, Eric
  0 siblings, 1 reply; 4+ messages in thread
From: Darrick J. Wong @ 2007-10-02 23:06 UTC (permalink / raw)
  To: Moore, Eric; +Cc: akpm, James.Bottomley, linux-scsi

On Tue, Oct 02, 2007 at 04:51:48PM -0600, Moore, Eric wrote:

> I replied to this thread a couple times last week, and no response from
> Darrick.   I doubt this is required becase the MESSAGE_UNIT_RESET is
> issued from inside mpt_do_ioc_recovery.  I need some logs with debug
> enabled.   Darrick did you see my email?

Yep.  Replied to it, too.  Apparently it never got to you, so I've
attached it below.

--D

---------------------

On Thu, Sep 20, 2007 at 07:06:35PM -0600, Moore, Eric wrote:
> Darrick - MESSAGE_UNIT_RESET is already issued from inside
> mpt_do_ioc_recovery(), so you don't need to send this in advance of
> that.    YOu will find that occuring from the function MakeIocReady.
> Anyways... would it be possible for you to enable debug logging so I can
> see what problem your having?   I suggest MPT_DEBUG and MPT_DEBUG_INIT.
> If its possible for you to manually load mptbase, that way you can set
> the command line option. 

I took a look at MakeIocReady(), and this section caught my eye:

/* Is it already READY? */
if (!statefault && (ioc_state & MPI_IOC_STATE_MASK) == MPI_IOC_STATE_READY)
	return 0;

So I turned on a whole lot more debugging (mpt_debug_level=65535), and
caught this from the dhsprintk() just above that code snippet:

mptbase::MakeIocReady, ioc0 [raw] state=10000000

state=10000000 seems to correspond with MPI_IOC_STATE_READY, which means
that the adapter isn't getting reset because the chip claims to be
ready.  It doesn't seem to be ready, as demonstrated by the original error
message that I reported with the patch.  I'll append the log entries
pertaining to mpt to the end of this message.

--D

(Driver sign-on message if you were curious)

[  164.467481] Fusion MPT base driver 3.04.05
[  164.471706] Copyright (c) 1999-2007 LSI Logic Corporation
[  164.492483] Fusion MPT SAS Host driver 3.04.05
[  167.066482] ACPI: PCI Interrupt 0000:0c:03.0[A] -> <6>ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
[  167.066534] mptbase: Initiating ioc0 bringup
[  167.761481] ioc0: LSISAS1064E B0: Capabilities={Initiator}
[  178.681050] scsi6 : ioc0: LSISAS1064E B0, FwRev=00060200h, Ports=1, MaxQ=511, IRQ=16
[  178.741821] scsi 6:0:0:0: Direct-Access     IBM-ESXS GNA073C3ESTT0Z N BH0C PQ: 0 ANSI: 5
[  178.816476] sd 6:0:0:0: [sda] 143374000 512-byte hardware sectors (73407 MB)
[  178.825198] sd 6:0:0:0: [sda] Write Protect is off
[  178.830088] sd 6:0:0:0: [sda] Mode Sense: d3 00 10 08
[  178.831204] sd 6:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
[  178.845101] sd 6:0:0:0: [sda] 143374000 512-byte hardware sectors (73407 MB)
[  178.853483] sd 6:0:0:0: [sda] Write Protect is off
[  178.858343] sd 6:0:0:0: [sda] Mode Sense: d3 00 10 08
[  178.859961] sd 6:0:0:0: [sda] Write cache: disabled, read cache: enabled, supports DPO and FUA
[  178.869069]  sda: sda1 sda2 sda3 sda4
[  178.877690] sd 6:0:0:0: [sda] Attached SCSI disk
[  178.912356] sd 6:0:0:0: Attached scsi generic sg0 type 0

(put system to sleep)

[  821.678155] mptbase: ioc0: pci-suspend: pdev=0xffff81003f64a000, slot=0000:01:00.0, Entering operating state [D3]
[  821.678195] mptbase: ioc0: Sending IOC reset(0x40)!
[  821.813585] mptbase: ioc0: WaitForDoorbell ACK (count=16)
[  821.814120] ACPI: PCI interrupt for device 0000:01:00.0 disabled

(wake system up)

[  891.307583] mptbase: ioc0: pci-resume: pdev=0xffff81003f64a000, slot=0000:01:00.0, Previous operating state [D3]
[  891.431146] PM: Writing back config space on device 0000:01:00.0 at offset 1 (was 100000, writing 100107)
[  891.431174] ACPI: PCI Interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
[  891.431179] mptbase: ioc0: pci-resume: ioc-state=0x1,doorbell=0x10000000
[  891.431182] mptbase: Initiating ioc0 recovery
[  891.431184] mptbase::MakeIocReady, ioc0 [raw] state=10000000
[  891.431187] mptbase: ioc0: Sending get IocFacts request req_sz=12 reply_sz=80
[  894.723823] mptbase: ioc0: WaitForDoorbell INT (cnt=412) howlong=5
[  894.723826] mptbase: ioc0: HandShake request start reqBytes=12, WaitCnt=412
[  894.723830] mptbase: ioc0: Sending get IocFacts request req_sz=12 reply_sz=80
[  894.731815] mptbase: ioc0: WaitForDoorbell INT (cnt=1) howlong=5
[  894.731817] mptbase: ioc0: HandShake request start reqBytes=12, WaitCnt=1
[  894.739806] mptbase: ioc0: WaitForDoorbell ACK (count=0)
[  894.747799] mptbase: ioc0: WaitForDoorbell ACK (count=0)
[  894.755791] mptbase: ioc0: WaitForDoorbell ACK (count=0)
[  894.763781] mptbase: ioc0: WaitForDoorbell ACK (count=0)
[  894.763784] mptbase: ioc0: Handshake request frame (@ffff810028c81918) header
[  894.763786] mptbase: ioc0: HandShake request post done, WaitCnt=0
[  894.763789] mptbase: ioc0: WaitForDoorbell INT (cnt=0) howlong=5
[  894.771775] mptbase: ioc0: WaitForDoorbell INT (cnt=1) howlong=5
[  894.771778] mptbase: ioc0: WaitCnt=1 First handshake reply word=03000000
[  894.779766] mptbase: ioc0: WaitForDoorbell INT (cnt=1) howlong=5
[  894.779769] mptbase: ioc0: Got Handshake reply:
[  894.779770] mptbase: ioc0: WaitForDoorbell REPLY WaitCnt=1 (sz=1)
[  894.779772] mptbase: ioc0: HandShake reply count=1
[  894.779775] mptbase: ioc0: ERROR - Invalid IOC facts reply, msgLength=0 offsetof=6!
<repeat>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* RE: [patch 16/17] mptbase: reset ioc initiator during PCI resume
  2007-10-02 23:06   ` Darrick J. Wong
@ 2007-10-03 19:32     ` Moore, Eric
  0 siblings, 0 replies; 4+ messages in thread
From: Moore, Eric @ 2007-10-03 19:32 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: akpm, James.Bottomley, linux-scsi

On Tuesday, October 02, 2007 5:06 PM,  Darrick J. Wong wrote:
> Yep.  Replied to it, too.  Apparently it never got to you, so I've
> attached it below.
> 

Sorry, I didn't receive the previous email you sent. 

> ---------------------
> 
> On Thu, Sep 20, 2007 at 07:06:35PM -0600, Moore, Eric wrote:
> > Darrick - MESSAGE_UNIT_RESET is already issued from inside
> > mpt_do_ioc_recovery(), so you don't need to send this in advance of
> > that.    YOu will find that occuring from the function MakeIocReady.
> > Anyways... would it be possible for you to enable debug 
> logging so I can
> > see what problem your having?   I suggest MPT_DEBUG and 
> MPT_DEBUG_INIT.
> > If its possible for you to manually load mptbase, that way 
> you can set
> > the command line option. 
> 
> I took a look at MakeIocReady(), and this section caught my eye:
> 
> /* Is it already READY? */
> if (!statefault && (ioc_state & MPI_IOC_STATE_MASK) == 
> MPI_IOC_STATE_READY)
> 	return 0;

Yes, the purpose of MakeIocReady is to get the card in READY state.  If
your already in READY state, there is no reason to continue in
MakeIocReady.  A MESSAGE_UNIT_RESET places the card into READY state.
You will see that we already issued MESSAGE_UNIT_RESET from
mptbase_suspend.  So it should be in READY state coming into
mptbase_resume, depending on which power state you transferred to from
suspend.    The code you added in this patch is not required, meaning we
dont need to send MESSAGE_UNIT_RESET prior to ioc_do_recovery, becuase
from MakeIocReady will issue a MESSAGE_UNIT_RESET if your not already in
READY.    I suspect there must be something else going on if you have to
issue MESSAGE_UNIT_RESET when your already in READY state.   My card
works fine without your patch.  I did the following:

# echo standby > /sys/power/state


There could be issues in the firmware your using.   I noticed
FwRev=00060200h in the log,, which is 6.02, and over a year old.    

I will send out a seperate email which I will copy you to the IBM system
engineer support here at LSI, should be able to assist on this issue.

Eric


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2007-10-03 19:33 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-02 21:38 [patch 16/17] mptbase: reset ioc initiator during PCI resume akpm
2007-10-02 22:51 ` Moore, Eric
2007-10-02 23:06   ` Darrick J. Wong
2007-10-03 19:32     ` Moore, Eric

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox