public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
* phase change messages cusing slowdown with sym53c8xx_2 driver
@ 2004-11-30  3:02 Jose R. Santos
  2004-12-01 16:56 ` Jose R. Santos
  0 siblings, 1 reply; 7+ messages in thread
From: Jose R. Santos @ 2004-11-30  3:02 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-kernel

Hi folks,

I'm having a bit of trouble with a integrated SCSI adapter using the 
sym53c8xx_2 driver on a RS6K-170.  Somewhere during 2.6.9 development I started
seeing a bunch of "phase change" messages generated every time I did any IO on 
the disks attached to the SCSI adapter.

Nov 28 23:05:12 orb kernel: sym0: <896> rev 0x5 at pci 0000:00:0c.0 irq 20
Nov 28 23:05:12 orb kernel: sym0: No NVRAM, ID 7, Fast-40, SE, parity checking
Nov 28 23:05:12 orb kernel: sym0: SCSI BUS has been reset.
Nov 28 23:05:12 orb kernel: scsi0 : sym-2.1.18m
Nov 28 23:05:12 orb kernel: sym0:1: FAST-20 WIDE SCSI 40.0 MB/s ST (50.0 ns, offset 15)
Nov 28 23:05:12 orb kernel:   Vendor: IBM       Model: DGHS09U	Rev: 03E0
Nov 28 23:05:12 orb kernel:   Type:   Direct-Access		ANSI SCSI revision: 03
Nov 28 23:05:12 orb kernel:  target0:0:1: Beginning Domain Validation
Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
Nov 28 23:05:12 orb last message repeated 10 times
Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@1005039c resid=6.
Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@1005039c resid=6.
Nov 28 23:05:12 orb kernel:  target0:0:1: Domain Validation skipping write tests
Nov 28 23:05:12 orb kernel:  target0:0:1: Ending Domain Validation
Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.

When these errors show up, the maximum performance I can get out of the disk is
about 1.3MB/s.  After several hours, the adapters seems to receive some ABORT 
operations and the messages stop showing.  Once this happens, performance for 
the disk goes back to 15MB/s.

Nov 29 06:32:43 orb kernel: sym0:1:0: ABORT operation started.
Nov 29 06:32:43 orb kernel: sym0:1:control msgout: 80 6.
Nov 29 06:32:43 orb kernel: sym0:1:0: ABORT operation complete.
Nov 29 06:32:53 orb kernel: sym0:1:0: ABORT operation started.
Nov 29 06:32:53 orb kernel: sym0:1:0: ABORT operation complete.
Nov 29 06:32:53 orb kernel: sym0:1:0: ABORT operation started.
Nov 29 06:32:53 orb kernel: sym0:1:control msgout: 80 6.
Nov 29 06:32:53 orb kernel: sym0:1:0: ABORT operation complete.
Nov 29 06:32:53 orb kernel: sym0:1:0:phase change 6-7 9@10050b90 resid=6.
Nov 29 06:32:53 orb kernel: sym0:1:0: DEVICE RESET operation started.
Nov 29 06:32:53 orb kernel: sym0:1:0: DEVICE RESET operation complete.
Nov 29 06:32:53 orb kernel: sym0:1:control msgout: c.
Nov 29 06:32:53 orb kernel: sym0: TARGET 1 has been reset.
Nov 29 06:33:03 orb kernel: sym0:1:0: ABORT operation started.
Nov 29 06:33:03 orb kernel: sym0:1:0: ABORT operation complete.
Nov 29 06:33:03 orb kernel: sym0:1:0: BUS RESET operation started.
Nov 29 06:33:03 orb kernel: sym0:1:0: BUS RESET operation complete.
Nov 29 06:33:03 orb kernel: sym0: SCSI BUS reset detected.
Nov 29 06:33:03 orb kernel: sym0: SCSI BUS has been reset.
Nov 29 06:33:14 orb kernel: sym0:1:0:phase change 6-7 9@10050b90 resid=6.
Nov 29 06:33:14 orb kernel: sym0:1:0:phase change 6-7 9@10050b9c resid=6.

The last kernel I tried was a bk pull this Sunday and it still show the 
problem.  I don't know enough about SCSI to figure out whats going on.  Anybody
care to enlighten me as to what the problem is.

Thanks

-JRS

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: phase change messages cusing slowdown with sym53c8xx_2 driver
  2004-11-30  3:02 phase change messages cusing slowdown with sym53c8xx_2 driver Jose R. Santos
@ 2004-12-01 16:56 ` Jose R. Santos
  2004-12-01 17:16   ` James Bottomley
  0 siblings, 1 reply; 7+ messages in thread
From: Jose R. Santos @ 2004-12-01 16:56 UTC (permalink / raw)
  To: linux-scsi; +Cc: linux-kernel, Matthew Wilcox

Jose R. Santos <jrsantos@austin.ibm.com> [041129]:
> I'm having a bit of trouble with a integrated SCSI adapter using the 
> sym53c8xx_2 driver on a RS6K-170.  Somewhere during 2.6.9 development I started
> seeing a bunch of "phase change" messages generated every time I did any IO on 
> the disks attached to the SCSI adapter.
> 
> Nov 28 23:05:12 orb kernel: sym0: <896> rev 0x5 at pci 0000:00:0c.0 irq 20
> Nov 28 23:05:12 orb kernel: sym0: No NVRAM, ID 7, Fast-40, SE, parity checking
> Nov 28 23:05:12 orb kernel: sym0: SCSI BUS has been reset.
> Nov 28 23:05:12 orb kernel: scsi0 : sym-2.1.18m
> Nov 28 23:05:12 orb kernel: sym0:1: FAST-20 WIDE SCSI 40.0 MB/s ST (50.0 ns, offset 15)
> Nov 28 23:05:12 orb kernel:   Vendor: IBM       Model: DGHS09U	Rev: 03E0
> Nov 28 23:05:12 orb kernel:   Type:   Direct-Access		ANSI SCSI revision: 03
> Nov 28 23:05:12 orb kernel:  target0:0:1: Beginning Domain Validation
> Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
> Nov 28 23:05:12 orb last message repeated 10 times
> Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@1005039c resid=6.
> Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
> Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
> Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@1005039c resid=6.
> Nov 28 23:05:12 orb kernel:  target0:0:1: Domain Validation skipping write tests
> Nov 28 23:05:12 orb kernel:  target0:0:1: Ending Domain Validation
> Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
> 
> When these errors show up, the maximum performance I can get out of the disk is
> about 1.3MB/s.  After several hours, the adapters seems to receive some ABORT 
> operations and the messages stop showing.  Once this happens, performance for 
> the disk goes back to 15MB/s.

I manage to get access to another PPC64 box that has uses this same
driver and was unable to reproduce this problem here, but I was able to
reproduce it on another same model machine.  Seem like there could be
something at initialization that only affects this revision of the SCSI
adapter.  Since the problem seems to disappear after a BUS RESET I assume
that something was left out when the driver was initializing the adapter.

Any Ideas?

Thanks

-JRS

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: phase change messages cusing slowdown with sym53c8xx_2 driver
  2004-12-01 16:56 ` Jose R. Santos
@ 2004-12-01 17:16   ` James Bottomley
  2004-12-01 20:32     ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: James Bottomley @ 2004-12-01 17:16 UTC (permalink / raw)
  To: Jose R. Santos; +Cc: SCSI Mailing List, Linux Kernel, Matthew Wilcox

On Wed, 2004-12-01 at 11:56, Jose R. Santos wrote:
> > Nov 28 23:05:12 orb kernel: sym0: <896> rev 0x5 at pci 0000:00:0c.0 irq 20
> > Nov 28 23:05:12 orb kernel: sym0: No NVRAM, ID 7, Fast-40, SE, parity checking
> > Nov 28 23:05:12 orb kernel: sym0: SCSI BUS has been reset.
> > Nov 28 23:05:12 orb kernel: scsi0 : sym-2.1.18m
> > Nov 28 23:05:12 orb kernel: sym0:1: FAST-20 WIDE SCSI 40.0 MB/s ST (50.0 ns, offset 15)
> > Nov 28 23:05:12 orb kernel:   Vendor: IBM       Model: DGHS09U	Rev: 03E0
> > Nov 28 23:05:12 orb kernel:   Type:   Direct-Access		ANSI SCSI revision: 03
> > Nov 28 23:05:12 orb kernel:  target0:0:1: Beginning Domain Validation
> > Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
> > Nov 28 23:05:12 orb last message repeated 10 times
> > Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@1005039c resid=6.
> > Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
> > Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
> > Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@1005039c resid=6.
> > Nov 28 23:05:12 orb kernel:  target0:0:1: Domain Validation skipping write tests
> > Nov 28 23:05:12 orb kernel:  target0:0:1: Ending Domain Validation
> > Nov 28 23:05:12 orb kernel: sym0:1:0:phase change 6-7 9@10050390 resid=6.
> > 
> > When these errors show up, the maximum performance I can get out of the disk is
> > about 1.3MB/s.  After several hours, the adapters seems to receive some ABORT 
> > operations and the messages stop showing.  Once this happens, performance for 
> > the disk goes back to 15MB/s.
> 
> I manage to get access to another PPC64 box that has uses this same
> driver and was unable to reproduce this problem here, but I was able to
> reproduce it on another same model machine.  Seem like there could be
> something at initialization that only affects this revision of the SCSI
> adapter.  Since the problem seems to disappear after a BUS RESET I assume
> that something was left out when the driver was initializing the adapter.

Matthew,

does this look like the "drive won't respond properly to PPR if the bus
is SE" problem again?

James



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: phase change messages cusing slowdown with sym53c8xx_2 driver
  2004-12-01 17:16   ` James Bottomley
@ 2004-12-01 20:32     ` Matthew Wilcox
  2004-12-01 20:43       ` James Bottomley
  0 siblings, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2004-12-01 20:32 UTC (permalink / raw)
  To: James Bottomley
  Cc: Jose R. Santos, SCSI Mailing List, Linux Kernel, Matthew Wilcox

On Wed, Dec 01, 2004 at 12:16:33PM -0500, James Bottomley wrote:
> does this look like the "drive won't respond properly to PPR if the bus
> is SE" problem again?

Thomas Babut who tested that fix reported it didn't solve his problem ;-(

http://marc.theaimsgroup.com/?l=linux-scsi&m=109968716312783&w=2
http://marc.theaimsgroup.com/?l=linux-scsi&m=109969829411685&w=2

I'm out of ideas for fixing that one.  Would you consider Richard
Waltham's patch?

http://marc.theaimsgroup.com/?l=linux-kernel&m=109967237930243&w=2

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: phase change messages cusing slowdown with sym53c8xx_2 driver
  2004-12-01 20:32     ` Matthew Wilcox
@ 2004-12-01 20:43       ` James Bottomley
  2004-12-01 21:22         ` Jose R. Santos
  2004-12-01 23:34         ` Doug Ledford
  0 siblings, 2 replies; 7+ messages in thread
From: James Bottomley @ 2004-12-01 20:43 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jose R. Santos, SCSI Mailing List, Linux Kernel

On Wed, 2004-12-01 at 15:32, Matthew Wilcox wrote:
> On Wed, Dec 01, 2004 at 12:16:33PM -0500, James Bottomley wrote:
> > does this look like the "drive won't respond properly to PPR if the bus
> > is SE" problem again?
> 
> Thomas Babut who tested that fix reported it didn't solve his problem ;-(
> 
> http://marc.theaimsgroup.com/?l=linux-scsi&m=109968716312783&w=2
> http://marc.theaimsgroup.com/?l=linux-scsi&m=109969829411685&w=2
> 
> I'm out of ideas for fixing that one.  Would you consider Richard
> Waltham's patch?
> 
> http://marc.theaimsgroup.com/?l=linux-kernel&m=109967237930243&w=2

Actually, yes, or the attached variant of it.  Does this solve the
problem?

There's no reason why we should assume a SCSI_3 or greater device
automatically supports ppr (especially if it's inquiry bit is
advertising that it doesn't...)

James

===== drivers/scsi/scsi_scan.c 1.134 vs edited =====
--- 1.134/drivers/scsi/scsi_scan.c	2004-10-24 07:09:48 -04:00
+++ edited/drivers/scsi/scsi_scan.c	2004-12-01 15:41:03 -05:00
@@ -554,10 +554,8 @@
 	sdev->removable = (0x80 & inq_result[1]) >> 7;
 	sdev->lockable = sdev->removable;
 	sdev->soft_reset = (inq_result[7] & 1) && ((inq_result[3] & 7) == 2);
+	sdev->ppr = (sdev->inquiry_len > 56 && (inq_result[56] & 0x04) == 0x04);
 
-	if (sdev->scsi_level >= SCSI_3 || (sdev->inquiry_len > 56 &&
-		inq_result[56] & 0x04))
-		sdev->ppr = 1;
 	if (inq_result[7] & 0x60)
 		sdev->wdtr = 1;
 	if (inq_result[7] & 0x10)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: phase change messages cusing slowdown with sym53c8xx_2 driver
  2004-12-01 20:43       ` James Bottomley
@ 2004-12-01 21:22         ` Jose R. Santos
  2004-12-01 23:34         ` Doug Ledford
  1 sibling, 0 replies; 7+ messages in thread
From: Jose R. Santos @ 2004-12-01 21:22 UTC (permalink / raw)
  To: James Bottomley
  Cc: Matthew Wilcox, Jose R. Santos, SCSI Mailing List, Linux Kernel

James Bottomley <James.Bottomley@SteelEye.com> [041201]:
> Actually, yes, or the attached variant of it.  Does this solve the
> problem?
> 
> There's no reason why we should assume a SCSI_3 or greater device
> automatically supports ppr (especially if it's inquiry bit is
> advertising that it doesn't...)
> 
> James

That fixes the problem.  No more messages "phase change" messages are
showing up and the disk performance is as expected.

Thanks

-JRS

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: phase change messages cusing slowdown with sym53c8xx_2 driver
  2004-12-01 20:43       ` James Bottomley
  2004-12-01 21:22         ` Jose R. Santos
@ 2004-12-01 23:34         ` Doug Ledford
  1 sibling, 0 replies; 7+ messages in thread
From: Doug Ledford @ 2004-12-01 23:34 UTC (permalink / raw)
  To: James Bottomley
  Cc: Matthew Wilcox, Jose R. Santos, linux-scsi mailing list,
	Linux Kernel

On Wed, 2004-12-01 at 15:43 -0500, James Bottomley wrote:
> On Wed, 2004-12-01 at 15:32, Matthew Wilcox wrote:
> > On Wed, Dec 01, 2004 at 12:16:33PM -0500, James Bottomley wrote:
> > > does this look like the "drive won't respond properly to PPR if the bus
> > > is SE" problem again?
> > 
> > Thomas Babut who tested that fix reported it didn't solve his problem ;-(
> > 
> > http://marc.theaimsgroup.com/?l=linux-scsi&m=109968716312783&w=2
> > http://marc.theaimsgroup.com/?l=linux-scsi&m=109969829411685&w=2
> > 
> > I'm out of ideas for fixing that one.  Would you consider Richard
> > Waltham's patch?
> > 
> > http://marc.theaimsgroup.com/?l=linux-kernel&m=109967237930243&w=2
> 
> Actually, yes, or the attached variant of it.  Does this solve the
> problem?
> 
> There's no reason why we should assume a SCSI_3 or greater device
> automatically supports ppr (especially if it's inquiry bit is
> advertising that it doesn't...)

Eh?  Where did that mickey mouse idea come from.  PPR is Mandatory in
SCSI-3 and above SPI class devices, so *yes*, we *should* be assuming
that.  The DT INQUIRY bit was not added to indicate PPR (although PPR is
a requirement of implementing it), it was added so that devices that
didn't conform to the entire SCSI-3 spec and couldn't qualify to carry a
SCSI-3 level could still use higher transfer speeds.  Consider it an LVD
addendum to the SCSI-2 spec.

All this patch does is *hide* the problem, the bug still exists and it's
in the sym_2 driver.

Further, most devices don't change the DT bit in the INQUIRY return data
or their SCSI level depending on bus conditions that will effect whether
or not they can do LVD signaling rates, so the PPR bit is a capability
bit, not a guaranteed to be appropriate at this moment in time bit.
With this patch, those devices that are on LVD buses, support higher
speed transfers, and are full SCSI-3 (or at least some portion of them
will, most SCSI-3 devices didn't use the DT INQUIRY bit as it's
redundant, don't know about the latest stuff) will start magically
negotiating all the way down to a max of 80MB/s data transfer rates
without any CRC checks on the data, parity checks only.

This is a totally broken patch.  I wrote up a description of what the
driver needs to be doing for this stuff and sent it to linux-scsi
mailing list on Nov. 5th.  Just because no one has fixed up the sym_2
driver to do the right thing is no reason to hork the SCSI layer. NAK.

> James
> 
> ===== drivers/scsi/scsi_scan.c 1.134 vs edited =====
> --- 1.134/drivers/scsi/scsi_scan.c	2004-10-24 07:09:48 -04:00
> +++ edited/drivers/scsi/scsi_scan.c	2004-12-01 15:41:03 -05:00
> @@ -554,10 +554,8 @@
>  	sdev->removable = (0x80 & inq_result[1]) >> 7;
>  	sdev->lockable = sdev->removable;
>  	sdev->soft_reset = (inq_result[7] & 1) && ((inq_result[3] & 7) == 2);
> +	sdev->ppr = (sdev->inquiry_len > 56 && (inq_result[56] & 0x04) == 0x04);
>  
> -	if (sdev->scsi_level >= SCSI_3 || (sdev->inquiry_len > 56 &&
> -		inq_result[56] & 0x04))
> -		sdev->ppr = 1;
>  	if (inq_result[7] & 0x60)
>  		sdev->wdtr = 1;
>  	if (inq_result[7] & 0x10)
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
  Doug Ledford <dledford@redhat.com>
         Red Hat, Inc.
         1801 Varsity Dr.
         Raleigh, NC 27606


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-12-01 23:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-30  3:02 phase change messages cusing slowdown with sym53c8xx_2 driver Jose R. Santos
2004-12-01 16:56 ` Jose R. Santos
2004-12-01 17:16   ` James Bottomley
2004-12-01 20:32     ` Matthew Wilcox
2004-12-01 20:43       ` James Bottomley
2004-12-01 21:22         ` Jose R. Santos
2004-12-01 23:34         ` Doug Ledford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox