linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Sym53c8xx_2
@ 2004-08-13  3:00 Kai OM
  2004-08-13 19:40 ` Sym53c8xx_2 Guennadi Liakhovetski
  0 siblings, 1 reply; 18+ messages in thread
From: Kai OM @ 2004-08-13  3:00 UTC (permalink / raw)
  To: linux-scsi

A while back I mailed the list a few times about an issue I was having
with the Sym53c8xx_2 driver in the 2.6.7-8 kernels, as well as mailing
the driver maintainer listed in the kernel source.

Nobody ever replied or acknowledged my e-mails, save a couple other
people with the same issue.

I'm wondering, with all this activity in the list now, would it be worth
mailing everyone again, or does anyone have suggestions for someone else
I can e-mail?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-13  3:00 Sym53c8xx_2 Kai OM
@ 2004-08-13 19:40 ` Guennadi Liakhovetski
  2004-08-13 21:24   ` Sym53c8xx_2 Kai OM
  0 siblings, 1 reply; 18+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-13 19:40 UTC (permalink / raw)
  To: Kai OM; +Cc: linux-scsi

Hi

On Thu, 12 Aug 2004, Kai OM wrote:

> A while back I mailed the list a few times about an issue I was having
> with the Sym53c8xx_2 driver in the 2.6.7-8 kernels, as well as mailing
> the driver maintainer listed in the kernel source.

Disclaimer: I don't think I am the best person to try to solve this 
problem, I'll just try to help you collect some more information... well, 
will see.

The first thing I would do is try the latest 2.6.8-preX, or, better yet, 
-mmY kernel. If that still doesn't work, try to narrow down the problem in 
the sequence of changes between 2.6.5 and 2.6.6. It should be possible 
somehow. Are you using BitKeeper?

Other things to do are turn scsi-logging on with something like 
scsi_mod.scsi_logging_level=511 debug
on your kernel command line. But, it would be useless without a serial 
console. Do you really have no chance to configure one? You would also 
need to enable scsi-logging in your kernel SCSI configuration. There are 
also some debug flags in sym53c8xx_2 driver too. Don't think things like 
pci=noacpi would help you...

Regards
Guennadi
---
Guennadi Liakhovetski


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-13 19:40 ` Sym53c8xx_2 Guennadi Liakhovetski
@ 2004-08-13 21:24   ` Kai OM
  2004-08-14 22:16     ` Sym53c8xx_2 Guennadi Liakhovetski
  0 siblings, 1 reply; 18+ messages in thread
From: Kai OM @ 2004-08-13 21:24 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: linux-scsi




----- Original message -----
From: "Guennadi Liakhovetski" <g.liakhovetski@gmx.de>
To: "Kai OM" <epimetreus@fastmail.fm>
Date: Fri, 13 Aug 2004 21:40:28 +0200 (CEST)
Subject: Re: Sym53c8xx_2

Hi

On Thu, 12 Aug 2004, Kai OM wrote:

> A while back I mailed the list a few times about an issue I was having
> with the Sym53c8xx_2 driver in the 2.6.7-8 kernels, as well as mailing
> the driver maintainer listed in the kernel source.

Disclaimer: I don't think I am the best person to try to solve this 
problem, I'll just try to help you collect some more information...
well, 
will see.

The first thing I would do is try the latest 2.6.8-preX, or, better yet, 
-mmY kernel. If that still doesn't work, try to narrow down the problem
in 
the sequence of changes between 2.6.5 and 2.6.6. It should be possible 
somehow. Are you using BitKeeper?

<<
Negative on that, normally I just download the source from kernel.org.
There was one entry in the 2.6.6 changelog, though, that said something
about modifying the driver in question; adding generic domain validation
and such -- and it's during domain validation that the driver stops.


Other things to do are turn scsi-logging on with something like 
scsi_mod.scsi_logging_level=511 debug
on your kernel command line. But, it would be useless without a serial 
console. Do you really have no chance to configure one? You would also 
need to enable scsi-logging in your kernel SCSI configuration. There are 
also some debug flags in sym53c8xx_2 driver too. Don't think things like 
pci=noacpi would help you...

<<
In the drivers that don't work, I never even get a chance to mount the
root FS. I'll e-mail some logs of what happens, copied by hand, but I do
not have access to a serial console.

Regards
Guennadi
---
Guennadi Liakhovetski


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-13 21:24   ` Sym53c8xx_2 Kai OM
@ 2004-08-14 22:16     ` Guennadi Liakhovetski
  2004-08-15  7:21       ` Sym53c8xx_2 Kai Makisara
  0 siblings, 1 reply; 18+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-14 22:16 UTC (permalink / raw)
  To: Kai OM; +Cc: Guennadi Liakhovetski, linux-scsi

Hi

On Fri, 13 Aug 2004, Kai OM wrote:

> There was one entry in the 2.6.6 changelog, though, that said something
> about modifying the driver in question; adding generic domain validation
> and such -- and it's during domain validation that the driver stops.

I looked through the diffs between 2.6.5 and 2.6.7 versions of sym53c8xx_2 
- there are not so many of them, so, we'll have to able to find the guilty 
one. In fact, I think, there are only 2 lines of code that can be 
essential. But before you test them, one question: in your original posts 
you quoted dmesg with 2.6.7 and 2.6.7 with replaced sym53c8xx_2 directory. 
However, the output of the working version with the older driver is 
missing 2 lines:

sym0:0:0: Tagged command queue enabled, command queue depth 16
scsi(0:0:0:0): Beginning Domain Validation

which are present in the non-working version. Why are they missing from 
the working version - did you use a different configuration or different 
command-line parameters?

And the patches you could try, to revert the changes I suspect, are:

1)

diff -u a/drivers/scsi/sym53c8xx_2/sym_glue.c b/drivers/scsi/sym53c8xx_2/sym_glue.c
--- a/drivers/scsi/sym53c8xx_2/sym_glue.c	19 May 2004 16:07:46
+++ b/drivers/scsi/sym53c8xx_2/sym_glue.c	14 Aug 2004 21:13:03
@@ -390,7 +390,7 @@
  			 * condition otherwise the device will always return
  			 * BUSY.  Use a big stick.
  			 */
-			sym_reset_scsi_target(np, csio->device->id);
+//			sym_reset_scsi_target(np, csio->device->id);
  			cam_status = DID_ERROR;
  		}
  	} else if (cp->host_status == HS_COMPLETE) 	/* Bad SCSI status */


and 2)

diff -u a/drivers/scsi/sym53c8xx_2/sym_glue.c b/drivers/scsi/sym53c8xx_2/sym_glue.c
--- a/drivers/scsi/sym53c8xx_2/sym_glue.c	19 May 2004 16:07:46
+++ b/drivers/scsi/sym53c8xx_2/sym_glue.c	14 Aug 2004 22:12:32
@@ -931,6 +931,7 @@
  	switch(to_do) {
  	default:
  	case SYM_EH_DO_IGNORE:
+		goto finish;
  		break;
  	case SYM_EH_DO_WAIT:
  		init_MUTEX_LOCKED(&ep->sem);

Try them to stock 2.6.7 and see if any / both of them fix your problem.

Regards
Guennadi
---
Guennadi Liakhovetski


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-14 22:16     ` Sym53c8xx_2 Guennadi Liakhovetski
@ 2004-08-15  7:21       ` Kai Makisara
  2004-08-15  8:38         ` Sym53c8xx_2 Guennadi Liakhovetski
                           ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Kai Makisara @ 2004-08-15  7:21 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: Kai OM, linux-scsi

On Sun, 15 Aug 2004, Guennadi Liakhovetski wrote:

> Hi
> 
> On Fri, 13 Aug 2004, Kai OM wrote:
> 
> > There was one entry in the 2.6.6 changelog, though, that said something
> > about modifying the driver in question; adding generic domain validation
> > and such -- and it's during domain validation that the driver stops.
> 
> I looked through the diffs between 2.6.5 and 2.6.7 versions of sym53c8xx_2 
> - there are not so many of them, so, we'll have to able to find the guilty 
> one. In fact, I think, there are only 2 lines of code that can be 
> essential. But before you test them, one question: in your original posts 
> you quoted dmesg with 2.6.7 and 2.6.7 with replaced sym53c8xx_2 directory. 
> However, the output of the working version with the older driver is 
> missing 2 lines:
> 
> sym0:0:0: Tagged command queue enabled, command queue depth 16
> scsi(0:0:0:0): Beginning Domain Validation
> 
> which are present in the non-working version. Why are they missing from 
> the working version - did you use a different configuration or different 
> command-line parameters?
> 
You missed one essential change in 2.6.6:

@@ -908,6 +916,7 @@
 config SCSI_SYM53C8XX_2
        tristate "SYM53C8XX Version 2 SCSI support"
        depends on PCI && SCSI
+       select SCSI_SPI_ATTRS

(Many changes in sym_glue.c accompany this. You can't just remove this 
config change to try without spi.)

Because of this, the system is trying to do domain validation with the 
SCSI devices. For some reason it fails. The problem may be in the drive 
firmware, mid-level spi code, the sym53c8xxx driver, or an interaction 
with some/all of those.

Domain Validation does not fail in all configurations. I am using 
sym53c8xx_2 in two machines to drive the system disk and some other 
devices. I have had no SCSI problems with these systems.

-- 
Kai

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-15  7:21       ` Sym53c8xx_2 Kai Makisara
@ 2004-08-15  8:38         ` Guennadi Liakhovetski
  2004-08-15  8:45           ` Sym53c8xx_2 Guennadi Liakhovetski
  2004-08-15 21:52         ` Sym53c8xx_2 Guennadi Liakhovetski
  2004-08-16  2:54         ` Sym53c8xx_2 Kai OM
  2 siblings, 1 reply; 18+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-15  8:38 UTC (permalink / raw)
  To: Kai Makisara; +Cc: Kai OM, linux-scsi

On Sun, 15 Aug 2004, Kai Makisara wrote:

> You missed one essential change in 2.6.6:

No, I didn't. I saw it, but just didn't realise that it's the introduction 
of the SPI, that triggered the Domain Validation. The transport template 
functions look quite harmless and are only triggered through sysfs. What 
if you disable SPI in 2.6.7 with the original dtiver? Does it work then?

Guennadi
---
Guennadi Liakhovetski


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-15  8:38         ` Sym53c8xx_2 Guennadi Liakhovetski
@ 2004-08-15  8:45           ` Guennadi Liakhovetski
  0 siblings, 0 replies; 18+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-15  8:45 UTC (permalink / raw)
  To: Kai Makisara; +Cc: Kai OM, linux-scsi

On Sun, 15 Aug 2004, Guennadi Liakhovetski wrote:

> functions look quite harmless and are only triggered through sysfs. What if 
> you disable SPI in 2.6.7 with the original dtiver? Does it work then?

Ough, I see:

config SCSI_SYM53C8XX_2
         tristate "SYM53C8XX Version 2 SCSI support"
         depends on PCI && SCSI
         select SCSI_SPI_ATTRS

...and it works on my 2-way too... Ok, I'll re-think it:-)

Guennadi
---
Guennadi Liakhovetski


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-15  7:21       ` Sym53c8xx_2 Kai Makisara
  2004-08-15  8:38         ` Sym53c8xx_2 Guennadi Liakhovetski
@ 2004-08-15 21:52         ` Guennadi Liakhovetski
  2004-08-16  2:54         ` Sym53c8xx_2 Kai OM
  2 siblings, 0 replies; 18+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-15 21:52 UTC (permalink / raw)
  To: Kai Makisara; +Cc: Kai OM, linux-scsi

On Sun, 15 Aug 2004, Kai Makisara wrote:

>> scsi(0:0:0:0): Beginning Domain Validation

...

> You missed one essential change in 2.6.6:
>
> @@ -908,6 +916,7 @@
> config SCSI_SYM53C8XX_2
>        tristate "SYM53C8XX Version 2 SCSI support"
>        depends on PCI && SCSI
> +       select SCSI_SPI_ATTRS
>
> (Many changes in sym_glue.c accompany this. You can't just remove this
> config change to try without spi.)

Yep, I could just as well just read your message:-)

Ok, some other ideas (hopefully, better ones than the first two) to try - 
do you have a chance to either connect another device to the same 
controller and / or move the device to another controller under the same 
driver? My bet would be the bug will follow the device. Just to make sure.

Then, to check which exactly command of those issued during the DV causes 
the problem, can you apply the following debugging patch and try 2.6.7 
with it? It is to be applied with -p0, note, my mail is known to mangle 
tabs, although I tested it recently and couldn't reproduce the problem. 
But if it does occur, I'll re-send it as an attachment.

Index: drivers/scsi/scsi_transport_spi.c
===================================================================
RCS file: /usr/src/cvs/linux-2_6/drivers/scsi/scsi_transport_spi.c,v
retrieving revision 1.1.1.2
diff -p -u -r1.1.1.2 scsi_transport_spi.c
--- drivers/scsi/scsi_transport_spi.c	19 May 2004 16:07:33 -0000	1.1.1.2
+++ drivers/scsi/scsi_transport_spi.c	15 Aug 2004 21:41:52 -0000
@@ -460,6 +460,7 @@ spi_dv_device_get_echo_buffer(struct scs
  		}
  	}

+	printk(KERN_WARNING"%s:%d\n", __FUNCTION__, __LINE__);
  	sreq->sr_cmd_len = 0;
  	sreq->sr_data_direction = DMA_FROM_DEVICE;

@@ -500,7 +501,7 @@ spi_dv_device_internal(struct scsi_reque
  			i->f->set_width(sdev, 0);
  		}
  	}
-
+	printk(KERN_WARNING"%s: SPI INQUIRY successful\n", __FUNCTION__);
  	if (!i->f->set_period)
  		return;

@@ -511,10 +512,13 @@ spi_dv_device_internal(struct scsi_reque
  	/* now set up to the maximum */
  	DV_SET(offset, 255);
  	DV_SET(period, 1);
+
+	printk(KERN_WARNING"%s:%d\n", __FUNCTION__, __LINE__);
  	if (!spi_dv_retrain(sreq, buffer, buffer + len,
  			    spi_dv_device_compare_inquiry))
  		return;

+	printk(KERN_WARNING"%s:%d\n", __FUNCTION__, __LINE__);
  	/* OK, now we have our initial speed set by the read only inquiry
  	 * test, now try an echo buffer test (if the device allows it) */

@@ -527,8 +531,10 @@ spi_dv_device_internal(struct scsi_reque
  		len = SPI_MAX_ECHO_BUFFER_SIZE;
  	}

+	printk(KERN_WARNING"%s:%d\n", __FUNCTION__, __LINE__);
  	spi_dv_retrain(sreq, buffer, buffer + len,
  		       spi_dv_device_echo_buffer);
+	printk(KERN_WARNING"%s:%d\n", __FUNCTION__, __LINE__);
  }


Thanks and regards 
Guennadi
---
Guennadi Liakhovetski


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-15  7:21       ` Sym53c8xx_2 Kai Makisara
  2004-08-15  8:38         ` Sym53c8xx_2 Guennadi Liakhovetski
  2004-08-15 21:52         ` Sym53c8xx_2 Guennadi Liakhovetski
@ 2004-08-16  2:54         ` Kai OM
  2004-08-16 11:12           ` Sym53c8xx_2 Matthew Wilcox
                             ` (2 more replies)
  2 siblings, 3 replies; 18+ messages in thread
From: Kai OM @ 2004-08-16  2:54 UTC (permalink / raw)
  To: Kai Makisara, Guennadi Liakhovetski; +Cc: linux-scsi


On Sun, 15 Aug 2004 10:21:35 +0300 (EEST), "Kai Makisara"
<Kai.Makisara@kolumbus.fi> said:

> Domain Validation does not fail in all configurations. I am using 
> sym53c8xx_2 in two machines to drive the system disk and some other 
> devices. I have had no SCSI problems with these systems.

What chipset is your controller based on?

I'm using an LSIU160, based on the LSI53C1010-33 chipset

I'm pretty sure at least one other person I know that had this problem
was using a very similar chipset/controller, which strikes me as
pertinent.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-16  2:54         ` Sym53c8xx_2 Kai OM
@ 2004-08-16 11:12           ` Matthew Wilcox
  2004-08-16 11:39           ` Sym53c8xx_2 Lance Dryden
  2004-08-16 18:02           ` Sym53c8xx_2 Kai Makisara
  2 siblings, 0 replies; 18+ messages in thread
From: Matthew Wilcox @ 2004-08-16 11:12 UTC (permalink / raw)
  To: Kai OM; +Cc: Kai Makisara, Guennadi Liakhovetski, linux-scsi

On Sun, Aug 15, 2004 at 10:54:49PM -0400, Kai OM wrote:
> 
> On Sun, 15 Aug 2004 10:21:35 +0300 (EEST), "Kai Makisara"
> <Kai.Makisara@kolumbus.fi> said:
> 
> > Domain Validation does not fail in all configurations. I am using 
> > sym53c8xx_2 in two machines to drive the system disk and some other 
> > devices. I have had no SCSI problems with these systems.
> 
> What chipset is your controller based on?

I don't think the problem lies with the controller; I think it lies with
one or more devices on the bus.

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-16  2:54         ` Sym53c8xx_2 Kai OM
  2004-08-16 11:12           ` Sym53c8xx_2 Matthew Wilcox
@ 2004-08-16 11:39           ` Lance Dryden
  2004-08-17 10:24             ` Sym53c8xx_2 Kai OM
  2004-08-16 18:02           ` Sym53c8xx_2 Kai Makisara
  2 siblings, 1 reply; 18+ messages in thread
From: Lance Dryden @ 2004-08-16 11:39 UTC (permalink / raw)
  To: linux-scsi

Hello.

Kai OM wrote:

> On Sun, 15 Aug 2004 10:21:35 +0300 (EEST), "Kai Makisara"
> <Kai.Makisara@kolumbus.fi> said:
> 
>>Domain Validation does not fail in all configurations. I am using 
>>sym53c8xx_2 in two machines to drive the system disk and some other 
>>devices. I have had no SCSI problems with these systems.
> 
> What chipset is your controller based on?
> 
> I'm using an LSIU160, based on the LSI53C1010-33 chipset
> 
> I'm pretty sure at least one other person I know that had this problem
> was using a very similar chipset/controller, which strikes me as
> pertinent.

I can replicate the same problem with a Tekram DC390U3W, which is also
based on the LSI53C1010.  I am guessing it's a specific problem with the
53c1010, as I have since replaced it with a 53c895-based card (Tekram
DC390U2W) and it functions normally in the same system.

For me, the failure symptom starts with domain validation of the first 
device on the bus (Quantum Atlas V 18).  The last lines indicate that 
domain validation is beginning for the device.  After a delay, the SCSI 
layer reports an attempted device reset.  Once that finishes, another 
delay and then the SCSI layer reports an attempted bus reset.  I will 
need to re-attach a c1010 to the host in order to provide a precise log.

When I had the 53c1010-based card, the other way to continue working
with the > 2.6.5 kernels was to deliberately stub out the call to
"spi_dv_device_internal()" within
drivers/scsi/scsi_transport_spi.c:spi_dv_device(struct scsi_device
*sdev).  Obviously not a nice thing to do.

I can only speculate that the 53c1010 somehow messes up domain
validation where the 53c895 does not.

   Cheers,
   Lance Dryden


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-16  2:54         ` Sym53c8xx_2 Kai OM
  2004-08-16 11:12           ` Sym53c8xx_2 Matthew Wilcox
  2004-08-16 11:39           ` Sym53c8xx_2 Lance Dryden
@ 2004-08-16 18:02           ` Kai Makisara
  2004-08-17 10:19             ` Sym53c8xx_2 Kai OM
  2 siblings, 1 reply; 18+ messages in thread
From: Kai Makisara @ 2004-08-16 18:02 UTC (permalink / raw)
  To: Kai OM; +Cc: Guennadi Liakhovetski, linux-scsi

On Sun, 15 Aug 2004, Kai OM wrote:

> 
> On Sun, 15 Aug 2004 10:21:35 +0300 (EEST), "Kai Makisara"
> <Kai.Makisara@kolumbus.fi> said:
> 
> > Domain Validation does not fail in all configurations. I am using 
> > sym53c8xx_2 in two machines to drive the system disk and some other 
> > devices. I have had no SCSI problems with these systems.
> 
> What chipset is your controller based on?
> 
> I'm using an LSIU160, based on the LSI53C1010-33 chipset
> 
I actually have three systems using sym53c8xx_2. One system has 53c1010-66 
(Tekram DC390U3W), one has 53c896 (ASUS SC896), and one has 53c895 (Tekram 
DC390U2W).

> I'm pretty sure at least one other person I know that had this problem
> was using a very similar chipset/controller, which strikes me as
> pertinent.
> 
I am sorry to break this theory ;-)

-- 
Kai

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-16 18:02           ` Sym53c8xx_2 Kai Makisara
@ 2004-08-17 10:19             ` Kai OM
  0 siblings, 0 replies; 18+ messages in thread
From: Kai OM @ 2004-08-17 10:19 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: linux-scsi


On Mon, 16 Aug 2004 21:02:49 +0300 (EEST), "Kai Makisara"
<Kai.Makisara@kolumbus.fi> said:

> > I'm pretty sure at least one other person I know that had this problem
> > was using a very similar chipset/controller, which strikes me as
> > pertinent.
> > 
> I am sorry to break this theory ;-)

Well, it weakens it, but it could always also be an issue with very
specific chipset revisions, or something more obscure. I can't test to
say that you're wrong, so I'd like someone who gets the same errors as
me(ABORT operations followed by device, bus, and host resets, for those
newly reading this thread) to try the tests suggested -- namely, moving
devices and controllers around, to see which the problem follows.

Could anyone who's gotten this error do some swapping around so we can
get some more data?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
  2004-08-16 11:39           ` Sym53c8xx_2 Lance Dryden
@ 2004-08-17 10:24             ` Kai OM
  2004-08-21 22:51               ` sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2) Lance Dryden
  0 siblings, 1 reply; 18+ messages in thread
From: Kai OM @ 2004-08-17 10:24 UTC (permalink / raw)
  To: Lance Dryden, linux-scsi


On Mon, 16 Aug 2004 07:39:46 -0400, "Lance Dryden" <lance@jound.net>
said:

> I can replicate the same problem with a Tekram DC390U3W, which is also
> based on the LSI53C1010.  I am guessing it's a specific problem with the
> 53c1010, as I have since replaced it with a 53c895-based card (Tekram
> DC390U2W) and it functions normally in the same system.
> 
> For me, the failure symptom starts with domain validation of the first 
> device on the bus (Quantum Atlas V 18).  The last lines indicate that 
> domain validation is beginning for the device.  After a delay, the SCSI 
> layer reports an attempted device reset.  Once that finishes, another 
> delay and then the SCSI layer reports an attempted bus reset.  I will 
> need to re-attach a c1010 to the host in order to provide a precise log.

The device I have attached to my LSIU160 is a Quantum Atlas 10K II.
Model ATLAS10K2-TY734J.
I wonder if it's an issue with 53c1010 variants in general, or maybe
only when connected to certain devices?

Can you test a different drive on the 53c1010 controller?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Sym53c8xx_2
@ 2004-08-17 18:06 Vladimir G. Ivanovic
  0 siblings, 0 replies; 18+ messages in thread
From: Vladimir G. Ivanovic @ 2004-08-17 18:06 UTC (permalink / raw)
  To: linux-scsi

I've had a similar problem for years with a Quantum ATLAS 10K3_18_WLS
attached to a LSI Logic/Symbios Logic 53c1010 Ultra3 SCSI Adapter.

I get errors like:

   sym0:6: ERROR (81:0) (47-67-19) (3e/18/80) @ (scripta 4b8:f31c0004).
   sym0:6: ERROR (81:0) (47-67-67) (3e/18/80) @ (mem f000ef74:ffffffff).
   sym0:6: ERROR (81:0) (47-67-7) (3e/18/80) @ (mem f000ef74:ffffffff).
   sym0:6: ERROR (81:0) (8-0-0) (3e/18/80) @ (mem 10c48404:fffec887).
   sym0:6: ERROR (81:0) (8-0-0) (3e/18/80) @ (scripta 38:f31c0004).
   sym0:6: ERROR (81:0) (c-0-0) (3e/18/80) @ (mem 1510158:04000000).
   sym0:6: ERROR (81:0) (c-0-0) (3e/18/80) @ (mem 6c6f707c:04000000).
   sym0:6: ERROR (81:0) (c-0-0) (3e/18/80) @ (mem 948174:04000000).

These are all Illegal Instruction Detected errors (81:0).

Switching controllers hasn't helped, nor has turning on and off
SCSI termination in the BIOS (on-board SCSI), nor has upgrading from a
2.4.x kernel to a 2.6.x kernel.

I have build a custom 2.6 kernel with all kernel debugging turned on,
and I am playing around with various debugging commands that can be sent
to the driver, but I'd really appreciate any advice on how to log and
debug these kind of problems.

Thanks.

--- Vladimir

------------------------------------------------------------------------
Vladimir G. Ivanovic                        http://leonora.org/~vladimir
Palo Alto, CA 94306                                      +1 650 678 8014
------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 18+ messages in thread

* sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2)
  2004-08-17 10:24             ` Sym53c8xx_2 Kai OM
@ 2004-08-21 22:51               ` Lance Dryden
  2004-08-22  1:01                 ` Matthew Wilcox
  2004-08-22  2:45                 ` Vladimir G. Ivanovic
  0 siblings, 2 replies; 18+ messages in thread
From: Lance Dryden @ 2004-08-21 22:51 UTC (permalink / raw)
  To: Kai OM; +Cc: linux-scsi

Apologies for the late reply.

What I have found is that only my Quantum Atlas disks seem to have the 
problem with domain validation at boot-up.

My results are below.  The first test was successful, the second was 
not.  Both host adapters were installed and enumerated by the kernel in 
the order listed.  Kernel was 2.6.8.1 from kernel.org.

=== BEGIN TEST 1 ===
53c895
- 2x Quantum Atlas V
- 1x Ecrix VXA-1
53c1010
- 1x Compaq OEM device
- 1x Plextor
Status: Successful DV, all devices

=== BEGIN TEST 2 ===
53c895
- 1x Atlas
- 1x Ecrix
53c1010
- 1x Atlas
- 1x Compaq
- 1x Plextor
Status: UNSUCCESSFUL DV against
- Vendor:"QUANTUM" Model:"ATLAS_V_18_WLS" Rev:"0200"

=== BEGIN TEST 2 CONSOLE LOG SNIPPET ===
sym1:2:0: tagged command queueing enabled, command queue depth 16
scsi(1:0:2:0): Beginning Domain Validation
sym1:2: wide asynchronous
sym1:2:0: ABORT operation started.
sym1:2:0: ABORT operation timed-out
sym1:2:0: DEVICE RESET operation started
sym1:2:0: DEVICE RESET operation timed-out.
sym1:2:0: BUS RESET operation started.
sym1: SCSI BUS reset detected.
sym1: SCSI BUS has been reset.
sym1:2:0: BUS RESET operation complete
sym1:2:0: ABORT operation started.
sym1:2:0: ABORT operation timed-out.
sym1:2:0: HOST RESET operation started.
sym1: SCSI BUS has been reset
=== END TEST 2 CONSOLE LOG SNIPPET ===

I wonder if anyone around can test with some more Quantum Atlas drives 
against a 53c1010.

   Cheers,
   Lance Dryden

Kai OM wrote:
> 
> The device I have attached to my LSIU160 is a Quantum Atlas 10K II.
> Model ATLAS10K2-TY734J.
> I wonder if it's an issue with 53c1010 variants in general, or maybe
> only when connected to certain devices?
> 
> Can you test a different drive on the 53c1010 controller?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2)
  2004-08-21 22:51               ` sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2) Lance Dryden
@ 2004-08-22  1:01                 ` Matthew Wilcox
  2004-08-22  2:45                 ` Vladimir G. Ivanovic
  1 sibling, 0 replies; 18+ messages in thread
From: Matthew Wilcox @ 2004-08-22  1:01 UTC (permalink / raw)
  To: Lance Dryden; +Cc: Kai OM, linux-scsi

On Sat, Aug 21, 2004 at 06:51:52PM -0400, Lance Dryden wrote:
> What I have found is that only my Quantum Atlas disks seem to have the 
> problem with domain validation at boot-up.

Correct.  James managed to isolate the problem to the Atlas drives too.
Basically, the sym2 driver was trying to negotiate an invalid state.
Most drives recognised it was invalid and rejected it, but the Atlas drives
tried to obey it.  Here's the patch to fix the problem:

http://marc.theaimsgroup.com/?l=linux-scsi&m=109311381603239&w=2

The patch James mentions that it depends on is:

http://marc.theaimsgroup.com/?l=linux-scsi&m=109305341428550&w=2
(and maybe the followup supplemental patch)
http://marc.theaimsgroup.com/?l=linux-scsi&m=109306123627231&w=2

-- 
"Next the statesmen will invent cheap lies, putting the blame upon 
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince 
himself that the war is just, and will thank God for the better sleep 
he enjoys after this process of grotesque self-deception." -- Mark Twain

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2)
  2004-08-21 22:51               ` sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2) Lance Dryden
  2004-08-22  1:01                 ` Matthew Wilcox
@ 2004-08-22  2:45                 ` Vladimir G. Ivanovic
  1 sibling, 0 replies; 18+ messages in thread
From: Vladimir G. Ivanovic @ 2004-08-22  2:45 UTC (permalink / raw)
  To: linux-scsi

I have posted to this list examples of the kinds of errors that I get
with my Atlas drive, but I haven't gotten any response or suggestion on
how to debug.

My boot command line includes "buschk:0x2", and my smartd configuration
file includes "/dev/sda -a -d scsi". 

Here are some examples (more or less in chronological order). (I've
elided the date and time, and my machine name from entries below. I can
provide them if anyone wishes.)  

Boot-time information:

   kernel: sym0: <1010-33> rev 0x1 at pci 0000:00:08.0 irq 177
   kernel: sym0: using 64 bit DMA addressing
   kernel: sym0: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking
   kernel: sym0: open drain IRQ line driver, using on-chip SRAM
   kernel: sym0: using LOAD/STORE-based firmware.
   kernel: sym0: handling phase mismatch from SCRIPTS.
   kernel: sym0: SCSI BUS has been reset.
   kernel: scsi0 : sym-2.1.18j
   kernel:   Vendor: QUANTUM   Model: ATLAS10K3_18_WLS  Rev: 020K
   kernel:   Type:   Direct-Access                      ANSI SCSI revision: 03
   kernel: sym0:6:0: tagged command queuing enabled, command queue depth 16.
   kernel: scsi(0:0:6:0): Beginning Domain Validation
   kernel: sym0:6: wide asynchronous.
   kernel: sym0:6: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 62)
   kernel: scsi(0:0:6:0): Ending Domain Validation
   kernel: SCSI device sda: 35916548 512-byte hdwr sectors (18389 MB)
   kernel: SCSI device sda: drive cache: write back
   kernel:  sda: sda1 sda2 sda3 sda4
   kernel: Attached scsi disk sda at scsi0, channel 0, id 6, lun 0

Right after smartd starts up:

   kernel: sym0:6:0:ODD transfer in DATA IN phase.
   kernel: sym0:6:0:COMMAND FAILED (87 0 10).
   kernel: sym0:6:0:ODD transfer in DATA IN phase.
   kernel: sym0:6:0:COMMAND FAILED (87 0 10).
   kernel: sym0:6:0:ODD transfer in DATA IN phase.
   kernel: sym0:6:0:COMMAND FAILED (87 0 10).
   kernel: sym0:6:0:ODD transfer in DATA IN phase.
   kernel: sym0:6:0:COMMAND FAILED (87 0 10).
   kernel: sym0:6:0:ODD transfer in DATA IN phase.
   kernel: sym0:6:0:COMMAND FAILED (87 0 10).

Still during bootup, but I can't associate any particular event with the
errors: 

   kernel: sym0:6:0:ordered tag forced.
   kernel: sym0:6:0:ordered tag forced.

Here is are the errors I get during a typical night:

   kernel: sym0:6: ERROR (81:0) (8-0-0) (3e/18/80) @ (scripta 50:f31c0004).
   kernel: sym0: script cmd = 90080000
   kernel: sym0: regdump: da 00 00 18 47 3e 06 0f 00 08 86 00 80 00 07 0a 89 cd c1 e9 02 00 00 00.
   kernel: sym0: SCSI BUS reset detected.
   kernel: sym0: SCSI BUS has been reset.

   kernel:   "ff ff ff 7f b1 2d 24 02 8c 41 44 02 c0 6c 34 02 "
   kernel: sym0:6: ERROR (81:0) (8-0-0) (3e/18/80) @ (scripta 50:f31c0004).
   kernel: sym0: script cmd = 90080000
   kernel: sym0: regdump: da 00 00 18 47 3e 06 0f 04 08 86 00 80 00 0f 0a 89 cd c1 e9 02 00 00 00.
   kernel: sym0: SCSI BUS reset detected.
   kernel: sym0: SCSI BUS has been reset.

--- Vladimir

------------------------------------------------------------------------
Vladimir G. Ivanovic                        http://leonora.org/~vladimir
Palo Alto, CA 94306                                      +1 650 678 8014
------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2004-08-22  2:45 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-13  3:00 Sym53c8xx_2 Kai OM
2004-08-13 19:40 ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-13 21:24   ` Sym53c8xx_2 Kai OM
2004-08-14 22:16     ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-15  7:21       ` Sym53c8xx_2 Kai Makisara
2004-08-15  8:38         ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-15  8:45           ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-15 21:52         ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-16  2:54         ` Sym53c8xx_2 Kai OM
2004-08-16 11:12           ` Sym53c8xx_2 Matthew Wilcox
2004-08-16 11:39           ` Sym53c8xx_2 Lance Dryden
2004-08-17 10:24             ` Sym53c8xx_2 Kai OM
2004-08-21 22:51               ` sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2) Lance Dryden
2004-08-22  1:01                 ` Matthew Wilcox
2004-08-22  2:45                 ` Vladimir G. Ivanovic
2004-08-16 18:02           ` Sym53c8xx_2 Kai Makisara
2004-08-17 10:19             ` Sym53c8xx_2 Kai OM
  -- strict thread matches above, loose matches on Subject: below --
2004-08-17 18:06 Sym53c8xx_2 Vladimir G. Ivanovic

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).