* Sym53c8xx_2
@ 2004-08-13 3:00 Kai OM
2004-08-13 19:40 ` Sym53c8xx_2 Guennadi Liakhovetski
0 siblings, 1 reply; 18+ messages in thread
From: Kai OM @ 2004-08-13 3:00 UTC (permalink / raw)
To: linux-scsi
A while back I mailed the list a few times about an issue I was having
with the Sym53c8xx_2 driver in the 2.6.7-8 kernels, as well as mailing
the driver maintainer listed in the kernel source.
Nobody ever replied or acknowledged my e-mails, save a couple other
people with the same issue.
I'm wondering, with all this activity in the list now, would it be worth
mailing everyone again, or does anyone have suggestions for someone else
I can e-mail?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-13 3:00 Sym53c8xx_2 Kai OM
@ 2004-08-13 19:40 ` Guennadi Liakhovetski
2004-08-13 21:24 ` Sym53c8xx_2 Kai OM
0 siblings, 1 reply; 18+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-13 19:40 UTC (permalink / raw)
To: Kai OM; +Cc: linux-scsi
Hi
On Thu, 12 Aug 2004, Kai OM wrote:
> A while back I mailed the list a few times about an issue I was having
> with the Sym53c8xx_2 driver in the 2.6.7-8 kernels, as well as mailing
> the driver maintainer listed in the kernel source.
Disclaimer: I don't think I am the best person to try to solve this
problem, I'll just try to help you collect some more information... well,
will see.
The first thing I would do is try the latest 2.6.8-preX, or, better yet,
-mmY kernel. If that still doesn't work, try to narrow down the problem in
the sequence of changes between 2.6.5 and 2.6.6. It should be possible
somehow. Are you using BitKeeper?
Other things to do are turn scsi-logging on with something like
scsi_mod.scsi_logging_level=511 debug
on your kernel command line. But, it would be useless without a serial
console. Do you really have no chance to configure one? You would also
need to enable scsi-logging in your kernel SCSI configuration. There are
also some debug flags in sym53c8xx_2 driver too. Don't think things like
pci=noacpi would help you...
Regards
Guennadi
---
Guennadi Liakhovetski
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-13 19:40 ` Sym53c8xx_2 Guennadi Liakhovetski
@ 2004-08-13 21:24 ` Kai OM
2004-08-14 22:16 ` Sym53c8xx_2 Guennadi Liakhovetski
0 siblings, 1 reply; 18+ messages in thread
From: Kai OM @ 2004-08-13 21:24 UTC (permalink / raw)
To: Guennadi Liakhovetski; +Cc: linux-scsi
----- Original message -----
From: "Guennadi Liakhovetski" <g.liakhovetski@gmx.de>
To: "Kai OM" <epimetreus@fastmail.fm>
Date: Fri, 13 Aug 2004 21:40:28 +0200 (CEST)
Subject: Re: Sym53c8xx_2
Hi
On Thu, 12 Aug 2004, Kai OM wrote:
> A while back I mailed the list a few times about an issue I was having
> with the Sym53c8xx_2 driver in the 2.6.7-8 kernels, as well as mailing
> the driver maintainer listed in the kernel source.
Disclaimer: I don't think I am the best person to try to solve this
problem, I'll just try to help you collect some more information...
well,
will see.
The first thing I would do is try the latest 2.6.8-preX, or, better yet,
-mmY kernel. If that still doesn't work, try to narrow down the problem
in
the sequence of changes between 2.6.5 and 2.6.6. It should be possible
somehow. Are you using BitKeeper?
<<
Negative on that, normally I just download the source from kernel.org.
There was one entry in the 2.6.6 changelog, though, that said something
about modifying the driver in question; adding generic domain validation
and such -- and it's during domain validation that the driver stops.
Other things to do are turn scsi-logging on with something like
scsi_mod.scsi_logging_level=511 debug
on your kernel command line. But, it would be useless without a serial
console. Do you really have no chance to configure one? You would also
need to enable scsi-logging in your kernel SCSI configuration. There are
also some debug flags in sym53c8xx_2 driver too. Don't think things like
pci=noacpi would help you...
<<
In the drivers that don't work, I never even get a chance to mount the
root FS. I'll e-mail some logs of what happens, copied by hand, but I do
not have access to a serial console.
Regards
Guennadi
---
Guennadi Liakhovetski
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-13 21:24 ` Sym53c8xx_2 Kai OM
@ 2004-08-14 22:16 ` Guennadi Liakhovetski
2004-08-15 7:21 ` Sym53c8xx_2 Kai Makisara
0 siblings, 1 reply; 18+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-14 22:16 UTC (permalink / raw)
To: Kai OM; +Cc: Guennadi Liakhovetski, linux-scsi
Hi
On Fri, 13 Aug 2004, Kai OM wrote:
> There was one entry in the 2.6.6 changelog, though, that said something
> about modifying the driver in question; adding generic domain validation
> and such -- and it's during domain validation that the driver stops.
I looked through the diffs between 2.6.5 and 2.6.7 versions of sym53c8xx_2
- there are not so many of them, so, we'll have to able to find the guilty
one. In fact, I think, there are only 2 lines of code that can be
essential. But before you test them, one question: in your original posts
you quoted dmesg with 2.6.7 and 2.6.7 with replaced sym53c8xx_2 directory.
However, the output of the working version with the older driver is
missing 2 lines:
sym0:0:0: Tagged command queue enabled, command queue depth 16
scsi(0:0:0:0): Beginning Domain Validation
which are present in the non-working version. Why are they missing from
the working version - did you use a different configuration or different
command-line parameters?
And the patches you could try, to revert the changes I suspect, are:
1)
diff -u a/drivers/scsi/sym53c8xx_2/sym_glue.c b/drivers/scsi/sym53c8xx_2/sym_glue.c
--- a/drivers/scsi/sym53c8xx_2/sym_glue.c 19 May 2004 16:07:46
+++ b/drivers/scsi/sym53c8xx_2/sym_glue.c 14 Aug 2004 21:13:03
@@ -390,7 +390,7 @@
* condition otherwise the device will always return
* BUSY. Use a big stick.
*/
- sym_reset_scsi_target(np, csio->device->id);
+// sym_reset_scsi_target(np, csio->device->id);
cam_status = DID_ERROR;
}
} else if (cp->host_status == HS_COMPLETE) /* Bad SCSI status */
and 2)
diff -u a/drivers/scsi/sym53c8xx_2/sym_glue.c b/drivers/scsi/sym53c8xx_2/sym_glue.c
--- a/drivers/scsi/sym53c8xx_2/sym_glue.c 19 May 2004 16:07:46
+++ b/drivers/scsi/sym53c8xx_2/sym_glue.c 14 Aug 2004 22:12:32
@@ -931,6 +931,7 @@
switch(to_do) {
default:
case SYM_EH_DO_IGNORE:
+ goto finish;
break;
case SYM_EH_DO_WAIT:
init_MUTEX_LOCKED(&ep->sem);
Try them to stock 2.6.7 and see if any / both of them fix your problem.
Regards
Guennadi
---
Guennadi Liakhovetski
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-14 22:16 ` Sym53c8xx_2 Guennadi Liakhovetski
@ 2004-08-15 7:21 ` Kai Makisara
2004-08-15 8:38 ` Sym53c8xx_2 Guennadi Liakhovetski
` (2 more replies)
0 siblings, 3 replies; 18+ messages in thread
From: Kai Makisara @ 2004-08-15 7:21 UTC (permalink / raw)
To: Guennadi Liakhovetski; +Cc: Kai OM, linux-scsi
On Sun, 15 Aug 2004, Guennadi Liakhovetski wrote:
> Hi
>
> On Fri, 13 Aug 2004, Kai OM wrote:
>
> > There was one entry in the 2.6.6 changelog, though, that said something
> > about modifying the driver in question; adding generic domain validation
> > and such -- and it's during domain validation that the driver stops.
>
> I looked through the diffs between 2.6.5 and 2.6.7 versions of sym53c8xx_2
> - there are not so many of them, so, we'll have to able to find the guilty
> one. In fact, I think, there are only 2 lines of code that can be
> essential. But before you test them, one question: in your original posts
> you quoted dmesg with 2.6.7 and 2.6.7 with replaced sym53c8xx_2 directory.
> However, the output of the working version with the older driver is
> missing 2 lines:
>
> sym0:0:0: Tagged command queue enabled, command queue depth 16
> scsi(0:0:0:0): Beginning Domain Validation
>
> which are present in the non-working version. Why are they missing from
> the working version - did you use a different configuration or different
> command-line parameters?
>
You missed one essential change in 2.6.6:
@@ -908,6 +916,7 @@
config SCSI_SYM53C8XX_2
tristate "SYM53C8XX Version 2 SCSI support"
depends on PCI && SCSI
+ select SCSI_SPI_ATTRS
(Many changes in sym_glue.c accompany this. You can't just remove this
config change to try without spi.)
Because of this, the system is trying to do domain validation with the
SCSI devices. For some reason it fails. The problem may be in the drive
firmware, mid-level spi code, the sym53c8xxx driver, or an interaction
with some/all of those.
Domain Validation does not fail in all configurations. I am using
sym53c8xx_2 in two machines to drive the system disk and some other
devices. I have had no SCSI problems with these systems.
--
Kai
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-15 7:21 ` Sym53c8xx_2 Kai Makisara
@ 2004-08-15 8:38 ` Guennadi Liakhovetski
2004-08-15 8:45 ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-15 21:52 ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-16 2:54 ` Sym53c8xx_2 Kai OM
2 siblings, 1 reply; 18+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-15 8:38 UTC (permalink / raw)
To: Kai Makisara; +Cc: Kai OM, linux-scsi
On Sun, 15 Aug 2004, Kai Makisara wrote:
> You missed one essential change in 2.6.6:
No, I didn't. I saw it, but just didn't realise that it's the introduction
of the SPI, that triggered the Domain Validation. The transport template
functions look quite harmless and are only triggered through sysfs. What
if you disable SPI in 2.6.7 with the original dtiver? Does it work then?
Guennadi
---
Guennadi Liakhovetski
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-15 8:38 ` Sym53c8xx_2 Guennadi Liakhovetski
@ 2004-08-15 8:45 ` Guennadi Liakhovetski
0 siblings, 0 replies; 18+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-15 8:45 UTC (permalink / raw)
To: Kai Makisara; +Cc: Kai OM, linux-scsi
On Sun, 15 Aug 2004, Guennadi Liakhovetski wrote:
> functions look quite harmless and are only triggered through sysfs. What if
> you disable SPI in 2.6.7 with the original dtiver? Does it work then?
Ough, I see:
config SCSI_SYM53C8XX_2
tristate "SYM53C8XX Version 2 SCSI support"
depends on PCI && SCSI
select SCSI_SPI_ATTRS
...and it works on my 2-way too... Ok, I'll re-think it:-)
Guennadi
---
Guennadi Liakhovetski
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-15 7:21 ` Sym53c8xx_2 Kai Makisara
2004-08-15 8:38 ` Sym53c8xx_2 Guennadi Liakhovetski
@ 2004-08-15 21:52 ` Guennadi Liakhovetski
2004-08-16 2:54 ` Sym53c8xx_2 Kai OM
2 siblings, 0 replies; 18+ messages in thread
From: Guennadi Liakhovetski @ 2004-08-15 21:52 UTC (permalink / raw)
To: Kai Makisara; +Cc: Kai OM, linux-scsi
On Sun, 15 Aug 2004, Kai Makisara wrote:
>> scsi(0:0:0:0): Beginning Domain Validation
...
> You missed one essential change in 2.6.6:
>
> @@ -908,6 +916,7 @@
> config SCSI_SYM53C8XX_2
> tristate "SYM53C8XX Version 2 SCSI support"
> depends on PCI && SCSI
> + select SCSI_SPI_ATTRS
>
> (Many changes in sym_glue.c accompany this. You can't just remove this
> config change to try without spi.)
Yep, I could just as well just read your message:-)
Ok, some other ideas (hopefully, better ones than the first two) to try -
do you have a chance to either connect another device to the same
controller and / or move the device to another controller under the same
driver? My bet would be the bug will follow the device. Just to make sure.
Then, to check which exactly command of those issued during the DV causes
the problem, can you apply the following debugging patch and try 2.6.7
with it? It is to be applied with -p0, note, my mail is known to mangle
tabs, although I tested it recently and couldn't reproduce the problem.
But if it does occur, I'll re-send it as an attachment.
Index: drivers/scsi/scsi_transport_spi.c
===================================================================
RCS file: /usr/src/cvs/linux-2_6/drivers/scsi/scsi_transport_spi.c,v
retrieving revision 1.1.1.2
diff -p -u -r1.1.1.2 scsi_transport_spi.c
--- drivers/scsi/scsi_transport_spi.c 19 May 2004 16:07:33 -0000 1.1.1.2
+++ drivers/scsi/scsi_transport_spi.c 15 Aug 2004 21:41:52 -0000
@@ -460,6 +460,7 @@ spi_dv_device_get_echo_buffer(struct scs
}
}
+ printk(KERN_WARNING"%s:%d\n", __FUNCTION__, __LINE__);
sreq->sr_cmd_len = 0;
sreq->sr_data_direction = DMA_FROM_DEVICE;
@@ -500,7 +501,7 @@ spi_dv_device_internal(struct scsi_reque
i->f->set_width(sdev, 0);
}
}
-
+ printk(KERN_WARNING"%s: SPI INQUIRY successful\n", __FUNCTION__);
if (!i->f->set_period)
return;
@@ -511,10 +512,13 @@ spi_dv_device_internal(struct scsi_reque
/* now set up to the maximum */
DV_SET(offset, 255);
DV_SET(period, 1);
+
+ printk(KERN_WARNING"%s:%d\n", __FUNCTION__, __LINE__);
if (!spi_dv_retrain(sreq, buffer, buffer + len,
spi_dv_device_compare_inquiry))
return;
+ printk(KERN_WARNING"%s:%d\n", __FUNCTION__, __LINE__);
/* OK, now we have our initial speed set by the read only inquiry
* test, now try an echo buffer test (if the device allows it) */
@@ -527,8 +531,10 @@ spi_dv_device_internal(struct scsi_reque
len = SPI_MAX_ECHO_BUFFER_SIZE;
}
+ printk(KERN_WARNING"%s:%d\n", __FUNCTION__, __LINE__);
spi_dv_retrain(sreq, buffer, buffer + len,
spi_dv_device_echo_buffer);
+ printk(KERN_WARNING"%s:%d\n", __FUNCTION__, __LINE__);
}
Thanks and regards
Guennadi
---
Guennadi Liakhovetski
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-15 7:21 ` Sym53c8xx_2 Kai Makisara
2004-08-15 8:38 ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-15 21:52 ` Sym53c8xx_2 Guennadi Liakhovetski
@ 2004-08-16 2:54 ` Kai OM
2004-08-16 11:12 ` Sym53c8xx_2 Matthew Wilcox
` (2 more replies)
2 siblings, 3 replies; 18+ messages in thread
From: Kai OM @ 2004-08-16 2:54 UTC (permalink / raw)
To: Kai Makisara, Guennadi Liakhovetski; +Cc: linux-scsi
On Sun, 15 Aug 2004 10:21:35 +0300 (EEST), "Kai Makisara"
<Kai.Makisara@kolumbus.fi> said:
> Domain Validation does not fail in all configurations. I am using
> sym53c8xx_2 in two machines to drive the system disk and some other
> devices. I have had no SCSI problems with these systems.
What chipset is your controller based on?
I'm using an LSIU160, based on the LSI53C1010-33 chipset
I'm pretty sure at least one other person I know that had this problem
was using a very similar chipset/controller, which strikes me as
pertinent.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-16 2:54 ` Sym53c8xx_2 Kai OM
@ 2004-08-16 11:12 ` Matthew Wilcox
2004-08-16 11:39 ` Sym53c8xx_2 Lance Dryden
2004-08-16 18:02 ` Sym53c8xx_2 Kai Makisara
2 siblings, 0 replies; 18+ messages in thread
From: Matthew Wilcox @ 2004-08-16 11:12 UTC (permalink / raw)
To: Kai OM; +Cc: Kai Makisara, Guennadi Liakhovetski, linux-scsi
On Sun, Aug 15, 2004 at 10:54:49PM -0400, Kai OM wrote:
>
> On Sun, 15 Aug 2004 10:21:35 +0300 (EEST), "Kai Makisara"
> <Kai.Makisara@kolumbus.fi> said:
>
> > Domain Validation does not fail in all configurations. I am using
> > sym53c8xx_2 in two machines to drive the system disk and some other
> > devices. I have had no SCSI problems with these systems.
>
> What chipset is your controller based on?
I don't think the problem lies with the controller; I think it lies with
one or more devices on the bus.
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-16 2:54 ` Sym53c8xx_2 Kai OM
2004-08-16 11:12 ` Sym53c8xx_2 Matthew Wilcox
@ 2004-08-16 11:39 ` Lance Dryden
2004-08-17 10:24 ` Sym53c8xx_2 Kai OM
2004-08-16 18:02 ` Sym53c8xx_2 Kai Makisara
2 siblings, 1 reply; 18+ messages in thread
From: Lance Dryden @ 2004-08-16 11:39 UTC (permalink / raw)
To: linux-scsi
Hello.
Kai OM wrote:
> On Sun, 15 Aug 2004 10:21:35 +0300 (EEST), "Kai Makisara"
> <Kai.Makisara@kolumbus.fi> said:
>
>>Domain Validation does not fail in all configurations. I am using
>>sym53c8xx_2 in two machines to drive the system disk and some other
>>devices. I have had no SCSI problems with these systems.
>
> What chipset is your controller based on?
>
> I'm using an LSIU160, based on the LSI53C1010-33 chipset
>
> I'm pretty sure at least one other person I know that had this problem
> was using a very similar chipset/controller, which strikes me as
> pertinent.
I can replicate the same problem with a Tekram DC390U3W, which is also
based on the LSI53C1010. I am guessing it's a specific problem with the
53c1010, as I have since replaced it with a 53c895-based card (Tekram
DC390U2W) and it functions normally in the same system.
For me, the failure symptom starts with domain validation of the first
device on the bus (Quantum Atlas V 18). The last lines indicate that
domain validation is beginning for the device. After a delay, the SCSI
layer reports an attempted device reset. Once that finishes, another
delay and then the SCSI layer reports an attempted bus reset. I will
need to re-attach a c1010 to the host in order to provide a precise log.
When I had the 53c1010-based card, the other way to continue working
with the > 2.6.5 kernels was to deliberately stub out the call to
"spi_dv_device_internal()" within
drivers/scsi/scsi_transport_spi.c:spi_dv_device(struct scsi_device
*sdev). Obviously not a nice thing to do.
I can only speculate that the 53c1010 somehow messes up domain
validation where the 53c895 does not.
Cheers,
Lance Dryden
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-16 2:54 ` Sym53c8xx_2 Kai OM
2004-08-16 11:12 ` Sym53c8xx_2 Matthew Wilcox
2004-08-16 11:39 ` Sym53c8xx_2 Lance Dryden
@ 2004-08-16 18:02 ` Kai Makisara
2004-08-17 10:19 ` Sym53c8xx_2 Kai OM
2 siblings, 1 reply; 18+ messages in thread
From: Kai Makisara @ 2004-08-16 18:02 UTC (permalink / raw)
To: Kai OM; +Cc: Guennadi Liakhovetski, linux-scsi
On Sun, 15 Aug 2004, Kai OM wrote:
>
> On Sun, 15 Aug 2004 10:21:35 +0300 (EEST), "Kai Makisara"
> <Kai.Makisara@kolumbus.fi> said:
>
> > Domain Validation does not fail in all configurations. I am using
> > sym53c8xx_2 in two machines to drive the system disk and some other
> > devices. I have had no SCSI problems with these systems.
>
> What chipset is your controller based on?
>
> I'm using an LSIU160, based on the LSI53C1010-33 chipset
>
I actually have three systems using sym53c8xx_2. One system has 53c1010-66
(Tekram DC390U3W), one has 53c896 (ASUS SC896), and one has 53c895 (Tekram
DC390U2W).
> I'm pretty sure at least one other person I know that had this problem
> was using a very similar chipset/controller, which strikes me as
> pertinent.
>
I am sorry to break this theory ;-)
--
Kai
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-16 18:02 ` Sym53c8xx_2 Kai Makisara
@ 2004-08-17 10:19 ` Kai OM
0 siblings, 0 replies; 18+ messages in thread
From: Kai OM @ 2004-08-17 10:19 UTC (permalink / raw)
To: Guennadi Liakhovetski; +Cc: linux-scsi
On Mon, 16 Aug 2004 21:02:49 +0300 (EEST), "Kai Makisara"
<Kai.Makisara@kolumbus.fi> said:
> > I'm pretty sure at least one other person I know that had this problem
> > was using a very similar chipset/controller, which strikes me as
> > pertinent.
> >
> I am sorry to break this theory ;-)
Well, it weakens it, but it could always also be an issue with very
specific chipset revisions, or something more obscure. I can't test to
say that you're wrong, so I'd like someone who gets the same errors as
me(ABORT operations followed by device, bus, and host resets, for those
newly reading this thread) to try the tests suggested -- namely, moving
devices and controllers around, to see which the problem follows.
Could anyone who's gotten this error do some swapping around so we can
get some more data?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
2004-08-16 11:39 ` Sym53c8xx_2 Lance Dryden
@ 2004-08-17 10:24 ` Kai OM
2004-08-21 22:51 ` sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2) Lance Dryden
0 siblings, 1 reply; 18+ messages in thread
From: Kai OM @ 2004-08-17 10:24 UTC (permalink / raw)
To: Lance Dryden, linux-scsi
On Mon, 16 Aug 2004 07:39:46 -0400, "Lance Dryden" <lance@jound.net>
said:
> I can replicate the same problem with a Tekram DC390U3W, which is also
> based on the LSI53C1010. I am guessing it's a specific problem with the
> 53c1010, as I have since replaced it with a 53c895-based card (Tekram
> DC390U2W) and it functions normally in the same system.
>
> For me, the failure symptom starts with domain validation of the first
> device on the bus (Quantum Atlas V 18). The last lines indicate that
> domain validation is beginning for the device. After a delay, the SCSI
> layer reports an attempted device reset. Once that finishes, another
> delay and then the SCSI layer reports an attempted bus reset. I will
> need to re-attach a c1010 to the host in order to provide a precise log.
The device I have attached to my LSIU160 is a Quantum Atlas 10K II.
Model ATLAS10K2-TY734J.
I wonder if it's an issue with 53c1010 variants in general, or maybe
only when connected to certain devices?
Can you test a different drive on the 53c1010 controller?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Sym53c8xx_2
@ 2004-08-17 18:06 Vladimir G. Ivanovic
0 siblings, 0 replies; 18+ messages in thread
From: Vladimir G. Ivanovic @ 2004-08-17 18:06 UTC (permalink / raw)
To: linux-scsi
I've had a similar problem for years with a Quantum ATLAS 10K3_18_WLS
attached to a LSI Logic/Symbios Logic 53c1010 Ultra3 SCSI Adapter.
I get errors like:
sym0:6: ERROR (81:0) (47-67-19) (3e/18/80) @ (scripta 4b8:f31c0004).
sym0:6: ERROR (81:0) (47-67-67) (3e/18/80) @ (mem f000ef74:ffffffff).
sym0:6: ERROR (81:0) (47-67-7) (3e/18/80) @ (mem f000ef74:ffffffff).
sym0:6: ERROR (81:0) (8-0-0) (3e/18/80) @ (mem 10c48404:fffec887).
sym0:6: ERROR (81:0) (8-0-0) (3e/18/80) @ (scripta 38:f31c0004).
sym0:6: ERROR (81:0) (c-0-0) (3e/18/80) @ (mem 1510158:04000000).
sym0:6: ERROR (81:0) (c-0-0) (3e/18/80) @ (mem 6c6f707c:04000000).
sym0:6: ERROR (81:0) (c-0-0) (3e/18/80) @ (mem 948174:04000000).
These are all Illegal Instruction Detected errors (81:0).
Switching controllers hasn't helped, nor has turning on and off
SCSI termination in the BIOS (on-board SCSI), nor has upgrading from a
2.4.x kernel to a 2.6.x kernel.
I have build a custom 2.6 kernel with all kernel debugging turned on,
and I am playing around with various debugging commands that can be sent
to the driver, but I'd really appreciate any advice on how to log and
debug these kind of problems.
Thanks.
--- Vladimir
------------------------------------------------------------------------
Vladimir G. Ivanovic http://leonora.org/~vladimir
Palo Alto, CA 94306 +1 650 678 8014
------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 18+ messages in thread
* sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2)
2004-08-17 10:24 ` Sym53c8xx_2 Kai OM
@ 2004-08-21 22:51 ` Lance Dryden
2004-08-22 1:01 ` Matthew Wilcox
2004-08-22 2:45 ` Vladimir G. Ivanovic
0 siblings, 2 replies; 18+ messages in thread
From: Lance Dryden @ 2004-08-21 22:51 UTC (permalink / raw)
To: Kai OM; +Cc: linux-scsi
Apologies for the late reply.
What I have found is that only my Quantum Atlas disks seem to have the
problem with domain validation at boot-up.
My results are below. The first test was successful, the second was
not. Both host adapters were installed and enumerated by the kernel in
the order listed. Kernel was 2.6.8.1 from kernel.org.
=== BEGIN TEST 1 ===
53c895
- 2x Quantum Atlas V
- 1x Ecrix VXA-1
53c1010
- 1x Compaq OEM device
- 1x Plextor
Status: Successful DV, all devices
=== BEGIN TEST 2 ===
53c895
- 1x Atlas
- 1x Ecrix
53c1010
- 1x Atlas
- 1x Compaq
- 1x Plextor
Status: UNSUCCESSFUL DV against
- Vendor:"QUANTUM" Model:"ATLAS_V_18_WLS" Rev:"0200"
=== BEGIN TEST 2 CONSOLE LOG SNIPPET ===
sym1:2:0: tagged command queueing enabled, command queue depth 16
scsi(1:0:2:0): Beginning Domain Validation
sym1:2: wide asynchronous
sym1:2:0: ABORT operation started.
sym1:2:0: ABORT operation timed-out
sym1:2:0: DEVICE RESET operation started
sym1:2:0: DEVICE RESET operation timed-out.
sym1:2:0: BUS RESET operation started.
sym1: SCSI BUS reset detected.
sym1: SCSI BUS has been reset.
sym1:2:0: BUS RESET operation complete
sym1:2:0: ABORT operation started.
sym1:2:0: ABORT operation timed-out.
sym1:2:0: HOST RESET operation started.
sym1: SCSI BUS has been reset
=== END TEST 2 CONSOLE LOG SNIPPET ===
I wonder if anyone around can test with some more Quantum Atlas drives
against a 53c1010.
Cheers,
Lance Dryden
Kai OM wrote:
>
> The device I have attached to my LSIU160 is a Quantum Atlas 10K II.
> Model ATLAS10K2-TY734J.
> I wonder if it's an issue with 53c1010 variants in general, or maybe
> only when connected to certain devices?
>
> Can you test a different drive on the 53c1010 controller?
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2)
2004-08-21 22:51 ` sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2) Lance Dryden
@ 2004-08-22 1:01 ` Matthew Wilcox
2004-08-22 2:45 ` Vladimir G. Ivanovic
1 sibling, 0 replies; 18+ messages in thread
From: Matthew Wilcox @ 2004-08-22 1:01 UTC (permalink / raw)
To: Lance Dryden; +Cc: Kai OM, linux-scsi
On Sat, Aug 21, 2004 at 06:51:52PM -0400, Lance Dryden wrote:
> What I have found is that only my Quantum Atlas disks seem to have the
> problem with domain validation at boot-up.
Correct. James managed to isolate the problem to the Atlas drives too.
Basically, the sym2 driver was trying to negotiate an invalid state.
Most drives recognised it was invalid and rejected it, but the Atlas drives
tried to obey it. Here's the patch to fix the problem:
http://marc.theaimsgroup.com/?l=linux-scsi&m=109311381603239&w=2
The patch James mentions that it depends on is:
http://marc.theaimsgroup.com/?l=linux-scsi&m=109305341428550&w=2
(and maybe the followup supplemental patch)
http://marc.theaimsgroup.com/?l=linux-scsi&m=109306123627231&w=2
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2)
2004-08-21 22:51 ` sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2) Lance Dryden
2004-08-22 1:01 ` Matthew Wilcox
@ 2004-08-22 2:45 ` Vladimir G. Ivanovic
1 sibling, 0 replies; 18+ messages in thread
From: Vladimir G. Ivanovic @ 2004-08-22 2:45 UTC (permalink / raw)
To: linux-scsi
I have posted to this list examples of the kinds of errors that I get
with my Atlas drive, but I haven't gotten any response or suggestion on
how to debug.
My boot command line includes "buschk:0x2", and my smartd configuration
file includes "/dev/sda -a -d scsi".
Here are some examples (more or less in chronological order). (I've
elided the date and time, and my machine name from entries below. I can
provide them if anyone wishes.)
Boot-time information:
kernel: sym0: <1010-33> rev 0x1 at pci 0000:00:08.0 irq 177
kernel: sym0: using 64 bit DMA addressing
kernel: sym0: Symbios NVRAM, ID 7, Fast-80, LVD, parity checking
kernel: sym0: open drain IRQ line driver, using on-chip SRAM
kernel: sym0: using LOAD/STORE-based firmware.
kernel: sym0: handling phase mismatch from SCRIPTS.
kernel: sym0: SCSI BUS has been reset.
kernel: scsi0 : sym-2.1.18j
kernel: Vendor: QUANTUM Model: ATLAS10K3_18_WLS Rev: 020K
kernel: Type: Direct-Access ANSI SCSI revision: 03
kernel: sym0:6:0: tagged command queuing enabled, command queue depth 16.
kernel: scsi(0:0:6:0): Beginning Domain Validation
kernel: sym0:6: wide asynchronous.
kernel: sym0:6: FAST-80 WIDE SCSI 160.0 MB/s DT (12.5 ns, offset 62)
kernel: scsi(0:0:6:0): Ending Domain Validation
kernel: SCSI device sda: 35916548 512-byte hdwr sectors (18389 MB)
kernel: SCSI device sda: drive cache: write back
kernel: sda: sda1 sda2 sda3 sda4
kernel: Attached scsi disk sda at scsi0, channel 0, id 6, lun 0
Right after smartd starts up:
kernel: sym0:6:0:ODD transfer in DATA IN phase.
kernel: sym0:6:0:COMMAND FAILED (87 0 10).
kernel: sym0:6:0:ODD transfer in DATA IN phase.
kernel: sym0:6:0:COMMAND FAILED (87 0 10).
kernel: sym0:6:0:ODD transfer in DATA IN phase.
kernel: sym0:6:0:COMMAND FAILED (87 0 10).
kernel: sym0:6:0:ODD transfer in DATA IN phase.
kernel: sym0:6:0:COMMAND FAILED (87 0 10).
kernel: sym0:6:0:ODD transfer in DATA IN phase.
kernel: sym0:6:0:COMMAND FAILED (87 0 10).
Still during bootup, but I can't associate any particular event with the
errors:
kernel: sym0:6:0:ordered tag forced.
kernel: sym0:6:0:ordered tag forced.
Here is are the errors I get during a typical night:
kernel: sym0:6: ERROR (81:0) (8-0-0) (3e/18/80) @ (scripta 50:f31c0004).
kernel: sym0: script cmd = 90080000
kernel: sym0: regdump: da 00 00 18 47 3e 06 0f 00 08 86 00 80 00 07 0a 89 cd c1 e9 02 00 00 00.
kernel: sym0: SCSI BUS reset detected.
kernel: sym0: SCSI BUS has been reset.
kernel: "ff ff ff 7f b1 2d 24 02 8c 41 44 02 c0 6c 34 02 "
kernel: sym0:6: ERROR (81:0) (8-0-0) (3e/18/80) @ (scripta 50:f31c0004).
kernel: sym0: script cmd = 90080000
kernel: sym0: regdump: da 00 00 18 47 3e 06 0f 04 08 86 00 80 00 0f 0a 89 cd c1 e9 02 00 00 00.
kernel: sym0: SCSI BUS reset detected.
kernel: sym0: SCSI BUS has been reset.
--- Vladimir
------------------------------------------------------------------------
Vladimir G. Ivanovic http://leonora.org/~vladimir
Palo Alto, CA 94306 +1 650 678 8014
------------------------------------------------------------------------
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2004-08-22 2:45 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-08-13 3:00 Sym53c8xx_2 Kai OM
2004-08-13 19:40 ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-13 21:24 ` Sym53c8xx_2 Kai OM
2004-08-14 22:16 ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-15 7:21 ` Sym53c8xx_2 Kai Makisara
2004-08-15 8:38 ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-15 8:45 ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-15 21:52 ` Sym53c8xx_2 Guennadi Liakhovetski
2004-08-16 2:54 ` Sym53c8xx_2 Kai OM
2004-08-16 11:12 ` Sym53c8xx_2 Matthew Wilcox
2004-08-16 11:39 ` Sym53c8xx_2 Lance Dryden
2004-08-17 10:24 ` Sym53c8xx_2 Kai OM
2004-08-21 22:51 ` sym53c8xx_2 version 2.1.18j domain validation failures (was Re: Sym53c8xx_2) Lance Dryden
2004-08-22 1:01 ` Matthew Wilcox
2004-08-22 2:45 ` Vladimir G. Ivanovic
2004-08-16 18:02 ` Sym53c8xx_2 Kai Makisara
2004-08-17 10:19 ` Sym53c8xx_2 Kai OM
-- strict thread matches above, loose matches on Subject: below --
2004-08-17 18:06 Sym53c8xx_2 Vladimir G. Ivanovic
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).