Re: st corruption with 2.4.3-pre4

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: st corruption with 2.4.3-pre4
       [not found] <3AB63F5F.B7C3E71A@mandrakesoft.com>
@ 2001-03-20  9:03 ` Geert Uytterhoeven
  2001-03-20 19:29   ` Geert Uytterhoeven
  0 siblings, 1 reply; 13+ messages in thread
From: Geert Uytterhoeven @ 2001-03-20  9:03 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linux Kernel Development

On Mon, 19 Mar 2001, Jeff Garzik wrote:
> Is the corruption reproducible?  If so, does the corruption go away if

Yes, it is reproducible. In all my tests, I tarred 16 files of 16 MB each to
tape (I used a new one).
  - test 1: 4 files with failed md5sum (no further investigation on type of
	    corruption)
  - test 2: 7 files with failed md5sum, 7 blocks of 32 consecutive bytes were
	    corrupted, all starting at an offset of the form 32*x+1.
  - test 3: 7 files with failed md5sum, 7 blocks of 32 consecutive bytes were
	    corrupted, all starting at an offset of the form 32*x+1.

The files seem to be corrupted during writing only, as reading always gives the
exact same (corrupted) data back.

Copying files from the disk on the MESH to a disk on the Sym53c875 (which also
has the tape drive) shows no corruption.

> you rip out the scsi_error patch in 2.4.3-preXX?

After reverting that patch, the problem got worse:
  - test 4: 15 files with failed md5sum, a total of 40 blocks of 32 consecutive
	    bytes were corrupted, all starting at an offset of the form 32*x+1.

So it seems to be related to scsi_error.c.

If you have some suggestions, I'm willing to try them. I'd like to trust
whatever Amanda writes to my backup tapes :-)

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: st corruption with 2.4.3-pre4
  2001-03-20  9:03 ` st corruption with 2.4.3-pre4 Geert Uytterhoeven
@ 2001-03-20 19:29   ` Geert Uytterhoeven
  2001-03-20 19:50     ` Gérard Roudier
  2001-03-20 20:13     ` Problems with SCSI controller !!! Mircea Ciocan
  0 siblings, 2 replies; 13+ messages in thread
From: Geert Uytterhoeven @ 2001-03-20 19:29 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Linux Kernel Development

On Tue, 20 Mar 2001, Geert Uytterhoeven wrote:
> On Mon, 19 Mar 2001, Jeff Garzik wrote:
> > Is the corruption reproducible?  If so, does the corruption go away if
> 
> Yes, it is reproducible. In all my tests, I tarred 16 files of 16 MB each to
> tape (I used a new one).
>   - test 1: 4 files with failed md5sum (no further investigation on type of
> 	    corruption)
>   - test 2: 7 files with failed md5sum, 7 blocks of 32 consecutive bytes were
> 	    corrupted, all starting at an offset of the form 32*x+1.
>   - test 3: 7 files with failed md5sum, 7 blocks of 32 consecutive bytes were
> 	    corrupted, all starting at an offset of the form 32*x+1.
> 
> The files seem to be corrupted during writing only, as reading always gives the
> exact same (corrupted) data back.
> 
> Copying files from the disk on the MESH to a disk on the Sym53c875 (which also
> has the tape drive) shows no corruption.

I did some more tests:
  - The problem also occurs when tarring up files from a disk on the Sym53c875.
  - The corrupted data always occurs at offset 32*x (the `+1' above was caused
    by hexdump, starting counting at 1).
  - The 32 bytes of corrupted data at offset 32*x are always a copy of the data
    at offset 32*x-10240.
  - Since 10240 is the default blocksize of tar (bug in tar?), I made a tarball
    on disk instead of on tape, but no corruption.
  - 32 is the size of a cacheline on PPC. Is there a missing cacheflush
    somewhere in the Sym53c875 driver? But then it should happen on disk as
    well?

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: st corruption with 2.4.3-pre4
  2001-03-20 19:29   ` Geert Uytterhoeven
@ 2001-03-20 19:50     ` Gérard Roudier
  2001-03-21  7:19       ` Geert Uytterhoeven
  2001-03-22 18:51       ` Geert Uytterhoeven
  2001-03-20 20:13     ` Problems with SCSI controller !!! Mircea Ciocan
  1 sibling, 2 replies; 13+ messages in thread
From: Gérard Roudier @ 2001-03-20 19:50 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Jeff Garzik, Linux Kernel Development

On Tue, 20 Mar 2001, Geert Uytterhoeven wrote:

> On Tue, 20 Mar 2001, Geert Uytterhoeven wrote:
> > On Mon, 19 Mar 2001, Jeff Garzik wrote:
> > > Is the corruption reproducible?  If so, does the corruption go away if
> > 
> > Yes, it is reproducible. In all my tests, I tarred 16 files of 16 MB each to
> > tape (I used a new one).
> >   - test 1: 4 files with failed md5sum (no further investigation on type of
> > 	    corruption)
> >   - test 2: 7 files with failed md5sum, 7 blocks of 32 consecutive bytes were
> > 	    corrupted, all starting at an offset of the form 32*x+1.
> >   - test 3: 7 files with failed md5sum, 7 blocks of 32 consecutive bytes were
> > 	    corrupted, all starting at an offset of the form 32*x+1.
> > 
> > The files seem to be corrupted during writing only, as reading always gives the
> > exact same (corrupted) data back.
> > 
> > Copying files from the disk on the MESH to a disk on the Sym53c875 (which also
> > has the tape drive) shows no corruption.
> 
> I did some more tests:
>   - The problem also occurs when tarring up files from a disk on the Sym53c875.
>   - The corrupted data always occurs at offset 32*x (the `+1' above was caused
>     by hexdump, starting counting at 1).
>   - The 32 bytes of corrupted data at offset 32*x are always a copy of the data
>     at offset 32*x-10240.
>   - Since 10240 is the default blocksize of tar (bug in tar?), I made a tarball
>     on disk instead of on tape, but no corruption.
>   - 32 is the size of a cacheline on PPC. Is there a missing cacheflush
>     somewhere in the Sym53c875 driver? But then it should happen on disk as
>     well?

The only PCI transaction that requires the cache line size to be correctly
configured is PCI WRITE and INVALIDATE. This transaction may be used by
the 875 only for data read from a SCSI device and DMAed to memory.

Note that the controller may use optimized PCI transactions only if the 
cache line size is configured in its PCI device configuration space.
Otherwise only normal PCI memory read and PCI memory write transactions 
will be used.

Could you check if the cache line size is configured for your 875?

Let me imagine it is so. Btw, I may be wasting my time if it is not ...
Then the 875 may also use PCI read multiple transactions and/or PCI read
line transactions when reading data from memory. If the corruption is due
to the use of these transactions, the the PCI-HOST bridges may well be the
culprit, in my opinion.

Anyway, since the sym53c8xx driver does not try to change the configured
cache line size on PPC, I would suggest to try again the same tests with
the cache line size set to zero for the 875. You may hack the driver code
or the PPC pci code if needed, for example, for value zero to be written
in the proper place in the PCI configuration space of the 875.

  Gérard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: st corruption with 2.4.3-pre4
  2001-03-20 19:50     ` Gérard Roudier
@ 2001-03-21  7:19       ` Geert Uytterhoeven
  2001-03-22 18:51       ` Geert Uytterhoeven
  1 sibling, 0 replies; 13+ messages in thread
From: Geert Uytterhoeven @ 2001-03-21  7:19 UTC (permalink / raw)
  To: Gérard Roudier; +Cc: Jeff Garzik, Linux Kernel Development

On Tue, 20 Mar 2001, Gérard Roudier wrote:
> On Tue, 20 Mar 2001, Geert Uytterhoeven wrote:
> > On Tue, 20 Mar 2001, Geert Uytterhoeven wrote:
> > > On Mon, 19 Mar 2001, Jeff Garzik wrote:
> > > > Is the corruption reproducible?  If so, does the corruption go away if
> > > 
> > > Yes, it is reproducible. In all my tests, I tarred 16 files of 16 MB each to
> > > tape (I used a new one).
> > >   - test 1: 4 files with failed md5sum (no further investigation on type of
> > > 	    corruption)
> > >   - test 2: 7 files with failed md5sum, 7 blocks of 32 consecutive bytes were
> > > 	    corrupted, all starting at an offset of the form 32*x+1.
> > >   - test 3: 7 files with failed md5sum, 7 blocks of 32 consecutive bytes were
> > > 	    corrupted, all starting at an offset of the form 32*x+1.
> > > 
> > > The files seem to be corrupted during writing only, as reading always gives the
> > > exact same (corrupted) data back.
> > > 
> > > Copying files from the disk on the MESH to a disk on the Sym53c875 (which also
> > > has the tape drive) shows no corruption.
> > 
> > I did some more tests:
> >   - The problem also occurs when tarring up files from a disk on the Sym53c875.
> >   - The corrupted data always occurs at offset 32*x (the `+1' above was caused
> >     by hexdump, starting counting at 1).
> >   - The 32 bytes of corrupted data at offset 32*x are always a copy of the data
> >     at offset 32*x-10240.
> >   - Since 10240 is the default blocksize of tar (bug in tar?), I made a tarball
> >     on disk instead of on tape, but no corruption.
> >   - 32 is the size of a cacheline on PPC. Is there a missing cacheflush
> >     somewhere in the Sym53c875 driver? But then it should happen on disk as
> >     well?
> 
> The only PCI transaction that requires the cache line size to be correctly
> configured is PCI WRITE and INVALIDATE. This transaction may be used by
> the 875 only for data read from a SCSI device and DMAed to memory.

So if this would be the problem, I should see the corruption when reading files
from disks too? But my tests indicate it happens when writing to tape only, not
when reading from tape, nor when copying between disks.

> Note that the controller may use optimized PCI transactions only if the 
> cache line size is configured in its PCI device configuration space.
> Otherwise only normal PCI memory read and PCI memory write transactions 
> will be used.
> 
> Could you check if the cache line size is configured for your 875?
> 
> Let me imagine it is so. Btw, I may be wasting my time if it is not ...
> Then the 875 may also use PCI read multiple transactions and/or PCI read
> line transactions when reading data from memory. If the corruption is due
> to the use of these transactions, the the PCI-HOST bridges may well be the
> culprit, in my opinion.
> 
> Anyway, since the sym53c8xx driver does not try to change the configured
> cache line size on PPC, I would suggest to try again the same tests with
> the cache line size set to zero for the 875. You may hack the driver code
> or the PPC pci code if needed, for example, for value zero to be written
> in the proper place in the PCI configuration space of the 875.

I'll try that.

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: st corruption with 2.4.3-pre4
  2001-03-20 19:50     ` Gérard Roudier
  2001-03-21  7:19       ` Geert Uytterhoeven
@ 2001-03-22 18:51       ` Geert Uytterhoeven
  2001-04-05 18:48         ` Geert Uytterhoeven
  1 sibling, 1 reply; 13+ messages in thread
From: Geert Uytterhoeven @ 2001-03-22 18:51 UTC (permalink / raw)
  To: Gérard Roudier; +Cc: Jeff Garzik, Linux Kernel Development

On Tue, 20 Mar 2001, Gérard Roudier wrote:
> On Tue, 20 Mar 2001, Geert Uytterhoeven wrote:
> > On Tue, 20 Mar 2001, Geert Uytterhoeven wrote:
> > > On Mon, 19 Mar 2001, Jeff Garzik wrote:
> > I did some more tests:
> >   - The problem also occurs when tarring up files from a disk on the Sym53c875.
> >   - The corrupted data always occurs at offset 32*x (the `+1' above was caused
> >     by hexdump, starting counting at 1).
> >   - The 32 bytes of corrupted data at offset 32*x are always a copy of the data
> >     at offset 32*x-10240.
> >   - Since 10240 is the default blocksize of tar (bug in tar?), I made a tarball
> >     on disk instead of on tape, but no corruption.
> >   - 32 is the size of a cacheline on PPC. Is there a missing cacheflush
> >     somewhere in the Sym53c875 driver? But then it should happen on disk as
> >     well?
> 
> The only PCI transaction that requires the cache line size to be correctly
> configured is PCI WRITE and INVALIDATE. This transaction may be used by
> the 875 only for data read from a SCSI device and DMAed to memory.
> 
> Note that the controller may use optimized PCI transactions only if the 
> cache line size is configured in its PCI device configuration space.
> Otherwise only normal PCI memory read and PCI memory write transactions 
> will be used.
> 
> Could you check if the cache line size is configured for your 875?

It's set to 0 :-(

> Let me imagine it is so. Btw, I may be wasting my time if it is not ...
> Then the 875 may also use PCI read multiple transactions and/or PCI read
> line transactions when reading data from memory. If the corruption is due
> to the use of these transactions, the the PCI-HOST bridges may well be the
> culprit, in my opinion.
> 
> Anyway, since the sym53c8xx driver does not try to change the configured
> cache line size on PPC, I would suggest to try again the same tests with
> the cache line size set to zero for the 875. You may hack the driver code
> or the PPC pci code if needed, for example, for value zero to be written
> in the proper place in the PCI configuration space of the 875.

So this is not the case.

Any more clues? I want to try different tape drives as well, but so far the
first batch of old DDS drives I found at work seem to be no longer functional.
Let's fetch some other drives tomorrow :-)

BTW, I tried my good old 2.4.0-test1-ac10 kernel from June 2000, and it also
suffered from the same problem. Also note that I did read/write tests on the
tape drive when I just bought it and when I installed the Sym53c875 later, and
I never noticed the problem. So I'm still willing to believe it's a software
bug in recent(?) kernels...

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: st corruption with 2.4.3-pre4
  2001-03-22 18:51       ` Geert Uytterhoeven
@ 2001-04-05 18:48         ` Geert Uytterhoeven
  2001-04-05 19:37           ` Geert Uytterhoeven
  2001-04-06  7:29           ` Stefano Coluccini
  0 siblings, 2 replies; 13+ messages in thread
From: Geert Uytterhoeven @ 2001-04-05 18:48 UTC (permalink / raw)
  To: Gérard Roudier
  Cc: Jeff Garzik, Linux Kernel Development, Linux/PPC Development

On Thu, 22 Mar 2001, Geert Uytterhoeven wrote:
> On Tue, 20 Mar 2001, Gérard Roudier wrote:
> > On Tue, 20 Mar 2001, Geert Uytterhoeven wrote:
> > > On Tue, 20 Mar 2001, Geert Uytterhoeven wrote:
> > > > On Mon, 19 Mar 2001, Jeff Garzik wrote:
> > > I did some more tests:
> > >   - The problem also occurs when tarring up files from a disk on the Sym53c875.
> > >   - The corrupted data always occurs at offset 32*x (the `+1' above was caused
> > >     by hexdump, starting counting at 1).
> > >   - The 32 bytes of corrupted data at offset 32*x are always a copy of the data
> > >     at offset 32*x-10240.
> > >   - Since 10240 is the default blocksize of tar (bug in tar?), I made a tarball
> > >     on disk instead of on tape, but no corruption.
> > >   - 32 is the size of a cacheline on PPC. Is there a missing cacheflush
> > >     somewhere in the Sym53c875 driver? But then it should happen on disk as
> > >     well?
> 
> BTW, I tried my good old 2.4.0-test1-ac10 kernel from June 2000, and it also
> suffered from the same problem. Also note that I did read/write tests on the
> tape drive when I just bought it and when I installed the Sym53c875 later, and
> I never noticed the problem. So I'm still willing to believe it's a software
> bug in recent(?) kernels...

Status update:
  - When I connect my DDS1 to the MESH, I see no corruption (as long as I get
    no `lost arbitration' messages from the MESH driver. I never get those with
    the disk BTW. Anyone who knows what needs to be done to make the MESH
    driver recover from lost arbitration errors?). So the tape drive seems to
    be fine.
  - I wanted to try different tape drives, but all retired DDS drives I found
    at work seem to be in a non-functional state. I tried 3 of them, without
    any luck.
  - I wanted to try a 2.2.x kernel, but linuxppc_2_2 (2.2.19-pre3) just says
    `illegal instruction' and returns me to the OF prompt.
  - Adding more eieio/syncs to the sym53c8xx driver doesn't help. In fact there
    are already memory barriers where I'd expect them (as could be expected, of
    course :-)

[...]

OK, I managed to compile an old 2.2.13 kernel from the PPC bk repository that
boots more or less (no video) on my box.

Surprise! So far no corruption!! Time to let Amanda make some dumps tonight :-)

So something broke the st/sym53c8xx combination on my box between 2.2.13 and
2.4.0-test1-ac10...

I'm still waiting for other reports of st/sym53c8xx on PPC under 2.4.x. BTW,
does it work on other big-endian platforms, like sparc?

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: st corruption with 2.4.3-pre4
  2001-04-05 18:48         ` Geert Uytterhoeven
@ 2001-04-05 19:37           ` Geert Uytterhoeven
  2001-04-06 17:09             ` Gérard Roudier
  2001-04-06  7:29           ` Stefano Coluccini
  1 sibling, 1 reply; 13+ messages in thread
From: Geert Uytterhoeven @ 2001-04-05 19:37 UTC (permalink / raw)
  To: Gérard Roudier; +Cc: Linux Kernel Development


BTW, my 2.4.3-pre8 kernel just said

| sym53c875-0:0: ERROR (81:0) (3-21-0) (10/9d) @ (script 8a8:0b000000).
| sym53c875-0: script cmd = 11000000
| sym53c875-0: regdump: da 10 80 9d 47 10 00 0d 00 03 80 21 80 01 09 09 00 30 4e 00 08 ff ff ff.
| sym53c875-0-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)

during the boot process, and continued without problems. What does this mean?

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: st corruption with 2.4.3-pre4
  2001-04-05 19:37           ` Geert Uytterhoeven
@ 2001-04-06 17:09             ` Gérard Roudier
  0 siblings, 0 replies; 13+ messages in thread
From: Gérard Roudier @ 2001-04-06 17:09 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Linux Kernel Development

On Thu, 5 Apr 2001, Geert Uytterhoeven wrote:

> 
> BTW, my 2.4.3-pre8 kernel just said
> 
> | sym53c875-0:0: ERROR (81:0) (3-21-0) (10/9d) @ (script 8a8:0b000000).

Illegal instruction detected.

> | sym53c875-0: script cmd = 11000000
> | sym53c875-0: regdump: da 10 80 9d 47 10 00 0d 00 03 80 21 80 01 09 09 00 30 4e 00 08 ff ff ff.
> | sym53c875-0-<0,*>: FAST-20 WIDE SCSI 40.0 MB/s (50.0 ns, offset 16)
> 
> during the boot process, and continued without problems. What does this mean?

Looks extremally serious to me.

The SCRIPTS processor should be fetching CHMOV DSA relative when DATA_IN
instructions. This corresponds to opcode 0x11000000.

However, it seems to have fetched instruction 0x0b000000 which is a 
MOVE ABSOLUTE WHEN STATUS PHASE.

In (3-21-0) we can see that the chip is expecting STATUS PHASE (3), but
the target is driving DATA IN phase (21 - the 1 indicates DATA IN phase).

In other word, the SCRIPTS processor seems to have fetched a bogus
instruction. The signaled 'illegal instruction detected' may be due to the 
count of bytes to transfer to be zero.

> Gr{oetje,eeting}s,

  Gérard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: st corruption with 2.4.3-pre4
  2001-04-05 18:48         ` Geert Uytterhoeven
  2001-04-05 19:37           ` Geert Uytterhoeven
@ 2001-04-06  7:29           ` Stefano Coluccini
  2001-04-06  7:47             ` Geert Uytterhoeven
  2001-04-06 14:31             ` Gérard Roudier
  1 sibling, 2 replies; 13+ messages in thread
From: Stefano Coluccini @ 2001-04-06  7:29 UTC (permalink / raw)
  To: Geert Uytterhoeven, Gérard Roudier
  Cc: Jeff Garzik, Linux Kernel Development, Linux/PPC Development

> I'm still waiting for other reports of st/sym53c8xx on PPC under
> 2.4.x. BTW,
> does it work on other big-endian platforms, like sparc?

I don't know if it is the same problem, but ...
I have a Motorola MVME5100 (PowerPC 750 based CPU) with a mezzanine PCI
based on the sym53c875 chip. I'm using the 2_5 kernel from fmslabs and the
first time I have downloaded the kernel all works fine, while in a
successive update the sym53c8xx driver was changed and my board don't work
anymore. The driver hangs on downloading the SCSI scripts.
I'm not a SCSI driver expert, so I've solved the problem installing the old
version of the driver.
Tom Rini says to me that it happened when he have merged some updates from
the 2_4 tree, so I think my problem is related to the latest updates to the
driver.
I hope this helps you.
Bye.
Stefano.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: st corruption with 2.4.3-pre4
  2001-04-06  7:29           ` Stefano Coluccini
@ 2001-04-06  7:47             ` Geert Uytterhoeven
  2001-04-06 14:31             ` Gérard Roudier
  1 sibling, 0 replies; 13+ messages in thread
From: Geert Uytterhoeven @ 2001-04-06  7:47 UTC (permalink / raw)
  To: Stefano Coluccini
  Cc: Gérard Roudier, Jeff Garzik, Linux Kernel Development,
	Linux/PPC Development

On Fri, 6 Apr 2001, Stefano Coluccini wrote:
> > I'm still waiting for other reports of st/sym53c8xx on PPC under
> > 2.4.x. BTW,
> > does it work on other big-endian platforms, like sparc?
> 
> I don't know if it is the same problem, but ...
> I have a Motorola MVME5100 (PowerPC 750 based CPU) with a mezzanine PCI
> based on the sym53c875 chip. I'm using the 2_5 kernel from fmslabs and the
> first time I have downloaded the kernel all works fine, while in a
> successive update the sym53c8xx driver was changed and my board don't work
> anymore. The driver hangs on downloading the SCSI scripts.
> I'm not a SCSI driver expert, so I've solved the problem installing the old
> version of the driver.
> Tom Rini says to me that it happened when he have merged some updates from
> the 2_4 tree, so I think my problem is related to the latest updates to the
> driver.

This is a different problem. You have to do the equivalent of what
process_bridge_ranges()/pci_process_OF_bridge_ranges() (the function got
renamed recently) does for your machine. Else PCI memory space won't work.

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: st corruption with 2.4.3-pre4
  2001-04-06  7:29           ` Stefano Coluccini
  2001-04-06  7:47             ` Geert Uytterhoeven
@ 2001-04-06 14:31             ` Gérard Roudier
  2001-04-06 17:20               ` Gérard Roudier
  1 sibling, 1 reply; 13+ messages in thread
From: Gérard Roudier @ 2001-04-06 14:31 UTC (permalink / raw)
  To: Stefano Coluccini
  Cc: Geert Uytterhoeven, Jeff Garzik, Linux Kernel Development,
	Linux/PPC Development

On Fri, 6 Apr 2001, Stefano Coluccini wrote:

> > I'm still waiting for other reports of st/sym53c8xx on PPC under
> > 2.4.x. BTW,
> > does it work on other big-endian platforms, like sparc?
> 
> I don't know if it is the same problem, but ...
> I have a Motorola MVME5100 (PowerPC 750 based CPU) with a mezzanine PCI
> based on the sym53c875 chip. I'm using the 2_5 kernel from fmslabs and the
> first time I have downloaded the kernel all works fine, while in a
> successive update the sym53c8xx driver was changed and my board don't work
> anymore. The driver hangs on downloading the SCSI scripts.
> I'm not a SCSI driver expert, so I've solved the problem installing the old
> version of the driver.
> Tom Rini says to me that it happened when he have merged some updates from
> the 2_4 tree, so I think my problem is related to the latest updates to the
> driver.
> I hope this helps you.
> Bye.
> Stefano.

IMO, it might well be the Linux/PPC PCI interface that doesn't return
expected values.

1) The [sym|ncr]53c8xx need to know about BAR addresses as physical
   address values as seen from the BUS. These values are used by the 
   SCSI SCRIPTS and _NOT_ by the CPU.

2) The pcidev structure returns cookies instead, that commonly are
   BARs physical addresses as seen from CPU.

The recent change in the Symbios driver about point (1) is that the
driver now reads the BARs using the pci_read_config*() interface. If these
functions donnot return the actual BAR values usable from the BUS for some
obscure reasons, this may explain your problem.

The cookies contained in the pcidev structure are completely useless for
the driver and probably for any driver. They just have to be used to remap
memory BARs to CPU virtual addresses. Then the driver forgets about them.

There are still some PPC PCI specific hacks in the sym53c8xx driver and it
has been reported to me that they can be removed. If the PPC PCI interface
is correct, then they should be removed without problems, IMO.

Here is a patch that removes the offending PPC PCI hacky area from the
driver (sym53c8xx_defs.h):

--- sym53c8xx_defs.h	Fri Apr  6 16:23:48 2001
+++ sym53c8xx_defs.h.orig	Sun Mar  4 13:54:11 2001
@@ -175,6 +175,9 @@
 #define	SCSI_NCR_IOMAPPED
 #elif defined(__alpha__)
 #define	SCSI_NCR_IOMAPPED
+#elif defined(__powerpc__)
+#define	SCSI_NCR_IOMAPPED
+#define SCSI_NCR_PCI_MEM_NOT_SUPPORTED
 #elif defined(__sparc__)
 #undef SCSI_NCR_IOMAPPED
 #endif
-------------------- Cut Here ------------------

Regards,
  Gérard.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: st corruption with 2.4.3-pre4
  2001-04-06 14:31             ` Gérard Roudier
@ 2001-04-06 17:20               ` Gérard Roudier
  0 siblings, 0 replies; 13+ messages in thread
From: Gérard Roudier @ 2001-04-06 17:20 UTC (permalink / raw)
  To: Stefano Coluccini
  Cc: Geert Uytterhoeven, Jeff Garzik, Linux Kernel Development,
	Linux/PPC Development



On Fri, 6 Apr 2001, Gérard Roudier wrote:

> Here is a patch that removes the offending PPC PCI hacky area from the
> driver (sym53c8xx_defs.h):
> 
> --- sym53c8xx_defs.h	Fri Apr  6 16:23:48 2001
> +++ sym53c8xx_defs.h.orig	Sun Mar  4 13:54:11 2001
> @@ -175,6 +175,9 @@
>  #define	SCSI_NCR_IOMAPPED
>  #elif defined(__alpha__)
>  #define	SCSI_NCR_IOMAPPED
> +#elif defined(__powerpc__)
> +#define	SCSI_NCR_IOMAPPED
> +#define SCSI_NCR_PCI_MEM_NOT_SUPPORTED
>  #elif defined(__sparc__)
>  #undef SCSI_NCR_IOMAPPED
>  #endif
> -------------------- Cut Here ------------------

The patch is obviously reversed. You just have to remove the 3 lines that
apply to powerpc using you preferred editor.
Btw, using the one you dislike the most will also fit. :-)

  Gérard.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Problems with SCSI controller !!!
  2001-03-20 19:29   ` Geert Uytterhoeven
  2001-03-20 19:50     ` Gérard Roudier
@ 2001-03-20 20:13     ` Mircea Ciocan
  1 sibling, 0 replies; 13+ messages in thread
From: Mircea Ciocan @ 2001-03-20 20:13 UTC (permalink / raw)
  To: georg; +Cc: Linux Kernel Development

		Hello everybody,

	This is a message on behalf of a friend that is not subscribed to list:

	It's about an ASUS board that has this ncr53-1010 dual 160 SCSI
controller (sym53c1010).
	On both latest kernels (2.2.18ac19 AND 2.4.2ac18) the log and console
is filled with that:

	sym53c1010-33-0: unable to abort current chip operation.
	sym53c1010-33-0: Downloading SCSI SCRIPTS.
	sym53c8xx_reset: pid=0 reset_flags=2 ...

	and the controller suddenly blocks and the system have be restarted.

	Do someone know the meaning of this messages and what's matter, do you
want more details and what else ???

			Regards,
			
			Mircea C.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2001-04-06 20:31 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <3AB63F5F.B7C3E71A@mandrakesoft.com>
2001-03-20  9:03 ` st corruption with 2.4.3-pre4 Geert Uytterhoeven
2001-03-20 19:29   ` Geert Uytterhoeven
2001-03-20 19:50     ` Gérard Roudier
2001-03-21  7:19       ` Geert Uytterhoeven
2001-03-22 18:51       ` Geert Uytterhoeven
2001-04-05 18:48         ` Geert Uytterhoeven
2001-04-05 19:37           ` Geert Uytterhoeven
2001-04-06 17:09             ` Gérard Roudier
2001-04-06  7:29           ` Stefano Coluccini
2001-04-06  7:47             ` Geert Uytterhoeven
2001-04-06 14:31             ` Gérard Roudier
2001-04-06 17:20               ` Gérard Roudier
2001-03-20 20:13     ` Problems with SCSI controller !!! Mircea Ciocan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox