All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joel Soete <soete.joel@scarlet.be>
To: Grant Grundler <grundler@parisc-linux.org>
Cc: kyle <kyle@mcmartin.ca>, parisc-linux <parisc-linux@parisc-linux.org>
Subject: Re: [parisc-linux] [patch 2/2] backport of sba sg list management to ccio-dma
Date: Sat, 24 Nov 2007 20:36:03 +0000	[thread overview]
Message-ID: <47488B33.5040904@scarlet.be> (raw)
In-Reply-To: <20071029053015.GA14763@colo.lackof.org>

Hello Grant, Kyle,

I finaly find an interesting paper on this Runway bc:
<http://www.hpl.hp.com/hpjournal/96feb/feb96a6.htm>

I read it but i don't yet understand it in deep but this detail:
"The lower 12 bits of the address must be left alone because of the 4K-byte page size defined by the architecture."

make me think that the IOVP_SHIFT of this ccio-dma driver would be always 12 what ever could be the PAGE_SHIFT (it's not yet 
possible but for pa8000 and later it could be greater)?

That said for the moment IOVP_SHIF == PAGE_SHIFT so couldn't be the reason of the followings issues.

On my d380, I reach to re-iterate some stress test on scsi ncr53c720 and LASI 53c700 with a simple read/write loop like:
# while true ; do nice -n -3 tar -xspf linux-2.6.11-rc3-pa3.tar ; nice -n -3 rm -rf linux-2.6.11-rc3-pa3 ; date ; done

With a fs build on a disk connected to a 53c710 hba, with or without my bp patch, unfortunately I always got same errors 
after some loop's occurence:
scsi3: (3:0) phase mismatch at 01e8, phase IO CD MSG BSY REQ MSG IN
  scsi3: Bus Reset detected, executing command 10a304e0, slot 10a0864c, dsp 001681e8[01e8]
   failing command because of reset, slot 10a08520, cmnd 10a30720
   failing command because of reset, slot 10a0864c, cmnd 10a304e0
   failing command because of reset, slot 10a08778, cmnd 10a303c0
   failing command because of reset, slot 10a088a4, cmnd 16eacd40
  scsi3: (3:0) phase mismatch at 01e8, phase IO CD MSG BSY REQ MSG IN
  scsi3: Bus Reset detected, executing command 10a30600, slot 10a088a4, dsp 001681e8[01e8]
   failing command because of reset, slot 10a08520, cmnd 16eacd40
   failing command because of reset, slot 10a0864c, cmnd 10a304e0
   failing command because of reset, slot 10a08778, cmnd 10a30720
   failing command because of reset, slot 10a088a4, cmnd 10a30600
  scsi3: (3:0) phase mismatch at 01e8, phase IO CD MSG BSY REQ MSG IN
  scsi3: Bus Reset detected, executing command 16eac9e0, slot 10a088a4, dsp 001681e8[01e8]
   failing command because of reset, slot 10a08520, cmnd 16eace60
   failing command because of reset, slot 10a0864c, cmnd 10a30600
   failing command because of reset, slot 10a08778, cmnd 16eacd40
   failing command because of reset, slot 10a088a4, cmnd 16eac9e0
[snip]

(this same disk connected to same lasi 53c710 of a b180 i.e. without ccio-dma could loop severall days without showing any 
issue)

On a disk attached to a ncr53c720 hba I also get errors:
  EXT3-fs error (device dm-0): ext3_free_blocks: Freeing blocks not in datazone - block = 1818455657, count = 1
  EXT3-fs error (device dm-0): ext3_free_blocks: Freeing blocks not in datazone - block = 157639797, count = 1
  EXT3-fs error (device dm-0): ext3_free_blocks: Freeing blocks not in datazone - block = 1852402748, count = 1
  EXT3-fs error (device dm-0): ext3_free_blocks: Freeing blocks not in datazone - block = 1714387061, count = 1
[snip]

With the original ccio-dma driver it occures after few occurence of the loop (about 5) but my patch only delay the pb to 
several houres (not useless work but not yet enough).

Any way fs is corrupted and this bring me to next major issue with my c110 (using same ncr53c720, lasi 53c710 and ccio-dma 
drivers as d380). This box was sleeping till about a year, so I removed additional ram kit of 512Mb for another usage and 
restored original ram of 64Mb, but internal boot disk stay unchanged connected to the ncr53c720 hba.
When I tried to reboot it some weeks ago with an existing & known working kernels (from the time system still own 512Mb; 
e.g. 2.6.8.1-pa7, 2.6.14-pa0, 2.6.19), it started to make a fsck obviously but this always sadely (fsck generating a fs 
corrution, well not directly but by border effect) ended by fs corruption too. That's only with the very old debian install 
kernel 2.4.17-32 that I reach to reboot this system to install latest 2.6.23-pa.orig and 2.6.23-pa+patch kernels. I could 
also reach to reboot this box with latest mentioned kernels but as soon as I launched an 'apt-get dist-upgrade' (after a 
update obviously) fs corruption occured again. I was inocently expecting that after some reboot, fsck and renew 
dist-upgrade, I would finaly recover a system operational like my d380. But I was wrong and after 2 or 3 reboot this box 
became not-bootable anymore (having lost too much critical files on the root fs :_().

[Sade sade sade to me: in 10 years of linux, it's the very first time I lost a system because of sw issue :__(]

All this story to say in summary, a d380 with 256Mb of ram works more or less fine (if I don't stress scsi disk) but a c110 
with few 64Mb is not usable at all (with either original or patched 2.6.23-pa kernels)?

I have the filing that some cache coherency (I/O, mem??) lakes somewhere but I didn't understand where/what is the code that 
do it now, so if you have some more time to pin point it to me, I would greatly appreciate.

TIA,
	J.


_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux

  parent reply	other threads:[~2007-11-24 20:36 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <JQIT2V$FB9C8381616CCA32E9FC6CC809E8D81D@scarlet.be>
     [not found] ` <20071028064158.GB29233@colo.lackof.org>
     [not found]   ` <4724A084.5090709@scarlet.be>
2007-10-28 15:44     ` [parisc-linux] [patch 2/2] backport of sba sg list management to ccio-dma Joel Soete
2007-10-29  5:30     ` Grant Grundler
     [not found]     ` <20071029053015.GA14763@colo.lackof.org>
2007-11-01 17:17       ` Joel Soete
     [not found]       ` <472A0A3F.5040007@scarlet.be>
2007-11-02 18:00         ` Joel Soete
2007-11-24 20:36       ` Joel Soete [this message]
2007-11-26  3:26         ` Grant Grundler
2007-11-28  8:25 Joel Soete
2007-11-29 18:50 ` Grant Grundler
  -- strict thread matches above, loose matches on Subject: below --
2007-11-27 12:48 Joel Soete
2007-11-27 21:47 ` Grant Grundler
2007-11-26  8:48 Joel Soete
2007-11-26 23:27 ` Grant Grundler
2007-10-23 16:15 Joel Soete
     [not found] <4714E800.50709@scarlet.be>
2007-10-17 18:44 ` Kyle McMartin
     [not found] ` <20071017184416.GB11502@fattire.cabal.ca>
2007-10-20 17:21   ` Joel Soete
     [not found]   ` <471A3928.8040603@scarlet.be>
2007-10-20 17:23     ` Kyle McMartin
     [not found]     ` <20071020172328.GI10429@fattire.cabal.ca>
2007-10-22  4:39       ` Grant Grundler
     [not found]       ` <20071022043909.GC11869@colo.lackof.org>
2007-11-01 17:18         ` Joel Soete
     [not found]         ` <472A0A49.8010201@scarlet.be>
2007-11-01 21:21           ` Grant Grundler
2007-10-16 16:34 Joel Soete

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=47488B33.5040904@scarlet.be \
    --to=soete.joel@scarlet.be \
    --cc=grundler@parisc-linux.org \
    --cc=kyle@mcmartin.ca \
    --cc=parisc-linux@parisc-linux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.