From: Joel Soete <soete.joel@scarlet.be>
To: Grant Grundler <grundler@parisc-linux.org>
Cc: kyle <kyle@mcmartin.ca>, parisc-linux <parisc-linux@parisc-linux.org>
Subject: Re: [parisc-linux] [patch 2/2] backport of sba sg list management to ccio-dma
Date: Sat, 24 Nov 2007 20:36:03 +0000 [thread overview]
Message-ID: <47488B33.5040904@scarlet.be> (raw)
In-Reply-To: <20071029053015.GA14763@colo.lackof.org>
Hello Grant, Kyle,
I finaly find an interesting paper on this Runway bc:
<http://www.hpl.hp.com/hpjournal/96feb/feb96a6.htm>
I read it but i don't yet understand it in deep but this detail:
"The lower 12 bits of the address must be left alone because of the 4K-byte page size defined by the architecture."
make me think that the IOVP_SHIFT of this ccio-dma driver would be always 12 what ever could be the PAGE_SHIFT (it's not yet
possible but for pa8000 and later it could be greater)?
That said for the moment IOVP_SHIF == PAGE_SHIFT so couldn't be the reason of the followings issues.
On my d380, I reach to re-iterate some stress test on scsi ncr53c720 and LASI 53c700 with a simple read/write loop like:
# while true ; do nice -n -3 tar -xspf linux-2.6.11-rc3-pa3.tar ; nice -n -3 rm -rf linux-2.6.11-rc3-pa3 ; date ; done
With a fs build on a disk connected to a 53c710 hba, with or without my bp patch, unfortunately I always got same errors
after some loop's occurence:
scsi3: (3:0) phase mismatch at 01e8, phase IO CD MSG BSY REQ MSG IN
scsi3: Bus Reset detected, executing command 10a304e0, slot 10a0864c, dsp 001681e8[01e8]
failing command because of reset, slot 10a08520, cmnd 10a30720
failing command because of reset, slot 10a0864c, cmnd 10a304e0
failing command because of reset, slot 10a08778, cmnd 10a303c0
failing command because of reset, slot 10a088a4, cmnd 16eacd40
scsi3: (3:0) phase mismatch at 01e8, phase IO CD MSG BSY REQ MSG IN
scsi3: Bus Reset detected, executing command 10a30600, slot 10a088a4, dsp 001681e8[01e8]
failing command because of reset, slot 10a08520, cmnd 16eacd40
failing command because of reset, slot 10a0864c, cmnd 10a304e0
failing command because of reset, slot 10a08778, cmnd 10a30720
failing command because of reset, slot 10a088a4, cmnd 10a30600
scsi3: (3:0) phase mismatch at 01e8, phase IO CD MSG BSY REQ MSG IN
scsi3: Bus Reset detected, executing command 16eac9e0, slot 10a088a4, dsp 001681e8[01e8]
failing command because of reset, slot 10a08520, cmnd 16eace60
failing command because of reset, slot 10a0864c, cmnd 10a30600
failing command because of reset, slot 10a08778, cmnd 16eacd40
failing command because of reset, slot 10a088a4, cmnd 16eac9e0
[snip]
(this same disk connected to same lasi 53c710 of a b180 i.e. without ccio-dma could loop severall days without showing any
issue)
On a disk attached to a ncr53c720 hba I also get errors:
EXT3-fs error (device dm-0): ext3_free_blocks: Freeing blocks not in datazone - block = 1818455657, count = 1
EXT3-fs error (device dm-0): ext3_free_blocks: Freeing blocks not in datazone - block = 157639797, count = 1
EXT3-fs error (device dm-0): ext3_free_blocks: Freeing blocks not in datazone - block = 1852402748, count = 1
EXT3-fs error (device dm-0): ext3_free_blocks: Freeing blocks not in datazone - block = 1714387061, count = 1
[snip]
With the original ccio-dma driver it occures after few occurence of the loop (about 5) but my patch only delay the pb to
several houres (not useless work but not yet enough).
Any way fs is corrupted and this bring me to next major issue with my c110 (using same ncr53c720, lasi 53c710 and ccio-dma
drivers as d380). This box was sleeping till about a year, so I removed additional ram kit of 512Mb for another usage and
restored original ram of 64Mb, but internal boot disk stay unchanged connected to the ncr53c720 hba.
When I tried to reboot it some weeks ago with an existing & known working kernels (from the time system still own 512Mb;
e.g. 2.6.8.1-pa7, 2.6.14-pa0, 2.6.19), it started to make a fsck obviously but this always sadely (fsck generating a fs
corrution, well not directly but by border effect) ended by fs corruption too. That's only with the very old debian install
kernel 2.4.17-32 that I reach to reboot this system to install latest 2.6.23-pa.orig and 2.6.23-pa+patch kernels. I could
also reach to reboot this box with latest mentioned kernels but as soon as I launched an 'apt-get dist-upgrade' (after a
update obviously) fs corruption occured again. I was inocently expecting that after some reboot, fsck and renew
dist-upgrade, I would finaly recover a system operational like my d380. But I was wrong and after 2 or 3 reboot this box
became not-bootable anymore (having lost too much critical files on the root fs :_().
[Sade sade sade to me: in 10 years of linux, it's the very first time I lost a system because of sw issue :__(]
All this story to say in summary, a d380 with 256Mb of ram works more or less fine (if I don't stress scsi disk) but a c110
with few 64Mb is not usable at all (with either original or patched 2.6.23-pa kernels)?
I have the filing that some cache coherency (I/O, mem??) lakes somewhere but I didn't understand where/what is the code that
do it now, so if you have some more time to pin point it to me, I would greatly appreciate.
TIA,
J.
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
next prev parent reply other threads:[~2007-11-24 20:36 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <JQIT2V$FB9C8381616CCA32E9FC6CC809E8D81D@scarlet.be>
[not found] ` <20071028064158.GB29233@colo.lackof.org>
[not found] ` <4724A084.5090709@scarlet.be>
2007-10-28 15:44 ` [parisc-linux] [patch 2/2] backport of sba sg list management to ccio-dma Joel Soete
2007-10-29 5:30 ` Grant Grundler
[not found] ` <20071029053015.GA14763@colo.lackof.org>
2007-11-01 17:17 ` Joel Soete
[not found] ` <472A0A3F.5040007@scarlet.be>
2007-11-02 18:00 ` Joel Soete
2007-11-24 20:36 ` Joel Soete [this message]
2007-11-26 3:26 ` Grant Grundler
2007-11-28 8:25 Joel Soete
2007-11-29 18:50 ` Grant Grundler
-- strict thread matches above, loose matches on Subject: below --
2007-11-27 12:48 Joel Soete
2007-11-27 21:47 ` Grant Grundler
2007-11-26 8:48 Joel Soete
2007-11-26 23:27 ` Grant Grundler
2007-10-23 16:15 Joel Soete
[not found] <4714E800.50709@scarlet.be>
2007-10-17 18:44 ` Kyle McMartin
[not found] ` <20071017184416.GB11502@fattire.cabal.ca>
2007-10-20 17:21 ` Joel Soete
[not found] ` <471A3928.8040603@scarlet.be>
2007-10-20 17:23 ` Kyle McMartin
[not found] ` <20071020172328.GI10429@fattire.cabal.ca>
2007-10-22 4:39 ` Grant Grundler
[not found] ` <20071022043909.GC11869@colo.lackof.org>
2007-11-01 17:18 ` Joel Soete
[not found] ` <472A0A49.8010201@scarlet.be>
2007-11-01 21:21 ` Grant Grundler
2007-10-16 16:34 Joel Soete
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47488B33.5040904@scarlet.be \
--to=soete.joel@scarlet.be \
--cc=grundler@parisc-linux.org \
--cc=kyle@mcmartin.ca \
--cc=parisc-linux@parisc-linux.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.