* [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
@ 2005-02-16 6:55 Grant Grundler
[not found] ` <4208D51500003293@mail-5-bnl.tiscali.it>
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Grant Grundler @ 2005-02-16 6:55 UTC (permalink / raw)
To: parisc-linux
Bad night. :^(
Running 2.6.11-rc4-pa0-plus (equivalent to -pa1) on a500-65 (riot).
xfs ran spew for 48h but panic'd when I did a "cvs up -A"
to the CVS source tree on the same file system.
This file system is a raid0 of 6 disks (3 per scsi bus).
Then default "mkfs -t xfs /dev/md0".
grant
Backtrace:
[ argh...no backtrace got dumped ]
Kernel Fault: Code=15 regs=0000000032860d20 (Addr=00000027452d93e1)
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000001000000000100001111 Not tainted
r00-03 0000000000000000 0000002647c44fe0 00000000a7f9a6c0 00000000f9aa3000
r04-07 0000000000000001 0000000000242000 00000000732bcab8 0000000032860ae8
r08-11 0000000000000001 00000000732bcab8 0000000000011c1b 0000000000000010
r12-15 0000000000000001 00000000f9aa3288 00000000f9aa3150 0000000000000001
r16-19 00000000328607b8 00000000cd3bb090 0000000032860560 00000000fd694400
Press Q/q to quit, Enter to continue:
r20-23 00000000002a2a40 000000000000055c 0000000058414749 00000000cb631400
r24-27 00000027452d93e0 0000000000000001 0000000000242000 0000000000242000
r28-31 00000000cb631400 0000000032860cf0 0000000032860d20 0000000000000000
sr0-3 000000000a95f800 000000000a95f800 0000000000000000 000000000a95e800
sr4-7 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000002a1574 00000000002a1578
IIR: 0f021014 ISR: 0000000000000000 IOR: 00000027452d93e1
CPU: 0 CR30: 0000000032860000 CR31: 0000000010508000
ORIG_R28: 0000000000242000
IAOQ[0]: xfs_ialloc_read_agi+0x164/0x188 [xfs]
IAOQ[1]: xfs_ialloc_read_agi+0x168/0x188 [xfs]
RP(r2): 0xa7f9a6c0
Kernel panic - not syncing: Kernel Fault
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
[not found] ` <4208D51500003293@mail-5-bnl.tiscali.it>
@ 2005-02-16 7:45 ` Grant Grundler
0 siblings, 0 replies; 12+ messages in thread
From: Grant Grundler @ 2005-02-16 7:45 UTC (permalink / raw)
To: Joel Soete; +Cc: parisc-linux
On Wed, Feb 16, 2005 at 08:28:15AM +0100, Joel Soete wrote:
> > RP(r2): 0xa7f9a6c0
> > Kernel panic - not syncing: Kernel Fault
> >
> Any chance to know what 0xa7f9a6c0 is (iirc calling address)?
return pointer - yes, the callers address.
> and fwiw it's different from previous panic: RP was 0xfa285480?
both look like garbage to me.
> mmm patch-2.6.11-rc4-bk3 (including some xfs fixes) is small, could you let
> a try?
Sure - tomorrow. You have a URL?
(Sorry - I think you posted it once before)
thanks,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
2005-02-16 6:55 [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault Grant Grundler
[not found] ` <4208D51500003293@mail-5-bnl.tiscali.it>
@ 2005-02-17 4:06 ` Carlos O'Donell
2005-02-19 22:27 ` Grant Grundler
2 siblings, 0 replies; 12+ messages in thread
From: Carlos O'Donell @ 2005-02-17 4:06 UTC (permalink / raw)
To: Grant Grundler; +Cc: parisc-linux
On Tue, Feb 15, 2005 at 11:55:39PM -0700, Grant Grundler wrote:
> Bad night. :^(
Yes, teh new kernels don't seem so safe. I'm not very happy to move my
a500 out of 2.6.9-rc2-pa11... 'cause it "just works (tm)"
> Running 2.6.11-rc4-pa0-plus (equivalent to -pa1) on a500-65 (riot).
> xfs ran spew for 48h but panic'd when I did a "cvs up -A"
> to the CVS source tree on the same file system.
>
> This file system is a raid0 of 6 disks (3 per scsi bus).
> Then default "mkfs -t xfs /dev/md0".
The rp is not correct, but doesn't have to be until reload before
function return, it can actually be used as scratch. IAOQ/IASQ is a
better indicator of what the cpu was doing at the time.
> Kernel Fault: Code=15 regs=0000000032860d20 (Addr=00000027452d93e1)
>
> YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
> PSW: 00001000000001000000000100001111 Not tainted
> r00-03 0000000000000000 0000002647c44fe0 00000000a7f9a6c0 00000000f9aa3000
> r04-07 0000000000000001 0000000000242000 00000000732bcab8 0000000032860ae8
> r08-11 0000000000000001 00000000732bcab8 0000000000011c1b 0000000000000010
> r12-15 0000000000000001 00000000f9aa3288 00000000f9aa3150 0000000000000001
> r16-19 00000000328607b8 00000000cd3bb090 0000000032860560 00000000fd694400
>
>
> Press Q/q to quit, Enter to continue:
> r20-23 00000000002a2a40 000000000000055c 0000000058414749 00000000cb631400
> r24-27 00000027452d93e0 0000000000000001 0000000000242000 0000000000242000
> r28-31 00000000cb631400 0000000032860cf0 0000000032860d20 0000000000000000
> sr0-3 000000000a95f800 000000000a95f800 0000000000000000 000000000a95e800
> sr4-7 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>
> IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000002a1574 00000000002a1578
> IIR: 0f021014 ISR: 0000000000000000 IOR: 00000027452d93e1
> CPU: 0 CR30: 0000000032860000 CR31: 0000000010508000
> ORIG_R28: 0000000000242000
> IAOQ[0]: xfs_ialloc_read_agi+0x164/0x188 [xfs]
> IAOQ[1]: xfs_ialloc_read_agi+0x168/0x188 [xfs]
> RP(r2): 0xa7f9a6c0
> Kernel panic - not syncing: Kernel Fault
The sp is aligned, and at some nominal value. I'm not sure what IIR that
works out to be, can you disassemble your kernel at that point to see
the faulting instruction?
c.
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
[not found] ` <20050217210915.GB1081@colo.lackof.org>
@ 2005-02-18 1:00 ` Michael S. Zick
0 siblings, 0 replies; 12+ messages in thread
From: Michael S. Zick @ 2005-02-18 1:00 UTC (permalink / raw)
To: parisc-linux
On Thu February 17 2005 15:09, Grant Grundler wrote:
> On Thu, Feb 17, 2005 at 06:53:52AM -0600, Michael S. Zick wrote:
> > A suggestion...
> >
> > I am tracking a timing and/or race condition(s) with x86-udev.
> > Debian doesn't use it yet, they still have a hard disk based /dev.
>
> OK. I'll try that tonight.
>
Grant,
One of my problems was the video card driver - Manufacturers
driver has problems, X.org driver does not. Tossing the vendor's
driver fixed that.
Still having network driver problems, haven't narrowed down all
the possible causes.
The hardware, software, and kernel are as similar as possible
with the exceptions:
works:
kernel 2.6.8, gcc 3.3.4, single processor, hard-disk /dev
problems:
kernel 2.6.x, gcc 3.3.5, dual processor, udev /dev
My guess is that the udev code is not 100% smp ready.
But it could actually be anything, including cockpit error.
Time to sneaker net (by cd-rom) code from one box
to the other. Will advise.
Mike
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
[not found] ` <200502170653.52415.mszick@wolfbutter.com>
[not found] ` <20050217210915.GB1081@colo.lackof.org>
@ 2005-02-18 6:23 ` Grant Grundler
2005-02-18 9:28 ` Thibaut VARENE
1 sibling, 1 reply; 12+ messages in thread
From: Grant Grundler @ 2005-02-18 6:23 UTC (permalink / raw)
To: Michael S. Zick; +Cc: parisc-linux
On Thu, Feb 17, 2005 at 06:53:52AM -0600, Michael S. Zick wrote:
> I am tracking a timing and/or race condition(s) with x86-udev.
> Debian doesn't use it yet, they still have a hard disk based /dev.
> But it is not a compile time option.
> If you have a 2.6.+ kernel, you have it.
>
> Try turning it off with the command line option: 'noudev'
I thought this helped at first. (2.6.11-rc4-pa1)
Could run spew+"make -j4" over xfs without crashing the box (panic).
Then I rebooted with a freshly built kernel (and "noudev")
and right away when I start spew on the XFS file system I get:
Kernel Fault: Code=26 regs=00000000fd0b0de0 (Addr=00000000000000c8)
...
IAOQ[0]: xfs_btree_init_cursor+0x70/0x1b8 [xfs]
IAOQ[1]: xfs_btree_init_cursor+0x8c/0x1b8 [xfs]
RP(r2): xfs_btree_init_cursor+0x58/0x1b8 [xfs]
The system panics with and without the xfs patches Joel pointed me at.
The only difference in the .config was sym2 driver IOMAPPED.
I had it enabled (using IO Port space) and now disable it
since I really want MMIO space. I'm very skeptical this is
related to the problem.
> If your compile time options include devfs, try turning it
> off also. Use the command line option: 'nodevfs'
I don't use devfs. Willy won't let me. :^P
> In my case, for kernels 2.6.8, 2.6.9, 2.6.10 and 2.6.11 -
> If I boot from slow devices (cdrom, zip disk) everything is
> fine, If I boot from fast devices (hard disk) - something isn't
> getting setup the same.
Is there a difference in the how the devices are used?
ie. serialized vs. multiple IOs in flight at the same time?
I didn't start hitting problems until I lauched a second job
on the same file system.
thanks,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
2005-02-18 6:23 ` Grant Grundler
@ 2005-02-18 9:28 ` Thibaut VARENE
2005-02-18 14:07 ` Michael S. Zick
[not found] ` <20050219022808.GA28925@colo.lackof.org>
0 siblings, 2 replies; 12+ messages in thread
From: Thibaut VARENE @ 2005-02-18 9:28 UTC (permalink / raw)
To: Grant Grundler; +Cc: Michael S.Zick, parisc-linux
-------------------
> On Thu, Feb 17, 2005 at 06:53:52AM -0600, Michael S. Zick wrote:
> > I am tracking a timing and/or race condition(s) with x86-udev.
> > Debian doesn't use it yet, they still have a hard disk based /dev.
> > But it is not a compile time option.
> > If you have a 2.6.+ kernel, you have it.
> >=20
> > Try turning it off with the command line option: 'noudev'
>=20
> I thought this helped at first. (2.6.11-rc4-pa1)
> Could run spew+"make -j4" over xfs without crashing the box (panic).
>=20
> Then I rebooted with a freshly built kernel (and "noudev")
> and right away when I start spew on the XFS file system I get:
=20
> The only difference in the .config was sym2 driver IOMAPPED.
> I had it enabled (using IO Port space) and now disable it
> since I really want MMIO space. I'm very skeptical this is
> related to the problem.
Hmmm.
Back in the early days of 2.5, i recall issues with SYM2 in MMIO mode,
so that it only worked in PIO mode (I think i had posted something on
this m-l about that). At that time i was using ext3 for that matter.
OTOH, XFS relies a lot on memory and cache quality. I don't know how
MMIO would interfere with that, but I suspect that if enabling MMIO
has the side effect of messing with that somehow, it's very likely XFS
won't like it at all. Just a random thought though.
=20
> > If your compile time options include devfs, try turning it=20
> > off also. Use the command line option: 'nodevfs'
>=20
> I don't use devfs. Willy won't let me. :^P
well, devfs is 3v1l and deprecated, so you'd better not use it :)
> > In my case, for kernels 2.6.8, 2.6.9, 2.6.10 and 2.6.11 -
> > If I boot from slow devices (cdrom, zip disk) everything is
> > fine, If I boot from fast devices (hard disk) - something isn't
> > getting setup the same.
>=20
> Is there a difference in the how the devices are used?
> ie. serialized vs. multiple IOs in flight at the same time?
>=20
> I didn't start hitting problems until I lauched a second job
> on the same file system.
Come to think of it, I wonder how the XFS bug you're pointing at can
be related to the bug Joel and I reported with ext3:
http://lists.parisc-linux.org/pipermail/parisc-linux/2005-February/025
867.html
Maybe one would like to try XFS on a non-SYM2 box and see what
happens? That way we might have a clue whether sym2's at fault or
something else...
My 2 cents in the morning ;)
Thibaut VARENE
PA/Linux ESIEE Team
http://www.pateam.org/
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
2005-02-18 9:28 ` Thibaut VARENE
@ 2005-02-18 14:07 ` Michael S. Zick
[not found] ` <20050219022808.GA28925@colo.lackof.org>
1 sibling, 0 replies; 12+ messages in thread
From: Michael S. Zick @ 2005-02-18 14:07 UTC (permalink / raw)
To: parisc-linux
On Fri February 18 2005 03:28, Thibaut VARENE wrote:
> -------------------
> > On Thu, Feb 17, 2005 at 06:53:52AM -0600, Michael S. Zick wrote:
> > > I am tracking a timing and/or race condition(s) with x86-udev.
> > > Debian doesn't use it yet, they still have a hard disk based /dev.
> > > But it is not a compile time option.
> > > If you have a 2.6.+ kernel, you have it.
> > >
> > > Try turning it off with the command line option: 'noudev'
> >
> > I thought this helped at first. (2.6.11-rc4-pa1)
> > Could run spew+"make -j4" over xfs without crashing the box (panic).
> >
> > Then I rebooted with a freshly built kernel (and "noudev")
> > and right away when I start spew on the XFS file system I get:
>
>
> > The only difference in the .config was sym2 driver IOMAPPED.
> > I had it enabled (using IO Port space) and now disable it
> > since I really want MMIO space. I'm very skeptical this is
> > related to the problem.
>
Try the command line option: pci=routeirq
I know that pa-risc pci is a totally different beast, so can't say
how much effect the changes in generic affect your systems.
> Hmmm.
> Back in the early days of 2.5, i recall issues with SYM2 in MMIO mode,
> so that it only worked in PIO mode (I think i had posted something on
> this m-l about that). At that time i was using ext3 for that matter.
>
> OTOH, XFS relies a lot on memory and cache quality. I don't know how
> MMIO would interfere with that, but I suspect that if enabling MMIO
> has the side effect of messing with that somehow, it's very likely XFS
> won't like it at all. Just a random thought though.
>
> > > If your compile time options include devfs, try turning it
> > > off also. Use the command line option: 'nodevfs'
> >
> > I don't use devfs. Willy won't let me. :^P
>
I have to agree with Willy on this one.
Taking it out of the kernel is a step forward for the kernel designers.
The udev system, although envisioned as a replacement for devfs
is much more general and can (could, might) lead to mountable
cpu and memory segments.
? ? ? ?
To take a cpu off-line, such as for test/maintance:
mount -o remount,ro /dev/cpu/xxx
To shut it down, such as for replacement:
umount /dev/cpu/xxx
To take a memory range off-line, such as for test/maintance:
mount -o remount,ro /dev/mem/<memory-range-id>
To take failed memory range down for replacement:
umount /dev/mem/<memory-range-id>
These operations are common in both tightly coupled
smp (symetrical) systems and loosely coupled amp
(asymetrical) systems.
A loosely coupled multiple processor system might not
have all of its cpus and memory in a single box.
Even a tightly coupled multiple processor system used
for 'non-stop' service can use these features.
This is the real power of what is now the udev system, not
just dynamic device naming.
It allows Linux to grow into both a single processor OS
and a general purpose, parallel processing OS.
It looks to me as if nearly all of the code is in-place, just
needs some changes in kernel initialization and some
general consolidation of existing kernel functions.
Mike
PS: These changes also give new reasons for including
my version of system sched.c into the kernel as an option
in 2.9.x
> well, devfs is 3v1l and deprecated, so you'd better not use it :)
>
> > > In my case, for kernels 2.6.8, 2.6.9, 2.6.10 and 2.6.11 -
> > > If I boot from slow devices (cdrom, zip disk) everything is
> > > fine, If I boot from fast devices (hard disk) - something isn't
> > > getting setup the same.
> >
> > Is there a difference in the how the devices are used?
> > ie. serialized vs. multiple IOs in flight at the same time?
> >
> > I didn't start hitting problems until I lauched a second job
> > on the same file system.
>
> Come to think of it, I wonder how the XFS bug you're pointing at can
> be related to the bug Joel and I reported with ext3:
> http://lists.parisc-linux.org/pipermail/parisc-linux/2005-February/025
> 867.html
>
> Maybe one would like to try XFS on a non-SYM2 box and see what
> happens? That way we might have a clue whether sym2's at fault or
> something else...
>
> My 2 cents in the morning ;)
>
> Thibaut VARENE
> PA/Linux ESIEE Team
> http://www.pateam.org/
> _______________________________________________
> parisc-linux mailing list
> parisc-linux@lists.parisc-linux.org
> http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
>
>
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
@ 2005-02-18 17:35 Joel Soete
0 siblings, 0 replies; 12+ messages in thread
From: Joel Soete @ 2005-02-18 17:35 UTC (permalink / raw)
To: Thibaut VARENE, Grant Grundler; +Cc: Michael S.Zick, parisc-linux
...
>
> Come to think of it, I wonder how the XFS bug you're pointing at can
> be related to the bug Joel and I reported with ext3:
> http://lists.parisc-linux.org/pipermail/parisc-linux/2005-February/025
> 867.html
>
> Maybe one would like to try XFS on a non-SYM2 box and see what
> happens? That way we might have a clue whether sym2's at fault or
> something else...
>
In fact because I have not other choise (the only spare disk I found is a=
small narrow se 4Gb hp disk which I cab connect to my b180 unstable debia=
n
install), I used the zalon ncr53c710 of my b180 and just build a 2Gb xfs
on a slice of the disk with defaults:
# mkfs -t xfs /dev/sdd1
meta-data=3D/dev/sdd1 isize=3D256 agcount=3D8, agsize=3D6=
4034 blks
=3D sectsz=3D512
data =3D bsize=3D4096 blocks=3D512272, imaxpc=
t=3D25
=3D sunit=3D0 swidth=3D0 blks, unwrit=
ten=3D1
naming =3Dversion 2 bsize=3D4096
log =3Dinternal log bsize=3D4096 blocks=3D2560, version=3D=
1
=3D sectsz=3D512 sunit=3D0 blks
realtime =3Dnone extsz=3D65536 blocks=3D0, rtextents=3D=
0
then cvs -z3 co -A linux-2.6 without any pb ;-)
I am convince that doesn't help a lot: that makes 3 big diff with Grant t=
est:
- 32bit kernel,
- up system (curious to see results in up too on the grant's a500: ju=
st
deconfigure a cpu, is it possible on a500?)
- different driver (ncr53c710 in place of sym_2)
The running kernel was 2.6.11-rc4-pa1 (cvs dated 20050217) 32bit without
any other patch, no devfs support builtin (nor module).
Hth anyway,
Joel
PS: I still have a chroot disk on this system but fw one connected to the=
symbios crtlr using sym-2.1.18n, it was dedicated to carlos libc test but=
trust I could safely convert some extra ext3 fs to xfs (I mean bakup/rest=
ore
:) ... on Monday :-(
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
[not found] ` <20050219022808.GA28925@colo.lackof.org>
@ 2005-02-19 10:17 ` Michael S. Zick
2005-02-21 16:58 ` Matthew Wilcox
0 siblings, 1 reply; 12+ messages in thread
From: Michael S. Zick @ 2005-02-19 10:17 UTC (permalink / raw)
To: parisc-linux
On Fri February 18 2005 20:28, Grant Grundler wrote:
> On Fri, Feb 18, 2005 at 10:28:26AM +0100, Thibaut VARENE wrote:
> > > The only difference in the .config was sym2 driver IOMAPPED.
> > > I had it enabled (using IO Port space) and now disable it
> > > since I really want MMIO space. I'm very skeptical this is
> > > related to the problem.
> >
> > Hmmm.
> > Back in the early days of 2.5, i recall issues with SYM2 in MMIO mode,
> > so that it only worked in PIO mode (I think i had posted something on
> > this m-l about that). At that time i was using ext3 for that matter.
> >
> > OTOH, XFS relies a lot on memory and cache quality. I don't know how
> > MMIO would interfere with that, but I suspect that if enabling MMIO
> > has the side effect of messing with that somehow, it's very likely XFS
> > won't like it at all. Just a random thought though.
>
> *sigh* Using IO Port space works. Either there is a bug in the Sym2
> driver (not posting writes when it needs to) or the lower cost of
> the MMIO writes is changing the timing enough to expose bugs
> in other subsystems.
>
A note,
There are recent messages on the powerPC list about the irq routing
and balancing changes in generic that cause the Sym2 driver to fail.
In my case, it is the tulip driver.
I have eliminated compiler, unip/smp, work loads, make -j levels;
and feel certain I have a solid 2.6.8 on this machine.
No problems with the IP network stack collapsing; produces good
executable of OpenOffice at "make -j3".
Next - binary search between 2.6.11 and 2.6.8 - find where things
went bad. If I find anything that might also effect pa-risc, I'll post.
Note: The changes to irq routing and balancing might also be
behind the high latency in servicing the timer interrupt that Willy
found on some pa-risc machines.
I would suggest sticking with: 'noudev nodevfs pci=routeirq'
command line options until more is known.
(They should make system work like pre-2.6.9)
Mike
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
2005-02-16 6:55 [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault Grant Grundler
[not found] ` <4208D51500003293@mail-5-bnl.tiscali.it>
2005-02-17 4:06 ` Carlos O'Donell
@ 2005-02-19 22:27 ` Grant Grundler
2005-02-21 0:04 ` Grant Grundler
2 siblings, 1 reply; 12+ messages in thread
From: Grant Grundler @ 2005-02-19 22:27 UTC (permalink / raw)
To: parisc-linux
On Tue, Feb 15, 2005 at 11:55:39PM -0700, Grant Grundler wrote:
> Running 2.6.11-rc4-pa0-plus (equivalent to -pa1) on a500-65 (riot).
> xfs ran spew for 48h but panic'd when I did a "cvs up -A"
> to the CVS source tree on the same file system.
XFS behaves better with sym2+MMIO on j6000 running a 64-bit SMP kernel.
I was able to run spew several hours with no problems.
This is without the patch that Joel pointed out from the -mm tree.
This suggests there is a problem with Astro/Elroy programming
where "Sprockets" firmware (for Astro based workstations) does
something different than PAT PDC firmware on a500.
I'm investigating.
However, did see one "WARN_ON" when I rsync (/home to /home2):
root@gggj6k:/home# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sda3 1783984 1522420 170940 90% /
tmpfs 4110708 0 4110708 0% /dev/shm
/dev/sda4 15439312 13086812 1568220 90% /home
/dev/sdb3 17019904 5155384 11864520 31% /mnt
/dev/sdb4 17773568 12486340 5287228 71% /home2
root@gggj6k:/home# lsscsi
[1:0:5:0] disk SEAGATE ST318203LC 0001 /dev/sda
[1:0:6:0] disk FUJITSU MAJ3364MC HP12 /dev/sdb
The rsync completed but one error showed up 4 times on the console
during the second half (about) of the rsync:
Filesystem "sdb4": XFS internal error xfs_btree_check_sblock at line 342 of file fs/xfs/xfs_btree.c. Caller 0x00000000102f1b84
Backtrace:
[<00000000101145e0>] dump_stack+0x18/0x28
[<00000000102eceac>] xfs_error_report+0x64/0x70
[<00000000102d7718>] xfs_btree_check_sblock+0x78/0xf8
[<00000000102f1b84>] xfs_inobt_lookup+0x144/0x3a0
[<00000000102f3514>] xfs_inobt_lookup_le+0x24/0x30
[<00000000102f08f0>] xfs_dialloc+0xe90/0x12e0
[<00000000102f6638>] xfs_ialloc+0x70/0x4e0
[<0000000010310a10>] xfs_dir_ialloc+0xb8/0x3e8
[<0000000010316d30>] xfs_create+0x2f0/0x810
[<0000000010326140>] linvfs_mknod+0x1e0/0x298
[<000000001032620c>] linvfs_create+0x14/0x20
[<00000000101d3c04>] vfs_create+0xbc/0xf8
[<00000000101d4a30>] open_namei+0xb58/0xc10
[<00000000101b5b14>] filp_open+0x44/0x98
[<00000000101b5f98>] sys_open+0x98/0x110
[<0000000010107fac>] syscall_exit+0x0/0x14
Perhaps this is fixed in the patches; I haven't looked.
hth,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
2005-02-19 22:27 ` Grant Grundler
@ 2005-02-21 0:04 ` Grant Grundler
0 siblings, 0 replies; 12+ messages in thread
From: Grant Grundler @ 2005-02-21 0:04 UTC (permalink / raw)
To: parisc-linux
On Sat, Feb 19, 2005 at 03:27:23PM -0700, Grant Grundler wrote:
> This suggests there is a problem with Astro/Elroy programming
> where "Sprockets" firmware (for Astro based workstations) does
> something different than PAT PDC firmware on a500.
This was not the only issue, but the first one.
Two more bits in the IOC_CTRL register were getting
set on j6k (vs A500):
D4 Disable 4-byte coalescing. Apperently could
result in 12 byte transfers (4 byte-lanes not
enabled on the second 8 byte transaction)
that "some devices don't tolerate".
Documentation doesn't list these devices.
DD Disable Distributed LMMIO coalescing.
Disables coalescing for all PIO MMIO transactions
to addresses below distributed LMMIO ranges.
Directed LMMIO and other GMMIO ranges
may still coalesce (e.g. graphics buffers usually
get a directed range).
Spew ran for an hour and managed to push 80+ GB
of data to 6 disks. I'll commit this change shortly.
Then I got ambitious and ran "make clean" and XFS panic'd.
I expect this is a different problem.
Same one I've seen before:
[ no backtrace :^( ]
Kernel Fault: Code=26 regs=000000006cd40c10 (Addr=00000000000000c8)
YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
PSW: 00001000000011001111111100001111 Not tainted
r00-03 0000000000000000 000000001026d1d0 000000000028afa8 0000000000000000
r04-07 0000000000243000 0000000000000000 0000000000000000 00000000fff00800
r08-11 00000000156b60b8 0000000000000000 0000000000000008 0000000000000000
r12-15 0000000000000008 0000000000000000 0000000000000000 000000006cd40768
r16-19 0008000000000001 0000000000000000 0000000000000022 0000000000000000
r20-23 0000000000000000 0000000000000000 00000000fdfafbf8 0000000000000000
r24-27 0000000000000000 0000000000000000 00000000fdfafb38 00000000105424c0
r28-31 00000000fdfafb38 000000006cd40c70 000000006cd40c10 00000000a0000000
sr0-3 000000000044d000 0000000000000000 0000000000000000 000000000044f000
sr4-7 0000000000000000 0000000000000000 0000000000000000 0000000000000000
IASQ: 0000000000000000 0000000000000000 IAOQ: 000000000028afc0 000000000028afdc
Press Q/q to quit, Enter to continue:
IIR: 50d30190 ISR: 0000000000000000 IOR: 00000000000000c8
CPU: 0 CR30: 000000006cd40000 CR31: 000000001050c000
ORIG_R28: 00000000fda4fa80
IAOQ[0]: xfs_btree_init_cursor+0x70/0x1b8 [xfs]
IAOQ[1]: xfs_btree_init_cursor+0x8c/0x1b8 [xfs]
RP(r2): xfs_btree_init_cursor+0x58/0x1b8 [xfs]
Kernel panic - not syncing: Kernel Fault
hth,
grant
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault
2005-02-19 10:17 ` Michael S. Zick
@ 2005-02-21 16:58 ` Matthew Wilcox
0 siblings, 0 replies; 12+ messages in thread
From: Matthew Wilcox @ 2005-02-21 16:58 UTC (permalink / raw)
To: Michael S. Zick; +Cc: parisc-linux
On Sat, Feb 19, 2005 at 04:17:59AM -0600, Michael S. Zick wrote:
> I would suggest sticking with: 'noudev nodevfs pci=routeirq'
> command line options until more is known.
> (They should make system work like pre-2.6.9)
pci=routeirq only has effect on i386 and ia64 boxes.
--
"Next the statesmen will invent cheap lies, putting the blame upon
the nation that is attacked, and every man will be glad of those
conscience-soothing falsities, and will diligently study them, and refuse
to examine any refutations of them; and thus he will by and by convince
himself that the war is just, and will thank God for the better sleep
he enjoys after this process of grotesque self-deception." -- Mark Twain
_______________________________________________
parisc-linux mailing list
parisc-linux@lists.parisc-linux.org
http://lists.parisc-linux.org/mailman/listinfo/parisc-linux
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2005-02-21 16:58 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-16 6:55 [parisc-linux] BUG 2.6.11-rc4-pa1 XFS data page fault Grant Grundler
[not found] ` <4208D51500003293@mail-5-bnl.tiscali.it>
2005-02-16 7:45 ` Grant Grundler
2005-02-17 4:06 ` Carlos O'Donell
2005-02-19 22:27 ` Grant Grundler
2005-02-21 0:04 ` Grant Grundler
[not found] <42075CE800004656@mail-6-bnl.tiscali.it>
[not found] ` <200502170653.52415.mszick@wolfbutter.com>
[not found] ` <20050217210915.GB1081@colo.lackof.org>
2005-02-18 1:00 ` Michael S. Zick
2005-02-18 6:23 ` Grant Grundler
2005-02-18 9:28 ` Thibaut VARENE
2005-02-18 14:07 ` Michael S. Zick
[not found] ` <20050219022808.GA28925@colo.lackof.org>
2005-02-19 10:17 ` Michael S. Zick
2005-02-21 16:58 ` Matthew Wilcox
-- strict thread matches above, loose matches on Subject: below --
2005-02-18 17:35 Joel Soete
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox