* Bad page state in process 'nfsd' with xfs
@ 2006-04-28 20:22 David Greaves
2006-04-30 2:06 ` Nick Piggin
2006-04-30 22:04 ` Nathan Scott
0 siblings, 2 replies; 10+ messages in thread
From: David Greaves @ 2006-04-28 20:22 UTC (permalink / raw)
To: 'linux-kernel@vger.kernel.org', linux-xfs
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
This was with 2.6.16.9
There's an nfs export from an xfs on an lvm on a raid5 on some
libata/sata disks.
(cc'ing xfs since I recall rumoured(?) badness in old nfs/xfs/md/lvm
setups and xfs_sendfile is mentioned)
dmesg had:
Bad page state in process 'nfsd'
page:b1602060 flags:0x80000008 mapping:00000000 mapcount:0 count:16777216
Trying to fix it up, but a reboot is needed
Backtrace:
[<b013bda2>] bad_page+0x62/0x90
[<b013c1c8>] prep_new_page+0x78/0x80
[<b013c6b6>] buffered_rmqueue+0xf6/0x1f0
[<b013c8e2>] get_page_from_freelist+0x92/0xb0
[<b013c956>] __alloc_pages+0x56/0x300
[<b013f00c>] __do_page_cache_readahead+0xdc/0x120
[<b013f1b9>] blockable_page_cache_readahead+0x59/0xd0
[<b013f2aa>] make_ahead_window+0x7a/0xb0
[<b013f39f>] page_cache_readahead+0xbf/0x1b0
[<b0138b91>] do_generic_mapping_read+0x4b1/0x4c0
[<b01390e2>] generic_file_sendfile+0x62/0x70
[<f1097080>] nfsd_read_actor+0x0/0xd0 [nfsd]
[<b021bab0>] xfs_sendfile+0xc0/0x190
[<f1097080>] nfsd_read_actor+0x0/0xd0 [nfsd]
[<b0217fe8>] linvfs_open+0x48/0x50
[<b0217f97>] linvfs_sendfile+0x57/0x60
[<f1097080>] nfsd_read_actor+0x0/0xd0 [nfsd]
[<f109734f>] nfsd_vfs_read+0x1ff/0x370 [nfsd]
[<f1097080>] nfsd_read_actor+0x0/0xd0 [nfsd]
[<f1097933>] nfsd_read+0x103/0x120 [nfsd]
[<f109e234>] nfsd3_proc_read+0xe4/0x170 [nfsd]
[<f1093649>] nfsd_dispatch+0xd9/0x210 [nfsd]
[<f10e4792>] svc_process+0x482/0x670 [sunrpc]
[<f10933fc>] nfsd+0x18c/0x300 [nfsd]
[<f1093270>] nfsd+0x0/0x300 [nfsd]
[<b0101391>] kernel_thread_helper+0x5/0x14
more info on request but I have rebooted as suggested.
David
- --
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEUnl+8LvjTle4P1gRAip3AJ9izpp3+/6/fPgzSbJdxuc74Uus5wCZAWtF
QHY+xcDh9cf6bYhBCx+DzJE=
=XZDc
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad page state in process 'nfsd' with xfs
2006-04-28 20:22 Bad page state in process 'nfsd' with xfs David Greaves
@ 2006-04-30 2:06 ` Nick Piggin
2006-04-30 22:04 ` Nathan Scott
1 sibling, 0 replies; 10+ messages in thread
From: Nick Piggin @ 2006-04-30 2:06 UTC (permalink / raw)
To: David Greaves; +Cc: 'linux-kernel@vger.kernel.org', linux-xfs
David Greaves wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> This was with 2.6.16.9
>
> There's an nfs export from an xfs on an lvm on a raid5 on some
> libata/sata disks.
> (cc'ing xfs since I recall rumoured(?) badness in old nfs/xfs/md/lvm
> setups and xfs_sendfile is mentioned)
>
> dmesg had:
>
> Bad page state in process 'nfsd'
> page:b1602060 flags:0x80000008 mapping:00000000 mapcount:0 count:16777216
> Trying to fix it up, but a reboot is needed
> Backtrace:
> [<b013bda2>] bad_page+0x62/0x90
> [<b013c1c8>] prep_new_page+0x78/0x80
Looks like you have a bit flipped in 'count', which was not flipped
when the page was last freed. Probably buggy RAM.
Running memtest overnight might confirm that.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad page state in process 'nfsd' with xfs
@ 2006-04-30 15:19 Zoltan Boszormenyi
2006-05-01 0:24 ` Nathan Scott
0 siblings, 1 reply; 10+ messages in thread
From: Zoltan Boszormenyi @ 2006-04-30 15:19 UTC (permalink / raw)
To: linux-kernel; +Cc: Nick Piggin
Hi.
> David Greaves wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > This was with 2.6.16.9
> >
> > There's an nfs export from an xfs on an lvm on a raid5 on some
> > libata/sata disks.
> > (cc'ing xfs since I recall rumoured(?) badness in old nfs/xfs/md/lvm
> > setups and xfs_sendfile is mentioned)
> >
> > dmesg had:
> >
> > Bad page state in process 'nfsd'
> > page:b1602060 flags:0x80000008 mapping:00000000 mapcount:0 count:16777216
> > Trying to fix it up, but a reboot is needed
> > Backtrace:
> > [<b013bda2>] bad_page+0x62/0x90
> > [<b013c1c8>] prep_new_page+0x78/0x80
>
> Looks like you have a bit flipped in 'count', which was not flipped
> when the page was last freed. Probably buggy RAM.
>
> Running memtest overnight might confirm that.
Or not. I had an FC3/x86-64 system until two days ago, now I have FC5/86-64.
When FC3 was installed I chose to format the partitions to XFS and since
then
I had Oopses regularly with or without VMWare modules.
I have run memtest64+ for 12+ hours and it indicated two separate single bit
errors in the topmost 64MB of my 1GB. Since then I was running with
mem=960M but I still got Oopses on a bit heavier disk loads and every time
XFS was involved.
I backed up my /home with rsync to a new harddisk in single mode,
the new disk was formatted to EXT3. During the backup I had Oopses
about 5 or 6 times and I had to reboot. Rsync was able to continue,
that's why I chose that for backup...
I installed FC5 using only EXT3 partitions and copied my 80+ GB data
back to /home. Guess what? No Oopses...
Best regards,
Zoltán Böszörményi
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad page state in process 'nfsd' with xfs
2006-04-28 20:22 Bad page state in process 'nfsd' with xfs David Greaves
2006-04-30 2:06 ` Nick Piggin
@ 2006-04-30 22:04 ` Nathan Scott
2006-05-01 9:41 ` David Greaves
1 sibling, 1 reply; 10+ messages in thread
From: Nathan Scott @ 2006-04-30 22:04 UTC (permalink / raw)
To: David Greaves; +Cc: 'linux-kernel@vger.kernel.org', linux-xfs
Hi there,
On Fri, Apr 28, 2006 at 09:22:23PM +0100, David Greaves wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> This was with 2.6.16.9
>
> There's an nfs export from an xfs on an lvm on a raid5 on some
> libata/sata disks.
> (cc'ing xfs since I recall rumoured(?) badness in old nfs/xfs/md/lvm
> setups and xfs_sendfile is mentioned)
Really old (early 2.6) or previously with 4Kstacks this used to be
a problem, but should not be with your kernel version.
> Bad page state in process 'nfsd'
> page:b1602060 flags:0x80000008 mapping:00000000 mapcount:0 count:16777216
> Trying to fix it up, but a reboot is needed
> Backtrace:
> [<b013bda2>] bad_page+0x62/0x90
> [<b013c1c8>] prep_new_page+0x78/0x80
> [<b013c6b6>] buffered_rmqueue+0xf6/0x1f0
> [<b013c8e2>] get_page_from_freelist+0x92/0xb0
Hmm... so, your page flags field there (0x80000008) has the 33rd and
4th bits set - 4 is pageuptodate, which is fine, but 33 seems odd
(perhaps some arch-specific bit? or a single bit error...).
But, the warning is triggered by the page count (16777216 above), and
that is 0x1000000 -- which is a huge, improbable count; that looks to
me like it could very well be the result of a single bit error too.
You may have a hardware problem - try running memtest I guess.
cheers.
--
Nathan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad page state in process 'nfsd' with xfs
2006-04-30 15:19 Zoltan Boszormenyi
@ 2006-05-01 0:24 ` Nathan Scott
2006-05-01 8:07 ` Zoltan Boszormenyi
0 siblings, 1 reply; 10+ messages in thread
From: Nathan Scott @ 2006-05-01 0:24 UTC (permalink / raw)
To: Zoltan Boszormenyi; +Cc: linux-kernel, Nick Piggin
On Sun, Apr 30, 2006 at 05:19:56PM +0200, Zoltan Boszormenyi wrote:
> ...
> Or not. I had an FC3/x86-64 system until two days ago, now I have FC5/86-64.
>
> When FC3 was installed I chose to format the partitions to XFS and since
> then
> I had Oopses regularly with or without VMWare modules.
What was the stack trace for your oops...?
cheers.
--
Nathan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad page state in process 'nfsd' with xfs
2006-05-01 0:24 ` Nathan Scott
@ 2006-05-01 8:07 ` Zoltan Boszormenyi
2006-05-01 21:33 ` Nathan Scott
0 siblings, 1 reply; 10+ messages in thread
From: Zoltan Boszormenyi @ 2006-05-01 8:07 UTC (permalink / raw)
To: Nathan Scott; +Cc: linux-kernel, Nick Piggin
Hi,
Nathan Scott írta:
> On Sun, Apr 30, 2006 at 05:19:56PM +0200, Zoltan Boszormenyi wrote:
>
>> ...
>> Or not. I had an FC3/x86-64 system until two days ago, now I have FC5/86-64.
>>
>> When FC3 was installed I chose to format the partitions to XFS and since
>> then
>> I had Oopses regularly with or without VMWare modules.
>>
>
> What was the stack trace for your oops...?
>
> cheers.
>
I reported some Oopses for earlier kernels, they are here:
http://marc.theaimsgroup.com/?t=113649735300003&r=1&w=2
http://marc.theaimsgroup.com/?l=linux-kernel&m=113166035904096&w=2
http://marc.theaimsgroup.com/?l=fedora-list&m=113611408900505&w=2
With FC3, the last kernel I used was vanilla 2.6.15.
It may be that those above were fixed since.
Best regards,
Zoltán Böszörményi
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad page state in process 'nfsd' with xfs
2006-04-30 22:04 ` Nathan Scott
@ 2006-05-01 9:41 ` David Greaves
2006-05-01 15:21 ` Chris Wedgwood
0 siblings, 1 reply; 10+ messages in thread
From: David Greaves @ 2006-05-01 9:41 UTC (permalink / raw)
To: Nathan Scott
Cc: 'linux-kernel@vger.kernel.org', linux-xfs, nickpiggin
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Nathan Scott wrote:
> Hi there,
>
> On Fri, Apr 28, 2006 at 09:22:23PM +0100, David Greaves wrote:
>
> But, the warning is triggered by the page count (16777216 above), and
> that is 0x1000000 -- which is a huge, improbable count; that looks to
> me like it could very well be the result of a single bit error too.
>
> You may have a hardware problem - try running memtest I guess.
Thanks guys
It's in use a lot so I'll schedule some downtime, blow out the dust
and run memtest (though I've done that before and it has been clean).
I'll let you know how it goes...
David
- --
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEVdfn8LvjTle4P1gRAiHTAKCBakrWQCpHgo8qyfN6ZNryAxi3bQCdFkDn
vQe781l5bQvq1a5BG2nF5sk=
=jdAy
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad page state in process 'nfsd' with xfs
2006-05-01 9:41 ` David Greaves
@ 2006-05-01 15:21 ` Chris Wedgwood
0 siblings, 0 replies; 10+ messages in thread
From: Chris Wedgwood @ 2006-05-01 15:21 UTC (permalink / raw)
To: David Greaves
Cc: Nathan Scott, 'linux-kernel@vger.kernel.org', linux-xfs,
nickpiggin
On Mon, May 01, 2006 at 10:41:59AM +0100, David Greaves wrote:
> It's in use a lot so I'll schedule some downtime, blow out the dust
> and run memtest (though I've done that before and it has been
> clean).
memtest doesn't always find bad memory sadly
finding bad memory is hard, and sometimes it's exacerbated by
complicated factors (heat from drives for example)
i wish ecc memory was standard
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad page state in process 'nfsd' with xfs
2006-05-01 8:07 ` Zoltan Boszormenyi
@ 2006-05-01 21:33 ` Nathan Scott
2006-05-01 22:08 ` Zoltan Boszormenyi
0 siblings, 1 reply; 10+ messages in thread
From: Nathan Scott @ 2006-05-01 21:33 UTC (permalink / raw)
To: Zoltan Boszormenyi; +Cc: linux-kernel, Nick Piggin
On Mon, May 01, 2006 at 10:07:44AM +0200, Zoltan Boszormenyi wrote:
> Hi,
>
> Nathan Scott írta:
> > On Sun, Apr 30, 2006 at 05:19:56PM +0200, Zoltan Boszormenyi wrote:
> >
> >> ...
> >> Or not. I had an FC3/x86-64 system until two days ago, now I have FC5/86-64.
> >>
> >> When FC3 was installed I chose to format the partitions to XFS and since
> >> then
> >> I had Oopses regularly with or without VMWare modules.
> >>
> >
> > What was the stack trace for your oops...?
> >
> > cheers.
> >
>
> I reported some Oopses for earlier kernels, they are here:
These aren't oopses. They do look similar, but slightly
different to the other report - your page count there is
off with the pixies, but its not as clear that its a single
bit error - yours are more like 0xfffe0000. Quite strange.
You also have the odd high-32-bits-mirrors-low-32-bits in
page flags, both with one bit set.
Not sure XFS can be causing this (we don't touch page count
for regular file pages, and only touch PageUptodate in flags
IIRC, like most/all filesystems).
Were you also using NFS, as in the other report?
cheers.
--
Nathan
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Bad page state in process 'nfsd' with xfs
2006-05-01 21:33 ` Nathan Scott
@ 2006-05-01 22:08 ` Zoltan Boszormenyi
0 siblings, 0 replies; 10+ messages in thread
From: Zoltan Boszormenyi @ 2006-05-01 22:08 UTC (permalink / raw)
To: Nathan Scott; +Cc: linux-kernel, Nick Piggin
Hi,
Nathan Scott írta:
> On Mon, May 01, 2006 at 10:07:44AM +0200, Zoltan Boszormenyi wrote:
>
>> Hi,
>>
>> Nathan Scott írta:
>>
>>> On Sun, Apr 30, 2006 at 05:19:56PM +0200, Zoltan Boszormenyi wrote:
>>>
>>>
>>>> ...
>>>> Or not. I had an FC3/x86-64 system until two days ago, now I have FC5/86-64.
>>>>
>>>> When FC3 was installed I chose to format the partitions to XFS and since
>>>> then
>>>> I had Oopses regularly with or without VMWare modules.
>>>>
>>>>
>>> What was the stack trace for your oops...?
>>>
>>> cheers.
>>>
>>>
>> I reported some Oopses for earlier kernels, they are here:
>>
>
> These aren't oopses. They do look similar, but slightly
> different to the other report - your page count there is
> off with the pixies, but its not as clear that its a single
> bit error - yours are more like 0xfffe0000. Quite strange.
> You also have the odd high-32-bits-mirrors-low-32-bits in
> page flags, both with one bit set.
>
> Not sure XFS can be causing this (we don't touch page count
> for regular file pages, and only touch PageUptodate in flags
> IIRC, like most/all filesystems).
>
> Were you also using NFS, as in the other report?
>
> cheers.
>
>
No, it's just a standalone machine.
Best regards,
Zoltán Böszörményi
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2006-05-01 22:08 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-28 20:22 Bad page state in process 'nfsd' with xfs David Greaves
2006-04-30 2:06 ` Nick Piggin
2006-04-30 22:04 ` Nathan Scott
2006-05-01 9:41 ` David Greaves
2006-05-01 15:21 ` Chris Wedgwood
-- strict thread matches above, loose matches on Subject: below --
2006-04-30 15:19 Zoltan Boszormenyi
2006-05-01 0:24 ` Nathan Scott
2006-05-01 8:07 ` Zoltan Boszormenyi
2006-05-01 21:33 ` Nathan Scott
2006-05-01 22:08 ` Zoltan Boszormenyi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox