* OOM on quotacheck (again?)
@ 2012-09-19 14:12 blafoo
2012-09-19 20:59 ` Dave Chinner
0 siblings, 1 reply; 10+ messages in thread
From: blafoo @ 2012-09-19 14:12 UTC (permalink / raw)
To: xfs
Hi all,
for the last couple of days i've been trying to compile a new kernel for
our webserver-platform which is based on debian-squeeze.
Hardware: a mix of Dell PE2850, 2950, R710
- raid-10 with 4 disks (old setup, PE2850)
- raid-1 system, raid-10 content (current setup)
- currently running linux-2.6.37 custom built, vmalloc set to default
(128MB)
All systems have an xfs-filesystem as their content-partition and have
group-quota enabled (no other xfs-settings active). the
content-partition varies in size between 250GB and 1TB and contains
between 3 and 10 million files.
Every time i try to mount the xfs-file-system and a quota-check is
needed, the server goes out of memory (oom). I can easily reproduce this
by rebooting the server, resetting the quota-flags with
xfs_db -x -c 'sb 0' -c 'write qflags 0'
and rerun the quota-check.
This is true for various kernels but not all. What i've tried so far:
2.6.37.x - fails with OOM
2.6.39.4 - suprisingly works (see below why)
3.2.29 - fails with OOM
3.4.10 - fails with OOM
3.6.0rc5 - fails with vmalloc error (XFS (sda7): xfs_buf_get_map: failed
to map pages), with vmalloc=256 the systems hangs on mount infitly.
Some more infos from my test-system are available here:
http://pastebin.com/2DkDyH4R
I found a couple of references regarding this problem but no final
solution so far.
Please correct the following if i misunderstood anything:
1. There was an OOM problem with quota-checks which was fixed in
2.6.39.4 which is mentioned here:
a) http://permalink.gmane.org/gmane.comp.file-systems.xfs.general/43565
and fixed here:
b) http://patchwork.xfs.org/patch/3337/
That is why 2.6.39.4 works for me.
2. That fix was later replaced (not extended) with a nicer patch which
is mentioned/published here:
c) http://oss.sgi.com/archives/xfs/2011-03/msg00240.html
I checked all kernel-versions above for the patch mentioned in 2. and
can confirm its presence in each kernel-tree. Still our servers fail to
check quota successfully.
Am i missing something here?
PS: As a side-note: we've been running xfs for years without any
problems. But after we activated the gquota-feature, we've been having
problems in a couple of places. One is the OOM on quota-check, another
is xfs-errors on high-io volumes with gquota enabled. But since the
high-io-problem problem might be connected to the OOM-problem, we'll try
to fix the latter first :-)
best regards
Volker
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: OOM on quotacheck (again?)
2012-09-19 14:12 OOM on quotacheck (again?) blafoo
@ 2012-09-19 20:59 ` Dave Chinner
2012-09-20 9:32 ` Volker
0 siblings, 1 reply; 10+ messages in thread
From: Dave Chinner @ 2012-09-19 20:59 UTC (permalink / raw)
To: blafoo; +Cc: xfs
On Wed, Sep 19, 2012 at 04:12:04PM +0200, blafoo wrote:
> Hi all,
>
> for the last couple of days i've been trying to compile a new kernel for
> our webserver-platform which is based on debian-squeeze.
>
> Hardware: a mix of Dell PE2850, 2950, R710
> - raid-10 with 4 disks (old setup, PE2850)
> - raid-1 system, raid-10 content (current setup)
> - currently running linux-2.6.37 custom built, vmalloc set to default
> (128MB)
Which implies you are running a 32 bit kernel even on 64 bit CPUs
(e.g. R710).
>
> All systems have an xfs-filesystem as their content-partition and have
> group-quota enabled (no other xfs-settings active). the
> content-partition varies in size between 250GB and 1TB and contains
> between 3 and 10 million files.
>
> Every time i try to mount the xfs-file-system and a quota-check is
> needed, the server goes out of memory (oom). I can easily reproduce this
> by rebooting the server, resetting the quota-flags with
No surprise if you are running an i686 kernel (32 bit). You've got
way more inodes than can fit in the kernel memory segment.
> xfs_db -x -c 'sb 0' -c 'write qflags 0'
>
> and rerun the quota-check.
>
> This is true for various kernels but not all. What i've tried so far:
>
> 2.6.37.x - fails with OOM
> 2.6.39.4 - suprisingly works (see below why)
> 3.2.29 - fails with OOM
> 3.4.10 - fails with OOM
8a00ebe xfs: Ensure inode reclaim can run during quotacheck
$ git describe --contains 8a00ebe
v3.5-rc1~91^2~54
So the OOM problem was fixed in 3.5.
> 3.6.0rc5 - fails with vmalloc error (XFS (sda7): xfs_buf_get_map: failed
> to map pages), with vmalloc=256 the systems hangs on mount infitly.
Running on a x86-64 kernel will make the vmalloc problem go away.
There's very little we can do about the limited vmalloc address
space on i686 kernels. As it is, the known recent regression in this
space:
bcf62ab xfs: Fix overallocation in xfs_buf_allocate_memory()
$ git describe --contains bcf62ab
v3.6-rc1~42^2~35
was fixed in 3.6-rc1, so I'm not really that sure why you'd be
running out of vmalloc space as there shouldn't be any metadata that
is vmalloc'd in your given filesystem configuration...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: OOM on quotacheck (again?)
2012-09-19 20:59 ` Dave Chinner
@ 2012-09-20 9:32 ` Volker
2012-09-24 13:21 ` Dave Chinner
0 siblings, 1 reply; 10+ messages in thread
From: Volker @ 2012-09-20 9:32 UTC (permalink / raw)
Cc: xfs
Hi,
> Which implies you are running a 32 bit kernel even on 64 bit CPUs
> (e.g. R710).
My mistake. That is not yet the case, but the plan for the future.
Thanks for pointing that out.
> No surprise if you are running an i686 kernel (32 bit). You've got
> way more inodes than can fit in the kernel memory segment.
Could you slightly elaborate on that or give me a link or two which
explain the matter?
If a 32bit kernel is not supposed to work because of the number of
inodes, why does the 2.6.39.4-kernel work flawlessy on quota-checks on
the same filesystem a 3.6.0-rc5 32bit (which is supposed to work) fails on?
Doesn't that imply, that the fix submitted for 2.6.39.4 fixed a problem
which was "reinvented" by the later patch, which is now being worked
around by using a 64bit kernel for more memory?
> Running on a x86-64 kernel will make the vmalloc problem go away.
> There's very little we can do about the limited vmalloc address
> space on i686 kernels. As it is, the known recent regression in this
> space:
>
> bcf62ab xfs: Fix overallocation in xfs_buf_allocate_memory()
>
> $ git describe --contains bcf62ab
> v3.6-rc1~42^2~35
>
> was fixed in 3.6-rc1,
Confirmed. The current 3.6.0-rc5 in 64bit works doing the quota-check.
I'll do some more testing with xfs_fsr etc. and report back.
best regards
volker
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: OOM on quotacheck (again?)
2012-09-20 9:32 ` Volker
@ 2012-09-24 13:21 ` Dave Chinner
2012-09-24 14:47 ` Volker
0 siblings, 1 reply; 10+ messages in thread
From: Dave Chinner @ 2012-09-24 13:21 UTC (permalink / raw)
To: Volker; +Cc: xfs
On Thu, Sep 20, 2012 at 11:32:17AM +0200, Volker wrote:
> Hi,
>
> > Which implies you are running a 32 bit kernel even on 64 bit CPUs
> > (e.g. R710).
>
> My mistake. That is not yet the case, but the plan for the future.
> Thanks for pointing that out.
>
> > No surprise if you are running an i686 kernel (32 bit). You've got
> > way more inodes than can fit in the kernel memory segment.
>
> Could you slightly elaborate on that or give me a link or two which
> explain the matter?
The kernel segment is limited to 960MB of RAM on ia32 bit machines
unless you build with special config options that allow for up to
3GB of kernel memory. The trade off is that you've only got 4GB of
RAM in teh process address space, so by default you have 3GB of RAM
for each process (i.e. 960MB/3GB kernel/user split). If you change
that to a 3GB/1GB split, you'll have problems with applications
that are memory hogs running out of memory.
As to the memory used by the inode cache, inodes tend to use between
1-1.5k of RAM each. Hence for a 960MB kernel segment, you *might* be
able to cache 500,000 inodes if you don't cache anything else.
Typically it will be 25-30% of that number (200-300MB of RAM in
caching inodes during filesystem traversal). Seeing as you have
millions of inodes, that's way more than you can cache in available
kernel memory...
> If a 32bit kernel is not supposed to work because of the number of
> inodes, why does the 2.6.39.4-kernel work flawlessy on quota-checks on
> the same filesystem a 3.6.0-rc5 32bit (which is supposed to work) fails on?
Because inode reclaim on 2.6.39 is running during the quotacheck.
> Doesn't that imply, that the fix submitted for 2.6.39.4 fixed a problem
> which was "reinvented" by the later patch, which is now being worked
> around by using a 64bit kernel for more memory?
It's called a regression, and we do try to avoid them. However, ia32
gets relatively little attention due to it's limitations and hence
changes that work fine on x86-64 but cause regressions on ia32 might
go unnoticed for some time because relatively few people are running
ia32 servers anymore.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: OOM on quotacheck (again?)
2012-09-24 13:21 ` Dave Chinner
@ 2012-09-24 14:47 ` Volker
2012-10-02 16:29 ` Volker
0 siblings, 1 reply; 10+ messages in thread
From: Volker @ 2012-09-24 14:47 UTC (permalink / raw)
Cc: xfs
Hi,
>> Could you slightly elaborate on that or give me a link or two which
>> explain the matter?
>
> The kernel segment is limited to 960MB of RAM on ia32 bit machines
> unless you build with special config options that allow for up to
> 3GB of kernel memory. The trade off is that you've only got 4GB of
> RAM in teh process address space, so by default you have 3GB of RAM
> for each process (i.e. 960MB/3GB kernel/user split). If you change
> that to a 3GB/1GB split, you'll have problems with applications
> that are memory hogs running out of memory.
Interesting!
> As to the memory used by the inode cache, inodes tend to use between
> 1-1.5k of RAM each. Hence for a 960MB kernel segment, you *might* be
> able to cache 500,000 inodes if you don't cache anything else.
> Typically it will be 25-30% of that number (200-300MB of RAM in
> caching inodes during filesystem traversal). Seeing as you have
> millions of inodes, that's way more than you can cache in available
> kernel memory...
Great! That answered all my questions! Thanks a lot!
3.6.0-rc6-x64 ist currently running fine on 6 machines.
-volker
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: OOM on quotacheck (again?)
2012-09-24 14:47 ` Volker
@ 2012-10-02 16:29 ` Volker
2012-10-02 20:09 ` Dave Chinner
0 siblings, 1 reply; 10+ messages in thread
From: Volker @ 2012-10-02 16:29 UTC (permalink / raw)
To: xfs
Hi again,
> Great! That answered all my questions! Thanks a lot!
>
> 3.6.0-rc6-x64 ist currently running fine on 6 machines.
just as a follow up i would like to share some info.
The six machines mentioned above are still running fine. So are few more
we tested with the new kernel. All of the servers tested so far, were
rebooted immediately after the new 3.6 kernel was installed.
Because of that, we decided to roll out the new kernel to all our
servers (approximately 330) and have the kernel "sink in" over the next
few days if the machines get rebooted.
This morning we experienced some problems with the superblock being
corrupted on 6 machines that had been rebooted during the night. For all
of them, the following was true:
a) the server was still running the old buggy 2.6.37 and had
filesystem-troubles on heavy i/o (that was our problem to begin with
besides the OOM)
b) because of the filesystem-troubles the server had been rebooted by
our hardware-support-team (sadly not necessarily using sys-requests)
because the xfs-partition was unresponsive
c) after being rebooted with the new 3.6 kernel, the server complained
about the super-block of the xfs-partition being corrupted and was not
able to mount the partition
d) by running xfs_repair -L -P <device> we were able to fix the problem
e) trying a remount of the fixed partition caused a quota-check which
always ended in a stack-trace, after a reboot, the quota-check was fine
and the partition successfully mounted
Has anyone ever experienced problems like this updating from an older
kernel to the current 3.6?
Any Idea what could have caused the bad superblock the 3.6 kernel
complained about?
Is it possible that the 2.6.37 kernel left a superblock behing that
could not be recognized by the 3.6 kernel?
If its of any interest, i can supply the stack-traces.
- volker
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: OOM on quotacheck (again?)
2012-10-02 16:29 ` Volker
@ 2012-10-02 20:09 ` Dave Chinner
2012-10-02 20:49 ` Volker
0 siblings, 1 reply; 10+ messages in thread
From: Dave Chinner @ 2012-10-02 20:09 UTC (permalink / raw)
To: Volker; +Cc: xfs
On Tue, Oct 02, 2012 at 06:29:27PM +0200, Volker wrote:
> Hi again,
>
> > Great! That answered all my questions! Thanks a lot!
> >
> > 3.6.0-rc6-x64 ist currently running fine on 6 machines.
>
> just as a follow up i would like to share some info.
>
> The six machines mentioned above are still running fine. So are few more
> we tested with the new kernel. All of the servers tested so far, were
> rebooted immediately after the new 3.6 kernel was installed.
>
> Because of that, we decided to roll out the new kernel to all our
> servers (approximately 330) and have the kernel "sink in" over the next
> few days if the machines get rebooted.
>
> This morning we experienced some problems with the superblock being
> corrupted on 6 machines that had been rebooted during the night. For all
> of them, the following was true:
>
> a) the server was still running the old buggy 2.6.37 and had
> filesystem-troubles on heavy i/o (that was our problem to begin with
> besides the OOM)
>
> b) because of the filesystem-troubles the server had been rebooted by
> our hardware-support-team (sadly not necessarily using sys-requests)
> because the xfs-partition was unresponsive
>
> c) after being rebooted with the new 3.6 kernel, the server complained
> about the super-block of the xfs-partition being corrupted and was not
> able to mount the partition
>
> d) by running xfs_repair -L -P <device> we were able to fix the problem
>
> e) trying a remount of the fixed partition caused a quota-check which
> always ended in a stack-trace, after a reboot, the quota-check was fine
> and the partition successfully mounted
>
> Has anyone ever experienced problems like this updating from an older
> kernel to the current 3.6?
>
> Any Idea what could have caused the bad superblock the 3.6 kernel
> complained about?
>
> Is it possible that the 2.6.37 kernel left a superblock behing that
> could not be recognized by the 3.6 kernel?
>
> If its of any interest, i can supply the stack-traces.
Yes, it is of interest, can you post everything you found out about
the problem? (dmesg, stack traces, repair output, etc).
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: OOM on quotacheck (again?)
2012-10-02 20:09 ` Dave Chinner
@ 2012-10-02 20:49 ` Volker
2012-10-02 22:15 ` Dave Chinner
0 siblings, 1 reply; 10+ messages in thread
From: Volker @ 2012-10-02 20:49 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Hi,
>> If its of any interest, i can supply the stack-traces.
>
> Yes, it is of interest, can you post everything you found out about
> the problem? (dmesg, stack traces, repair output, etc).
Everything posted here is from a single server and its chronologically
top to bottom. Without having checked each and every stacktrace, it
looked quite similar on the other servers.
http://pastebin.com/PXquE4sM
Sidenote:
The xfs_repair would not finish without supplying -P, otherwise the
repair hang in phase 6 (might be related to this bug:
http://oss.sgi.com/archives/xfs-masters/2011-01/msg00009.html)
Hope it helps! Since we have about 300 servers left to go from 2.6.37 to
3.6, i'd be happy to do some testing as long we are not gambling with
our customers data :-)
- volker
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: OOM on quotacheck (again?)
2012-10-02 20:49 ` Volker
@ 2012-10-02 22:15 ` Dave Chinner
2012-10-04 14:19 ` Volker
0 siblings, 1 reply; 10+ messages in thread
From: Dave Chinner @ 2012-10-02 22:15 UTC (permalink / raw)
To: Volker; +Cc: xfs
On Tue, Oct 02, 2012 at 10:49:27PM +0200, Volker wrote:
> Hi,
>
> >> If its of any interest, i can supply the stack-traces.
> >
> > Yes, it is of interest, can you post everything you found out about
> > the problem? (dmesg, stack traces, repair output, etc).
>
> Everything posted here is from a single server and its chronologically
> top to bottom. Without having checked each and every stacktrace, it
> looked quite similar on the other servers.
>
> http://pastebin.com/PXquE4sM
So you had a hang on 2.6.37 to do with dquot reclaim, you rebooted
the server into what I think is a 3.6 kernel.
Log recovery failed with "bad clientid 0x0", so no superblock
problem. It does tend to indicate that 2.6.37 wrote bad data to the
log, though. If you reboot into 2.6.37, does log recovery run
successfully? i.e. does the failure only occur on 2.6.37 -> 3.6
with a dirty log?
You then ran xfs_repair -P -L, which threw lots of metadata
away and moved lots of stuff to lost+found.
You them mounted the filesystem on the same kernel (has
xfs_trans_read_buf_map() in the trace, hence the 3.6 version), and
it appears to be hung waiting for IO to complete on a dquot buffer.
That tends to indicate that maybe there's a problem with IO
completion somewhere below the XFS layer.
And if there's a problem below XFS w.r.t. IO compeltion, that also
makes me wonder if the log recovery problem isn't also caused by
something below XFS...
What mount options are you using on the 2.6.37 kernel?
> Sidenote:
> The xfs_repair would not finish without supplying -P, otherwise the
> repair hang in phase 6 (might be related to this bug:
> http://oss.sgi.com/archives/xfs-masters/2011-01/msg00009.html)
If you are upgrading your kernel, you should also upgrade your
xfsprogs installation as well.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: OOM on quotacheck (again?)
2012-10-02 22:15 ` Dave Chinner
@ 2012-10-04 14:19 ` Volker
0 siblings, 0 replies; 10+ messages in thread
From: Volker @ 2012-10-04 14:19 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Hi
> So you had a hang on 2.6.37 to do with dquot reclaim, you rebooted
> the server into what I think is a 3.6 kernel.
Correct.
> Log recovery failed with "bad clientid 0x0", so no superblock
> problem.
I was told by 'mount' that its a superblock-problem :-)
###
server044:~# mount -a
mount: /dev/sdb1: can't read superblock
###
What does the bad client-id in syslog indicate?
It does tend to indicate that 2.6.37 wrote bad data to the
> log, though. If you reboot into 2.6.37, does log recovery run
> successfully?
Yes. A server which was rebooted on Oct 3rd 07:18am, running 2.6.37 with
a stacktrace involving xfs_qm_dqreclaim_one came back up fine a couple
minutes later on 2.6.37.
If this would have not been working, we would have had way more trouble
with crashed xfs-partitions in the the past since the
xfs_qm_dqreclaim_one-stacktrace has been a very common error for us.
> i.e. does the failure only occur on 2.6.37 -> 3.6
> with a dirty log?
Yes. All 6 servers failed to mount the xfs-partition after they had
xfs-troubles on 2.6.37 and came back up on new 3.6 kernel. I did not try
to reboot them into 2.6.37 though.
> You them mounted the filesystem on the same kernel (has
> xfs_trans_read_buf_map() in the trace, hence the 3.6 version)
Correct. A quota-check was performed on all servers which ended in the
shown stack-trace also on all servers (see pastebin). After a reboot the
partition mounted just fine.
> What mount options are you using on the 2.6.37 kernel?
2.6.37 and 3.6 use the same options:
noatime,nosuid,nodev,gquota
> If you are upgrading your kernel, you should also upgrade your
> xfsprogs installation as well.
Will do.
- volker
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2012-10-04 14:18 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-19 14:12 OOM on quotacheck (again?) blafoo
2012-09-19 20:59 ` Dave Chinner
2012-09-20 9:32 ` Volker
2012-09-24 13:21 ` Dave Chinner
2012-09-24 14:47 ` Volker
2012-10-02 16:29 ` Volker
2012-10-02 20:09 ` Dave Chinner
2012-10-02 20:49 ` Volker
2012-10-02 22:15 ` Dave Chinner
2012-10-04 14:19 ` Volker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox