From: Dave Chinner <david@fromorbit.com>
To: Christian Kujau <lists@nerdbynature.de>
Cc: xfs@oss.sgi.com
Subject: Re: possible memory allocation deadlock in xfs_buf_allocate_memory
Date: Fri, 12 Oct 2012 08:33:23 +1100 [thread overview]
Message-ID: <20121011213323.GD2739@dastard> (raw)
In-Reply-To: <alpine.DEB.2.01.1210111048170.10880@trent.utfs.org>
On Thu, Oct 11, 2012 at 11:13:14AM -0700, Christian Kujau wrote:
> Hi,
>
> since Linux 3.5 I'm seeing these "inconsistent lock state" lockdep
> warnings [0]. They show up in 3.6 as well [1]. I was being told[2] that I
> may have run out of inode attributes. This may well be the case, but I
> cannot reformat the disk right now and will have to live with that warning
> a while longer.
>
> I got the warning again today, but 8h later the system hung and eventually
> shutdown. The last message from the box was received via netconsole:
>
> XFS: possible memory allocation deadlock in xfs_buf_allocate_memory (mode:0x250)
>
> This has been reported[3] for 3.2.1, but when the message was printed I
> was not around and could not watch /proc/slabinfo.
>
> The last -MARK- message (some kind of heartbeat message from syslog,
> printing "MARK" every 5min) has been received 07:55 local time, the
> netconsole message above was received 08:09, so two -MARK- messages were
> lost. sar(1) stopped recording at 08:05.
>
> These two incidents (the lockdep warning and the final lockup) may be
> unrelated (and timestamp-wise, they seem to be), but I thought I'd
> better report it.
>
> Full dmesg and .config: http://nerdbynature.de/bits/3.6.0/xfs/o
The inconsistent lock state is this path:
Oct 11 00:18:27 alice kernel: [261506.767190] [e5e2fc70] [c00b5d44] vm_map_ram+0x228/0x5a0
Oct 11 00:18:27 alice kernel: [261506.768354] [e5e2fcf0] [c01a6f9c] _xfs_buf_map_pages+0x44/0x104
Oct 11 00:18:27 alice kernel: [261506.769522] [e5e2fd10] [c01a8090] xfs_buf_get_map+0x74/0x11c
Oct 11 00:18:27 alice kernel: [261506.770698] [e5e2fd30] [c01ffad8] xfs_trans_get_buf_map+0xc0/0xdc
Oct 11 00:18:27 alice kernel: [261506.772106] [e5e2fd50] [c01e9504] xfs_ifree_cluster+0x3f4/0x5b0
Oct 11 00:18:27 alice kernel: [261506.773309] [e5e2fde0] [c01eabc0] xfs_ifree+0xec/0xf0
Oct 11 00:18:27 alice kernel: [261506.774493] [e5e2fe20] [c01bd5c0] xfs_inactive+0x274/0x448
Oct 11 00:18:27 alice kernel: [261506.775672] [e5e2fe60] [c01b8450] xfs_fs_evict_inode+0x60/0x74
Oct 11 00:18:27 alice kernel: [261506.776863] [e5e2fe70] [c00dfa88] evict+0xc0/0x1b4
Oct 11 00:18:27 alice kernel: [261506.778046] [e5e2fe90] [c00db968] d_delete+0x1b0/0x1ec
Oct 11 00:18:27 alice kernel: [261506.779235] [e5e2feb0] [c00d2e24] vfs_rmdir+0x124/0x128
Oct 11 00:18:27 alice kernel: [261506.780425] [e5e2fed0] [c00d2f10] do_rmdir+0xe8/0x110
Oct 11 00:18:27 alice kernel: [261506.781619] [e5e2ff40] [c0010aac] ret_from_syscall+0x0/0x38
Which indicates that it is this:
> [1] http://oss.sgi.com/archives/xfs/2012-09/msg00305.html
which is a real bug in the VM code that the VM developers refuse to
fix even though it is simple to do and patches have been posted
several times to fix it.
/me points to his recent rant rather than repeating it:
https://lkml.org/lkml/2012/6/13/628
However, that is unrelated to this message:
XFS: possible memory allocation deadlock in xfs_buf_allocate_memory (mode:0x250)
which triggers when memory cannot be allocated. mode 0x250 is
___GFP_NOWARN | ___GFP_IO | ___GFP_WAIT
or more commonly known as:
GFP_NOFS
with warnings turned off. Basically the warning is saying "we're
trying really hard to allocate memory, but we're not making
progress". If it was only emitted once, however, it means that
progress was made, as the message is emitted every 100 times through
the loop and so only one message means it looped less than 200
times...
What it does imply, however, is that vm_map_ram() is being called
from GFP_NOFS context quite regularly and might be blocking there,
and so the lockdep warning is more than just a nuisance - your
system may indeed have hung there, but I don't have enough
information to say for sure.
When it hangs next time, can you get a blocked task dump from sysrq
(i.e. sysrq-w, or "echo w > /proc/sysrq-trigger")? That's the only
way we're going to know where the system hung. You might also want
to ensure the hung task functionality is built into you kernel, so
it automatically dumps stack traces for tasks hung longer than
120s...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2012-10-11 21:31 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-11 18:13 possible memory allocation deadlock in xfs_buf_allocate_memory Christian Kujau
2012-10-11 21:33 ` Dave Chinner [this message]
2012-10-11 22:06 ` Christian Kujau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121011213323.GD2739@dastard \
--to=david@fromorbit.com \
--cc=lists@nerdbynature.de \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox