linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Alex Riesen <raa.lkml@gmail.com>,
	David Miller <davem@davemloft.net>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: Heads up Linux 2.6.38-rc4 compile problems.
Date: Sun, 13 Feb 2011 18:45:13 -0800	[thread overview]
Message-ID: <AANLkTimO3h80Wk44OhpfkjzxpzdDKnf6emj2fNObn_jD@mail.gmail.com> (raw)
In-Reply-To: <m14o87wev8.fsf@fess.ebiederm.org>

On Sun, Feb 13, 2011 at 6:04 PM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
>
> The build failures appear to have been due to a corrupted ccache. A
> coworker turned off using the ccache and the compiles started working
> again.  Unfortunately I can't qualify when my ccache got corrupted,
> or give a hint at which kernel bug caused the corrupted cache.  I
> expected it happened in whatever I tested just before -rc3.

Ok, that certainly explains how it was reproducible, and why it would
show up in rc4 despite there not being a lot of reasons for any of the
post-rc3 changes to introduce anything like that.

It does sound like memory corruption. I'm not at all sure that it's
the rcu lookup thing (although it's a possible case), and especially
if you've been playing around with some of the more experimental VM
features (memcg? transparent hugepage? migration/compaction?) it could
easily be something there. There's been several bug-fixes in those
areas.

Having SLUB debugging on would be a good start. Obviously,
CONFIG_DEBUG_PAGEALLOC would be wondeful, but it's expensive as heck,
so it can be a bit painful to use on a machine that is actually used
for real work. But it can really help pinpoint those kinds of
problems.

> There is something corrupting my page tables.
>
> messages:Feb 13 12:50:00 bs38 kernel: BUG: Bad page map in process [manager]  pte:ffff88028688b748 pmd:28688b067
> messages:Feb 13 12:50:00 bs38 kernel: BUG: Bad page map in process [manager]  pte:ffff88028688b748 pmd:28688b067
> messages:Feb 13 12:52:17 bs38 kernel: BUG: Bad page map in process [manager]  pte:ffff880011065748 pmd:11065067

Odd pattern. That is a totally invalid pte, and I do not see what the
pattern would come from. It's a kernel pointer, afaik, and obviously
shouldn't show up in the pte.

But it could be the result of a use-after-free. Or a double free.
Which I _think_ is that rcu lookup bug pattern, but I may be barking
up the wrong tree. Again, SLUB or PAGEALLOC debugging would probably
give more information.

I'm adding Andrew to the cc too, in case it's simply some of the VM patches.

> I have some unexpected kernel crashes as well.
> With 2.6.38-rc3 (something I think this was a git snapshot) I saw:
>
> <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000008

The instruction is the "lock xadd %ax,(%rdi)" that is the actual
locked spin-lock instruction. It's this:

   spin_lock(&root_anon_vma->lock);

in __page_lock_anon_vma(), and %rdi is 8. Which is consistent with
root_anon_vma being NULL.

> <0>Call Trace:
> <4> [<ffffffff813d0a0c>] _raw_spin_lock+0x9/0xb
> <4> [<ffffffff810d30cd>] __page_lock_anon_vma+0x3a/0x54
> <4> [<ffffffff810d3633>] page_referenced+0xaf/0x240
> <4> [<ffffffff810bcfda>] shrink_page_list+0x154/0x49e
> <4> [<ffffffff810bd762>] shrink_inactive_list+0x234/0x386
> <4> [<ffffffff810bdede>] shrink_zone+0x356/0x418
> <4> [<ffffffff810bed0e>] kswapd+0x4f6/0x84d
> <4> [<ffffffff81057de9>] kthread+0x7d/0x85
> <4> [<ffffffff810037a4>] kernel_thread_helper+0x4/0x10

It goes without saying that root_anon_vma shouldn't have been NULL
here. But maybe this triggers something for Andrew?

> With 2.6.38-rc4 I have seen:
> <0>general protection fault: 0000 [#1] SMP
> <4>RIP: 0010:[<ffffffff810326b0>]  [<ffffffff810326b0>] post_schedule+0x7/0x4e
> <4>RSP: 0000:ffff8802981c5bf8  EFLAGS: 00010287
> <4>RAX: 0000000000000006 RBX: ffff100367f45c28 RCX: ffff8801a6af0dc0
> <4>RDX: ffff8802981c5fd8 RSI: ffff8801a6af0dc0 RDI: ffff100367f45c28
> <0>Call Trace:
> <4> [<ffffffff813cf98c>] schedule+0x544/0x577
> <4> [<ffffffff813cfb4f>] schedule_timeout+0x22/0xbb
> <4> [<ffffffff813386e5>] __skb_recv_datagram+0x1ec/0x264
> <4> [<ffffffff8133877c>] skb_recv_datagram+0x1f/0x21
> <4> [<ffffffff813aefeb>] unix_accept+0x55/0x103
> <4> [<ffffffff8132efcb>] sys_accept4+0xf3/0x1c3
> <4> [<ffffffff81353b97>] compat_sys_socketcall+0x17d/0x186
> <4> [<ffffffff8102cd90>] sysenter_dispatch+0x7/0x2e
> <0>Code: 49 89 c4 8b 75 e8 48 89 df 31 c9 e8 a3 d4 ff ff 4c 89 e6 48 89 df e8 ae e3 39 00 48 83 c4 20 5b 41 5c c9 c3 55 48 89 e5 41 54 53 <83> bf 74 08 00 00 00 48 89 fb 74 36 e8 4d e3 39 00 49 89 c4 48
> <1>RIP  [<ffffffff810326b0>] post_schedule+0x7/0x4e

This is the very first memory access in post_schedule, the

   if (rq->post_schedule) {

load. (trapping instruction is "cmpl $0x0,0x874(%rdi)". With %rdi
being corrupt, and the resulting pointer being invalid, it looks like.

Odd, and looks pretty random. Maybe it really is just memory corruption.

> With 2.6.38-rc4 I have seen:
> <1>BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
> <1>IP: [<ffffffff811016cb>] shrink_dcache_parent+0x104/0x23c
> <0>Call Trace:
> <4> [<ffffffff8113c8bc>] proc_flush_task+0xae/0x1d2
> <4> [<ffffffff8104061a>] release_task+0x35/0x3b9
> <4> [<ffffffff81040f53>] wait_consider_task+0x5b5/0x911
> <4> [<ffffffff810413a6>] do_wait+0xf7/0x222
> <4> [<ffffffff8104266f>] sys_wait4+0x99/0xbc
> <4> [<ffffffff81076155>] compat_sys_wait4+0x26/0xc3
> <4> [<ffffffff8102d9e0>] sys32_waitpid+0xb/0xd
> <4> [<ffffffff8102cd90>] sysenter_dispatch+0x7/0x2e
> <0>Code: 00 49 89 87 80 00 00 00 49 89 8f 88 00 00 00 48 89 11 49 8b 47 68 ff 05 28 04 72 00 ff 80 f0 00 00 00 eb 33 49 8b b7 88 00 00 00 <48> 89 72 08 48 89 16 48 8b 90 e8 00 00 00 48 89 88 e8 00 00 00
> <1>RIP  [<ffffffff811016cb>] shrink_dcache_parent+0x104/0x23c

I dunno. That instruction sequence looks like a list_del(), but I'm
not certain ("mov %rsi,0x8(%rdx) ; mov %rdx,(%rsi)"). With %rdx being
NULL. But shrink_dcache tends to be where a lot of random memory
corruption ends up then blowing up (because the dcache is very
pointer-intensive, and it can be a large cache), so again, I don't
think the oops really tells us anything. It looks more like the
symptom rather than a cause.

                                    Linus

  reply	other threads:[~2011-02-14  2:45 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-08  0:23 Linux 2.6.38-rc4 Linus Torvalds
2011-02-08 10:17 ` lockdep: possible reason: unannotated irqs-off. (was: Re: Linux 2.6.38-rc4) Borislav Petkov
2011-02-08 10:41   ` Peter Zijlstra
2011-02-08 12:11     ` Yong Zhang
2011-02-08 12:14       ` [PATCH 2/2] timer: use local_bh_enable_force_wake() in del_timer_sync() Yong Zhang
2011-02-08 13:34       ` lockdep: possible reason: unannotated irqs-off. (was: Re: Linux 2.6.38-rc4) Yong Zhang
2011-02-08 13:48         ` Peter Zijlstra
2011-02-08 14:18           ` Peter Zijlstra
2011-02-08 15:15             ` Ingo Molnar
2011-02-08 15:51             ` [tip:core/urgent] Revert "lockdep, timer: Fix del_timer_sync() annotation" tip-bot for Peter Zijlstra
2011-02-09  1:46             ` lockdep: possible reason: unannotated irqs-off. (was: Re: Linux 2.6.38-rc4) Yong Zhang
2011-02-14 14:51             ` Yong Zhang
2011-02-14 18:53               ` Thomas Gleixner
2011-02-08 20:28 ` Heads up Linux 2.6.38-rc4 compile problems Eric W. Biederman
2011-02-08 20:44   ` Linus Torvalds
2011-02-09  9:01     ` Eric W. Biederman
2011-02-09 14:59       ` Alex Riesen
2011-02-09 16:02         ` Linus Torvalds
2011-02-13 17:39           ` Linus Torvalds
2011-02-14  2:04             ` Eric W. Biederman
2011-02-14  2:45               ` Linus Torvalds [this message]
2011-02-14  3:40                 ` Eric W. Biederman
2011-02-14  5:34                   ` Eric W. Biederman
2011-02-14 15:26                     ` Linus Torvalds
2011-02-14 15:37                     ` Eric W. Biederman
2011-02-14 16:37                       ` Linus Torvalds
2011-02-14 17:39                         ` Eric W. Biederman
2011-02-14 17:49                         ` Linus Torvalds
2011-02-14 18:08                           ` Linus Torvalds
2011-02-14 19:44                             ` Eric W. Biederman
2011-02-14 20:13                               ` Andrew Morton
2011-02-14 18:25                         ` Andi Kleen
2011-02-14 16:58                     ` Mike Snitzer
2011-02-15 14:07                       ` [Crash-utility] " Dave Anderson
2011-02-09 17:08 ` Linux 2.6.38-rc4 (test_nx: BUG) Randy Dunlap
2011-02-09 17:10   ` Arjan van de Ven
2011-02-17 19:33     ` Kees Cook
2011-02-09 17:24 ` Linux 2.6.38-rc4 (hysdn: BUG) Randy Dunlap
2011-02-09 19:44   ` Linus Torvalds
2011-02-09 21:25     ` Randy Dunlap
2011-02-09 21:57       ` David Miller
2011-02-09 22:00         ` Linus Torvalds
2011-02-09 17:26 ` Linux 2.6.38-rc4 (tty/ifx6x60: BUG) Randy Dunlap
2011-02-09 18:28   ` Alan Cox
2011-02-09 17:28 ` Linux 2.6.38-rc4 (target_core: rmmod GP fault) Randy Dunlap
2011-02-09 19:00   ` Linus Torvalds
2011-02-09 20:02     ` Nicholas A. Bellinger
2011-02-09 20:13       ` James Bottomley
2011-02-09 20:20         ` Nicholas A. Bellinger
2011-02-09 20:28           ` James Bottomley
2011-02-09 20:44             ` Nicholas A. Bellinger
2011-02-09 17:36 ` Linux 2.6.38-rc4 (other bugs) Randy Dunlap
2011-02-09 22:01   ` David Miller
2011-02-09 22:16     ` Randy Dunlap
2011-02-10  4:58     ` Linux 2.6.38-rc4 (other bugs: x25) Randy Dunlap
2011-02-10  5:48       ` David Miller
2011-02-10  6:29         ` Randy Dunlap
2011-02-10  6:35           ` David Miller
2011-02-10 19:34   ` Linux 2.6.38-rc4 (other bugs: ipmi Oops) Randy Dunlap
2011-02-10 20:03     ` Linus Torvalds
2011-02-10 20:08       ` Corey Minyard
2011-02-10 21:41       ` Randy Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AANLkTimO3h80Wk44OhpfkjzxpzdDKnf6emj2fNObn_jD@mail.gmail.com \
    --to=torvalds@linux-foundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=ebiederm@xmission.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=raa.lkml@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).