From: Andrea Arcangeli <aarcange@redhat.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Christoph Lameter <cl@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Adam Litke <agl@us.ibm.com>, Avi Kivity <avi@redhat.com>,
David Rientjes <rientjes@google.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Rik van Riel <riel@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 04/14] mm,migration: Allow the migration of PageSwapCache pages
Date: Sun, 25 Apr 2010 16:41:13 +0200 [thread overview]
Message-ID: <20100425144113.GB5789@random.random> (raw)
In-Reply-To: <20100424105226.GF14351@csn.ul.ie>
On Sat, Apr 24, 2010 at 11:52:27AM +0100, Mel Gorman wrote:
> It should be. I expect that's why you have never seen the bugon in
> swapops.
Oh I just got the very crash you're talking about with aa.git with
your v8 code. Weird that I never reproduced it before! I think it's
because I fixed gcc to be fully backed by hugepages always (without
khugepaged) and I was rebuilding a couple of packages, and that now
triggers memory compaction much more, but mixed with heavy
fork/execve. This is the only instability I managed to reproduce over
24 hours of stress testing and it's clearly not related to transparent
hugepage support but it's either a bug in migrate.c (more likely) or
memory compaction.
Note that I'm running with the 2.6.33 anon-vma code, so it will
relieve you to know it's not the anon-vma recent changes causing this
(well I can't rule out anon-vma bugs, but if it's anon-vma, it's a
longstanding one).
kernel BUG at include/linux/swapops.h:105!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:12.0/host0/target0:0:0/0:0:0:0/block/sr0/size
CPU 0
Modules linked in: nls_iso8859_1 loop twofish twofish_common tun bridge stp llc bnep sco rfcomm l2cap bluetooth snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss usbhid gspca_pac207 gspca_main videodev v4l1_compat v4l2_compat_ioctl32 snd_hda_codec_realtek ohci_hcd snd_hda_intel ehci_hcd usbcore snd_hda_codec snd_pcm snd_timer snd snd_page_alloc sg psmouse sr_mod pcspkr
Pid: 13351, comm: basename Not tainted 2.6.34-rc5 #23 M2A-VM/System Product Name
RIP: 0010:[<ffffffff810e66b0>] [<ffffffff810e66b0>] migration_entry_wait+0x170/0x180
RSP: 0000:ffff88009ab6fa58 EFLAGS: 00010246
RAX: ffffea0000000000 RBX: ffffea000234eed8 RCX: ffff8800aaa95298
RDX: 00000000000a168d RSI: ffff88000411ae28 RDI: ffffea00025550a8
RBP: ffffea0002555098 R08: ffff88000411ae28 R09: 0000000000000000
R10: 0000000000000008 R11: 0000000000000009 R12: 00000000aaa95298
R13: 00007ffff8a53000 R14: ffff88000411ae28 R15: ffff88011108a7c0
FS: 00002adf29469b90(0000) GS:ffff880001a00000(0000) knlGS:0000000055700d50
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007ffff8a53000 CR3: 0000000004f80000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process basename (pid: 13351, threadinfo ffff88009ab6e000, task ffff88009ab96c70)
Stack:
ffff8800aaa95280 ffffffff810ce472 ffff8801134a7ce8 0000000000000000
<0> 00000000142d1a3e ffffffff810c2e35 79f085e9c08a4db7 62d38944fd014000
<0> 76b07a274b0c057a ffffea00025649f8 f8000000000a168d d19934e84d2a74f3
Call Trace:
[<ffffffff810ce472>] ? page_add_new_anon_rmap+0x72/0xc0
[<ffffffff810c2e35>] ? handle_pte_fault+0x7a5/0x7d0
[<ffffffff8150506d>] ? do_page_fault+0x13d/0x420
[<ffffffff8150215f>] ? page_fault+0x1f/0x30
[<ffffffff81273bfb>] ? strnlen_user+0x4b/0x80
[<ffffffff81131f4e>] ? load_elf_binary+0x12be/0x1c80
[<ffffffff810f426d>] ? search_binary_handler+0xad/0x2c0
[<ffffffff810f5ce7>] ? do_execve+0x247/0x320
[<ffffffff8100ab16>] ? sys_execve+0x36/0x60
[<ffffffff8100314a>] ? stub_execve+0x6a/0xc0
Code: 5e ff ff ff 8d 41 01 89 4c 24 08 89 44 24 04 8b 74 24 04 8b 44 24 08 f0 0f b1 32 89 44 24 0c 8b 44 24 0c 39 c8 74 a4 89 c1 eb d1 <0f> 0b eb fe 66 66 66 2e 0f 1f 84 00 00 00 00 00 41 54 49 89 d4
RIP [<ffffffff810e66b0>] migration_entry_wait+0x170/0x180
RSP <ffff88009ab6fa58>
---[ end trace 840ce8bc6f6dc402 ]---
It doesn't look like a coincidence the page that had the migration PTE
set was the argv in the user stack during execve. The bug has to be
there. Or maybe it's a coincidence and it will mislead us. If you've
other stack traces please post them so I can have more info (I'll post
more stack traces if I get them again, it doesn't look easy to
reproduce, supposedly the bug has always been there since the first
time I used memory compaction, and this is the first time I reproduce
it).
WARNING: multiple messages have this Message-ID (diff)
From: Andrea Arcangeli <aarcange@redhat.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
Christoph Lameter <cl@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Adam Litke <agl@us.ibm.com>, Avi Kivity <avi@redhat.com>,
David Rientjes <rientjes@google.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Rik van Riel <riel@redhat.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 04/14] mm,migration: Allow the migration of PageSwapCache pages
Date: Sun, 25 Apr 2010 16:41:13 +0200 [thread overview]
Message-ID: <20100425144113.GB5789@random.random> (raw)
In-Reply-To: <20100424105226.GF14351@csn.ul.ie>
On Sat, Apr 24, 2010 at 11:52:27AM +0100, Mel Gorman wrote:
> It should be. I expect that's why you have never seen the bugon in
> swapops.
Oh I just got the very crash you're talking about with aa.git with
your v8 code. Weird that I never reproduced it before! I think it's
because I fixed gcc to be fully backed by hugepages always (without
khugepaged) and I was rebuilding a couple of packages, and that now
triggers memory compaction much more, but mixed with heavy
fork/execve. This is the only instability I managed to reproduce over
24 hours of stress testing and it's clearly not related to transparent
hugepage support but it's either a bug in migrate.c (more likely) or
memory compaction.
Note that I'm running with the 2.6.33 anon-vma code, so it will
relieve you to know it's not the anon-vma recent changes causing this
(well I can't rule out anon-vma bugs, but if it's anon-vma, it's a
longstanding one).
kernel BUG at include/linux/swapops.h:105!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:12.0/host0/target0:0:0/0:0:0:0/block/sr0/size
CPU 0
Modules linked in: nls_iso8859_1 loop twofish twofish_common tun bridge stp llc bnep sco rfcomm l2cap bluetooth snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss usbhid gspca_pac207 gspca_main videodev v4l1_compat v4l2_compat_ioctl32 snd_hda_codec_realtek ohci_hcd snd_hda_intel ehci_hcd usbcore snd_hda_codec snd_pcm snd_timer snd snd_page_alloc sg psmouse sr_mod pcspkr
Pid: 13351, comm: basename Not tainted 2.6.34-rc5 #23 M2A-VM/System Product Name
RIP: 0010:[<ffffffff810e66b0>] [<ffffffff810e66b0>] migration_entry_wait+0x170/0x180
RSP: 0000:ffff88009ab6fa58 EFLAGS: 00010246
RAX: ffffea0000000000 RBX: ffffea000234eed8 RCX: ffff8800aaa95298
RDX: 00000000000a168d RSI: ffff88000411ae28 RDI: ffffea00025550a8
RBP: ffffea0002555098 R08: ffff88000411ae28 R09: 0000000000000000
R10: 0000000000000008 R11: 0000000000000009 R12: 00000000aaa95298
R13: 00007ffff8a53000 R14: ffff88000411ae28 R15: ffff88011108a7c0
FS: 00002adf29469b90(0000) GS:ffff880001a00000(0000) knlGS:0000000055700d50
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007ffff8a53000 CR3: 0000000004f80000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process basename (pid: 13351, threadinfo ffff88009ab6e000, task ffff88009ab96c70)
Stack:
ffff8800aaa95280 ffffffff810ce472 ffff8801134a7ce8 0000000000000000
<0> 00000000142d1a3e ffffffff810c2e35 79f085e9c08a4db7 62d38944fd014000
<0> 76b07a274b0c057a ffffea00025649f8 f8000000000a168d d19934e84d2a74f3
Call Trace:
[<ffffffff810ce472>] ? page_add_new_anon_rmap+0x72/0xc0
[<ffffffff810c2e35>] ? handle_pte_fault+0x7a5/0x7d0
[<ffffffff8150506d>] ? do_page_fault+0x13d/0x420
[<ffffffff8150215f>] ? page_fault+0x1f/0x30
[<ffffffff81273bfb>] ? strnlen_user+0x4b/0x80
[<ffffffff81131f4e>] ? load_elf_binary+0x12be/0x1c80
[<ffffffff810f426d>] ? search_binary_handler+0xad/0x2c0
[<ffffffff810f5ce7>] ? do_execve+0x247/0x320
[<ffffffff8100ab16>] ? sys_execve+0x36/0x60
[<ffffffff8100314a>] ? stub_execve+0x6a/0xc0
Code: 5e ff ff ff 8d 41 01 89 4c 24 08 89 44 24 04 8b 74 24 04 8b 44 24 08 f0 0f b1 32 89 44 24 0c 8b 44 24 0c 39 c8 74 a4 89 c1 eb d1 <0f> 0b eb fe 66 66 66 2e 0f 1f 84 00 00 00 00 00 41 54 49 89 d4
RIP [<ffffffff810e66b0>] migration_entry_wait+0x170/0x180
RSP <ffff88009ab6fa58>
---[ end trace 840ce8bc6f6dc402 ]---
It doesn't look like a coincidence the page that had the migration PTE
set was the argv in the user stack during execve. The bug has to be
there. Or maybe it's a coincidence and it will mislead us. If you've
other stack traces please post them so I can have more info (I'll post
more stack traces if I get them again, it doesn't look easy to
reproduce, supposedly the bug has always been there since the first
time I used memory compaction, and this is the first time I reproduce
it).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2010-04-25 14:42 UTC|newest]
Thread overview: 138+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-20 21:01 [PATCH 0/14] Memory Compaction v8 Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 01/14] mm,migration: Take a reference to the anon_vma before migrating Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-21 2:49 ` KAMEZAWA Hiroyuki
2010-04-21 2:49 ` KAMEZAWA Hiroyuki
2010-04-20 21:01 ` [PATCH 02/14] mm,migration: Share the anon_vma ref counts between KSM and page migration Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 03/14] mm,migration: Do not try to migrate unmapped anonymous pages Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 04/14] mm,migration: Allow the migration of PageSwapCache pages Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-21 14:30 ` Christoph Lameter
2010-04-21 14:30 ` Christoph Lameter
2010-04-21 15:00 ` Mel Gorman
2010-04-21 15:00 ` Mel Gorman
2010-04-21 15:05 ` Christoph Lameter
2010-04-21 15:05 ` Christoph Lameter
2010-04-21 15:14 ` Mel Gorman
2010-04-21 15:14 ` Mel Gorman
2010-04-21 15:31 ` Christoph Lameter
2010-04-21 15:31 ` Christoph Lameter
2010-04-21 15:34 ` Mel Gorman
2010-04-21 15:34 ` Mel Gorman
2010-04-21 15:46 ` Christoph Lameter
2010-04-21 15:46 ` Christoph Lameter
2010-04-22 9:28 ` Mel Gorman
2010-04-22 9:28 ` Mel Gorman
2010-04-22 9:46 ` KAMEZAWA Hiroyuki
2010-04-22 9:46 ` KAMEZAWA Hiroyuki
2010-04-22 10:13 ` Minchan Kim
2010-04-22 10:13 ` Minchan Kim
2010-04-22 10:31 ` KAMEZAWA Hiroyuki
2010-04-22 10:31 ` KAMEZAWA Hiroyuki
2010-04-22 10:51 ` KAMEZAWA Hiroyuki
2010-04-22 10:51 ` KAMEZAWA Hiroyuki
2010-04-22 14:14 ` Mel Gorman
2010-04-22 14:14 ` Mel Gorman
2010-04-22 14:18 ` Minchan Kim
2010-04-22 14:18 ` Minchan Kim
2010-04-22 15:40 ` Mel Gorman
2010-04-22 15:40 ` Mel Gorman
2010-04-22 16:13 ` Mel Gorman
2010-04-22 16:13 ` Mel Gorman
2010-04-22 19:29 ` Mel Gorman
2010-04-22 19:29 ` Mel Gorman
2010-04-22 19:40 ` Christoph Lameter
2010-04-22 19:40 ` Christoph Lameter
2010-04-22 23:52 ` KAMEZAWA Hiroyuki
2010-04-22 23:52 ` KAMEZAWA Hiroyuki
2010-04-23 9:03 ` Mel Gorman
2010-04-23 9:03 ` Mel Gorman
2010-04-22 14:23 ` Minchan Kim
2010-04-22 14:23 ` Minchan Kim
2010-04-22 14:40 ` Minchan Kim
2010-04-22 14:40 ` Minchan Kim
2010-04-22 15:44 ` Mel Gorman
2010-04-22 15:44 ` Mel Gorman
2010-04-23 18:31 ` Andrea Arcangeli
2010-04-23 18:31 ` Andrea Arcangeli
2010-04-23 19:23 ` Mel Gorman
2010-04-23 19:23 ` Mel Gorman
2010-04-23 19:39 ` Andrea Arcangeli
2010-04-23 19:39 ` Andrea Arcangeli
2010-04-23 21:35 ` Andrea Arcangeli
2010-04-23 21:35 ` Andrea Arcangeli
2010-04-24 10:52 ` Mel Gorman
2010-04-24 10:52 ` Mel Gorman
2010-04-24 11:13 ` Andrea Arcangeli
2010-04-24 11:13 ` Andrea Arcangeli
2010-04-24 11:59 ` Mel Gorman
2010-04-24 11:59 ` Mel Gorman
2010-04-24 14:30 ` Andrea Arcangeli
2010-04-24 14:30 ` Andrea Arcangeli
2010-04-26 21:54 ` Rik van Riel
2010-04-26 21:54 ` Rik van Riel
2010-04-26 22:11 ` Mel Gorman
2010-04-26 22:11 ` Mel Gorman
2010-04-26 22:26 ` Andrea Arcangeli
2010-04-26 22:26 ` Andrea Arcangeli
2010-04-25 14:41 ` Andrea Arcangeli [this message]
2010-04-25 14:41 ` Andrea Arcangeli
2010-04-27 9:40 ` Mel Gorman
2010-04-27 9:40 ` Mel Gorman
2010-04-27 10:41 ` KAMEZAWA Hiroyuki
2010-04-27 10:41 ` KAMEZAWA Hiroyuki
2010-04-27 11:12 ` Mel Gorman
2010-04-27 11:12 ` Mel Gorman
2010-04-27 15:42 ` Andrea Arcangeli
2010-04-27 15:42 ` Andrea Arcangeli
2010-04-24 10:50 ` Mel Gorman
2010-04-24 10:50 ` Mel Gorman
2010-04-22 15:14 ` Christoph Lameter
2010-04-22 15:14 ` Christoph Lameter
2010-04-23 3:39 ` Paul E. McKenney
2010-04-23 3:39 ` Paul E. McKenney
2010-04-23 4:55 ` Minchan Kim
2010-04-23 4:55 ` Minchan Kim
2010-04-21 23:59 ` KAMEZAWA Hiroyuki
2010-04-21 23:59 ` KAMEZAWA Hiroyuki
2010-04-22 0:11 ` Minchan Kim
2010-04-22 0:11 ` Minchan Kim
2010-04-20 21:01 ` [PATCH 05/14] mm: Allow CONFIG_MIGRATION to be set without CONFIG_NUMA or memory hot-remove Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 06/14] mm: Export unusable free space index via debugfs Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 07/14] mm: Export fragmentation " Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 08/14] mm: Move definition for LRU isolation modes to a header Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 09/14] mm,compaction: Memory compaction core Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 10/14] mm,compaction: Add /proc trigger for memory compaction Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 11/14] mm,compaction: Add /sys trigger for per-node " Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 12/14] mm,compaction: Direct compact when a high-order allocation fails Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-05-05 12:19 ` [PATCH] fix count_vm_event preempt in memory compaction direct reclaim Andrea Arcangeli
2010-05-05 12:19 ` Andrea Arcangeli
2010-05-05 12:51 ` Mel Gorman
2010-05-05 12:51 ` Mel Gorman
2010-05-05 13:11 ` Andrea Arcangeli
2010-05-05 13:11 ` Andrea Arcangeli
2010-05-05 13:55 ` Mel Gorman
2010-05-05 13:55 ` Mel Gorman
2010-05-05 14:48 ` Andrea Arcangeli
2010-05-05 14:48 ` Andrea Arcangeli
2010-05-05 15:14 ` Mel Gorman
2010-05-05 15:14 ` Mel Gorman
2010-05-05 15:25 ` Andrea Arcangeli
2010-05-05 15:25 ` Andrea Arcangeli
2010-05-05 15:32 ` Mel Gorman
2010-05-05 15:32 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 13/14] mm,compaction: Add a tunable that decides when memory should be compacted and when it should be reclaimed Mel Gorman
2010-04-20 21:01 ` Mel Gorman
2010-04-20 21:01 ` [PATCH 14/14] mm,compaction: Defer compaction using an exponential backoff when compaction fails Mel Gorman
2010-04-20 21:01 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100425144113.GB5789@random.random \
--to=aarcange@redhat.com \
--cc=agl@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=avi@redhat.com \
--cc=cl@linux-foundation.org \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mel@csn.ul.ie \
--cc=minchan.kim@gmail.com \
--cc=riel@redhat.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.