* Re: perf bug: bad page map @ 2013-11-15 18:04 Vince Weaver 2013-11-18 15:17 ` Peter Zijlstra 0 siblings, 1 reply; 8+ messages in thread From: Vince Weaver @ 2013-11-15 18:04 UTC (permalink / raw) To: Vince Weaver Cc: Peter Zijlstra, LKML, Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo (figured out the minicom issue). Anyway while trying to reproduce the last bug I instead got this with the perf_fuzzer. Is it worth continuing to run and report these issues? I'm losing track of all the open bugs. Vince [ 1618.118179] BUG: Bad page map in process perf_fuzzer pte:ffff8800c4d60040 pmd:bd86a067 [ 1618.142177] addr:0000000000409000 vm_flags:00000875 anon_vma: (null) mapping:ffff8800cb74adf0 index:9 [ 1618.172142] vma->vm_ops->fault: filemap_fault+0x0/0x358 [ 1618.187783] vma->vm_file->f_op->mmap: ext4_file_mmap+0x0/0x48 [ 1618.204981] CPU: 1 PID: 24819 Comm: perf_fuzzer Not tainted 3.12.0 #4 [ 1618.224256] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012 [ 1618.250825] 0000000000409000 ffff8800bf6dfaa8 ffffffff8151d8ec 0000000000000000 [ 1618.273081] ffff8800c89ac928 ffff8800bf6dfaf8 ffffffff810ed692 dead000000200200 [ 1618.295345] 00000000c03df067 ffff8800bf6dfbe8 0000000000409000 ffffea0002bc2fe8 [ 1618.317603] Call Trace: [ 1618.324951] [<ffffffff8151d8ec>] dump_stack+0x49/0x5d [ 1618.340355] [<ffffffff810ed692>] print_bad_pte+0x1f5/0x213 [ 1618.357059] [<ffffffff810ef43c>] unmap_single_vma+0x511/0x666 [ 1618.374540] [<ffffffff810ef5c3>] unmap_vmas+0x32/0x49 [ 1618.389934] [<ffffffff810f3804>] exit_mmap+0x84/0x10d [ 1618.405343] [<ffffffff8105bb15>] ? hrtimer_try_to_cancel+0x41/0x4b [ 1618.424129] [<ffffffff8103ac43>] mmput+0x4b/0xd1 [ 1618.438227] [<ffffffff8103ec76>] do_exit+0x36c/0x936 [ 1618.453366] [<ffffffff810c7312>] ? update_context_time+0x11/0x34 [ 1618.471628] [<ffffffff8100951b>] ? native_sched_clock+0x3b/0x3d [ 1618.489635] [<ffffffff8106730d>] ? sched_clock_local+0x1c/0x82 [ 1618.507376] [<ffffffff8103f2b8>] do_group_exit+0x78/0xa0 [ 1618.523563] [<ffffffff8104c898>] get_signal_to_deliver+0x46d/0x48a [ 1618.542347] [<ffffffff810c8ac7>] ? ctx_sched_in+0x35/0x185 [ 1618.559051] [<ffffffff810c8c80>] ? perf_event_sched_in+0x69/0x72 [ 1618.577318] [<ffffffff81002513>] do_signal+0x46/0x5f5 [ 1618.592724] [<ffffffff810c8ffe>] ? __perf_event_task_sched_in+0x3a/0x10e [ 1618.613071] [<ffffffff8106699f>] ? finish_task_switch+0x46/0x98 [ 1618.631075] [<ffffffff8151f832>] ? __schedule+0x51c/0x54b [ 1618.647516] [<ffffffff81002aee>] do_notify_resume+0x2c/0x64 [ 1618.664486] [<ffffffff81520ef5>] retint_signal+0x3d/0x78 [ 1618.680661] Disabling lock debugging due to kernel taint ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf bug: bad page map 2013-11-15 18:04 perf bug: bad page map Vince Weaver @ 2013-11-18 15:17 ` Peter Zijlstra 2013-11-18 15:36 ` Ingo Molnar 2013-11-18 16:41 ` Vince Weaver 0 siblings, 2 replies; 8+ messages in thread From: Peter Zijlstra @ 2013-11-18 15:17 UTC (permalink / raw) To: Vince Weaver Cc: LKML, Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo, tytso, adilger.kernel On Fri, Nov 15, 2013 at 01:04:23PM -0500, Vince Weaver wrote: > > (figured out the minicom issue). > > Anyway while trying to reproduce the last bug I instead got this with > the perf_fuzzer. > > Is it worth continuing to run and report these issues? I'm losing track > of all the open bugs. This is looks like ext4. Not entirely sure how perf ties into this. Anyway, yes, I do think its useful to keep running these tests, we do fix various issues -- although probably not at the rate you seem to be finding them. > [ 1618.118179] BUG: Bad page map in process perf_fuzzer pte:ffff8800c4d60040 pmd:bd86a067 > [ 1618.142177] addr:0000000000409000 vm_flags:00000875 anon_vma: (null) mapping:ffff8800cb74adf0 index:9 > [ 1618.172142] vma->vm_ops->fault: filemap_fault+0x0/0x358 > [ 1618.187783] vma->vm_file->f_op->mmap: ext4_file_mmap+0x0/0x48 > [ 1618.204981] CPU: 1 PID: 24819 Comm: perf_fuzzer Not tainted 3.12.0 #4 > [ 1618.224256] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012 > [ 1618.250825] 0000000000409000 ffff8800bf6dfaa8 ffffffff8151d8ec 0000000000000000 > [ 1618.273081] ffff8800c89ac928 ffff8800bf6dfaf8 ffffffff810ed692 dead000000200200 > [ 1618.295345] 00000000c03df067 ffff8800bf6dfbe8 0000000000409000 ffffea0002bc2fe8 > [ 1618.317603] Call Trace: > [ 1618.324951] [<ffffffff8151d8ec>] dump_stack+0x49/0x5d > [ 1618.340355] [<ffffffff810ed692>] print_bad_pte+0x1f5/0x213 > [ 1618.357059] [<ffffffff810ef43c>] unmap_single_vma+0x511/0x666 > [ 1618.374540] [<ffffffff810ef5c3>] unmap_vmas+0x32/0x49 > [ 1618.389934] [<ffffffff810f3804>] exit_mmap+0x84/0x10d > [ 1618.405343] [<ffffffff8105bb15>] ? hrtimer_try_to_cancel+0x41/0x4b > [ 1618.424129] [<ffffffff8103ac43>] mmput+0x4b/0xd1 > [ 1618.438227] [<ffffffff8103ec76>] do_exit+0x36c/0x936 > [ 1618.453366] [<ffffffff810c7312>] ? update_context_time+0x11/0x34 > [ 1618.471628] [<ffffffff8100951b>] ? native_sched_clock+0x3b/0x3d > [ 1618.489635] [<ffffffff8106730d>] ? sched_clock_local+0x1c/0x82 > [ 1618.507376] [<ffffffff8103f2b8>] do_group_exit+0x78/0xa0 > [ 1618.523563] [<ffffffff8104c898>] get_signal_to_deliver+0x46d/0x48a > [ 1618.542347] [<ffffffff810c8ac7>] ? ctx_sched_in+0x35/0x185 > [ 1618.559051] [<ffffffff810c8c80>] ? perf_event_sched_in+0x69/0x72 > [ 1618.577318] [<ffffffff81002513>] do_signal+0x46/0x5f5 > [ 1618.592724] [<ffffffff810c8ffe>] ? __perf_event_task_sched_in+0x3a/0x10e > [ 1618.613071] [<ffffffff8106699f>] ? finish_task_switch+0x46/0x98 > [ 1618.631075] [<ffffffff8151f832>] ? __schedule+0x51c/0x54b > [ 1618.647516] [<ffffffff81002aee>] do_notify_resume+0x2c/0x64 > [ 1618.664486] [<ffffffff81520ef5>] retint_signal+0x3d/0x78 > [ 1618.680661] Disabling lock debugging due to kernel taint > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf bug: bad page map 2013-11-18 15:17 ` Peter Zijlstra @ 2013-11-18 15:36 ` Ingo Molnar 2013-11-18 16:41 ` Vince Weaver 1 sibling, 0 replies; 8+ messages in thread From: Ingo Molnar @ 2013-11-18 15:36 UTC (permalink / raw) To: Peter Zijlstra Cc: Vince Weaver, LKML, Paul Mackerras, Arnaldo Carvalho de Melo, tytso, adilger.kernel * Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, Nov 15, 2013 at 01:04:23PM -0500, Vince Weaver wrote: > > > > (figured out the minicom issue). > > > > Anyway while trying to reproduce the last bug I instead got this > > with the perf_fuzzer. > > > > Is it worth continuing to run and report these issues? I'm losing > > track of all the open bugs. > > This is looks like ext4. Not entirely sure how perf ties into this. > > Anyway, yes, I do think its useful to keep running these tests, we > do fix various issues -- although probably not at the rate you seem > to be finding them. I'm trying to slow down the merging of kernel side features, so that the fixing effort has a chance to catch up... Thanks, Ingo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf bug: bad page map 2013-11-18 15:17 ` Peter Zijlstra 2013-11-18 15:36 ` Ingo Molnar @ 2013-11-18 16:41 ` Vince Weaver 2013-11-18 17:13 ` Ingo Molnar 2013-11-18 23:05 ` One Thousand Gnomes 1 sibling, 2 replies; 8+ messages in thread From: Vince Weaver @ 2013-11-18 16:41 UTC (permalink / raw) To: Peter Zijlstra Cc: Vince Weaver, LKML, Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo, tytso, adilger.kernel On Mon, 18 Nov 2013, Peter Zijlstra wrote: > On Fri, Nov 15, 2013 at 01:04:23PM -0500, Vince Weaver wrote: > > > > (figured out the minicom issue). > > > > Anyway while trying to reproduce the last bug I instead got this with > > the perf_fuzzer. > > > > Is it worth continuing to run and report these issues? I'm losing track > > of all the open bugs. > > This is looks like ext4. Not entirely sure how perf ties into this. It's believable the filesystem could have issues (it's a fuzzer machine, so it's had 100+ unclean shutdowns on an SSD drive in the past few months) but as far as I know there shouldn't have been any filesystem accesses happening at all when the bug triggered. I thought it might be perf related due to the perf references in the backtrace (and since it was being perf-fuzzed at the time). > Anyway, yes, I do think its useful to keep running these tests, we do > fix various issues -- although probably not at the rate you seem to be > finding them. > > > [ 1618.118179] BUG: Bad page map in process perf_fuzzer pte:ffff8800c4d60040 pmd:bd86a067 > > [ 1618.142177] addr:0000000000409000 vm_flags:00000875 anon_vma: (null) mapping:ffff8800cb74adf0 index:9 > > [ 1618.172142] vma->vm_ops->fault: filemap_fault+0x0/0x358 > > [ 1618.187783] vma->vm_file->f_op->mmap: ext4_file_mmap+0x0/0x48 > > [ 1618.204981] CPU: 1 PID: 24819 Comm: perf_fuzzer Not tainted 3.12.0 #4 > > [ 1618.224256] Hardware name: AOpen DE7000/nMCP7ALPx-DE R1.06 Oct.19.2012, BIOS 080015 10/19/2012 > > [ 1618.250825] 0000000000409000 ffff8800bf6dfaa8 ffffffff8151d8ec 0000000000000000 > > [ 1618.273081] ffff8800c89ac928 ffff8800bf6dfaf8 ffffffff810ed692 dead000000200200 > > [ 1618.295345] 00000000c03df067 ffff8800bf6dfbe8 0000000000409000 ffffea0002bc2fe8 > > [ 1618.317603] Call Trace: > > [ 1618.324951] [<ffffffff8151d8ec>] dump_stack+0x49/0x5d > > [ 1618.340355] [<ffffffff810ed692>] print_bad_pte+0x1f5/0x213 > > [ 1618.357059] [<ffffffff810ef43c>] unmap_single_vma+0x511/0x666 > > [ 1618.374540] [<ffffffff810ef5c3>] unmap_vmas+0x32/0x49 > > [ 1618.389934] [<ffffffff810f3804>] exit_mmap+0x84/0x10d > > [ 1618.405343] [<ffffffff8105bb15>] ? hrtimer_try_to_cancel+0x41/0x4b > > [ 1618.424129] [<ffffffff8103ac43>] mmput+0x4b/0xd1 > > [ 1618.438227] [<ffffffff8103ec76>] do_exit+0x36c/0x936 > > [ 1618.453366] [<ffffffff810c7312>] ? update_context_time+0x11/0x34 > > [ 1618.471628] [<ffffffff8100951b>] ? native_sched_clock+0x3b/0x3d > > [ 1618.489635] [<ffffffff8106730d>] ? sched_clock_local+0x1c/0x82 > > [ 1618.507376] [<ffffffff8103f2b8>] do_group_exit+0x78/0xa0 > > [ 1618.523563] [<ffffffff8104c898>] get_signal_to_deliver+0x46d/0x48a > > [ 1618.542347] [<ffffffff810c8ac7>] ? ctx_sched_in+0x35/0x185 > > [ 1618.559051] [<ffffffff810c8c80>] ? perf_event_sched_in+0x69/0x72 > > [ 1618.577318] [<ffffffff81002513>] do_signal+0x46/0x5f5 > > [ 1618.592724] [<ffffffff810c8ffe>] ? __perf_event_task_sched_in+0x3a/0x10e > > [ 1618.613071] [<ffffffff8106699f>] ? finish_task_switch+0x46/0x98 > > [ 1618.631075] [<ffffffff8151f832>] ? __schedule+0x51c/0x54b > > [ 1618.647516] [<ffffffff81002aee>] do_notify_resume+0x2c/0x64 > > [ 1618.664486] [<ffffffff81520ef5>] retint_signal+0x3d/0x78 > > [ 1618.680661] Disabling lock debugging due to kernel taint ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf bug: bad page map 2013-11-18 16:41 ` Vince Weaver @ 2013-11-18 17:13 ` Ingo Molnar 2013-11-18 23:05 ` One Thousand Gnomes 1 sibling, 0 replies; 8+ messages in thread From: Ingo Molnar @ 2013-11-18 17:13 UTC (permalink / raw) To: Vince Weaver Cc: Peter Zijlstra, LKML, Paul Mackerras, Arnaldo Carvalho de Melo, tytso, adilger.kernel * Vince Weaver <vincent.weaver@maine.edu> wrote: > On Mon, 18 Nov 2013, Peter Zijlstra wrote: > > > On Fri, Nov 15, 2013 at 01:04:23PM -0500, Vince Weaver wrote: > > > > > > (figured out the minicom issue). > > > > > > Anyway while trying to reproduce the last bug I instead got this with > > > the perf_fuzzer. > > > > > > Is it worth continuing to run and report these issues? I'm losing track > > > of all the open bugs. > > > > This is looks like ext4. Not entirely sure how perf ties into this. > > It's believable the filesystem could have issues (it's a fuzzer > machine, so it's had 100+ unclean shutdowns on an SSD drive in the > past few months) but as far as I know there shouldn't have been any > filesystem accesses happening at all when the bug triggered. > > I thought it might be perf related due to the perf references in the > backtrace (and since it was being perf-fuzzed at the time). Maybe the connection is that ext4 has lots of tracepoints? Thanks, Ingo ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf bug: bad page map 2013-11-18 16:41 ` Vince Weaver 2013-11-18 17:13 ` Ingo Molnar @ 2013-11-18 23:05 ` One Thousand Gnomes 2013-11-19 1:57 ` Vince Weaver 1 sibling, 1 reply; 8+ messages in thread From: One Thousand Gnomes @ 2013-11-18 23:05 UTC (permalink / raw) To: Vince Weaver Cc: Peter Zijlstra, LKML, Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo, tytso, adilger.kernel On Mon, 18 Nov 2013 11:41:22 -0500 (EST) Vince Weaver <vincent.weaver@maine.edu> wrote: > On Mon, 18 Nov 2013, Peter Zijlstra wrote: > > > On Fri, Nov 15, 2013 at 01:04:23PM -0500, Vince Weaver wrote: > > > > > > (figured out the minicom issue). > > > > > > Anyway while trying to reproduce the last bug I instead got this with > > > the perf_fuzzer. > > > > > > Is it worth continuing to run and report these issues? I'm losing track > > > of all the open bugs. > > > > This is looks like ext4. Not entirely sure how perf ties into this. > > It's believable the filesystem could have issues (it's a fuzzer machine, > so it's had 100+ unclean shutdowns on an SSD drive in the past few months) > but as far as I know there shouldn't have been any filesystem accesses > happening at all when the bug triggered. Obvious question - does it pass fsck currently. If it does then presumably it was sane at the time it went pop ? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf bug: bad page map 2013-11-18 23:05 ` One Thousand Gnomes @ 2013-11-19 1:57 ` Vince Weaver 2013-11-19 7:06 ` Ingo Molnar 0 siblings, 1 reply; 8+ messages in thread From: Vince Weaver @ 2013-11-19 1:57 UTC (permalink / raw) To: One Thousand Gnomes Cc: Peter Zijlstra, LKML, Ingo Molnar, Paul Mackerras, Arnaldo Carvalho de Melo, tytso, adilger.kernel On Mon, 18 Nov 2013, One Thousand Gnomes wrote: > On Mon, 18 Nov 2013 11:41:22 -0500 (EST) > Vince Weaver <vincent.weaver@maine.edu> wrote: > > > On Mon, 18 Nov 2013, Peter Zijlstra wrote: > > > > > On Fri, Nov 15, 2013 at 01:04:23PM -0500, Vince Weaver wrote: > > > > > > > > (figured out the minicom issue). > > > > > > > > Anyway while trying to reproduce the last bug I instead got this with > > > > the perf_fuzzer. > > > > > > > > Is it worth continuing to run and report these issues? I'm losing track > > > > of all the open bugs. > > > > > > This is looks like ext4. Not entirely sure how perf ties into this. > > > > It's believable the filesystem could have issues (it's a fuzzer machine, > > so it's had 100+ unclean shutdowns on an SSD drive in the past few months) > > but as far as I know there shouldn't have been any filesystem accesses > > happening at all when the bug triggered. > > Obvious question - does it pass fsck currently. If it does then > presumably it was sane at the time it went pop ? # e2fsck -f /dev/sda1 e2fsck 1.42.8 (20-Jun-2013) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/sda1: 620972/3514368 files (0.5% non-contiguous), 9796212/14047744 blocks so it looks clean now... Vince ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: perf bug: bad page map 2013-11-19 1:57 ` Vince Weaver @ 2013-11-19 7:06 ` Ingo Molnar 0 siblings, 0 replies; 8+ messages in thread From: Ingo Molnar @ 2013-11-19 7:06 UTC (permalink / raw) To: Vince Weaver Cc: One Thousand Gnomes, Peter Zijlstra, LKML, Paul Mackerras, Arnaldo Carvalho de Melo, tytso, adilger.kernel * Vince Weaver <vincent.weaver@maine.edu> wrote: > On Mon, 18 Nov 2013, One Thousand Gnomes wrote: > > > On Mon, 18 Nov 2013 11:41:22 -0500 (EST) > > Vince Weaver <vincent.weaver@maine.edu> wrote: > > > > > On Mon, 18 Nov 2013, Peter Zijlstra wrote: > > > > > > > On Fri, Nov 15, 2013 at 01:04:23PM -0500, Vince Weaver wrote: > > > > > > > > > > (figured out the minicom issue). > > > > > > > > > > Anyway while trying to reproduce the last bug I instead got this with > > > > > the perf_fuzzer. > > > > > > > > > > Is it worth continuing to run and report these issues? I'm losing track > > > > > of all the open bugs. > > > > > > > > This is looks like ext4. Not entirely sure how perf ties into this. > > > > > > It's believable the filesystem could have issues (it's a fuzzer machine, > > > so it's had 100+ unclean shutdowns on an SSD drive in the past few months) > > > but as far as I know there shouldn't have been any filesystem accesses > > > happening at all when the bug triggered. > > > > Obvious question - does it pass fsck currently. If it does then > > presumably it was sane at the time it went pop ? > > # e2fsck -f /dev/sda1 > e2fsck 1.42.8 (20-Jun-2013) > Pass 1: Checking inodes, blocks, and sizes > Pass 2: Checking directory structure > Pass 3: Checking directory connectivity > Pass 4: Checking reference counts > Pass 5: Checking group summary information > /dev/sda1: 620972/3514368 files (0.5% non-contiguous), 9796212/14047744 blocks > > so it looks clean now... Also, in no way should a corrupted filesystem be able to provoke kernel crashes. So even if the filesystem had errors, this would still be a kernel bug we need to fix. Thanks, Ingo ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-11-19 7:06 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-11-15 18:04 perf bug: bad page map Vince Weaver 2013-11-18 15:17 ` Peter Zijlstra 2013-11-18 15:36 ` Ingo Molnar 2013-11-18 16:41 ` Vince Weaver 2013-11-18 17:13 ` Ingo Molnar 2013-11-18 23:05 ` One Thousand Gnomes 2013-11-19 1:57 ` Vince Weaver 2013-11-19 7:06 ` Ingo Molnar
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).