Re: mm: shm: hang in shmem_fallocate

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Sasha Levin <sasha.levin@oracle.com>
To: Hugh Dickins <hughd@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Konstantin Khlebnikov <koct9i@gmail.com>,
	Dave Jones <davej@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: mm: shm: hang in shmem_fallocate
Date: Sat, 28 Jun 2014 17:41:10 -0400	[thread overview]
Message-ID: <53AF3676.8080509@oracle.com> (raw)
In-Reply-To: <alpine.LSU.2.11.1406271043270.28744@eggly.anvils>

On 06/27/2014 02:03 PM, Hugh Dickins wrote:
> On Fri, 27 Jun 2014, Sasha Levin wrote:
>> On 06/27/2014 01:59 AM, Hugh Dickins wrote:
>>>>> First, this:
>>>>>
>>>>> [  681.267487] BUG: unable to handle kernel paging request at ffffea0003480048
>>>>> [  681.268621] IP: zap_pte_range (mm/memory.c:1132)
>>> Weird, I don't think we've seen anything like that before, have we?
>>> I'm pretty sure it's not a consequence of my "index = min(index, end)",
>>> but what it portends I don't know.  Please confirm mm/memory.c:1132 -
>>> that's the "if (PageAnon(page))" line, isn't it?  Which indeed matches
>>> the code below.  So accessing page->mapping is causing an oops...
>>
>> Right, that's the correct line.
>>
>> At this point I'm pretty sure that it's somehow related to that one line
>> patch since it reproduced fairly quickly after applying it, and when I
>> removed it I didn't see it happening again during the overnight fuzzing.
> 
> Oh, I assumed it was a one-off: you're saying that you saw it more than
> once with the min(index, end) patch in?  But not since removing it (did
> you replace that by the newer patch? or by the older? or by nothing?).

It reproduced exactly twice, can't say it happens too often.

What I did was revert your original fix for the issue and apply the one-liner.

I've spent most of yesterday chasing a different bug with a "clean" -next
tree (without the revert and the one-line patch) and didn't see any mm/
issues.

However, about 2 hours after doing the revert and applying the one-line
patch I've encountered the following:

[ 3686.797859] BUG: unable to handle kernel paging request at ffff88028a488f98
[ 3686.805732] IP: do_read_fault.isra.40 (mm/memory.c:2856 mm/memory.c:2889)
[ 3686.805732] PGD 12b82067 PUD 704d49067 PMD 704cf6067 PTE 800000028a488060
[ 3686.805732] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[ 3686.815852] Dumping ftrace buffer:
[ 3686.815852]    (ftrace buffer empty)
[ 3686.815852] Modules linked in:
[ 3686.815852] CPU: 10 PID: 8890 Comm: modprobe Not tainted 3.16.0-rc2-next-20140627-sasha-00024-ga284b83-dirty #753
[ 3686.815852] task: ffff8801d1c20000 ti: ffff8801c6a08000 task.ti: ffff8801c6a08000
[ 3686.826134] RIP: do_read_fault.isra.40 (mm/memory.c:2856 mm/memory.c:2889)
[ 3686.826134] RSP: 0000:ffff8801c6a0bc78  EFLAGS: 00010297
[ 3686.826134] RAX: 0000000000000000 RBX: ffff880288531200 RCX: 000000000000001f
[ 3686.826134] RDX: 0000000000000014 RSI: 00007f22949f3000 RDI: ffff88028a488f98
[ 3686.826134] RBP: ffff8801c6a0bd18 R08: 00007f2294a13000 R09: 000000000000000c
[ 3686.826134] R10: 0000000000000000 R11: 00000000000000a8 R12: 00007f2294a07c50
[ 3686.826134] R13: ffff880279fec4b0 R14: 00007f22949f3000 R15: ffff88028ebbc528
[ 3686.826134] FS:  0000000000000000(0000) GS:ffff880292e00000(0000) knlGS:0000000000000000
[ 3686.826134] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 3686.826134] CR2: ffff88028a488f98 CR3: 000000026b766000 CR4: 00000000000006a0
[ 3686.826134] Stack:
[ 3686.826134]  ffff8801c6a0bc98 0000000000000001 ffff8802000000a8 0000000000000014
[ 3686.826134]  ffff88028a489038 0000000000000000 ffff88026d40d000 ffff88028eafaee0
[ 3686.826134]  ffff8801c6a0bcd8 ffffffff8e572715 ffffea000a292240 000000000028a489
[ 3686.826134] Call Trace:
[ 3686.826134] ? _raw_spin_unlock (./arch/x86/include/asm/preempt.h:98 include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:183)
[ 3686.826134] ? __pte_alloc (mm/memory.c:598 mm/memory.c:593)
[ 3686.826134] __handle_mm_fault (mm/memory.c:3037 mm/memory.c:3198 mm/memory.c:3322)
[ 3686.826134] handle_mm_fault (include/linux/memcontrol.h:124 mm/memory.c:3348)
[ 3686.826134] ? __do_page_fault (arch/x86/mm/fault.c:1163)
[ 3686.826134] __do_page_fault (arch/x86/mm/fault.c:1230)
[ 3686.826134] ? vtime_account_user (kernel/sched/cputime.c:687)
[ 3686.826134] ? get_parent_ip (kernel/sched/core.c:2550)
[ 3686.826134] ? context_tracking_user_exit (include/linux/vtime.h:89 include/linux/jump_label.h:115 include/trace/events/context_tracking.h:47 kernel/context_tracking.c:180)
[ 3686.826134] ? preempt_count_sub (kernel/sched/core.c:2606)
[ 3686.826134] ? context_tracking_user_exit (kernel/context_tracking.c:184)
[ 3686.826134] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[ 3686.826134] ? trace_hardirqs_off_caller (kernel/locking/lockdep.c:2638 (discriminator 2))
[ 3686.826134] trace_do_page_fault (arch/x86/mm/fault.c:1313 include/linux/jump_label.h:115 include/linux/context_tracking_state.h:27 include/linux/context_tracking.h:45 arch/x86/mm/fault.c:1314)
[ 3686.826134] do_async_page_fault (arch/x86/kernel/kvm.c:264)
[ 3686.826134] async_page_fault (arch/x86/kernel/entry_64.S:1322)
[ 3686.826134] Code: 89 c0 4c 8b 43 08 48 8d 4c 08 ff 49 01 c1 49 39 c9 4c 0f 47 c9 4c 89 c1 4c 29 f1 48 c1 e9 0c 49 8d 4c 0a ff 49 39 c9 4c 0f 47 c9 <48> 83 3f 00 74 3c 48 83 c0 01 4c 39 c8 77 74 48 81 c6 00 10 00
All code
========
   0:	89 c0                	mov    %eax,%eax
   2:	4c 8b 43 08          	mov    0x8(%rbx),%r8
   6:	48 8d 4c 08 ff       	lea    -0x1(%rax,%rcx,1),%rcx
   b:	49 01 c1             	add    %rax,%r9
   e:	49 39 c9             	cmp    %rcx,%r9
  11:	4c 0f 47 c9          	cmova  %rcx,%r9
  15:	4c 89 c1             	mov    %r8,%rcx
  18:	4c 29 f1             	sub    %r14,%rcx
  1b:	48 c1 e9 0c          	shr    $0xc,%rcx
  1f:	49 8d 4c 0a ff       	lea    -0x1(%r10,%rcx,1),%rcx
  24:	49 39 c9             	cmp    %rcx,%r9
  27:	4c 0f 47 c9          	cmova  %rcx,%r9
  2b:*	48 83 3f 00          	cmpq   $0x0,(%rdi)		<-- trapping instruction
  2f:	74 3c                	je     0x6d
  31:	48 83 c0 01          	add    $0x1,%rax
  35:	4c 39 c8             	cmp    %r9,%rax
  38:	77 74                	ja     0xae
  3a:	48 81 c6 00 10 00 00 	add    $0x1000,%rsi

Code starting with the faulting instruction
===========================================
   0:	48 83 3f 00          	cmpq   $0x0,(%rdi)
   4:	74 3c                	je     0x42
   6:	48 83 c0 01          	add    $0x1,%rax
   a:	4c 39 c8             	cmp    %r9,%rax
   d:	77 74                	ja     0x83
   f:	48 81 c6 00 10 00 00 	add    $0x1000,%rsi
[ 3686.826134] RIP do_read_fault.isra.40 (mm/memory.c:2856 mm/memory.c:2889)
[ 3686.826134]  RSP <ffff8801c6a0bc78>
[ 3686.826134] CR2: ffff88028a488f98

Association is not causation but this is pretty suspicious...

> I want to exclaim "That makes no sense!", but bugs don't make sense
> anyway.  It's going to be a challenge to work out a connection though.
> I think I want to ask for more attempts to reproduce, with and without
> the min(index, end) patch (if you have enough time - there must be a
> limit to the amount of time you can give me on this).
> 
> I rather hoped that the oops on PageAnon might shed light from another
> direction on the outstanding page_mapped bug: both seem like page table
> corruption of some kind (though I've not seen a plausible path to either).
> 
> And regarding the page_mapped bug: we've heard nothing since Dave
> Hansen suggested a VM_BUG_ON_PAGE for that - has it gone away now?

Seems like it. I'm carrying Dave's patch still, but haven't seen it
triggering.


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2014-06-28 21:47 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-16  4:01 mm: shm: hang in shmem_fallocate Sasha Levin
2014-02-08 19:46 ` Sasha Levin
2014-02-09  3:25   ` Hugh Dickins
2014-02-10  1:41     ` Sasha Levin
2014-06-12 20:38       ` Sasha Levin
2014-06-16  2:29         ` Hugh Dickins
2014-06-17 20:32           ` Sasha Levin
2014-06-24 16:31           ` Vlastimil Babka
2014-06-25 22:36             ` Hugh Dickins
2014-06-26  9:14               ` Vlastimil Babka
2014-06-26 15:19                 ` Vlastimil Babka
2014-06-27  5:36                 ` Hugh Dickins
2014-07-01 11:52                   ` Vlastimil Babka
2014-07-02  1:49                     ` Hugh Dickins
2014-07-09 21:59                   ` Johannes Weiner
2014-07-09 22:48                     ` Hugh Dickins
2014-07-10  0:51                       ` Hugh Dickins
2014-06-26 15:11               ` Sasha Levin
2014-06-27  5:59                 ` Hugh Dickins
2014-06-27 14:50                   ` Sasha Levin
2014-06-27 18:03                     ` Hugh Dickins
2014-06-28 21:41                       ` Sasha Levin [this message]
2014-07-01 22:37                       ` mm: shmem: hang in shmem_fault (WAS: mm: shm: hang in shmem_fallocate) Sasha Levin
2014-07-02  0:17                         ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53AF3676.8080509@oracle.com \
    --to=sasha.levin@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=davej@redhat.com \
    --cc=hughd@google.com \
    --cc=koct9i@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).