All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Richard Henderson <rth@twiddle.net>,
	Jakub Jelinek <jakub@redhat.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [x86] BUG: unable to handle kernel paging request at 00740060
Date: Wed, 9 Oct 2013 16:17:37 +0200	[thread overview]
Message-ID: <20131009141737.GA16168@redhat.com> (raw)
In-Reply-To: <20131009140734.GH3081@twins.programming.kicks-ass.net>

OK, thanks...

I didn't notice Richard and Jakub were not cc'ed... Add them, perhaps
they can take a look.

On 10/09, Peter Zijlstra wrote:
>
> On Wed, Oct 09, 2013 at 02:43:10PM +0200, Oleg Nesterov wrote:
> > I'm afraid I am wrong, my asm skills are close to zero... but this
> > code looks wrong to me, and this can explain the oopses.
> >
> > > task_work_add:
> > > 	pushl	%ebp	#
> > > 	movl	%esp, %ebp	#,
> > > 	pushl	%edi	#
> > > 	pushl	%esi	#
> > > 	pushl	%ebx	#
> > > 	subl	$12, %esp	#,
> > > 	call	mcount
> > > 	movl	%eax, %edi	# task, task
> > > 	movl	%edx, -16(%ebp)	# work, %sfp
> > > 	movb	%cl, -21(%ebp)	# notify, %sfp
> > > 	.p2align 4,,15
> > > .L3:
> > > 	movl	904(%edi), %esi	# task_3(D)->task_works, head
> > > 	cmpl	$work_exited, %esi	#, head
> > > 	sete	%bl	#, D.14145
> > > 	andl	$255, %ebx	#, D.14145
> > > 	xorl	%ecx, %ecx	#
> > > 	movl	%ebx, %edx	# D.14145,
> > > 	movl	$______f.14042, %eax	#,
> > > 	call	ftrace_likely_update	#
> > > 	testl	%ebx, %ebx	# D.14145
> > > 	jne	.L4	#,
> > > 	movl	-16(%ebp), %edx	# %sfp,
> > > 	movl	%esi, (%edx)	# head, work_13(D)->next
> > > 	movl	%esi, %eax	# head, __ret
> > > #APP
> > > # 34 "/c/wfg/tip/kernel/task_work.c" 1
> > > 	cmpxchgl %edx,904(%edi)	#, *__ptr_16
> > > # 0 "" 2
> > > #NO_APP
> > > 	cmpl	%eax, %esi	# __ret, head
> > > 	jne	.L3	#,
> >
> > OK, we added the new work successfully, we should return 0. If we return
> > non-zero, fput() (the likely caller) assumes that it should use the workqueues
> > to close/free this file. Then later task_work_run() will do __fput() again.
> >
> > > 	cmpb	$0, -21(%ebp)	#, %sfp
> > > 	je	.L5	#,
> > > 	movl	4(%edi), %eax	# task_3(D)->stack, task_3(D)->stack
> > > #APP
> > > # 208 "/c/wfg/tip/arch/x86/include/asm/bitops.h" 1
> > > 	bts $1, 8(%eax); jc .L2	#, MEM[(volatile long unsigned int *)D.14203_29],
> >
> > This is set_notify_resume(). Probably !CONFIG_SMP (I do not see kick_process).
> >
> > > # 0 "" 2
> > > #NO_APP
> > > .L5:
> > > 	movl	$0, -20(%ebp)	#, %sfp
> > > .L2:
> > > 	movl	-20(%ebp), %eax	# %sfp,
> >
> > This is what we are going to return. But note that -20(%ebp) was not
> > initialized if TIF_NOTIFY_RESUME was already set, "jc .L2" skips .L5
> > above. IOW, in this case we seem to return a random value from stack.
>
> I think you're quite right, and I can confirm I can reproduce this with
> gcc-4.8.1 and Wu's .config:
>
>         .p2align 4,,15
>         .globl  task_work_add
>         .type   task_work_add, @function
> task_work_add:
>         pushl   %ebp    #
>         movl    %esp, %ebp      #,
>         pushl   %edi    #
>         pushl   %esi    #
>         pushl   %ebx    #
>         subl    $12, %esp       #,
>         call    mcount
>         movl    %eax, %esi      # task, task
>         movl    %edx, %edi      # work, work
>         movl    %ecx, -24(%ebp) # notify, %sfp
>         jmp     .L4     #
>         .p2align 4,,15
> .L9:
>         movl    %ebx, (%edi)    # __old, work_15(D)->next
>         movl    %ebx, %eax      # __old, __ret
> #APP
> # 34 "/usr/src/linux-2.6/kernel/task_work.c" 1
>         cmpxchgl %edi,904(%esi) # work, *__ptr_17
> # 0 "" 2
> #NO_APP
>         cmpl    %eax, %ebx      # __ret, __old
>         je      .L8     #,
> .L4:
>         movl    904(%esi), %ebx # task_7(D)->task_works, __old
>         cmpl    $work_exited, %ebx      #, __old
>         sete    -13(%ebp)       #, %sfp
>         xorl    %edx, %edx      # ______r
>         movb    -13(%ebp), %dl  # %sfp, ______r
>         xorl    %ecx, %ecx      #
>         movl    $______f.14204, %eax    #,
>         call    ftrace_likely_update    #
>         cmpb    $0, -13(%ebp)   #, %sfp
>         je      .L9     #,
>         movl    $-3, -20(%ebp)  #, %sfp
> .L2:
>         movl    -20(%ebp), %eax # %sfp,
>         addl    $12, %esp       #,
>         popl    %ebx    #
>         popl    %esi    #
>         popl    %edi    #
>         popl    %ebp    #
>         ret
>         .p2align 4,,15
> .L8:
>         cmpb    $0, -24(%ebp)   #, %sfp
>         je      .L6     #,
>         movl    4(%esi), %eax   # task_7(D)->stack, task_7(D)->stack
> #APP
> # 208 "/usr/src/linux-2.6/arch/x86/include/asm/bitops.h" 1
>         bts $1, 8(%eax); jc .L2 #, MEM[(volatile long unsigned int *)_23],
> # 0 "" 2
> #NO_APP
> .L6:
>         movl    $0, -20(%ebp)   #, %sfp
>         movl    -20(%ebp), %eax # %sfp,
>         addl    $12, %esp       #,
>         popl    %ebx    #
>         popl    %esi    #
>         popl    %edi    #
>         popl    %ebp    #
>         ret
>         .size   task_work_add, .-task_work_add
>
> Once I force a x86_64 build using the 'same' config it goes away and
> generates 'sensible' code again (although I don't see why L9 isn't
> merged with L2):
>
>         .p2align 4,,15
>         .globl  task_work_add
>         .type   task_work_add, @function
> task_work_add:
>         call    __fentry__
>         pushq   %rbp    #
>         movq    %rsp, %rbp      #,
>         pushq   %r15    #
>         pushq   %r14    #
>         movl    %edx, %r14d     # notify, notify
>         pushq   %r13    #
>         movq    %rsi, %r13      # work, work
>         pushq   %r12    #
>         movq    %rdi, %r12      # task, task
>         pushq   %rbx    #
>         jmp     .L4     #
>         .p2align 4,,10
>         .p2align 3
> .L11:
>         movq    %rbx, 0(%r13)   # __old, work_17(D)->next
>         movq    %rbx, %rax      # __old, __ret
> #APP
> # 34 "/usr/src/linux-2.6/kernel/task_work.c" 1
>         cmpxchgq %r13,1312(%r12)        # work, *__ptr_19
> # 0 "" 2
> #NO_APP
>         cmpq    %rax, %rbx      # __ret, __old
>         je      .L10    #,
> .L4:
>         movq    1312(%r12), %rbx        # task_7(D)->task_works, __old
>         movq    $______f.14855, %rdi    #,
>         cmpq    $work_exited, %rbx      #, __old
>         sete    %r15b   #, tmp72
>         xorl    %edx, %edx      #
>         movzbl  %r15b, %esi     # tmp72, ______r
>         call    ftrace_likely_update    #
>         testb   %r15b, %r15b    # tmp72
>         je      .L11    #,
>         movl    $-3, %eax       #, D.15034
> .L2:
>         popq    %rbx    #
>         popq    %r12    #
>         popq    %r13    #
>         popq    %r14    #
>         popq    %r15    #
>         popq    %rbp    #
>         ret
>         .p2align 4,,10
>         .p2align 3
> .L10:
>         xorl    %eax, %eax      # D.15034
>         testb   %r14b, %r14b    # notify
>         je      .L2     #,
>         movq    8(%r12), %rdx   # task_7(D)->stack, task_7(D)->stack
> #APP
> # 208 "/usr/src/linux-2.6/arch/x86/include/asm/bitops.h" 1
>         bts $1, 16(%rdx); jc .L9        #, MEM[(volatile long unsigned int *)_25],
> # 0 "" 2
> #NO_APP
> .L9:
>         popq    %rbx    #
>         popq    %r12    #
>         popq    %r13    #
>         popq    %r14    #
>         popq    %r15    #
>         popq    %rbp    #
>         ret
>         .size   task_work_add, .-task_work_add
>
>


  reply	other threads:[~2013-10-09 14:24 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-10-05 23:44 [x86] BUG: unable to handle kernel paging request at 00740060 Fengguang Wu
2013-10-05 23:47 ` [x86] BUG: unable to handle kernel paging request at 08000060 Fengguang Wu
2013-10-06  7:27   ` Mike Galbraith
2013-10-06  7:36     ` Fengguang Wu
2013-10-07  8:49   ` Peter Zijlstra
2013-10-07  9:17     ` Fengguang Wu
2013-10-07  9:36       ` Peter Zijlstra
2013-10-07  9:46         ` Fengguang Wu
2013-10-07  8:55 ` [x86] BUG: unable to handle kernel paging request at 00740060 Peter Zijlstra
2013-10-07  9:08   ` Peter Zijlstra
2013-10-07 11:32     ` Fengguang Wu
2013-10-07  9:27   ` Fengguang Wu
2013-10-07 18:47 ` Linus Torvalds
2013-10-08  7:51   ` Fengguang Wu
2013-10-08 16:21     ` Linus Torvalds
2013-10-08 17:15       ` [x86] BUG: unable to handle kernel NULL pointer dereference at (null) Fengguang Wu
2013-10-08 17:31         ` Linus Torvalds
2013-10-09  1:09           ` Fengguang Wu
2013-10-09  1:33             ` Linus Torvalds
2013-10-08 18:51       ` [x86] BUG: unable to handle kernel paging request at 00740060 Oleg Nesterov
2013-10-08 19:05         ` Jakub Jelinek
2013-10-08 19:20           ` Linus Torvalds
2013-10-08 19:34             ` Linus Torvalds
2013-10-08 19:35           ` Oleg Nesterov
2013-10-08 19:49             ` Linus Torvalds
2013-10-09  1:43           ` Mike Galbraith
2013-10-08 19:05         ` Linus Torvalds
2013-10-08 16:46     ` Oleg Nesterov
2013-10-08 14:34   ` Oleg Nesterov
2013-10-09  8:04     ` Fengguang Wu
2013-10-09 12:19       ` Fengguang Wu
2013-10-09 12:21         ` Fengguang Wu
2013-10-09 12:27         ` Peter Zijlstra
2013-10-09 12:52           ` Ingo Molnar
2013-10-09 17:18             ` Ingo Molnar
2013-10-10  2:15               ` Mike Galbraith
2013-10-09 12:56           ` Fengguang Wu
2013-10-09 12:43       ` Oleg Nesterov
2013-10-09 14:07         ` Peter Zijlstra
2013-10-09 14:17           ` Oleg Nesterov [this message]
2013-10-09 14:32           ` Ingo Molnar
2013-10-09 14:33           ` Peter Zijlstra
2013-10-09 14:46             ` Peter Zijlstra
2013-10-09 18:16               ` Jakub Jelinek
2013-10-09 18:54                 ` Linus Torvalds
2013-10-09 19:02                 ` Peter Zijlstra
2013-10-09 19:08                   ` Jakub Jelinek
2013-10-10  6:22                     ` Ingo Molnar
2013-10-10  6:51                       ` Jakub Jelinek
2013-10-10  8:04                         ` Jakub Jelinek
2013-10-10  8:24                           ` [PATCH] gcc4: Add 'asm goto' miscompilation quirk Ingo Molnar
2013-10-10  8:31                             ` Jakub Jelinek
2013-10-10  8:45                               ` Ingo Molnar
2013-10-10  8:55                                 ` [PATCH, -v2] compiler/gcc4: Add quirk for 'asm goto' miscompilation bug Ingo Molnar
2013-10-10 11:56                                   ` Peter Zijlstra
2013-10-10 12:32                                     ` Jakub Jelinek
2013-10-10 13:10                                       ` Peter Zijlstra
2013-10-10 15:04                                         ` Ingo Molnar
2013-10-10 14:04                               ` [PATCH] gcc4: Add 'asm goto' miscompilation quirk Richard Henderson
2013-10-10 14:27                                 ` Jakub Jelinek
2013-10-10 15:12                                   ` [PATCH, -v3] compiler/gcc4: Add quirk for 'asm goto' miscompilation bug Ingo Molnar
2013-10-10 16:15                                     ` Richard Henderson
2013-10-10 16:49                                       ` Ingo Molnar
2013-10-11  4:35                                     ` Fengguang Wu
2013-10-11  5:46                                       ` Ingo Molnar
2013-10-11  6:51                                         ` Fengguang Wu
2013-10-11  9:30                                           ` Fengguang Wu
2013-10-12 17:03                                             ` Ingo Molnar
2013-10-10  8:34                             ` [PATCH] gcc4: Add 'asm goto' miscompilation quirk Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131009141737.GA16168@redhat.com \
    --to=oleg@redhat.com \
    --cc=fengguang.wu@intel.com \
    --cc=jakub@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rth@twiddle.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.