Re: Possible race in copy of fpu->state in copy_process against the exeve'ing parent?

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Jianyu Zhan <nasa4836@gmail.com>
Cc: mingo@redhat.com, "H. Peter Anvin" <hpa@zytor.com>,
	suresh.b.siddha@intel.com, x86@kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	Andy Lutomirski <luto@kernel.org>, Borislav Petkov <bp@alien8.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Oleg Nesterov <oleg@redhat.com>, Dave Hansen <dave@sr71.net>
Subject: Re: Possible race in copy of fpu->state in copy_process against the exeve'ing parent?
Date: Wed, 13 Apr 2016 08:09:43 +0200	[thread overview]
Message-ID: <20160413060943.GA4705@gmail.com> (raw)
In-Reply-To: <CAHz2CGXb1AW5DaGd9=vzkLdP2X9fep36H913ChyJEtd6UUtzVA@mail.gmail.com>


* Jianyu Zhan <nasa4836@gmail.com> wrote:

> Hi,
> 
> I encountered a panic on a Linux-3.2 kernel on a x86_64 machine, and
> suspect it is a race condition.  And I checked the current mainline
> and found it was fixed unintendedly.
> 
> So I hope x86/fpu maintainer help verify this.  Thanks verfy much.
> 
> 
> The panic stack trace :
> 
>  #0 [ffff88529d33f990] try_crashdump at ffffffff8105b8ca
>  #1 [ffff88529d33f9a0] dump_on_panic at ffffffff8105b965
>  #2 [ffff88529d33fa60] notifier_call_chain at ffffffff8139f784
>  #3 [ffff88529d33fac0] atomic_notifier_call_chain at ffffffff8139f81d
>  #4 [ffff88529d33fad0] panic at ffffffff8139971c
>  #5 [ffff88529d33fb50] oops_end at ffffffff8139d34a
>  #6 [ffff88529d33fb80] no_context at ffffffff81021569
>  #7 [ffff88529d33fbd0] __bad_area_nosemaphore at ffffffff81021730
>  #8 [ffff88529d33fc20] bad_area at ffffffff810217ac
>  #9 [ffff88529d33fc50] do_page_fault at ffffffff8139f509
> #10 [ffff88529d33fd70] page_fault at ffffffff8139caef
>     [exception RIP: prepare_to_copy+35]
> <------------------ PANIC !!!
>     RIP: ffffffff810013f4  RSP: ffff88529d33fe20  RFLAGS: 00010286
>     RAX: 00000000ffffffff  RBX: 0000000001200011  RCX: ffff884fe73f6320
>     RDX: 00000000ffffffff  RSI: 00007fff07d36bd0  RDI: 0000000000000000
>     RBP: ffff88529d33fe20   R8: 00007f5c4a209770   R9: 0000000000000000
>     R10: 00007f5c4a209770  R11: 0000000000000202  R12: 0000000000000000
>     R13: 0000000000000000  R14: ffff884fe73f6320  R15: 0000000000000001
>     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> #11 [ffff88529d33fe28] copy_process at ffffffff81038211
> #12 [ffff88529d33fea8] do_fork at ffffffff810393ec
> #13 [ffff88529d33ff38] sys_clone at ffffffff81009118
> #14 [ffff88529d33ff48] stub_clone at ffffffff813a31d3
> 
> 
> crash7> dis -r prepare_to_copy+35
> 0xffffffff810013d1 <prepare_to_copy>:   push   %rbp
> 0xffffffff810013d2 <prepare_to_copy+1>: mov    %rsp,%rbp
> 0xffffffff810013d5 <prepare_to_copy+4>: nopl   0x0(%rax,%rax,1)
> 0xffffffff810013da <prepare_to_copy+9>: mov    %rdi,%rcx
> 0xffffffff810013dd <prepare_to_copy+12>:        cmpl   $0x0,0x4d8(%rdi)
> 0xffffffff810013e4 <prepare_to_copy+19>:        je
> 0xffffffff8100142e <prepare_to_copy+93>
> 0xffffffff810013e6 <prepare_to_copy+21>:        mov    0x4e0(%rdi),%rdi
> 0xffffffff810013ed <prepare_to_copy+28>:        xchg   %ax,%ax
> 0xffffffff810013ef <prepare_to_copy+30>:        or     $0xffffffff,%eax
> 0xffffffff810013f2 <prepare_to_copy+33>:        mov    %eax,%edx
> 0xffffffff810013f4 <prepare_to_copy+35>:        xsaveopt64 (%rdi)
> <---- PANIC HERE
> 
> when panic the %rdi is 0x0000000000000000, which is fpu->state.
> 
> 
> 
> So  I suspect there is a possible race:
> 
> 
>    Parent:
> 
>        sys_execve
>          do_execve
>            do_execve_common
>              search_binary_handler
>                 load_elf_binary
>                   start_thread
>                     start_thread_common
>                        free_thread_xstate(current)
>                          fpu_free
>                             fpu->state = NULL
> 
> 
>     Child:
> 
>         sys_clone
>           do_fork
>              copy_process
>                 dup_task_struct
>                    prepare_to_copy
>                       unlazy_fpu
>                          __save_init_fpu
>                            fpu_save_init
>                              fpu_xsave(fpu)   <---- fpu->sate is NULL,
> so cause a
>                                                                 NULL
> dereference.
> 
> Scenario:  Parent is still exeve'ing,  and just set fpu->state to NULL,
> and the a concurrent clone() forks a Child and in which  fpu_xsave()
> tries to fpu_xsave, when fpu->state is NULL.
> 
> The race window seems quite small, and I have checked the Parent's
> 'sum_exec_runtime' is 536920255(~0.53s).
> 
> I checked the mainline, and found commit 304bceda6a18(" x86, fpu: use
> non-lazy fpu restore for processors supporting xsave") seems
> unintendedly fix this?

So I'm not sure I understand the suggested race. Separate tasks have separate 
fpu->state states, so a parallel execve() and clone() has no effect on each other. 
There's no FPU state sharing.

Thanks,

	Ingo

next prev parent reply	other threads:[~2016-04-13  6:09 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-13  3:11 Possible race in copy of fpu->state in copy_process against the exeve'ing parent? Jianyu Zhan
2016-04-13  3:19 ` Jianyu Zhan
2016-04-13  6:09 ` Ingo Molnar [this message]
2016-04-13  7:23   ` Jianyu Zhan
2016-04-13 10:25     ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160413060943.GA4705@gmail.com \
    --to=mingo@kernel.org \
    --cc=bp@alien8.de \
    --cc=dave@sr71.net \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nasa4836@gmail.com \
    --cc=oleg@redhat.com \
    --cc=suresh.b.siddha@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.