* [PATCH, 3.18] sleeping function called from invalid context @ 2014-12-11 0:25 Daniel J Blueman 2014-12-11 0:37 ` Andy Lutomirski 0 siblings, 1 reply; 8+ messages in thread From: Daniel J Blueman @ 2014-12-11 0:25 UTC (permalink / raw) To: Linux Kernel, Linus Torvalds, H. Peter Anvin, Suresh Siddha, Andy Lutomirski With 3.18, I was seeing a significant number of allocation warnings from load_elf_binary call paths [1]. Allocating FPU state atomically [2] does prevent potentially sleeping while atomic, but probably isn't the correct fix. Anyone familiar with this area? -- [1] BUG: sleeping function called from invalid context at mm/slub.c:1240 in_atomic(): 1, irqs_disabled(): 0, pid: 29016, name: dmesg INFO: lockdep is turned off. Preemption disabled at:[<ffffffff8114625f>] handle_mm_fault+0x5ff/0x8a0 CPU: 122 PID: 29016 Comm: dmesg Tainted: G W 3.18.0-test #6 Hardware name: Supermicro AS -1042G-LTF/H8QGL, BIOS DS3.5a 11/13/2014 ffffffff8203b81b ffff886fdf62f888 ffffffff81b942e8 0000000000000007 0000000000000000 ffff886fdf62f8b8 ffffffff8109ac4c 0000000000000000 0000000000000010 ffff8807df84bc00 00000000000000d0 ffff886fdf62f908 Call Trace: [<ffffffff81b942e8>] dump_stack+0x4f/0x7c [<ffffffff8109ac4c>] __might_sleep+0x164/0x250 [<ffffffff8116372b>] kmem_cache_alloc+0xeb/0x138 [<ffffffff8100e0e1>] init_fpu+0x71/0xb0 [<ffffffff81003966>] math_state_restore+0xce/0x208 [<ffffffff8100437b>] do_device_not_available+0x2b/0x60 [<ffffffff81ba70f5>] device_not_available+0x15/0x20 [<ffffffff81487102>] ? copy_page+0x12/0x33 [<ffffffff8109edb9>] ? get_parent_ip+0x11/0x58 [<ffffffff8109ee55>] ? preempt_count_add+0x55/0xb0 [<ffffffff811437a1>] ? do_cow_fault+0xe9/0x258 [<ffffffff8114625f>] handle_mm_fault+0x5ff/0x8a0 [<ffffffff81032218>] ? __do_page_fault+0xc8/0x498 [<ffffffff811481b9>] ? vma_gap_callbacks_rotate+0x19/0x20 [<ffffffff8103227d>] __do_page_fault+0x12d/0x498 [<ffffffff810b682e>] ? up_write+0x1e/0x48 [<ffffffff81148d18>] ? vma_link+0x80/0xc0 [<ffffffff81149cb2>] ? vma_set_page_prot+0x3a/0x60 [<ffffffff8114ae6e>] ? mmap_region+0x1be/0x5e0 [<ffffffff8103262e>] do_page_fault+0x1e/0x70 [<ffffffff81ba737f>] page_fault+0x1f/0x30 [<ffffffff81487f3e>] ? __clear_user+0x2e/0x50 [<ffffffff81487f22>] ? __clear_user+0x12/0x50 [<ffffffff81487f8a>] clear_user+0x2a/0x30 [<ffffffff811ca1e1>] padzero+0x21/0x30 [<ffffffff811cc41f>] load_elf_binary+0x8cf/0xdd0 [<ffffffff8117d18f>] search_binary_handler+0x7f/0x1f8 [<ffffffff8117ec06>] do_execve_common.isra.34+0x616/0x790 [<ffffffff8117eb62>] ? do_execve_common.isra.34+0x572/0x790 [<ffffffff8117ed93>] do_execve+0x13/0x18 [<ffffffff8117f070>] SyS_execve+0x20/0x30 [<ffffffff81ba6159>] stub_execve+0x69/0xa0 -- [2] diff --git a/arch/x86/include/asm/fpu-internal.h b/arch/x86/include/asm/fpu-internal.h index e97622f..57029ad 100644 --- a/arch/x86/include/asm/fpu-internal.h +++ b/arch/x86/include/asm/fpu-internal.h @@ -574,7 +574,7 @@ static inline int fpu_alloc(struct fpu *fpu) { if (fpu_allocated(fpu)) return 0; - fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL); + fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_ATOMIC); if (!fpu->state) return -ENOMEM; WARN_ON((unsigned long)fpu->state & 15); -- Daniel J Blueman ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH, 3.18] sleeping function called from invalid context 2014-12-11 0:25 [PATCH, 3.18] sleeping function called from invalid context Daniel J Blueman @ 2014-12-11 0:37 ` Andy Lutomirski 2014-12-11 0:46 ` Daniel J Blueman 0 siblings, 1 reply; 8+ messages in thread From: Andy Lutomirski @ 2014-12-11 0:37 UTC (permalink / raw) To: Daniel J Blueman, Rik van Riel Cc: Linux Kernel, Linus Torvalds, H. Peter Anvin, Suresh Siddha Adding Rik, who is probably the most recent person to have looked at the disaster fondly known as "FPU". On Wed, Dec 10, 2014 at 4:25 PM, Daniel J Blueman <daniel@quora.org> wrote: > With 3.18, I was seeing a significant number of allocation warnings > from load_elf_binary call paths [1]. > > Allocating FPU state atomically [2] does prevent potentially sleeping > while atomic, but probably isn't the correct fix. Anyone familiar with > this area? > > -- [1] > > BUG: sleeping function called from invalid context at mm/slub.c:1240 > in_atomic(): 1, irqs_disabled(): 0, pid: 29016, name: dmesg > INFO: lockdep is turned off. > Preemption disabled at:[<ffffffff8114625f>] handle_mm_fault+0x5ff/0x8a0 > > CPU: 122 PID: 29016 Comm: dmesg Tainted: G W 3.18.0-test #6 > Hardware name: Supermicro AS -1042G-LTF/H8QGL, BIOS DS3.5a 11/13/2014 > ffffffff8203b81b ffff886fdf62f888 ffffffff81b942e8 0000000000000007 > 0000000000000000 ffff886fdf62f8b8 ffffffff8109ac4c 0000000000000000 > 0000000000000010 ffff8807df84bc00 00000000000000d0 ffff886fdf62f908 > Call Trace: > [<ffffffff81b942e8>] dump_stack+0x4f/0x7c > [<ffffffff8109ac4c>] __might_sleep+0x164/0x250 > [<ffffffff8116372b>] kmem_cache_alloc+0xeb/0x138 > [<ffffffff8100e0e1>] init_fpu+0x71/0xb0 > [<ffffffff81003966>] math_state_restore+0xce/0x208 > [<ffffffff8100437b>] do_device_not_available+0x2b/0x60 > [<ffffffff81ba70f5>] device_not_available+0x15/0x20 > [<ffffffff81487102>] ? copy_page+0x12/0x33 wtf? I don't see any fpu state being accessed in copy_page. Can you disassemble "[<ffffffff81487102>] ? copy_page+0x12/0x33"? --Andy > [<ffffffff8109edb9>] ? get_parent_ip+0x11/0x58 > [<ffffffff8109ee55>] ? preempt_count_add+0x55/0xb0 > [<ffffffff811437a1>] ? do_cow_fault+0xe9/0x258 > [<ffffffff8114625f>] handle_mm_fault+0x5ff/0x8a0 > [<ffffffff81032218>] ? __do_page_fault+0xc8/0x498 > [<ffffffff811481b9>] ? vma_gap_callbacks_rotate+0x19/0x20 > [<ffffffff8103227d>] __do_page_fault+0x12d/0x498 > [<ffffffff810b682e>] ? up_write+0x1e/0x48 > [<ffffffff81148d18>] ? vma_link+0x80/0xc0 > [<ffffffff81149cb2>] ? vma_set_page_prot+0x3a/0x60 > [<ffffffff8114ae6e>] ? mmap_region+0x1be/0x5e0 > [<ffffffff8103262e>] do_page_fault+0x1e/0x70 > [<ffffffff81ba737f>] page_fault+0x1f/0x30 > [<ffffffff81487f3e>] ? __clear_user+0x2e/0x50 > [<ffffffff81487f22>] ? __clear_user+0x12/0x50 > [<ffffffff81487f8a>] clear_user+0x2a/0x30 > [<ffffffff811ca1e1>] padzero+0x21/0x30 > [<ffffffff811cc41f>] load_elf_binary+0x8cf/0xdd0 > [<ffffffff8117d18f>] search_binary_handler+0x7f/0x1f8 > [<ffffffff8117ec06>] do_execve_common.isra.34+0x616/0x790 > [<ffffffff8117eb62>] ? do_execve_common.isra.34+0x572/0x790 > [<ffffffff8117ed93>] do_execve+0x13/0x18 > [<ffffffff8117f070>] SyS_execve+0x20/0x30 > [<ffffffff81ba6159>] stub_execve+0x69/0xa0 > > -- [2] > > diff --git a/arch/x86/include/asm/fpu-internal.h > b/arch/x86/include/asm/fpu-internal.h > index e97622f..57029ad 100644 > --- a/arch/x86/include/asm/fpu-internal.h > +++ b/arch/x86/include/asm/fpu-internal.h > @@ -574,7 +574,7 @@ static inline int fpu_alloc(struct fpu *fpu) > { > if (fpu_allocated(fpu)) > return 0; > - fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL); > + fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_ATOMIC); > if (!fpu->state) > return -ENOMEM; > WARN_ON((unsigned long)fpu->state & 15); > -- > Daniel J Blueman -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, 3.18] sleeping function called from invalid context 2014-12-11 0:37 ` Andy Lutomirski @ 2014-12-11 0:46 ` Daniel J Blueman 2014-12-11 0:49 ` Rik van Riel 0 siblings, 1 reply; 8+ messages in thread From: Daniel J Blueman @ 2014-12-11 0:46 UTC (permalink / raw) To: Andy Lutomirski Cc: Rik van Riel, Linux Kernel, Linus Torvalds, H. Peter Anvin Gah. I had some non-temporal copy changes in the wrong tree. I'll check with a definitely clean tree and follow up if it still occurs. On 10 December 2014 at 16:37, Andy Lutomirski <luto@amacapital.net> wrote: > Adding Rik, who is probably the most recent person to have looked at > the disaster fondly known as "FPU". > > On Wed, Dec 10, 2014 at 4:25 PM, Daniel J Blueman <daniel@quora.org> wrote: >> With 3.18, I was seeing a significant number of allocation warnings >> from load_elf_binary call paths [1]. >> >> Allocating FPU state atomically [2] does prevent potentially sleeping >> while atomic, but probably isn't the correct fix. Anyone familiar with >> this area? >> >> -- [1] >> >> BUG: sleeping function called from invalid context at mm/slub.c:1240 >> in_atomic(): 1, irqs_disabled(): 0, pid: 29016, name: dmesg >> INFO: lockdep is turned off. >> Preemption disabled at:[<ffffffff8114625f>] handle_mm_fault+0x5ff/0x8a0 >> >> CPU: 122 PID: 29016 Comm: dmesg Tainted: G W 3.18.0-test #6 >> Hardware name: Supermicro AS -1042G-LTF/H8QGL, BIOS DS3.5a 11/13/2014 >> ffffffff8203b81b ffff886fdf62f888 ffffffff81b942e8 0000000000000007 >> 0000000000000000 ffff886fdf62f8b8 ffffffff8109ac4c 0000000000000000 >> 0000000000000010 ffff8807df84bc00 00000000000000d0 ffff886fdf62f908 >> Call Trace: >> [<ffffffff81b942e8>] dump_stack+0x4f/0x7c >> [<ffffffff8109ac4c>] __might_sleep+0x164/0x250 >> [<ffffffff8116372b>] kmem_cache_alloc+0xeb/0x138 >> [<ffffffff8100e0e1>] init_fpu+0x71/0xb0 >> [<ffffffff81003966>] math_state_restore+0xce/0x208 >> [<ffffffff8100437b>] do_device_not_available+0x2b/0x60 >> [<ffffffff81ba70f5>] device_not_available+0x15/0x20 >> [<ffffffff81487102>] ? copy_page+0x12/0x33 > > wtf? I don't see any fpu state being accessed in copy_page. > > Can you disassemble "[<ffffffff81487102>] ? copy_page+0x12/0x33"? > > --Andy > >> [<ffffffff8109edb9>] ? get_parent_ip+0x11/0x58 >> [<ffffffff8109ee55>] ? preempt_count_add+0x55/0xb0 >> [<ffffffff811437a1>] ? do_cow_fault+0xe9/0x258 >> [<ffffffff8114625f>] handle_mm_fault+0x5ff/0x8a0 >> [<ffffffff81032218>] ? __do_page_fault+0xc8/0x498 >> [<ffffffff811481b9>] ? vma_gap_callbacks_rotate+0x19/0x20 >> [<ffffffff8103227d>] __do_page_fault+0x12d/0x498 >> [<ffffffff810b682e>] ? up_write+0x1e/0x48 >> [<ffffffff81148d18>] ? vma_link+0x80/0xc0 >> [<ffffffff81149cb2>] ? vma_set_page_prot+0x3a/0x60 >> [<ffffffff8114ae6e>] ? mmap_region+0x1be/0x5e0 >> [<ffffffff8103262e>] do_page_fault+0x1e/0x70 >> [<ffffffff81ba737f>] page_fault+0x1f/0x30 >> [<ffffffff81487f3e>] ? __clear_user+0x2e/0x50 >> [<ffffffff81487f22>] ? __clear_user+0x12/0x50 >> [<ffffffff81487f8a>] clear_user+0x2a/0x30 >> [<ffffffff811ca1e1>] padzero+0x21/0x30 >> [<ffffffff811cc41f>] load_elf_binary+0x8cf/0xdd0 >> [<ffffffff8117d18f>] search_binary_handler+0x7f/0x1f8 >> [<ffffffff8117ec06>] do_execve_common.isra.34+0x616/0x790 >> [<ffffffff8117eb62>] ? do_execve_common.isra.34+0x572/0x790 >> [<ffffffff8117ed93>] do_execve+0x13/0x18 >> [<ffffffff8117f070>] SyS_execve+0x20/0x30 >> [<ffffffff81ba6159>] stub_execve+0x69/0xa0 >> >> -- [2] >> >> diff --git a/arch/x86/include/asm/fpu-internal.h >> b/arch/x86/include/asm/fpu-internal.h >> index e97622f..57029ad 100644 >> --- a/arch/x86/include/asm/fpu-internal.h >> +++ b/arch/x86/include/asm/fpu-internal.h >> @@ -574,7 +574,7 @@ static inline int fpu_alloc(struct fpu *fpu) >> { >> if (fpu_allocated(fpu)) >> return 0; >> - fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL); >> + fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_ATOMIC); >> if (!fpu->state) >> return -ENOMEM; >> WARN_ON((unsigned long)fpu->state & 15); >> -- >> Daniel J Blueman > > > > -- > Andy Lutomirski > AMA Capital Management, LLC -- Daniel J Blueman ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, 3.18] sleeping function called from invalid context 2014-12-11 0:46 ` Daniel J Blueman @ 2014-12-11 0:49 ` Rik van Riel 2014-12-11 0:51 ` Andy Lutomirski 0 siblings, 1 reply; 8+ messages in thread From: Rik van Riel @ 2014-12-11 0:49 UTC (permalink / raw) To: Daniel J Blueman, Andy Lutomirski Cc: Linux Kernel, Linus Torvalds, H. Peter Anvin -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/10/2014 07:46 PM, Daniel J Blueman wrote: > Gah. I had some non-temporal copy changes in the wrong tree. I'll > check with a definitely clean tree and follow up if it still > occurs. The exception handlers should definitely allow sleeping, so I suspect those changes may be related. > On 10 December 2014 at 16:37, Andy Lutomirski <luto@amacapital.net> > wrote: >> Adding Rik, who is probably the most recent person to have looked >> at the disaster fondly known as "FPU". >> >> On Wed, Dec 10, 2014 at 4:25 PM, Daniel J Blueman >> <daniel@quora.org> wrote: >>> With 3.18, I was seeing a significant number of allocation >>> warnings from load_elf_binary call paths [1]. >>> >>> Allocating FPU state atomically [2] does prevent potentially >>> sleeping while atomic, but probably isn't the correct fix. >>> Anyone familiar with this area? >>> >>> -- [1] >>> >>> BUG: sleeping function called from invalid context at >>> mm/slub.c:1240 in_atomic(): 1, irqs_disabled(): 0, pid: 29016, >>> name: dmesg INFO: lockdep is turned off. Preemption disabled >>> at:[<ffffffff8114625f>] handle_mm_fault+0x5ff/0x8a0 >>> >>> CPU: 122 PID: 29016 Comm: dmesg Tainted: G W >>> 3.18.0-test #6 Hardware name: Supermicro AS -1042G-LTF/H8QGL, >>> BIOS DS3.5a 11/13/2014 ffffffff8203b81b ffff886fdf62f888 >>> ffffffff81b942e8 0000000000000007 0000000000000000 >>> ffff886fdf62f8b8 ffffffff8109ac4c 0000000000000000 >>> 0000000000000010 ffff8807df84bc00 00000000000000d0 >>> ffff886fdf62f908 Call Trace: [<ffffffff81b942e8>] >>> dump_stack+0x4f/0x7c [<ffffffff8109ac4c>] >>> __might_sleep+0x164/0x250 [<ffffffff8116372b>] >>> kmem_cache_alloc+0xeb/0x138 [<ffffffff8100e0e1>] >>> init_fpu+0x71/0xb0 [<ffffffff81003966>] >>> math_state_restore+0xce/0x208 [<ffffffff8100437b>] >>> do_device_not_available+0x2b/0x60 [<ffffffff81ba70f5>] >>> device_not_available+0x15/0x20 [<ffffffff81487102>] ? >>> copy_page+0x12/0x33 >> >> wtf? I don't see any fpu state being accessed in copy_page. >> >> Can you disassemble "[<ffffffff81487102>] ? >> copy_page+0x12/0x33"? >> >> --Andy >> >>> [<ffffffff8109edb9>] ? get_parent_ip+0x11/0x58 >>> [<ffffffff8109ee55>] ? preempt_count_add+0x55/0xb0 >>> [<ffffffff811437a1>] ? do_cow_fault+0xe9/0x258 >>> [<ffffffff8114625f>] handle_mm_fault+0x5ff/0x8a0 >>> [<ffffffff81032218>] ? __do_page_fault+0xc8/0x498 >>> [<ffffffff811481b9>] ? vma_gap_callbacks_rotate+0x19/0x20 >>> [<ffffffff8103227d>] __do_page_fault+0x12d/0x498 >>> [<ffffffff810b682e>] ? up_write+0x1e/0x48 [<ffffffff81148d18>] >>> ? vma_link+0x80/0xc0 [<ffffffff81149cb2>] ? >>> vma_set_page_prot+0x3a/0x60 [<ffffffff8114ae6e>] ? >>> mmap_region+0x1be/0x5e0 [<ffffffff8103262e>] >>> do_page_fault+0x1e/0x70 [<ffffffff81ba737f>] >>> page_fault+0x1f/0x30 [<ffffffff81487f3e>] ? >>> __clear_user+0x2e/0x50 [<ffffffff81487f22>] ? >>> __clear_user+0x12/0x50 [<ffffffff81487f8a>] >>> clear_user+0x2a/0x30 [<ffffffff811ca1e1>] padzero+0x21/0x30 >>> [<ffffffff811cc41f>] load_elf_binary+0x8cf/0xdd0 >>> [<ffffffff8117d18f>] search_binary_handler+0x7f/0x1f8 >>> [<ffffffff8117ec06>] do_execve_common.isra.34+0x616/0x790 >>> [<ffffffff8117eb62>] ? do_execve_common.isra.34+0x572/0x790 >>> [<ffffffff8117ed93>] do_execve+0x13/0x18 [<ffffffff8117f070>] >>> SyS_execve+0x20/0x30 [<ffffffff81ba6159>] >>> stub_execve+0x69/0xa0 >>> >>> -- [2] >>> >>> diff --git a/arch/x86/include/asm/fpu-internal.h >>> b/arch/x86/include/asm/fpu-internal.h index e97622f..57029ad >>> 100644 --- a/arch/x86/include/asm/fpu-internal.h +++ >>> b/arch/x86/include/asm/fpu-internal.h @@ -574,7 +574,7 @@ >>> static inline int fpu_alloc(struct fpu *fpu) { if >>> (fpu_allocated(fpu)) return 0; - fpu->state = >>> kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL); + >>> fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_ATOMIC); >>> if (!fpu->state) return -ENOMEM; WARN_ON((unsigned >>> long)fpu->state & 15); -- Daniel J Blueman >> >> >> >> -- Andy Lutomirski AMA Capital Management, LLC > > > - -- All rights reversed -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJUiOoBAAoJEM553pKExN6DfrQH/2qDtTACDGUfQ0VgeK6hkDK0 mMp2Xk8xHhUTKkyLQtA1NSHAjfv0RsEvrnr11yJtbr8+vfMjJyxp0/Qq0w9gqqyR zqVBPyE9yOauKvkyzBDtLOZW8u+hb+LUY8Cv4FkZieJg7cACPwW2jSeUI+ALStI/ nb5qM6wUNdeZGnRxd82x23J/u17ILZ8HsdNb2Pk2FpmgRnZ0/NWZdw2Kp714uF47 fvPrV41go7OEBoGFPtOIdF9Z7XagpfwN1Jg1ThfQqlLFtVv/Yf0P6hMX6siPjmMn YuxYXsypfdWAWPCP7PloJW+nraHz0poX5/gN4YycZD4+Dh4pzNZ0oVSADQkageI= =Hs9Y -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, 3.18] sleeping function called from invalid context 2014-12-11 0:49 ` Rik van Riel @ 2014-12-11 0:51 ` Andy Lutomirski 2014-12-11 1:32 ` Rik van Riel 0 siblings, 1 reply; 8+ messages in thread From: Andy Lutomirski @ 2014-12-11 0:51 UTC (permalink / raw) To: Rik van Riel Cc: Daniel J Blueman, Linux Kernel, Linus Torvalds, H. Peter Anvin On Wed, Dec 10, 2014 at 4:49 PM, Rik van Riel <riel@redhat.com> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 12/10/2014 07:46 PM, Daniel J Blueman wrote: >> Gah. I had some non-temporal copy changes in the wrong tree. I'll >> check with a definitely clean tree and follow up if it still >> occurs. > > The exception handlers should definitely allow sleeping, > so I suspect those changes may be related. It would be really, really nice if we could arrange for kernel_fpu_begin to be unconditionally usable in anything except NMI context. The crypto code would be much less scary, we could make non-temporal copies safe, etc. Can we have ponies, too? --Andy > >> On 10 December 2014 at 16:37, Andy Lutomirski <luto@amacapital.net> >> wrote: >>> Adding Rik, who is probably the most recent person to have looked >>> at the disaster fondly known as "FPU". >>> >>> On Wed, Dec 10, 2014 at 4:25 PM, Daniel J Blueman >>> <daniel@quora.org> wrote: >>>> With 3.18, I was seeing a significant number of allocation >>>> warnings from load_elf_binary call paths [1]. >>>> >>>> Allocating FPU state atomically [2] does prevent potentially >>>> sleeping while atomic, but probably isn't the correct fix. >>>> Anyone familiar with this area? >>>> >>>> -- [1] >>>> >>>> BUG: sleeping function called from invalid context at >>>> mm/slub.c:1240 in_atomic(): 1, irqs_disabled(): 0, pid: 29016, >>>> name: dmesg INFO: lockdep is turned off. Preemption disabled >>>> at:[<ffffffff8114625f>] handle_mm_fault+0x5ff/0x8a0 >>>> >>>> CPU: 122 PID: 29016 Comm: dmesg Tainted: G W >>>> 3.18.0-test #6 Hardware name: Supermicro AS -1042G-LTF/H8QGL, >>>> BIOS DS3.5a 11/13/2014 ffffffff8203b81b ffff886fdf62f888 >>>> ffffffff81b942e8 0000000000000007 0000000000000000 >>>> ffff886fdf62f8b8 ffffffff8109ac4c 0000000000000000 >>>> 0000000000000010 ffff8807df84bc00 00000000000000d0 >>>> ffff886fdf62f908 Call Trace: [<ffffffff81b942e8>] >>>> dump_stack+0x4f/0x7c [<ffffffff8109ac4c>] >>>> __might_sleep+0x164/0x250 [<ffffffff8116372b>] >>>> kmem_cache_alloc+0xeb/0x138 [<ffffffff8100e0e1>] >>>> init_fpu+0x71/0xb0 [<ffffffff81003966>] >>>> math_state_restore+0xce/0x208 [<ffffffff8100437b>] >>>> do_device_not_available+0x2b/0x60 [<ffffffff81ba70f5>] >>>> device_not_available+0x15/0x20 [<ffffffff81487102>] ? >>>> copy_page+0x12/0x33 >>> >>> wtf? I don't see any fpu state being accessed in copy_page. >>> >>> Can you disassemble "[<ffffffff81487102>] ? >>> copy_page+0x12/0x33"? >>> >>> --Andy >>> >>>> [<ffffffff8109edb9>] ? get_parent_ip+0x11/0x58 >>>> [<ffffffff8109ee55>] ? preempt_count_add+0x55/0xb0 >>>> [<ffffffff811437a1>] ? do_cow_fault+0xe9/0x258 >>>> [<ffffffff8114625f>] handle_mm_fault+0x5ff/0x8a0 >>>> [<ffffffff81032218>] ? __do_page_fault+0xc8/0x498 >>>> [<ffffffff811481b9>] ? vma_gap_callbacks_rotate+0x19/0x20 >>>> [<ffffffff8103227d>] __do_page_fault+0x12d/0x498 >>>> [<ffffffff810b682e>] ? up_write+0x1e/0x48 [<ffffffff81148d18>] >>>> ? vma_link+0x80/0xc0 [<ffffffff81149cb2>] ? >>>> vma_set_page_prot+0x3a/0x60 [<ffffffff8114ae6e>] ? >>>> mmap_region+0x1be/0x5e0 [<ffffffff8103262e>] >>>> do_page_fault+0x1e/0x70 [<ffffffff81ba737f>] >>>> page_fault+0x1f/0x30 [<ffffffff81487f3e>] ? >>>> __clear_user+0x2e/0x50 [<ffffffff81487f22>] ? >>>> __clear_user+0x12/0x50 [<ffffffff81487f8a>] >>>> clear_user+0x2a/0x30 [<ffffffff811ca1e1>] padzero+0x21/0x30 >>>> [<ffffffff811cc41f>] load_elf_binary+0x8cf/0xdd0 >>>> [<ffffffff8117d18f>] search_binary_handler+0x7f/0x1f8 >>>> [<ffffffff8117ec06>] do_execve_common.isra.34+0x616/0x790 >>>> [<ffffffff8117eb62>] ? do_execve_common.isra.34+0x572/0x790 >>>> [<ffffffff8117ed93>] do_execve+0x13/0x18 [<ffffffff8117f070>] >>>> SyS_execve+0x20/0x30 [<ffffffff81ba6159>] >>>> stub_execve+0x69/0xa0 >>>> >>>> -- [2] >>>> >>>> diff --git a/arch/x86/include/asm/fpu-internal.h >>>> b/arch/x86/include/asm/fpu-internal.h index e97622f..57029ad >>>> 100644 --- a/arch/x86/include/asm/fpu-internal.h +++ >>>> b/arch/x86/include/asm/fpu-internal.h @@ -574,7 +574,7 @@ >>>> static inline int fpu_alloc(struct fpu *fpu) { if >>>> (fpu_allocated(fpu)) return 0; - fpu->state = >>>> kmem_cache_alloc(task_xstate_cachep, GFP_KERNEL); + >>>> fpu->state = kmem_cache_alloc(task_xstate_cachep, GFP_ATOMIC); >>>> if (!fpu->state) return -ENOMEM; WARN_ON((unsigned >>>> long)fpu->state & 15); -- Daniel J Blueman >>> >>> >>> >>> -- Andy Lutomirski AMA Capital Management, LLC >> >> >> > > > - -- > All rights reversed > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1 > > iQEcBAEBAgAGBQJUiOoBAAoJEM553pKExN6DfrQH/2qDtTACDGUfQ0VgeK6hkDK0 > mMp2Xk8xHhUTKkyLQtA1NSHAjfv0RsEvrnr11yJtbr8+vfMjJyxp0/Qq0w9gqqyR > zqVBPyE9yOauKvkyzBDtLOZW8u+hb+LUY8Cv4FkZieJg7cACPwW2jSeUI+ALStI/ > nb5qM6wUNdeZGnRxd82x23J/u17ILZ8HsdNb2Pk2FpmgRnZ0/NWZdw2Kp714uF47 > fvPrV41go7OEBoGFPtOIdF9Z7XagpfwN1Jg1ThfQqlLFtVv/Yf0P6hMX6siPjmMn > YuxYXsypfdWAWPCP7PloJW+nraHz0poX5/gN4YycZD4+Dh4pzNZ0oVSADQkageI= > =Hs9Y > -----END PGP SIGNATURE----- -- Andy Lutomirski AMA Capital Management, LLC ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, 3.18] sleeping function called from invalid context 2014-12-11 0:51 ` Andy Lutomirski @ 2014-12-11 1:32 ` Rik van Riel 2014-12-11 1:53 ` Andy Lutomirski 0 siblings, 1 reply; 8+ messages in thread From: Rik van Riel @ 2014-12-11 1:32 UTC (permalink / raw) To: Andy Lutomirski Cc: Daniel J Blueman, Linux Kernel, Linus Torvalds, H. Peter Anvin -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/10/2014 07:51 PM, Andy Lutomirski wrote: > On Wed, Dec 10, 2014 at 4:49 PM, Rik van Riel <riel@redhat.com> > wrote: On 12/10/2014 07:46 PM, Daniel J Blueman wrote: >>>> Gah. I had some non-temporal copy changes in the wrong tree. >>>> I'll check with a definitely clean tree and follow up if it >>>> still occurs. > > The exception handlers should definitely allow sleeping, so I > suspect those changes may be related. > >> It would be really, really nice if we could arrange for >> kernel_fpu_begin to be unconditionally usable in anything except >> NMI context. The crypto code would be much less scary, we could >> make non-temporal copies safe, etc. Can we have ponies, too? Isn't it already? I see nothing in __kernel_fpu_begin that looks like it would ever need to sleep. - -- All rights reversed -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJUiPQyAAoJEM553pKExN6DKJoH/iWa64YibC17ADt2F7Uq4sSD 7gsi7oG7t/WJGUH5QbWy3O+36ppTw2lZwLMUUlPMqHaqzyhNa8TkGr2auyJyH45h vk7YCBpHjzqGB8IUxgbeaie6n95Z8RG28FP8HOB1EjCwQwqJR3lQqnvRP6QD17FX H/I9+w5+Hdkcu0sNj8jppdap+q6Bu/y1WHli3iWIlHMJIsfMMmkmiZR/Gw0ewf0w F8bePmQYQjyB/xkEPtfXFGUuJfnbLYkYFyHUHHe/J+/MI1VkKNPDBGelvNKwniEC 41R7KinuqVh6DiAa9WprAXcrmllHI4L5aIFIRWI/W03/omFrF6WAdWs8OXioHfI= =s4Eb -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, 3.18] sleeping function called from invalid context 2014-12-11 1:32 ` Rik van Riel @ 2014-12-11 1:53 ` Andy Lutomirski 2014-12-11 2:10 ` Rik van Riel 0 siblings, 1 reply; 8+ messages in thread From: Andy Lutomirski @ 2014-12-11 1:53 UTC (permalink / raw) To: Rik van Riel Cc: Daniel J Blueman, Linux Kernel, Linus Torvalds, H. Peter Anvin On Wed, Dec 10, 2014 at 5:32 PM, Rik van Riel <riel@redhat.com> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 12/10/2014 07:51 PM, Andy Lutomirski wrote: >> On Wed, Dec 10, 2014 at 4:49 PM, Rik van Riel <riel@redhat.com> >> wrote: On 12/10/2014 07:46 PM, Daniel J Blueman wrote: >>>>> Gah. I had some non-temporal copy changes in the wrong tree. >>>>> I'll check with a definitely clean tree and follow up if it >>>>> still occurs. >> >> The exception handlers should definitely allow sleeping, so I >> suspect those changes may be related. >> >>> It would be really, really nice if we could arrange for >>> kernel_fpu_begin to be unconditionally usable in anything except >>> NMI context. The crypto code would be much less scary, we could >>> make non-temporal copies safe, etc. Can we have ponies, too? > > Isn't it already? > > I see nothing in __kernel_fpu_begin that looks like it would > ever need to sleep. > It never needs to sleep, but it does need somewhere to save the previous state. See irq_fpu_usable. FWIW, I don't understand what the comments above interrupted_kernel_fpu_idle are talking about. The issue that I understand is: kernel_fpu_begin() irq: kernel_fpu_begin() use xstate kernel_fpu_end() we're screwed now :( kernel_fpu_end() IOW we need somewhere to put the state from the thing we interrupted. This gets extra fun if some thread does something that takes a page fault that uses fpu that gets interrupted, etc. Fortunately, I think that can't happen -- kernel_fpu_begin disables preemption. So I think we have a maximum of one active FPU context per thread plus some number per cpu. Maybe we could have a percpu array of ten or twenty xstates to handle all possible nesting. Also, can we just delete the non-eager code some day? --Andy ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH, 3.18] sleeping function called from invalid context 2014-12-11 1:53 ` Andy Lutomirski @ 2014-12-11 2:10 ` Rik van Riel 0 siblings, 0 replies; 8+ messages in thread From: Rik van Riel @ 2014-12-11 2:10 UTC (permalink / raw) To: Andy Lutomirski Cc: Daniel J Blueman, Linux Kernel, Linus Torvalds, H. Peter Anvin -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/10/2014 08:53 PM, Andy Lutomirski wrote: > On Wed, Dec 10, 2014 at 5:32 PM, Rik van Riel <riel@redhat.com> > wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 >> >> On 12/10/2014 07:51 PM, Andy Lutomirski wrote: >>> On Wed, Dec 10, 2014 at 4:49 PM, Rik van Riel >>> <riel@redhat.com> wrote: On 12/10/2014 07:46 PM, Daniel J >>> Blueman wrote: >>>>>> Gah. I had some non-temporal copy changes in the wrong >>>>>> tree. I'll check with a definitely clean tree and follow >>>>>> up if it still occurs. >>> >>> The exception handlers should definitely allow sleeping, so I >>> suspect those changes may be related. >>> >>>> It would be really, really nice if we could arrange for >>>> kernel_fpu_begin to be unconditionally usable in anything >>>> except NMI context. The crypto code would be much less >>>> scary, we could make non-temporal copies safe, etc. Can we >>>> have ponies, too? >> >> Isn't it already? >> >> I see nothing in __kernel_fpu_begin that looks like it would ever >> need to sleep. >> > > It never needs to sleep, but it does need somewhere to save the > previous state. See irq_fpu_usable. > > FWIW, I don't understand what the comments above > interrupted_kernel_fpu_idle are talking about. The issue that I > understand is: > > kernel_fpu_begin() > > irq: kernel_fpu_begin() use xstate kernel_fpu_end() > > we're screwed now :( > > kernel_fpu_end() > > IOW we need somewhere to put the state from the thing we > interrupted. Good point. An interruptible kernel_fpu_begin needs to provide a place to put the state. Alternatively, the one that runs from irq context could provide the place to store the current context, with a kernel_fpu_begin_irq() function... > This gets extra fun if some thread does something that takes a > page fault that uses fpu that gets interrupted, etc. Fortunately, > I think that can't happen -- kernel_fpu_begin disables preemption. > So I think we have a maximum of one active FPU context per thread > plus some number per cpu. Maybe we could have a percpu array of > ten or twenty xstates to handle all possible nesting. > > Also, can we just delete the non-eager code some day? The XSAVEOPT and XRSTOR optimizations do not work across VMENTER and VMEXIT, so with the eager code we end up always loading and saving 384 bytes of state at every context switch, even for tasks that never once touched the FPU. That is 6 cache lines worth of unused stuff for each task involved in the context switch. Somehow I am not convinced that is a good idea... - -- All rights reversed -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJUiP0yAAoJEM553pKExN6DD24H/jCcahi8z3LP6JEL4tpJUMpv NIYuAdzdIbns4ycijSqYJSVGxZJEx2c7w8QSrRgTslx6zPfP2Q0MYZQ9J++CJbgQ FZZcbnor6OL/KQtJWnPNSTN0ElBehWo+nZqK4SIbBoAwqJDQpBqo1Me9alxGoZ3P SixN4NAwfDMP3nz0wNGuxV/Hr3/T1Hz5tDYS4226z5yhz84Sd4ODzAuu9NDGANHg 5g08uocMz6lpMlxO2m+bSElDxh6V0J6bqedgxzZbjVZM3zYk+txFw1TwmMSJoHT3 /rUzwfwGmA/8WDNWgSEscUZILJiEZgrGAtWVyJed0IZTg+A7MSqP9y57OXrDD60= =YmP/ -----END PGP SIGNATURE----- ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-12-11 2:47 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-11 0:25 [PATCH, 3.18] sleeping function called from invalid context Daniel J Blueman 2014-12-11 0:37 ` Andy Lutomirski 2014-12-11 0:46 ` Daniel J Blueman 2014-12-11 0:49 ` Rik van Riel 2014-12-11 0:51 ` Andy Lutomirski 2014-12-11 1:32 ` Rik van Riel 2014-12-11 1:53 ` Andy Lutomirski 2014-12-11 2:10 ` Rik van Riel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox