linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powerpc/livepatch: Fix livepatch stack access
@ 2017-09-20 10:19 Kamalesh Babulal
  2017-09-20 11:11 ` Naveen N . Rao
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Kamalesh Babulal @ 2017-09-20 10:19 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Kamalesh Babulal, Balbir Singh, Naveen N . Rao, linuxppc-dev

While running stress test with livepatch module loaded, kernel
bug was triggered.

cpu 0x5: Vector: 400 (Instruction Access) at [c0000000eb9d3b60]
    pc: c0000000eb9d3e30
    lr: c0000000eb9d3e30
    sp: c0000000eb9d3de0
   msr: 800000001280b033
  current = 0xc0000000dbd38700
  paca    = 0xc00000000fe01400   softe: 0        irq_happened: 0x01
    pid   = 8618, comm = make
Linux version 4.13.0+ (root@ubuntu) (gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #1 SMP Wed Sep 13 03:49:27 EDT 2017

5:mon> t
[c0000000eb9d3de0] c0000000eb9d3e30 (unreliable)
[c0000000eb9d3e30] c000000000008ab4 hardware_interrupt_common+0x114/0x120
 --- Exception: 501 (Hardware Interrupt) at c000000000053040 livepatch_handler+0x4c/0x74
[c0000000eb9d4120] 0000000057ac6e9d (unreliable)
[d0000000089d9f78] 2e0965747962382e
SP (965747962342e09) is in userspace

When an interrupt is served in between the livepatch_handler execution,
there are chances of the livepatch_stack/task task getting corrupted.

                    CPU 1
                 =============
Task A                          Interrupt Handler
=========                       =================
livepatch_handler:
 mr r0, r1
 ld r1, TI_livepatch_sp(r12)
                                hardware_interrupt_common
                                |_do_IRQ
                                  |_ call_do_irq:
                                        mflr r0
                                        std  r0,16(r1)
                                        stdu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r4)
                                            ...
lis r2, STACK_END_MAGIC@h
ori r2, r2, STACK_END_MAGIC@l
ld  r12, -8(r1)      <- livepatch stack is corrupted.

Fix the corruption by using r11 register for livepatch stack manipulation,
instead of shuffling task stack and livepatch stack into r1 register.
Using r11 register also avoids disabling/enabling irq's while setting
up the livepatch stack.

Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
Cc: Balbir Singh <bsingharora@gmail.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/trace/ftrace_64_mprofile.S | 45 +++++++++-----------------
 1 file changed, 15 insertions(+), 30 deletions(-)

diff --git a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
index c98e90b..b4e2b71 100644
--- a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
+++ b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
@@ -181,34 +181,25 @@ _GLOBAL(ftrace_stub)
 	 *  - we have no stack frame and can not allocate one
 	 *  - LR points back to the original caller (in A)
 	 *  - CTR holds the new NIP in C
-	 *  - r0 & r12 are free
-	 *
-	 * r0 can't be used as the base register for a DS-form load or store, so
-	 * we temporarily shuffle r1 (stack pointer) into r0 and then put it back.
+	 *  - r0, r11 & r12 are free
 	 */
 livepatch_handler:
 	CURRENT_THREAD_INFO(r12, r1)

-	/* Save stack pointer into r0 */
-	mr	r0, r1
-
 	/* Allocate 3 x 8 bytes */
-	ld	r1, TI_livepatch_sp(r12)
-	addi	r1, r1, 24
-	std	r1, TI_livepatch_sp(r12)
+	ld	r11, TI_livepatch_sp(r12)
+	addi	r11, r11, 24
+	std	r11, TI_livepatch_sp(r12)

 	/* Save toc & real LR on livepatch stack */
-	std	r2,  -24(r1)
+	std	r2,  -24(r11)
 	mflr	r12
-	std	r12, -16(r1)
+	std	r12, -16(r11)

 	/* Store stack end marker */
 	lis     r12, STACK_END_MAGIC@h
 	ori     r12, r12, STACK_END_MAGIC@l
-	std	r12, -8(r1)
-
-	/* Restore real stack pointer */
-	mr	r1, r0
+	std	r12, -8(r11)

 	/* Put ctr in r12 for global entry and branch there */
 	mfctr	r12
@@ -216,36 +207,30 @@ livepatch_handler:

 	/*
 	 * Now we are returning from the patched function to the original
-	 * caller A. We are free to use r0 and r12, and we can use r2 until we
+	 * caller A. We are free to use r11, r12 and we can use r2 until we
 	 * restore it.
 	 */

 	CURRENT_THREAD_INFO(r12, r1)

-	/* Save stack pointer into r0 */
-	mr	r0, r1
-
-	ld	r1, TI_livepatch_sp(r12)
+	ld	r11, TI_livepatch_sp(r12)

 	/* Check stack marker hasn't been trashed */
 	lis     r2,  STACK_END_MAGIC@h
 	ori     r2,  r2, STACK_END_MAGIC@l
-	ld	r12, -8(r1)
+	ld	r12, -8(r11)
 1:	tdne	r12, r2
 	EMIT_BUG_ENTRY 1b, __FILE__, __LINE__ - 1, 0

 	/* Restore LR & toc from livepatch stack */
-	ld	r12, -16(r1)
+	ld	r12, -16(r11)
 	mtlr	r12
-	ld	r2,  -24(r1)
+	ld	r2,  -24(r11)

 	/* Pop livepatch stack frame */
-	CURRENT_THREAD_INFO(r12, r0)
-	subi	r1, r1, 24
-	std	r1, TI_livepatch_sp(r12)
-
-	/* Restore real stack pointer */
-	mr	r1, r0
+	CURRENT_THREAD_INFO(r12, r1)
+	subi	r11, r11, 24
+	std	r11, TI_livepatch_sp(r12)

 	/* Return to original caller of live patched function */
 	blr
--
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/livepatch: Fix livepatch stack access
  2017-09-20 10:19 [PATCH] powerpc/livepatch: Fix livepatch stack access Kamalesh Babulal
@ 2017-09-20 11:11 ` Naveen N . Rao
  2017-09-29 11:47   ` Balbir Singh
  2017-09-21 10:02 ` Michael Ellerman
  2017-10-12  0:20 ` Michael Ellerman
  2 siblings, 1 reply; 9+ messages in thread
From: Naveen N . Rao @ 2017-09-20 11:11 UTC (permalink / raw)
  To: Kamalesh Babulal; +Cc: Michael Ellerman, Balbir Singh, linuxppc-dev

On 2017/09/20 03:49PM, Kamalesh Babulal wrote:
> While running stress test with livepatch module loaded, kernel
> bug was triggered.
> 
> cpu 0x5: Vector: 400 (Instruction Access) at [c0000000eb9d3b60]
>     pc: c0000000eb9d3e30
>     lr: c0000000eb9d3e30
>     sp: c0000000eb9d3de0
>    msr: 800000001280b033
>   current = 0xc0000000dbd38700
>   paca    = 0xc00000000fe01400   softe: 0        irq_happened: 0x01
>     pid   = 8618, comm = make
> Linux version 4.13.0+ (root@ubuntu) (gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #1 SMP Wed Sep 13 03:49:27 EDT 2017
> 
> 5:mon> t
> [c0000000eb9d3de0] c0000000eb9d3e30 (unreliable)
> [c0000000eb9d3e30] c000000000008ab4 hardware_interrupt_common+0x114/0x120
>  --- Exception: 501 (Hardware Interrupt) at c000000000053040 livepatch_handler+0x4c/0x74
> [c0000000eb9d4120] 0000000057ac6e9d (unreliable)
> [d0000000089d9f78] 2e0965747962382e
> SP (965747962342e09) is in userspace
> 
> When an interrupt is served in between the livepatch_handler execution,
> there are chances of the livepatch_stack/task task getting corrupted.
> 
>                     CPU 1
>                  =============
> Task A                          Interrupt Handler
> =========                       =================
> livepatch_handler:
>  mr r0, r1
>  ld r1, TI_livepatch_sp(r12)
>                                 hardware_interrupt_common
>                                 |_do_IRQ
>                                   |_ call_do_irq:
>                                         mflr r0
>                                         std  r0,16(r1)
>                                         stdu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r4)
>                                             ...
> lis r2, STACK_END_MAGIC@h
> ori r2, r2, STACK_END_MAGIC@l
> ld  r12, -8(r1)      <- livepatch stack is corrupted.
> 
> Fix the corruption by using r11 register for livepatch stack manipulation,
> instead of shuffling task stack and livepatch stack into r1 register.
> Using r11 register also avoids disabling/enabling irq's while setting
> up the livepatch stack.
> 
> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> ---
>  arch/powerpc/kernel/trace/ftrace_64_mprofile.S | 45 +++++++++-----------------
>  1 file changed, 15 insertions(+), 30 deletions(-)

Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>


- Naveen

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/livepatch: Fix livepatch stack access
  2017-09-20 10:19 [PATCH] powerpc/livepatch: Fix livepatch stack access Kamalesh Babulal
  2017-09-20 11:11 ` Naveen N . Rao
@ 2017-09-21 10:02 ` Michael Ellerman
  2017-09-21 11:00   ` Balbir Singh
  2017-10-12  0:20 ` Michael Ellerman
  2 siblings, 1 reply; 9+ messages in thread
From: Michael Ellerman @ 2017-09-21 10:02 UTC (permalink / raw)
  To: Kamalesh Babulal
  Cc: Kamalesh Babulal, Balbir Singh, Naveen N . Rao, linuxppc-dev

Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> writes:

> While running stress test with livepatch module loaded, kernel
> bug was triggered.
>
> cpu 0x5: Vector: 400 (Instruction Access) at [c0000000eb9d3b60]
>     pc: c0000000eb9d3e30
>     lr: c0000000eb9d3e30
>     sp: c0000000eb9d3de0
>    msr: 800000001280b033
>   current = 0xc0000000dbd38700
>   paca    = 0xc00000000fe01400   softe: 0        irq_happened: 0x01
>     pid   = 8618, comm = make
> Linux version 4.13.0+ (root@ubuntu) (gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #1 SMP Wed Sep 13 03:49:27 EDT 2017
>
> 5:mon> t
> [c0000000eb9d3de0] c0000000eb9d3e30 (unreliable)
> [c0000000eb9d3e30] c000000000008ab4 hardware_interrupt_common+0x114/0x120
>  --- Exception: 501 (Hardware Interrupt) at c000000000053040 livepatch_handler+0x4c/0x74
> [c0000000eb9d4120] 0000000057ac6e9d (unreliable)
> [d0000000089d9f78] 2e0965747962382e
> SP (965747962342e09) is in userspace
>
> When an interrupt is served in between the livepatch_handler execution,
> there are chances of the livepatch_stack/task task getting corrupted.

Ouch. That's pretty broken by me.

> Fix the corruption by using r11 register for livepatch stack manipulation,
> instead of shuffling task stack and livepatch stack into r1 register.
> Using r11 register also avoids disabling/enabling irq's while setting
> up the livepatch stack.

I'm trying to think if there's some reason I didn't use r11. But I can't
remember anything specific. I suspect I just didn't check the ABI
closely enough, and knew I could use r0 and r12 so stuck with them.

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/livepatch: Fix livepatch stack access
  2017-09-21 10:02 ` Michael Ellerman
@ 2017-09-21 11:00   ` Balbir Singh
  2017-09-21 11:53     ` Kamalesh Babulal
  2017-09-21 16:55     ` Naveen N . Rao
  0 siblings, 2 replies; 9+ messages in thread
From: Balbir Singh @ 2017-09-21 11:00 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Kamalesh Babulal, Naveen N . Rao,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

On Thu, Sep 21, 2017 at 8:02 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> writes:
>
>> While running stress test with livepatch module loaded, kernel
>> bug was triggered.
>>
>> cpu 0x5: Vector: 400 (Instruction Access) at [c0000000eb9d3b60]
>>     pc: c0000000eb9d3e30
>>     lr: c0000000eb9d3e30
>>     sp: c0000000eb9d3de0
>>    msr: 800000001280b033
>>   current = 0xc0000000dbd38700
>>   paca    = 0xc00000000fe01400   softe: 0        irq_happened: 0x01
>>     pid   = 8618, comm = make
>> Linux version 4.13.0+ (root@ubuntu) (gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #1 SMP Wed Sep 13 03:49:27 EDT 2017
>>
>> 5:mon> t
>> [c0000000eb9d3de0] c0000000eb9d3e30 (unreliable)
>> [c0000000eb9d3e30] c000000000008ab4 hardware_interrupt_common+0x114/0x120
>>  --- Exception: 501 (Hardware Interrupt) at c000000000053040 livepatch_handler+0x4c/0x74
>> [c0000000eb9d4120] 0000000057ac6e9d (unreliable)
>> [d0000000089d9f78] 2e0965747962382e
>> SP (965747962342e09) is in userspace
>>
>> When an interrupt is served in between the livepatch_handler execution,
>> there are chances of the livepatch_stack/task task getting corrupted.
>
> Ouch. That's pretty broken by me.
>

I was worried more about preemption as I said in the review comment earlier,
this is new. It looks like we restored the wrong r1 on returning from
the interrupt
context? It would be nice to see any pt_regs changes due to the interrupt.
Did the interrupt handling code called something that needed live-patching?


>> Fix the corruption by using r11 register for livepatch stack manipulation,
>> instead of shuffling task stack and livepatch stack into r1 register.
>> Using r11 register also avoids disabling/enabling irq's while setting
>> up the livepatch stack.
>
> I'm trying to think if there's some reason I didn't use r11. But I can't
> remember anything specific. I suspect I just didn't check the ABI

We can't use r11, this is ftrace with regs, we've restore registers before
calling livepatch_handler, I don't think we can clobber r11, but I might
be sleep deprived and missing something

> closely enough, and knew I could use r0 and r12 so stuck with them.
>
> cheers

Balbir

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/livepatch: Fix livepatch stack access
  2017-09-21 11:00   ` Balbir Singh
@ 2017-09-21 11:53     ` Kamalesh Babulal
  2017-09-21 16:55     ` Naveen N . Rao
  1 sibling, 0 replies; 9+ messages in thread
From: Kamalesh Babulal @ 2017-09-21 11:53 UTC (permalink / raw)
  To: Balbir Singh, Michael Ellerman
  Cc: Naveen N . Rao, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

On Thursday 21 September 2017 04:30 PM, Balbir Singh wrote:
> On Thu, Sep 21, 2017 at 8:02 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>> Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> writes:
>>
>>> While running stress test with livepatch module loaded, kernel
>>> bug was triggered.
>>>
>>> cpu 0x5: Vector: 400 (Instruction Access) at [c0000000eb9d3b60]
>>>     pc: c0000000eb9d3e30
>>>     lr: c0000000eb9d3e30
>>>     sp: c0000000eb9d3de0
>>>    msr: 800000001280b033
>>>   current = 0xc0000000dbd38700
>>>   paca    = 0xc00000000fe01400   softe: 0        irq_happened: 0x01
>>>     pid   = 8618, comm = make
>>> Linux version 4.13.0+ (root@ubuntu) (gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #1 SMP Wed Sep 13 03:49:27 EDT 2017
>>>
>>> 5:mon> t
>>> [c0000000eb9d3de0] c0000000eb9d3e30 (unreliable)
>>> [c0000000eb9d3e30] c000000000008ab4 hardware_interrupt_common+0x114/0x120
>>>  --- Exception: 501 (Hardware Interrupt) at c000000000053040 livepatch_handler+0x4c/0x74
>>> [c0000000eb9d4120] 0000000057ac6e9d (unreliable)
>>> [d0000000089d9f78] 2e0965747962382e
>>> SP (965747962342e09) is in userspace
>>>
>>> When an interrupt is served in between the livepatch_handler execution,
>>> there are chances of the livepatch_stack/task task getting corrupted.
>>
>> Ouch. That's pretty broken by me.
>>
>
> I was worried more about preemption as I said in the review comment earlier,
> this is new. It looks like we restored the wrong r1 on returning from
> the interrupt
> context?

Yes, consider the example in the mail thread. Where the r0 is temporary 
used to hold the stack pointer value before loading r1 with the 
livepatch_sp (current->thread_info + 24). An interrupt occurs while the 
livepatch stack is being setup to store or restore the LR/TOC value of 
the calling function.

do_IRQ -> call_do_irq's first instruction mflr r0, over-writes the stack 
value with LR value. Consecutive instruction referring to r1 in 
call_do_irq are actually referring to the livepatch_sp instead of task 
stack. On the return to livepatch_handler, stack is restored from r0 
value (which has been over-written with LR value in call_do_irq). Any 
access therefore made to the stack is to a wrong address.

> It would be nice to see any pt_regs changes due to the interrupt.
> Did the interrupt handling code called something that needed live-patching?
>

AFAIK, the livepatch_sp value for softirq_ctx/hardirq_ctx are 
initialized, as it's part of thread_info. Am I missing something here.


>
>>> Fix the corruption by using r11 register for livepatch stack manipulation,
>>> instead of shuffling task stack and livepatch stack into r1 register.
>>> Using r11 register also avoids disabling/enabling irq's while setting
>>> up the livepatch stack.
>>

The r11 is restored before calling the livepatch_handler by ftrace_caller:

         /* Load CTR with the possibly modified NIP */
         mtctr   r15

         /* Restore gprs */
         REST_GPR(0,r1)
         REST_10GPRS(2,r1)
         REST_10GPRS(12,r1)
         REST_10GPRS(22,r1)

         [...]

#ifdef CONFIG_LIVEPATCH
         /*
          * Based on the cmpd or cmpdi above, if the NIP was altered and 
we're
          * not on a kprobe/jprobe, then handle livepatch.
          */
         bne-    livepatch_handler
#endif


-- 
cheers,
Kamalesh.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/livepatch: Fix livepatch stack access
  2017-09-21 11:00   ` Balbir Singh
  2017-09-21 11:53     ` Kamalesh Babulal
@ 2017-09-21 16:55     ` Naveen N . Rao
  2017-09-22  5:16       ` Kamalesh Babulal
  1 sibling, 1 reply; 9+ messages in thread
From: Naveen N . Rao @ 2017-09-21 16:55 UTC (permalink / raw)
  To: Balbir Singh
  Cc: Michael Ellerman, Kamalesh Babulal,
	open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

On 2017/09/21 09:00PM, Balbir Singh wrote:
> On Thu, Sep 21, 2017 at 8:02 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> > Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> writes:
> >
> >> While running stress test with livepatch module loaded, kernel
> >> bug was triggered.
> >>
> >> cpu 0x5: Vector: 400 (Instruction Access) at [c0000000eb9d3b60]
> >>     pc: c0000000eb9d3e30
> >>     lr: c0000000eb9d3e30
> >>     sp: c0000000eb9d3de0
> >>    msr: 800000001280b033
> >>   current = 0xc0000000dbd38700
> >>   paca    = 0xc00000000fe01400   softe: 0        irq_happened: 0x01
> >>     pid   = 8618, comm = make
> >> Linux version 4.13.0+ (root@ubuntu) (gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #1 SMP Wed Sep 13 03:49:27 EDT 2017
> >>
> >> 5:mon> t
> >> [c0000000eb9d3de0] c0000000eb9d3e30 (unreliable)
> >> [c0000000eb9d3e30] c000000000008ab4 hardware_interrupt_common+0x114/0x120
> >>  --- Exception: 501 (Hardware Interrupt) at c000000000053040 livepatch_handler+0x4c/0x74
> >> [c0000000eb9d4120] 0000000057ac6e9d (unreliable)
> >> [d0000000089d9f78] 2e0965747962382e
> >> SP (965747962342e09) is in userspace
> >>
> >> When an interrupt is served in between the livepatch_handler execution,
> >> there are chances of the livepatch_stack/task task getting corrupted.
> >
> > Ouch. That's pretty broken by me.
> >
> 
> I was worried more about preemption as I said in the review comment earlier,
> this is new. It looks like we restored the wrong r1 on returning from
> the interrupt
> context? It would be nice to see any pt_regs changes due to the interrupt.
> Did the interrupt handling code called something that needed live-patching?

The problem is just that the livepatch stack grows up, rather than down.  
So, when this stack is used during interrupt handling, we'll clobber 
part of this stack.

There aren't any pt_regs changes due to this afaics. Just the livepatch 
stack being over-written.

> 
> >> Fix the corruption by using r11 register for livepatch stack manipulation,
> >> instead of shuffling task stack and livepatch stack into r1 register.
> >> Using r11 register also avoids disabling/enabling irq's while setting
> >> up the livepatch stack.
> >
> > I'm trying to think if there's some reason I didn't use r11. But I can't
> > remember anything specific. I suspect I just didn't check the ABI
> 
> We can't use r11, this is ftrace with regs, we've restore registers before
> calling livepatch_handler, I don't think we can clobber r11, but I might
> be sleep deprived and missing something

r11 is volatile and meant for usage in function linkage. With 
-mprofile-kernel, we've just entered the new function and haven't 
touched r11 yet. So, it appears to me that we can definetely use/clobber 
it, along with r0 and r12.

- Naveen

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/livepatch: Fix livepatch stack access
  2017-09-21 16:55     ` Naveen N . Rao
@ 2017-09-22  5:16       ` Kamalesh Babulal
  0 siblings, 0 replies; 9+ messages in thread
From: Kamalesh Babulal @ 2017-09-22  5:16 UTC (permalink / raw)
  To: Naveen N . Rao, Balbir Singh
  Cc: Michael Ellerman, open list:LINUX FOR POWERPC (32-BIT AND 64-BIT)

On Thursday 21 September 2017 10:25 PM, Naveen N . Rao wrote:
> On 2017/09/21 09:00PM, Balbir Singh wrote:
>> On Thu, Sep 21, 2017 at 8:02 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
>>> Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> writes:
>>>
>>>> While running stress test with livepatch module loaded, kernel
>>>> bug was triggered.
>>>>
>>>> cpu 0x5: Vector: 400 (Instruction Access) at [c0000000eb9d3b60]
>>>>     pc: c0000000eb9d3e30
>>>>     lr: c0000000eb9d3e30
>>>>     sp: c0000000eb9d3de0
>>>>    msr: 800000001280b033
>>>>   current = 0xc0000000dbd38700
>>>>   paca    = 0xc00000000fe01400   softe: 0        irq_happened: 0x01
>>>>     pid   = 8618, comm = make
>>>> Linux version 4.13.0+ (root@ubuntu) (gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #1 SMP Wed Sep 13 03:49:27 EDT 2017
>>>>
>>>> 5:mon> t
>>>> [c0000000eb9d3de0] c0000000eb9d3e30 (unreliable)
>>>> [c0000000eb9d3e30] c000000000008ab4 hardware_interrupt_common+0x114/0x120
>>>>  --- Exception: 501 (Hardware Interrupt) at c000000000053040 livepatch_handler+0x4c/0x74
>>>> [c0000000eb9d4120] 0000000057ac6e9d (unreliable)
>>>> [d0000000089d9f78] 2e0965747962382e
>>>> SP (965747962342e09) is in userspace
>>>>
>>>> When an interrupt is served in between the livepatch_handler execution,
>>>> there are chances of the livepatch_stack/task task getting corrupted.
>>>
>>> Ouch. That's pretty broken by me.
>>>
>>
>> I was worried more about preemption as I said in the review comment earlier,
>> this is new. It looks like we restored the wrong r1 on returning from
>> the interrupt
>> context? It would be nice to see any pt_regs changes due to the interrupt.
>> Did the interrupt handling code called something that needed live-patching?
>
> The problem is just that the livepatch stack grows up, rather than down.
> So, when this stack is used during interrupt handling, we'll clobber
> part of this stack.
>
> There aren't any pt_regs changes due to this afaics. Just the livepatch
> stack being over-written.
>
>>
>>>> Fix the corruption by using r11 register for livepatch stack manipulation,
>>>> instead of shuffling task stack and livepatch stack into r1 register.
>>>> Using r11 register also avoids disabling/enabling irq's while setting
>>>> up the livepatch stack.
>>>
>>> I'm trying to think if there's some reason I didn't use r11. But I can't
>>> remember anything specific. I suspect I just didn't check the ABI
>>
>> We can't use r11, this is ftrace with regs, we've restore registers before
>> calling livepatch_handler, I don't think we can clobber r11, but I might
>> be sleep deprived and missing something
>
> r11 is volatile and meant for usage in function linkage. With
> -mprofile-kernel, we've just entered the new function and haven't
> touched r11 yet. So, it appears to me that we can definetely use/clobber
> it, along with r0 and r12.
>

Misread the question from Balbir. livepatch_handler is called in mcount
context, which is very early in the function execution. r11 might
have live values used by the calling function, those are restored by
ftrace code but the called/calling function cannot rely on it, after
a function call as pointed out by Naveen.

In non-ftrace context, r11 is also used/clobbered between function calls 
to load the stub address.

-- 
cheers,
Kamalesh.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/livepatch: Fix livepatch stack access
  2017-09-20 11:11 ` Naveen N . Rao
@ 2017-09-29 11:47   ` Balbir Singh
  0 siblings, 0 replies; 9+ messages in thread
From: Balbir Singh @ 2017-09-29 11:47 UTC (permalink / raw)
  To: Naveen N . Rao, Kamalesh Babulal; +Cc: Michael Ellerman, linuxppc-dev

On Wed, 2017-09-20 at 16:41 +0530, Naveen N . Rao wrote:
> On 2017/09/20 03:49PM, Kamalesh Babulal wrote:
> > While running stress test with livepatch module loaded, kernel
> > bug was triggered.
> > 
> > cpu 0x5: Vector: 400 (Instruction Access) at [c0000000eb9d3b60]
> >     pc: c0000000eb9d3e30
> >     lr: c0000000eb9d3e30
> >     sp: c0000000eb9d3de0
> >    msr: 800000001280b033
> >   current = 0xc0000000dbd38700
> >   paca    = 0xc00000000fe01400   softe: 0        irq_happened: 0x01
> >     pid   = 8618, comm = make
> > Linux version 4.13.0+ (root@ubuntu) (gcc version 6.3.0 20170406 (Ubuntu 6.3.0-12ubuntu2)) #1 SMP Wed Sep 13 03:49:27 EDT 2017
> > 
> > 5:mon> t
> > [c0000000eb9d3de0] c0000000eb9d3e30 (unreliable)
> > [c0000000eb9d3e30] c000000000008ab4 hardware_interrupt_common+0x114/0x120
> >  --- Exception: 501 (Hardware Interrupt) at c000000000053040 livepatch_handler+0x4c/0x74
> > [c0000000eb9d4120] 0000000057ac6e9d (unreliable)
> > [d0000000089d9f78] 2e0965747962382e
> > SP (965747962342e09) is in userspace
> > 
> > When an interrupt is served in between the livepatch_handler execution,
> > there are chances of the livepatch_stack/task task getting corrupted.
> > 
> >                     CPU 1
> >                  =============
> > Task A                          Interrupt Handler
> > =========                       =================
> > livepatch_handler:
> >  mr r0, r1
> >  ld r1, TI_livepatch_sp(r12)
> >                                 hardware_interrupt_common
> >                                 |_do_IRQ
> >                                   |_ call_do_irq:
> >                                         mflr r0
> >                                         std  r0,16(r1)
> >                                         stdu r1,THREAD_SIZE-STACK_FRAME_OVERHEAD(r4)
> >                                             ...
> > lis r2, STACK_END_MAGIC@h
> > ori r2, r2, STACK_END_MAGIC@l
> > ld  r12, -8(r1)      <- livepatch stack is corrupted.
> > 
> > Fix the corruption by using r11 register for livepatch stack manipulation,
> > instead of shuffling task stack and livepatch stack into r1 register.
> > Using r11 register also avoids disabling/enabling irq's while setting
> > up the livepatch stack.
> > 
> > Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
> > Cc: Balbir Singh <bsingharora@gmail.com>
> > Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> > Cc: linuxppc-dev@lists.ozlabs.org
> > ---
> >  arch/powerpc/kernel/trace/ftrace_64_mprofile.S | 45 +++++++++-----------------
> >  1 file changed, 15 insertions(+), 30 deletions(-)
> 
> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora@gmail.com>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: powerpc/livepatch: Fix livepatch stack access
  2017-09-20 10:19 [PATCH] powerpc/livepatch: Fix livepatch stack access Kamalesh Babulal
  2017-09-20 11:11 ` Naveen N . Rao
  2017-09-21 10:02 ` Michael Ellerman
@ 2017-10-12  0:20 ` Michael Ellerman
  2 siblings, 0 replies; 9+ messages in thread
From: Michael Ellerman @ 2017-10-12  0:20 UTC (permalink / raw)
  To: Kamalesh Babulal; +Cc: linuxppc-dev, Naveen N . Rao, Kamalesh Babulal

On Wed, 2017-09-20 at 10:19:51 UTC, Kamalesh Babulal wrote:
> While running stress test with livepatch module loaded, kernel
> bug was triggered.
...
> 
> Signed-off-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
> Cc: Balbir Singh <bsingharora@gmail.com>
> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
> Cc: linuxppc-dev@lists.ozlabs.org

Applied to powerpc fixes, thanks.

https://git.kernel.org/powerpc/c/e36a82ee4c514a2f4f8fa30c780ad0

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-10-12  0:20 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-20 10:19 [PATCH] powerpc/livepatch: Fix livepatch stack access Kamalesh Babulal
2017-09-20 11:11 ` Naveen N . Rao
2017-09-29 11:47   ` Balbir Singh
2017-09-21 10:02 ` Michael Ellerman
2017-09-21 11:00   ` Balbir Singh
2017-09-21 11:53     ` Kamalesh Babulal
2017-09-21 16:55     ` Naveen N . Rao
2017-09-22  5:16       ` Kamalesh Babulal
2017-10-12  0:20 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).