* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
@ 2015-01-08 13:15 Pratyush Anand
2015-01-08 15:49 ` William Cohen
2015-01-08 16:23 ` Will Deacon
0 siblings, 2 replies; 18+ messages in thread
From: Pratyush Anand @ 2015-01-08 13:15 UTC (permalink / raw)
To: linux-arm-kernel
Hi All,
I am trying to test following scenario, which seems valid to me. But I
am very new to ARM64 as well as to debugging tools, so seeking expert's
comment here.
-- I have inserted a kprobe to the function uprobe_breakpoint_handler
which is called from elo_dbg
(el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
-- kprobe is enabled.
-- an uprobe is inserted into a test application and enabled.
So, when uprobe is enabled and test code execution reaches to probe
instruction, it executes uprobe breakpoint instruction and el0_dbg
exception is raised.
When control reaches to start of uprobe_breakpoint_handler and it
executes first instruction (which has been replaced with a kprobe
breakpoint instruction), el1_dbg exception is raised.
Further Call sequence goes like,
el1_dbg->do_debug_exception->brk_handler->call_break_hook->kprobe_breakpoint_handler,
and kprobe breakpoint handler does everything what it should have done.
After return from above (first) el1_dbg, second el1_dbg is raised for
single steping of kprobe instruction, and instruction pointer does not
matches with the kcb->ss_ctx.match_addr and so, kprobe_ss_hit fails,
which is strange.
To debug it further, I examined ELR_EL1 value in el1_dbg after execution
of first el1_dbg, and it was fffffdfffc000004.
So, my question is how can instruction pointer has a value
fffffe0000092470(which is actually el1_inv + 0x4) when second el1_dbg is
received?
Am I missing something or trying something which is not supported by ARM64?
I have put some printk in the code. You can have a detailed view of
debug code and print log here:
https://github.com/pratyushanand/linux.git
branch:
ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
~Pratyush
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-08 13:15 Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg Pratyush Anand
@ 2015-01-08 15:49 ` William Cohen
2015-01-08 17:19 ` Pratyush Anand
2015-01-08 16:23 ` Will Deacon
1 sibling, 1 reply; 18+ messages in thread
From: William Cohen @ 2015-01-08 15:49 UTC (permalink / raw)
To: linux-arm-kernel
On 01/08/2015 08:15 AM, Pratyush Anand wrote:
> Hi All,
>
> I am trying to test following scenario, which seems valid to me. But I am very new to ARM64 as well as to debugging tools, so seeking expert's comment here.
>
> -- I have inserted a kprobe to the function uprobe_breakpoint_handler which is called from elo_dbg (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
>
> -- kprobe is enabled.
>
> -- an uprobe is inserted into a test application and enabled.
>
> So, when uprobe is enabled and test code execution reaches to probe instruction, it executes uprobe breakpoint instruction and el0_dbg exception is raised.
>
> When control reaches to start of uprobe_breakpoint_handler and it executes first instruction (which has been replaced with a kprobe breakpoint instruction), el1_dbg exception is raised.
>
> Further Call sequence goes like, el1_dbg->do_debug_exception->brk_handler->call_break_hook->kprobe_breakpoint_handler, and kprobe breakpoint handler does everything what it should have done.
>
> After return from above (first) el1_dbg, second el1_dbg is raised for single steping of kprobe instruction, and instruction pointer does not matches with the kcb->ss_ctx.match_addr and so, kprobe_ss_hit fails, which is strange.
>
> To debug it further, I examined ELR_EL1 value in el1_dbg after execution of first el1_dbg, and it was fffffdfffc000004.
>
> So, my question is how can instruction pointer has a value fffffe0000092470(which is actually el1_inv + 0x4) when second el1_dbg is received?
>
> Am I missing something or trying something which is not supported by ARM64?
>
>
> I have put some printk in the code. You can have a detailed view of debug code and print log here:
>
> https://github.com/pratyushanand/linux.git
>
> branch: ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
>
>
> ~Pratyush
Hi Pratyush,
Could the problem be that uprobe handling is already handling the breakpoint exception, so when the kprobe fires in the uprobes code it is like nested kprobes? When this happen some state information is getting clobbered and process never gets out of the nested kprobe/uprobe. I wonder if similar failures can be trigger with kprobes in the kernel code that user-space gdb uses the handle breakpoints or hardware watchpoints/breakpoints.
-Will
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-08 13:15 Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg Pratyush Anand
2015-01-08 15:49 ` William Cohen
@ 2015-01-08 16:23 ` Will Deacon
2015-01-08 17:28 ` Pratyush Anand
1 sibling, 1 reply; 18+ messages in thread
From: Will Deacon @ 2015-01-08 16:23 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Jan 08, 2015 at 01:15:58PM +0000, Pratyush Anand wrote:
> Hi All,
>
> I am trying to test following scenario, which seems valid to me. But I
> am very new to ARM64 as well as to debugging tools, so seeking expert's
> comment here.
>
> -- I have inserted a kprobe to the function uprobe_breakpoint_handler
> which is called from elo_dbg
> (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
>
> -- kprobe is enabled.
>
> -- an uprobe is inserted into a test application and enabled.
>
> So, when uprobe is enabled and test code execution reaches to probe
> instruction, it executes uprobe breakpoint instruction and el0_dbg
> exception is raised.
>
> When control reaches to start of uprobe_breakpoint_handler and it
> executes first instruction (which has been replaced with a kprobe
> breakpoint instruction), el1_dbg exception is raised.
Hmm, debug exceptions should be masked at this point so I don't see why
you're taking the second debug exception.
Will
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-08 15:49 ` William Cohen
@ 2015-01-08 17:19 ` Pratyush Anand
0 siblings, 0 replies; 18+ messages in thread
From: Pratyush Anand @ 2015-01-08 17:19 UTC (permalink / raw)
To: linux-arm-kernel
On Thursday 08 January 2015 09:19 PM, William Cohen wrote:
> On 01/08/2015 08:15 AM, Pratyush Anand wrote:
>> Hi All,
>>
>> I am trying to test following scenario, which seems valid to me. But I am very new to ARM64 as well as to debugging tools, so seeking expert's comment here.
>>
>> -- I have inserted a kprobe to the function uprobe_breakpoint_handler which is called from elo_dbg (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
>>
>> -- kprobe is enabled.
>>
>> -- an uprobe is inserted into a test application and enabled.
>>
>> So, when uprobe is enabled and test code execution reaches to probe instruction, it executes uprobe breakpoint instruction and el0_dbg exception is raised.
>>
>> When control reaches to start of uprobe_breakpoint_handler and it executes first instruction (which has been replaced with a kprobe breakpoint instruction), el1_dbg exception is raised.
>>
>> Further Call sequence goes like, el1_dbg->do_debug_exception->brk_handler->call_break_hook->kprobe_breakpoint_handler, and kprobe breakpoint handler does everything what it should have done.
>>
>> After return from above (first) el1_dbg, second el1_dbg is raised for single steping of kprobe instruction, and instruction pointer does not matches with the kcb->ss_ctx.match_addr and so, kprobe_ss_hit fails, which is strange.
>>
>> To debug it further, I examined ELR_EL1 value in el1_dbg after execution of first el1_dbg, and it was fffffdfffc000004.
>>
>> So, my question is how can instruction pointer has a value fffffe0000092470(which is actually el1_inv + 0x4) when second el1_dbg is received?
>>
>> Am I missing something or trying something which is not supported by ARM64?
>>
>>
>> I have put some printk in the code. You can have a detailed view of debug code and print log here:
>>
>> https://github.com/pratyushanand/linux.git
>>
>> branch: ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
>>
>>
>> ~Pratyush
>
> Hi Pratyush,
>
> Could the problem be that uprobe handling is already handling the breakpoint exception, so when the kprobe fires in the uprobes code it is like nested kprobes? When this happen some state information is getting clobbered and process never gets out of the nested kprobe/uprobe. I wonder if similar failures can be trigger with kprobes in the kernel code that user-space gdb uses the handle breakpoints or hardware watchpoints/breakpoints.
Yes, if kprobes are inserted in the kernel code that user-space gdb uses
or hardware watchpoints/breakpoints handling code, I believe you will
land into similar situation.
~Pratyush
>
> -Will
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-08 16:23 ` Will Deacon
@ 2015-01-08 17:28 ` Pratyush Anand
2015-01-09 15:46 ` Will Deacon
0 siblings, 1 reply; 18+ messages in thread
From: Pratyush Anand @ 2015-01-08 17:28 UTC (permalink / raw)
To: linux-arm-kernel
On Thursday 08 January 2015 09:53 PM, Will Deacon wrote:
> On Thu, Jan 08, 2015 at 01:15:58PM +0000, Pratyush Anand wrote:
>> Hi All,
>>
>> I am trying to test following scenario, which seems valid to me. But I
>> am very new to ARM64 as well as to debugging tools, so seeking expert's
>> comment here.
>>
>> -- I have inserted a kprobe to the function uprobe_breakpoint_handler
>> which is called from elo_dbg
>> (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
>>
>> -- kprobe is enabled.
>>
>> -- an uprobe is inserted into a test application and enabled.
>>
>> So, when uprobe is enabled and test code execution reaches to probe
>> instruction, it executes uprobe breakpoint instruction and el0_dbg
>> exception is raised.
>>
>> When control reaches to start of uprobe_breakpoint_handler and it
>> executes first instruction (which has been replaced with a kprobe
>> breakpoint instruction), el1_dbg exception is raised.
>
> Hmm, debug exceptions should be masked at this point so I don't see why
> you're taking the second debug exception.
>
So, you mean to say that when an exception which has been taken from
lower exception level (EL0) is being executed, then we keep masked also
the exception from current exception level (EL1)...
If, so then how to handle it. One way is that I assign a __kprobe
qualifier to uprobe_breakpoint_handler and uprobe_single_step_handler,
so that an user can not insert a kprobe there. But, that does not seem
to be a good idea, because it will only prevent these two functions to
be probed. What about the functions which is being called by these
functions like uprobe_pre_sstep_notifier & uprobe_post_sstep_notifier
which lie in generic kernel code. So, may be we need something in
debug-monitor, which handles this situation, no?
~Pratyush
> Will
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-08 17:28 ` Pratyush Anand
@ 2015-01-09 15:46 ` Will Deacon
2015-01-09 17:13 ` Pratyush Anand
0 siblings, 1 reply; 18+ messages in thread
From: Will Deacon @ 2015-01-09 15:46 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Jan 08, 2015 at 05:28:37PM +0000, Pratyush Anand wrote:
> On Thursday 08 January 2015 09:53 PM, Will Deacon wrote:
> > On Thu, Jan 08, 2015 at 01:15:58PM +0000, Pratyush Anand wrote:
> >> I am trying to test following scenario, which seems valid to me. But I
> >> am very new to ARM64 as well as to debugging tools, so seeking expert's
> >> comment here.
> >>
> >> -- I have inserted a kprobe to the function uprobe_breakpoint_handler
> >> which is called from elo_dbg
> >> (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
> >>
> >> -- kprobe is enabled.
> >>
> >> -- an uprobe is inserted into a test application and enabled.
> >>
> >> So, when uprobe is enabled and test code execution reaches to probe
> >> instruction, it executes uprobe breakpoint instruction and el0_dbg
> >> exception is raised.
> >>
> >> When control reaches to start of uprobe_breakpoint_handler and it
> >> executes first instruction (which has been replaced with a kprobe
> >> breakpoint instruction), el1_dbg exception is raised.
> >
> > Hmm, debug exceptions should be masked at this point so I don't see why
> > you're taking the second debug exception.
> >
>
> So, you mean to say that when an exception which has been taken from
> lower exception level (EL0) is being executed, then we keep masked also
> the exception from current exception level (EL1)...
Yeah, if you look at entry.S then you'll see that neither el0_dbg or el1_dbg
re-enable debug exceptions (masked automatically by the CPU after taking the
exception) until *after* the handling has completed. This is to prevent
recursive debug exceptions, which I don't see how we can reasonable handle.
> If, so then how to handle it. One way is that I assign a __kprobe
> qualifier to uprobe_breakpoint_handler and uprobe_single_step_handler,
> so that an user can not insert a kprobe there. But, that does not seem
> to be a good idea, because it will only prevent these two functions to
> be probed. What about the functions which is being called by these
> functions like uprobe_pre_sstep_notifier & uprobe_post_sstep_notifier
> which lie in generic kernel code. So, may be we need something in
> debug-monitor, which handles this situation, no?
I'm not sure how to solve it, but we certainly can't allow debug exceptions
to trigger on the debug exception handling path. The first thing to do would
be finding out where they are getting re-enabled.
Will
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-09 15:46 ` Will Deacon
@ 2015-01-09 17:13 ` Pratyush Anand
2015-01-12 17:30 ` Will Deacon
0 siblings, 1 reply; 18+ messages in thread
From: Pratyush Anand @ 2015-01-09 17:13 UTC (permalink / raw)
To: linux-arm-kernel
On Friday 09 January 2015 09:16 PM, Will Deacon wrote:
> On Thu, Jan 08, 2015 at 05:28:37PM +0000, Pratyush Anand wrote:
>> On Thursday 08 January 2015 09:53 PM, Will Deacon wrote:
>>> On Thu, Jan 08, 2015 at 01:15:58PM +0000, Pratyush Anand wrote:
>>>> I am trying to test following scenario, which seems valid to me. But I
>>>> am very new to ARM64 as well as to debugging tools, so seeking expert's
>>>> comment here.
>>>>
>>>> -- I have inserted a kprobe to the function uprobe_breakpoint_handler
>>>> which is called from elo_dbg
>>>> (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
>>>>
>>>> -- kprobe is enabled.
>>>>
>>>> -- an uprobe is inserted into a test application and enabled.
>>>>
>>>> So, when uprobe is enabled and test code execution reaches to probe
>>>> instruction, it executes uprobe breakpoint instruction and el0_dbg
>>>> exception is raised.
>>>>
>>>> When control reaches to start of uprobe_breakpoint_handler and it
>>>> executes first instruction (which has been replaced with a kprobe
>>>> breakpoint instruction), el1_dbg exception is raised.
>>>
>>> Hmm, debug exceptions should be masked at this point so I don't see why
>>> you're taking the second debug exception.
>>>
>>
>> So, you mean to say that when an exception which has been taken from
>> lower exception level (EL0) is being executed, then we keep masked also
>> the exception from current exception level (EL1)...
>
> Yeah, if you look at entry.S then you'll see that neither el0_dbg or el1_dbg
> re-enable debug exceptions (masked automatically by the CPU after taking the
> exception) until *after* the handling has completed. This is to prevent
> recursive debug exceptions, which I don't see how we can reasonable handle.
May be I am missing something, but my observation on silicon is
different. Please have a look at git log of HEAD of following branch,
which says that el1_dbg exception has been raised while el0_dbg was
executing. Do not know what I am missing..
https://github.com/pratyushanand/linux/tree/ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
>
>> If, so then how to handle it. One way is that I assign a __kprobe
>> qualifier to uprobe_breakpoint_handler and uprobe_single_step_handler,
>> so that an user can not insert a kprobe there. But, that does not seem
>> to be a good idea, because it will only prevent these two functions to
>> be probed. What about the functions which is being called by these
>> functions like uprobe_pre_sstep_notifier & uprobe_post_sstep_notifier
>> which lie in generic kernel code. So, may be we need something in
>> debug-monitor, which handles this situation, no?
>
> I'm not sure how to solve it, but we certainly can't allow debug exceptions
> to trigger on the debug exception handling path. The first thing to do would
> be finding out where they are getting re-enabled.
As of now I will put uprobe_breakpoint_handler and
uprobe_single_step_handler symbols under NOKPROBE_SYMBOL.
Other than these, we should also put functions like brk_handler,
do_dbg_exception (all those which comes in debug exception handling
path) under NOKPROBE_SYMBOL, as they have been done in
arch/x86/kernel/traps.c
In my opinion uprobe_pre_sstep_notifier and uprobe_post_sstep_notifier
should also be put under NOKPROBE_SYMBOL. Adding linux-kernel to comment.
~Pratyush
>
> Will
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-09 17:13 ` Pratyush Anand
@ 2015-01-12 17:30 ` Will Deacon
2015-01-12 19:25 ` William Cohen
2015-01-13 6:46 ` Pratyush Anand
0 siblings, 2 replies; 18+ messages in thread
From: Will Deacon @ 2015-01-12 17:30 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Jan 09, 2015 at 05:13:29PM +0000, Pratyush Anand wrote:
>
>
> On Friday 09 January 2015 09:16 PM, Will Deacon wrote:
> > On Thu, Jan 08, 2015 at 05:28:37PM +0000, Pratyush Anand wrote:
> >> On Thursday 08 January 2015 09:53 PM, Will Deacon wrote:
> >>> On Thu, Jan 08, 2015 at 01:15:58PM +0000, Pratyush Anand wrote:
> >>>> I am trying to test following scenario, which seems valid to me. But I
> >>>> am very new to ARM64 as well as to debugging tools, so seeking expert's
> >>>> comment here.
> >>>>
> >>>> -- I have inserted a kprobe to the function uprobe_breakpoint_handler
> >>>> which is called from elo_dbg
> >>>> (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
> >>>>
> >>>> -- kprobe is enabled.
> >>>>
> >>>> -- an uprobe is inserted into a test application and enabled.
> >>>>
> >>>> So, when uprobe is enabled and test code execution reaches to probe
> >>>> instruction, it executes uprobe breakpoint instruction and el0_dbg
> >>>> exception is raised.
> >>>>
> >>>> When control reaches to start of uprobe_breakpoint_handler and it
> >>>> executes first instruction (which has been replaced with a kprobe
> >>>> breakpoint instruction), el1_dbg exception is raised.
> >>>
> >>> Hmm, debug exceptions should be masked at this point so I don't see why
> >>> you're taking the second debug exception.
> >>>
> >>
> >> So, you mean to say that when an exception which has been taken from
> >> lower exception level (EL0) is being executed, then we keep masked also
> >> the exception from current exception level (EL1)...
> >
> > Yeah, if you look at entry.S then you'll see that neither el0_dbg or el1_dbg
> > re-enable debug exceptions (masked automatically by the CPU after taking the
> > exception) until *after* the handling has completed. This is to prevent
> > recursive debug exceptions, which I don't see how we can reasonable handle.
>
> May be I am missing something, but my observation on silicon is
> different. Please have a look at git log of HEAD of following branch,
> which says that el1_dbg exception has been raised while el0_dbg was
> executing. Do not know what I am missing..
>
> https://github.com/pratyushanand/linux/tree/ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
That page just says "Failed to load latest commit information." for me.
Regardless, I think you need to debug further and found out if PSTATE.D is
getting cleared and, if so, who is responsible for that. Somebody could be
enabling IRQs, for example, which will then unmask debug exceptions in
el1_irq.
Will
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-12 17:30 ` Will Deacon
@ 2015-01-12 19:25 ` William Cohen
2015-01-13 6:46 ` Pratyush Anand
1 sibling, 0 replies; 18+ messages in thread
From: William Cohen @ 2015-01-12 19:25 UTC (permalink / raw)
To: linux-arm-kernel
On 01/12/2015 12:30 PM, Will Deacon wrote:
> On Fri, Jan 09, 2015 at 05:13:29PM +0000, Pratyush Anand wrote:
>>
>>
>> On Friday 09 January 2015 09:16 PM, Will Deacon wrote:
>>> On Thu, Jan 08, 2015 at 05:28:37PM +0000, Pratyush Anand wrote:
>>>> On Thursday 08 January 2015 09:53 PM, Will Deacon wrote:
>>>>> On Thu, Jan 08, 2015 at 01:15:58PM +0000, Pratyush Anand wrote:
>>>>>> I am trying to test following scenario, which seems valid to me. But I
>>>>>> am very new to ARM64 as well as to debugging tools, so seeking expert's
>>>>>> comment here.
>>>>>>
>>>>>> -- I have inserted a kprobe to the function uprobe_breakpoint_handler
>>>>>> which is called from elo_dbg
>>>>>> (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
>>>>>>
>>>>>> -- kprobe is enabled.
>>>>>>
>>>>>> -- an uprobe is inserted into a test application and enabled.
>>>>>>
>>>>>> So, when uprobe is enabled and test code execution reaches to probe
>>>>>> instruction, it executes uprobe breakpoint instruction and el0_dbg
>>>>>> exception is raised.
>>>>>>
>>>>>> When control reaches to start of uprobe_breakpoint_handler and it
>>>>>> executes first instruction (which has been replaced with a kprobe
>>>>>> breakpoint instruction), el1_dbg exception is raised.
>>>>>
>>>>> Hmm, debug exceptions should be masked at this point so I don't see why
>>>>> you're taking the second debug exception.
>>>>>
>>>>
>>>> So, you mean to say that when an exception which has been taken from
>>>> lower exception level (EL0) is being executed, then we keep masked also
>>>> the exception from current exception level (EL1)...
>>>
>>> Yeah, if you look at entry.S then you'll see that neither el0_dbg or el1_dbg
>>> re-enable debug exceptions (masked automatically by the CPU after taking the
>>> exception) until *after* the handling has completed. This is to prevent
>>> recursive debug exceptions, which I don't see how we can reasonable handle.
>>
>> May be I am missing something, but my observation on silicon is
>> different. Please have a look at git log of HEAD of following branch,
>> which says that el1_dbg exception has been raised while el0_dbg was
>> executing. Do not know what I am missing..
>>
>> https://github.com/pratyushanand/linux/tree/ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
>
> That page just says "Failed to load latest commit information." for me.
I got that message too, but I was able to see the history and the information in the first entry of:
https://github.com/pratyushanand/linux/commits/ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
>
> Regardless, I think you need to debug further and found out if PSTATE.D is
> getting cleared and, if so, who is responsible for that. Somebody could be
> enabling IRQs, for example, which will then unmask debug exceptions in
> el1_irq.
>
> Will
>
If the problem is due to the irq being enabled and then an irq handler re-enabling the flag, it would be possible to use a systemtap script to monitor the irq_handler_entry and irq_handler_exit tracepoints to see if PSTATE.D is gettting cleared. Maybe something like the attached script. This script isn't using the kprobe support, so should avoid the problematic interactions between kprobes and uprobes.
-Will Cohen
-------------- next part --------------
global pstated
function masked_dflag:long(f) { return ((f & 1 << 9) != 0) }
probe irq_handler.entry {
// Record if pstate.d is masked
pstated[cpu(), irq] = masked_dflag(flags)
}
probe irq_handler.exit {
if ((!masked_dflag(flags)) && pstated[cpu(), irq]) {
printf("d flag unmasked in irq %d(%s)\n", irq, kernel_string(dev_name));
}
delete pstated[cpu(), irq]
}
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-12 17:30 ` Will Deacon
2015-01-12 19:25 ` William Cohen
@ 2015-01-13 6:46 ` Pratyush Anand
2015-01-13 15:52 ` Catalin Marinas
1 sibling, 1 reply; 18+ messages in thread
From: Pratyush Anand @ 2015-01-13 6:46 UTC (permalink / raw)
To: linux-arm-kernel
On Monday 12 January 2015 11:00 PM, Will Deacon wrote:
> On Fri, Jan 09, 2015 at 05:13:29PM +0000, Pratyush Anand wrote:
>>
>>
>> On Friday 09 January 2015 09:16 PM, Will Deacon wrote:
>>> On Thu, Jan 08, 2015 at 05:28:37PM +0000, Pratyush Anand wrote:
>>>> On Thursday 08 January 2015 09:53 PM, Will Deacon wrote:
>>>>> On Thu, Jan 08, 2015 at 01:15:58PM +0000, Pratyush Anand wrote:
>>>>>> I am trying to test following scenario, which seems valid to me. But I
>>>>>> am very new to ARM64 as well as to debugging tools, so seeking expert's
>>>>>> comment here.
>>>>>>
>>>>>> -- I have inserted a kprobe to the function uprobe_breakpoint_handler
>>>>>> which is called from elo_dbg
>>>>>> (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
>>>>>>
>>>>>> -- kprobe is enabled.
>>>>>>
>>>>>> -- an uprobe is inserted into a test application and enabled.
>>>>>>
>>>>>> So, when uprobe is enabled and test code execution reaches to probe
>>>>>> instruction, it executes uprobe breakpoint instruction and el0_dbg
>>>>>> exception is raised.
>>>>>>
>>>>>> When control reaches to start of uprobe_breakpoint_handler and it
>>>>>> executes first instruction (which has been replaced with a kprobe
>>>>>> breakpoint instruction), el1_dbg exception is raised.
>>>>>
>>>>> Hmm, debug exceptions should be masked at this point so I don't see why
>>>>> you're taking the second debug exception.
>>>>>
>>>>
>>>> So, you mean to say that when an exception which has been taken from
>>>> lower exception level (EL0) is being executed, then we keep masked also
>>>> the exception from current exception level (EL1)...
>>>
>>> Yeah, if you look at entry.S then you'll see that neither el0_dbg or el1_dbg
>>> re-enable debug exceptions (masked automatically by the CPU after taking the
>>> exception) until *after* the handling has completed. This is to prevent
>>> recursive debug exceptions, which I don't see how we can reasonable handle.
>>
>> May be I am missing something, but my observation on silicon is
>> different. Please have a look at git log of HEAD of following branch,
>> which says that el1_dbg exception has been raised while el0_dbg was
>> executing. Do not know what I am missing..
>>
>> https://github.com/pratyushanand/linux/tree/ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
>
> That page just says "Failed to load latest commit information." for me.
may be you can fetch https://github.com/pratyushanand/linux.git and can
see git log of HEAD of
ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler.
Or, you can apply attached patches on top of v3.18 kernel.
>
> Regardless, I think you need to debug further and found out if PSTATE.D is
> getting cleared and, if so, who is responsible for that. Somebody could be
> enabling IRQs, for example, which will then unmask debug exceptions in
> el1_irq.
>
This is what I see for pstate, When el0_dbg exception is raised (ie an
exception raised with ESR = ESR_EL1_EC_BRK64 after executing instruction
BRK64_OPCODE_UPROBES = 0xD4200100 in EL0, user mode), spsr_el1 value is
0x80000000. Which means, all exceptions are unmasked. Is it expected?
~Pratyush
-------------- next part --------------
A non-text attachment was scrubbed...
Name: uprobe_kprobe_patches_over_v3.18.tar.bz2
Type: application/x-bzip
Size: 26669 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20150113/4cbfc661/attachment-0001.bin>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-13 6:46 ` Pratyush Anand
@ 2015-01-13 15:52 ` Catalin Marinas
2015-01-13 17:53 ` Pratyush Anand
0 siblings, 1 reply; 18+ messages in thread
From: Catalin Marinas @ 2015-01-13 15:52 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Jan 13, 2015 at 06:46:36AM +0000, Pratyush Anand wrote:
> On Monday 12 January 2015 11:00 PM, Will Deacon wrote:
> > On Fri, Jan 09, 2015 at 05:13:29PM +0000, Pratyush Anand wrote:
> >> On Friday 09 January 2015 09:16 PM, Will Deacon wrote:
> >>> On Thu, Jan 08, 2015 at 05:28:37PM +0000, Pratyush Anand wrote:
> >>>> On Thursday 08 January 2015 09:53 PM, Will Deacon wrote:
> >>>>> On Thu, Jan 08, 2015 at 01:15:58PM +0000, Pratyush Anand wrote:
> >>>>>> I am trying to test following scenario, which seems valid to me. But I
> >>>>>> am very new to ARM64 as well as to debugging tools, so seeking expert's
> >>>>>> comment here.
> >>>>>>
> >>>>>> -- I have inserted a kprobe to the function uprobe_breakpoint_handler
> >>>>>> which is called from elo_dbg
> >>>>>> (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
> >>>>>>
> >>>>>> -- kprobe is enabled.
> >>>>>>
> >>>>>> -- an uprobe is inserted into a test application and enabled.
> >>>>>>
> >>>>>> So, when uprobe is enabled and test code execution reaches to probe
> >>>>>> instruction, it executes uprobe breakpoint instruction and el0_dbg
> >>>>>> exception is raised.
> >>>>>>
> >>>>>> When control reaches to start of uprobe_breakpoint_handler and it
> >>>>>> executes first instruction (which has been replaced with a kprobe
> >>>>>> breakpoint instruction), el1_dbg exception is raised.
> >>>>>
> >>>>> Hmm, debug exceptions should be masked at this point so I don't see why
> >>>>> you're taking the second debug exception.
> >>>>
> >>>> So, you mean to say that when an exception which has been taken from
> >>>> lower exception level (EL0) is being executed, then we keep masked also
> >>>> the exception from current exception level (EL1)...
> >>>
> >>> Yeah, if you look at entry.S then you'll see that neither el0_dbg or el1_dbg
> >>> re-enable debug exceptions (masked automatically by the CPU after taking the
> >>> exception) until *after* the handling has completed. This is to prevent
> >>> recursive debug exceptions, which I don't see how we can reasonable handle.
> >>
> >> May be I am missing something, but my observation on silicon is
> >> different. Please have a look at git log of HEAD of following branch,
> >> which says that el1_dbg exception has been raised while el0_dbg was
> >> executing. Do not know what I am missing..
[...]
> > Regardless, I think you need to debug further and found out if PSTATE.D is
> > getting cleared and, if so, who is responsible for that. Somebody could be
> > enabling IRQs, for example, which will then unmask debug exceptions in
> > el1_irq.
>
> This is what I see for pstate, When el0_dbg exception is raised (ie an
> exception raised with ESR = ESR_EL1_EC_BRK64 after executing instruction
> BRK64_OPCODE_UPROBES = 0xD4200100 in EL0, user mode), spsr_el1 value is
> 0x80000000. Which means, all exceptions are unmasked. Is it expected?
spsr_el1 is the EL0 pstate saved when entering EL1. So it is expected
that user space always has interrupts enabled.
--
Catalin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-13 15:52 ` Catalin Marinas
@ 2015-01-13 17:53 ` Pratyush Anand
2015-01-15 16:47 ` Pratyush Anand
0 siblings, 1 reply; 18+ messages in thread
From: Pratyush Anand @ 2015-01-13 17:53 UTC (permalink / raw)
To: linux-arm-kernel
On Tuesday 13 January 2015 09:22 PM, Catalin Marinas wrote:
> On Tue, Jan 13, 2015 at 06:46:36AM +0000, Pratyush Anand wrote:
>> On Monday 12 January 2015 11:00 PM, Will Deacon wrote:
>>> On Fri, Jan 09, 2015 at 05:13:29PM +0000, Pratyush Anand wrote:
>>>> On Friday 09 January 2015 09:16 PM, Will Deacon wrote:
>>>>> On Thu, Jan 08, 2015 at 05:28:37PM +0000, Pratyush Anand wrote:
>>>>>> On Thursday 08 January 2015 09:53 PM, Will Deacon wrote:
>>>>>>> On Thu, Jan 08, 2015 at 01:15:58PM +0000, Pratyush Anand wrote:
>>>>>>>> I am trying to test following scenario, which seems valid to me. But I
>>>>>>>> am very new to ARM64 as well as to debugging tools, so seeking expert's
>>>>>>>> comment here.
>>>>>>>>
>>>>>>>> -- I have inserted a kprobe to the function uprobe_breakpoint_handler
>>>>>>>> which is called from elo_dbg
>>>>>>>> (el0_dbg->do_debug_exception->brk_handler->call_break_hook->uprobe_breakpoint_handler)
>>>>>>>>
>>>>>>>> -- kprobe is enabled.
>>>>>>>>
>>>>>>>> -- an uprobe is inserted into a test application and enabled.
>>>>>>>>
>>>>>>>> So, when uprobe is enabled and test code execution reaches to probe
>>>>>>>> instruction, it executes uprobe breakpoint instruction and el0_dbg
>>>>>>>> exception is raised.
>>>>>>>>
>>>>>>>> When control reaches to start of uprobe_breakpoint_handler and it
>>>>>>>> executes first instruction (which has been replaced with a kprobe
>>>>>>>> breakpoint instruction), el1_dbg exception is raised.
>>>>>>>
>>>>>>> Hmm, debug exceptions should be masked at this point so I don't see why
>>>>>>> you're taking the second debug exception.
>>>>>>
>>>>>> So, you mean to say that when an exception which has been taken from
>>>>>> lower exception level (EL0) is being executed, then we keep masked also
>>>>>> the exception from current exception level (EL1)...
>>>>>
>>>>> Yeah, if you look at entry.S then you'll see that neither el0_dbg or el1_dbg
>>>>> re-enable debug exceptions (masked automatically by the CPU after taking the
>>>>> exception) until *after* the handling has completed. This is to prevent
>>>>> recursive debug exceptions, which I don't see how we can reasonable handle.
>>>>
>>>> May be I am missing something, but my observation on silicon is
>>>> different. Please have a look at git log of HEAD of following branch,
>>>> which says that el1_dbg exception has been raised while el0_dbg was
>>>> executing. Do not know what I am missing..
> [...]
>>> Regardless, I think you need to debug further and found out if PSTATE.D is
>>> getting cleared and, if so, who is responsible for that. Somebody could be
>>> enabling IRQs, for example, which will then unmask debug exceptions in
>>> el1_irq.
>>
>> This is what I see for pstate, When el0_dbg exception is raised (ie an
>> exception raised with ESR = ESR_EL1_EC_BRK64 after executing instruction
>> BRK64_OPCODE_UPROBES = 0xD4200100 in EL0, user mode), spsr_el1 value is
>> 0x80000000. Which means, all exceptions are unmasked. Is it expected?
>
> spsr_el1 is the EL0 pstate saved when entering EL1. So it is expected
> that user space always has interrupts enabled.
>
Yes, I was wrong :(
By the way, is there a way to read cpsr or current PSTATE.D?
That would help me to know if PSTATE.D was unmasked just before
executing BRK64_OPCODE_UPROBES. Actually, print in enable_dbg macro give
me other issues and does not allow system to boot.
I will still try to find some way to capture enable_dbg macro
path.However, if I just examine the code flow then I do not see a
situation where enable_dbg could have been called after receiving
el0_dbg.(or other than enable_dbg is there some other path too which can
re-enable debug exception??)
-- Application executes BRK64_OPCODE_UPROBES.
-- el0_sync is raised.
-- el0_sync
-> kernel_entry 0
-> el0_dbg
-> do_debug_exception
->brk_handler
->call_break_hook
->uprobe_breakpoint_handler
None of the above path seems calling enable_dbg, then how do we receive
el1_sync when first instruction of uprobe_breakpoint_handler (which has
been replaced with BRK64_OPCODE_KPROBES) is executed?
~Pratyush
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-13 17:53 ` Pratyush Anand
@ 2015-01-15 16:47 ` Pratyush Anand
2015-01-16 12:00 ` Pratyush Anand
0 siblings, 1 reply; 18+ messages in thread
From: Pratyush Anand @ 2015-01-15 16:47 UTC (permalink / raw)
To: linux-arm-kernel
Hi Will / Catalin,
On Tuesday 13 January 2015 11:23 PM, Pratyush Anand wrote:
> I will still try to find some way to capture enable_dbg macro path.H
I did instrumented debug tap points at all the location from where
enable_debug macro is called(see attached debug patch). But, I do not
see that, execution reaches to any of those tap points between el0_dbg
and el1_dbg, and tap points debug log also confirms that el1_dbg is
raised before el0_dbg is returned.
Details of log and code base can be seen here:
https://github.com/pratyushanand/linux/tree/ml_arm64_uprobe_devel_debug_kprobe_insertion_at_uprobe_breakpoint_handler
I am also providing debug log corresponding to attached patches here for
quick reference. Please see if there is anything which I would still be
missing in my analysis?
Step at user level:
================================
//inserting kprobe at 1st instruction of uprobe_breakpoint_handler. So
1st instruction of uprobe_breakpoint_handler has been replaced by
BRK64_OPCODE_KPROBES when kprobe enabled.
echo 'p:myprobe uprobe_breakpoint_handler' >
/sys/kernel/debug/tracing/kprobe_events
//enabling kprobe
echo 1 > /sys/kernel/debug/tracing/events/kprobes/enable
//run test application
./test&
//inserting uprobe at offset 0x5d0 of uprobe_breakpoint_handler. So
instruction at this offset has been replaced by BRK64_OPCODE_UPROBES,
when uprobe enabled.
echo 'p:test_entry test:0x5d0' >
/sys/kernel/debug/tracing/uprobe_events
//enabling uprobe
echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable
observed flow summary
========================
kprobe has been inserted at 1st instruction of
uprobe_breakpoint_handler and
uprobe has been inserted at offset 0x5d0 of test application.
Observation is that execution flow is as under:
-- Application executes BRK64_OPCODE_UPROBES.
-- el0_sync is raised.
-- el0_sync
-> kernel_entry 0
-> el0_dbg
-> do_debug_exception
->brk_handler
->call_break_hook
->uprobe_breakpoint_handler
(1st instruction of uprobe_breakpoint_handler has been modified as
BRK64_OPCODE_KPROBES)
-- el1_sync is raised.
-- el1_sync
-> kernel_entry 1
-> el1_dbg
-> do_debug_exception
->brk_handler
->call_break_hook
->kprobe_breakpoint_handler
Following printk messages confirms above flow. printk messages has been
avoided into el0_dbg and el1_dbg execution path. All the tap points for
these path have been written into per_cpu array and then they have been
printed when kprobe_breakpoint_handler is executed.
tap points have been instrumented wherever we are calling macro
enable_dbg and also in uprobe/kprobe break/single step exception path.
printk debug messages with comments
============================================
[ 60.846047] arch_prepare_kprobe called at 89
[ 60.850344] arch_prepare_kprobe called at 97
[ 60.854595] arch_prepare_kprobe called at 110
[ 60.858959] arch_prepare_kprobe called at 114 with slot
fffffdfffc000004
[ 60.865633] arch_prepare_ss_slot called at 46
[ 60.874466] arch_arm_kprobe called at 143
[ 60.878487] patch_text called at 136
[ 60.904226] arch_uprobe_analyze_insn called at 54
[ 60.908939] arch_uprobe_analyze_insn called at 68
[ 60.914155] 0.0: event 0 syndrom 0 @cpu 0
[ 60.918151] 0.0: event 0 syndrom 0 @cpu 1
[ 60.922143] 0.0: event 0 syndrom 0 @cpu 2
[ 60.926134] 1421337852.798722179: event 19 syndrom f2000008 @cpu 3
[1][Pratyush]: ESR = f2000008 and event 19 says its uprobe breakpoint
exception
[ 60.932286] 1421337852.798722179: event 19 syndrom f2000004 @cpu 3
[2][Pratyush]: ESR = f2000004 and event 19 says its kprobe breakpoint
exception
[ 60.938438] 1421337852.798722179: event 23 syndrom f2000004 @cpu 3
[3][Pratyush]: ESR = f2000004 and event 23 says that we are in function
kprobe_breakpoint_handler
Since we did not receive any event corresponding to calling of
enable_dbg macro
or execution of either uprobe_breakpoint_handler or
uprobe_single_step_handler, so it is confirmed that,
we received el1_dbg while executing el0_dbg
[ 60.944590] 0.0: event 0 syndrom 0 @cpu 3
[ 60.948579] 0.0: event 0 syndrom 0 @cpu 4
[ 60.952569] 0.0: event 0 syndrom 0 @cpu 5
[ 60.956558] 0.0: event 0 syndrom 0 @cpu 6
[ 60.960547] 0.0: event 0 syndrom 0 @cpu 7
[ 60.964539] kprobe_handler called at 453 with addr fffffe000009fd80
[ 60.970778] kprobe_handler called at 456
[ 60.974681] kprobe_handler called at 465
~Pratyush
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Debug-kprobe-insertion-at-uprobe_breakpoint_handler.patch
Type: text/x-patch
Size: 33060 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20150115/ba388265/attachment-0001.bin>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-15 16:47 ` Pratyush Anand
@ 2015-01-16 12:00 ` Pratyush Anand
2015-01-16 14:55 ` Pratyush Anand
2015-01-16 16:22 ` Will Deacon
0 siblings, 2 replies; 18+ messages in thread
From: Pratyush Anand @ 2015-01-16 12:00 UTC (permalink / raw)
To: linux-arm-kernel
Hi Will,
On Thursday 15 January 2015 10:17 PM, Pratyush Anand wrote:
> Hi Will / Catalin,
>
> On Tuesday 13 January 2015 11:23 PM, Pratyush Anand wrote:
>> I will still try to find some way to capture enable_dbg macro path.H
>
> I did instrumented debug tap points at all the location from where
> enable_debug macro is called(see attached debug patch). But, I do not
> see that, execution reaches to any of those tap points between el0_dbg
> and el1_dbg, and tap points debug log also confirms that el1_dbg is
> raised before el0_dbg is returned.
Probably we all missed this, ARMv8 specs is very clear about it. In
section "D2.1 About debug exceptions" it says:
Software Breakpoint Instruction exceptions cannot be masked. The PE
takes Software Breakpoint Instruction exceptions regardless of both of
the following:
? The current Exception level.
? The current Security state.
So, reception of el1_dbg while executing el0_dbg seems perfectly normal
to me. If you agree then I am back with the original query which I asked
in the beginning of the
thread,(http://permalink.gmane.org/gmane.linux.ports.arm.kernel/383672)
ie how can instruction_pointer be wrong when second el1_dbg is called
recursively(as follows).
[1]-> el0_dbg (After executing BRK instruction by user)
[2] -> el1_dbg (when uprobe break handler at [1] executes BRK instruction)
(At the end of this ELR_EL1 is programmed with fffffdfffc000004)
[3] -> el1_dbg (when kprobe break handler at [2] enables single stepping)
(Here ELR_EL1 was found fffffe0000092470).So When this el1_dbg was
received, then regs->pc values are not same what was programmed in
ELR_EL1 at the return of [2].
~Pratyush
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-16 12:00 ` Pratyush Anand
@ 2015-01-16 14:55 ` Pratyush Anand
2015-01-16 16:22 ` Will Deacon
1 sibling, 0 replies; 18+ messages in thread
From: Pratyush Anand @ 2015-01-16 14:55 UTC (permalink / raw)
To: linux-arm-kernel
Sorry for writing so many mails...But I have one more closer information
which could help to further explain the behavior. See below.
On Friday 16 January 2015 05:30 PM, Pratyush Anand wrote:
> Hi Will,
>
>
> On Thursday 15 January 2015 10:17 PM, Pratyush Anand wrote:
>> Hi Will / Catalin,
>>
>> On Tuesday 13 January 2015 11:23 PM, Pratyush Anand wrote:
>>> I will still try to find some way to capture enable_dbg macro path.H
>>
>> I did instrumented debug tap points at all the location from where
>> enable_debug macro is called(see attached debug patch). But, I do not
>> see that, execution reaches to any of those tap points between el0_dbg
>> and el1_dbg, and tap points debug log also confirms that el1_dbg is
>> raised before el0_dbg is returned.
>
> Probably we all missed this, ARMv8 specs is very clear about it. In
> section "D2.1 About debug exceptions" it says:
>
> Software Breakpoint Instruction exceptions cannot be masked. The PE
> takes Software Breakpoint Instruction exceptions regardless of both of
> the following:
> ? The current Exception level.
> ? The current Security state.
>
> So, reception of el1_dbg while executing el0_dbg seems perfectly normal
> to me. If you agree then I am back with the original query which I asked
> in the beginning of the
> thread,(http://permalink.gmane.org/gmane.linux.ports.arm.kernel/383672)
> ie how can instruction_pointer be wrong when second el1_dbg is called
> recursively(as follows).
>
> [1]-> el0_dbg (After executing BRK instruction by user)
> [2] -> el1_dbg (when uprobe break handler at [1] executes BRK
> instruction)
> (At the end of this ELR_EL1 is programmed with fffffdfffc000004)
With new tap point debug of entry.S, I see that:
After this we are receiving one more exception and that is el1_inv. Now,
as soon as enable_dbg is called in el1_inv, we receive next single step
exception, with ELR_EL1 value as next instruction address after
enable_dbg of el1_inv. EC value of ESR_EL1(0x86000007) in el1_inv is
0x21 ie ESR_EL1_EC_IABT_EL1 and IFSC is 0x07
Hummmm..So, why did we receive here, an instruction abort in EL1 due to
Translation fault, third level??? I do not have that much knowledge yet,
to decipher it... :(
> [3] -> el1_dbg (when kprobe break handler at [2] enables single
> stepping)
> (Here ELR_EL1 was found fffffe0000092470).So When this el1_dbg
> was received, then regs->pc values are not same what was programmed in
> ELR_EL1 at the return of [2].
>
~Pratyush
PS: Debug code is here:
https://github.com/pratyushanand/linux.git :
ml_arm64_uprobe_devel_debug_el1_inv_while_kprobe_insertion_at_uprobe_breakpoint_handler
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-16 12:00 ` Pratyush Anand
2015-01-16 14:55 ` Pratyush Anand
@ 2015-01-16 16:22 ` Will Deacon
2015-01-19 6:10 ` Pratyush Anand
1 sibling, 1 reply; 18+ messages in thread
From: Will Deacon @ 2015-01-16 16:22 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Jan 16, 2015 at 12:00:09PM +0000, Pratyush Anand wrote:
> On Thursday 15 January 2015 10:17 PM, Pratyush Anand wrote:
> > On Tuesday 13 January 2015 11:23 PM, Pratyush Anand wrote:
> >> I will still try to find some way to capture enable_dbg macro path.H
> >
> > I did instrumented debug tap points at all the location from where
> > enable_debug macro is called(see attached debug patch). But, I do not
> > see that, execution reaches to any of those tap points between el0_dbg
> > and el1_dbg, and tap points debug log also confirms that el1_dbg is
> > raised before el0_dbg is returned.
>
> Probably we all missed this, ARMv8 specs is very clear about it. In
> section "D2.1 About debug exceptions" it says:
>
> Software Breakpoint Instruction exceptions cannot be masked. The PE
> takes Software Breakpoint Instruction exceptions regardless of both of
> the following:
> ? The current Exception level.
> ? The current Security state.
Ah, of course, I completely forgot you were using software breakpoints!
> So, reception of el1_dbg while executing el0_dbg seems perfectly normal
> to me. If you agree then I am back with the original query which I asked
> in the beginning of the
> thread,(http://permalink.gmane.org/gmane.linux.ports.arm.kernel/383672)
> ie how can instruction_pointer be wrong when second el1_dbg is called
> recursively(as follows).
>
> [1]-> el0_dbg (After executing BRK instruction by user)
> [2] -> el1_dbg (when uprobe break handler at [1] executes BRK instruction)
> (At the end of this ELR_EL1 is programmed with fffffdfffc000004)
> [3] -> el1_dbg (when kprobe break handler at [2] enables single stepping)
> (Here ELR_EL1 was found fffffe0000092470).So When this el1_dbg was
> received, then regs->pc values are not same what was programmed in
> ELR_EL1 at the return of [2].
Perhaps you're not removing the BRK instruction properly, and so you try to
single-step a trapping instruction and end up stepping into the exception?
Will
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-16 16:22 ` Will Deacon
@ 2015-01-19 6:10 ` Pratyush Anand
2015-01-19 10:11 ` Will Deacon
0 siblings, 1 reply; 18+ messages in thread
From: Pratyush Anand @ 2015-01-19 6:10 UTC (permalink / raw)
To: linux-arm-kernel
On Friday 16 January 2015 09:52 PM, Will Deacon wrote:
> On Fri, Jan 16, 2015 at 12:00:09PM +0000, Pratyush Anand wrote:
>> On Thursday 15 January 2015 10:17 PM, Pratyush Anand wrote:
>>> On Tuesday 13 January 2015 11:23 PM, Pratyush Anand wrote:
>>>> I will still try to find some way to capture enable_dbg macro path.H
>>>
>>> I did instrumented debug tap points at all the location from where
>>> enable_debug macro is called(see attached debug patch). But, I do not
>>> see that, execution reaches to any of those tap points between el0_dbg
>>> and el1_dbg, and tap points debug log also confirms that el1_dbg is
>>> raised before el0_dbg is returned.
>>
>> Probably we all missed this, ARMv8 specs is very clear about it. In
>> section "D2.1 About debug exceptions" it says:
>>
>> Software Breakpoint Instruction exceptions cannot be masked. The PE
>> takes Software Breakpoint Instruction exceptions regardless of both of
>> the following:
>> ? The current Exception level.
>> ? The current Security state.
>
> Ah, of course, I completely forgot you were using software breakpoints!
>
>> So, reception of el1_dbg while executing el0_dbg seems perfectly normal
>> to me. If you agree then I am back with the original query which I asked
>> in the beginning of the
>> thread,(http://permalink.gmane.org/gmane.linux.ports.arm.kernel/383672)
>> ie how can instruction_pointer be wrong when second el1_dbg is called
>> recursively(as follows).
>>
>> [1]-> el0_dbg (After executing BRK instruction by user)
>> [2] -> el1_dbg (when uprobe break handler at [1] executes BRK instruction)
>> (At the end of this ELR_EL1 is programmed with fffffdfffc000004)
>> [3] -> el1_dbg (when kprobe break handler at [2] enables single stepping)
>> (Here ELR_EL1 was found fffffe0000092470).So When this el1_dbg was
>> received, then regs->pc values are not same what was programmed in
>> ELR_EL1 at the return of [2].
>
> Perhaps you're not removing the BRK instruction properly, and so you try to
> single-step a trapping instruction and end up stepping into the exception?
>
No, probably that is not the scenario. One thing I agree, that even if
AARCH64 specs says that SW BRK exception can not be masked, current
kernel code is not ready to handle re-entrant software debug exception.
So, I will keep those part of uprobe code as non-kprobable, and then its
not so important to get into it for code development perspective.
However, it would be good to understand that what went wrong and caused
to receive an el1_inval. I still fail to pin point the reason of current
issue and its not single stepping a trapping instruction (BRK). Sorry,
but please have a relook at the sequence of events:
1. 1st instruction of uprobe_breakpoint_handler is:
ffffffc00059a628: a9bf7bfd stp x29, x30, [sp,#-16]!
which is replaced by BRK64_OPCODE_KPROBES = 0xD4200080, when Kprobe is
instrumented.
2. User instruction at address 0x4005d0 is replaced by
BRK64_OPCODE_UPROBES = 0xD4200100, when uprobe is instrumented.
3. When application executes instruction at 0x4005d0,we receive el0_dbg.
4. In el0_dbg handler we execute kernel code at address
ffffffc00059a628, so el1_dbg is raised. (I agree here that el0_dbg has
not been closed properly, which current entry.S code expects, so we will
need to fix it if we consensus to support re-entrant software debug
exception, how ever the issue which I see seems unrelated, so...)
5. Now in el1_dbg, we handle kprobe_breakpoint_handler, where we write
saved instruction (ie a9bf7bfd stp x29, x30, [sp,#-16]!) to
the kmalloc allocated address fffffdfffc000004. kprobe code does
flush_icache_range on this location. regs->pc is set to
fffffdfffc000004, so elr_el1 is programmed with fffffdfffc000004 during
kernel_exit. I have cross checked elr_el1 value just before eret is
executed in kernel_exit, and it is correct.
So, here we are trying to single step a STP instruction and not BRK
instruction.
6. Here I am expecting a single step exception, but I receive a el1_inv
with ESR_EL1(0x86000007) ie EC as "ESR_EL1_EC_IABT_EL1" and IFSC as
"Translation fault, third level". WHY????
As soon as enable_dbg is called in el1_inv, we receive next single step
exception, with ELR_EL1 value as next instruction address after
enable_dbg of el1_inv.
Had we received single step instead of el1_inv with correct elr_el1,
kprobe_single_step_handler would have executed properly and we would
have come back to address ffffffc00059a62C (2nd instruction of
uprobe_breakpoint_handler) after returning from this kprobe single step
handler. [off-course fix would be needed to correctly come back to this
address and then also for returning to user space]
~Pratyush
^ permalink raw reply [flat|nested] 18+ messages in thread
* Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg
2015-01-19 6:10 ` Pratyush Anand
@ 2015-01-19 10:11 ` Will Deacon
0 siblings, 0 replies; 18+ messages in thread
From: Will Deacon @ 2015-01-19 10:11 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jan 19, 2015 at 06:10:08AM +0000, Pratyush Anand wrote:
> On Friday 16 January 2015 09:52 PM, Will Deacon wrote:
> > Perhaps you're not removing the BRK instruction properly, and so you try to
> > single-step a trapping instruction and end up stepping into the exception?
>
> No, probably that is not the scenario. One thing I agree, that even if
> AARCH64 specs says that SW BRK exception can not be masked, current
> kernel code is not ready to handle re-entrant software debug exception.
> So, I will keep those part of uprobe code as non-kprobable, and then its
> not so important to get into it for code development perspective.
>
> However, it would be good to understand that what went wrong and caused
> to receive an el1_inval. I still fail to pin point the reason of current
> issue and its not single stepping a trapping instruction (BRK). Sorry,
> but please have a relook at the sequence of events:
I think my general point still stands (the issue is likely in step 5),
but ok.
> 1. 1st instruction of uprobe_breakpoint_handler is:
> ffffffc00059a628: a9bf7bfd stp x29, x30, [sp,#-16]!
> which is replaced by BRK64_OPCODE_KPROBES = 0xD4200080, when Kprobe is
> instrumented.
>
> 2. User instruction at address 0x4005d0 is replaced by
> BRK64_OPCODE_UPROBES = 0xD4200100, when uprobe is instrumented.
>
> 3. When application executes instruction at 0x4005d0,we receive el0_dbg.
>
> 4. In el0_dbg handler we execute kernel code at address
> ffffffc00059a628, so el1_dbg is raised. (I agree here that el0_dbg has
> not been closed properly, which current entry.S code expects, so we will
> need to fix it if we consensus to support re-entrant software debug
> exception, how ever the issue which I see seems unrelated, so...)
Up to here, we seem to be doing fine.
> 5. Now in el1_dbg, we handle kprobe_breakpoint_handler, where we write
> saved instruction (ie a9bf7bfd stp x29, x30, [sp,#-16]!) to
> the kmalloc allocated address fffffdfffc000004. kprobe code does
> flush_icache_range on this location. regs->pc is set to
> fffffdfffc000004, so elr_el1 is programmed with fffffdfffc000004 during
> kernel_exit. I have cross checked elr_el1 value just before eret is
> executed in kernel_exit, and it is correct.
This is the step I'm concerned about. Can you verify that:
- Replacing the instruction with a nop does/doesn't change behaviour?
- 0xfffffdfffc000004 is mapped at the point of exception return?
- Using __flush_icache_all instead of flush_icache_range makes no
difference?
> So, here we are trying to single step a STP instruction and not BRK
> instruction.
>
> 6. Here I am expecting a single step exception, but I receive a el1_inv
> with ESR_EL1(0x86000007) ie EC as "ESR_EL1_EC_IABT_EL1" and IFSC as
> "Translation fault, third level". WHY????
That likely means that 0xfffffdfffc000004 isn't mapped. Looking at the
kprobes code, shouldn't it be using the modules area so that it can
guarantee an executable mapping? If so, that should be below PAGE_OFFSET
which isn't true in your case afaict.
Will
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-01-19 10:11 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-01-08 13:15 Query: ARM64: Behavior of el1_dbg exception while executing el0_dbg Pratyush Anand
2015-01-08 15:49 ` William Cohen
2015-01-08 17:19 ` Pratyush Anand
2015-01-08 16:23 ` Will Deacon
2015-01-08 17:28 ` Pratyush Anand
2015-01-09 15:46 ` Will Deacon
2015-01-09 17:13 ` Pratyush Anand
2015-01-12 17:30 ` Will Deacon
2015-01-12 19:25 ` William Cohen
2015-01-13 6:46 ` Pratyush Anand
2015-01-13 15:52 ` Catalin Marinas
2015-01-13 17:53 ` Pratyush Anand
2015-01-15 16:47 ` Pratyush Anand
2015-01-16 12:00 ` Pratyush Anand
2015-01-16 14:55 ` Pratyush Anand
2015-01-16 16:22 ` Will Deacon
2015-01-19 6:10 ` Pratyush Anand
2015-01-19 10:11 ` Will Deacon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).