linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Supporting KVM_GUESTDBG_BLOCKIRQ or something similar on ARM64
@ 2024-10-28 10:53 puranjay
  2024-10-28 11:23 ` Marc Zyngier
  0 siblings, 1 reply; 7+ messages in thread
From: puranjay @ 2024-10-28 10:53 UTC (permalink / raw)
  To: Arnd Bergmann, Marc Zyngier, Arnd Bergmann, Alex Bennée,
	kvmarm, linux-arm-kernel, Sumit Garg
  Cc: puranjay12

[-- Attachment #1: Type: text/plain, Size: 812 bytes --]


Hi Everyone,

I work on the BPF JIT for arm64 and regularly use Qemu with gdb for
debugging by single stepping parts of the code. I realized that whenever
I enable KVM, single stepping doesn't work as expected and it lands in an
interrupt handler.

It always worked for me on x86 so I looked in the source code and found
that x86 supports KVM_GUESTDBG_BLOCKIRQ that blocks IRQs when single
stepping.

I assume that arm64 doesn't support KVM_GUESTDBG_BLOCKIRQ because it is
not trivial to implement this on arm64 due to some architectural
limitations? There was a patch [1] posted in 2022 to solve this issue
but it was not merged.

Let's start a discussion about what needs to be done to support this on
arm64.

Thanks,
Puranjay

[1] https://lore.kernel.org/lkml/20221219102452.2860088-2-sumit.garg@linaro.org/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 255 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Supporting KVM_GUESTDBG_BLOCKIRQ or something similar on ARM64
  2024-10-28 10:53 Supporting KVM_GUESTDBG_BLOCKIRQ or something similar on ARM64 puranjay
@ 2024-10-28 11:23 ` Marc Zyngier
  2024-10-29  8:52   ` Ard Biesheuvel
  0 siblings, 1 reply; 7+ messages in thread
From: Marc Zyngier @ 2024-10-28 11:23 UTC (permalink / raw)
  To: puranjay
  Cc: Arnd Bergmann, Arnd Bergmann, Alex Bennée, kvmarm,
	linux-arm-kernel, Sumit Garg, puranjay12, Ard Biesheuvel,
	Oliver Upton, Suzuki K Poulose, Joey Gouly, Zenghui Yu

[+ArdB, which I assume you really wanted to Cc on this, as well as the
KVM/arm64 stakeholders]

On Mon, 28 Oct 2024 10:53:34 +0000,
puranjay@kernel.org wrote:
> 
> Hi Everyone,
> 
> I work on the BPF JIT for arm64 and regularly use Qemu with gdb for
> debugging by single stepping parts of the code. I realized that whenever
> I enable KVM, single stepping doesn't work as expected and it lands in an
> interrupt handler.

I disagree. Single-stepping works *exactly* as you should expect, by
not interfering with the rest of the system.

> It always worked for me on x86 so I looked in the source code and found
> that x86 supports KVM_GUESTDBG_BLOCKIRQ that blocks IRQs when single
> stepping.

Right, and that is not an architectural behaviour, but something that
helps the person running the debugger. I'm not saying it is not
useful, but that this is an *additional* behaviour that the
architecture is not supposed to cover.

Also, given that KVM_GUESTDBG_BLOCKIRQ has *zero* documentation,
nobody felt compelled to implement it. I didn't even know of its
existence until you mentioned it.

> I assume that arm64 doesn't support KVM_GUESTDBG_BLOCKIRQ because it is
> not trivial to implement this on arm64 due to some architectural
> limitations? There was a patch [1] posted in 2022 to solve this issue
> but it was not merged.

That patch does the wrong thing when it comes to KVM. We are not
building a Linux-only hypervisor, and we need a solution that works
irrespective of the guest.

> Let's start a discussion about what needs to be done to support this on
> arm64.

A good start would be to define the semantics of such a flag:

- what should it affect? the vcpu you are single-stepping? all vcpu?

- should userspace to know that interrupts are pending?

- should this result in any effect on the guest's view of time?

- what of interactions on the rest of the system (such as devices)?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Supporting KVM_GUESTDBG_BLOCKIRQ or something similar on ARM64
  2024-10-28 11:23 ` Marc Zyngier
@ 2024-10-29  8:52   ` Ard Biesheuvel
  2024-10-29  9:53     ` Marc Zyngier
  0 siblings, 1 reply; 7+ messages in thread
From: Ard Biesheuvel @ 2024-10-29  8:52 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: puranjay, Arnd Bergmann, Arnd Bergmann, Alex Bennée, kvmarm,
	linux-arm-kernel, Sumit Garg, puranjay12, Oliver Upton,
	Suzuki K Poulose, Joey Gouly, Zenghui Yu

On Mon, 28 Oct 2024 at 12:23, Marc Zyngier <maz@kernel.org> wrote:
>
> [+ArdB, which I assume you really wanted to Cc on this, as well as the
> KVM/arm64 stakeholders]
>
> On Mon, 28 Oct 2024 10:53:34 +0000,
> puranjay@kernel.org wrote:
> >
> > Hi Everyone,
> >
> > I work on the BPF JIT for arm64 and regularly use Qemu with gdb for
> > debugging by single stepping parts of the code. I realized that whenever
> > I enable KVM, single stepping doesn't work as expected and it lands in an
> > interrupt handler.
>
> I disagree. Single-stepping works *exactly* as you should expect, by
> not interfering with the rest of the system.
>
> > It always worked for me on x86 so I looked in the source code and found
> > that x86 supports KVM_GUESTDBG_BLOCKIRQ that blocks IRQs when single
> > stepping.
>
> Right, and that is not an architectural behaviour, but something that
> helps the person running the debugger. I'm not saying it is not
> useful, but that this is an *additional* behaviour that the
> architecture is not supposed to cover.
>
> Also, given that KVM_GUESTDBG_BLOCKIRQ has *zero* documentation,
> nobody felt compelled to implement it. I didn't even know of its
> existence until you mentioned it.
>
> > I assume that arm64 doesn't support KVM_GUESTDBG_BLOCKIRQ because it is
> > not trivial to implement this on arm64 due to some architectural
> > limitations? There was a patch [1] posted in 2022 to solve this issue
> > but it was not merged.
>
> That patch does the wrong thing when it comes to KVM. We are not
> building a Linux-only hypervisor, and we need a solution that works
> irrespective of the guest.
>
> > Let's start a discussion about what needs to be done to support this on
> > arm64.
>
> A good start would be to define the semantics of such a flag:
>
> - what should it affect? the vcpu you are single-stepping? all vcpu?
>
> - should userspace to know that interrupts are pending?
>
> - should this result in any effect on the guest's view of time?
>
> - what of interactions on the rest of the system (such as devices)?
>

Sorry to give a handwavy answer here, but approaching this from a
usability PoV (like what Puranjay is doing), it is really about
adhering to the principle of least surprise for the user.

So in that sense, it is not really about blocking IRQs at all, as long
as we step over them rather than into them. How that is achieved is
not that relevant from the user PoV, and maybe KVM_GUESTDBG_BLOCKIRQ
is not the right solution for ARM at all.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Supporting KVM_GUESTDBG_BLOCKIRQ or something similar on ARM64
  2024-10-29  8:52   ` Ard Biesheuvel
@ 2024-10-29  9:53     ` Marc Zyngier
  2024-10-29 10:00       ` Ard Biesheuvel
  0 siblings, 1 reply; 7+ messages in thread
From: Marc Zyngier @ 2024-10-29  9:53 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: puranjay, Arnd Bergmann, Arnd Bergmann, Alex Bennée, kvmarm,
	linux-arm-kernel, Sumit Garg, puranjay12, Oliver Upton,
	Suzuki K Poulose, Joey Gouly, Zenghui Yu

On Tue, 29 Oct 2024 08:52:41 +0000,
Ard Biesheuvel <ardb@kernel.org> wrote:
> 
> On Mon, 28 Oct 2024 at 12:23, Marc Zyngier <maz@kernel.org> wrote:
> >
> > > Let's start a discussion about what needs to be done to support this on
> > > arm64.
> >
> > A good start would be to define the semantics of such a flag:
> >
> > - what should it affect? the vcpu you are single-stepping? all vcpu?
> >
> > - should userspace to know that interrupts are pending?
> >
> > - should this result in any effect on the guest's view of time?
> >
> > - what of interactions on the rest of the system (such as devices)?
> >
> 
> Sorry to give a handwavy answer here, but approaching this from a
> usability PoV (like what Puranjay is doing), it is really about
> adhering to the principle of least surprise for the user.
> 
> So in that sense, it is not really about blocking IRQs at all, as long
> as we step over them rather than into them. How that is achieved is
> not that relevant from the user PoV, and maybe KVM_GUESTDBG_BLOCKIRQ
> is not the right solution for ARM at all.

I definitely sympathise with the goal, but there is no simple way to
let interrupts through while stepping (which is what your "step over"
implies):

- the hypervisor (in general) doesn't interact with the guest delivery
  and handling of interrupts -- this is either very opaque (list
  registers) or completely invisible (direct injection)

- replacing the step with a breakpoint after the stepped instruction
  requires us to decode the guest instructions to handle branching
  effects

One possible mechanism would be to:

- while stepping, add breakpoints to the interrupt vectors for the EL
  we are stepping (3 breakpoints for any of the 4 possible exception
  groups),

- when any interrupt breakpoint hits, clear all 3, place a breakpoint
  on the instruction that was about to be single-stepped (pointed to
  by SPSR)

- run to completion, until the breakpoint hits

- disable the breakpoint, reinstall the previous 3 interrupt
  breakpoints

- single-step, rinse, repeat

But then I'm asking myself the question: why is this KVM's job? It
seems to me that this is what an external debugger would do when
interacting with HW on bare metal.

So can we implement this as part of the debugger's state machine?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Supporting KVM_GUESTDBG_BLOCKIRQ or something similar on ARM64
  2024-10-29  9:53     ` Marc Zyngier
@ 2024-10-29 10:00       ` Ard Biesheuvel
  2024-10-29 13:57         ` Mark Rutland
  0 siblings, 1 reply; 7+ messages in thread
From: Ard Biesheuvel @ 2024-10-29 10:00 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: puranjay, Arnd Bergmann, Arnd Bergmann, Alex Bennée, kvmarm,
	linux-arm-kernel, Sumit Garg, puranjay12, Oliver Upton,
	Suzuki K Poulose, Joey Gouly, Zenghui Yu

On Tue, 29 Oct 2024 at 10:53, Marc Zyngier <maz@kernel.org> wrote:
>
> On Tue, 29 Oct 2024 08:52:41 +0000,
> Ard Biesheuvel <ardb@kernel.org> wrote:
> >
> > On Mon, 28 Oct 2024 at 12:23, Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > > Let's start a discussion about what needs to be done to support this on
> > > > arm64.
> > >
> > > A good start would be to define the semantics of such a flag:
> > >
> > > - what should it affect? the vcpu you are single-stepping? all vcpu?
> > >
> > > - should userspace to know that interrupts are pending?
> > >
> > > - should this result in any effect on the guest's view of time?
> > >
> > > - what of interactions on the rest of the system (such as devices)?
> > >
> >
> > Sorry to give a handwavy answer here, but approaching this from a
> > usability PoV (like what Puranjay is doing), it is really about
> > adhering to the principle of least surprise for the user.
> >
> > So in that sense, it is not really about blocking IRQs at all, as long
> > as we step over them rather than into them. How that is achieved is
> > not that relevant from the user PoV, and maybe KVM_GUESTDBG_BLOCKIRQ
> > is not the right solution for ARM at all.
>
> I definitely sympathise with the goal, but there is no simple way to
> let interrupts through while stepping (which is what your "step over"
> implies):
>
> - the hypervisor (in general) doesn't interact with the guest delivery
>   and handling of interrupts -- this is either very opaque (list
>   registers) or completely invisible (direct injection)
>
> - replacing the step with a breakpoint after the stepped instruction
>   requires us to decode the guest instructions to handle branching
>   effects
>

Yeah, and we still want to take non-IRQ/FIQ exceptions, so this does
not seem feasible to me.

> One possible mechanism would be to:
>
> - while stepping, add breakpoints to the interrupt vectors for the EL
>   we are stepping (3 breakpoints for any of the 4 possible exception
>   groups),
>
> - when any interrupt breakpoint hits, clear all 3, place a breakpoint
>   on the instruction that was about to be single-stepped (pointed to
>   by SPSR)
>
> - run to completion, until the breakpoint hits
>
> - disable the breakpoint, reinstall the previous 3 interrupt
>   breakpoints
>
> - single-step, rinse, repeat
>
> But then I'm asking myself the question: why is this KVM's job? It
> seems to me that this is what an external debugger would do when
> interacting with HW on bare metal.
>
> So can we implement this as part of the debugger's state machine?
>

Which debugger is that? The GDB stub in QEMU?

Setting a one-shot breakpoint on the address in SPSR when taking an
IRQ exception seems like a reasonable approach to me.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Supporting KVM_GUESTDBG_BLOCKIRQ or something similar on ARM64
  2024-10-29 10:00       ` Ard Biesheuvel
@ 2024-10-29 13:57         ` Mark Rutland
  2024-10-29 15:36           ` Marc Zyngier
  0 siblings, 1 reply; 7+ messages in thread
From: Mark Rutland @ 2024-10-29 13:57 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Marc Zyngier, puranjay, Arnd Bergmann, Arnd Bergmann,
	Alex Bennée, kvmarm, linux-arm-kernel, Sumit Garg,
	puranjay12, Oliver Upton, Suzuki K Poulose, Joey Gouly,
	Zenghui Yu

On Tue, Oct 29, 2024 at 11:00:24AM +0100, Ard Biesheuvel wrote:
> On Tue, 29 Oct 2024 at 10:53, Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Tue, 29 Oct 2024 08:52:41 +0000,
> > Ard Biesheuvel <ardb@kernel.org> wrote:
> > >
> > > On Mon, 28 Oct 2024 at 12:23, Marc Zyngier <maz@kernel.org> wrote:
> > > >
> > > > > Let's start a discussion about what needs to be done to support this on
> > > > > arm64.
> > > >
> > > > A good start would be to define the semantics of such a flag:
> > > >
> > > > - what should it affect? the vcpu you are single-stepping? all vcpu?
> > > >
> > > > - should userspace to know that interrupts are pending?
> > > >
> > > > - should this result in any effect on the guest's view of time?
> > > >
> > > > - what of interactions on the rest of the system (such as devices)?
> > > >
> > >
> > > Sorry to give a handwavy answer here, but approaching this from a
> > > usability PoV (like what Puranjay is doing), it is really about
> > > adhering to the principle of least surprise for the user.
> > >
> > > So in that sense, it is not really about blocking IRQs at all, as long
> > > as we step over them rather than into them. How that is achieved is
> > > not that relevant from the user PoV, and maybe KVM_GUESTDBG_BLOCKIRQ
> > > is not the right solution for ARM at all.
> >
> > I definitely sympathise with the goal, but there is no simple way to
> > let interrupts through while stepping (which is what your "step over"
> > implies):
> >
> > - the hypervisor (in general) doesn't interact with the guest delivery
> >   and handling of interrupts -- this is either very opaque (list
> >   registers) or completely invisible (direct injection)
> >
> > - replacing the step with a breakpoint after the stepped instruction
> >   requires us to decode the guest instructions to handle branching
> >   effects
> >
> 
> Yeah, and we still want to take non-IRQ/FIQ exceptions, so this does
> not seem feasible to me.
> 
> > One possible mechanism would be to:
> >
> > - while stepping, add breakpoints to the interrupt vectors for the EL
> >   we are stepping (3 breakpoints for any of the 4 possible exception
> >   groups),
> >
> > - when any interrupt breakpoint hits, clear all 3, place a breakpoint
> >   on the instruction that was about to be single-stepped (pointed to
> >   by SPSR)
> >
> > - run to completion, until the breakpoint hits
> >
> > - disable the breakpoint, reinstall the previous 3 interrupt
> >   breakpoints
> >
> > - single-step, rinse, repeat
> >
> > But then I'm asking myself the question: why is this KVM's job? It
> > seems to me that this is what an external debugger would do when
> > interacting with HW on bare metal.
> >
> > So can we implement this as part of the debugger's state machine?
> >
> 
> Which debugger is that? The GDB stub in QEMU?
> 
> Setting a one-shot breakpoint on the address in SPSR when taking an
> IRQ exception seems like a reasonable approach to me.

That doesn't work; an IRQ could be taken in the middle of a common
helper that's also used in IRQ context, so you'd take the breakpoint
within the IRQ. You could try to match a bunch of things like the SP and
so on, but that boils to do a bunch of heuristics rather than something
that's guarnateed to work...

More generally, the IRQ can preempt the running thread anyway, so:

* The user cannot use this to trace a kernel thread reliabl , since that
  can be switched out behind their back.

* The user cannot use this to trace a CPU regardless of the running
  thread, since they lose anything that happens under an IRQ.

Mark.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Supporting KVM_GUESTDBG_BLOCKIRQ or something similar on ARM64
  2024-10-29 13:57         ` Mark Rutland
@ 2024-10-29 15:36           ` Marc Zyngier
  0 siblings, 0 replies; 7+ messages in thread
From: Marc Zyngier @ 2024-10-29 15:36 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Ard Biesheuvel, puranjay, Arnd Bergmann, Arnd Bergmann,
	Alex Bennée, kvmarm, linux-arm-kernel, Sumit Garg,
	puranjay12, Oliver Upton, Suzuki K Poulose, Joey Gouly,
	Zenghui Yu

On Tue, 29 Oct 2024 13:57:53 +0000,
Mark Rutland <mark.rutland@arm.com> wrote:
> 
> On Tue, Oct 29, 2024 at 11:00:24AM +0100, Ard Biesheuvel wrote:
> > On Tue, 29 Oct 2024 at 10:53, Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > On Tue, 29 Oct 2024 08:52:41 +0000,
> > > Ard Biesheuvel <ardb@kernel.org> wrote:
> > > >
> > > > On Mon, 28 Oct 2024 at 12:23, Marc Zyngier <maz@kernel.org> wrote:
> > > > >
> > > > > > Let's start a discussion about what needs to be done to support this on
> > > > > > arm64.
> > > > >
> > > > > A good start would be to define the semantics of such a flag:
> > > > >
> > > > > - what should it affect? the vcpu you are single-stepping? all vcpu?
> > > > >
> > > > > - should userspace to know that interrupts are pending?
> > > > >
> > > > > - should this result in any effect on the guest's view of time?
> > > > >
> > > > > - what of interactions on the rest of the system (such as devices)?
> > > > >
> > > >
> > > > Sorry to give a handwavy answer here, but approaching this from a
> > > > usability PoV (like what Puranjay is doing), it is really about
> > > > adhering to the principle of least surprise for the user.
> > > >
> > > > So in that sense, it is not really about blocking IRQs at all, as long
> > > > as we step over them rather than into them. How that is achieved is
> > > > not that relevant from the user PoV, and maybe KVM_GUESTDBG_BLOCKIRQ
> > > > is not the right solution for ARM at all.
> > >
> > > I definitely sympathise with the goal, but there is no simple way to
> > > let interrupts through while stepping (which is what your "step over"
> > > implies):
> > >
> > > - the hypervisor (in general) doesn't interact with the guest delivery
> > >   and handling of interrupts -- this is either very opaque (list
> > >   registers) or completely invisible (direct injection)
> > >
> > > - replacing the step with a breakpoint after the stepped instruction
> > >   requires us to decode the guest instructions to handle branching
> > >   effects
> > >
> > 
> > Yeah, and we still want to take non-IRQ/FIQ exceptions, so this does
> > not seem feasible to me.
> > 
> > > One possible mechanism would be to:
> > >
> > > - while stepping, add breakpoints to the interrupt vectors for the EL
> > >   we are stepping (3 breakpoints for any of the 4 possible exception
> > >   groups),
> > >
> > > - when any interrupt breakpoint hits, clear all 3, place a breakpoint
> > >   on the instruction that was about to be single-stepped (pointed to
> > >   by SPSR)
> > >
> > > - run to completion, until the breakpoint hits
> > >
> > > - disable the breakpoint, reinstall the previous 3 interrupt
> > >   breakpoints
> > >
> > > - single-step, rinse, repeat
> > >
> > > But then I'm asking myself the question: why is this KVM's job? It
> > > seems to me that this is what an external debugger would do when
> > > interacting with HW on bare metal.
> > >
> > > So can we implement this as part of the debugger's state machine?
> > >
> > 
> > Which debugger is that? The GDB stub in QEMU?
> > 
> > Setting a one-shot breakpoint on the address in SPSR when taking an
> > IRQ exception seems like a reasonable approach to me.
> 
> That doesn't work; an IRQ could be taken in the middle of a common
> helper that's also used in IRQ context, so you'd take the breakpoint
> within the IRQ. You could try to match a bunch of things like the SP and
> so on, but that boils to do a bunch of heuristics rather than something
> that's guarnateed to work...

Hmmm. Yes, that's a pretty pathological case.

> 
> More generally, the IRQ can preempt the running thread anyway, so:
> 
> * The user cannot use this to trace a kernel thread reliabl , since that
>   can be switched out behind their back.
> 
> * The user cannot use this to trace a CPU regardless of the running
>   thread, since they lose anything that happens under an IRQ.

These are understood limitations, I expect, and would be a part of the
contract between the debugger and the user when deciding to hide
asynchronous exceptions.

But if those limitations are not deemed acceptable (or not easily
implementable by the debugger), then the only option we have is to
block IRQ/FIQ the hard way. In a way, this is what the architecture
provides when entering Debug state, where all asynchronous exceptions
are ignored (H2.4.1).

And that brings me back to my earlier set of question...

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-10-29 17:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-28 10:53 Supporting KVM_GUESTDBG_BLOCKIRQ or something similar on ARM64 puranjay
2024-10-28 11:23 ` Marc Zyngier
2024-10-29  8:52   ` Ard Biesheuvel
2024-10-29  9:53     ` Marc Zyngier
2024-10-29 10:00       ` Ard Biesheuvel
2024-10-29 13:57         ` Mark Rutland
2024-10-29 15:36           ` Marc Zyngier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).