* [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests
@ 2009-08-17 12:46 K.Prasad
2009-08-19 16:11 ` K.Prasad
0 siblings, 1 reply; 15+ messages in thread
From: K.Prasad @ 2009-08-17 12:46 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Ingo Molnar, Peter Zijlstra, Lai Jiangshan,
Steven Rostedt, Mathieu Desnoyers, Alan Stern
Hi All,
Please find a patch that enables kernel-space breakpoints to be
requested for a subset of the available CPUs in the system. This allows
per-CPU breakpoints and comes with the associated benefit of reduced
overhead during (un)registration.
This enhancement allows exploitation of hardware breakpoint registers by
'perf' which produces a CPU-wise information.
Design changes
--------------
- Every breakpoint request 'consumes' the first available debug register
(starting from HBP_NUM) in each CPU represented by 'cpumask' field in
'struct hw_breakpoint'.
- 'hbp_kernel_pos' (that separates kernel-space breakpoints from the
free/user-space breakpoints) now points to the maximum of debug
registers consumed on any given CPU.
-- 'hbp_kernel_pos' is decremented (one-at-a-time) to allow a new-slot
for kernel-space requests iff all debug registers on the given CPU
(from HBP_NUM - 1 to 'hbp_kernel_pos' are already consumed.
-- 'hbp_kernel_pos' is incremented (one-at-a-time) to free a slot iff
a removal request results in the release of a bkpt request that
consumed maximum debug registers for kernel-space.
- Every removal request results in compaction of breakpoint registers
(on a per-cpu basis) to occupy the vacant debug register.
The patch is based on commit b6c720b811aed0eeda89f277f13c1bd1bdf721fd of
-tip tree and has been tested to work fine on an x86 machine for both
cases (i.e. system-wide kernel breakpoints and bkpts for a subset of CPUs).
Please let me know your comments on the same.
Thanks,
K.Prasad
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-17 12:46 [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests K.Prasad @ 2009-08-19 16:11 ` K.Prasad 2009-08-19 17:33 ` Frederic Weisbecker 0 siblings, 1 reply; 15+ messages in thread From: K.Prasad @ 2009-08-19 16:11 UTC (permalink / raw) To: LKML, Frederic Weisbecker Cc: Ingo Molnar, Peter Zijlstra, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern On Mon, Aug 17, 2009 at 06:16:41PM +0530, K.Prasad wrote: > Hi All, > Please find a patch that enables kernel-space breakpoints to be > requested for a subset of the available CPUs in the system. This allows > per-CPU breakpoints and comes with the associated benefit of reduced > overhead during (un)registration. > > This enhancement allows exploitation of hardware breakpoint registers by > 'perf' which produces a CPU-wise information. > > Design changes > -------------- > - Every breakpoint request 'consumes' the first available debug register > (starting from HBP_NUM) in each CPU represented by 'cpumask' field in > 'struct hw_breakpoint'. > > - 'hbp_kernel_pos' (that separates kernel-space breakpoints from the > free/user-space breakpoints) now points to the maximum of debug > registers consumed on any given CPU. > -- 'hbp_kernel_pos' is decremented (one-at-a-time) to allow a new-slot > for kernel-space requests iff all debug registers on the given CPU > (from HBP_NUM - 1 to 'hbp_kernel_pos' are already consumed. > -- 'hbp_kernel_pos' is incremented (one-at-a-time) to free a slot iff > a removal request results in the release of a bkpt request that > consumed maximum debug registers for kernel-space. > > - Every removal request results in compaction of breakpoint registers > (on a per-cpu basis) to occupy the vacant debug register. > > The patch is based on commit b6c720b811aed0eeda89f277f13c1bd1bdf721fd of > -tip tree and has been tested to work fine on an x86 machine for both > cases (i.e. system-wide kernel breakpoints and bkpts for a subset of CPUs). > > Please let me know your comments on the same. > > Thanks, > K.Prasad > Hi Frederic, Do you find these patches, that provide the ability to restrict kernel-space breakpoints to any given subset of CPUs, to bring the requisite features for exploitation of hw-bkpt by 'perf tools'? Also of interest would be the reduced overhead associated with (un)register_kernel_hw_breakpoint() operations (no IPI in case of single-CPU breakpoint request). Thanks, K.Prasad ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-19 16:11 ` K.Prasad @ 2009-08-19 17:33 ` Frederic Weisbecker 2009-08-20 17:27 ` K.Prasad 0 siblings, 1 reply; 15+ messages in thread From: Frederic Weisbecker @ 2009-08-19 17:33 UTC (permalink / raw) To: K.Prasad, Peter Zijlstra, Ingo Molnar Cc: LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern On Wed, Aug 19, 2009 at 09:41:19PM +0530, K.Prasad wrote: > On Mon, Aug 17, 2009 at 06:16:41PM +0530, K.Prasad wrote: > > Hi All, > > Please find a patch that enables kernel-space breakpoints to be > > requested for a subset of the available CPUs in the system. This allows > > per-CPU breakpoints and comes with the associated benefit of reduced > > overhead during (un)registration. > > > > This enhancement allows exploitation of hardware breakpoint registers by > > 'perf' which produces a CPU-wise information. > > > > Design changes > > -------------- > > - Every breakpoint request 'consumes' the first available debug register > > (starting from HBP_NUM) in each CPU represented by 'cpumask' field in > > 'struct hw_breakpoint'. > > > > - 'hbp_kernel_pos' (that separates kernel-space breakpoints from the > > free/user-space breakpoints) now points to the maximum of debug > > registers consumed on any given CPU. > > -- 'hbp_kernel_pos' is decremented (one-at-a-time) to allow a new-slot > > for kernel-space requests iff all debug registers on the given CPU > > (from HBP_NUM - 1 to 'hbp_kernel_pos' are already consumed. > > -- 'hbp_kernel_pos' is incremented (one-at-a-time) to free a slot iff > > a removal request results in the release of a bkpt request that > > consumed maximum debug registers for kernel-space. > > > > - Every removal request results in compaction of breakpoint registers > > (on a per-cpu basis) to occupy the vacant debug register. > > > > The patch is based on commit b6c720b811aed0eeda89f277f13c1bd1bdf721fd of > > -tip tree and has been tested to work fine on an x86 machine for both > > cases (i.e. system-wide kernel breakpoints and bkpts for a subset of CPUs). > > > > Please let me know your comments on the same. > > > > Thanks, > > K.Prasad > > > > Hi Frederic, > Do you find these patches, that provide the ability to restrict > kernel-space breakpoints to any given subset of CPUs, to bring the > requisite features for exploitation of hw-bkpt by 'perf tools'? > > Also of interest would be the reduced overhead associated with > (un)register_kernel_hw_breakpoint() operations (no IPI in case of > single-CPU breakpoint request). > > Thanks, > K.Prasad > Nice. Yeah I just reviewed the patch and it looks good. Now I guess we should meet two others requirements for a pmu through this high level Api: - only update the hardware registers when needed: while switching to another thread of a same group, the hardware register switching is wasteful. BTW, I wonder if we need a flag while creating a user bp that tells whether the bp is inherited through fork/clone calls. - having a callback that quickly swap two breakpoints in order to support the hardware register multiplexing. I guess the pmu object would just need to call it when the multiplexing is decided. Providing those would let us build a pmu struct on top of this high level API, hopefully. All that would be a benefit in both sides. It avoids us building a low level PMU that reinvent the wheel, ie: the hardware breakpoints API handles a lot of things both in arch and core sides (debug register setting tricks with dr7 and co, cpu hotplug, kexec, etc...). In the bp API it brings more power (register switching only if needed, per cpu support, clone inheritance support, etc...) And in the end we have a pmu (which unifies the control of this profiling unit through a well established and known object for perfcounter) controlled by a high level API that could also benefit to other debugging subsystems. What do you think? It would be also nice to have Peter's and Ingo opinion about it, to be sure we are not going in the wrong direction. Thanks. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-19 17:33 ` Frederic Weisbecker @ 2009-08-20 17:27 ` K.Prasad 2009-08-21 14:28 ` Ingo Molnar 2009-08-25 20:33 ` K.Prasad 0 siblings, 2 replies; 15+ messages in thread From: K.Prasad @ 2009-08-20 17:27 UTC (permalink / raw) To: Frederic Weisbecker, Ingo Molnar, Peter Zijlstra Cc: LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern On Wed, Aug 19, 2009 at 07:33:00PM +0200, Frederic Weisbecker wrote: > On Wed, Aug 19, 2009 at 09:41:19PM +0530, K.Prasad wrote: > > On Mon, Aug 17, 2009 at 06:16:41PM +0530, K.Prasad wrote: > > > Hi All, > > > Please find a patch that enables kernel-space breakpoints to be > > > requested for a subset of the available CPUs in the system. This allows > > > per-CPU breakpoints and comes with the associated benefit of reduced > > > overhead during (un)registration. > > > > > > This enhancement allows exploitation of hardware breakpoint registers by > > > 'perf' which produces a CPU-wise information. > > > [edited] > > > > Hi Frederic, > > Do you find these patches, that provide the ability to restrict > > kernel-space breakpoints to any given subset of CPUs, to bring the > > requisite features for exploitation of hw-bkpt by 'perf tools'? > > > > Also of interest would be the reduced overhead associated with > > (un)register_kernel_hw_breakpoint() operations (no IPI in case of > > single-CPU breakpoint request). > > > > Thanks, > > K.Prasad > > > > > Nice. > Yeah I just reviewed the patch and it looks good. > > Now I guess we should meet two others requirements for a pmu > through this high level Api: > > - only update the hardware registers when needed: while switching > to another thread of a same group, the hardware register switching > is wasteful. > BTW, I wonder if we need a flag while creating a user bp that tells whether > the bp is inherited through fork/clone calls. > So this means avoiding a re-write of addresses into debug registers when they don't change. It is indeed desirable and would help if the same breakpoint is used across, say, many/all threads of a process. However I'd believe that the time taken for this is miniscule compared to the overhead involved during context switch. Perhaps consider this requirement a later time? > - having a callback that quickly swap two breakpoints in order to support > the hardware register multiplexing. I guess the pmu object would just need > to call it when the multiplexing is decided. > > Are you suggesting something like a modify_kernel_hw_breakpoint() that can quickly change a breakpoint address/characteristics? That's quite doable...it requires a quick validation through arch_validate_hwbkpt_settings() and the requisite IPIs (depending on what the new cpumask is). I will send a patch to that effect soon. > Providing those would let us build a pmu struct on top of this high level API, > hopefully. > > All that would be a benefit in both sides. It avoids us building a low level PMU > that reinvent the wheel, ie: the hardware breakpoints API handles a lot of things > both in arch and core sides (debug register setting tricks with dr7 and co, > cpu hotplug, kexec, etc...). > In the bp API it brings more power (register switching only if needed, per cpu > support, clone inheritance support, etc...) > > And in the end we have a pmu (which unifies the control of this profiling > unit through a well established and known object for perfcounter) controlled by > a high level API that could also benefit to other debugging subsystems. > > What do you think? > It would be also nice to have Peter's and Ingo opinion about it, to be sure > we are not going in the wrong direction. > Indeed, it will be nice to know from Ingo and Peter that we are heading right. Thanks, K.Prasad ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-20 17:27 ` K.Prasad @ 2009-08-21 14:28 ` Ingo Molnar 2009-08-26 3:36 ` Frederic Weisbecker 2009-08-25 20:33 ` K.Prasad 1 sibling, 1 reply; 15+ messages in thread From: Ingo Molnar @ 2009-08-21 14:28 UTC (permalink / raw) To: K.Prasad Cc: Frederic Weisbecker, Peter Zijlstra, LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > > Providing those would let us build a pmu struct on top of this > > high level API, hopefully. Note that there's a PMU struct already in arch/x86/kernel/cpu/perf_counter.c. Could debug-register ops be tacked on to it? > > All that would be a benefit in both sides. It avoids us building > > a low level PMU that reinvent the wheel, ie: the hardware > > breakpoints API handles a lot of things both in arch and core > > sides (debug register setting tricks with dr7 and co, cpu > > hotplug, kexec, etc...). In the bp API it brings more power > > (register switching only if needed, per cpu support, clone > > inheritance support, etc...) > > > > And in the end we have a pmu (which unifies the control of this > > profiling unit through a well established and known object for > > perfcounter) controlled by a high level API that could also > > benefit to other debugging subsystems. > > > > What do you think? It would be also nice to have Peter's and > > Ingo opinion about it, to be sure we are not going in the wrong > > direction. > > Indeed, it will be nice to know from Ingo and Peter that we are > heading right. If you do this proper perfcounters integration then i'm certainly happy. Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-21 14:28 ` Ingo Molnar @ 2009-08-26 3:36 ` Frederic Weisbecker 2009-08-26 9:16 ` Ingo Molnar 0 siblings, 1 reply; 15+ messages in thread From: Frederic Weisbecker @ 2009-08-26 3:36 UTC (permalink / raw) To: Ingo Molnar Cc: K.Prasad, Peter Zijlstra, LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern On Fri, Aug 21, 2009 at 04:28:11PM +0200, Ingo Molnar wrote: > > * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > > > > Providing those would let us build a pmu struct on top of this > > > high level API, hopefully. > > Note that there's a PMU struct already in > arch/x86/kernel/cpu/perf_counter.c. Could debug-register ops be > tacked on to it? No, we don't need to build an arch level pmu since the BP api already handles the arch abstraction (or well, it is planned to). Instead, what we need is a core pmu that relies on the BP api. Such pmu will be allocated dynamically while creating a hardware breakpoint counter. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-26 3:36 ` Frederic Weisbecker @ 2009-08-26 9:16 ` Ingo Molnar 2009-08-26 11:49 ` Frederic Weisbecker 0 siblings, 1 reply; 15+ messages in thread From: Ingo Molnar @ 2009-08-26 9:16 UTC (permalink / raw) To: Frederic Weisbecker Cc: K.Prasad, Peter Zijlstra, LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern * Frederic Weisbecker <fweisbec@gmail.com> wrote: > On Fri, Aug 21, 2009 at 04:28:11PM +0200, Ingo Molnar wrote: > > > > * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > > > > > > Providing those would let us build a pmu struct on top of this > > > > high level API, hopefully. > > > > Note that there's a PMU struct already in > > arch/x86/kernel/cpu/perf_counter.c. Could debug-register ops be > > tacked on to it? > > No, we don't need to build an arch level pmu since the BP api > already handles the arch abstraction (or well, it is planned to). > > Instead, what we need is a core pmu that relies on the BP api. > Such pmu will be allocated dynamically while creating a hardware > breakpoint counter. i'm not convinced at all we need all that layering of perfcounters->pmu->BP. Why not add BP support to the PMU abstraction and be done with it? That way we get hardware breakpoints via 'pinned, exclusive, per cpu hw-breakpoint counters' for example and kernel/hw-breakpoint.c can go away altogether. kernel/perf_counter.c already handles scheduling, conflict resolution, enumeration, syscall exposure and more. Hm? Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-26 9:16 ` Ingo Molnar @ 2009-08-26 11:49 ` Frederic Weisbecker 2009-08-26 18:02 ` K.Prasad 0 siblings, 1 reply; 15+ messages in thread From: Frederic Weisbecker @ 2009-08-26 11:49 UTC (permalink / raw) To: Ingo Molnar Cc: K.Prasad, Peter Zijlstra, LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern On Wed, Aug 26, 2009 at 11:16:42AM +0200, Ingo Molnar wrote: > > * Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > On Fri, Aug 21, 2009 at 04:28:11PM +0200, Ingo Molnar wrote: > > > > > > * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > > > > > > > > Providing those would let us build a pmu struct on top of this > > > > > high level API, hopefully. > > > > > > Note that there's a PMU struct already in > > > arch/x86/kernel/cpu/perf_counter.c. Could debug-register ops be > > > tacked on to it? > > > > No, we don't need to build an arch level pmu since the BP api > > already handles the arch abstraction (or well, it is planned to). > > > > Instead, what we need is a core pmu that relies on the BP api. > > Such pmu will be allocated dynamically while creating a hardware > > breakpoint counter. > > i'm not convinced at all we need all that layering of > perfcounters->pmu->BP. Why not add BP support to the PMU abstraction > and be done with it? > > That way we get hardware breakpoints via 'pinned, exclusive, per cpu > hw-breakpoint counters' for example and kernel/hw-breakpoint.c can > go away altogether. > > kernel/perf_counter.c already handles scheduling, conflict > resolution, enumeration, syscall exposure and more. > > Hm? What you are suggesting is a complete refactoring of the breakpoint API on top of pmus. Well, that's possible and would factorize the scheduling, conflict and so on. So that's theoretically a good point and I hope we'll come to such centralization, that looks like my suggestion to Peter to share the perfcounter layer that handles the scheduling of hardware registers. But the pmu handling is currently not ready for that. For now it's completely tied to perfcounter, the pmu handling must become completely standalone wrt perfcounter because hardware breakpoint shouldn't depend on perfcounter. It couldn't even, because it is not only wanted for perfcounter but currently used by ptrace (and perhaps some other various users) and I can't imagine we need to open a perfcounter to use ptrace facilities. That said, I really agree with the concept, we could then drop the scheduling bindings for hardware breakpoints and use a centralized thing for that which would be the pmu. But: - the PMUs handling is not ready for that as I explained above - we still need the hardware breakpoint layer that decodes a breakpoint request (address, length of the memory target, number of registers limitation). This part is still a mandatory feature to build dynamic PMU based breakpoints. Frederic. ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-26 11:49 ` Frederic Weisbecker @ 2009-08-26 18:02 ` K.Prasad 2009-08-29 13:41 ` Ingo Molnar 0 siblings, 1 reply; 15+ messages in thread From: K.Prasad @ 2009-08-26 18:02 UTC (permalink / raw) To: Ingo Molnar, Frederic Weisbecker Cc: Peter Zijlstra, LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern On Wed, Aug 26, 2009 at 01:49:57PM +0200, Frederic Weisbecker wrote: > On Wed, Aug 26, 2009 at 11:16:42AM +0200, Ingo Molnar wrote: > > > > * Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > > On Fri, Aug 21, 2009 at 04:28:11PM +0200, Ingo Molnar wrote: > > > > > > > > * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > > > > > > > > > > Providing those would let us build a pmu struct on top of this > > > > > > high level API, hopefully. > > > > > > > > Note that there's a PMU struct already in > > > > arch/x86/kernel/cpu/perf_counter.c. Could debug-register ops be > > > > tacked on to it? > > > > > > No, we don't need to build an arch level pmu since the BP api > > > already handles the arch abstraction (or well, it is planned to). > > > > > > Instead, what we need is a core pmu that relies on the BP api. > > > Such pmu will be allocated dynamically while creating a hardware > > > breakpoint counter. > > > > i'm not convinced at all we need all that layering of > > perfcounters->pmu->BP. Why not add BP support to the PMU abstraction > > and be done with it? > > > > That way we get hardware breakpoints via 'pinned, exclusive, per cpu > > hw-breakpoint counters' for example and kernel/hw-breakpoint.c can > > go away altogether. > > > > kernel/perf_counter.c already handles scheduling, conflict > > resolution, enumeration, syscall exposure and more. > > > > Hm? > > > What you are suggesting is a complete refactoring of the breakpoint API > on top of pmus. > > Well, that's possible and would factorize the scheduling, conflict and so > on. So that's theoretically a good point and I hope we'll come to such > centralization, that looks like my suggestion to Peter to share the > perfcounter layer that handles the scheduling of hardware registers. > > But the pmu handling is currently not ready for that. I am not sure if pmus can handle, (or want to handle) all the intricacies involved with the hw-breakpoint layer and let the other in-kernel users of hw-breakpoint such as ptrace and ftrace (at the moment) operate over it. The hw-breakpoint infrastructure has now grown to address nearly all requirements of perf-tools (barring the facility to schedule over-committed breakpoint requests, and a pending enable/disable feature) while its interoperability allows co-existence of other users. Given that there are multiple users of hw-breakpoint and that it is a contended resource (with diversity in breakpoint characteristics) wouldn't it be best to leave its management in a layer well below all its users (including perf/pmu)? That, in my opinion, would help the hw-breakpoint infrastructure evolve continuously to help the users exploit the debug registers better. Thanks, K.Prasad ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-26 18:02 ` K.Prasad @ 2009-08-29 13:41 ` Ingo Molnar 2009-09-01 6:38 ` K.Prasad 0 siblings, 1 reply; 15+ messages in thread From: Ingo Molnar @ 2009-08-29 13:41 UTC (permalink / raw) To: K.Prasad Cc: Frederic Weisbecker, Peter Zijlstra, LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > I am not sure if pmus can handle, (or want to handle) all the > intricacies involved with the hw-breakpoint layer [...] Which are those intricacies? It's all rather straightforward register scheduling and reservation stuff - which perfcounters already solves in a very rich way. Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-29 13:41 ` Ingo Molnar @ 2009-09-01 6:38 ` K.Prasad 2009-09-01 23:51 ` Frederic Weisbecker 0 siblings, 1 reply; 15+ messages in thread From: K.Prasad @ 2009-09-01 6:38 UTC (permalink / raw) To: Ingo Molnar Cc: Frederic Weisbecker, Peter Zijlstra, LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern, Paul Mackerras, David Gibson On Sat, Aug 29, 2009 at 03:41:07PM +0200, Ingo Molnar wrote: > > * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > > > I am not sure if pmus can handle, (or want to handle) all the > > intricacies involved with the hw-breakpoint layer [...] > > Which are those intricacies? It's all rather straightforward > register scheduling and reservation stuff - which perfcounters > already solves in a very rich way. > > Ingo While it is quite true that debug register scheduling and reservation (using exclusive/pinned properties) are possible through the perf's implementation, breakpoint exception handling and a provision to invoke user-defined callback require an extension to the existing perf implementation (which allows only counting and sampling upon an event, as I presently understand). Breakpoint exception handling involving tasks such as filtering stray exceptions (arising out of breakpoint length limitations), user-defined callback invocation and signal generation are, as I see not in common with perf-counter's functionality. And on architectures like PPC64 whose exception behaviour is 'trigger-before-execute' making it difficult to bring a 'continuous-trigger' behaviour, sufficient interlocking is necessary with single-step exception (required for a bkpt_exception-->disable_bp-->single_step-->enable_bp-->invoke_callback+signal process). And post integration, in-kernel users like ptrace, kgdb* and xmon* which hitherto have interacted directly with the debug registers (through set_debugreg()/set_dabr()) should route their requests through the perf-layer. It is difficult to imagine ptrace's idempotent requests (through ptrace_<get><set>_debugreg()) having to pass through perf-layer (and becoming dependant on CONFIG_PERF_COUNTERS), not to mention the tricks required to synchronise signal generation timing with exception behaviour (especially on PPC64). * - Not converted to use hw-breakpoint layer yet With debugging and performance monitoring being two primary uses of hw-breakpoints (apart from the many niche uses that one can think of), it would be prudent to retain the breakpoints as a separate layer allowing exploitation by applications with either needs than to tightly integrate with perf-counters. With plenty of users exploiting the breakpoint layer's debugging capabilities - like SystemTap http://lwn.net/Articles/343581/ (extensible for user-space), ftrace, ptrace and potentially gdbstub (http://tinyurl.com/gdbstub-prototype), it is but a sad state to keep the hw-breakpoint layer waiting in-queue for want of performance monitoring (through perf-counter exploitation/integration). Thanks, K.Prasad ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-09-01 6:38 ` K.Prasad @ 2009-09-01 23:51 ` Frederic Weisbecker 2009-09-03 18:28 ` K.Prasad 0 siblings, 1 reply; 15+ messages in thread From: Frederic Weisbecker @ 2009-09-01 23:51 UTC (permalink / raw) To: K.Prasad Cc: Ingo Molnar, Peter Zijlstra, LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern, Paul Mackerras, David Gibson On Tue, Sep 01, 2009 at 12:08:45PM +0530, K.Prasad wrote: > On Sat, Aug 29, 2009 at 03:41:07PM +0200, Ingo Molnar wrote: > > > > * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > > > > > I am not sure if pmus can handle, (or want to handle) all the > > > intricacies involved with the hw-breakpoint layer [...] > > > > Which are those intricacies? It's all rather straightforward > > register scheduling and reservation stuff - which perfcounters > > already solves in a very rich way. > > > > Ingo > > While it is quite true that debug register scheduling and reservation > (using exclusive/pinned properties) are possible through the perf's > implementation, breakpoint exception handling and a provision to invoke > user-defined callback require an extension to the existing perf > implementation (which allows only counting and sampling upon an event, > as I presently understand). Well, not that much actually. The upper (core) layer of the hw bp should reside and still handle specific breakpoint problems. Also that doesn't imply a complete zapping of the low level, we indeed still need to handle things like exception callbacks. Actually the only part that may roughly shrink is the registers scheduling. We just won't need to handle anymore tricky things like per thread virtual debug registers and things like that. > Breakpoint exception handling involving tasks such as filtering stray > exceptions (arising out of breakpoint length limitations), user-defined > callback invocation and signal generation are, as I see not in common > with perf-counter's functionality. And on architectures like PPC64 whose > exception behaviour is 'trigger-before-execute' making it difficult to > bring a 'continuous-trigger' behaviour, sufficient interlocking is necessary > with single-step exception (required for a > bkpt_exception-->disable_bp-->single_step-->enable_bp-->invoke_callback+signal > process). No really, it's not up to perf to handle such peculiar things. It's still the role of the bp API (low and high level). > And post integration, in-kernel users like ptrace, kgdb* and xmon* > which hitherto have interacted directly with the debug registers > (through set_debugreg()/set_dabr()) should route their requests through the > perf-layer. It is difficult to imagine ptrace's idempotent requests > (through ptrace_<get><set>_debugreg()) having to pass through perf-layer > (and becoming dependant on CONFIG_PERF_COUNTERS), not to mention the > tricks required to synchronise signal generation timing with exception > behaviour (especially on PPC64). > * - Not converted to use hw-breakpoint layer yet Actually, I see the perf layer here as a middle man between - the very hardware stuff (dr[0-467]) handling, reading, writing, updating - the core API (register_kernel_breakpoint(), register_user_breakpoint() etc..) And this middle man can handle so much things on its own that the two above gets utterly shrinked. Also the ptrace thing is tricky in itself, and that can't be helped easily. Because of the direct writing to debug registers done by POKE_USR, whatever the current breakpoint API with or without perf integration, we still need subterfuges to carry it. > With debugging and performance monitoring being two primary uses of > hw-breakpoints (apart from the many niche uses that one can think of), > it would be prudent to retain the breakpoints as a separate layer > allowing exploitation by applications with either needs than to tightly > integrate with perf-counters. A lonesome counter would be very limited in itself, we would only the perf support for breakpoint. Again, the API is still required. The goal is to have: 1) A factorization of the registers scheduling, of breakpoint target allocation (task/cpu, etc..., it's all handled by perf) 2) Optimization of registers scheduling 3) New features (period to trigger events, target inheritance, context exclusion etc...) 4) A schrink of the code > With plenty of users exploiting the breakpoint layer's debugging > capabilities - like SystemTap http://lwn.net/Articles/343581/ > (extensible for user-space), ftrace, ptrace and potentially gdbstub > (http://tinyurl.com/gdbstub-prototype), it is but a sad state to keep > the hw-breakpoint layer waiting in-queue for want of performance > monitoring (through perf-counter exploitation/integration). I first felt the idea of a perf based design suspicious. Because it appeared to be a real overkill. But actually after more thoughts about it, it could really simplify, factorize, and enhance this API. I'm currently trying to do something. A quick draft just to see where we can go with it, how could look like such a beast... ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-09-01 23:51 ` Frederic Weisbecker @ 2009-09-03 18:28 ` K.Prasad 2009-09-03 19:22 ` Ingo Molnar 0 siblings, 1 reply; 15+ messages in thread From: K.Prasad @ 2009-09-03 18:28 UTC (permalink / raw) To: Frederic Weisbecker, Ingo Molnar Cc: Peter Zijlstra, LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern, Paul Mackerras, David Gibson On Wed, Sep 02, 2009 at 01:51:33AM +0200, Frederic Weisbecker wrote: > On Tue, Sep 01, 2009 at 12:08:45PM +0530, K.Prasad wrote: > > On Sat, Aug 29, 2009 at 03:41:07PM +0200, Ingo Molnar wrote: > > > > > > * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > > > > > > > I am not sure if pmus can handle, (or want to handle) all the > > > > intricacies involved with the hw-breakpoint layer [...] > > > > > > Which are those intricacies? It's all rather straightforward > > > register scheduling and reservation stuff - which perfcounters > > > already solves in a very rich way. > > > > > > Ingo > > [edited] > > And post integration, in-kernel users like ptrace, kgdb* and xmon* > > which hitherto have interacted directly with the debug registers > > (through set_debugreg()/set_dabr()) should route their requests through the > > perf-layer. It is difficult to imagine ptrace's idempotent requests > > (through ptrace_<get><set>_debugreg()) having to pass through perf-layer > > (and becoming dependant on CONFIG_PERF_COUNTERS), not to mention the > > tricks required to synchronise signal generation timing with exception > > behaviour (especially on PPC64). > > * - Not converted to use hw-breakpoint layer yet > > > Actually, I see the perf layer here as a middle man between > > - the very hardware stuff (dr[0-467]) handling, reading, writing, updating > - the core API (register_kernel_breakpoint(), register_user_breakpoint() etc..) > > And this middle man can handle so much things on its own that the two above > gets utterly shrinked. > > Also the ptrace thing is tricky in itself, and that can't be helped easily. > Because of the direct writing to debug registers done by POKE_USR, > whatever the current breakpoint API with or without perf integration, we still > need subterfuges to carry it. > The reverse-dependancy this would create over perf (CONFIG_PERF) for the hw-breakpoint layer is an undesirable side-effect, and gives rise to atleast two immediate questions: - Handling of requests for hw-breakpoint from users like ptrace when CONFIG_PERF is not turned on - Managing 'register scheduling and reservation' on architectures where perf layer isn't ported. An inefficient way of handling this would be to retain the existing register allocation code of hw-breakpoint for such architectures - thereby artificially imposing arch-specific code into generic stuff. A solution here would be to detach parts of perf layer's code that handle register scheduling and reservation (which I learn are in kernel/perf_counter.c) into a separate entity (outside the ambit of CONFIG_PERF) that can serve the needs of both hw-breakpoint and perf thereby eliminating the two issues enumerated above. The tight coupling between the functions that perform register scheduling (in kernel/perf_counter.c) and perf's data structures is quite apparent and does suggest non-trivial amount of effort to detach them into a layer of its own. However this might be quite necessary in order to balance between a desire to re-use the 'register scheduling and reservation' code of perf-layer while not running into issues as above. This, along with the framework (described in the previous mail) to retain the hw-breakpoint's APIs + code interacting with debug registers (including exception handling) would be a good compromise. Thanks, K.Prasad ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-09-03 18:28 ` K.Prasad @ 2009-09-03 19:22 ` Ingo Molnar 0 siblings, 0 replies; 15+ messages in thread From: Ingo Molnar @ 2009-09-03 19:22 UTC (permalink / raw) To: K.Prasad Cc: Frederic Weisbecker, Peter Zijlstra, LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern, Paul Mackerras, David Gibson * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > On Wed, Sep 02, 2009 at 01:51:33AM +0200, Frederic Weisbecker wrote: > > On Tue, Sep 01, 2009 at 12:08:45PM +0530, K.Prasad wrote: > > > On Sat, Aug 29, 2009 at 03:41:07PM +0200, Ingo Molnar wrote: > > > > > > > > * K.Prasad <prasad@linux.vnet.ibm.com> wrote: > > > > > > > > > I am not sure if pmus can handle, (or want to handle) all the > > > > > intricacies involved with the hw-breakpoint layer [...] > > > > > > > > Which are those intricacies? It's all rather straightforward > > > > register scheduling and reservation stuff - which perfcounters > > > > already solves in a very rich way. > > > > > > > > Ingo > > > > [edited] > > > And post integration, in-kernel users like ptrace, kgdb* and xmon* > > > which hitherto have interacted directly with the debug registers > > > (through set_debugreg()/set_dabr()) should route their requests through the > > > perf-layer. It is difficult to imagine ptrace's idempotent requests > > > (through ptrace_<get><set>_debugreg()) having to pass through perf-layer > > > (and becoming dependant on CONFIG_PERF_COUNTERS), not to mention the > > > tricks required to synchronise signal generation timing with exception > > > behaviour (especially on PPC64). > > > * - Not converted to use hw-breakpoint layer yet > > > > > > Actually, I see the perf layer here as a middle man between > > > > - the very hardware stuff (dr[0-467]) handling, reading, writing, updating > > - the core API (register_kernel_breakpoint(), register_user_breakpoint() etc..) > > > > And this middle man can handle so much things on its own that the two above > > gets utterly shrinked. > > > > Also the ptrace thing is tricky in itself, and that can't be helped easily. > > Because of the direct writing to debug registers done by POKE_USR, > > whatever the current breakpoint API with or without perf integration, we still > > need subterfuges to carry it. > > > > The reverse-dependancy this would create over perf (CONFIG_PERF) for the > hw-breakpoint layer is an undesirable side-effect, and gives rise to > atleast two immediate questions: > > - Handling of requests for hw-breakpoint from users like ptrace when > CONFIG_PERF is not turned on This is basically just a build/layering logistics question and it is solved easily - we could have a library mode for it. > - Managing 'register scheduling and reservation' on architectures where > perf layer isn't ported. An inefficient way of handling this would be > to retain the existing register allocation code of hw-breakpoint for > such architectures - thereby artificially imposing arch-specific code > into generic stuff. Minimally porting perf to enable a hw-breakpoints PMU extension is very easy in practice. For example on s390 it took just 15 lines of code: 12310e9: [S390] Enable tick based perf_counter on s390. arch/s390/Kconfig | 1 + arch/s390/include/asm/perf_counter.h | 8 ++++++++ tools/perf/perf.h | 6 ++++++ 3 files changed, 15 insertions(+), 0 deletions(-) On FRV it took 38 lines (60% of which are boilerplace copyright notices), on PARISC 15 lines. By far the most complexity is in factoring out the hw-breakpoint code itself - and that has to be done regardless of the register scheduling model. > A solution here would be to detach parts of perf layer's code that > handle register scheduling and reservation (which I learn are in > kernel/perf_counter.c) into a separate entity (outside the ambit > of CONFIG_PERF) that can serve the needs of both hw-breakpoint and > perf thereby eliminating the two issues enumerated above. > > The tight coupling between the functions that perform register > scheduling (in kernel/perf_counter.c) and perf's data structures > is quite apparent and does suggest non-trivial amount of effort to > detach them into a layer of its own. > > However this might be quite necessary in order to balance between > a desire to re-use the 'register scheduling and reservation' code > of perf-layer while not running into issues as above. > > This, along with the framework (described in the previous mail) to > retain the hw-breakpoint's APIs + code interacting with debug > registers (including exception handling) would be a good > compromise. I dont think the librarization is all that complex. It's very much desired, as we'd reuse an existing piece of infrastructure to implement another one - this is always good. Ingo ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests 2009-08-20 17:27 ` K.Prasad 2009-08-21 14:28 ` Ingo Molnar @ 2009-08-25 20:33 ` K.Prasad 1 sibling, 0 replies; 15+ messages in thread From: K.Prasad @ 2009-08-25 20:33 UTC (permalink / raw) To: Frederic Weisbecker, Ingo Molnar, Peter Zijlstra Cc: LKML, Lai Jiangshan, Steven Rostedt, Mathieu Desnoyers, Alan Stern On Thu, Aug 20, 2009 at 10:57:19PM +0530, K.Prasad wrote: > On Wed, Aug 19, 2009 at 07:33:00PM +0200, Frederic Weisbecker wrote: > > On Wed, Aug 19, 2009 at 09:41:19PM +0530, K.Prasad wrote: > > > On Mon, Aug 17, 2009 at 06:16:41PM +0530, K.Prasad wrote: > > > > Hi All, > > > > Please find a patch that enables kernel-space breakpoints to be > > > > requested for a subset of the available CPUs in the system. This allows > > > > per-CPU breakpoints and comes with the associated benefit of reduced > > > > overhead during (un)registration. > > > > > > > > This enhancement allows exploitation of hardware breakpoint registers by > > > > 'perf' which produces a CPU-wise information. > > > > > [edited] > > > > > > Hi Frederic, > > > Do you find these patches, that provide the ability to restrict > > > kernel-space breakpoints to any given subset of CPUs, to bring the > > > requisite features for exploitation of hw-bkpt by 'perf tools'? > > > > > > Also of interest would be the reduced overhead associated with > > > (un)register_kernel_hw_breakpoint() operations (no IPI in case of > > > single-CPU breakpoint request). > > > [edited] > > - having a callback that quickly swap two breakpoints in order to support > > the hardware register multiplexing. I guess the pmu object would just need > > to call it when the multiplexing is decided. > > > > > > Are you suggesting something like a modify_kernel_hw_breakpoint() that > can quickly change a breakpoint address/characteristics? > > That's quite doable...it requires a quick validation through > arch_validate_hwbkpt_settings() and the requisite IPIs (depending on > what the new cpumask is). > > I will send a patch to that effect soon. > Hi Frederic, I just sent a patchset that adds the ability to specify per-cpu kernel-space breakpoints + a (relatively) lightweight function to modify the characteristics of a kernel-space breakpoint that can be used to swap between two breakpoint requests. Please pull them into -tip tree if you find them mature and ready. With these new feature additions, I see the HW-Breakpoint infrastructure code ready to meet the needs for exploitation by perf-tools and I presume you would restart your effort on the same? Thanks, K.Prasad ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2009-09-03 19:23 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-08-17 12:46 [Patch 0/1] HW-BKPT: Allow per-cpu kernel-space Hardware Breakpoint requests K.Prasad 2009-08-19 16:11 ` K.Prasad 2009-08-19 17:33 ` Frederic Weisbecker 2009-08-20 17:27 ` K.Prasad 2009-08-21 14:28 ` Ingo Molnar 2009-08-26 3:36 ` Frederic Weisbecker 2009-08-26 9:16 ` Ingo Molnar 2009-08-26 11:49 ` Frederic Weisbecker 2009-08-26 18:02 ` K.Prasad 2009-08-29 13:41 ` Ingo Molnar 2009-09-01 6:38 ` K.Prasad 2009-09-01 23:51 ` Frederic Weisbecker 2009-09-03 18:28 ` K.Prasad 2009-09-03 19:22 ` Ingo Molnar 2009-08-25 20:33 ` K.Prasad
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox