From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: MIME-Version: 1.0 In-Reply-To: <1486966298-16767-1-git-send-email-leo.yan@linaro.org> References: <1486966298-16767-1-git-send-email-leo.yan@linaro.org> From: Mike Leach Date: Mon, 13 Feb 2017 15:58:47 +0000 Message-ID: Subject: Re: [PATCH RFC 0/3] coresight: enable debug module Content-Type: multipart/alternative; boundary=001a1146d146d866c905486b85ba To: Leo Yan Cc: Mathieu Poirier , Rob Herring , Mark Rutland , Wei Xu , Catalin Marinas , Will Deacon , linux-arm-kernel@lists.infradead.org, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, Daniel Thompson List-ID: --001a1146d146d866c905486b85ba Content-Type: text/plain; charset=UTF-8 Hi Leo, A few comments about your driver RFC. i) As it stands this looks like it will work for v8 cores, but would need refining for v7. There are subtle differences in the PC sampling between the two architectures. ii) the routine that dumps the PCSR register values across all the available cores, appears to assume that all the cores are powered and as such the registers are accessible. I think it would be useful in the Kconfig help to point this out. At ARM we have encountered systems that have quite aggressive power management policies which will disable both the core power domain and debug power domain for unused cores / clusters. On such a system I think that there would be an slave error return provided by the memory access to unpowered elements, resulting in undesirable behavior, or even a bus lockup. These are both errors we have seen using the external debug registers via an external debugger. At the very least the user needs to be warned so he can adjust his system accordingly (e.g. when we are debugging a Juno based Linux system using the external debug registers / debugger. we will often disable power management using a script to ensure that cores stay up and debuggable.) iii) I would recommend that you take note of the relevant bits in EDPRSR which may indicate that reading EDPCSR will cause an access error. Specifically DLK, PU would gate valid access to OSLAR -> the state of which you can also determine using EDPRSR. iv) When running in higher EL levels, such as might be the case with trusted firmware, then PC sampling may not be permitted by the OS, when the EDPCSR value will return 0xFFFFFFFF. The PCSR dump message should reflect this. v) EDVIDSR - which is sampled at the same time as EDPCSR_lo contains interesting information regarding the validity of the EDPCSR_hi registers and current EL and security state. I think it would be worth reading and interpreting this register in addition to the PC sample. vi) The set of registers you are using are the "PC sample-based profiling extension". (ARM v8 manual section H7.) It might be more accurate to describe the driver as implementing support for this rather than full debug. Best Regards Mike Leach On 13 February 2017 at 06:11, Leo Yan wrote: > This patch series is to enable coresight debug module. With debug > module we can check CPU state and PC value, etc. So this is helpful for > CPU lockup bugs, e.g. if one CPU has run into infinite loop with IRQ > disabled. The hang CPU cannot switch context and handle any interrupt, > so it cannot handle SMP call for stack dump, etc. > > Furthermore, now ARMv8 introduces some runtime firmwares like ARM > trusted firmware BL31, so sometime CPU lockup may happen in the > firmware and cannot return back to kernel. > > This initial patch series enable debug module and registers call back > notifier for PCSR register dumping when panic happens, so we can see > below dumping info for panic: > > [ 13.751777] Coresight debug module: > [ 13.755275] CPU[0]: PSCR=0xffff000008090cbc > [ 13.759469] CPU[1]: PSCR=0xffff0000088bf9e4 > [ 13.763662] CPU[2]: PSCR=0xffff000008090cc0 > [ 13.767856] CPU[3]: PSCR=0xffff000008090cb8 > [ 13.772049] CPU[4]: PSCR=0xffff000008090cc0 > [ 13.776243] CPU[5]: PSCR=0xffff000008090cbc > [ 13.780436] CPU[6]: PSCR=0xffff000008090cc0 > [ 13.784630] CPU[7]: PSCR=0xffff000008090cbc > > This patch series has been verified on 96boards Hikey. > > > Leo Yan (3): > coresight: binding for coresight debug driver > coresight: add support for debug module > arm64: dts: register Hi6220's coresight debug module > > .../devicetree/bindings/arm/coresight.txt | 9 +- > .../boot/dts/hisilicon/hikey_6220_coresight.dtsi | 73 +++++++++ > drivers/hwtracing/coresight/Kconfig | 8 + > drivers/hwtracing/coresight/Makefile | 1 + > drivers/hwtracing/coresight/coresight-debug.c | 169 > +++++++++++++++++++++ > 5 files changed, 258 insertions(+), 2 deletions(-) > create mode 100644 drivers/hwtracing/coresight/coresight-debug.c > > -- > 2.7.4 > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel > -- Mike Leach Principal Engineer, ARM Ltd. Blackburn Design Centre. UK --001a1146d146d866c905486b85ba Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Leo,

A few comments about your drive= r RFC.

i) As it stands this looks like it will work for v8 = cores, but would need refining for v7. There are subtle differences in the = PC sampling between the two architectures.

ii) the routine = that dumps the PCSR register values across all the available cores, appears= to assume that all the cores are powered and as such the registers are acc= essible.
I think it would be useful in the Kconfig help to poin= t this out.

At ARM we have encountered systems that have quite aggr= essive power management policies which will disable both the core power dom= ain and debug power domain for unused cores / clusters. On such a system I = think that there would be an slave error return provided by the memory acce= ss to unpowered elements, resulting in undesirable behavior, or even a bus = lockup. These are both errors we have seen using the external debug registe= rs via an external debugger.

At the very least the user needs to be = warned so he can adjust his system accordingly (e.g. when we are debugging = a Juno based Linux system using the external debug registers / debugger. we= will often disable power management using a script to ensure that cores st= ay up and debuggable.)

iii) I would recommend that you take= note of the relevant bits in EDPRSR which may indicate that reading EDPCSR= will cause an access error. Specifically DLK, PU would gate valid access t= o OSLAR -> the state of which you can also determine using EDPRSR.
iv) When running in higher EL levels, such as might be the case = with trusted firmware, then PC sampling may not be permitted by the OS, whe= n the EDPCSR value will return 0xFFFFFFFF. The PCSR dump message should ref= lect this.

v) EDVIDSR - which is sampled at the s= ame time as EDPCSR_lo contains interesting information regarding the validi= ty of the EDPCSR_hi registers and current EL and security state.
<= div class=3D"gmail_default" style=3D"font-family:arial,helvetica,sans-serif= ">I think it would be worth reading and interpreting this register in addit= ion to the PC sample.

vi) The set of registers you are usin= g are the "PC sample-based profiling extension". (ARM v8 manual s= ection H7.) It might be more accurate to describe the driver as implementin= g support for this rather than full debug.

Best Re= gards

Mike Leach



On 13 February 2017 at 06:11, Leo Ya= n <leo.yan@linaro.org> wrote:
This patch series is to enable coresight debug module. With debug
module we can check CPU state and PC value, etc. So this is helpful for
CPU lockup bugs, e.g. if one CPU has run into infinite loop with IRQ
disabled. The hang CPU cannot switch context and handle any interrupt,
so it cannot handle SMP call for stack dump, etc.

Furthermore, now ARMv8 introduces some runtime firmwares like ARM
trusted firmware BL31, so sometime CPU lockup may happen in the
firmware and cannot return back to kernel.

This initial patch series enable debug module and registers call back
notifier for PCSR register dumping when panic happens, so we can see
below dumping info for panic:

[=C2=A0 =C2=A013.751777] Coresight debug module:
[=C2=A0 =C2=A013.755275] CPU[0]: PSCR=3D0xffff000008090cbc
[=C2=A0 =C2=A013.759469] CPU[1]: PSCR=3D0xffff0000088bf9e4
[=C2=A0 =C2=A013.763662] CPU[2]: PSCR=3D0xffff000008090cc0
[=C2=A0 =C2=A013.767856] CPU[3]: PSCR=3D0xffff000008090cb8
[=C2=A0 =C2=A013.772049] CPU[4]: PSCR=3D0xffff000008090cc0
[=C2=A0 =C2=A013.776243] CPU[5]: PSCR=3D0xffff000008090cbc
[=C2=A0 =C2=A013.780436] CPU[6]: PSCR=3D0xffff000008090cc0
[=C2=A0 =C2=A013.784630] CPU[7]: PSCR=3D0xffff000008090cbc

This patch series has been verified on 96boards Hikey.


Leo Yan (3):
=C2=A0 coresight: binding for coresight debug driver
=C2=A0 coresight: add support for debug module
=C2=A0 arm64: dts: register Hi6220's coresight debug module

=C2=A0.../devicetree/bindings/arm/coresight.txt=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 |=C2=A0 =C2=A09 +-
=C2=A0.../boot/dts/hisilicon/hikey_6220_coresight.dtsi=C2=A0 =C2=A0|= =C2=A0 73 +++++++++
=C2=A0drivers/hwtracing/coresight/Kconfig=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 |=C2=A0 =C2=A08 +
=C2=A0drivers/hwtracing/coresight/Makefile=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2=A01 +
=C2=A0drivers/hwtracing/coresight/coresight-debug.c=C2=A0 =C2=A0 =C2= =A0 | 169 +++++++++++++++++++++
=C2=A05 files changed, 258 insertions(+), 2 deletions(-)
=C2=A0create mode 100644 drivers/hwtracing/coresight/coresight-debug.c=

--
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@li= sts.infradead.org
http://lists.infradead.org/mailman/= listinfo/linux-arm-kernel



--
<= div>Mike Leach
Principal Engineer, ARM Ltd.
Blackburn D= esign Centre. UK
--001a1146d146d866c905486b85ba--