public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] watchdog: Prefer use "ref-cycles" for NMI watchdog
@ 2023-05-16 23:58 Song Liu
  2023-05-17  1:23 ` Li, Xin3
  2023-05-17  7:31 ` Peter Zijlstra
  0 siblings, 2 replies; 5+ messages in thread
From: Song Liu @ 2023-05-16 23:58 UTC (permalink / raw)
  To: linux-kernel; +Cc: kernel-team, Song Liu, Andrew Morton, Peter Zijlstra

NMI watchdog permanently consumes one hardware counters per CPU on the
system. For systems that use many hardware counters, this causes more
aggressive time multiplexing of perf events.

OTOH, some CPUs (mostly Intel) support "ref-cycles" event, which is rarely
used. Try use "ref-cycles" for the watchdog, so that one more hardware
counter is available to the user. If the CPU doesn't support "ref-cycles",
fall back to "cycles".

The downside of this change is that users of "ref-cycles" need to disable
nmi_watchdog.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Song Liu <song@kernel.org>

---

Changes in v2:
1. Do not send warning when failed to create ref-cycles event.
---
 kernel/watchdog_hld.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index 247bf0b1582c..a1d2a43ea31f 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -100,7 +100,7 @@ static inline bool watchdog_check_timestamp(void)
 
 static struct perf_event_attr wd_hw_attr = {
 	.type		= PERF_TYPE_HARDWARE,
-	.config		= PERF_COUNT_HW_CPU_CYCLES,
+	.config		= PERF_COUNT_HW_REF_CPU_CYCLES,
 	.size		= sizeof(struct perf_event_attr),
 	.pinned		= 1,
 	.disabled	= 1,
@@ -163,7 +163,7 @@ static void watchdog_overflow_callback(struct perf_event *event,
 	return;
 }
 
-static int hardlockup_detector_event_create(void)
+static int hardlockup_detector_event_create(bool send_warning)
 {
 	unsigned int cpu = smp_processor_id();
 	struct perf_event_attr *wd_attr;
@@ -176,8 +176,10 @@ static int hardlockup_detector_event_create(void)
 	evt = perf_event_create_kernel_counter(wd_attr, cpu, NULL,
 					       watchdog_overflow_callback, NULL);
 	if (IS_ERR(evt)) {
-		pr_debug("Perf event create on CPU %d failed with %ld\n", cpu,
-			 PTR_ERR(evt));
+		if (send_warning) {
+			pr_debug("Perf event create on CPU %d failed with %ld\n", cpu,
+				 PTR_ERR(evt));
+		}
 		return PTR_ERR(evt);
 	}
 	this_cpu_write(watchdog_ev, evt);
@@ -189,7 +191,7 @@ static int hardlockup_detector_event_create(void)
  */
 void hardlockup_detector_perf_enable(void)
 {
-	if (hardlockup_detector_event_create())
+	if (hardlockup_detector_event_create(true))
 		return;
 
 	/* use original value for check */
@@ -284,7 +286,13 @@ void __init hardlockup_detector_perf_restart(void)
  */
 int __init hardlockup_detector_perf_init(void)
 {
-	int ret = hardlockup_detector_event_create();
+	int ret = hardlockup_detector_event_create(false);
+
+	if (ret) {
+		/* Failed to create "ref-cycles", try "cycles" instead */
+		wd_hw_attr.config = PERF_COUNT_HW_CPU_CYCLES;
+		ret = hardlockup_detector_event_create(true);
+	}
 
 	if (ret) {
 		pr_info("Perf NMI watchdog permanently disabled\n");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* RE: [PATCH v2] watchdog: Prefer use "ref-cycles" for NMI watchdog
  2023-05-16 23:58 [PATCH v2] watchdog: Prefer use "ref-cycles" for NMI watchdog Song Liu
@ 2023-05-17  1:23 ` Li, Xin3
  2023-05-17  4:38   ` Song Liu
  2023-05-17  7:31 ` Peter Zijlstra
  1 sibling, 1 reply; 5+ messages in thread
From: Li, Xin3 @ 2023-05-17  1:23 UTC (permalink / raw)
  To: Song Liu, linux-kernel@vger.kernel.org
  Cc: kernel-team@meta.com, Andrew Morton, Peter Zijlstra

> NMI watchdog permanently consumes one hardware counters per CPU on the
> system. For systems that use many hardware counters, this causes more
> aggressive time multiplexing of perf events.
> 
> OTOH, some CPUs (mostly Intel) support "ref-cycles" event, which is rarely
> used. Try use "ref-cycles" for the watchdog, so that one more hardware
> counter is available to the user. If the CPU doesn't support "ref-cycles",
> fall back to "cycles".
> 
> The downside of this change is that users of "ref-cycles" need to disable
> nmi_watchdog.

From the discussion in v1, the users don't have to disable the NMI watchdog
*permanently*, right?

> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Song Liu <song@kernel.org>
> 
> ---
> 
> Changes in v2:
> 1. Do not send warning when failed to create ref-cycles event.
> ---
>  kernel/watchdog_hld.c | 20 ++++++++++++++------
>  1 file changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
> index 247bf0b1582c..a1d2a43ea31f 100644
> --- a/kernel/watchdog_hld.c
> +++ b/kernel/watchdog_hld.c
> @@ -100,7 +100,7 @@ static inline bool watchdog_check_timestamp(void)
> 
>  static struct perf_event_attr wd_hw_attr = {
>  	.type		= PERF_TYPE_HARDWARE,
> -	.config		= PERF_COUNT_HW_CPU_CYCLES,
> +	.config		= PERF_COUNT_HW_REF_CPU_CYCLES,
>  	.size		= sizeof(struct perf_event_attr),
>  	.pinned		= 1,
>  	.disabled	= 1,
> @@ -163,7 +163,7 @@ static void watchdog_overflow_callback(struct perf_event
> *event,
>  	return;
>  }
> 
> -static int hardlockup_detector_event_create(void)
> +static int hardlockup_detector_event_create(bool send_warning)
>  {
>  	unsigned int cpu = smp_processor_id();
>  	struct perf_event_attr *wd_attr;
> @@ -176,8 +176,10 @@ static int hardlockup_detector_event_create(void)
>  	evt = perf_event_create_kernel_counter(wd_attr, cpu, NULL,
>  					       watchdog_overflow_callback, NULL);
>  	if (IS_ERR(evt)) {
> -		pr_debug("Perf event create on CPU %d failed with %ld\n", cpu,
> -			 PTR_ERR(evt));
> +		if (send_warning) {
> +			pr_debug("Perf event create on CPU %d failed with
> %ld\n", cpu,
> +				 PTR_ERR(evt));
> +		}
>  		return PTR_ERR(evt);
>  	}
>  	this_cpu_write(watchdog_ev, evt);
> @@ -189,7 +191,7 @@ static int hardlockup_detector_event_create(void)
>   */
>  void hardlockup_detector_perf_enable(void)
>  {
> -	if (hardlockup_detector_event_create())
> +	if (hardlockup_detector_event_create(true))
>  		return;
> 
>  	/* use original value for check */
> @@ -284,7 +286,13 @@ void __init hardlockup_detector_perf_restart(void)
>   */
>  int __init hardlockup_detector_perf_init(void)
>  {
> -	int ret = hardlockup_detector_event_create();
> +	int ret = hardlockup_detector_event_create(false);
> +
> +	if (ret) {
> +		/* Failed to create "ref-cycles", try "cycles" instead */
> +		wd_hw_attr.config = PERF_COUNT_HW_CPU_CYCLES;
> +		ret = hardlockup_detector_event_create(true);
> +	}
> 
>  	if (ret) {
>  		pr_info("Perf NMI watchdog permanently disabled\n");
> --
> 2.34.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] watchdog: Prefer use "ref-cycles" for NMI watchdog
  2023-05-17  1:23 ` Li, Xin3
@ 2023-05-17  4:38   ` Song Liu
  0 siblings, 0 replies; 5+ messages in thread
From: Song Liu @ 2023-05-17  4:38 UTC (permalink / raw)
  To: Li, Xin3
  Cc: Song Liu, linux-kernel@vger.kernel.org, Kernel Team,
	Andrew Morton, Peter Zijlstra



> On May 16, 2023, at 6:23 PM, Li, Xin3 <xin3.li@intel.com> wrote:
> 
>> NMI watchdog permanently consumes one hardware counters per CPU on the
>> system. For systems that use many hardware counters, this causes more
>> aggressive time multiplexing of perf events.
>> 
>> OTOH, some CPUs (mostly Intel) support "ref-cycles" event, which is rarely
>> used. Try use "ref-cycles" for the watchdog, so that one more hardware
>> counter is available to the user. If the CPU doesn't support "ref-cycles",
>> fall back to "cycles".
>> 
>> The downside of this change is that users of "ref-cycles" need to disable
>> nmi_watchdog.
> 
> From the discussion in v1, the users don't have to disable the NMI watchdog
> *permanently*, right?

The users need to disable NMI watchdog when using ref-cycles. For example:

    # disable nmi_watchdog
    sysctl kernel.nmi_watchdog=0

    # use ref-cycles
    perf stat/record -e ref-cycles ...

    # reenable nmi_watchdog
    sysctl kernel.nmi_watchdog=1

Thanks,
Song

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] watchdog: Prefer use "ref-cycles" for NMI watchdog
  2023-05-16 23:58 [PATCH v2] watchdog: Prefer use "ref-cycles" for NMI watchdog Song Liu
  2023-05-17  1:23 ` Li, Xin3
@ 2023-05-17  7:31 ` Peter Zijlstra
  2023-05-17 17:51   ` Song Liu
  1 sibling, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2023-05-17  7:31 UTC (permalink / raw)
  To: Song Liu; +Cc: linux-kernel, kernel-team, Andrew Morton

On Tue, May 16, 2023 at 04:58:17PM -0700, Song Liu wrote:
> NMI watchdog permanently consumes one hardware counters per CPU on the
> system. For systems that use many hardware counters, this causes more
> aggressive time multiplexing of perf events.
> 
> OTOH, some CPUs (mostly Intel) support "ref-cycles" event, which is rarely
> used. Try use "ref-cycles" for the watchdog, so that one more hardware
> counter is available to the user. If the CPU doesn't support "ref-cycles",
> fall back to "cycles".
> 
> The downside of this change is that users of "ref-cycles" need to disable
> nmi_watchdog.

I still utterly hate how you hardcode ref-cycles

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] watchdog: Prefer use "ref-cycles" for NMI watchdog
  2023-05-17  7:31 ` Peter Zijlstra
@ 2023-05-17 17:51   ` Song Liu
  0 siblings, 0 replies; 5+ messages in thread
From: Song Liu @ 2023-05-17 17:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Song Liu, linux-kernel@vger.kernel.org, Kernel Team,
	Andrew Morton



> On May 17, 2023, at 12:31 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> 
> On Tue, May 16, 2023 at 04:58:17PM -0700, Song Liu wrote:
>> NMI watchdog permanently consumes one hardware counters per CPU on the
>> system. For systems that use many hardware counters, this causes more
>> aggressive time multiplexing of perf events.
>> 
>> OTOH, some CPUs (mostly Intel) support "ref-cycles" event, which is rarely
>> used. Try use "ref-cycles" for the watchdog, so that one more hardware
>> counter is available to the user. If the CPU doesn't support "ref-cycles",
>> fall back to "cycles".
>> 
>> The downside of this change is that users of "ref-cycles" need to disable
>> nmi_watchdog.
> 
> I still utterly hate how you hardcode ref-cycles

OK.. let me try with kernel cmdline args. Sending v3. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-05-17 17:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-05-16 23:58 [PATCH v2] watchdog: Prefer use "ref-cycles" for NMI watchdog Song Liu
2023-05-17  1:23 ` Li, Xin3
2023-05-17  4:38   ` Song Liu
2023-05-17  7:31 ` Peter Zijlstra
2023-05-17 17:51   ` Song Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox