All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: mingo@kernel.org, gkohli@codeaurora.org, tglx@linutronix.de,
	mpe@ellerman.id.au, bigeasy@linutronix.de,
	linux-kernel@vger.kernel.org, will.deacon@arm.com
Subject: Re: [PATCH 2/4] watchdog/softlockup: Replace "watchdog/%u" threads with cpu_stop_work
Date: Fri, 8 Jun 2018 15:57:04 +0200	[thread overview]
Message-ID: <20180608135703.GD18941@redhat.com> (raw)
In-Reply-To: <20180607142405.GO12198@hirez.programming.kicks-ass.net>

On 06/07, Peter Zijlstra wrote:
>
> On Thu, Jun 07, 2018 at 02:33:12PM +0200, Peter Zijlstra wrote:
>
> > +static int softlockup_stop_fn(void *data)
> >  {
> > +	watchdog_disable(smp_processor_id());
> > +	return 0;
> >  }
> >
> > +static void softlockup_stop_all(void)
> >  {
> > +	int cpu;
> > +
> > +	for_each_cpu(cpu, &watchdog_allowed_mask)
> > +		stop_one_cpu(cpu, softlockup_stop_fn, NULL);
> > +
> > +	cpumask_clear(&watchdog_allowed_mask);
> >  }
>
> Bugger, that one doesn't quite work.. watchdog_disable() ends up calling
> a sleeping function. I forgot to enable all the debug cruft when
> testing..

And probably there is another problem. Both watchdog_disable(cpu) and
watchdog_nmi_disable(cpu) assume that cpu == smp_processor_id(), this arg
is simply ignored.

but lockup_detector_offline_cpu(cpu) is called by cpuhp_invoke_callback(),
so in this case watchdog_disable(dying_cpu) is simply wrong.

May be we can do something like below? Then softlockup_stop_all() can simply do

	for_each_cpu(cpu, &watchdog_allowed_mask)
		watchdog_disable(cpu);

watchdog_nmi_disable() is __weak, but at first glance arch/sparc/kernel/nmi.c
does everything correctly.

Oleg.

--- x/kernel/watchdog.c
+++ x/kernel/watchdog.c
@@ -108,13 +108,13 @@ __setup("hardlockup_all_cpu_backtrace=",
  */
 int __weak watchdog_nmi_enable(unsigned int cpu)
 {
-	hardlockup_detector_perf_enable();
+	hardlockup_detector_perf_enable(cpu);
 	return 0;
 }
 
 void __weak watchdog_nmi_disable(unsigned int cpu)
 {
-	hardlockup_detector_perf_disable();
+	hardlockup_detector_perf_disable(cpu);
 }
 
 /* Return 0, if a NMI watchdog is available. Error code otherwise */
@@ -479,7 +479,7 @@ static void watchdog_enable(unsigned int
 
 static void watchdog_disable(unsigned int cpu)
 {
-	struct hrtimer *hrtimer = this_cpu_ptr(&watchdog_hrtimer);
+	struct hrtimer *hrtimer = per_cpu_ptr(&watchdog_hrtimer, cpu);
 
 	watchdog_set_prio(SCHED_NORMAL, 0);
 	/*
--- x/kernel/watchdog_hld.c
+++ x/kernel/watchdog_hld.c
@@ -162,9 +162,8 @@ static void watchdog_overflow_callback(s
 	return;
 }
 
-static int hardlockup_detector_event_create(void)
+static int hardlockup_detector_event_create(int cpu)
 {
-	unsigned int cpu = smp_processor_id();
 	struct perf_event_attr *wd_attr;
 	struct perf_event *evt;
 
@@ -179,37 +178,37 @@ static int hardlockup_detector_event_cre
 			PTR_ERR(evt));
 		return PTR_ERR(evt);
 	}
-	this_cpu_write(watchdog_ev, evt);
+	raw_cpu_write(watchdog_ev, evt);
 	return 0;
 }
 
 /**
  * hardlockup_detector_perf_enable - Enable the local event
  */
-void hardlockup_detector_perf_enable(void)
+void hardlockup_detector_perf_enable(int cpu)
 {
-	if (hardlockup_detector_event_create())
+	if (hardlockup_detector_event_create(cpu))
 		return;
 
 	/* use original value for check */
 	if (!atomic_fetch_inc(&watchdog_cpus))
 		pr_info("Enabled. Permanently consumes one hw-PMU counter.\n");
 
-	perf_event_enable(this_cpu_read(watchdog_ev));
+	perf_event_enable(raw_cpu_read(watchdog_ev));
 }
 
 /**
  * hardlockup_detector_perf_disable - Disable the local event
  */
-void hardlockup_detector_perf_disable(void)
+void hardlockup_detector_perf_disable(int cpu)
 {
-	struct perf_event *event = this_cpu_read(watchdog_ev);
+	struct perf_event *event = per_cpu_read(watchdog_ev, cpu);
 
 	if (event) {
 		perf_event_disable(event);
-		this_cpu_write(watchdog_ev, NULL);
-		this_cpu_write(dead_event, event);
-		cpumask_set_cpu(smp_processor_id(), &dead_events_mask);
+		raw_cpu_write(watchdog_ev, cpu, NULL);
+		raw_cpu_write(dead_event, cpu, event);
+		cpumask_set_cpu(cpu, &dead_events_mask);
 		atomic_dec(&watchdog_cpus);
 	}
 }

  parent reply	other threads:[~2018-06-08 13:57 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-07 12:33 [PATCH 0/4] kthread/smpboot: More fixes Peter Zijlstra
2018-06-07 12:33 ` [PATCH 1/4] kthread, sched: Fix kthread_parkme() (again...) Peter Zijlstra
2018-06-07 12:33 ` [PATCH 2/4] watchdog/softlockup: Replace "watchdog/%u" threads with cpu_stop_work Peter Zijlstra
2018-06-07 14:24   ` Peter Zijlstra
2018-06-07 14:42     ` Peter Zijlstra
2018-06-08 13:57     ` Oleg Nesterov [this message]
2018-06-12 12:17       ` Peter Zijlstra
2018-06-07 12:33 ` [PATCH 3/4] smpboot: Remove cpumask from the API Peter Zijlstra
2018-06-07 12:33 ` [PATCH 4/4] kthread: Simplify kthread_park() completion Peter Zijlstra
2018-06-08  9:52   ` Oleg Nesterov
2018-06-12 12:42     ` Peter Zijlstra
2018-06-25  7:12     ` Peter Zijlstra
2018-06-25 16:53       ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180608135703.GD18941@redhat.com \
    --to=oleg@redhat.com \
    --cc=bigeasy@linutronix.de \
    --cc=gkohli@codeaurora.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mpe@ellerman.id.au \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.