public inbox for linux-trace-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [v1] trace/hwlat: stop worker if !is_percpu_thread due to hotplug event
@ 2024-02-01  6:24 Andy Chiu
  2024-02-01 14:57 ` Steven Rostedt
  0 siblings, 1 reply; 2+ messages in thread
From: Andy Chiu @ 2024-02-01  6:24 UTC (permalink / raw)
  To: rostedt, linux-trace-kernel
  Cc: bristot, mhiramat, mathieu.desnoyers, Andy Chiu

If the task happens to run after cpu hot-plug offline, then it would not
be running in a percpu_thread. Instead, it would be re-queued into a
UNBOUND workqueue. This would trigger a warning if we enable kernel
preemption.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
---
 kernel/trace/trace_hwlat.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
index b791524a6536..87258ddc2141 100644
--- a/kernel/trace/trace_hwlat.c
+++ b/kernel/trace/trace_hwlat.c
@@ -511,7 +511,16 @@ static int start_cpu_kthread(unsigned int cpu)
 static void hwlat_hotplug_workfn(struct work_struct *dummy)
 {
 	struct trace_array *tr = hwlat_trace;
-	unsigned int cpu = smp_processor_id();
+	unsigned int cpu;
+
+	/*
+	 * If the work is scheduled after CPU hotplug offline being invoked,
+	 * then it would be queued into UNBOUNDED workqueue
+	 */
+	if (!is_percpu_thread())
+		return;
+
+	cpu = smp_processor_id();
 
 	mutex_lock(&trace_types_lock);
 	mutex_lock(&hwlat_data.lock);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [v1] trace/hwlat: stop worker if !is_percpu_thread due to hotplug event
  2024-02-01  6:24 [v1] trace/hwlat: stop worker if !is_percpu_thread due to hotplug event Andy Chiu
@ 2024-02-01 14:57 ` Steven Rostedt
  0 siblings, 0 replies; 2+ messages in thread
From: Steven Rostedt @ 2024-02-01 14:57 UTC (permalink / raw)
  To: Andy Chiu; +Cc: linux-trace-kernel, bristot, mhiramat, mathieu.desnoyers

On Thu,  1 Feb 2024 14:24:55 +0800
Andy Chiu <andy.chiu@sifive.com> wrote:

> If the task happens to run after cpu hot-plug offline, then it would not
> be running in a percpu_thread. Instead, it would be re-queued into a
> UNBOUND workqueue. This would trigger a warning if we enable kernel
> preemption.
> 
> Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
> ---
>  kernel/trace/trace_hwlat.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/trace/trace_hwlat.c b/kernel/trace/trace_hwlat.c
> index b791524a6536..87258ddc2141 100644
> --- a/kernel/trace/trace_hwlat.c
> +++ b/kernel/trace/trace_hwlat.c
> @@ -511,7 +511,16 @@ static int start_cpu_kthread(unsigned int cpu)
>  static void hwlat_hotplug_workfn(struct work_struct *dummy)
>  {
>  	struct trace_array *tr = hwlat_trace;
> -	unsigned int cpu = smp_processor_id();
> +	unsigned int cpu;
> +
> +	/*
> +	 * If the work is scheduled after CPU hotplug offline being invoked,
> +	 * then it would be queued into UNBOUNDED workqueue

Is it due to a CPU going to online and then right back to offline? This
function is called by the online path.

From the queue_work_on() comment:

/**
 * queue_work_on - queue work on specific cpu
 * @cpu: CPU number to execute work on
 * @wq: workqueue to use
 * @work: work to queue
 *
 * We queue the work to a specific CPU, the caller must ensure it
 * can't go away.  Callers that fail to ensure that the specified
 * CPU cannot go away will execute on a randomly chosen CPU.
 * But note well that callers specifying a CPU that never has been
 * online will get a splat.
 *
 * Return: %false if @work was already on a queue, %true otherwise.
 */

So the work gets queued on the CPU but then the CPU immediately goes
offline. I wonder what happens if it comes back online?

The above states that if the work is already queued on a CPU it won't queue
it again. Hmm, reading the comments, I'm not sure you can run the same
workqueue on muliple CPUs. It looks to do only one and all the others will
fail.

I think we need to change this to a global work queue and use CPU masks.
Where in the *cpu_init() function, it just sets the current CPU bit in a
bit mask and calls the workqueue, and that workqueue is responsible for all
CPUs.

Then the workqueue function will call cpu_read_lock() and just iterate the
CPU mask starting the threads for each of the bits that are set and clear
the bit (possibly with __test_and_clear_bit).

I don't think the schedule_on_cpu() is doing what we think it should be
doing.


-- Steve


> +	 */
> +	if (!is_percpu_thread())
> +		return;
> +
> +	cpu = smp_processor_id();
>  
>  	mutex_lock(&trace_types_lock);
>  	mutex_lock(&hwlat_data.lock);


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2024-02-01 14:57 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-01  6:24 [v1] trace/hwlat: stop worker if !is_percpu_thread due to hotplug event Andy Chiu
2024-02-01 14:57 ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox