[PATCH v3 00/22] kthread: Use kthread worker API more widely

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 00/22] kthread: Use kthread worker API more widely
@ 2015-11-18 13:25 Petr Mladek
  2015-11-18 13:25 ` [PATCH v3 21/22] thermal/intel_powerclamp: Remove duplicated code that starts the kthread Petr Mladek
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Petr Mladek @ 2015-11-18 13:25 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek,
	Catalin Marinas, linux-watchdog, Corey Minyard,
	openipmi-developer, Doug Ledford, Sean Hefty, Hal Rosenstock,
	linux-rdma, Maxim Levitsky, Zhang Rui, Eduardo Valentin,
	Jacob Pan <jacob.j>

My intention is to make it easier to manipulate and maintain kthreads.
Especially, I want to replace all the custom main cycles with a
generic one. Also I want to make the kthreads sleep in a consistent
state in a common place when there is no work.

My first attempt was with a brand new API (iterant kthread), see
http://thread.gmane.org/gmane.linux.kernel.api/11892 . But I was
directed to improve the existing kthread worker API. This is
the 3rd iteration of the new direction.


1st patch: add support to check if a timer callback is being called

2nd..12th patches: improve the existing kthread worker API

13th..18th, 20th, 22nd patches: convert several kthreads into
      the kthread worker API, namely: khugepaged, ring buffer
      benchmark, hung_task, kmemleak, ipmi, IB/fmr_pool,
      memstick/r592, intel_powerclamp
      
21st, 23rd patches: do some preparation steps; they usually do
      some clean up that makes sense even without the conversion.

  
Changes against v2:

  + used worker->lock to synchronize the operations with the work
    instead of the PENDING bit as suggested by Tejun Heo; it simplified
    the implementation in several ways

  + added timer_active(); used it together with del_timer_sync()
    to cancel the work a less tricky way

  + removed the controversial conversion of the RCU kthreads

  + added several other examples: hung_task, kmemleak, ipmi,
    IB/fmr_pool, memstick/r592, intel_powerclamp

  + the helper fixes for the ring buffer benchmark has been improved
    as suggested by Steven; they already are in the Linus tree now

  + fixed a possible race between the check for existing khugepaged
    worker and queuing the work
 

Changes against v1:

  + remove wrappers to manipulate the scheduling policy and priority

  + remove questionable wakeup_and_destroy_kthread_worker() variant

  + do not check for chained work when draining the queue

  + allocate struct kthread worker in create_kthread_work() and
    use more simple checks for running worker

  + add support for delayed kthread works and use them instead
    of waiting inside the works

  + rework the "unrelated" fixes for the ring buffer benchmark
    as discussed in the 1st RFC; also sent separately

  + convert also the consumer in the ring buffer benchmark


I have tested this patch set against the stable Linus tree
for 4.4-rc1.

Petr Mladek (22):
  timer: Allow to check when the timer callback has not finished yet
  kthread/smpboot: Do not park in kthread_create_on_cpu()
  kthread: Allow to call __kthread_create_on_node() with va_list args
  kthread: Add create_kthread_worker*()
  kthread: Add drain_kthread_worker()
  kthread: Add destroy_kthread_worker()
  kthread: Detect when a kthread work is used by more workers
  kthread: Initial support for delayed kthread work
  kthread: Allow to cancel kthread work
  kthread: Allow to modify delayed kthread work
  kthread: Better support freezable kthread workers
  kthread: Use try_lock_kthread_work() in flush_kthread_work()
  mm/huge_page: Convert khugepaged() into kthread worker API
  ring_buffer: Convert benchmark kthreads into kthread worker API
  hung_task: Convert hungtaskd into kthread worker API
  kmemleak: Convert kmemleak kthread into kthread worker API
  ipmi: Convert kipmi kthread into kthread worker API
  IB/fmr_pool: Convert the cleanup thread into kthread worker API
  memstick/r592: Better synchronize debug messages in r592_io kthread
  memstick/r592: convert r592_io kthread into kthread worker API
  thermal/intel_powerclamp: Remove duplicated code that starts the
    kthread
  thermal/intel_powerclamp: Convert the kthread to kthread worker API

 drivers/char/ipmi/ipmi_si_intf.c     | 116 ++++---
 drivers/infiniband/core/fmr_pool.c   |  54 ++-
 drivers/memstick/host/r592.c         |  61 ++--
 drivers/memstick/host/r592.h         |   5 +-
 drivers/thermal/intel_powerclamp.c   | 302 +++++++++--------
 include/linux/kthread.h              |  56 ++++
 include/linux/timer.h                |   2 +
 kernel/hung_task.c                   |  41 ++-
 kernel/kthread.c                     | 618 +++++++++++++++++++++++++++++++----
 kernel/smpboot.c                     |   5 +
 kernel/time/timer.c                  |  24 ++
 kernel/trace/ring_buffer_benchmark.c | 133 ++++----
 mm/huge_memory.c                     | 134 ++++----
 mm/kmemleak.c                        |  86 +++--
 14 files changed, 1142 insertions(+), 495 deletions(-)

CC: Catalin Marinas <catalin.marinas@arm.com>
CC: linux-watchdog@vger.kernel.org
CC: Corey Minyard <minyard@acm.org>
CC: openipmi-developer@lists.sourceforge.net
CC: Doug Ledford <dledford@redhat.com>
CC: Sean Hefty <sean.hefty@intel.com>
CC: Hal Rosenstock <hal.rosenstock@gmail.com>
CC: linux-rdma@vger.kernel.org
CC: Maxim Levitsky <maximlevitsky@gmail.com>
CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
CC: linux-pm@vger.kernel.org

-- 
1.8.5.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 21/22] thermal/intel_powerclamp: Remove duplicated code that starts the kthread
  2015-11-18 13:25 [PATCH v3 00/22] kthread: Use kthread worker API more widely Petr Mladek
@ 2015-11-18 13:25 ` Petr Mladek
  2015-11-18 13:25 ` [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API Petr Mladek
  2015-11-18 14:25 ` [PATCH v3 00/22] kthread: Use kthread worker API more widely Paul E. McKenney
  2 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2015-11-18 13:25 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek,
	Zhang Rui, Eduardo Valentin, Jacob Pan, linux-pm

This patch removes a code duplication. It does not modify
the functionality.

Signed-off-by: Petr Mladek <pmladek@suse.com>
CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
CC: linux-pm@vger.kernel.org
---
 drivers/thermal/intel_powerclamp.c | 45 +++++++++++++++++---------------------
 1 file changed, 20 insertions(+), 25 deletions(-)

diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c
index 6c79588251d5..cb32c38f9828 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -505,10 +505,27 @@ static void poll_pkg_cstate(struct work_struct *dummy)
 		schedule_delayed_work(&poll_pkg_cstate_work, HZ);
 }
 
+static void start_power_clamp_thread(unsigned long cpu)
+{
+	struct task_struct **p = per_cpu_ptr(powerclamp_thread, cpu);
+	struct task_struct *thread;
+
+	thread = kthread_create_on_node(clamp_thread,
+					(void *) cpu,
+					cpu_to_node(cpu),
+					"kidle_inject/%ld", cpu);
+	if (IS_ERR(thread))
+		return;
+
+	/* bind to cpu here */
+	kthread_bind(thread, cpu);
+	wake_up_process(thread);
+	*p = thread;
+}
+
 static int start_power_clamp(void)
 {
 	unsigned long cpu;
-	struct task_struct *thread;
 
 	/* check if pkg cstate counter is completely 0, abort in this case */
 	if (!has_pkg_state_counter()) {
@@ -530,20 +547,7 @@ static int start_power_clamp(void)
 
 	/* start one thread per online cpu */
 	for_each_online_cpu(cpu) {
-		struct task_struct **p =
-			per_cpu_ptr(powerclamp_thread, cpu);
-
-		thread = kthread_create_on_node(clamp_thread,
-						(void *) cpu,
-						cpu_to_node(cpu),
-						"kidle_inject/%ld", cpu);
-		/* bind to cpu here */
-		if (likely(!IS_ERR(thread))) {
-			kthread_bind(thread, cpu);
-			wake_up_process(thread);
-			*p = thread;
-		}
-
+		start_power_clamp_thread(cpu);
 	}
 	put_online_cpus();
 
@@ -575,7 +579,6 @@ static int powerclamp_cpu_callback(struct notifier_block *nfb,
 				unsigned long action, void *hcpu)
 {
 	unsigned long cpu = (unsigned long)hcpu;
-	struct task_struct *thread;
 	struct task_struct **percpu_thread =
 		per_cpu_ptr(powerclamp_thread, cpu);
 
@@ -584,15 +587,7 @@ static int powerclamp_cpu_callback(struct notifier_block *nfb,
 
 	switch (action) {
 	case CPU_ONLINE:
-		thread = kthread_create_on_node(clamp_thread,
-						(void *) cpu,
-						cpu_to_node(cpu),
-						"kidle_inject/%lu", cpu);
-		if (likely(!IS_ERR(thread))) {
-			kthread_bind(thread, cpu);
-			wake_up_process(thread);
-			*percpu_thread = thread;
-		}
+		start_power_clamp_thread(cpu);
 		/* prefer BSP as controlling CPU */
 		if (cpu == 0) {
 			control_cpu = 0;
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API
  2015-11-18 13:25 [PATCH v3 00/22] kthread: Use kthread worker API more widely Petr Mladek
  2015-11-18 13:25 ` [PATCH v3 21/22] thermal/intel_powerclamp: Remove duplicated code that starts the kthread Petr Mladek
@ 2015-11-18 13:25 ` Petr Mladek
  2016-01-07 19:55   ` Jacob Pan
  2015-11-18 14:25 ` [PATCH v3 00/22] kthread: Use kthread worker API more widely Paul E. McKenney
  2 siblings, 1 reply; 12+ messages in thread
From: Petr Mladek @ 2015-11-18 13:25 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra
  Cc: Steven Rostedt, Paul E. McKenney, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel, Petr Mladek,
	Zhang Rui, Eduardo Valentin, Jacob Pan, linux-pm

Kthreads are currently implemented as an infinite loop. Each
has its own variant of checks for terminating, freezing,
awakening. In many cases it is unclear to say in which state
it is and sometimes it is done a wrong way.

The plan is to convert kthreads into kthread_worker or workqueues
API. It allows to split the functionality into separate operations.
It helps to make a better structure. Also it defines a clean state
where no locks are taken, IRQs blocked, the kthread might sleep
or even be safely migrated.

The kthread worker API is useful when we want to have a dedicated
single thread for the work. It helps to make sure that it is
available when needed. Also it allows a better control, e.g.
define a scheduling priority.

This patch converts the intel powerclamp kthreads into the kthread
worker because they need to have a good control over the assigned
CPUs.

IMHO, the most natural way is to split one cycle into two works.
First one does some balancing and let the CPU work normal
way for some time. The second work checks what the CPU has done
in the meantime and put it into C-state to reach the required
idle time ratio. The delay between the two works is achieved
by the delayed kthread work.

The two works have to share some data that used to be local
variables of the single kthread function. This is achieved
by the new per-CPU struct kthread_worker_data. It might look
as a complication. On the other hand, the long original kthread
function was not nice either.

The patch tries to avoid extra init and cleanup works. All the
actions might be done outside the thread. They are moved
to the functions that create or destroy the worker. Especially,
I checked that the timers are assigned to the right CPU.

The two works are queuing each other. It makes it a bit tricky to
break it when we want to stop the worker. We use the global and
per-worker "clamping" variables to make sure that the re-queuing
eventually stops. We also cancel the works to make it faster.
Note that the canceling is not reliable because the handling
of the two variables and queuing is not synchronized via a lock.
But it is not a big deal because it is just an optimization.
The job is stopped faster than before in most cases.

Signed-off-by: Petr Mladek <pmladek@suse.com>
CC: Zhang Rui <rui.zhang@intel.com>
CC: Eduardo Valentin <edubezval@gmail.com>
CC: Jacob Pan <jacob.jun.pan@linux.intel.com>
CC: linux-pm@vger.kernel.org
---
 drivers/thermal/intel_powerclamp.c | 287 ++++++++++++++++++++++---------------
 1 file changed, 168 insertions(+), 119 deletions(-)

diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c
index cb32c38f9828..58ea1862d412 100644
--- a/drivers/thermal/intel_powerclamp.c
+++ b/drivers/thermal/intel_powerclamp.c
@@ -86,11 +86,27 @@ static unsigned int control_cpu; /* The cpu assigned to collect stat and update
 				  */
 static bool clamping;
 
+static const struct sched_param sparam = {
+	.sched_priority = MAX_USER_RT_PRIO / 2,
+};
+struct powerclamp_worker_data {
+	struct kthread_worker *worker;
+	struct kthread_work balancing_work;
+	struct delayed_kthread_work idle_injection_work;
+	struct timer_list wakeup_timer;
+	unsigned int cpu;
+	unsigned int count;
+	unsigned int guard;
+	unsigned int window_size_now;
+	unsigned int target_ratio;
+	unsigned int duration_jiffies;
+	bool clamping;
+};
 
-static struct task_struct * __percpu *powerclamp_thread;
+static struct powerclamp_worker_data * __percpu worker_data;
 static struct thermal_cooling_device *cooling_dev;
 static unsigned long *cpu_clamping_mask;  /* bit map for tracking per cpu
-					   * clamping thread
+					   * clamping kthread worker
 					   */
 
 static unsigned int duration;
@@ -368,100 +384,102 @@ static bool powerclamp_adjust_controls(unsigned int target_ratio,
 	return set_target_ratio + guard <= current_ratio;
 }
 
-static int clamp_thread(void *arg)
+static void clamp_balancing_func(struct kthread_work *work)
 {
-	int cpunr = (unsigned long)arg;
-	DEFINE_TIMER(wakeup_timer, noop_timer, 0, 0);
-	static const struct sched_param param = {
-		.sched_priority = MAX_USER_RT_PRIO/2,
-	};
-	unsigned int count = 0;
-	unsigned int target_ratio;
+	struct powerclamp_worker_data *w_data;
+	int sleeptime;
+	unsigned long target_jiffies;
+	unsigned int compensation;
+	int interval; /* jiffies to sleep for each attempt */
 
-	set_bit(cpunr, cpu_clamping_mask);
-	set_freezable();
-	init_timer_on_stack(&wakeup_timer);
-	sched_setscheduler(current, SCHED_FIFO, &param);
-
-	while (true == clamping && !kthread_should_stop() &&
-		cpu_online(cpunr)) {
-		int sleeptime;
-		unsigned long target_jiffies;
-		unsigned int guard;
-		unsigned int compensation = 0;
-		int interval; /* jiffies to sleep for each attempt */
-		unsigned int duration_jiffies = msecs_to_jiffies(duration);
-		unsigned int window_size_now;
-
-		try_to_freeze();
-		/*
-		 * make sure user selected ratio does not take effect until
-		 * the next round. adjust target_ratio if user has changed
-		 * target such that we can converge quickly.
-		 */
-		target_ratio = set_target_ratio;
-		guard = 1 + target_ratio/20;
-		window_size_now = window_size;
-		count++;
+	w_data = container_of(work, struct powerclamp_worker_data,
+			      balancing_work);
 
-		/*
-		 * systems may have different ability to enter package level
-		 * c-states, thus we need to compensate the injected idle ratio
-		 * to achieve the actual target reported by the HW.
-		 */
-		compensation = get_compensation(target_ratio);
-		interval = duration_jiffies*100/(target_ratio+compensation);
-
-		/* align idle time */
-		target_jiffies = roundup(jiffies, interval);
-		sleeptime = target_jiffies - jiffies;
-		if (sleeptime <= 0)
-			sleeptime = 1;
-		schedule_timeout_interruptible(sleeptime);
-		/*
-		 * only elected controlling cpu can collect stats and update
-		 * control parameters.
-		 */
-		if (cpunr == control_cpu && !(count%window_size_now)) {
-			should_skip =
-				powerclamp_adjust_controls(target_ratio,
-							guard, window_size_now);
-			smp_mb();
-		}
+	/*
+	 * make sure user selected ratio does not take effect until
+	 * the next round. adjust target_ratio if user has changed
+	 * target such that we can converge quickly.
+	 */
+	w_data->target_ratio = READ_ONCE(set_target_ratio);
+	w_data->guard = 1 + w_data->target_ratio / 20;
+	w_data->window_size_now = window_size;
+	w_data->duration_jiffies = msecs_to_jiffies(duration);
+	w_data->count++;
+
+	/*
+	 * systems may have different ability to enter package level
+	 * c-states, thus we need to compensate the injected idle ratio
+	 * to achieve the actual target reported by the HW.
+	 */
+	compensation = get_compensation(w_data->target_ratio);
+	interval = w_data->duration_jiffies * 100 /
+		(w_data->target_ratio + compensation);
+
+	/* align idle time */
+	target_jiffies = roundup(jiffies, interval);
+	sleeptime = target_jiffies - jiffies;
+	if (sleeptime <= 0)
+		sleeptime = 1;
+
+	if (clamping && w_data->clamping && cpu_online(w_data->cpu))
+		queue_delayed_kthread_work(w_data->worker,
+					   &w_data->idle_injection_work,
+					   sleeptime);
+}
+
+static void clamp_idle_injection_func(struct kthread_work *work)
+{
+	struct powerclamp_worker_data *w_data;
+	unsigned long target_jiffies;
+
+	w_data = container_of(work, struct powerclamp_worker_data,
+			      idle_injection_work.work);
+
+	/*
+	 * only elected controlling cpu can collect stats and update
+	 * control parameters.
+	 */
+	if (w_data->cpu == control_cpu &&
+	    !(w_data->count % w_data->window_size_now)) {
+		should_skip =
+			powerclamp_adjust_controls(w_data->target_ratio,
+						   w_data->guard,
+						   w_data->window_size_now);
+		smp_mb();
+	}
 
-		if (should_skip)
-			continue;
+	if (should_skip)
+		goto balance;
+
+	target_jiffies = jiffies + w_data->duration_jiffies;
+	mod_timer(&w_data->wakeup_timer, target_jiffies);
+	if (unlikely(local_softirq_pending()))
+		goto balance;
+	/*
+	 * stop tick sched during idle time, interrupts are still
+	 * allowed. thus jiffies are updated properly.
+	 */
+	preempt_disable();
+	/* mwait until target jiffies is reached */
+	while (time_before(jiffies, target_jiffies)) {
+		unsigned long ecx = 1;
+		unsigned long eax = target_mwait;
 
-		target_jiffies = jiffies + duration_jiffies;
-		mod_timer(&wakeup_timer, target_jiffies);
-		if (unlikely(local_softirq_pending()))
-			continue;
 		/*
-		 * stop tick sched during idle time, interrupts are still
-		 * allowed. thus jiffies are updated properly.
+		 * REVISIT: may call enter_idle() to notify drivers who
+		 * can save power during cpu idle. same for exit_idle()
 		 */
-		preempt_disable();
-		/* mwait until target jiffies is reached */
-		while (time_before(jiffies, target_jiffies)) {
-			unsigned long ecx = 1;
-			unsigned long eax = target_mwait;
-
-			/*
-			 * REVISIT: may call enter_idle() to notify drivers who
-			 * can save power during cpu idle. same for exit_idle()
-			 */
-			local_touch_nmi();
-			stop_critical_timings();
-			mwait_idle_with_hints(eax, ecx);
-			start_critical_timings();
-			atomic_inc(&idle_wakeup_counter);
-		}
-		preempt_enable();
+		local_touch_nmi();
+		stop_critical_timings();
+		mwait_idle_with_hints(eax, ecx);
+		start_critical_timings();
+		atomic_inc(&idle_wakeup_counter);
 	}
-	del_timer_sync(&wakeup_timer);
-	clear_bit(cpunr, cpu_clamping_mask);
+	preempt_enable();
 
-	return 0;
+balance:
+	if (clamping && w_data->clamping && cpu_online(w_data->cpu))
+		queue_kthread_work(w_data->worker, &w_data->balancing_work);
 }
 
 /*
@@ -505,22 +523,58 @@ static void poll_pkg_cstate(struct work_struct *dummy)
 		schedule_delayed_work(&poll_pkg_cstate_work, HZ);
 }
 
-static void start_power_clamp_thread(unsigned long cpu)
+static void start_power_clamp_worker(unsigned long cpu)
 {
-	struct task_struct **p = per_cpu_ptr(powerclamp_thread, cpu);
-	struct task_struct *thread;
-
-	thread = kthread_create_on_node(clamp_thread,
-					(void *) cpu,
-					cpu_to_node(cpu),
-					"kidle_inject/%ld", cpu);
-	if (IS_ERR(thread))
+	struct powerclamp_worker_data *w_data = per_cpu_ptr(worker_data, cpu);
+	struct kthread_worker *worker;
+
+	worker = create_kthread_worker_on_cpu(KTW_FREEZABLE, cpu,
+					      "kidle_inject/%ld");
+	if (IS_ERR(worker))
 		return;
 
-	/* bind to cpu here */
-	kthread_bind(thread, cpu);
-	wake_up_process(thread);
-	*p = thread;
+	w_data->worker = worker;
+	w_data->count = 0;
+	w_data->cpu = cpu;
+	w_data->clamping = true;
+	set_bit(cpu, cpu_clamping_mask);
+	setup_timer(&w_data->wakeup_timer, noop_timer, 0);
+	sched_setscheduler(worker->task, SCHED_FIFO, &sparam);
+	init_kthread_work(&w_data->balancing_work, clamp_balancing_func);
+	init_delayed_kthread_work(&w_data->idle_injection_work,
+				  clamp_idle_injection_func);
+	queue_kthread_work(w_data->worker, &w_data->balancing_work);
+}
+
+static void stop_power_clamp_worker(unsigned long cpu)
+{
+	struct powerclamp_worker_data *w_data = per_cpu_ptr(worker_data, cpu);
+
+	if (!w_data->worker)
+		return;
+
+	w_data->clamping = false;
+	/*
+	 * Make sure that all works that get queued after this point see
+	 * the clamping disabled. The counter part is not needed because
+	 * there is an implicit memory barrier when the queued work
+	 * is proceed.
+	 */
+	smp_wmb();
+	cancel_kthread_work_sync(&w_data->balancing_work);
+	cancel_delayed_kthread_work_sync(&w_data->idle_injection_work);
+	/*
+	 * The balancing work still might be queued here because
+	 * the handling of the "clapming" variable, cancel, and queue
+	 * operations are not synchronized via a lock. But it is not
+	 * a big deal. The balancing work is fast and destroy kthread
+	 * will wait for it.
+	 */
+	del_timer_sync(&w_data->wakeup_timer);
+	clear_bit(w_data->cpu, cpu_clamping_mask);
+	destroy_kthread_worker(w_data->worker);
+
+	w_data->worker = NULL;
 }
 
 static int start_power_clamp(void)
@@ -545,9 +599,9 @@ static int start_power_clamp(void)
 	clamping = true;
 	schedule_delayed_work(&poll_pkg_cstate_work, 0);
 
-	/* start one thread per online cpu */
+	/* start one kthread worker per online cpu */
 	for_each_online_cpu(cpu) {
-		start_power_clamp_thread(cpu);
+		start_power_clamp_worker(cpu);
 	}
 	put_online_cpus();
 
@@ -557,20 +611,17 @@ static int start_power_clamp(void)
 static void end_power_clamp(void)
 {
 	int i;
-	struct task_struct *thread;
 
-	clamping = false;
 	/*
-	 * make clamping visible to other cpus and give per cpu clamping threads
-	 * sometime to exit, or gets killed later.
+	 * Block requeuing in all the kthread workers. They will drain and
+	 * stop faster.
 	 */
-	smp_mb();
-	msleep(20);
+	clamping = false;
 	if (bitmap_weight(cpu_clamping_mask, num_possible_cpus())) {
 		for_each_set_bit(i, cpu_clamping_mask, num_possible_cpus()) {
-			pr_debug("clamping thread for cpu %d alive, kill\n", i);
-			thread = *per_cpu_ptr(powerclamp_thread, i);
-			kthread_stop(thread);
+			pr_debug("clamping worker for cpu %d alive, destroy\n",
+				 i);
+			stop_power_clamp_worker(i);
 		}
 	}
 }
@@ -579,15 +630,13 @@ static int powerclamp_cpu_callback(struct notifier_block *nfb,
 				unsigned long action, void *hcpu)
 {
 	unsigned long cpu = (unsigned long)hcpu;
-	struct task_struct **percpu_thread =
-		per_cpu_ptr(powerclamp_thread, cpu);
 
 	if (false == clamping)
 		goto exit_ok;
 
 	switch (action) {
 	case CPU_ONLINE:
-		start_power_clamp_thread(cpu);
+		start_power_clamp_worker(cpu);
 		/* prefer BSP as controlling CPU */
 		if (cpu == 0) {
 			control_cpu = 0;
@@ -598,7 +647,7 @@ static int powerclamp_cpu_callback(struct notifier_block *nfb,
 		if (test_bit(cpu, cpu_clamping_mask)) {
 			pr_err("cpu %lu dead but powerclamping thread is not\n",
 				cpu);
-			kthread_stop(*percpu_thread);
+			stop_power_clamp_worker(cpu);
 		}
 		if (cpu == control_cpu) {
 			control_cpu = smp_processor_id();
@@ -785,8 +834,8 @@ static int __init powerclamp_init(void)
 	window_size = 2;
 	register_hotcpu_notifier(&powerclamp_cpu_notifier);
 
-	powerclamp_thread = alloc_percpu(struct task_struct *);
-	if (!powerclamp_thread) {
+	worker_data = alloc_percpu(struct powerclamp_worker_data);
+	if (!worker_data) {
 		retval = -ENOMEM;
 		goto exit_unregister;
 	}
@@ -806,7 +855,7 @@ static int __init powerclamp_init(void)
 	return 0;
 
 exit_free_thread:
-	free_percpu(powerclamp_thread);
+	free_percpu(worker_data);
 exit_unregister:
 	unregister_hotcpu_notifier(&powerclamp_cpu_notifier);
 exit_free:
@@ -819,7 +868,7 @@ static void __exit powerclamp_exit(void)
 {
 	unregister_hotcpu_notifier(&powerclamp_cpu_notifier);
 	end_power_clamp();
-	free_percpu(powerclamp_thread);
+	free_percpu(worker_data);
 	thermal_cooling_device_unregister(cooling_dev);
 	kfree(cpu_clamping_mask);
 
-- 
1.8.5.6


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 00/22] kthread: Use kthread worker API more widely
  2015-11-18 13:25 [PATCH v3 00/22] kthread: Use kthread worker API more widely Petr Mladek
  2015-11-18 13:25 ` [PATCH v3 21/22] thermal/intel_powerclamp: Remove duplicated code that starts the kthread Petr Mladek
  2015-11-18 13:25 ` [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API Petr Mladek
@ 2015-11-18 14:25 ` Paul E. McKenney
  2 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2015-11-18 14:25 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, Steven Rostedt, Josh Triplett, Thomas Gleixner,
	Linus Torvalds, Jiri Kosina, Borislav Petkov, Michal Hocko,
	linux-mm, Vlastimil Babka, linux-api, linux-kernel,
	Catalin Marinas, linux-watchdog, Corey Minyard,
	openipmi-developer, Doug Ledford, Sean Hefty, Hal

On Wed, Nov 18, 2015 at 02:25:05PM +0100, Petr Mladek wrote:
> My intention is to make it easier to manipulate and maintain kthreads.
> Especially, I want to replace all the custom main cycles with a
> generic one. Also I want to make the kthreads sleep in a consistent
> state in a common place when there is no work.
> 
> My first attempt was with a brand new API (iterant kthread), see
> http://thread.gmane.org/gmane.linux.kernel.api/11892 . But I was
> directed to improve the existing kthread worker API. This is
> the 3rd iteration of the new direction.
> 
> 
> 1st patch: add support to check if a timer callback is being called
> 
> 2nd..12th patches: improve the existing kthread worker API
> 
> 13th..18th, 20th, 22nd patches: convert several kthreads into
>       the kthread worker API, namely: khugepaged, ring buffer
>       benchmark, hung_task, kmemleak, ipmi, IB/fmr_pool,
>       memstick/r592, intel_powerclamp
>       
> 21st, 23rd patches: do some preparation steps; they usually do
>       some clean up that makes sense even without the conversion.
> 
>   
> Changes against v2:
> 
>   + used worker->lock to synchronize the operations with the work
>     instead of the PENDING bit as suggested by Tejun Heo; it simplified
>     the implementation in several ways
> 
>   + added timer_active(); used it together with del_timer_sync()
>     to cancel the work a less tricky way
> 
>   + removed the controversial conversion of the RCU kthreads

Thank you!  ;-)

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API
  2015-11-18 13:25 ` [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API Petr Mladek
@ 2016-01-07 19:55   ` Jacob Pan
  2016-01-08 16:49     ` Petr Mladek
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Pan @ 2016-01-07 19:55 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, Steven Rostedt, Paul E. McKenney, Josh Triplett,
	Thomas Gleixner, Linus Torvalds, Jiri Kosina, Borislav Petkov,
	Michal Hocko, linux-mm, Vlastimil Babka, linux-api, linux-kernel,
	Zhang Rui, Eduardo Valentin, linux-pm, jacob.jun.pan

On Wed, 18 Nov 2015 14:25:27 +0100
Petr Mladek <pmladek@suse.com> wrote:

> From: Petr Mladek <pmladek@suse.com>
> To: Andrew Morton <akpm@linux-foundation.org>, Oleg Nesterov
> <oleg@redhat.com>, Tejun Heo <tj@kernel.org>, Ingo Molnar
> <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org> Cc: Steven
> Rostedt <rostedt@goodmis.org>, "Paul E. McKenney"
> <paulmck@linux.vnet.ibm.com>, Josh Triplett <josh@joshtriplett.org>,
> Thomas Gleixner <tglx@linutronix.de>, Linus Torvalds
> <torvalds@linux-foundation.org>, Jiri Kosina <jkosina@suse.cz>,
> Borislav Petkov <bp@suse.de>, Michal Hocko <mhocko@suse.cz>,
> linux-mm@kvack.org, Vlastimil Babka <vbabka@suse.cz>,
> linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, Petr Mladek
> <pmladek@suse.com>, Zhang Rui <rui.zhang@intel.com>, Eduardo Valentin
> <edubezval@gmail.com>, Jacob Pan <jacob.jun.pan@linux.intel.com>,
> linux-pm@vger.kernel.org Subject: [PATCH v3 22/22]
> thermal/intel_powerclamp: Convert the kthread to kthread worker API
> Date: Wed, 18 Nov 2015 14:25:27 +0100 X-Mailer: git-send-email 1.8.5.6
> 
> Kthreads are currently implemented as an infinite loop. Each
> has its own variant of checks for terminating, freezing,
> awakening. In many cases it is unclear to say in which state
> it is and sometimes it is done a wrong way.
> 
> The plan is to convert kthreads into kthread_worker or workqueues
> API. It allows to split the functionality into separate operations.
> It helps to make a better structure. Also it defines a clean state
> where no locks are taken, IRQs blocked, the kthread might sleep
> or even be safely migrated.
> 
> The kthread worker API is useful when we want to have a dedicated
> single thread for the work. It helps to make sure that it is
> available when needed. Also it allows a better control, e.g.
> define a scheduling priority.
> 
> This patch converts the intel powerclamp kthreads into the kthread
> worker because they need to have a good control over the assigned
> CPUs.
> 
I have tested this patchset and found no obvious issues in terms of
functionality, power and performance. Tested CPU online/offline,
suspend resume, freeze etc.
Power numbers are comparable too. e.g. on IVB 8C system. Inject idle
from 5 to 50% and read package power while running CPU bound workload.

Before:
IdlePct    Perf    RAPL    WallPower                               
5 256.28 16.50 0.0                                                 
10 248.86 15.64 0.0                                                
15 209.01 14.57 0.0                                                
20 176.17 13.88 0.0                                                
25 161.25 13.37 0.0                                                
30 165.62 13.38 0.0                                                
35 150.94 12.89 0.0                                                
40 137.45 12.47 0.0                                                
45 123.80 11.83 0.0                                                
50 137.59 11.80 0.0                                                

After:

(deb_chroot)root@ubuntu-jp-nfs:~/powercap-power# ./test.py -c 5
IdlePct	Perf	RAPL	WallPower
5 266.30 16.34 0.0
10 226.32 15.27 0.0
15 195.52 14.29 0.0
20 200.96 13.98 0.0
25 174.77 13.08 0.0
30 162.05 13.04 0.0
35 166.70 12.90 0.0
40 134.78 12.12 0.0
45 128.08 11.70 0.0
50 117.74 11.74 0.0    



> IMHO, the most natural way is to split one cycle into two works.
> First one does some balancing and let the CPU work normal
> way for some time. The second work checks what the CPU has done
> in the meantime and put it into C-state to reach the required
> idle time ratio. The delay between the two works is achieved
> by the delayed kthread work.
> 
> The two works have to share some data that used to be local
> variables of the single kthread function. This is achieved
> by the new per-CPU struct kthread_worker_data. It might look
> as a complication. On the other hand, the long original kthread
> function was not nice either.
> 
> The patch tries to avoid extra init and cleanup works. All the
> actions might be done outside the thread. They are moved
> to the functions that create or destroy the worker. Especially,
> I checked that the timers are assigned to the right CPU.
> 
> The two works are queuing each other. It makes it a bit tricky to
> break it when we want to stop the worker. We use the global and
> per-worker "clamping" variables to make sure that the re-queuing
> eventually stops. We also cancel the works to make it faster.
> Note that the canceling is not reliable because the handling
> of the two variables and queuing is not synchronized via a lock.
> But it is not a big deal because it is just an optimization.
> The job is stopped faster than before in most cases.
I am not convinced this added complexity is necessary, here are my
concerns by breaking down into two work items.
- overhead of queuing, per cpu data as you already mentioned.
- since we need to have very tight timing control, two items may limit
  our turnaround time. Wouldn't it take one extra tick for the scheduler
  to run the balance work then add delay? as opposed to just
  schedule_timeout()?
- vulnerable to future changes of queuing work

Jacob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API
  2016-01-07 19:55   ` Jacob Pan
@ 2016-01-08 16:49     ` Petr Mladek
  2016-01-12  2:17       ` Jacob Pan
  0 siblings, 1 reply; 12+ messages in thread
From: Petr Mladek @ 2016-01-08 16:49 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, Steven Rostedt, Paul E. McKenney, Josh Triplett,
	Thomas Gleixner, Linus Torvalds, Jiri Kosina, Borislav Petkov,
	Michal Hocko, linux-mm, Vlastimil Babka, linux-api, linux-kernel,
	Zhang Rui, Eduardo Valentin, linux-pm

On Thu 2016-01-07 11:55:31, Jacob Pan wrote:
> On Wed, 18 Nov 2015 14:25:27 +0100
> Petr Mladek <pmladek@suse.com> wrote:
> I have tested this patchset and found no obvious issues in terms of
> functionality, power and performance. Tested CPU online/offline,
> suspend resume, freeze etc.
> Power numbers are comparable too. e.g. on IVB 8C system. Inject idle
> from 5 to 50% and read package power while running CPU bound workload.

Great news. Thanks a lot for testing.

> > IMHO, the most natural way is to split one cycle into two works.
> > First one does some balancing and let the CPU work normal
> > way for some time. The second work checks what the CPU has done
> > in the meantime and put it into C-state to reach the required
> > idle time ratio. The delay between the two works is achieved
> > by the delayed kthread work.
> > 
> > The two works have to share some data that used to be local
> > variables of the single kthread function. This is achieved
> > by the new per-CPU struct kthread_worker_data. It might look
> > as a complication. On the other hand, the long original kthread
> > function was not nice either.
> > 
> > The two works are queuing each other. It makes it a bit tricky to
> > break it when we want to stop the worker. We use the global and
> > per-worker "clamping" variables to make sure that the re-queuing
> > eventually stops. We also cancel the works to make it faster.
> > Note that the canceling is not reliable because the handling
> > of the two variables and queuing is not synchronized via a lock.
> > But it is not a big deal because it is just an optimization.
> > The job is stopped faster than before in most cases.

> I am not convinced this added complexity is necessary, here are my
> concerns by breaking down into two work items.

I am not super happy with the split either. But the current state has
its drawback as well.

> - overhead of queuing,

Good question. Here is a rather typical snippet from function_graph
tracer of the clamp_balancing func:

  31)               |  clamp_balancing_func() {
  31)               |    queue_delayed_kthread_work() {
  31)               |      __queue_delayed_kthread_work() {
  31)               |        add_timer() {
  31)   4.906 us    |        }
  31)   5.959 us    |      }
  31)   9.702 us    |    }
  31) + 10.878 us   |  }

On one hand it spends most of the time (10 of 11 secs) in queueing
the work. On the other hand, half of this time is spent on adding
the timer. schedule_timeout() would need to setup the timer as well.

Here is a snippet from clamp_idle_injection_func()

  31)               |  clamp_idle_injection_func() {
  31)               |    smp_apic_timer_interrupt() {
  31) + 67.523 us   |    }
  31)               |    smp_apic_timer_interrupt() {
  31) + 59.946 us   |    }
  ...
  31)               |    queue_kthread_work() {
  31)   4.314 us    |    }
  31) * 24075.11 us |  }

Of course, it spends most of the time in the idle state. Anyway, the
time spent on queuing is negligible in compare with the time spent
in the several timer interrupt handlers.

> per cpu data as you already mentioned.

On the other hand, the variables need to be stored somewhere.
Also it helps to split the rather long function into more pieces.

> - since we need to have very tight timing control, two items may limit
>   our turnaround time. Wouldn't it take one extra tick for the scheduler
>   to run the balance work then add delay? as opposed to just
>   schedule_timeout()?

Kthread worker processes works until the queue is empty. It calls
try_to_freeze() and __preempt_schedule() between the works.
Where __preempt_schedule() is hidden in the spin_unlock_irq().

try_to_freeze() is in the original code as well.

Is the __preempt_schedule() a problem? It allows to switch the process
when needed. I thought that it was safe because try_to_freeze() might
have slept as well.

> - vulnerable to future changes of queuing work

The question is if it is safe to sleep, freeze, or even migrate
the system between the works. It looks like because of the
try_to_freeze() and schedule_interrupt() calls in the original code.

BTW: I wonder if the original code correctly handle freezing after
the schedule_timeout(). It does not call try_to_freeze()
there and the forced idle states might block freezing.
I think that the small overhead of kthread works is worth
solving such bugs. It makes it easier to maintain these
sleeping states.

Thanks a lot for feedback,
Petr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API
  2016-01-08 16:49     ` Petr Mladek
@ 2016-01-12  2:17       ` Jacob Pan
  2016-01-12 10:11         ` Petr Mladek
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Pan @ 2016-01-12  2:17 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, Steven Rostedt, Paul E. McKenney, Josh Triplett,
	Thomas Gleixner, Linus Torvalds, Jiri Kosina, Borislav Petkov,
	Michal Hocko, linux-mm, Vlastimil Babka, linux-api, linux-kernel,
	Zhang Rui, Eduardo Valentin, linux-pm, jacob.jun.pan

On Fri, 8 Jan 2016 17:49:31 +0100
Petr Mladek <pmladek@suse.com> wrote:

> Is the __preempt_schedule() a problem? It allows to switch the process
> when needed. I thought that it was safe because try_to_freeze() might
> have slept as well.
> 
not a problem. i originally thought queue_kthread_work() may add
delay but it doesn't since there is no other work on this kthread.
> 
> > - vulnerable to future changes of queuing work  
> 
> The question is if it is safe to sleep, freeze, or even migrate
> the system between the works. It looks like because of the
> try_to_freeze() and schedule_interrupt() calls in the original code.
> 
> BTW: I wonder if the original code correctly handle freezing after
> the schedule_timeout(). It does not call try_to_freeze()
> there and the forced idle states might block freezing.
> I think that the small overhead of kthread works is worth
> solving such bugs. It makes it easier to maintain these
> sleeping states.
it is in a while loop, so try_to_freeze() gets called. Am I missing
something?

Thanks,

Jacob

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API
  2016-01-12  2:17       ` Jacob Pan
@ 2016-01-12 10:11         ` Petr Mladek
  2016-01-12 16:20           ` Jacob Pan
  0 siblings, 1 reply; 12+ messages in thread
From: Petr Mladek @ 2016-01-12 10:11 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, Steven Rostedt, Paul E. McKenney, Josh Triplett,
	Thomas Gleixner, Linus Torvalds, Jiri Kosina, Borislav Petkov,
	Michal Hocko, linux-mm, Vlastimil Babka, linux-api, linux-kernel,
	Zhang Rui, Eduardo Valentin, linux-pm

On Mon 2016-01-11 18:17:18, Jacob Pan wrote:
> On Fri, 8 Jan 2016 17:49:31 +0100
> Petr Mladek <pmladek@suse.com> wrote:
> 
> > Is the __preempt_schedule() a problem? It allows to switch the process
> > when needed. I thought that it was safe because try_to_freeze() might
> > have slept as well.
> > 
> not a problem. i originally thought queue_kthread_work() may add
> delay but it doesn't since there is no other work on this kthread.

Great.
 
> > > - vulnerable to future changes of queuing work  
> > 
> > The question is if it is safe to sleep, freeze, or even migrate
> > the system between the works. It looks like because of the
> > try_to_freeze() and schedule_interrupt() calls in the original code.
> > 
> > BTW: I wonder if the original code correctly handle freezing after
> > the schedule_timeout(). It does not call try_to_freeze()
> > there and the forced idle states might block freezing.
> > I think that the small overhead of kthread works is worth
> > solving such bugs. It makes it easier to maintain these
> > sleeping states.
> it is in a while loop, so try_to_freeze() gets called. Am I missing
> something?

But it might take some time until try_to_freeze() is called.
If I get it correctly. try_to_freeze_tasks() wakes freezable
tasks to get them into the fridge. If clamp_thread() is waken
from that schedule_timeout_interruptible(), it still might inject
the idle state before calling try_to_freeze(). It means that freezer
needs to wait "quite" some time until the kthread ends up in the
fridge.

Hmm, even my conversion does not solve this entirely. We might
need to call freezing(current) in the

   while (time_before(jiffies, target_jiffies)) {

cycle. And break injecting the idle state when freezing is requested.

Or do I miss something, please?

Best Regards,
Petr







Best Regards,
Petr

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API
  2016-01-12 10:11         ` Petr Mladek
@ 2016-01-12 16:20           ` Jacob Pan
  2016-01-13 10:18             ` Petr Mladek
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Pan @ 2016-01-12 16:20 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, Steven Rostedt, Paul E. McKenney, Josh Triplett,
	Thomas Gleixner, Linus Torvalds, Jiri Kosina, Borislav Petkov,
	Michal Hocko, linux-mm, Vlastimil Babka, linux-api, linux-kernel,
	Zhang Rui, Eduardo Valentin, linux-pm, jacob.jun.pan

On Tue, 12 Jan 2016 11:11:29 +0100
Petr Mladek <pmladek@suse.com> wrote:

> > > BTW: I wonder if the original code correctly handle freezing after
> > > the schedule_timeout(). It does not call try_to_freeze()
> > > there and the forced idle states might block freezing.
> > > I think that the small overhead of kthread works is worth
> > > solving such bugs. It makes it easier to maintain these
> > > sleeping states.  
> > it is in a while loop, so try_to_freeze() gets called. Am I missing
> > something?  
> 
> But it might take some time until try_to_freeze() is called.
> If I get it correctly. try_to_freeze_tasks() wakes freezable
> tasks to get them into the fridge. If clamp_thread() is waken
> from that schedule_timeout_interruptible(), it still might inject
> the idle state before calling try_to_freeze(). It means that freezer
> needs to wait "quite" some time until the kthread ends up in the
> fridge.
> 
> Hmm, even my conversion does not solve this entirely. We might
> need to call freezing(current) in the
> 
>    while (time_before(jiffies, target_jiffies)) {
> 
> cycle. And break injecting the idle state when freezing is requested.

The injection time for each period is very short, default 6ms. While on
the other side the default freeze timeout is 20 sec. So I think task
freeze can wait :)
i.e.
unsigned int __read_mostly freeze_timeout_msecs = 20 * MSEC_PER_SEC;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API
  2016-01-12 16:20           ` Jacob Pan
@ 2016-01-13 10:18             ` Petr Mladek
  2016-01-13 17:53               ` Jacob Pan
  0 siblings, 1 reply; 12+ messages in thread
From: Petr Mladek @ 2016-01-13 10:18 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, Steven Rostedt, Paul E. McKenney, Josh Triplett,
	Thomas Gleixner, Linus Torvalds, Jiri Kosina, Borislav Petkov,
	Michal Hocko, linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Vlastimil Babka,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Zhang Rui, Eduardo Valentin,
	linux-pm-u79uwXL29TY76Z2rM5mHXA

On Tue 2016-01-12 08:20:21, Jacob Pan wrote:
> On Tue, 12 Jan 2016 11:11:29 +0100
> Petr Mladek <pmladek-IBi9RG/b67k@public.gmane.org> wrote:
> 
> > > > BTW: I wonder if the original code correctly handle freezing after
> > > > the schedule_timeout(). It does not call try_to_freeze()
> > > > there and the forced idle states might block freezing.
> > > > I think that the small overhead of kthread works is worth
> > > > solving such bugs. It makes it easier to maintain these
> > > > sleeping states.  
> > > it is in a while loop, so try_to_freeze() gets called. Am I missing
> > > something?  
> > 
> > But it might take some time until try_to_freeze() is called.
> > If I get it correctly. try_to_freeze_tasks() wakes freezable
> > tasks to get them into the fridge. If clamp_thread() is waken
> > from that schedule_timeout_interruptible(), it still might inject
> > the idle state before calling try_to_freeze(). It means that freezer
> > needs to wait "quite" some time until the kthread ends up in the
> > fridge.
> > 
> > Hmm, even my conversion does not solve this entirely. We might
> > need to call freezing(current) in the
> > 
> >    while (time_before(jiffies, target_jiffies)) {
> > 
> > cycle. And break injecting the idle state when freezing is requested.
> 
> The injection time for each period is very short, default 6ms. While on
> the other side the default freeze timeout is 20 sec. So I think task
> freeze can wait :)
> i.e.
> unsigned int __read_mostly freeze_timeout_msecs = 20 * MSEC_PER_SEC;

You are right. And it does not make sense to add an extra
freezer-specific code if not really necessary.

Otherwise, I will keep the conversion into the kthread worker as is
for now. Please, let me know if you are strongly against the split
into the two works.


Best Regards,
Petr

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API
  2016-01-13 10:18             ` Petr Mladek
@ 2016-01-13 17:53               ` Jacob Pan
  2016-01-14 15:37                 ` Petr Mladek
  0 siblings, 1 reply; 12+ messages in thread
From: Jacob Pan @ 2016-01-13 17:53 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, Steven Rostedt, Paul E. McKenney, Josh Triplett,
	Thomas Gleixner, Linus Torvalds, Jiri Kosina, Borislav Petkov,
	Michal Hocko, linux-mm, Vlastimil Babka, linux-api, linux-kernel,
	Zhang Rui, Eduardo Valentin, linux-pm, jacob.jun.pan

On Wed, 13 Jan 2016 11:18:31 +0100
Petr Mladek <pmladek@suse.com> wrote:

> > unsigned int __read_mostly freeze_timeout_msecs = 20 *
> > MSEC_PER_SEC;  
> 
> You are right. And it does not make sense to add an extra
> freezer-specific code if not really necessary.
> 
> Otherwise, I will keep the conversion into the kthread worker as is
> for now. Please, let me know if you are strongly against the split
> into the two works.
I am fine with the split now.

Another question, are you planning to convert acpi_pad.c as well? It
uses kthread similar way.


Jacob


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API
  2016-01-13 17:53               ` Jacob Pan
@ 2016-01-14 15:37                 ` Petr Mladek
  0 siblings, 0 replies; 12+ messages in thread
From: Petr Mladek @ 2016-01-14 15:37 UTC (permalink / raw)
  To: Jacob Pan
  Cc: Andrew Morton, Oleg Nesterov, Tejun Heo, Ingo Molnar,
	Peter Zijlstra, Steven Rostedt, Paul E. McKenney, Josh Triplett,
	Thomas Gleixner, Linus Torvalds, Jiri Kosina, Borislav Petkov,
	Michal Hocko, linux-mm, Vlastimil Babka, linux-api, linux-kernel,
	Zhang Rui, Eduardo Valentin, linux-pm

On Wed 2016-01-13 09:53:53, Jacob Pan wrote:
> On Wed, 13 Jan 2016 11:18:31 +0100
> Petr Mladek <pmladek@suse.com> wrote:
> > Otherwise, I will keep the conversion into the kthread worker as is
> > for now. Please, let me know if you are strongly against the split
> > into the two works.
> I am fine with the split now.

Great.

> Another question, are you planning to convert acpi_pad.c as well? It
> uses kthread similar way.

Yup. I would like to convert as many kthreads as possible either to
the kthread worker or workqueue APIs in the long term.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-01-14 15:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-11-18 13:25 [PATCH v3 00/22] kthread: Use kthread worker API more widely Petr Mladek
2015-11-18 13:25 ` [PATCH v3 21/22] thermal/intel_powerclamp: Remove duplicated code that starts the kthread Petr Mladek
2015-11-18 13:25 ` [PATCH v3 22/22] thermal/intel_powerclamp: Convert the kthread to kthread worker API Petr Mladek
2016-01-07 19:55   ` Jacob Pan
2016-01-08 16:49     ` Petr Mladek
2016-01-12  2:17       ` Jacob Pan
2016-01-12 10:11         ` Petr Mladek
2016-01-12 16:20           ` Jacob Pan
2016-01-13 10:18             ` Petr Mladek
2016-01-13 17:53               ` Jacob Pan
2016-01-14 15:37                 ` Petr Mladek
2015-11-18 14:25 ` [PATCH v3 00/22] kthread: Use kthread worker API more widely Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).