Linux Power Management development
 help / color / mirror / Atom feed
* Re: [RFC PATCH v4 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Srivatsa S. Bhat @ 2012-12-24 15:50 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
	vincent.guittot, tj, sbw, amit.kucheria, rostedt, rjw, wangyun,
	xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121223164242.GA9979@redhat.com>

On 12/23/2012 10:12 PM, Oleg Nesterov wrote:
> On 12/23, Srivatsa S. Bhat wrote:
>>
>> On 12/20/2012 07:12 PM, Oleg Nesterov wrote:
>>>
>>> We need mb() + rmb(). Plust cli/sti unless this arch has optimized
>>> this_cpu_add() like x86 (as you pointed out).
>>>
>>
>> Hey, IIUC, we actually don't need mb() in the reader!! Just an rmb() will do.
> 
> Well. I don't think so. But when it comes to the barriers I am never sure
> until Paul confirms my understanding ;)
> 
>> #define reader_nested_percpu()						\
>> 	     (__this_cpu_read(reader_percpu_refcnt) & READER_REFCNT_MASK)
>>
>> #define writer_active()							\
>> 				(__this_cpu_read(writer_signal))
>>
>>
>> #define READER_PRESENT		(1UL << 16)
>> #define READER_REFCNT_MASK	(READER_PRESENT - 1)
>>
>> void get_online_cpus_atomic(void)
>> {
>> 	preempt_disable();
>>
>> 	/*
>> 	 * First and foremost, make your presence known to the writer.
>> 	 */
>> 	this_cpu_add(reader_percpu_refcnt, READER_PRESENT);
>>
>> 	/*
>> 	 * If we are already using per-cpu refcounts, it is not safe to switch
>> 	 * the synchronization scheme. So continue using the refcounts.
>> 	 */
>> 	if (reader_nested_percpu()) {
>> 		this_cpu_inc(reader_percpu_refcnt);
>> 	} else {
>> 		smp_rmb();
>> 		if (unlikely(writer_active())) {
>> 			... //take hotplug_rwlock
>> 		}
>> 	}
>>
>> 	...
>>
>> 	/* Prevent reordering of any subsequent reads of cpu_online_mask. */
>> 	smp_rmb();
>> }
>>
>> The smp_rmb() before writer_active() ensures that LOAD(writer_signal) follows
>> LOAD(reader_percpu_refcnt) (at the 'if' condition). And in turn, that load is
>> automatically going to follow the STORE(reader_percpu_refcnt)
> 
> But why this STORE should be visible on another CPU before we LOAD(writer_signal)?
> 
> Lets discuss the simple and artificial example. Suppose we have
> 
> 	int X, Y;
> 
> 	int func(void)
> 	{
> 		X = 1;	// suppose that nobody else can change it
> 		mb();
> 		return Y;
> 	}
> 
> Now you are saying that we can change it and avoid the costly mb():
> 
> 	int func(void)
> 	{
> 		X = 1;
> 
> 		if (X != 1)
> 			BUG();
> 	
> 		rmb();
> 		return Y;
> 	}
> 
> I doubt. rmb() can only guarantee that the preceding LOAD's should be
> completed. Without mb() it is possible that this CPU won't write X to
> memory at all.
> 

Oh, ok :-( Thanks for correcting me and for the detailed explanation!
For a moment, I really thought we had it solved at last! ;-(

Regards,
Srivatsa S. Bhat


^ permalink raw reply

* Re: [RFC PATCH v4 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Oleg Nesterov @ 2012-12-23 16:42 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
	vincent.guittot, tj, sbw, amit.kucheria, rostedt, rjw, wangyun,
	xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <50D61561.7090805@linux.vnet.ibm.com>

On 12/23, Srivatsa S. Bhat wrote:
>
> On 12/20/2012 07:12 PM, Oleg Nesterov wrote:
> >
> > We need mb() + rmb(). Plust cli/sti unless this arch has optimized
> > this_cpu_add() like x86 (as you pointed out).
> >
>
> Hey, IIUC, we actually don't need mb() in the reader!! Just an rmb() will do.

Well. I don't think so. But when it comes to the barriers I am never sure
until Paul confirms my understanding ;)

> #define reader_nested_percpu()						\
> 	     (__this_cpu_read(reader_percpu_refcnt) & READER_REFCNT_MASK)
>
> #define writer_active()							\
> 				(__this_cpu_read(writer_signal))
>
>
> #define READER_PRESENT		(1UL << 16)
> #define READER_REFCNT_MASK	(READER_PRESENT - 1)
>
> void get_online_cpus_atomic(void)
> {
> 	preempt_disable();
>
> 	/*
> 	 * First and foremost, make your presence known to the writer.
> 	 */
> 	this_cpu_add(reader_percpu_refcnt, READER_PRESENT);
>
> 	/*
> 	 * If we are already using per-cpu refcounts, it is not safe to switch
> 	 * the synchronization scheme. So continue using the refcounts.
> 	 */
> 	if (reader_nested_percpu()) {
> 		this_cpu_inc(reader_percpu_refcnt);
> 	} else {
> 		smp_rmb();
> 		if (unlikely(writer_active())) {
> 			... //take hotplug_rwlock
> 		}
> 	}
>
> 	...
>
> 	/* Prevent reordering of any subsequent reads of cpu_online_mask. */
> 	smp_rmb();
> }
>
> The smp_rmb() before writer_active() ensures that LOAD(writer_signal) follows
> LOAD(reader_percpu_refcnt) (at the 'if' condition). And in turn, that load is
> automatically going to follow the STORE(reader_percpu_refcnt)

But why this STORE should be visible on another CPU before we LOAD(writer_signal)?

Lets discuss the simple and artificial example. Suppose we have

	int X, Y;

	int func(void)
	{
		X = 1;	// suppose that nobody else can change it
		mb();
		return Y;
	}

Now you are saying that we can change it and avoid the costly mb():

	int func(void)
	{
		X = 1;

		if (X != 1)
			BUG();
	
		rmb();
		return Y;
	}

I doubt. rmb() can only guarantee that the preceding LOAD's should be
completed. Without mb() it is possible that this CPU won't write X to
memory at all.

Oleg.


^ permalink raw reply

* Re: [RFC PATCH v4 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Srivatsa S. Bhat @ 2012-12-22 20:17 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
	vincent.guittot, tj, sbw, amit.kucheria, rostedt, rjw, wangyun,
	xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121220134203.GB10813@redhat.com>

On 12/20/2012 07:12 PM, Oleg Nesterov wrote:
> On 12/20, Srivatsa S. Bhat wrote:
>>
>> On 12/20/2012 12:44 AM, Oleg Nesterov wrote:
>>>
>>> We need 2 helpers for writer, the 1st one does synchronize_sched() and the
>>> 2nd one takes rwlock. A generic percpu_write_lock() simply calls them both.
>>>
>>
>> Ah, that's the problem no? Users of reader-writer locks expect to run in
>> atomic context (ie., they don't want to sleep).
> 
> Ah, I misunderstood.
> 
> Sure, percpu_write_lock() should be might_sleep(), and this is not
> symmetric to percpu_read_lock().
> 
>> We can't expose an API that
>> can make the task go to sleep under the covers!
> 
> Why? Just this should be documented. However I would not worry until we
> find another user. Until then we do not even need to add percpu_write_lock
> or try to generalize this code too much.
> 
>>> To me, the main question is: can we use synchronize_sched() in cpu_down?
>>> It is slow.
>>>
>>
>> Haha :-) So we don't want smp_mb() in the reader,
> 
> We need mb() + rmb(). Plust cli/sti unless this arch has optimized
> this_cpu_add() like x86 (as you pointed out).
> 

Hey, IIUC, we actually don't need mb() in the reader!! Just an rmb() will do.

This is the reader code I have so far:

#define reader_nested_percpu()						\
	     (__this_cpu_read(reader_percpu_refcnt) & READER_REFCNT_MASK)

#define writer_active()							\
				(__this_cpu_read(writer_signal))


#define READER_PRESENT		(1UL << 16)
#define READER_REFCNT_MASK	(READER_PRESENT - 1)

void get_online_cpus_atomic(void)
{
	preempt_disable();

	/*
	 * First and foremost, make your presence known to the writer.
	 */
	this_cpu_add(reader_percpu_refcnt, READER_PRESENT);

	/*
	 * If we are already using per-cpu refcounts, it is not safe to switch
	 * the synchronization scheme. So continue using the refcounts.
	 */
	if (reader_nested_percpu()) {
		this_cpu_inc(reader_percpu_refcnt);
	} else {
		smp_rmb();
		if (unlikely(writer_active())) {
			... //take hotplug_rwlock
		}
	}

	...

	/* Prevent reordering of any subsequent reads of cpu_online_mask. */
	smp_rmb();
}

The smp_rmb() before writer_active() ensures that LOAD(writer_signal) follows
LOAD(reader_percpu_refcnt) (at the 'if' condition). And in turn, that load is
automatically going to follow the STORE(reader_percpu_refcnt) (at this_cpu_add())
due to the data dependency. So it is something like a transitive relation.

So, the result is that, we mark ourselves as active in reader_percpu_refcnt before
we check writer_signal. This is exactly what we wanted to do right?
And luckily, due to the dependency, we can achieve it without using the heavy
smp_mb(). And, we can't crib about the smp_rmb() because it is unavoidable anyway
(because we want to prevent reordering of the reads to cpu_online_mask, like you
pointed out earlier).

I hope I'm not missing anything...

Regards,
Srivatsa S. Bhat


^ permalink raw reply

* Re: [PATCH 16/25] PM / Domains: don't use [delayed_]work_pending()
From: Rafael J. Wysocki @ 2012-12-22 11:57 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Rafael J. Wysocki, linux-pm
In-Reply-To: <1356141435-17340-17-git-send-email-tj@kernel.org>

On Friday, December 21, 2012 05:57:06 PM Tejun Heo wrote:
> There's no need to test whether a (delayed) work item in pending
> before queueing, flushing or cancelling it.  Most uses are unnecessary
> and quite a few of them are buggy.

Is the particular one you're removing from domain.c buggy?

> Remove unnecessary pending tests from power domains.  Only compile
> tested.

I can take this one too.

Thanks,
Rafael


> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> Cc: linux-pm@vger.kernel.org
> ---
> Please let me know how this patch should be routed.  I can take it
> through the workqueue tree if necessary.
> 
> Thanks.
> 
>  drivers/base/power/domain.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
> index acc3a8d..9a6b05a 100644
> --- a/drivers/base/power/domain.c
> +++ b/drivers/base/power/domain.c
> @@ -433,8 +433,7 @@ static bool genpd_abort_poweroff(struct generic_pm_domain *genpd)
>   */
>  void genpd_queue_power_off_work(struct generic_pm_domain *genpd)
>  {
> -	if (!work_pending(&genpd->power_off_work))
> -		queue_work(pm_wq, &genpd->power_off_work);
> +	queue_work(pm_wq, &genpd->power_off_work);
>  }
>  
>  /**
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: [PATCH 11/25] pm: don't use [delayed_]work_pending()
From: Rafael J. Wysocki @ 2012-12-22 11:53 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, linux-pm
In-Reply-To: <1356141435-17340-12-git-send-email-tj@kernel.org>

On Friday, December 21, 2012 05:57:01 PM Tejun Heo wrote:
> There's no need to test whether a (delayed) work item in pending
> before queueing, flushing or cancelling it.  Most uses are unnecessary
> and quite a few of them are buggy.

Can you please say why they are buggy?

> Remove unnecessary pending tests from pm autosleep and qos.  Only
> compile tested.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
> Cc: linux-pm@vger.kernel.org
> ---
> Please let me know how this patch should be routed.  I can take it
> through the workqueue tree if necessary.

I can take it.  I will send a pull request with fixes later in the cycle
(maybe even before -rc2).

Thanks,
Rafael


>  kernel/power/autosleep.c | 2 +-
>  kernel/power/qos.c       | 9 +++------
>  2 files changed, 4 insertions(+), 7 deletions(-)
> 
> diff --git a/kernel/power/autosleep.c b/kernel/power/autosleep.c
> index ca304046..c6422ff 100644
> --- a/kernel/power/autosleep.c
> +++ b/kernel/power/autosleep.c
> @@ -66,7 +66,7 @@ static DECLARE_WORK(suspend_work, try_to_suspend);
>  
>  void queue_up_suspend_work(void)
>  {
> -	if (!work_pending(&suspend_work) && autosleep_state > PM_SUSPEND_ON)
> +	if (autosleep_state > PM_SUSPEND_ON)
>  		queue_work(autosleep_wq, &suspend_work);
>  }
>  
> diff --git a/kernel/power/qos.c b/kernel/power/qos.c
> index 9322ff7..587ddde 100644
> --- a/kernel/power/qos.c
> +++ b/kernel/power/qos.c
> @@ -359,8 +359,7 @@ void pm_qos_update_request(struct pm_qos_request *req,
>  		return;
>  	}
>  
> -	if (delayed_work_pending(&req->work))
> -		cancel_delayed_work_sync(&req->work);
> +	cancel_delayed_work_sync(&req->work);
>  
>  	if (new_value != req->node.prio)
>  		pm_qos_update_target(
> @@ -386,8 +385,7 @@ void pm_qos_update_request_timeout(struct pm_qos_request *req, s32 new_value,
>  		 "%s called for unknown object.", __func__))
>  		return;
>  
> -	if (delayed_work_pending(&req->work))
> -		cancel_delayed_work_sync(&req->work);
> +	cancel_delayed_work_sync(&req->work);
>  
>  	if (new_value != req->node.prio)
>  		pm_qos_update_target(
> @@ -416,8 +414,7 @@ void pm_qos_remove_request(struct pm_qos_request *req)
>  		return;
>  	}
>  
> -	if (delayed_work_pending(&req->work))
> -		cancel_delayed_work_sync(&req->work);
> +	cancel_delayed_work_sync(&req->work);
>  
>  	pm_qos_update_target(pm_qos_array[req->pm_qos_class]->constraints,
>  			     &req->node, PM_QOS_REMOVE_REQ,
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* [PATCH 11/25] pm: don't use [delayed_]work_pending()
From: Tejun Heo @ 2012-12-22  1:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, Rafael J. Wysocki, linux-pm
In-Reply-To: <1356141435-17340-1-git-send-email-tj@kernel.org>

There's no need to test whether a (delayed) work item in pending
before queueing, flushing or cancelling it.  Most uses are unnecessary
and quite a few of them are buggy.

Remove unnecessary pending tests from pm autosleep and qos.  Only
compile tested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: linux-pm@vger.kernel.org
---
Please let me know how this patch should be routed.  I can take it
through the workqueue tree if necessary.

Thanks.

 kernel/power/autosleep.c | 2 +-
 kernel/power/qos.c       | 9 +++------
 2 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/kernel/power/autosleep.c b/kernel/power/autosleep.c
index ca304046..c6422ff 100644
--- a/kernel/power/autosleep.c
+++ b/kernel/power/autosleep.c
@@ -66,7 +66,7 @@ static DECLARE_WORK(suspend_work, try_to_suspend);
 
 void queue_up_suspend_work(void)
 {
-	if (!work_pending(&suspend_work) && autosleep_state > PM_SUSPEND_ON)
+	if (autosleep_state > PM_SUSPEND_ON)
 		queue_work(autosleep_wq, &suspend_work);
 }
 
diff --git a/kernel/power/qos.c b/kernel/power/qos.c
index 9322ff7..587ddde 100644
--- a/kernel/power/qos.c
+++ b/kernel/power/qos.c
@@ -359,8 +359,7 @@ void pm_qos_update_request(struct pm_qos_request *req,
 		return;
 	}
 
-	if (delayed_work_pending(&req->work))
-		cancel_delayed_work_sync(&req->work);
+	cancel_delayed_work_sync(&req->work);
 
 	if (new_value != req->node.prio)
 		pm_qos_update_target(
@@ -386,8 +385,7 @@ void pm_qos_update_request_timeout(struct pm_qos_request *req, s32 new_value,
 		 "%s called for unknown object.", __func__))
 		return;
 
-	if (delayed_work_pending(&req->work))
-		cancel_delayed_work_sync(&req->work);
+	cancel_delayed_work_sync(&req->work);
 
 	if (new_value != req->node.prio)
 		pm_qos_update_target(
@@ -416,8 +414,7 @@ void pm_qos_remove_request(struct pm_qos_request *req)
 		return;
 	}
 
-	if (delayed_work_pending(&req->work))
-		cancel_delayed_work_sync(&req->work);
+	cancel_delayed_work_sync(&req->work);
 
 	pm_qos_update_target(pm_qos_array[req->pm_qos_class]->constraints,
 			     &req->node, PM_QOS_REMOVE_REQ,
-- 
1.8.0.2


^ permalink raw reply related

* [PATCH 16/25] PM / Domains: don't use [delayed_]work_pending()
From: Tejun Heo @ 2012-12-22  1:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: Tejun Heo, Rafael J. Wysocki, linux-pm
In-Reply-To: <1356141435-17340-1-git-send-email-tj@kernel.org>

There's no need to test whether a (delayed) work item in pending
before queueing, flushing or cancelling it.  Most uses are unnecessary
and quite a few of them are buggy.

Remove unnecessary pending tests from power domains.  Only compile
tested.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: linux-pm@vger.kernel.org
---
Please let me know how this patch should be routed.  I can take it
through the workqueue tree if necessary.

Thanks.

 drivers/base/power/domain.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index acc3a8d..9a6b05a 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -433,8 +433,7 @@ static bool genpd_abort_poweroff(struct generic_pm_domain *genpd)
  */
 void genpd_queue_power_off_work(struct generic_pm_domain *genpd)
 {
-	if (!work_pending(&genpd->power_off_work))
-		queue_work(pm_wq, &genpd->power_off_work);
+	queue_work(pm_wq, &genpd->power_off_work);
 }
 
 /**
-- 
1.8.0.2

^ permalink raw reply related

* Re: [Resend][PATCH] PM: Move disabling/enabling runtime PM to late suspend/early resume
From: Rafael J. Wysocki @ 2012-12-21 22:09 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Linux PM list, LKML, Jan-Matthias Braun, Jiri Kosina, Alan Stern
In-Reply-To: <87d2y3dvxj.fsf@deeprootsystems.com>

On Friday, December 21, 2012 11:52:56 AM Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw@sisk.pl> writes:
> 
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Currently, the PM core disables runtime PM for all devices right
> > after executing subsystem/driver .suspend() callbacks for them
> > and re-enables it right before executing subsystem/driver .resume()
> > callbacks for them.  This may lead to problems when there are
> > two devices such that the .suspend() callback executed for one of
> > them depends on runtime PM working for the other.  In that case,
> > if runtime PM has already been disabled for the second device,
> > the first one's .suspend() won't work correctly (and analogously
> > for resume).
> >
> > To make those issues go away, make the PM core disable runtime PM
> > for devices right before executing subsystem/driver .suspend_late()
> > callbacks for them and enable runtime PM for them right after
> > executing subsystem/driver .resume_early() callbacks for them.  This
> > way the potential conflitcs between .suspend_late()/.resume_early()
> > and their runtime PM counterparts are still prevented from happening,
> > but the subtle ordering issues related to disabling/enabling runtime
> > PM for devices during system suspend/resume are much easier to avoid.
> >
> > Reported-and-tested-by: Jan-Matthias Braun <jan_braun@gmx.net>
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Yes!  
> 
> Of course if there are dependencies between drivers late/early
> callbacks, we'll still have the same problems, but those should be
> *very* rare compared to the suspend/resume dependencies.  

Well, let's put it this way: Any dependencies between drivers' late/early
callbacks that make things break because runtime PM is disabled at that time
are bugs that need to be fixed.

Why?

Because runtime PM has always been disabled during late/early callbacks since
they were introduced and if somebody conveniently ignored that, then this is
this person's problem entirely.

> I haven't been able to do any testing with this yet (I'm away from my
> hardware for a bit), but I totally support this change.  
> 
> Reviewed-by: Kevin Hilman <khilman@deeprootsystems.com>

Thanks!

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply

* Re: [Resend][PATCH] PM: Move disabling/enabling runtime PM to late suspend/early resume
From: Kevin Hilman @ 2012-12-21 19:52 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM list, LKML, Jan-Matthias Braun, Jiri Kosina, Alan Stern
In-Reply-To: <3307221.dy6xvn45xe@vostro.rjw.lan>

"Rafael J. Wysocki" <rjw@sisk.pl> writes:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Currently, the PM core disables runtime PM for all devices right
> after executing subsystem/driver .suspend() callbacks for them
> and re-enables it right before executing subsystem/driver .resume()
> callbacks for them.  This may lead to problems when there are
> two devices such that the .suspend() callback executed for one of
> them depends on runtime PM working for the other.  In that case,
> if runtime PM has already been disabled for the second device,
> the first one's .suspend() won't work correctly (and analogously
> for resume).
>
> To make those issues go away, make the PM core disable runtime PM
> for devices right before executing subsystem/driver .suspend_late()
> callbacks for them and enable runtime PM for them right after
> executing subsystem/driver .resume_early() callbacks for them.  This
> way the potential conflitcs between .suspend_late()/.resume_early()
> and their runtime PM counterparts are still prevented from happening,
> but the subtle ordering issues related to disabling/enabling runtime
> PM for devices during system suspend/resume are much easier to avoid.
>
> Reported-and-tested-by: Jan-Matthias Braun <jan_braun@gmx.net>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Yes!  

Of course if there are dependencies between drivers late/early
callbacks, we'll still have the same problems, but those should be
*very* rare compared to the suspend/resume dependencies.  

I haven't been able to do any testing with this yet (I'm away from my
hardware for a bit), but I totally support this change.  

Reviewed-by: Kevin Hilman <khilman@deeprootsystems.com>

Thanks!

Kevin

^ permalink raw reply

* Re: [RESEND PATCH 0/5] cpufreq: db8500: Rename driver and update some parts
From: Mike Turquette @ 2012-12-21 17:18 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Philippe Begnic, Vincent Guittot, Jonas Aberg,
	linux-pm@vger.kernel.org, Linus Walleij, cpufreq,
	Rickard Andersson, Rafael J. Wysocki, Lee Jones, Ulf Hansson,
	linux-arm-kernel@lists.infradead.org
In-Reply-To: <CAPDyKFpN_MM+3N6U07UFu71rEPFkjwExCsnTwMkXbt0ei9ryfg@mail.gmail.com>

On Thu, Dec 13, 2012 at 3:20 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 12 December 2012 21:56, Mike Turquette <mturquette@linaro.org> wrote:
>> On Mon, Dec 10, 2012 at 7:25 AM, Ulf Hansson <ulf.hansson@stericsson.com> wrote:
>>> From: Ulf Hansson <ulf.hansson@linaro.org>
>>>
>>> This patchset starts by renaming the db8500 cpufreq driver to a more generic
>>> name. There are new variants which rely on it too, so instead we give it a
>>> family name of dbx500.
>>>
>>> On top of that a fixup patch for initialization of the driver and some minor
>>> cleanup patches are included as well.
>>>
>>> These patches can be material for 3.8.
>>>
>>> Since some patches for the db8500 cpufreq driver has recently been merged
>>> through Mike Turquette's clk tree, I decided to base these patches on top
>>> of the clk-next branch. So I hope Mike is fine by merging these patches
>>> through his tree.
>>>
>>> So I will try to collect ACKs from Rafael for the cpufreq patches. The
>>> mfd patch has already been ACKED by Samuel.
>>>
>>
>> Ulf,
>>
>> Just FYI clk-next has been pulled by Linus for 3.8.  I can still take
>> this branch once Rafael ACKs the CPUfreq bits if you want, but I think
>> you have more options now that clk-next is merged in.  It's up to you.
>
> I think the easiest way forward (at least for me :-) ) would be if we
> could send them as for 3.8 rcs. Is that possible for you Mike?
>

Hi Ulf,

I did not want to leave you hanging on this point.  I decided at the
last minute to not bring a development platform or a nice laptop on my
vacation so I will not be sending any more pull requests to Linus for
3.8-rc unless some patches come in which are just obvious fixes.  Even
then I will wait until I return in January to properly test any fixes
before I send the request out.

Sorry that these patches did not make it in for 3.8 but I will queue
them up into clk-next in January.

Thanks,
Mike

>>
>> Regards,
>> Mike
>
> Thanks!
>
> Ulf Hansson

^ permalink raw reply

* RE: [PATCH 0/8] Thermal Framework Enhancements
From: R, Durgadoss @ 2012-12-21  9:17 UTC (permalink / raw)
  To: Hongbo Zhang
  Cc: Wei Ni, Zhang, Rui, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAJLyvQztwuvG0GooBUKApR+VnP9=yxKUu6sRn1AqR8Sno_MP8Q@mail.gmail.com>


> -----Original Message-----
> From: Hongbo Zhang [mailto:hongbo.zhang@linaro.org]
> Sent: Friday, December 21, 2012 2:17 PM
> To: R, Durgadoss
> Cc: Wei Ni; Zhang, Rui; linux-pm@vger.kernel.org; linux-
> kernel@vger.kernel.org
> Subject: Re: [PATCH 0/8] Thermal Framework Enhancements
> 
> On 21 December 2012 16:30, R, Durgadoss <durgadoss.r@intel.com> wrote:
> > Hi Ni,
> >
> >> -----Original Message-----
> >> From: Wei Ni [mailto:wni@nvidia.com]
> >> Sent: Friday, December 21, 2012 1:36 PM
> >> To: R, Durgadoss
> >> Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> >> hongbo.zhang@linaro.org
> >> Subject: Re: [PATCH 0/8] Thermal Framework Enhancements
> >>
> >> On 12/18/2012 05:29 PM, Durgadoss R wrote:
> >> > This patch is a v1 based on the RFC submitted here:
> >> > https://patchwork.kernel.org/patch/1758921/
> >> >
> >> > This patch set is based on Rui's -thermal tree, and is
> >> > tested on a Core-i5 and an Atom netbook.
> >> >
> >> > This series contains 8 patches:
> >> > Patch 1/8: Creates new sensor level APIs
> >> > Patch 2/8: Creates new zone level APIs. The existing tzd structure is
> >> >            kept as such for clarity and compatibility purposes.
> >> > Patch 3/8: Creates functions to add/remove a cdev to/from a zone. The
> >> >            existing tcd structure need not be modified.
> >> > Patch 4/8: Adds a thermal_trip sysfs node, which exposes various trip
> >> >            points for all sensors present in a zone.
> >> > Patch 5/8: Adds a thermal_map sysfs node. It is a compact
> representation
> >> >            of the binding relationship between a sensor and a cdev,
> >> >            within a zone.
> >> > Patch 6/8: Creates Documentation for the new APIs. A new file is
> >> >            created for clarity. Final goal is to merge with the existing
> >> >            file or refactor the files, as whatever seems appropriate.
> >> > Patch 7/8: Make PER ZONE values configurable through Kconfig
> >> > Patch 8/8: A dummy driver that can be used for testing. This is not for
> >> merge.
> >>
> >> I read these patches, they create new APIs and sysfs, but it seems they
> >> didn't use the thermal_zone to handle the thermal_throttle issue,
> >> something like update thermal_zone, update temperature, handle
> >> governors
> >> when cross the trip temp. So will you send out next serial patches for
> >> these implementation?
> >
> > Yes, once these get into Rui's tree, we will start migrating the existing
> drivers/
> > and governors, to get things working.
> Durgadoss,
> See function psy_register_thermal() in power_supply_core.c,
> thermal_zone_device_register() is used here, what will this look like
> in future?

Yes, I know this code.
This will be a thermal_sensor_register.
Basically this will expose battery's temperature as a 'sensor'
under /sys/class/thermal/.

Then, each platform can add it to whatever zone they like to.

Thanks,
Durga

^ permalink raw reply

* Re: [PATCH 0/8] Thermal Framework Enhancements
From: Hongbo Zhang @ 2012-12-21  8:46 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Wei Ni, Zhang, Rui, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <4D68720C2E767A4AA6A8796D42C8EB5924D729@BGSMSX101.gar.corp.intel.com>

On 21 December 2012 16:30, R, Durgadoss <durgadoss.r@intel.com> wrote:
> Hi Ni,
>
>> -----Original Message-----
>> From: Wei Ni [mailto:wni@nvidia.com]
>> Sent: Friday, December 21, 2012 1:36 PM
>> To: R, Durgadoss
>> Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
>> hongbo.zhang@linaro.org
>> Subject: Re: [PATCH 0/8] Thermal Framework Enhancements
>>
>> On 12/18/2012 05:29 PM, Durgadoss R wrote:
>> > This patch is a v1 based on the RFC submitted here:
>> > https://patchwork.kernel.org/patch/1758921/
>> >
>> > This patch set is based on Rui's -thermal tree, and is
>> > tested on a Core-i5 and an Atom netbook.
>> >
>> > This series contains 8 patches:
>> > Patch 1/8: Creates new sensor level APIs
>> > Patch 2/8: Creates new zone level APIs. The existing tzd structure is
>> >            kept as such for clarity and compatibility purposes.
>> > Patch 3/8: Creates functions to add/remove a cdev to/from a zone. The
>> >            existing tcd structure need not be modified.
>> > Patch 4/8: Adds a thermal_trip sysfs node, which exposes various trip
>> >            points for all sensors present in a zone.
>> > Patch 5/8: Adds a thermal_map sysfs node. It is a compact representation
>> >            of the binding relationship between a sensor and a cdev,
>> >            within a zone.
>> > Patch 6/8: Creates Documentation for the new APIs. A new file is
>> >            created for clarity. Final goal is to merge with the existing
>> >            file or refactor the files, as whatever seems appropriate.
>> > Patch 7/8: Make PER ZONE values configurable through Kconfig
>> > Patch 8/8: A dummy driver that can be used for testing. This is not for
>> merge.
>>
>> I read these patches, they create new APIs and sysfs, but it seems they
>> didn't use the thermal_zone to handle the thermal_throttle issue,
>> something like update thermal_zone, update temperature, handle
>> governors
>> when cross the trip temp. So will you send out next serial patches for
>> these implementation?
>
> Yes, once these get into Rui's tree, we will start migrating the existing drivers/
> and governors, to get things working.
Durgadoss,
See function psy_register_thermal() in power_supply_core.c,
thermal_zone_device_register() is used here, what will this look like
in future?
Create thermal zone here?
Or add sensor to thermal zone? if so, how do we know which zone to add
sensor to?


>
> Thanks,
> Durga

^ permalink raw reply

* RE: [PATCH 0/8] Thermal Framework Enhancements
From: R, Durgadoss @ 2012-12-21  8:30 UTC (permalink / raw)
  To: Wei Ni
  Cc: Zhang, Rui, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org, hongbo.zhang@linaro.org
In-Reply-To: <50D41860.7040406@nvidia.com>

Hi Ni,

> -----Original Message-----
> From: Wei Ni [mailto:wni@nvidia.com]
> Sent: Friday, December 21, 2012 1:36 PM
> To: R, Durgadoss
> Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> hongbo.zhang@linaro.org
> Subject: Re: [PATCH 0/8] Thermal Framework Enhancements
> 
> On 12/18/2012 05:29 PM, Durgadoss R wrote:
> > This patch is a v1 based on the RFC submitted here:
> > https://patchwork.kernel.org/patch/1758921/
> >
> > This patch set is based on Rui's -thermal tree, and is
> > tested on a Core-i5 and an Atom netbook.
> >
> > This series contains 8 patches:
> > Patch 1/8: Creates new sensor level APIs
> > Patch 2/8: Creates new zone level APIs. The existing tzd structure is
> >            kept as such for clarity and compatibility purposes.
> > Patch 3/8: Creates functions to add/remove a cdev to/from a zone. The
> >            existing tcd structure need not be modified.
> > Patch 4/8: Adds a thermal_trip sysfs node, which exposes various trip
> >            points for all sensors present in a zone.
> > Patch 5/8: Adds a thermal_map sysfs node. It is a compact representation
> >            of the binding relationship between a sensor and a cdev,
> >            within a zone.
> > Patch 6/8: Creates Documentation for the new APIs. A new file is
> >            created for clarity. Final goal is to merge with the existing
> >            file or refactor the files, as whatever seems appropriate.
> > Patch 7/8: Make PER ZONE values configurable through Kconfig
> > Patch 8/8: A dummy driver that can be used for testing. This is not for
> merge.
> 
> I read these patches, they create new APIs and sysfs, but it seems they
> didn't use the thermal_zone to handle the thermal_throttle issue,
> something like update thermal_zone, update temperature, handle
> governors
> when cross the trip temp. So will you send out next serial patches for
> these implementation?

Yes, once these get into Rui's tree, we will start migrating the existing drivers/
and governors, to get things working.

Thanks,
Durga

^ permalink raw reply

* Re: [PATCH 0/8] Thermal Framework Enhancements
From: Wei Ni @ 2012-12-21  8:05 UTC (permalink / raw)
  To: Durgadoss R
  Cc: rui.zhang@intel.com, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org, hongbo.zhang@linaro.org
In-Reply-To: <1355822977-4804-1-git-send-email-durgadoss.r@intel.com>

On 12/18/2012 05:29 PM, Durgadoss R wrote:
> This patch is a v1 based on the RFC submitted here:
> https://patchwork.kernel.org/patch/1758921/
> 
> This patch set is based on Rui's -thermal tree, and is
> tested on a Core-i5 and an Atom netbook.
> 
> This series contains 8 patches:
> Patch 1/8: Creates new sensor level APIs
> Patch 2/8: Creates new zone level APIs. The existing tzd structure is
>            kept as such for clarity and compatibility purposes.
> Patch 3/8: Creates functions to add/remove a cdev to/from a zone. The
>            existing tcd structure need not be modified.
> Patch 4/8: Adds a thermal_trip sysfs node, which exposes various trip
>            points for all sensors present in a zone.
> Patch 5/8: Adds a thermal_map sysfs node. It is a compact representation
>            of the binding relationship between a sensor and a cdev,
>            within a zone.
> Patch 6/8: Creates Documentation for the new APIs. A new file is
>            created for clarity. Final goal is to merge with the existing
>            file or refactor the files, as whatever seems appropriate.
> Patch 7/8: Make PER ZONE values configurable through Kconfig
> Patch 8/8: A dummy driver that can be used for testing. This is not for merge.

I read these patches, they create new APIs and sysfs, but it seems they
didn't use the thermal_zone to handle the thermal_throttle issue,
something like update thermal_zone, update temperature, handle governors
when cross the trip temp. So will you send out next serial patches for
these implementation?

> 
> Thanks to Rui Zhang, Honghbo Zhang, Wei Ni for their feedback on the
> RFC version.
> 
> Durgadoss R (8):
>   Thermal: Create sensor level APIs
>   Thermal: Create zone level APIs
>   Thermal: Add APIs to bind cdev to new zone structure
>   Thermal: Add Thermal_trip sysfs node
>   Thermal: Add 'thermal_map' sysfs node
>   Thermal: Add Documentation to new APIs
>   Thermal: Make PER_ZONE values configurable
>   Thermal: Dummy driver used for testing
> 
>  Documentation/thermal/sysfs-api2.txt |  248 +++++++++
>  drivers/thermal/Kconfig              |   19 +
>  drivers/thermal/Makefile             |    3 +
>  drivers/thermal/thermal_sys.c        |  932 ++++++++++++++++++++++++++++++++++
>  drivers/thermal/thermal_test.c       |  315 ++++++++++++
>  include/linux/thermal.h              |  124 +++++
>  6 files changed, 1641 insertions(+)
>  create mode 100644 Documentation/thermal/sysfs-api2.txt
>  create mode 100644 drivers/thermal/thermal_test.c
> 


^ permalink raw reply

* Re: [PATCH] PCIe/PM: Do not suspend port if any subordinate device need PME polling
From: Bjorn Helgaas @ 2012-12-20 21:44 UTC (permalink / raw)
  To: Richard Yang
  Cc: Rafael J. Wysocki, Huang Ying, linux-kernel, linux-pci, linux-pm
In-Reply-To: <50cdc89b.2788440a.0cfe.ffff8fb7SMTPIN_ADDED_BROKEN@mx.google.com>

On Sun, Dec 16, 2012 at 6:11 AM, Richard Yang
<weiyang@linux.vnet.ibm.com> wrote:
> On Sat, Dec 15, 2012 at 12:03:33AM +0100, Rafael J. Wysocki wrote:
>>On Friday, December 14, 2012 01:11:31 PM Richard Yang wrote:
>>> On Fri, Dec 14, 2012 at 10:52:11AM +0800, Huang Ying wrote:
>>> >In
>>> >
>>> >  http://www.mail-archive.com/linux-usb@vger.kernel.org/msg07976.html
>>> >
>>> >Ulrich reported that his USB3 cardreader does not work reliably when
>>> >connected to the USB3 port.  It turns out that USB3 controller failed
>>> >to be waken up when plugging in the USB3 cardreader.  Further
>>> >experiment found that the USB3 host controller can only be waken up
>>> >via polling, while not via PME interrupt.  But if the PCIe port that
>>> >the USB3 host controller is connected is suspended, we can not poll
>>> >the USB3 host controller because its config space is not accessible if
>>> >the PCIe port is put into low power state.
>>> >
>>> >To solve the issue, the PCIe port will not be suspended if any
>>> >subordinate device need PME polling.
>>> >
>>> >Reported-by: Ulrich Eckhardt <usb@uli-eckhardt.de>
>>> >Signed-off-by: Huang Ying <ying.huang@intel.com>
>>> >Tested-by: Sarah Sharp <sarah.a.sharp@linux.intel.com>
>>> >Cc: stable@vger.kernel.org  # 3.6+
>>> >---
>>> > drivers/pci/pcie/portdrv_pci.c |   18 +++++++++++++++++-
>>> > 1 file changed, 17 insertions(+), 1 deletion(-)
>>> >
>>> >--- a/drivers/pci/pcie/portdrv_pci.c
>>> >+++ b/drivers/pci/pcie/portdrv_pci.c
>>> >@@ -134,10 +134,26 @@ static int pcie_port_runtime_resume(stru
>>> >    return 0;
>>> > }
>>> >
>>> >+static int pci_dev_pme_poll(struct pci_dev *pdev, void *data)
>>> >+{
>>> >+   int *pme_poll = data;
>>> >+   *pme_poll = *pme_poll || pdev->pme_poll;
>>> >+   return 0;
>>> >+}
>>> >+
>>> > static int pcie_port_runtime_idle(struct device *dev)
>>> > {
>>> >+   struct pci_dev *pdev = to_pci_dev(dev);
>>> >+   int pme_poll = false;
>>>
>>> You want to use int or bool?
>>>
>>> I think bool is better?
>>
>>Well, bool would be nicer, but it's not a big deal.
>>
>
> Yep, not a big deal.

I fixed up the int/bool confusion and added this to my pci/for-3.8
branch.  I'll push it soon after v3.8-rc1.  Thanks!

Bjorn

^ permalink raw reply

* Re: [regression] cpuidle_get_cpu_driver livelocks idle system
From: Daniel Lezcano @ 2012-12-20 18:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Sivaram Nair, Russ Anderson, Peter De Schrijver,
	akpm@linux-foundation.org, shuox.liu@intel.com,
	yanmin_zhang@intel.com, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <2092099.bR8HA2X8Pe@vostro.rjw.lan>

On 12/18/2012 12:33 AM, Rafael J. Wysocki wrote:
> On Monday, December 17, 2012 10:59:09 PM Daniel Lezcano wrote:
>> On 12/17/2012 08:36 PM, Russ Anderson wrote:
>>> The 3.7 kernel grinds to a halt on boot of a system with
>>> 2048 cpus.  NMI showed most of the cpus in
>>> _raw_spin_lock in cpuidle_get_cpu_driver().  (backtrace below)
>>>
>>> A quick look at cpuidle_get_cpu_driver() shows the hot lock.
>>>
>>> In drivers/cpuidle/driver.c:
>>> --------------------------------------------------------
>>> /**
>>>  * cpuidle_get_cpu_driver - return the driver tied with a cpu
>>>  */
>>> struct cpuidle_driver *cpuidle_get_cpu_driver(struct cpuidle_device *dev)
>>> {
>>>         struct cpuidle_driver *drv;
>>>
>>>         if (!dev)
>>>                 return NULL;
>>>
>>>         spin_lock(&cpuidle_driver_lock);
>>>         drv = __cpuidle_get_cpu_driver(dev->cpu);
>>>         spin_unlock(&cpuidle_driver_lock);
>>>
>>>         return drv;
>>> }
>>> --------------------------------------------------------
>>
>> Hi Russ,
>>
>> thanks for investigating the problem. You are right, there is a
>> bottleneck here.
>>
>> Regarding how is used the cpuidle code, I think it is safe to remove the
>> locks.
> 
> OK, a patch would be appreciated. :-)
> 
> If you prepare one, please explain in the changelog why it is safe to drop the
> locks.

Ok, sure. I have some troubles with my x86 hardware, so it could take a
couple of days before I can send a fix a bit tested.


>>> This change was added in on Nov 14th, 2012.
>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=bf4d1b5ddb78f86078ac6ae0415802d5f0c68f92
>>>
>>> The patch says it adds support for cpus with different characteristics,
>>> but adds a big global lock.  The comment claims "no impact for the other
>>> platforms if the option is disabled", which leads me to believe the
>>> spin_lock was added inadvertently.  CPU_IDLE_MULTIPLE_DRIVERS is off
>>> in my config file.
>>>
>>> linux$ grep CPU_IDLE_MULTIPLE_DRIVERS .config
>>> # CONFIG_CPU_IDLE_MULTIPLE_DRIVERS is not set
>>>
>>> As more cpus become idle, more cpus fight over the lock until
>>> the system livelocks on the crushing weight of idle.
>>>
>>> The fix may be to move the spin_lock into __cpuidle_get_cpu_driver,
>>> which has different versions for CONFIG_CPU_IDLE_MULTIPLE_DRIVERS,
>>> to avoid impacting the disabled case, or get rid of the spin_lock
>>> all together.
>>>
>>>
>>> --------------------------------------------------------
>>> == UV NMI process trace cpu 12: ==
>>> CPU 12
>>> Pid: 0, comm: swapper/12 Tainted: G           O 3.7.0.rja-sgi+ #38
>>> RIP: 0010:[<ffffffff81614e45>]  [<ffffffff81614e45>] _raw_spin_lock+0x25/0x30
>>> [...]
>>> Call Trace:
>>>  [<ffffffff814c891c>] cpuidle_get_cpu_driver+0x1c/0x30
>>>  [<ffffffff814c871d>] cpuidle_idle_call+0x7d/0x1b0
>>>  [<ffffffff8101d08d>] cpu_idle+0xdd/0x130
>>>  [<ffffffff8160a3ea>] start_secondary+0xc6/0xcc
>>> --------------------------------------------------------
>>>
>>
>>
>>


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

^ permalink raw reply

* RE: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
From: R, Durgadoss @ 2012-12-20 18:12 UTC (permalink / raw)
  To: Greg KH
  Cc: Zhang, Rui, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org, hongbo.zhang@linaro.org,
	wni@nvidia.com
In-Reply-To: <20121220175116.GB32061@kroah.com>

> -----Original Message-----
> From: Greg KH [mailto:gregkh@linuxfoundation.org]
> Sent: Thursday, December 20, 2012 11:21 PM
> To: R, Durgadoss
> Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> hongbo.zhang@linaro.org; wni@nvidia.com
> Subject: Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
> 
> On Thu, Dec 20, 2012 at 04:58:32PM +0000, R, Durgadoss wrote:
> > > -----Original Message-----
> > > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > > Sent: Thursday, December 20, 2012 10:09 PM
> > > To: R, Durgadoss
> > > Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > hongbo.zhang@linaro.org; wni@nvidia.com
> > > Subject: Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
> > >
> > > On Thu, Dec 20, 2012 at 04:25:32PM +0000, R, Durgadoss wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > > > > Sent: Thursday, December 20, 2012 9:42 PM
> > > > > To: R, Durgadoss
> > > > > Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-
> kernel@vger.kernel.org;
> > > > > hongbo.zhang@linaro.org; wni@nvidia.com
> > > > > Subject: Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
> > > > >
> > > > > On Thu, Dec 20, 2012 at 07:52:03AM +0000, R, Durgadoss wrote:
> > > > > > > On Tue, Dec 18, 2012 at 02:59:33PM +0530, Durgadoss R wrote:
> > > > > > > > This patch adds a thermal_trip directory under
> > > > > > > > /sys/class/thermal/zoneX. This directory contains
> > > > > > > > the trip point values for sensors bound to this
> > > > > > > > zone.
> > > > > > >
> > > > > > > Eeek, you just broke userspace tools that now can no longer see
> > > these
> > > > > > > entries :(
> > > > > > >
> > > > > > > Why do you need to create a subdirectory?  As you found out,
> doing
> > > so
> > > > > > > isn't the easiest, right?  That is on purpose.
> > > > > >
> > > > > > Yes, I observed the complexity.
> > > > > >
> > > > > > >
> > > > > > > I really wouldn't recommend doing this at all, please stick within
> the
> > > > > > > 'struct device' framework here, don't create new kobjects and
> hang
> > > sysfs
> > > > > > > files off of them.
> > > > > >
> > > > > > But, we cannot put all _trip directly under ZoneX directory.
> > > > >
> > > > > Why not?  What is preventing this?
> > > > >
> > > > > > We can remove the thermal_trip directory, and put sensorY_trip
> under
> > > > > > /sys/class/thermal/zoneX/.  But this sensorY_trip needs to be a
> > > > > > directory which has four sysfs nodes named, active, passive, crit,
> > > > > > hot.
> > > > > >
> > > > > > Rui, What do you think about this ?
> > > > > >
> > > > > > The only other way I see, is directly put
> > > > > sensorY_trip_[active/passive/hot/crit]
> > > > > > which will create way too many nodes, under
> > > /sys/class/thermal/zoneX/.
> > > > >
> > > > > What is "too many"?  20000?  50000?  How many are we talking about
> > > here?
> > > >
> > > > Not in 1000's though..
> > > >
> > > > > What is the limiting factor that is preventing this from all going into
> > > > > one directory?
> > > >
> > > > We support a MAX of 12 sensors per zone today, which will lead to
> > > > 12 * 4, 48 nodes under this directory named
> > > > sensorY_trip_[active/passive/hot/crit], besides the other nodes.
> > >
> > > That's fine, we can easily support that many files, have you tried this
> > > already?
> >
> > Yes, in fact, this is sort of what was the old implementation..
> > although with different sysfs nodes.
> 
> What "old" implementation, one that is in-kernel?  Are you changing the
> user interface here?

Sorry, I should have used better wordings ;(
[s/old/existing]
There are other sysfs nodes following the correct convention under
/sys/class/thermal/, which is what I was mentioning.

No, we are not changing the user interface, in these patches.

Thanks,
Durga

^ permalink raw reply

* Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
From: Greg KH @ 2012-12-20 17:51 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Zhang, Rui, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org, hongbo.zhang@linaro.org,
	wni@nvidia.com
In-Reply-To: <4D68720C2E767A4AA6A8796D42C8EB5924CF67@BGSMSX101.gar.corp.intel.com>

On Thu, Dec 20, 2012 at 04:58:32PM +0000, R, Durgadoss wrote:
> > -----Original Message-----
> > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > Sent: Thursday, December 20, 2012 10:09 PM
> > To: R, Durgadoss
> > Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > hongbo.zhang@linaro.org; wni@nvidia.com
> > Subject: Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
> > 
> > On Thu, Dec 20, 2012 at 04:25:32PM +0000, R, Durgadoss wrote:
> > >
> > > > -----Original Message-----
> > > > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > > > Sent: Thursday, December 20, 2012 9:42 PM
> > > > To: R, Durgadoss
> > > > Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > > hongbo.zhang@linaro.org; wni@nvidia.com
> > > > Subject: Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
> > > >
> > > > On Thu, Dec 20, 2012 at 07:52:03AM +0000, R, Durgadoss wrote:
> > > > > > On Tue, Dec 18, 2012 at 02:59:33PM +0530, Durgadoss R wrote:
> > > > > > > This patch adds a thermal_trip directory under
> > > > > > > /sys/class/thermal/zoneX. This directory contains
> > > > > > > the trip point values for sensors bound to this
> > > > > > > zone.
> > > > > >
> > > > > > Eeek, you just broke userspace tools that now can no longer see
> > these
> > > > > > entries :(
> > > > > >
> > > > > > Why do you need to create a subdirectory?  As you found out, doing
> > so
> > > > > > isn't the easiest, right?  That is on purpose.
> > > > >
> > > > > Yes, I observed the complexity.
> > > > >
> > > > > >
> > > > > > I really wouldn't recommend doing this at all, please stick within the
> > > > > > 'struct device' framework here, don't create new kobjects and hang
> > sysfs
> > > > > > files off of them.
> > > > >
> > > > > But, we cannot put all _trip directly under ZoneX directory.
> > > >
> > > > Why not?  What is preventing this?
> > > >
> > > > > We can remove the thermal_trip directory, and put sensorY_trip under
> > > > > /sys/class/thermal/zoneX/.  But this sensorY_trip needs to be a
> > > > > directory which has four sysfs nodes named, active, passive, crit,
> > > > > hot.
> > > > >
> > > > > Rui, What do you think about this ?
> > > > >
> > > > > The only other way I see, is directly put
> > > > sensorY_trip_[active/passive/hot/crit]
> > > > > which will create way too many nodes, under
> > /sys/class/thermal/zoneX/.
> > > >
> > > > What is "too many"?  20000?  50000?  How many are we talking about
> > here?
> > >
> > > Not in 1000's though..
> > >
> > > > What is the limiting factor that is preventing this from all going into
> > > > one directory?
> > >
> > > We support a MAX of 12 sensors per zone today, which will lead to
> > > 12 * 4, 48 nodes under this directory named
> > > sensorY_trip_[active/passive/hot/crit], besides the other nodes.
> > 
> > That's fine, we can easily support that many files, have you tried this
> > already?
> 
> Yes, in fact, this is sort of what was the old implementation..
> although with different sysfs nodes.

What "old" implementation, one that is in-kernel?  Are you changing the
user interface here?

thanks,

greg k-h

^ permalink raw reply

* RE: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
From: R, Durgadoss @ 2012-12-20 16:58 UTC (permalink / raw)
  To: Greg KH
  Cc: Zhang, Rui, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org, hongbo.zhang@linaro.org,
	wni@nvidia.com
In-Reply-To: <20121220163839.GA30120@kroah.com>

> -----Original Message-----
> From: Greg KH [mailto:gregkh@linuxfoundation.org]
> Sent: Thursday, December 20, 2012 10:09 PM
> To: R, Durgadoss
> Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> hongbo.zhang@linaro.org; wni@nvidia.com
> Subject: Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
> 
> On Thu, Dec 20, 2012 at 04:25:32PM +0000, R, Durgadoss wrote:
> >
> > > -----Original Message-----
> > > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > > Sent: Thursday, December 20, 2012 9:42 PM
> > > To: R, Durgadoss
> > > Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > hongbo.zhang@linaro.org; wni@nvidia.com
> > > Subject: Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
> > >
> > > On Thu, Dec 20, 2012 at 07:52:03AM +0000, R, Durgadoss wrote:
> > > > > On Tue, Dec 18, 2012 at 02:59:33PM +0530, Durgadoss R wrote:
> > > > > > This patch adds a thermal_trip directory under
> > > > > > /sys/class/thermal/zoneX. This directory contains
> > > > > > the trip point values for sensors bound to this
> > > > > > zone.
> > > > >
> > > > > Eeek, you just broke userspace tools that now can no longer see
> these
> > > > > entries :(
> > > > >
> > > > > Why do you need to create a subdirectory?  As you found out, doing
> so
> > > > > isn't the easiest, right?  That is on purpose.
> > > >
> > > > Yes, I observed the complexity.
> > > >
> > > > >
> > > > > I really wouldn't recommend doing this at all, please stick within the
> > > > > 'struct device' framework here, don't create new kobjects and hang
> sysfs
> > > > > files off of them.
> > > >
> > > > But, we cannot put all _trip directly under ZoneX directory.
> > >
> > > Why not?  What is preventing this?
> > >
> > > > We can remove the thermal_trip directory, and put sensorY_trip under
> > > > /sys/class/thermal/zoneX/.  But this sensorY_trip needs to be a
> > > > directory which has four sysfs nodes named, active, passive, crit,
> > > > hot.
> > > >
> > > > Rui, What do you think about this ?
> > > >
> > > > The only other way I see, is directly put
> > > sensorY_trip_[active/passive/hot/crit]
> > > > which will create way too many nodes, under
> /sys/class/thermal/zoneX/.
> > >
> > > What is "too many"?  20000?  50000?  How many are we talking about
> here?
> >
> > Not in 1000's though..
> >
> > > What is the limiting factor that is preventing this from all going into
> > > one directory?
> >
> > We support a MAX of 12 sensors per zone today, which will lead to
> > 12 * 4, 48 nodes under this directory named
> > sensorY_trip_[active/passive/hot/crit], besides the other nodes.
> 
> That's fine, we can easily support that many files, have you tried this
> already?

Yes, in fact, this is sort of what was the old implementation..
although with different sysfs nodes.

> 
> The main point is, if you use a kobject like you are, userspace tools
> can't "see" these directories and files easily, if at all.  Try it out
> with libudev yourself to verify it, the attributes will not show up as
> owned to that device like you need them to be.

I haven't used libudev exactly, but I realized this sort of thing,
when I was trying to catch UEvents on this device path.

I will give libudev a try..

> 
> So put them all in one directory, we can handle 10's of thousands of
> files quite easily, so 48 is trivial :)

Okay, Will make it this way :-)
Now I can see the implementation getting much simpler !!

Thank you Greg,
Durga
P.S: I should thank you for this file(samples/kobject/kobject-example.c) also,
from where I got how to get this implementation done :-)

^ permalink raw reply

* Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
From: Greg KH @ 2012-12-20 16:38 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Zhang, Rui, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org, hongbo.zhang@linaro.org,
	wni@nvidia.com
In-Reply-To: <4D68720C2E767A4AA6A8796D42C8EB5924CF1F@BGSMSX101.gar.corp.intel.com>

On Thu, Dec 20, 2012 at 04:25:32PM +0000, R, Durgadoss wrote:
> 
> > -----Original Message-----
> > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > Sent: Thursday, December 20, 2012 9:42 PM
> > To: R, Durgadoss
> > Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> > hongbo.zhang@linaro.org; wni@nvidia.com
> > Subject: Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
> > 
> > On Thu, Dec 20, 2012 at 07:52:03AM +0000, R, Durgadoss wrote:
> > > > On Tue, Dec 18, 2012 at 02:59:33PM +0530, Durgadoss R wrote:
> > > > > This patch adds a thermal_trip directory under
> > > > > /sys/class/thermal/zoneX. This directory contains
> > > > > the trip point values for sensors bound to this
> > > > > zone.
> > > >
> > > > Eeek, you just broke userspace tools that now can no longer see these
> > > > entries :(
> > > >
> > > > Why do you need to create a subdirectory?  As you found out, doing so
> > > > isn't the easiest, right?  That is on purpose.
> > >
> > > Yes, I observed the complexity.
> > >
> > > >
> > > > I really wouldn't recommend doing this at all, please stick within the
> > > > 'struct device' framework here, don't create new kobjects and hang sysfs
> > > > files off of them.
> > >
> > > But, we cannot put all _trip directly under ZoneX directory.
> > 
> > Why not?  What is preventing this?
> > 
> > > We can remove the thermal_trip directory, and put sensorY_trip under
> > > /sys/class/thermal/zoneX/.  But this sensorY_trip needs to be a
> > > directory which has four sysfs nodes named, active, passive, crit,
> > > hot.
> > >
> > > Rui, What do you think about this ?
> > >
> > > The only other way I see, is directly put
> > sensorY_trip_[active/passive/hot/crit]
> > > which will create way too many nodes, under /sys/class/thermal/zoneX/.
> > 
> > What is "too many"?  20000?  50000?  How many are we talking about here?
> 
> Not in 1000's though..
> 
> > What is the limiting factor that is preventing this from all going into
> > one directory?
> 
> We support a MAX of 12 sensors per zone today, which will lead to
> 12 * 4, 48 nodes under this directory named
> sensorY_trip_[active/passive/hot/crit], besides the other nodes.

That's fine, we can easily support that many files, have you tried this
already?

The main point is, if you use a kobject like you are, userspace tools
can't "see" these directories and files easily, if at all.  Try it out
with libudev yourself to verify it, the attributes will not show up as
owned to that device like you need them to be.

So put them all in one directory, we can handle 10's of thousands of
files quite easily, so 48 is trivial :)

thanks,

greg k-h

^ permalink raw reply

* RE: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
From: R, Durgadoss @ 2012-12-20 16:25 UTC (permalink / raw)
  To: Greg KH
  Cc: Zhang, Rui, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org, hongbo.zhang@linaro.org,
	wni@nvidia.com
In-Reply-To: <20121220161202.GA29796@kroah.com>


> -----Original Message-----
> From: Greg KH [mailto:gregkh@linuxfoundation.org]
> Sent: Thursday, December 20, 2012 9:42 PM
> To: R, Durgadoss
> Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> hongbo.zhang@linaro.org; wni@nvidia.com
> Subject: Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
> 
> On Thu, Dec 20, 2012 at 07:52:03AM +0000, R, Durgadoss wrote:
> > > On Tue, Dec 18, 2012 at 02:59:33PM +0530, Durgadoss R wrote:
> > > > This patch adds a thermal_trip directory under
> > > > /sys/class/thermal/zoneX. This directory contains
> > > > the trip point values for sensors bound to this
> > > > zone.
> > >
> > > Eeek, you just broke userspace tools that now can no longer see these
> > > entries :(
> > >
> > > Why do you need to create a subdirectory?  As you found out, doing so
> > > isn't the easiest, right?  That is on purpose.
> >
> > Yes, I observed the complexity.
> >
> > >
> > > I really wouldn't recommend doing this at all, please stick within the
> > > 'struct device' framework here, don't create new kobjects and hang sysfs
> > > files off of them.
> >
> > But, we cannot put all _trip directly under ZoneX directory.
> 
> Why not?  What is preventing this?
> 
> > We can remove the thermal_trip directory, and put sensorY_trip under
> > /sys/class/thermal/zoneX/.  But this sensorY_trip needs to be a
> > directory which has four sysfs nodes named, active, passive, crit,
> > hot.
> >
> > Rui, What do you think about this ?
> >
> > The only other way I see, is directly put
> sensorY_trip_[active/passive/hot/crit]
> > which will create way too many nodes, under /sys/class/thermal/zoneX/.
> 
> What is "too many"?  20000?  50000?  How many are we talking about here?

Not in 1000's though..

> What is the limiting factor that is preventing this from all going into
> one directory?

We support a MAX of 12 sensors per zone today, which will lead to
12 * 4, 48 nodes under this directory named
sensorY_trip_[active/passive/hot/crit], besides the other nodes.

Thanks,
Durga

^ permalink raw reply

* Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
From: Greg KH @ 2012-12-20 16:12 UTC (permalink / raw)
  To: R, Durgadoss
  Cc: Zhang, Rui, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org, hongbo.zhang@linaro.org,
	wni@nvidia.com
In-Reply-To: <4D68720C2E767A4AA6A8796D42C8EB5924CB6C@BGSMSX101.gar.corp.intel.com>

On Thu, Dec 20, 2012 at 07:52:03AM +0000, R, Durgadoss wrote:
> > On Tue, Dec 18, 2012 at 02:59:33PM +0530, Durgadoss R wrote:
> > > This patch adds a thermal_trip directory under
> > > /sys/class/thermal/zoneX. This directory contains
> > > the trip point values for sensors bound to this
> > > zone.
> > 
> > Eeek, you just broke userspace tools that now can no longer see these
> > entries :(
> > 
> > Why do you need to create a subdirectory?  As you found out, doing so
> > isn't the easiest, right?  That is on purpose.
> 
> Yes, I observed the complexity.
> 
> > 
> > I really wouldn't recommend doing this at all, please stick within the
> > 'struct device' framework here, don't create new kobjects and hang sysfs
> > files off of them.
> 
> But, we cannot put all _trip directly under ZoneX directory.

Why not?  What is preventing this?

> We can remove the thermal_trip directory, and put sensorY_trip under
> /sys/class/thermal/zoneX/.  But this sensorY_trip needs to be a
> directory which has four sysfs nodes named, active, passive, crit,
> hot.
> 
> Rui, What do you think about this ?
> 
> The only other way I see, is directly put sensorY_trip_[active/passive/hot/crit]
> which will create way too many nodes, under /sys/class/thermal/zoneX/.

What is "too many"?  20000?  50000?  How many are we talking about here?
What is the limiting factor that is preventing this from all going into
one directory?

thanks,

greg k-h

^ permalink raw reply

* Re: [RFC PATCH v4 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Srivatsa S. Bhat @ 2012-12-20 14:06 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
	vincent.guittot, tj, sbw, amit.kucheria, rostedt, rjw, wangyun,
	xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <20121220134203.GB10813@redhat.com>

On 12/20/2012 07:12 PM, Oleg Nesterov wrote:
> On 12/20, Srivatsa S. Bhat wrote:
>>
>> On 12/20/2012 12:44 AM, Oleg Nesterov wrote:
>>>
>>> We need 2 helpers for writer, the 1st one does synchronize_sched() and the
>>> 2nd one takes rwlock. A generic percpu_write_lock() simply calls them both.
>>>
>>
>> Ah, that's the problem no? Users of reader-writer locks expect to run in
>> atomic context (ie., they don't want to sleep).
> 
> Ah, I misunderstood.
> 
> Sure, percpu_write_lock() should be might_sleep(), and this is not
> symmetric to percpu_read_lock().
> 
>> We can't expose an API that
>> can make the task go to sleep under the covers!
> 
> Why? Just this should be documented. However I would not worry until we
> find another user. Until then we do not even need to add percpu_write_lock
> or try to generalize this code too much.
> 

Hmm.. But considering the disable_nonboot_cpus() case you mentioned below, I'm
only getting farther away from using synchronize_sched() ;-) And that also makes
it easier to expose a generic percpu rwlock API, like Tejun was suggesting.
So I'll give it a shot.

>>> To me, the main question is: can we use synchronize_sched() in cpu_down?
>>> It is slow.
>>>
>>
>> Haha :-) So we don't want smp_mb() in the reader,
> 
> We need mb() + rmb(). Plust cli/sti unless this arch has optimized
> this_cpu_add() like x86 (as you pointed out).
> 
>> *and* also don't want
>> synchronize_sched() in the writer! Sounds like saying we want to have the cake
>> and eat it too ;-) :P
> 
> Personally I'd vote for synchronize_sched() but I am not sure. And I do
> not really understand the problem space.
> 
>> And moreover, since I'm still not convinced about the writer API part if use
>> synchronize_sched(), I'd rather avoid synchronize_sched().)
> 
> Understand.
> 
> And yes, synchronize_sched() adds more problems. For example, where should
> we call it? I do not this _cpu_down() should do this, in this case, say,
> disable_nonboot_cpus() needs num_online_cpus() synchronize_sched's.
> 

Ouch! I should have seen that coming!

> So probably cpu_down() should call it before cpu_maps_update_begin(), this
> makes the locking even less obvious.
> 

True.

> In short. What I am trying to say is, don't ask me I do not know ;)
>

OK then, I'll go with what I believe is a reasonably good way (not necessarily
the best way) to deal with this:

I'll avoid the use of synchronize_sched(), expose a decent-looking percpu
rwlock implementation, use it in CPU hotplug and get rid of stop_machine().
That would certainly be a good starting base, IMHO.

Regards,
Srivatsa S. Bhat


^ permalink raw reply

* Re: [RFC PATCH v4 1/9] CPU hotplug: Provide APIs to prevent CPU offline from atomic context
From: Oleg Nesterov @ 2012-12-20 13:42 UTC (permalink / raw)
  To: Srivatsa S. Bhat
  Cc: tglx, peterz, paulmck, rusty, mingo, akpm, namhyung,
	vincent.guittot, tj, sbw, amit.kucheria, rostedt, rjw, wangyun,
	xiaoguangrong, nikunj, linux-pm, linux-kernel
In-Reply-To: <50D21A5F.4040604@linux.vnet.ibm.com>

On 12/20, Srivatsa S. Bhat wrote:
>
> On 12/20/2012 12:44 AM, Oleg Nesterov wrote:
> >
> > We need 2 helpers for writer, the 1st one does synchronize_sched() and the
> > 2nd one takes rwlock. A generic percpu_write_lock() simply calls them both.
> >
>
> Ah, that's the problem no? Users of reader-writer locks expect to run in
> atomic context (ie., they don't want to sleep).

Ah, I misunderstood.

Sure, percpu_write_lock() should be might_sleep(), and this is not
symmetric to percpu_read_lock().

> We can't expose an API that
> can make the task go to sleep under the covers!

Why? Just this should be documented. However I would not worry until we
find another user. Until then we do not even need to add percpu_write_lock
or try to generalize this code too much.

> > To me, the main question is: can we use synchronize_sched() in cpu_down?
> > It is slow.
> >
>
> Haha :-) So we don't want smp_mb() in the reader,

We need mb() + rmb(). Plust cli/sti unless this arch has optimized
this_cpu_add() like x86 (as you pointed out).

> *and* also don't want
> synchronize_sched() in the writer! Sounds like saying we want to have the cake
> and eat it too ;-) :P

Personally I'd vote for synchronize_sched() but I am not sure. And I do
not really understand the problem space.

> And moreover, since I'm still not convinced about the writer API part if use
> synchronize_sched(), I'd rather avoid synchronize_sched().)

Understand.

And yes, synchronize_sched() adds more problems. For example, where should
we call it? I do not this _cpu_down() should do this, in this case, say,
disable_nonboot_cpus() needs num_online_cpus() synchronize_sched's.

So probably cpu_down() should call it before cpu_maps_update_begin(), this
makes the locking even less obvious.

In short. What I am trying to say is, don't ask me I do not know ;)

Oleg.


^ permalink raw reply

* RE: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
From: R, Durgadoss @ 2012-12-20  7:52 UTC (permalink / raw)
  To: Greg KH
  Cc: Zhang, Rui, linux-pm@vger.kernel.org,
	linux-kernel@vger.kernel.org, hongbo.zhang@linaro.org,
	wni@nvidia.com
In-Reply-To: <20121220054201.GE28007@kroah.com>

Hi Greg,

Thank you for looking at this.

> -----Original Message-----
> From: Greg KH [mailto:gregkh@linuxfoundation.org]
> Sent: Thursday, December 20, 2012 11:12 AM
> To: R, Durgadoss
> Cc: Zhang, Rui; linux-pm@vger.kernel.org; linux-kernel@vger.kernel.org;
> hongbo.zhang@linaro.org; wni@nvidia.com
> Subject: Re: [PATCH 4/8] Thermal: Add Thermal_trip sysfs node
> 
> On Tue, Dec 18, 2012 at 02:59:33PM +0530, Durgadoss R wrote:
> > This patch adds a thermal_trip directory under
> > /sys/class/thermal/zoneX. This directory contains
> > the trip point values for sensors bound to this
> > zone.
> 
> Eeek, you just broke userspace tools that now can no longer see these
> entries :(
> 
> Why do you need to create a subdirectory?  As you found out, doing so
> isn't the easiest, right?  That is on purpose.

Yes, I observed the complexity.

> 
> I really wouldn't recommend doing this at all, please stick within the
> 'struct device' framework here, don't create new kobjects and hang sysfs
> files off of them.

But, we cannot put all _trip directly under ZoneX directory. We can remove the
thermal_trip directory, and put sensorY_trip under /sys/class/thermal/zoneX/.
But this sensorY_trip needs to be a directory which has four sysfs nodes named,
active, passive, crit, hot.

Rui, What do you think about this ?

The only other way I see, is directly put sensorY_trip_[active/passive/hot/crit]
which will create way too many nodes, under /sys/class/thermal/zoneX/.

Thanks,
Durga

> 
> thanks,
> 
> greg k-h

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox