linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
@ 2014-11-17  8:08 Viresh Kumar
  2014-11-17 19:32 ` Stefan Wahren
  2014-11-17 23:39 ` Rafael J. Wysocki
  0 siblings, 2 replies; 14+ messages in thread
From: Viresh Kumar @ 2014-11-17  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

Giving a warning in case we add duplicate OPPs doesn't workout that great. For
example just playing with cpufreq-dt driver as a module results in this:

$ modprobe cpufreq-dt
$ modprobe -r cpufreq-dt
$ modprobe cpufreq-dt

cpu cpu0: dev_pm_opp_add: duplicate OPPs detected. Existing:
freq: 261819000, volt: 1350000, enabled: 1. New: freq: 261819000, volt: 1350000,
enabled: 1
cpu cpu0: dev_pm_opp_add: duplicate OPPs detected. Existing:
freq: 360000000, volt: 1350000, enabled: 1. New: freq: 360000000, volt: 1350000,
enabled: 1
cpu cpu0: dev_pm_opp_add: duplicate OPPs detected. Existing:
freq: 392728000, volt: 1450000, enabled: 1. New: freq: 392728000, volt: 1450000,
enabled: 1
cpu cpu0: dev_pm_opp_add: duplicate OPPs detected. Existing:
freq: 454737000, volt: 1550000, enabled: 1. New: freq: 454737000, volt: 1550000,
enabled: 1

This happens because we don't destroy OPPs (created during ->init()) while
unloading modules.

Now the question is: Should we destroy these OPPs?

Logically kernel drivers *must* free resources they acquired. But in this
particular case, the OPPs are created using a static list present in device
tree. Destroying and then allocating them again isn't of much benefit. The only
benefit of removing OPPs is to save some space if the driver isn't loaded again.

This has its own complications. OPPs can be created either from DT (static) or
platform code (dynamic). Driver should only remove static OPPs and not the
dynamic ones as they are controlled from platform code. But there is no field in
'struct dev_pm_opp' which has this information to distinguish between different
kind of OPPs.

Because of all this, I wasn't sure if drivers should remove static OPPs during
their removal. And so just fixing the reported issue by issuing a dev_dbg()
instead of dev_warn().

Reported-by: Stefan Wahren <stefan.wahren@i2se.com>
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 drivers/base/power/opp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
index 89ced95..490e9db 100644
--- a/drivers/base/power/opp.c
+++ b/drivers/base/power/opp.c
@@ -466,9 +466,9 @@ int dev_pm_opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
 		int ret = opp->available && new_opp->u_volt == opp->u_volt ?
 			0 : -EEXIST;
 
-		dev_warn(dev, "%s: duplicate OPPs detected. Existing: freq: %lu, volt: %lu, enabled: %d. New: freq: %lu, volt: %lu, enabled: %d\n",
-			 __func__, opp->rate, opp->u_volt, opp->available,
-			 new_opp->rate, new_opp->u_volt, new_opp->available);
+		dev_dbg(dev, "%s: duplicate OPPs detected. Existing: freq: %lu, volt: %lu, enabled: %d. New: freq: %lu, volt: %lu, enabled: %d\n",
+			__func__, opp->rate, opp->u_volt, opp->available,
+			new_opp->rate, new_opp->u_volt, new_opp->available);
 		mutex_unlock(&dev_opp_list_lock);
 		kfree(new_opp);
 		return ret;
-- 
2.0.3.693.g996b0fd

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-17  8:08 [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs Viresh Kumar
@ 2014-11-17 19:32 ` Stefan Wahren
  2014-11-17 23:39 ` Rafael J. Wysocki
  1 sibling, 0 replies; 14+ messages in thread
From: Stefan Wahren @ 2014-11-17 19:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Viresh,

> Viresh Kumar <viresh.kumar@linaro.org> hat am 17. November 2014 um 09:08
> geschrieben:
>
> [...]
>
> Now the question is: Should we destroy these OPPs?
>
> Logically kernel drivers *must* free resources they acquired. But in this
> particular case, the OPPs are created using a static list present in device
> tree. Destroying and then allocating them again isn't of much benefit. The
> only
> benefit of removing OPPs is to save some space if the driver isn't loaded
> again.
>
> This has its own complications. OPPs can be created either from DT (static) or
> platform code (dynamic). Driver should only remove static OPPs and not the
> dynamic ones as they are controlled from platform code. But there is no field
> in
> 'struct dev_pm_opp' which has this information to distinguish between
> different
> kind of OPPs.
>
> Because of all this, I wasn't sure if drivers should remove static OPPs during
> their removal. And so just fixing the reported issue by issuing a dev_dbg()
> instead of dev_warn().

how about adding some kind of FIXME?

beside that

Tested-by: Stefan Wahren <stefan.wahren@i2se.com>

Thanks

Stefan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-17  8:08 [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs Viresh Kumar
  2014-11-17 19:32 ` Stefan Wahren
@ 2014-11-17 23:39 ` Rafael J. Wysocki
  2014-11-18  3:08   ` Viresh Kumar
  1 sibling, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2014-11-17 23:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday, November 17, 2014 01:38:00 PM Viresh Kumar wrote:
> Giving a warning in case we add duplicate OPPs doesn't workout that great. For
> example just playing with cpufreq-dt driver as a module results in this:
> 
> $ modprobe cpufreq-dt
> $ modprobe -r cpufreq-dt
> $ modprobe cpufreq-dt
> 
> cpu cpu0: dev_pm_opp_add: duplicate OPPs detected. Existing:
> freq: 261819000, volt: 1350000, enabled: 1. New: freq: 261819000, volt: 1350000,
> enabled: 1
> cpu cpu0: dev_pm_opp_add: duplicate OPPs detected. Existing:
> freq: 360000000, volt: 1350000, enabled: 1. New: freq: 360000000, volt: 1350000,
> enabled: 1
> cpu cpu0: dev_pm_opp_add: duplicate OPPs detected. Existing:
> freq: 392728000, volt: 1450000, enabled: 1. New: freq: 392728000, volt: 1450000,
> enabled: 1
> cpu cpu0: dev_pm_opp_add: duplicate OPPs detected. Existing:
> freq: 454737000, volt: 1550000, enabled: 1. New: freq: 454737000, volt: 1550000,
> enabled: 1
> 
> This happens because we don't destroy OPPs (created during ->init()) while
> unloading modules.
> 
> Now the question is: Should we destroy these OPPs?
> 
> Logically kernel drivers *must* free resources they acquired. But in this
> particular case, the OPPs are created using a static list present in device
> tree. Destroying and then allocating them again isn't of much benefit. The only
> benefit of removing OPPs is to save some space if the driver isn't loaded again.
> 
> This has its own complications. OPPs can be created either from DT (static) or
> platform code (dynamic). Driver should only remove static OPPs and not the
> dynamic ones as they are controlled from platform code. But there is no field in
> 'struct dev_pm_opp' which has this information to distinguish between different
> kind of OPPs.
> 
> Because of all this, I wasn't sure if drivers should remove static OPPs during
> their removal. And so just fixing the reported issue by issuing a dev_dbg()
> instead of dev_warn().
> 
> Reported-by: Stefan Wahren <stefan.wahren@i2se.com>
> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
> ---
>  drivers/base/power/opp.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
> index 89ced95..490e9db 100644
> --- a/drivers/base/power/opp.c
> +++ b/drivers/base/power/opp.c
> @@ -466,9 +466,9 @@ int dev_pm_opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
>  		int ret = opp->available && new_opp->u_volt == opp->u_volt ?
>  			0 : -EEXIST;
>  
> -		dev_warn(dev, "%s: duplicate OPPs detected. Existing: freq: %lu, volt: %lu, enabled: %d. New: freq: %lu, volt: %lu, enabled: %d\n",
> -			 __func__, opp->rate, opp->u_volt, opp->available,
> -			 new_opp->rate, new_opp->u_volt, new_opp->available);
> +		dev_dbg(dev, "%s: duplicate OPPs detected. Existing: freq: %lu, volt: %lu, enabled: %d. New: freq: %lu, volt: %lu, enabled: %d\n",
> +			__func__, opp->rate, opp->u_volt, opp->available,
> +			new_opp->rate, new_opp->u_volt, new_opp->available);
>  		mutex_unlock(&dev_opp_list_lock);
>  		kfree(new_opp);
>  		return ret;

Don't you think that this may hide real bugs?

Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-17 23:39 ` Rafael J. Wysocki
@ 2014-11-18  3:08   ` Viresh Kumar
  2014-11-18 20:51     ` Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Viresh Kumar @ 2014-11-18  3:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 18 November 2014 05:09, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Monday, November 17, 2014 01:38:00 PM Viresh Kumar wrote:
>> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
>> index 89ced95..490e9db 100644
>> --- a/drivers/base/power/opp.c
>> +++ b/drivers/base/power/opp.c
>> @@ -466,9 +466,9 @@ int dev_pm_opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
>>               int ret = opp->available && new_opp->u_volt == opp->u_volt ?
>>                       0 : -EEXIST;
>>
>> -             dev_warn(dev, "%s: duplicate OPPs detected. Existing: freq: %lu, volt: %lu, enabled: %d. New: freq: %lu, volt: %lu, enabled: %d\n",
>> -                      __func__, opp->rate, opp->u_volt, opp->available,
>> -                      new_opp->rate, new_opp->u_volt, new_opp->available);
>> +             dev_dbg(dev, "%s: duplicate OPPs detected. Existing: freq: %lu, volt: %lu, enabled: %d. New: freq: %lu, volt: %lu, enabled: %d\n",
>> +                     __func__, opp->rate, opp->u_volt, opp->available,
>> +                     new_opp->rate, new_opp->u_volt, new_opp->available);
>>               mutex_unlock(&dev_opp_list_lock);
>>               kfree(new_opp);
>>               return ret;
>
> Don't you think that this may hide real bugs?

What kind of bugs exactly?

We are allowing addition of duplicate OPPs as a standard thing right now
as cpufreq drivers don't get rid of the OPPs they create with DT. So, that
shouldn't complain, isn't it ?

For example, what Stefan was doing was absolutely normal procedure
and that complained for him..

The only thing we don't allow is when the existing OPP isn't available
and we are trying to add a duplicate one. In that case we do return
-EEXIST and so we will get errors from the upper layer.

Or do we want to destroy OPPs created with help of DT while the
driver unloads ?

--
viresh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-18  3:08   ` Viresh Kumar
@ 2014-11-18 20:51     ` Rafael J. Wysocki
  2014-11-19  7:46       ` Viresh Kumar
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2014-11-18 20:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Tuesday, November 18, 2014 08:38:14 AM Viresh Kumar wrote:
> On 18 November 2014 05:09, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Monday, November 17, 2014 01:38:00 PM Viresh Kumar wrote:
> >> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
> >> index 89ced95..490e9db 100644
> >> --- a/drivers/base/power/opp.c
> >> +++ b/drivers/base/power/opp.c
> >> @@ -466,9 +466,9 @@ int dev_pm_opp_add(struct device *dev, unsigned long freq, unsigned long u_volt)
> >>               int ret = opp->available && new_opp->u_volt == opp->u_volt ?
> >>                       0 : -EEXIST;
> >>
> >> -             dev_warn(dev, "%s: duplicate OPPs detected. Existing: freq: %lu, volt: %lu, enabled: %d. New: freq: %lu, volt: %lu, enabled: %d\n",
> >> -                      __func__, opp->rate, opp->u_volt, opp->available,
> >> -                      new_opp->rate, new_opp->u_volt, new_opp->available);
> >> +             dev_dbg(dev, "%s: duplicate OPPs detected. Existing: freq: %lu, volt: %lu, enabled: %d. New: freq: %lu, volt: %lu, enabled: %d\n",
> >> +                     __func__, opp->rate, opp->u_volt, opp->available,
> >> +                     new_opp->rate, new_opp->u_volt, new_opp->available);
> >>               mutex_unlock(&dev_opp_list_lock);
> >>               kfree(new_opp);
> >>               return ret;
> >
> > Don't you think that this may hide real bugs?
> 
> What kind of bugs exactly?
> 
> We are allowing addition of duplicate OPPs as a standard thing right now
> as cpufreq drivers don't get rid of the OPPs they create with DT. So, that
> shouldn't complain, isn't it ?

Is cpufreq the only user of OPP?  I thought there were other users, so what
about them?

> For example, what Stefan was doing was absolutely normal procedure
> and that complained for him..
> 
> The only thing we don't allow is when the existing OPP isn't available
> and we are trying to add a duplicate one. In that case we do return
> -EEXIST and so we will get errors from the upper layer.
> 
> Or do we want to destroy OPPs created with help of DT while the
> driver unloads ?

I'm not sure about that.  If they aren't useful for anything after
that, what's the benefit of keeping them around?

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-18 20:51     ` Rafael J. Wysocki
@ 2014-11-19  7:46       ` Viresh Kumar
  2014-11-21 15:58         ` Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Viresh Kumar @ 2014-11-19  7:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 19 November 2014 02:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Tuesday, November 18, 2014 08:38:14 AM Viresh Kumar wrote:

>> We are allowing addition of duplicate OPPs as a standard thing right now
>> as cpufreq drivers don't get rid of the OPPs they create with DT. So, that
>> shouldn't complain, isn't it ?
>
> Is cpufreq the only user of OPP?  I thought there were other users, so what
> about them?

Probably of CPU OPPs, but I am not sure. Obviously dev OPPs can be used
by others.

> I'm not sure about that.  If they aren't useful for anything after
> that, what's the benefit of keeping them around?

I don't think they are of any use once the driver is gone, unless the driver is
inserted again.

So, this is what we can do to distinguish DT OPPs with other dynamic ones:

diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
index 490e9db..7e25f01 100644
--- a/drivers/base/power/opp.c
+++ b/drivers/base/power/opp.c
@@ -49,6 +49,7 @@
  *             are protected by the dev_opp_list_lock for integrity.
  *             IMPORTANT: the opp nodes should be maintained in increasing
  *             order.
+ * @from_dt:   created from static DT entries.
  * @available: true/false - marks if this OPP as available or not
  * @rate:      Frequency in hertz
  * @u_volt:    Nominal voltage in microvolts corresponding to this OPP
@@ -61,6 +62,7 @@ struct dev_pm_opp {
        struct list_head node;

        bool available;
+       bool from_dt;
        unsigned long rate;
        unsigned long u_volt;



Does this look fine? I can then write of_free_opp_table(), opposite of
of_init_opp_table().

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-19  7:46       ` Viresh Kumar
@ 2014-11-21 15:58         ` Rafael J. Wysocki
  2014-11-24 10:40           ` Viresh Kumar
  0 siblings, 1 reply; 14+ messages in thread
From: Rafael J. Wysocki @ 2014-11-21 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday, November 19, 2014 01:16:24 PM Viresh Kumar wrote:
> On 19 November 2014 02:21, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, November 18, 2014 08:38:14 AM Viresh Kumar wrote:
> 
> >> We are allowing addition of duplicate OPPs as a standard thing right now
> >> as cpufreq drivers don't get rid of the OPPs they create with DT. So, that
> >> shouldn't complain, isn't it ?
> >
> > Is cpufreq the only user of OPP?  I thought there were other users, so what
> > about them?
> 
> Probably of CPU OPPs, but I am not sure. Obviously dev OPPs can be used
> by others.
> 
> > I'm not sure about that.  If they aren't useful for anything after
> > that, what's the benefit of keeping them around?
> 
> I don't think they are of any use once the driver is gone, unless the driver is
> inserted again.
> 
> So, this is what we can do to distinguish DT OPPs with other dynamic ones:
> 
> diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c
> index 490e9db..7e25f01 100644
> --- a/drivers/base/power/opp.c
> +++ b/drivers/base/power/opp.c
> @@ -49,6 +49,7 @@
>   *             are protected by the dev_opp_list_lock for integrity.
>   *             IMPORTANT: the opp nodes should be maintained in increasing
>   *             order.
> + * @from_dt:   created from static DT entries.

What about @dynamic instead of @from_dt?  That may apply to more use cases if
need be.

>   * @available: true/false - marks if this OPP as available or not
>   * @rate:      Frequency in hertz
>   * @u_volt:    Nominal voltage in microvolts corresponding to this OPP
> @@ -61,6 +62,7 @@ struct dev_pm_opp {
>         struct list_head node;
> 
>         bool available;
> +       bool from_dt;
>         unsigned long rate;
>         unsigned long u_volt;
> 
> 
> 
> Does this look fine? I can then write of_free_opp_table(), opposite of
> of_init_opp_table().
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-21 15:58         ` Rafael J. Wysocki
@ 2014-11-24 10:40           ` Viresh Kumar
  2014-11-24 15:09             ` Rafael J. Wysocki
  0 siblings, 1 reply; 14+ messages in thread
From: Viresh Kumar @ 2014-11-24 10:40 UTC (permalink / raw)
  To: linux-arm-kernel

On 21 November 2014 at 21:28, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> What about @dynamic instead of @from_dt?  That may apply to more use cases if
> need be.

@Paul: I am stuck at a point and need help on RCUs :)

File: drivers/base/power/opp.c

We are trying to remove OPPs created from static data present in DT on
cpufreq driver's removal (when configured as module).

opp core uses RCUs internally and it looks like I need to implement:
list_for_each_entry_safe_rcu()

But, I am not sure because of these:
http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-10/6280.html
http://patchwork.ozlabs.org/patch/48989/

So, wanted to ask you if I really need that or the OPP code is
buggy somewhere.

The code removing OPPs is:

        list_for_each_entry_rcu(opp, &dev_opp->opp_list, node) {
                srcu_notifier_call_chain(&dev_opp->head, OPP_EVENT_REMOVE, opp);
                list_del_rcu(&opp->node);

                kfree(opp);
        }

Because we are freeing opp at the end, list_for_each_entry_rcu()
is trying to read the already freed opp to find opp->node.next
and that results in a crash.

What am I doing wrong ?

--
viresh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-24 10:40           ` Viresh Kumar
@ 2014-11-24 15:09             ` Rafael J. Wysocki
  2014-11-24 16:14               ` Paul E. McKenney
  2014-11-25  6:32               ` Viresh Kumar
  0 siblings, 2 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2014-11-24 15:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday, November 24, 2014 04:10:00 PM Viresh Kumar wrote:
> On 21 November 2014 at 21:28, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > What about @dynamic instead of @from_dt?  That may apply to more use cases if
> > need be.
> 
> @Paul: I am stuck at a point and need help on RCUs :)
> 
> File: drivers/base/power/opp.c
> 
> We are trying to remove OPPs created from static data present in DT on
> cpufreq driver's removal (when configured as module).
> 
> opp core uses RCUs internally and it looks like I need to implement:
> list_for_each_entry_safe_rcu()
> 
> But, I am not sure because of these:
> http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-10/6280.html
> http://patchwork.ozlabs.org/patch/48989/
> 
> So, wanted to ask you if I really need that or the OPP code is
> buggy somewhere.
> 
> The code removing OPPs is:
> 
>         list_for_each_entry_rcu(opp, &dev_opp->opp_list, node) {
>                 srcu_notifier_call_chain(&dev_opp->head, OPP_EVENT_REMOVE, opp);
>                 list_del_rcu(&opp->node);
> 
>                 kfree(opp);
>         }
> 
> Because we are freeing opp at the end, list_for_each_entry_rcu()
> is trying to read the already freed opp to find opp->node.next
> and that results in a crash.
> 
> What am I doing wrong ?

I hope that doesn't happen under rcu_read_lock()?

The modification needs to be done under dev_opp_list_lock in the first place
in which case you don't need the _rcu version of list walking, so you simply
can use list_for_each_entry_safe() here. The mutex is sufficient for the
synchronization with other writers (if any).  The freeing, though, has to be
deferred until all readers drop their references to the old entry.  You can
use kfree_rcu() for that.

Rafael

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-24 15:09             ` Rafael J. Wysocki
@ 2014-11-24 16:14               ` Paul E. McKenney
  2014-11-25  1:27                 ` Rafael J. Wysocki
  2014-11-25 10:37                 ` Viresh Kumar
  2014-11-25  6:32               ` Viresh Kumar
  1 sibling, 2 replies; 14+ messages in thread
From: Paul E. McKenney @ 2014-11-24 16:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Nov 24, 2014 at 04:09:54PM +0100, Rafael J. Wysocki wrote:
> On Monday, November 24, 2014 04:10:00 PM Viresh Kumar wrote:
> > On 21 November 2014 at 21:28, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > What about @dynamic instead of @from_dt?  That may apply to more use cases if
> > > need be.
> > 
> > @Paul: I am stuck at a point and need help on RCUs :)
> > 
> > File: drivers/base/power/opp.c
> > 
> > We are trying to remove OPPs created from static data present in DT on
> > cpufreq driver's removal (when configured as module).
> > 
> > opp core uses RCUs internally and it looks like I need to implement:
> > list_for_each_entry_safe_rcu()
> > 
> > But, I am not sure because of these:
> > http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-10/6280.html
> > http://patchwork.ozlabs.org/patch/48989/
> > 
> > So, wanted to ask you if I really need that or the OPP code is
> > buggy somewhere.
> > 
> > The code removing OPPs is:
> > 
> >         list_for_each_entry_rcu(opp, &dev_opp->opp_list, node) {
> >                 srcu_notifier_call_chain(&dev_opp->head, OPP_EVENT_REMOVE, opp);
> >                 list_del_rcu(&opp->node);
> > 
> >                 kfree(opp);

As Rafael says, if opp is reachable by RCU readers, you cannot just
immediately kfree() it.  Immediately kfree()ing it like this -will-
cause your RCU readers to see freed memory, which, as you noted, can
cause crashes.

> >         }
> > 
> > Because we are freeing opp at the end, list_for_each_entry_rcu()
> > is trying to read the already freed opp to find opp->node.next
> > and that results in a crash.
> > 
> > What am I doing wrong ?
> 
> I hope that doesn't happen under rcu_read_lock()?
> 
> The modification needs to be done under dev_opp_list_lock in the first place
> in which case you don't need the _rcu version of list walking, so you simply
> can use list_for_each_entry_safe() here. The mutex is sufficient for the
> synchronization with other writers (if any).  The freeing, though, has to be
> deferred until all readers drop their references to the old entry.  You can
> use kfree_rcu() for that.

Except that srcu_notifier_call_chain() involves SRCU readers.  So,
unless I am confused, you instead need something like this:

static void kfree_opp_rcu(struct rcu_head *rhp)
{
	struct device_opp *opp = container_of(rhp, struct device_opp, opp_list);

	kfree(opp);
}

Then replace the above kfree() by:

	call_srcu(&opp->rcu, kfree_opp_rcu);

This will require adding the following to struct device_opp:

	struct rcu_head rcu;

And yes, this would be simpler if there was a kfree_srcu().  If a few
more uses like this show up, I will create one.

All that said, I do not claim to understand the OPP code, so please take
the above suggested changes with a grain of salt.  And if you let me know
where I am confused, I should be able to offer better suggestions.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-24 16:14               ` Paul E. McKenney
@ 2014-11-25  1:27                 ` Rafael J. Wysocki
  2014-11-25 10:37                 ` Viresh Kumar
  1 sibling, 0 replies; 14+ messages in thread
From: Rafael J. Wysocki @ 2014-11-25  1:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Monday, November 24, 2014 08:14:35 AM Paul E. McKenney wrote:
> On Mon, Nov 24, 2014 at 04:09:54PM +0100, Rafael J. Wysocki wrote:
> > On Monday, November 24, 2014 04:10:00 PM Viresh Kumar wrote:
> > > On 21 November 2014 at 21:28, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > > > What about @dynamic instead of @from_dt?  That may apply to more use cases if
> > > > need be.
> > > 
> > > @Paul: I am stuck at a point and need help on RCUs :)
> > > 
> > > File: drivers/base/power/opp.c
> > > 
> > > We are trying to remove OPPs created from static data present in DT on
> > > cpufreq driver's removal (when configured as module).
> > > 
> > > opp core uses RCUs internally and it looks like I need to implement:
> > > list_for_each_entry_safe_rcu()
> > > 
> > > But, I am not sure because of these:
> > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2005-10/6280.html
> > > http://patchwork.ozlabs.org/patch/48989/
> > > 
> > > So, wanted to ask you if I really need that or the OPP code is
> > > buggy somewhere.
> > > 
> > > The code removing OPPs is:
> > > 
> > >         list_for_each_entry_rcu(opp, &dev_opp->opp_list, node) {
> > >                 srcu_notifier_call_chain(&dev_opp->head, OPP_EVENT_REMOVE, opp);
> > >                 list_del_rcu(&opp->node);
> > > 
> > >                 kfree(opp);
> 
> As Rafael says, if opp is reachable by RCU readers, you cannot just
> immediately kfree() it.  Immediately kfree()ing it like this -will-
> cause your RCU readers to see freed memory, which, as you noted, can
> cause crashes.
> 
> > >         }
> > > 
> > > Because we are freeing opp at the end, list_for_each_entry_rcu()
> > > is trying to read the already freed opp to find opp->node.next
> > > and that results in a crash.
> > > 
> > > What am I doing wrong ?
> > 
> > I hope that doesn't happen under rcu_read_lock()?
> > 
> > The modification needs to be done under dev_opp_list_lock in the first place
> > in which case you don't need the _rcu version of list walking, so you simply
> > can use list_for_each_entry_safe() here. The mutex is sufficient for the
> > synchronization with other writers (if any).  The freeing, though, has to be
> > deferred until all readers drop their references to the old entry.  You can
> > use kfree_rcu() for that.
> 
> Except that srcu_notifier_call_chain() involves SRCU readers.  So,
> unless I am confused, you instead need something like this:

Correct, that's SRCU.  Sorry for my confusion.


> static void kfree_opp_rcu(struct rcu_head *rhp)
> {
> 	struct device_opp *opp = container_of(rhp, struct device_opp, opp_list);
> 
> 	kfree(opp);
> }
> 
> Then replace the above kfree() by:
> 
> 	call_srcu(&opp->rcu, kfree_opp_rcu);
> 
> This will require adding the following to struct device_opp:
> 
> 	struct rcu_head rcu;
> 
> And yes, this would be simpler if there was a kfree_srcu().  If a few
> more uses like this show up, I will create one.
> 
> All that said, I do not claim to understand the OPP code, so please take
> the above suggested changes with a grain of salt.  And if you let me know
> where I am confused, I should be able to offer better suggestions.


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-24 15:09             ` Rafael J. Wysocki
  2014-11-24 16:14               ` Paul E. McKenney
@ 2014-11-25  6:32               ` Viresh Kumar
  1 sibling, 0 replies; 14+ messages in thread
From: Viresh Kumar @ 2014-11-25  6:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 24 November 2014 at 20:39, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> I hope that doesn't happen under rcu_read_lock()?

No. Should it?

> The modification needs to be done under dev_opp_list_lock in the first place

Yeah, its there to protect against other updaters.

> in which case you don't need the _rcu version of list walking, so you simply
> can use list_for_each_entry_safe() here. The mutex is sufficient for the
> synchronization with other writers (if any).  The freeing, though, has to be

Correct.

> deferred until all readers drop their references to the old entry.  You can
> use kfree_rcu() for that.

Cool. I understood the concept atleast. And yes I followed the srcu mail from
paul as well.. Will reply there.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-24 16:14               ` Paul E. McKenney
  2014-11-25  1:27                 ` Rafael J. Wysocki
@ 2014-11-25 10:37                 ` Viresh Kumar
  2014-11-25 16:25                   ` Paul E. McKenney
  1 sibling, 1 reply; 14+ messages in thread
From: Viresh Kumar @ 2014-11-25 10:37 UTC (permalink / raw)
  To: linux-arm-kernel

On 24 November 2014 at 21:44, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> As Rafael says, if opp is reachable by RCU readers, you cannot just
> immediately kfree() it.  Immediately kfree()ing it like this -will-
> cause your RCU readers to see freed memory, which, as you noted, can
> cause crashes.

In order to reply you at some level, I tried going through RCU documentation
today before replying anymore. And yes I understood this part.

> Except that srcu_notifier_call_chain() involves SRCU readers.  So,
> unless I am confused, you instead need something like this:
>
> static void kfree_opp_rcu(struct rcu_head *rhp)
> {
>         struct device_opp *opp = container_of(rhp, struct device_opp, opp_list);
>
>         kfree(opp);
> }
>
> Then replace the above kfree() by:
>
>         call_srcu(&opp->rcu, kfree_opp_rcu);

Correct. But you missed the srcu which should be the first argument here :)

> This will require adding the following to struct device_opp:
>
>         struct rcu_head rcu;

We were freeing struct dev_pm_opp, and so I believe you
wanted me to add it there? Its already there.

> All that said, I do not claim to understand the OPP code, so please take
> the above suggested changes with a grain of salt.  And if you let me know
> where I am confused, I should be able to offer better suggestions.

Thanks for your suggestions. I have sent the patch to list and cc'd you on
the relevant ones. Would be great if you can review the rcu part there.

--
viresh

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs
  2014-11-25 10:37                 ` Viresh Kumar
@ 2014-11-25 16:25                   ` Paul E. McKenney
  0 siblings, 0 replies; 14+ messages in thread
From: Paul E. McKenney @ 2014-11-25 16:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 25, 2014 at 04:07:35PM +0530, Viresh Kumar wrote:
> On 24 November 2014 at 21:44, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > As Rafael says, if opp is reachable by RCU readers, you cannot just
> > immediately kfree() it.  Immediately kfree()ing it like this -will-
> > cause your RCU readers to see freed memory, which, as you noted, can
> > cause crashes.
> 
> In order to reply you at some level, I tried going through RCU documentation
> today before replying anymore. And yes I understood this part.
> 
> > Except that srcu_notifier_call_chain() involves SRCU readers.  So,
> > unless I am confused, you instead need something like this:
> >
> > static void kfree_opp_rcu(struct rcu_head *rhp)
> > {
> >         struct device_opp *opp = container_of(rhp, struct device_opp, opp_list);
> >
> >         kfree(opp);
> > }
> >
> > Then replace the above kfree() by:
> >
> >         call_srcu(&opp->rcu, kfree_opp_rcu);
> 
> Correct. But you missed the srcu which should be the first argument here :)

Indeed I did!  ;-)

> > This will require adding the following to struct device_opp:
> >
> >         struct rcu_head rcu;
> 
> We were freeing struct dev_pm_opp, and so I believe you
> wanted me to add it there? Its already there.

Fair enough!

> > All that said, I do not claim to understand the OPP code, so please take
> > the above suggested changes with a grain of salt.  And if you let me know
> > where I am confused, I should be able to offer better suggestions.
> 
> Thanks for your suggestions. I have sent the patch to list and cc'd you on
> the relevant ones. Would be great if you can review the rcu part there.

Done!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-11-25 16:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-17  8:08 [PATCH] opp: convert dev_warn() to dev_dbg() for duplicate OPPs Viresh Kumar
2014-11-17 19:32 ` Stefan Wahren
2014-11-17 23:39 ` Rafael J. Wysocki
2014-11-18  3:08   ` Viresh Kumar
2014-11-18 20:51     ` Rafael J. Wysocki
2014-11-19  7:46       ` Viresh Kumar
2014-11-21 15:58         ` Rafael J. Wysocki
2014-11-24 10:40           ` Viresh Kumar
2014-11-24 15:09             ` Rafael J. Wysocki
2014-11-24 16:14               ` Paul E. McKenney
2014-11-25  1:27                 ` Rafael J. Wysocki
2014-11-25 10:37                 ` Viresh Kumar
2014-11-25 16:25                   ` Paul E. McKenney
2014-11-25  6:32               ` Viresh Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).