public inbox for linux-pm@vger.kernel.org
 help / color / mirror / Atom feed
* cpuidle regression ?
@ 2019-12-08 10:39 Daniel Lezcano
  2019-12-09  9:26 ` Rafael J. Wysocki
  2019-12-10 11:48 ` [PATCH] cpuidle: Fix cpuidle_driver_state_disabled() Rafael J. Wysocki
  0 siblings, 2 replies; 10+ messages in thread
From: Daniel Lezcano @ 2019-12-08 10:39 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM mailing list, NXP Linux Team


Hi Rafael,

the latest linux-next kernelci report indicates a kernel bug for the
imx6 platform. I don't have this board so it is not possible to
investigate it.

https://storage.kernelci.org/next/master/next-20191208/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-imx6q-sabrelite.html

[ ... ]

[    3.372501] Unable to handle kernel NULL pointer dereference at
virtual address 00000000

[ ... ]

[    3.408898] PC is at _find_first_bit_le+0xc/0x2c
[    3.413785] LR is at cpuidle_driver_state_disabled+0x40/0xa0


Not sure if it is related to the latest changes or not.


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cpuidle regression ?
  2019-12-08 10:39 cpuidle regression ? Daniel Lezcano
@ 2019-12-09  9:26 ` Rafael J. Wysocki
  2019-12-09 10:32   ` Daniel Lezcano
  2019-12-10 11:48 ` [PATCH] cpuidle: Fix cpuidle_driver_state_disabled() Rafael J. Wysocki
  1 sibling, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2019-12-09  9:26 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: Rafael J. Wysocki, Linux PM mailing list, NXP Linux Team

[-- Attachment #1: Type: text/plain, Size: 848 bytes --]

On Sun, Dec 8, 2019 at 11:40 AM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
>
> Hi Rafael,
>
> the latest linux-next kernelci report indicates a kernel bug for the
> imx6 platform. I don't have this board so it is not possible to
> investigate it.
>
> https://storage.kernelci.org/next/master/next-20191208/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-imx6q-sabrelite.html
>
> [ ... ]
>
> [    3.372501] Unable to handle kernel NULL pointer dereference at
> virtual address 00000000
>
> [ ... ]
>
> [    3.408898] PC is at _find_first_bit_le+0xc/0x2c
> [    3.413785] LR is at cpuidle_driver_state_disabled+0x40/0xa0
>
>
> Not sure if it is related to the latest changes or not.

It does seem so, in which case the attached patch should address it.

Is there a way to test the patch alone or do I need to push it
somewhere to be tested?

[-- Attachment #2: cpuidle-fix-driver_state_disabled.patch --]
[-- Type: text/x-patch, Size: 825 bytes --]

---
 drivers/cpuidle/driver.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

Index: linux-pm/drivers/cpuidle/driver.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/driver.c
+++ linux-pm/drivers/cpuidle/driver.c
@@ -403,6 +403,13 @@ void cpuidle_driver_state_disabled(struc
 
 	mutex_lock(&cpuidle_lock);
 
+	spin_lock(&cpuidle_driver_lock);
+
+	if (!drv->cpumask) {
+		drv->states[idx].flags |= CPUIDLE_FLAG_UNUSABLE;
+		goto unlock;
+	}
+
 	for_each_cpu(cpu, drv->cpumask) {
 		struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu);
 
@@ -415,5 +422,8 @@ void cpuidle_driver_state_disabled(struc
 			dev->states_usage[idx].disable &= ~CPUIDLE_STATE_DISABLED_BY_DRIVER;
 	}
 
+unlock:
+	spin_unlock(&cpuidle_driver_lock);
+
 	mutex_unlock(&cpuidle_lock);
 }

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cpuidle regression ?
  2019-12-09  9:26 ` Rafael J. Wysocki
@ 2019-12-09 10:32   ` Daniel Lezcano
  2019-12-09 10:50     ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Lezcano @ 2019-12-09 10:32 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM mailing list, NXP Linux Team

On 09/12/2019 10:26, Rafael J. Wysocki wrote:
> On Sun, Dec 8, 2019 at 11:40 AM Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
>>
>>
>> Hi Rafael,
>>
>> the latest linux-next kernelci report indicates a kernel bug for the
>> imx6 platform. I don't have this board so it is not possible to
>> investigate it.
>>
>> https://storage.kernelci.org/next/master/next-20191208/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-imx6q-sabrelite.html
>>
>> [ ... ]
>>
>> [    3.372501] Unable to handle kernel NULL pointer dereference at
>> virtual address 00000000
>>
>> [ ... ]
>>
>> [    3.408898] PC is at _find_first_bit_le+0xc/0x2c
>> [    3.413785] LR is at cpuidle_driver_state_disabled+0x40/0xa0
>>
>>
>> Not sure if it is related to the latest changes or not.
> 
> It does seem so, in which case the attached patch should address it.
> 
> Is there a way to test the patch alone or do I need to push it
> somewhere to be tested?

Is the bleeding-edge branch monitored by kernelci ?




-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cpuidle regression ?
  2019-12-09 10:32   ` Daniel Lezcano
@ 2019-12-09 10:50     ` Rafael J. Wysocki
  2019-12-09 11:35       ` Daniel Lezcano
  0 siblings, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2019-12-09 10:50 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: Rafael J. Wysocki, Linux PM mailing list, NXP Linux Team

On Mon, Dec 9, 2019 at 11:32 AM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> On 09/12/2019 10:26, Rafael J. Wysocki wrote:
> > On Sun, Dec 8, 2019 at 11:40 AM Daniel Lezcano
> > <daniel.lezcano@linaro.org> wrote:
> >>
> >>
> >> Hi Rafael,
> >>
> >> the latest linux-next kernelci report indicates a kernel bug for the
> >> imx6 platform. I don't have this board so it is not possible to
> >> investigate it.
> >>
> >> https://storage.kernelci.org/next/master/next-20191208/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-imx6q-sabrelite.html
> >>
> >> [ ... ]
> >>
> >> [    3.372501] Unable to handle kernel NULL pointer dereference at
> >> virtual address 00000000
> >>
> >> [ ... ]
> >>
> >> [    3.408898] PC is at _find_first_bit_le+0xc/0x2c
> >> [    3.413785] LR is at cpuidle_driver_state_disabled+0x40/0xa0
> >>
> >>
> >> Not sure if it is related to the latest changes or not.
> >
> > It does seem so, in which case the attached patch should address it.
> >
> > Is there a way to test the patch alone or do I need to push it
> > somewhere to be tested?
>
> Is the bleeding-edge branch monitored by kernelci ?

No, it is not right now, AFAICS.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cpuidle regression ?
  2019-12-09 10:50     ` Rafael J. Wysocki
@ 2019-12-09 11:35       ` Daniel Lezcano
  2019-12-09 11:55         ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Lezcano @ 2019-12-09 11:35 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM mailing list, NXP Linux Team

On 09/12/2019 11:50, Rafael J. Wysocki wrote:
> On Mon, Dec 9, 2019 at 11:32 AM Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
>>
>> On 09/12/2019 10:26, Rafael J. Wysocki wrote:
>>> On Sun, Dec 8, 2019 at 11:40 AM Daniel Lezcano
>>> <daniel.lezcano@linaro.org> wrote:
>>>>
>>>>
>>>> Hi Rafael,
>>>>
>>>> the latest linux-next kernelci report indicates a kernel bug for the
>>>> imx6 platform. I don't have this board so it is not possible to
>>>> investigate it.
>>>>
>>>> https://storage.kernelci.org/next/master/next-20191208/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-imx6q-sabrelite.html
>>>>
>>>> [ ... ]
>>>>
>>>> [    3.372501] Unable to handle kernel NULL pointer dereference at
>>>> virtual address 00000000
>>>>
>>>> [ ... ]
>>>>
>>>> [    3.408898] PC is at _find_first_bit_le+0xc/0x2c
>>>> [    3.413785] LR is at cpuidle_driver_state_disabled+0x40/0xa0
>>>>
>>>>
>>>> Not sure if it is related to the latest changes or not.
>>>
>>> It does seem so, in which case the attached patch should address it.
>>>
>>> Is there a way to test the patch alone or do I need to push it
>>> somewhere to be tested?
>>
>> Is the bleeding-edge branch monitored by kernelci ?
> 
> No, it is not right now, AFAICS.

I have a imx7, I will try to reproduce the issue on it. Otherwise, I can
test on the 'testing' thermal branch temporarily.



-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cpuidle regression ?
  2019-12-09 11:35       ` Daniel Lezcano
@ 2019-12-09 11:55         ` Rafael J. Wysocki
  2019-12-09 17:22           ` Daniel Lezcano
  0 siblings, 1 reply; 10+ messages in thread
From: Rafael J. Wysocki @ 2019-12-09 11:55 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: Rafael J. Wysocki, Linux PM mailing list, NXP Linux Team

On Mon, Dec 9, 2019 at 12:35 PM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> On 09/12/2019 11:50, Rafael J. Wysocki wrote:
> > On Mon, Dec 9, 2019 at 11:32 AM Daniel Lezcano
> > <daniel.lezcano@linaro.org> wrote:
> >>
> >> On 09/12/2019 10:26, Rafael J. Wysocki wrote:
> >>> On Sun, Dec 8, 2019 at 11:40 AM Daniel Lezcano
> >>> <daniel.lezcano@linaro.org> wrote:
> >>>>
> >>>>
> >>>> Hi Rafael,
> >>>>
> >>>> the latest linux-next kernelci report indicates a kernel bug for the
> >>>> imx6 platform. I don't have this board so it is not possible to
> >>>> investigate it.
> >>>>
> >>>> https://storage.kernelci.org/next/master/next-20191208/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-imx6q-sabrelite.html
> >>>>
> >>>> [ ... ]
> >>>>
> >>>> [    3.372501] Unable to handle kernel NULL pointer dereference at
> >>>> virtual address 00000000
> >>>>
> >>>> [ ... ]
> >>>>
> >>>> [    3.408898] PC is at _find_first_bit_le+0xc/0x2c
> >>>> [    3.413785] LR is at cpuidle_driver_state_disabled+0x40/0xa0
> >>>>
> >>>>
> >>>> Not sure if it is related to the latest changes or not.
> >>>
> >>> It does seem so, in which case the attached patch should address it.
> >>>
> >>> Is there a way to test the patch alone or do I need to push it
> >>> somewhere to be tested?
> >>
> >> Is the bleeding-edge branch monitored by kernelci ?
> >
> > No, it is not right now, AFAICS.
>
> I have a imx7, I will try to reproduce the issue on it. Otherwise, I can
> test on the 'testing' thermal branch temporarily.

Thanks, that'll help!

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cpuidle regression ?
  2019-12-09 11:55         ` Rafael J. Wysocki
@ 2019-12-09 17:22           ` Daniel Lezcano
  2019-12-10  8:06             ` Daniel Lezcano
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Lezcano @ 2019-12-09 17:22 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM mailing list, NXP Linux Team

On 09/12/2019 12:55, Rafael J. Wysocki wrote:
> On Mon, Dec 9, 2019 at 12:35 PM Daniel Lezcano
> <daniel.lezcano@linaro.org> wrote:
>>
>> On 09/12/2019 11:50, Rafael J. Wysocki wrote:
>>> On Mon, Dec 9, 2019 at 11:32 AM Daniel Lezcano
>>> <daniel.lezcano@linaro.org> wrote:
>>>>
>>>> On 09/12/2019 10:26, Rafael J. Wysocki wrote:
>>>>> On Sun, Dec 8, 2019 at 11:40 AM Daniel Lezcano
>>>>> <daniel.lezcano@linaro.org> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Rafael,
>>>>>>
>>>>>> the latest linux-next kernelci report indicates a kernel bug for the
>>>>>> imx6 platform. I don't have this board so it is not possible to
>>>>>> investigate it.
>>>>>>
>>>>>> https://storage.kernelci.org/next/master/next-20191208/arm/multi_v7_defconfig/gcc-8/lab-collabora/boot-imx6q-sabrelite.html
>>>>>>
>>>>>> [ ... ]
>>>>>>
>>>>>> [    3.372501] Unable to handle kernel NULL pointer dereference at
>>>>>> virtual address 00000000
>>>>>>
>>>>>> [ ... ]
>>>>>>
>>>>>> [    3.408898] PC is at _find_first_bit_le+0xc/0x2c
>>>>>> [    3.413785] LR is at cpuidle_driver_state_disabled+0x40/0xa0
>>>>>>
>>>>>>
>>>>>> Not sure if it is related to the latest changes or not.
>>>>>
>>>>> It does seem so, in which case the attached patch should address it.
>>>>>
>>>>> Is there a way to test the patch alone or do I need to push it
>>>>> somewhere to be tested?
>>>>
>>>> Is the bleeding-edge branch monitored by kernelci ?
>>>
>>> No, it is not right now, AFAICS.
>>
>> I have a imx7, I will try to reproduce the issue on it. Otherwise, I can
>> test on the 'testing' thermal branch temporarily.
> 
> Thanks, that'll help!

The imx7 kernel has an issue with cpuidle but not related to your
changes, so I was unable to double check with a local board.

However, kernelci does no longer report a regression on the baseline
with your patch applied on my 'testing' branch.

I can not guarantee the bug is fixed because I can't reproduce it
locally and check the patch is effectively solving the issue.

In order to increase the confidence on the fix, I updated the branch
without your patch, so in a few hours, we should see the if the
regression appears or not.


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cpuidle regression ?
  2019-12-09 17:22           ` Daniel Lezcano
@ 2019-12-10  8:06             ` Daniel Lezcano
  2019-12-10  8:07               ` Rafael J. Wysocki
  0 siblings, 1 reply; 10+ messages in thread
From: Daniel Lezcano @ 2019-12-10  8:06 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM mailing list, NXP Linux Team

On 09/12/2019 18:22, Daniel Lezcano wrote:

[ ... ]

> However, kernelci does no longer report a regression on the baseline
> with your patch applied on my 'testing' branch.
> 
> I can not guarantee the bug is fixed because I can't reproduce it
> locally and check the patch is effectively solving the issue.
> 
> In order to increase the confidence on the fix, I updated the branch
> without your patch, so in a few hours, we should see the if the
> regression appears or not.

The regression came back without the patch.


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: cpuidle regression ?
  2019-12-10  8:06             ` Daniel Lezcano
@ 2019-12-10  8:07               ` Rafael J. Wysocki
  0 siblings, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2019-12-10  8:07 UTC (permalink / raw)
  To: Daniel Lezcano; +Cc: Rafael J. Wysocki, Linux PM mailing list, NXP Linux Team

On Tue, Dec 10, 2019 at 9:06 AM Daniel Lezcano
<daniel.lezcano@linaro.org> wrote:
>
> On 09/12/2019 18:22, Daniel Lezcano wrote:
>
> [ ... ]
>
> > However, kernelci does no longer report a regression on the baseline
> > with your patch applied on my 'testing' branch.
> >
> > I can not guarantee the bug is fixed because I can't reproduce it
> > locally and check the patch is effectively solving the issue.
> >
> > In order to increase the confidence on the fix, I updated the branch
> > without your patch, so in a few hours, we should see the if the
> > regression appears or not.
>
> The regression came back without the patch.

OK, thanks!

I'll post the patch properly then and queue it up as a regression fix.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] cpuidle: Fix cpuidle_driver_state_disabled()
  2019-12-08 10:39 cpuidle regression ? Daniel Lezcano
  2019-12-09  9:26 ` Rafael J. Wysocki
@ 2019-12-10 11:48 ` Rafael J. Wysocki
  1 sibling, 0 replies; 10+ messages in thread
From: Rafael J. Wysocki @ 2019-12-10 11:48 UTC (permalink / raw)
  To: Linux PM mailing list; +Cc: Daniel Lezcano, Rafael J. Wysocki, NXP Linux Team

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

It turns out that cpuidle_driver_state_disabled() can be called
before registering the cpufreq driver on some platforms, which
was not expected when it was introduced and which leads to a NULL
pointer dereference when trying to walk the CPUs associated with
the given cpuidle driver.

Fix the problem by making cpuidle_driver_state_disabled() check if
the driver's mask of CPUs associated with it is present and to set
CPUIDLE_FLAG_UNUSABLE for the given idle state in the driver's states
list if that is not the case to cause __cpuidle_register_device() to
set CPUIDLE_STATE_DISABLED_BY_DRIVER for that state for all cpuidle
devices registered by it later.

Fixes: cbda56d5fefc ("cpuidle: Introduce cpuidle_driver_state_disabled() for driver quirks")
Reported-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/driver.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

Index: linux-pm/drivers/cpuidle/driver.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/driver.c
+++ linux-pm/drivers/cpuidle/driver.c
@@ -403,6 +403,13 @@ void cpuidle_driver_state_disabled(struc
 
 	mutex_lock(&cpuidle_lock);
 
+	spin_lock(&cpuidle_driver_lock);
+
+	if (!drv->cpumask) {
+		drv->states[idx].flags |= CPUIDLE_FLAG_UNUSABLE;
+		goto unlock;
+	}
+
 	for_each_cpu(cpu, drv->cpumask) {
 		struct cpuidle_device *dev = per_cpu(cpuidle_devices, cpu);
 
@@ -415,5 +422,8 @@ void cpuidle_driver_state_disabled(struc
 			dev->states_usage[idx].disable &= ~CPUIDLE_STATE_DISABLED_BY_DRIVER;
 	}
 
+unlock:
+	spin_unlock(&cpuidle_driver_lock);
+
 	mutex_unlock(&cpuidle_lock);
 }




^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-12-10 11:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-12-08 10:39 cpuidle regression ? Daniel Lezcano
2019-12-09  9:26 ` Rafael J. Wysocki
2019-12-09 10:32   ` Daniel Lezcano
2019-12-09 10:50     ` Rafael J. Wysocki
2019-12-09 11:35       ` Daniel Lezcano
2019-12-09 11:55         ` Rafael J. Wysocki
2019-12-09 17:22           ` Daniel Lezcano
2019-12-10  8:06             ` Daniel Lezcano
2019-12-10  8:07               ` Rafael J. Wysocki
2019-12-10 11:48 ` [PATCH] cpuidle: Fix cpuidle_driver_state_disabled() Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox