* [PATCH RESEND] Do not mark ACPI devices as irq safe
@ 2024-08-08 12:14 Breno Leitao
2024-08-08 23:57 ` Andi Shyti
0 siblings, 1 reply; 9+ messages in thread
From: Breno Leitao @ 2024-08-08 12:14 UTC (permalink / raw)
To: Laxman Dewangan, Dmitry Osipenko, Andi Shyti, Thierry Reding,
Jonathan Hunter
Cc: leit, Michael van der Westhuizen,
open list:I2C SUBSYSTEM HOST DRIVERS,
open list:TEGRA ARCHITECTURE SUPPORT, open list
On ACPI machines, the tegra i2c module encounters an issue due to a
mutex being called inside a spinlock. This leads to the following bug:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1282, name: kssif0010
preempt_count: 0, expected: 0
RCU nest depth: 0, expected: 0
irq event stamp: 0
Call trace:
dump_backtrace+0xf0/0x140
show_stack (./arch/x86/include/asm/current.h:49
arch/x86/kernel/dumpstack.c:312)
dump_stack_lvl (lib/dump_stack.c:89 lib/dump_stack.c:115)
dump_stack (lib/earlycpio.c:61)
__might_resched (./arch/x86/include/asm/current.h:49
kernel/sched/core.c:10297)
__might_sleep (./include/linux/lockdep.h:231
kernel/sched/core.c:10236)
__mutex_lock_common+0x5c/0x2190
mutex_lock_nested (kernel/locking/mutex.c:751)
acpi_subsys_runtime_resume+0xb8/0x160
__rpm_callback+0x1cc/0x4b0
rpm_resume+0xa60/0x1078
__pm_runtime_resume+0xbc/0x130
tegra_i2c_xfer+0x74/0x398
__i2c_transfer (./include/trace/events/i2c.h:122 drivers/i2c/i2c-core-base.c:2258)
The problem arises because during __pm_runtime_resume(), the spinlock
&dev->power.lock is acquired before rpm_resume() is called. Later,
rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
mutexes, triggering the error.
To address this issue, devices on ACPI are now marked as not IRQ-safe,
considering the dependency of acpi_subsys_runtime_resume() on mutexes.
Co-developed-by: Michael van der Westhuizen <rmikey@meta.com>
Signed-off-by: Michael van der Westhuizen <rmikey@meta.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
---
drivers/i2c/busses/i2c-tegra.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
index 85b31edc558d..6d783ecc3431 100644
--- a/drivers/i2c/busses/i2c-tegra.c
+++ b/drivers/i2c/busses/i2c-tegra.c
@@ -1804,7 +1804,7 @@ static int tegra_i2c_probe(struct platform_device *pdev)
* VI I2C device shouldn't be marked as IRQ-safe because VI I2C won't
* be used for atomic transfers.
*/
- if (!IS_VI(i2c_dev))
+ if (!IS_VI(i2c_dev) && !ACPI_HANDLE(i2c_dev->dev))
pm_runtime_irq_safe(i2c_dev->dev);
pm_runtime_enable(i2c_dev->dev);
--
2.43.5
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH RESEND] Do not mark ACPI devices as irq safe
2024-08-08 12:14 [PATCH RESEND] Do not mark ACPI devices as irq safe Breno Leitao
@ 2024-08-08 23:57 ` Andi Shyti
2024-08-09 11:03 ` Andy Shevchenko
0 siblings, 1 reply; 9+ messages in thread
From: Andi Shyti @ 2024-08-08 23:57 UTC (permalink / raw)
To: Breno Leitao
Cc: Laxman Dewangan, Dmitry Osipenko, Thierry Reding, Jonathan Hunter,
leit, Michael van der Westhuizen,
open list:I2C SUBSYSTEM HOST DRIVERS,
open list:TEGRA ARCHITECTURE SUPPORT, open list, Andy Shevchenko
Hi,
On Thu, Aug 08, 2024 at 05:14:46AM GMT, Breno Leitao wrote:
> On ACPI machines, the tegra i2c module encounters an issue due to a
> mutex being called inside a spinlock. This leads to the following bug:
>
> BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585
> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1282, name: kssif0010
> preempt_count: 0, expected: 0
> RCU nest depth: 0, expected: 0
> irq event stamp: 0
>
> Call trace:
> dump_backtrace+0xf0/0x140
> show_stack (./arch/x86/include/asm/current.h:49
> arch/x86/kernel/dumpstack.c:312)
> dump_stack_lvl (lib/dump_stack.c:89 lib/dump_stack.c:115)
> dump_stack (lib/earlycpio.c:61)
> __might_resched (./arch/x86/include/asm/current.h:49
> kernel/sched/core.c:10297)
> __might_sleep (./include/linux/lockdep.h:231
> kernel/sched/core.c:10236)
> __mutex_lock_common+0x5c/0x2190
> mutex_lock_nested (kernel/locking/mutex.c:751)
> acpi_subsys_runtime_resume+0xb8/0x160
> __rpm_callback+0x1cc/0x4b0
> rpm_resume+0xa60/0x1078
> __pm_runtime_resume+0xbc/0x130
> tegra_i2c_xfer+0x74/0x398
> __i2c_transfer (./include/trace/events/i2c.h:122 drivers/i2c/i2c-core-base.c:2258)
>
> The problem arises because during __pm_runtime_resume(), the spinlock
> &dev->power.lock is acquired before rpm_resume() is called. Later,
> rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
> mutexes, triggering the error.
>
> To address this issue, devices on ACPI are now marked as not IRQ-safe,
> considering the dependency of acpi_subsys_runtime_resume() on mutexes.
>
> Co-developed-by: Michael van der Westhuizen <rmikey@meta.com>
> Signed-off-by: Michael van der Westhuizen <rmikey@meta.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> drivers/i2c/busses/i2c-tegra.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
> index 85b31edc558d..6d783ecc3431 100644
> --- a/drivers/i2c/busses/i2c-tegra.c
> +++ b/drivers/i2c/busses/i2c-tegra.c
> @@ -1804,7 +1804,7 @@ static int tegra_i2c_probe(struct platform_device *pdev)
> * VI I2C device shouldn't be marked as IRQ-safe because VI I2C won't
> * be used for atomic transfers.
> */
> - if (!IS_VI(i2c_dev))
> + if (!IS_VI(i2c_dev) && !ACPI_HANDLE(i2c_dev->dev))
looks good to me, can I have an ack from Andy here?
Thanks,
Andi
> pm_runtime_irq_safe(i2c_dev->dev);
>
> pm_runtime_enable(i2c_dev->dev);
> --
> 2.43.5
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH RESEND] Do not mark ACPI devices as irq safe
2024-08-08 23:57 ` Andi Shyti
@ 2024-08-09 11:03 ` Andy Shevchenko
2024-08-13 13:32 ` Breno Leitao
0 siblings, 1 reply; 9+ messages in thread
From: Andy Shevchenko @ 2024-08-09 11:03 UTC (permalink / raw)
To: Andi Shyti
Cc: Breno Leitao, Laxman Dewangan, Dmitry Osipenko, Thierry Reding,
Jonathan Hunter, leit, Michael van der Westhuizen,
open list:I2C SUBSYSTEM HOST DRIVERS,
open list:TEGRA ARCHITECTURE SUPPORT, open list
On Fri, Aug 9, 2024 at 2:57 AM Andi Shyti <andi.shyti@kernel.org> wrote:
> On Thu, Aug 08, 2024 at 05:14:46AM GMT, Breno Leitao wrote:
> > On ACPI machines, the tegra i2c module encounters an issue due to a
> > mutex being called inside a spinlock. This leads to the following bug:
First of all, ...
> > BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585
> > in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1282, name: kssif0010
> > preempt_count: 0, expected: 0
> > RCU nest depth: 0, expected: 0
> > irq event stamp: 0
> >
> > Call trace:
> > dump_backtrace+0xf0/0x140
> > show_stack (./arch/x86/include/asm/current.h:49
> > arch/x86/kernel/dumpstack.c:312)
> > dump_stack_lvl (lib/dump_stack.c:89 lib/dump_stack.c:115)
> > dump_stack (lib/earlycpio.c:61)
> > __might_resched (./arch/x86/include/asm/current.h:49
> > kernel/sched/core.c:10297)
> > __might_sleep (./include/linux/lockdep.h:231
> > kernel/sched/core.c:10236)
> > __mutex_lock_common+0x5c/0x2190
> > mutex_lock_nested (kernel/locking/mutex.c:751)
> > acpi_subsys_runtime_resume+0xb8/0x160
> > __rpm_callback+0x1cc/0x4b0
> > rpm_resume+0xa60/0x1078
> > __pm_runtime_resume+0xbc/0x130
> > tegra_i2c_xfer+0x74/0x398
> > __i2c_transfer (./include/trace/events/i2c.h:122 drivers/i2c/i2c-core-base.c:2258)
...read Submitting Patches and make the above to be ~5-6 significant
(useful) lines only.
> > The problem arises because during __pm_runtime_resume(), the spinlock
> > &dev->power.lock is acquired before rpm_resume() is called. Later,
> > rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
> > mutexes, triggering the error.
> >
> > To address this issue, devices on ACPI are now marked as not IRQ-safe,
> > considering the dependency of acpi_subsys_runtime_resume() on mutexes.
This is a step in the right direction, but somewhere in the replies
here I would like to hear about roadmap to get rid of the
pm_runtime_irq_safe() in all Tegra related code.
...
> > * VI I2C device shouldn't be marked as IRQ-safe because VI I2C won't
> > * be used for atomic transfers.
> > */
Does the comment need an expansion?
> > - if (!IS_VI(i2c_dev))
> > + if (!IS_VI(i2c_dev) && !ACPI_HANDLE(i2c_dev->dev))
>
> looks good to me, can I have an ack from Andy here?
I prefer to see something like
is_acpi_node() / is_acpi_device_node() / is_acpi_data_node() /
has_acpi_companion()
instead depending on the actual ACPI representation of the device.
Otherwise no objections.
Please, Cc me (andy@kernel.org) for the next version.
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH RESEND] Do not mark ACPI devices as irq safe
2024-08-09 11:03 ` Andy Shevchenko
@ 2024-08-13 13:32 ` Breno Leitao
2024-08-13 15:28 ` Dmitry Osipenko
0 siblings, 1 reply; 9+ messages in thread
From: Breno Leitao @ 2024-08-13 13:32 UTC (permalink / raw)
To: Andy Shevchenko, dmitry.osipenko
Cc: Andi Shyti, Laxman Dewangan, Dmitry Osipenko, Thierry Reding,
Jonathan Hunter, leit, Michael van der Westhuizen,
open list:I2C SUBSYSTEM HOST DRIVERS,
open list:TEGRA ARCHITECTURE SUPPORT, open list
Hello Andy,
On Fri, Aug 09, 2024 at 02:03:27PM +0300, Andy Shevchenko wrote:
> On Fri, Aug 9, 2024 at 2:57 AM Andi Shyti <andi.shyti@kernel.org> wrote:
> > On Thu, Aug 08, 2024 at 05:14:46AM GMT, Breno Leitao wrote:
> > > The problem arises because during __pm_runtime_resume(), the spinlock
> > > &dev->power.lock is acquired before rpm_resume() is called. Later,
> > > rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
> > > mutexes, triggering the error.
> > >
> > > To address this issue, devices on ACPI are now marked as not IRQ-safe,
> > > considering the dependency of acpi_subsys_runtime_resume() on mutexes.
>
> This is a step in the right direction
Thanks
> but somewhere in the replies
> here I would like to hear about roadmap to get rid of the
> pm_runtime_irq_safe() in all Tegra related code.
Agree, that seems the right way to go, but this is a question to
maintainers, Laxman and Dmitry.
By the way, looking at lore, I found that the last email from Laxman is
from 2022. And Dmitry seems to be using a different email!? Let me copy
the Dmitry's other email (dmitry.osipenko@collabora.com) here.
> > > + if (!IS_VI(i2c_dev) && !ACPI_HANDLE(i2c_dev->dev))
> >
> > looks good to me, can I have an ack from Andy here?
>
> I prefer to see something like
> is_acpi_node() / is_acpi_device_node() / is_acpi_data_node() /
> has_acpi_companion()
> instead depending on the actual ACPI representation of the device.
>
> Otherwise no objections.
> Please, Cc me (andy@kernel.org) for the next version.
Thanks for the feedback, I agree that leveraging the functions about
should be better. What about something as:
Author: Breno Leitao <leitao@debian.org>
Date: Thu Jun 6 06:27:07 2024 -0700
Do not mark ACPI devices as irq safe
On ACPI machines, the tegra i2c module encounters an issue due to a
mutex being called inside a spinlock. This leads to the following bug:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585
in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1282, name: kssif0010
preempt_count: 0, expected: 0
RCU nest depth: 0, expected: 0
irq event stamp: 0
Call trace:
__might_sleep
__mutex_lock_common
mutex_lock_nested
acpi_subsys_runtime_resume
rpm_resume
tegra_i2c_xfer
The problem arises because during __pm_runtime_resume(), the spinlock
&dev->power.lock is acquired before rpm_resume() is called. Later,
rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
mutexes, triggering the error.
To address this issue, devices on ACPI are now marked as not IRQ-safe,
considering the dependency of acpi_subsys_runtime_resume() on mutexes.
Co-developed-by: Michael van der Westhuizen <rmikey@meta.com>
Signed-off-by: Michael van der Westhuizen <rmikey@meta.com>
Signed-off-by: Breno Leitao <leitao@debian.org>
diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
index 85b31edc558d..1df5b4204142 100644
--- a/drivers/i2c/busses/i2c-tegra.c
+++ b/drivers/i2c/busses/i2c-tegra.c
@@ -1802,9 +1802,9 @@ static int tegra_i2c_probe(struct platform_device *pdev)
* domain.
*
* VI I2C device shouldn't be marked as IRQ-safe because VI I2C won't
- * be used for atomic transfers.
+ * be used for atomic transfers. ACPI device is not IRQ safe also.
*/
- if (!IS_VI(i2c_dev))
+ if (!IS_VI(i2c_dev) && !has_acpi_companion(i2c_dev->dev))
pm_runtime_irq_safe(i2c_dev->dev);
pm_runtime_enable(i2c_dev->dev);
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH RESEND] Do not mark ACPI devices as irq safe
2024-08-13 13:32 ` Breno Leitao
@ 2024-08-13 15:28 ` Dmitry Osipenko
2024-08-13 15:52 ` Andy Shevchenko
0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Osipenko @ 2024-08-13 15:28 UTC (permalink / raw)
To: Breno Leitao, Andy Shevchenko, dmitry.osipenko
Cc: Andi Shyti, Laxman Dewangan, Thierry Reding, Jonathan Hunter,
leit, Michael van der Westhuizen,
open list:I2C SUBSYSTEM HOST DRIVERS,
open list:TEGRA ARCHITECTURE SUPPORT, open list
13.08.2024 16:32, Breno Leitao пишет:
> Hello Andy,
>
> On Fri, Aug 09, 2024 at 02:03:27PM +0300, Andy Shevchenko wrote:
>> On Fri, Aug 9, 2024 at 2:57 AM Andi Shyti <andi.shyti@kernel.org> wrote:
>>> On Thu, Aug 08, 2024 at 05:14:46AM GMT, Breno Leitao wrote:
>
>>>> The problem arises because during __pm_runtime_resume(), the spinlock
>>>> &dev->power.lock is acquired before rpm_resume() is called. Later,
>>>> rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
>>>> mutexes, triggering the error.
>>>>
>>>> To address this issue, devices on ACPI are now marked as not IRQ-safe,
>>>> considering the dependency of acpi_subsys_runtime_resume() on mutexes.
>>
>> This is a step in the right direction
>
> Thanks
>
>> but somewhere in the replies
>> here I would like to hear about roadmap to get rid of the
>> pm_runtime_irq_safe() in all Tegra related code.
>
> Agree, that seems the right way to go, but this is a question to
> maintainers, Laxman and Dmitry.
>
> By the way, looking at lore, I found that the last email from Laxman is
> from 2022. And Dmitry seems to be using a different email!? Let me copy
> the Dmitry's other email (dmitry.osipenko@collabora.com) here.
>
>>>> + if (!IS_VI(i2c_dev) && !ACPI_HANDLE(i2c_dev->dev))
>>>
>>> looks good to me, can I have an ack from Andy here?
>>
>> I prefer to see something like
>> is_acpi_node() / is_acpi_device_node() / is_acpi_data_node() /
>> has_acpi_companion()
>> instead depending on the actual ACPI representation of the device.
>>
>> Otherwise no objections.
>> Please, Cc me (andy@kernel.org) for the next version.
>
> Thanks for the feedback, I agree that leveraging the functions about
> should be better. What about something as:
>
> Author: Breno Leitao <leitao@debian.org>
> Date: Thu Jun 6 06:27:07 2024 -0700
>
> Do not mark ACPI devices as irq safe
>
> On ACPI machines, the tegra i2c module encounters an issue due to a
> mutex being called inside a spinlock. This leads to the following bug:
>
> BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585
> in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1282, name: kssif0010
> preempt_count: 0, expected: 0
> RCU nest depth: 0, expected: 0
> irq event stamp: 0
>
> Call trace:
> __might_sleep
> __mutex_lock_common
> mutex_lock_nested
> acpi_subsys_runtime_resume
> rpm_resume
> tegra_i2c_xfer
>
> The problem arises because during __pm_runtime_resume(), the spinlock
> &dev->power.lock is acquired before rpm_resume() is called. Later,
> rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
> mutexes, triggering the error.
>
> To address this issue, devices on ACPI are now marked as not IRQ-safe,
> considering the dependency of acpi_subsys_runtime_resume() on mutexes.
>
> Co-developed-by: Michael van der Westhuizen <rmikey@meta.com>
> Signed-off-by: Michael van der Westhuizen <rmikey@meta.com>
> Signed-off-by: Breno Leitao <leitao@debian.org>
>
> diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
> index 85b31edc558d..1df5b4204142 100644
> --- a/drivers/i2c/busses/i2c-tegra.c
> +++ b/drivers/i2c/busses/i2c-tegra.c
> @@ -1802,9 +1802,9 @@ static int tegra_i2c_probe(struct platform_device *pdev)
> * domain.
> *
> * VI I2C device shouldn't be marked as IRQ-safe because VI I2C won't
> - * be used for atomic transfers.
> + * be used for atomic transfers. ACPI device is not IRQ safe also.
> */
> - if (!IS_VI(i2c_dev))
> + if (!IS_VI(i2c_dev) && !has_acpi_companion(i2c_dev->dev))
> pm_runtime_irq_safe(i2c_dev->dev);
>
> pm_runtime_enable(i2c_dev->dev);
>
Looks good, thanks
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
> but somewhere in the replies
> here I would like to hear about roadmap to get rid of the
> pm_runtime_irq_safe() in all Tegra related code.
What is the problem with pm_runtime_irq_safe()? There were multiple
problems with RPM for this driver in the past, it wasn't trivial to make
it work for all Tegra HW generations. Don't expect anyone would want to
invest time into doing it all over again.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH RESEND] Do not mark ACPI devices as irq safe
2024-08-13 15:28 ` Dmitry Osipenko
@ 2024-08-13 15:52 ` Andy Shevchenko
2024-08-15 2:48 ` Dmitry Osipenko
0 siblings, 1 reply; 9+ messages in thread
From: Andy Shevchenko @ 2024-08-13 15:52 UTC (permalink / raw)
To: Dmitry Osipenko
Cc: Breno Leitao, dmitry.osipenko, Andi Shyti, Laxman Dewangan,
Thierry Reding, Jonathan Hunter, leit, Michael van der Westhuizen,
open list:I2C SUBSYSTEM HOST DRIVERS,
open list:TEGRA ARCHITECTURE SUPPORT, open list
On Tue, Aug 13, 2024 at 6:28 PM Dmitry Osipenko <digetx@gmail.com> wrote:
>
> 13.08.2024 16:32, Breno Leitao пишет:
> > Hello Andy,
> >
> > On Fri, Aug 09, 2024 at 02:03:27PM +0300, Andy Shevchenko wrote:
> >> On Fri, Aug 9, 2024 at 2:57 AM Andi Shyti <andi.shyti@kernel.org> wrote:
> >>> On Thu, Aug 08, 2024 at 05:14:46AM GMT, Breno Leitao wrote:
> >
> >>>> The problem arises because during __pm_runtime_resume(), the spinlock
> >>>> &dev->power.lock is acquired before rpm_resume() is called. Later,
> >>>> rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
> >>>> mutexes, triggering the error.
> >>>>
> >>>> To address this issue, devices on ACPI are now marked as not IRQ-safe,
> >>>> considering the dependency of acpi_subsys_runtime_resume() on mutexes.
> >>
> >> This is a step in the right direction
> >
> > Thanks
> >
> >> but somewhere in the replies
> >> here I would like to hear about roadmap to get rid of the
> >> pm_runtime_irq_safe() in all Tegra related code.
> >
> > Agree, that seems the right way to go, but this is a question to
> > maintainers, Laxman and Dmitry.
> >
> > By the way, looking at lore, I found that the last email from Laxman is
> > from 2022. And Dmitry seems to be using a different email!? Let me copy
> > the Dmitry's other email (dmitry.osipenko@collabora.com) here.
> >
> >>>> + if (!IS_VI(i2c_dev) && !ACPI_HANDLE(i2c_dev->dev))
> >>>
> >>> looks good to me, can I have an ack from Andy here?
> >>
> >> I prefer to see something like
> >> is_acpi_node() / is_acpi_device_node() / is_acpi_data_node() /
> >> has_acpi_companion()
> >> instead depending on the actual ACPI representation of the device.
> >>
> >> Otherwise no objections.
> >> Please, Cc me (andy@kernel.org) for the next version.
> >
> > Thanks for the feedback, I agree that leveraging the functions about
> > should be better. What about something as:
> >
> > Author: Breno Leitao <leitao@debian.org>
> > Date: Thu Jun 6 06:27:07 2024 -0700
> >
> > Do not mark ACPI devices as irq safe
> >
> > On ACPI machines, the tegra i2c module encounters an issue due to a
> > mutex being called inside a spinlock. This leads to the following bug:
> >
> > BUG: sleeping function called from invalid context at kernel/locking/mutex.c:585
> > in_atomic(): 0, irqs_disabled(): 1, non_block: 0, pid: 1282, name: kssif0010
> > preempt_count: 0, expected: 0
> > RCU nest depth: 0, expected: 0
> > irq event stamp: 0
> >
> > Call trace:
> > __might_sleep
> > __mutex_lock_common
> > mutex_lock_nested
> > acpi_subsys_runtime_resume
> > rpm_resume
> > tegra_i2c_xfer
> >
> > The problem arises because during __pm_runtime_resume(), the spinlock
> > &dev->power.lock is acquired before rpm_resume() is called. Later,
> > rpm_resume() invokes acpi_subsys_runtime_resume(), which relies on
> > mutexes, triggering the error.
> >
> > To address this issue, devices on ACPI are now marked as not IRQ-safe,
> > considering the dependency of acpi_subsys_runtime_resume() on mutexes.
> >
> > Co-developed-by: Michael van der Westhuizen <rmikey@meta.com>
> > Signed-off-by: Michael van der Westhuizen <rmikey@meta.com>
> > Signed-off-by: Breno Leitao <leitao@debian.org>
> >
> > diff --git a/drivers/i2c/busses/i2c-tegra.c b/drivers/i2c/busses/i2c-tegra.c
> > index 85b31edc558d..1df5b4204142 100644
> > --- a/drivers/i2c/busses/i2c-tegra.c
> > +++ b/drivers/i2c/busses/i2c-tegra.c
> > @@ -1802,9 +1802,9 @@ static int tegra_i2c_probe(struct platform_device *pdev)
> > * domain.
> > *
> > * VI I2C device shouldn't be marked as IRQ-safe because VI I2C won't
> > - * be used for atomic transfers.
> > + * be used for atomic transfers. ACPI device is not IRQ safe also.
> > */
> > - if (!IS_VI(i2c_dev))
> > + if (!IS_VI(i2c_dev) && !has_acpi_companion(i2c_dev->dev))
> > pm_runtime_irq_safe(i2c_dev->dev);
> >
> > pm_runtime_enable(i2c_dev->dev);
> >
>
> Looks good, thanks
>
> Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
LGTM as well, feel free to add
Reviewed-by: Andy Shevchenko <andy@kernel.org>
to the above when sending it formally.
> > but somewhere in the replies
> > here I would like to hear about roadmap to get rid of the
> > pm_runtime_irq_safe() in all Tegra related code.
>
> What is the problem with pm_runtime_irq_safe()?
It's a hack. It has no reasons to stay in the kernel. It also prevents
PM from working properly (in some cases, not Tegra).
> There were multiple
> problems with RPM for this driver in the past, it wasn't trivial to make
> it work for all Tegra HW generations. Don't expect anyone would want to
> invest time into doing it all over again.
You may always refer to the OMAP case, which used to have 12 (IIRC,
but definitely several) calls to this API and now 0. Taking the OMAP
case into consideration I believe it's quite possible to get rid of
this hack and retire the API completely. Yes, this may take months or
even years. But I would like to have this roadmap be documented.
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH RESEND] Do not mark ACPI devices as irq safe
2024-08-13 15:52 ` Andy Shevchenko
@ 2024-08-15 2:48 ` Dmitry Osipenko
2024-08-19 9:20 ` Andy Shevchenko
0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Osipenko @ 2024-08-15 2:48 UTC (permalink / raw)
To: Andy Shevchenko
Cc: Breno Leitao, dmitry.osipenko, Andi Shyti, Laxman Dewangan,
Thierry Reding, Jonathan Hunter, leit, Michael van der Westhuizen,
open list:I2C SUBSYSTEM HOST DRIVERS,
open list:TEGRA ARCHITECTURE SUPPORT, open list
13.08.2024 18:52, Andy Shevchenko пишет:
...
>>> but somewhere in the replies
>>> here I would like to hear about roadmap to get rid of the
>>> pm_runtime_irq_safe() in all Tegra related code.
>>
>> What is the problem with pm_runtime_irq_safe()?
>
> It's a hack. It has no reasons to stay in the kernel. It also prevents
> PM from working properly (in some cases, not Tegra).
Why is it a hack? Why it can't be made to work properly for all cases?
>> There were multiple
>> problems with RPM for this driver in the past, it wasn't trivial to make
>> it work for all Tegra HW generations. Don't expect anyone would want to
>> invest time into doing it all over again.
>
> You may always refer to the OMAP case, which used to have 12 (IIRC,
> but definitely several) calls to this API and now 0. Taking the OMAP
> case into consideration I believe it's quite possible to get rid of
> this hack and retire the API completely. Yes, this may take months or
> even years. But I would like to have this roadmap be documented.
There should be alternative to the removed API. Otherwise drivers will
have to have own hacks to work around the RPM limitation, re-invent own
PM, or not do RPM at all.
Looking at the i2c-omap.c, I see it's doing pm_runtime_get_sync() in the
atomic transfer, which should cause a lockup without IRQ-safe RPM,
AFAICT. The OMAP example doesn't look great so far.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH RESEND] Do not mark ACPI devices as irq safe
2024-08-15 2:48 ` Dmitry Osipenko
@ 2024-08-19 9:20 ` Andy Shevchenko
2024-08-20 3:57 ` Tony Lindgren
0 siblings, 1 reply; 9+ messages in thread
From: Andy Shevchenko @ 2024-08-19 9:20 UTC (permalink / raw)
To: Dmitry Osipenko, Tony Lindgren
Cc: Breno Leitao, dmitry.osipenko, Andi Shyti, Laxman Dewangan,
Thierry Reding, Jonathan Hunter, leit, Michael van der Westhuizen,
open list:I2C SUBSYSTEM HOST DRIVERS,
open list:TEGRA ARCHITECTURE SUPPORT, open list
+Cc: Tony
On Thu, Aug 15, 2024 at 5:48 AM Dmitry Osipenko <digetx@gmail.com> wrote:
> 13.08.2024 18:52, Andy Shevchenko пишет:
> ...
> >>> but somewhere in the replies
> >>> here I would like to hear about roadmap to get rid of the
> >>> pm_runtime_irq_safe() in all Tegra related code.
> >>
> >> What is the problem with pm_runtime_irq_safe()?
> >
> > It's a hack. It has no reasons to stay in the kernel. It also prevents
> > PM from working properly (in some cases, not Tegra).
>
> Why is it a hack? Why it can't be made to work properly for all cases?
Because it messes up with the proper power transitions of the parent
devices. Refer to the initial commit c7b61de5b7b1 ("PM / Runtime: Add
synchronous runtime interface for interrupt handlers (v3)") that
pretty much explains the constraints of it. Also note, it was added
quite a while after the main PM machinery had been introduced.
What you have to use is device links to make sure the parent (PM
speaking) may not go away.
FWIW, if I am not mistaken the whole reconsideration of
pm_runtime_irq_safe() had been started with this [1] thread.
If you want to dive more into the history of this API, run `git log -S
pm_runtime_irq_safe`. It gives you also interesting facts of how it
was started being used and in many cases reverted or reworked for a
reason.
> >> There were multiple
> >> problems with RPM for this driver in the past, it wasn't trivial to make
> >> it work for all Tegra HW generations. Don't expect anyone would want to
> >> invest time into doing it all over again.
> >
> > You may always refer to the OMAP case, which used to have 12 (IIRC,
> > but definitely several) calls to this API and now 0. Taking the OMAP
> > case into consideration I believe it's quite possible to get rid of
> > this hack and retire the API completely. Yes, this may take months or
> > even years. But I would like to have this roadmap be documented.
>
> There should be alternative to the removed API. Otherwise drivers will
> have to have own hacks to work around the RPM limitation, re-invent own
> PM, or not do RPM at all.
>
> Looking at the i2c-omap.c, I see it's doing pm_runtime_get_sync() in the
> atomic transfer, which should cause a lockup without IRQ-safe RPM,
> AFAICT. The OMAP example doesn't look great so far.
Bugs may still appear, but it's not a point. I can easily find a
better example with a hint why it's bad to call that API [2][3][4] and
so on.
[1]: https://lore.kernel.org/all/20180515183409.78046-1-andriy.shevchenko@linux.intel.com/T/#u
[2]: https://lore.kernel.org/all/20191114101718.20619-1-peter.ujfalusi@ti.com/
[3]: https://lore.kernel.org/all/20180920193532.7714-1-tony@atomide.com/
[4]: https://lore.kernel.org/all/1463014396-4095-1-git-send-email-tony@atomide.com/
--
With Best Regards,
Andy Shevchenko
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH RESEND] Do not mark ACPI devices as irq safe
2024-08-19 9:20 ` Andy Shevchenko
@ 2024-08-20 3:57 ` Tony Lindgren
0 siblings, 0 replies; 9+ messages in thread
From: Tony Lindgren @ 2024-08-20 3:57 UTC (permalink / raw)
To: Andy Shevchenko
Cc: Dmitry Osipenko, Breno Leitao, dmitry.osipenko, Andi Shyti,
Laxman Dewangan, Thierry Reding, Jonathan Hunter, leit,
Michael van der Westhuizen, open list:I2C SUBSYSTEM HOST DRIVERS,
open list:TEGRA ARCHITECTURE SUPPORT, open list, Kevin Hilman
* Andy Shevchenko <andy.shevchenko@gmail.com> [240819 09:21]:
> +Cc: Tony
>
> On Thu, Aug 15, 2024 at 5:48 AM Dmitry Osipenko <digetx@gmail.com> wrote:
> > 13.08.2024 18:52, Andy Shevchenko пишет:
> > ...
> > >>> but somewhere in the replies
> > >>> here I would like to hear about roadmap to get rid of the
> > >>> pm_runtime_irq_safe() in all Tegra related code.
> > >>
> > >> What is the problem with pm_runtime_irq_safe()?
> > >
> > > It's a hack. It has no reasons to stay in the kernel. It also prevents
> > > PM from working properly (in some cases, not Tegra).
> >
> > Why is it a hack? Why it can't be made to work properly for all cases?
>
> Because it messes up with the proper power transitions of the parent
> devices. Refer to the initial commit c7b61de5b7b1 ("PM / Runtime: Add
> synchronous runtime interface for interrupt handlers (v3)") that
> pretty much explains the constraints of it. Also note, it was added
> quite a while after the main PM machinery had been introduced.
>
> What you have to use is device links to make sure the parent (PM
> speaking) may not go away.
> FWIW, if I am not mistaken the whole reconsideration of
> pm_runtime_irq_safe() had been started with this [1] thread.
>
> If you want to dive more into the history of this API, run `git log -S
> pm_runtime_irq_safe`. It gives you also interesting facts of how it
> was started being used and in many cases reverted or reworked for a
> reason.
Yeah we should remove pm_runtime_irq_safe() completely. Fixing the use
of it in a driver afterwards is always a pain. And so far there has
always been a better solution available for the use cases I've seen.
> > >> There were multiple
> > >> problems with RPM for this driver in the past, it wasn't trivial to make
> > >> it work for all Tegra HW generations. Don't expect anyone would want to
> > >> invest time into doing it all over again.
> > >
> > > You may always refer to the OMAP case, which used to have 12 (IIRC,
> > > but definitely several) calls to this API and now 0. Taking the OMAP
> > > case into consideration I believe it's quite possible to get rid of
> > > this hack and retire the API completely. Yes, this may take months or
> > > even years. But I would like to have this roadmap be documented.
> >
> > There should be alternative to the removed API. Otherwise drivers will
> > have to have own hacks to work around the RPM limitation, re-invent own
> > PM, or not do RPM at all.
> >
> > Looking at the i2c-omap.c, I see it's doing pm_runtime_get_sync() in the
> > atomic transfer, which should cause a lockup without IRQ-safe RPM,
> > AFAICT. The OMAP example doesn't look great so far.
>
> Bugs may still appear, but it's not a point. I can easily find a
> better example with a hint why it's bad to call that API [2][3][4] and
> so on.
Adding Kevin for the i2c-omap.c, sounds like it might depend on the
autosuspend timeout for runtime PM.
For issues where the controller may wake to an interrupt while runtime
idle, there's pm_runtime_get_noresume().
Regards,
Tony
> [1]: https://lore.kernel.org/all/20180515183409.78046-1-andriy.shevchenko@linux.intel.com/T/#u
> [2]: https://lore.kernel.org/all/20191114101718.20619-1-peter.ujfalusi@ti.com/
> [3]: https://lore.kernel.org/all/20180920193532.7714-1-tony@atomide.com/
> [4]: https://lore.kernel.org/all/1463014396-4095-1-git-send-email-tony@atomide.com/
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-08-20 4:06 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-08 12:14 [PATCH RESEND] Do not mark ACPI devices as irq safe Breno Leitao
2024-08-08 23:57 ` Andi Shyti
2024-08-09 11:03 ` Andy Shevchenko
2024-08-13 13:32 ` Breno Leitao
2024-08-13 15:28 ` Dmitry Osipenko
2024-08-13 15:52 ` Andy Shevchenko
2024-08-15 2:48 ` Dmitry Osipenko
2024-08-19 9:20 ` Andy Shevchenko
2024-08-20 3:57 ` Tony Lindgren
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).