From: Praveen Talari <praveen.talari@oss.qualcomm.com>
To: Alexey Klimov <alexey.klimov@linaro.org>,
Praveen Talari <quic_ptalari@quicinc.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Jiri Slaby <jirislaby@kernel.org>,
Bryan O'Donoghue <bryan.odonoghue@linaro.org>,
linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-serial@vger.kernel.org, psodagud@quicinc.com,
djaggi@quicinc.com, quic_msavaliy@quicinc.com,
quic_vtanuku@quicinc.com, quic_arandive@quicinc.com,
quic_shazhuss@quicinc.com, krzk@kernel.org
Subject: Re: [PATCH v1] serial: qcom-geni: Fix pinctrl deadlock on runtime resume
Date: Mon, 15 Sep 2025 19:55:14 +0530 [thread overview]
Message-ID: <cb96f3cd-7427-4644-b7ca-26b763867db4@oss.qualcomm.com> (raw)
In-Reply-To: <DCT9VWQYD4VM.1NV5FJJCJG4PI@linaro.org>
Hi Alexey,
On 9/15/2025 3:09 PM, Alexey Klimov wrote:
> (removing <quic_mnaresh@quicinc.com> from c/c -- too many mail not delivered)
>
> Hi Praveen,
>
> On Mon Sep 15, 2025 at 7:58 AM BST, Praveen Talari wrote:
>> Hi Alexey,
>>
>> Really appreciate you waiting!
>>
>> On 9/11/2025 2:30 PM, Alexey Klimov wrote:
>>> Hi Praveen,
>>>
>>> On Thu Sep 11, 2025 at 9:34 AM BST, Praveen Talari wrote:
>>>> Hi Alexy,
>>>>
>>>> Thank you for update.
>>>>
>>>> On 9/10/2025 1:35 AM, Alexey Klimov wrote:
>>>>>
>>>>> (adding Krzysztof to c/c)
>>>>>
>>>>> On Mon Sep 8, 2025 at 6:43 PM BST, Alexey Klimov wrote:
>>>>>> On Mon Sep 8, 2025 at 5:45 PM BST, Praveen Talari wrote:
>>>>>>> A deadlock is observed in the qcom_geni_serial driver during runtime
>>>>>>> resume. This occurs when the pinctrl subsystem reconfigures device pins
>>>>>>> via msm_pinmux_set_mux() while the serial device's interrupt is an
>>>>>>> active wakeup source. msm_pinmux_set_mux() calls disable_irq() or
>>>>>>> __synchronize_irq(), conflicting with the active wakeup state and
>>>>>>> causing the IRQ thread to enter an uninterruptible (D-state) sleep,
>>>>>>> leading to system instability.
>>>>>>>
>>>>>>> The critical call trace leading to the deadlock is:
>>>>>>>
>>>>>>> Call trace:
>>>>>>> __switch_to+0xe0/0x120
>>>>>>> __schedule+0x39c/0x978
>>>>>>> schedule+0x5c/0xf8
>>>>>>> __synchronize_irq+0x88/0xb4
>>>>>>> disable_irq+0x3c/0x4c
>>>>>>> msm_pinmux_set_mux+0x508/0x644
>>>>>>> pinmux_enable_setting+0x190/0x2dc
>>>>>>> pinctrl_commit_state+0x13c/0x208
>>>>>>> pinctrl_pm_select_default_state+0x4c/0xa4
>>>>>>> geni_se_resources_on+0xe8/0x154
>>>>>>> qcom_geni_serial_runtime_resume+0x4c/0x88
>>>>>>> pm_generic_runtime_resume+0x2c/0x44
>>>>>>> __genpd_runtime_resume+0x30/0x80
>>>>>>> genpd_runtime_resume+0x114/0x29c
>>>>>>> __rpm_callback+0x48/0x1d8
>>>>>>> rpm_callback+0x6c/0x78
>>>>>>> rpm_resume+0x530/0x750
>>>>>>> __pm_runtime_resume+0x50/0x94
>>>>>>> handle_threaded_wake_irq+0x30/0x94
>>>>>>> irq_thread_fn+0x2c/xa8
>>>>>>> irq_thread+0x160/x248
>>>>>>> kthread+0x110/x114
>>>>>>> ret_from_fork+0x10/x20
>>>>>>>
>>>>>>> To resolve this, explicitly manage the wakeup IRQ state within the
>>>>>>> runtime suspend/resume callbacks. In the runtime resume callback, call
>>>>>>> disable_irq_wake() before enabling resources. This preemptively
>>>>>>> removes the "wakeup" capability from the IRQ, allowing subsequent
>>>>>>> interrupt management calls to proceed without conflict. An error path
>>>>>>> re-enables the wakeup IRQ if resource enablement fails.
>>>>>>>
>>>>>>> Conversely, in runtime suspend, call enable_irq_wake() after resources
>>>>>>> are disabled. This ensures the interrupt is configured as a wakeup
>>>>>>> source only once the device has fully entered its low-power state. An
>>>>>>> error path handles disabling the wakeup IRQ if the suspend operation
>>>>>>> fails.
>>>>>>>
>>>>>>> Fixes: 1afa70632c39 ("serial: qcom-geni: Enable PM runtime for serial driver")
>>>>>>> Signed-off-by: Praveen Talari <praveen.talari@oss.qualcomm.com>
>>>>>>
>>>>>> You forgot:
>>>>>>
>>>>>> Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
>>>>>>
>>>>>> Also, not sure where this change will go, via Greg or Jiri, but ideally
>>>>>> this should be picked for current -rc cycle since regression is
>>>>>> introduced during latest merge window.
>>>>>>
>>>>>> I also would like to test it on qrb2210 rb1 where this regression is
>>>>>> reproduciable.
>>>>>
>>>>> It doesn't seem that it fixes the regression on RB1 board:
>>>>>
>>>>> INFO: task kworker/u16:3:50 blocked for more than 120 seconds.
>>>>> Not tainted 6.17.0-rc5-00018-g9dd1835ecda5-dirty #13
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:kworker/u16:3 state:D stack:0 pid:50 tgid:50 ppid:2 task_flags:0x4208060 flags:0x00000010
>>>>> Workqueue: async async_run_entry_fn
>>>>> Call trace:
>>>>> __switch_to+0xf0/0x1c0 (T)
>>>>> __schedule+0x358/0x99c
>>>>> schedule+0x34/0x11c
>>>>> rpm_resume+0x17c/0x6a0
>>>>> rpm_resume+0x2c4/0x6a0
>>>>> rpm_resume+0x2c4/0x6a0
>>>>> rpm_resume+0x2c4/0x6a0
>>>>> __pm_runtime_resume+0x50/0x9c
>>>>> __driver_probe_device+0x58/0x120
>>>>> driver_probe_device+0x3c/0x154
>>>>> __driver_attach_async_helper+0x4c/0xc0
>>>>> async_run_entry_fn+0x34/0xe0
>>>>> process_one_work+0x148/0x284
>>>>> worker_thread+0x2c4/0x3e0
>>>>> kthread+0x12c/0x210
>>>>> ret_from_fork+0x10/0x20
>>>>> INFO: task irq/92-4a8c000.:79 blocked for more than 120 seconds.
>>>>> Not tainted 6.17.0-rc5-00018-g9dd1835ecda5-dirty #13
>>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> task:irq/92-4a8c000. state:D stack:0 pid:79 tgid:79 ppid:2 task_flags:0x208040 flags:0x00000010
>>>>> Call trace:
>>>>> __switch_to+0xf0/0x1c0 (T)
>>>>> __schedule+0x358/0x99c
>>>>> schedule+0x34/0x11c
>>>>> __synchronize_irq+0x90/0xcc
>>>>> disable_irq+0x3c/0x4c
>>>>> msm_pinmux_set_mux+0x3b4/0x45c
>>>>> pinmux_enable_setting+0x1fc/0x2d8
>>>>> pinctrl_commit_state+0xa0/0x260
>>>>> pinctrl_pm_select_default_state+0x4c/0xa0
>>>>> geni_se_resources_on+0xe8/0x154
>>>>> geni_serial_resource_state+0x8c/0xbc
>>>>> qcom_geni_serial_runtime_resume+0x3c/0x88
>>>>> pm_generic_runtime_resume+0x2c/0x44
>>>>> __rpm_callback+0x48/0x1e0
>>>>> rpm_callback+0x74/0x80
>>>>> rpm_resume+0x3bc/0x6a0
>>>>> __pm_runtime_resume+0x50/0x9c
>>>>> handle_threaded_wake_irq+0x30/0x80
>>>>> irq_thread_fn+0x2c/0xb0
>>>>> irq_thread+0x170/0x334
>>>>> kthread+0x12c/0x210
>>>>> ret_from_fork+0x10/0x20
>>>>
>>>> I can see call stack is mostly similar for yours and mine but not
>>>> completely at initial calls.
>>>>
>>>> Yours dump:
>>>> > qcom_geni_serial_runtime_resume+0x3c/0x88
>>>> > pm_generic_runtime_resume+0x2c/0x44
>>>> > __rpm_callback+0x48/0x1e0
>>>> > rpm_callback+0x74/0x80
>>>> > rpm_resume+0x3bc/0x6a0
>>>> > __pm_runtime_resume+0x50/0x9c
>>>> > handle_threaded_wake_irq+0x30/0x80
>>>>
>>>> Mine:
>>>> >>> qcom_geni_serial_runtime_resume+0x4c/0x88
>>>> >>> pm_generic_runtime_resume+0x2c/0x44
>>>> >>> __genpd_runtime_resume+0x30/0x80
>>>> >>> genpd_runtime_resume+0x114/0x29c
>>>> >>> __rpm_callback+0x48/0x1d8
>>>> >>> rpm_callback+0x6c/0x78
>>>> >>> rpm_resume+0x530/0x750
>>>>
>>>>
>>>> Can you please share what is DT file for this Board if possible?
>>>> is there any usecase enabled on this SE instance?
>>>
>>> Well, yeah, sorry, I didn't really compared backtraces line to line and
>>> behaviour was exactly the same. I thought that the purpose was to fix
>>> the regression reported earlier.
>>>
>>> RB1 main dts files are qrb2210-rb1.dts and qcm2290.dtsi.
>>>
>>> The similar board RB2 uses qrb4210-rb2.dts and sm4250.dtsi+sm6115.dtsi,
>>> it is worth checking it as well.
>>> For testing here I didn't use anything extra (the only change was wifi fix
>>> from Loic); I tested -master and linux-next usually.
>>>
>>> If you can tell me what is SE instance I may be able to answer. But
>>> as far as I know it is not a part of any infrastructure or CI machinery.
>>> I just boot the board and see if it works, if it does then I rebuild and
>>> test my changes (audio).
>>
>> I'm actively working on this and experimenting various scenarios with
>> wakeup. I’ll share the updated patch as soon as possible.
>>
>> Should we include fix in V2 or new version(V1) if the fix originates
>> from a different subsystem(pinctrol)?
>
> Wait, I am a bit lost. Are there two regresssions? And is this patch only
> targets one of the them?
I am simulated on different target(SC7280) and it is same issue only.
> Are there two fixes now for different problems?
The problem is same.
> If they are not related (independent) then I'd split it but it not something
> exceptional -- just standard rules should apply.
I am fixing from this issue from pinctrol subsystem.
Please guide me on this.
Should we include fix in V2 or new version(V1) if the fix originates
from a different subsystem(pinctrol)?
Thanks,
Praveen
>
> Thanks,
> Alexey
next prev parent reply other threads:[~2025-09-15 14:25 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-08 16:45 [PATCH v1] serial: qcom-geni: Fix pinctrl deadlock on runtime resume Praveen Talari
2025-09-08 17:43 ` Alexey Klimov
2025-09-09 20:05 ` Alexey Klimov
2025-09-11 8:34 ` Praveen Talari
2025-09-11 9:00 ` Alexey Klimov
2025-09-15 6:58 ` Praveen Talari
2025-09-15 9:39 ` Alexey Klimov
2025-09-15 14:25 ` Praveen Talari [this message]
2025-09-16 6:50 ` Praveen Talari
2025-09-16 14:39 ` Jorge Ramirez
2025-09-16 15:07 ` Jorge Ramirez
2025-09-16 17:12 ` Alexey Klimov
2025-09-17 0:05 ` Krzysztof Kozlowski
2025-09-17 0:13 ` Krzysztof Kozlowski
2025-09-17 4:02 ` Praveen Talari
2025-09-17 3:26 ` Praveen Talari
2025-09-16 14:34 ` Jorge Ramirez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cb96f3cd-7427-4644-b7ca-26b763867db4@oss.qualcomm.com \
--to=praveen.talari@oss.qualcomm.com \
--cc=alexey.klimov@linaro.org \
--cc=bryan.odonoghue@linaro.org \
--cc=djaggi@quicinc.com \
--cc=gregkh@linuxfoundation.org \
--cc=jirislaby@kernel.org \
--cc=krzk@kernel.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-serial@vger.kernel.org \
--cc=psodagud@quicinc.com \
--cc=quic_arandive@quicinc.com \
--cc=quic_msavaliy@quicinc.com \
--cc=quic_ptalari@quicinc.com \
--cc=quic_shazhuss@quicinc.com \
--cc=quic_vtanuku@quicinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox