From: "Alexey Klimov" <alexey.klimov@linaro.org>
To: "Praveen Talari" <praveen.talari@oss.qualcomm.com>,
"Praveen Talari" <quic_ptalari@quicinc.com>
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Jiri Slaby" <jirislaby@kernel.org>,
"Bryan O'Donoghue" <bryan.odonoghue@linaro.org>,
<linux-arm-msm@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-serial@vger.kernel.org>, <psodagud@quicinc.com>,
<djaggi@quicinc.com>, <quic_msavaliy@quicinc.com>,
<quic_vtanuku@quicinc.com>, <quic_arandive@quicinc.com>,
<quic_shazhuss@quicinc.com>, <krzk@kernel.org>
Subject: Re: [PATCH v1] serial: qcom-geni: Fix pinctrl deadlock on runtime resume
Date: Mon, 15 Sep 2025 10:39:40 +0100 [thread overview]
Message-ID: <DCT9VWQYD4VM.1NV5FJJCJG4PI@linaro.org> (raw)
In-Reply-To: <2c5fd01a-543b-4108-ac54-80d1d87b65a3@oss.qualcomm.com>
(removing <quic_mnaresh@quicinc.com> from c/c -- too many mail not delivered)
Hi Praveen,
On Mon Sep 15, 2025 at 7:58 AM BST, Praveen Talari wrote:
> Hi Alexey,
>
> Really appreciate you waiting!
>
> On 9/11/2025 2:30 PM, Alexey Klimov wrote:
>> Hi Praveen,
>>
>> On Thu Sep 11, 2025 at 9:34 AM BST, Praveen Talari wrote:
>>> Hi Alexy,
>>>
>>> Thank you for update.
>>>
>>> On 9/10/2025 1:35 AM, Alexey Klimov wrote:
>>>>
>>>> (adding Krzysztof to c/c)
>>>>
>>>> On Mon Sep 8, 2025 at 6:43 PM BST, Alexey Klimov wrote:
>>>>> On Mon Sep 8, 2025 at 5:45 PM BST, Praveen Talari wrote:
>>>>>> A deadlock is observed in the qcom_geni_serial driver during runtime
>>>>>> resume. This occurs when the pinctrl subsystem reconfigures device pins
>>>>>> via msm_pinmux_set_mux() while the serial device's interrupt is an
>>>>>> active wakeup source. msm_pinmux_set_mux() calls disable_irq() or
>>>>>> __synchronize_irq(), conflicting with the active wakeup state and
>>>>>> causing the IRQ thread to enter an uninterruptible (D-state) sleep,
>>>>>> leading to system instability.
>>>>>>
>>>>>> The critical call trace leading to the deadlock is:
>>>>>>
>>>>>> Call trace:
>>>>>> __switch_to+0xe0/0x120
>>>>>> __schedule+0x39c/0x978
>>>>>> schedule+0x5c/0xf8
>>>>>> __synchronize_irq+0x88/0xb4
>>>>>> disable_irq+0x3c/0x4c
>>>>>> msm_pinmux_set_mux+0x508/0x644
>>>>>> pinmux_enable_setting+0x190/0x2dc
>>>>>> pinctrl_commit_state+0x13c/0x208
>>>>>> pinctrl_pm_select_default_state+0x4c/0xa4
>>>>>> geni_se_resources_on+0xe8/0x154
>>>>>> qcom_geni_serial_runtime_resume+0x4c/0x88
>>>>>> pm_generic_runtime_resume+0x2c/0x44
>>>>>> __genpd_runtime_resume+0x30/0x80
>>>>>> genpd_runtime_resume+0x114/0x29c
>>>>>> __rpm_callback+0x48/0x1d8
>>>>>> rpm_callback+0x6c/0x78
>>>>>> rpm_resume+0x530/0x750
>>>>>> __pm_runtime_resume+0x50/0x94
>>>>>> handle_threaded_wake_irq+0x30/0x94
>>>>>> irq_thread_fn+0x2c/xa8
>>>>>> irq_thread+0x160/x248
>>>>>> kthread+0x110/x114
>>>>>> ret_from_fork+0x10/x20
>>>>>>
>>>>>> To resolve this, explicitly manage the wakeup IRQ state within the
>>>>>> runtime suspend/resume callbacks. In the runtime resume callback, call
>>>>>> disable_irq_wake() before enabling resources. This preemptively
>>>>>> removes the "wakeup" capability from the IRQ, allowing subsequent
>>>>>> interrupt management calls to proceed without conflict. An error path
>>>>>> re-enables the wakeup IRQ if resource enablement fails.
>>>>>>
>>>>>> Conversely, in runtime suspend, call enable_irq_wake() after resources
>>>>>> are disabled. This ensures the interrupt is configured as a wakeup
>>>>>> source only once the device has fully entered its low-power state. An
>>>>>> error path handles disabling the wakeup IRQ if the suspend operation
>>>>>> fails.
>>>>>>
>>>>>> Fixes: 1afa70632c39 ("serial: qcom-geni: Enable PM runtime for serial driver")
>>>>>> Signed-off-by: Praveen Talari <praveen.talari@oss.qualcomm.com>
>>>>>
>>>>> You forgot:
>>>>>
>>>>> Reported-by: Alexey Klimov <alexey.klimov@linaro.org>
>>>>>
>>>>> Also, not sure where this change will go, via Greg or Jiri, but ideally
>>>>> this should be picked for current -rc cycle since regression is
>>>>> introduced during latest merge window.
>>>>>
>>>>> I also would like to test it on qrb2210 rb1 where this regression is
>>>>> reproduciable.
>>>>
>>>> It doesn't seem that it fixes the regression on RB1 board:
>>>>
>>>> INFO: task kworker/u16:3:50 blocked for more than 120 seconds.
>>>> Not tainted 6.17.0-rc5-00018-g9dd1835ecda5-dirty #13
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:kworker/u16:3 state:D stack:0 pid:50 tgid:50 ppid:2 task_flags:0x4208060 flags:0x00000010
>>>> Workqueue: async async_run_entry_fn
>>>> Call trace:
>>>> __switch_to+0xf0/0x1c0 (T)
>>>> __schedule+0x358/0x99c
>>>> schedule+0x34/0x11c
>>>> rpm_resume+0x17c/0x6a0
>>>> rpm_resume+0x2c4/0x6a0
>>>> rpm_resume+0x2c4/0x6a0
>>>> rpm_resume+0x2c4/0x6a0
>>>> __pm_runtime_resume+0x50/0x9c
>>>> __driver_probe_device+0x58/0x120
>>>> driver_probe_device+0x3c/0x154
>>>> __driver_attach_async_helper+0x4c/0xc0
>>>> async_run_entry_fn+0x34/0xe0
>>>> process_one_work+0x148/0x284
>>>> worker_thread+0x2c4/0x3e0
>>>> kthread+0x12c/0x210
>>>> ret_from_fork+0x10/0x20
>>>> INFO: task irq/92-4a8c000.:79 blocked for more than 120 seconds.
>>>> Not tainted 6.17.0-rc5-00018-g9dd1835ecda5-dirty #13
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> task:irq/92-4a8c000. state:D stack:0 pid:79 tgid:79 ppid:2 task_flags:0x208040 flags:0x00000010
>>>> Call trace:
>>>> __switch_to+0xf0/0x1c0 (T)
>>>> __schedule+0x358/0x99c
>>>> schedule+0x34/0x11c
>>>> __synchronize_irq+0x90/0xcc
>>>> disable_irq+0x3c/0x4c
>>>> msm_pinmux_set_mux+0x3b4/0x45c
>>>> pinmux_enable_setting+0x1fc/0x2d8
>>>> pinctrl_commit_state+0xa0/0x260
>>>> pinctrl_pm_select_default_state+0x4c/0xa0
>>>> geni_se_resources_on+0xe8/0x154
>>>> geni_serial_resource_state+0x8c/0xbc
>>>> qcom_geni_serial_runtime_resume+0x3c/0x88
>>>> pm_generic_runtime_resume+0x2c/0x44
>>>> __rpm_callback+0x48/0x1e0
>>>> rpm_callback+0x74/0x80
>>>> rpm_resume+0x3bc/0x6a0
>>>> __pm_runtime_resume+0x50/0x9c
>>>> handle_threaded_wake_irq+0x30/0x80
>>>> irq_thread_fn+0x2c/0xb0
>>>> irq_thread+0x170/0x334
>>>> kthread+0x12c/0x210
>>>> ret_from_fork+0x10/0x20
>>>
>>> I can see call stack is mostly similar for yours and mine but not
>>> completely at initial calls.
>>>
>>> Yours dump:
>>> > qcom_geni_serial_runtime_resume+0x3c/0x88
>>> > pm_generic_runtime_resume+0x2c/0x44
>>> > __rpm_callback+0x48/0x1e0
>>> > rpm_callback+0x74/0x80
>>> > rpm_resume+0x3bc/0x6a0
>>> > __pm_runtime_resume+0x50/0x9c
>>> > handle_threaded_wake_irq+0x30/0x80
>>>
>>> Mine:
>>> >>> qcom_geni_serial_runtime_resume+0x4c/0x88
>>> >>> pm_generic_runtime_resume+0x2c/0x44
>>> >>> __genpd_runtime_resume+0x30/0x80
>>> >>> genpd_runtime_resume+0x114/0x29c
>>> >>> __rpm_callback+0x48/0x1d8
>>> >>> rpm_callback+0x6c/0x78
>>> >>> rpm_resume+0x530/0x750
>>>
>>>
>>> Can you please share what is DT file for this Board if possible?
>>> is there any usecase enabled on this SE instance?
>>
>> Well, yeah, sorry, I didn't really compared backtraces line to line and
>> behaviour was exactly the same. I thought that the purpose was to fix
>> the regression reported earlier.
>>
>> RB1 main dts files are qrb2210-rb1.dts and qcm2290.dtsi.
>>
>> The similar board RB2 uses qrb4210-rb2.dts and sm4250.dtsi+sm6115.dtsi,
>> it is worth checking it as well.
>> For testing here I didn't use anything extra (the only change was wifi fix
>> from Loic); I tested -master and linux-next usually.
>>
>> If you can tell me what is SE instance I may be able to answer. But
>> as far as I know it is not a part of any infrastructure or CI machinery.
>> I just boot the board and see if it works, if it does then I rebuild and
>> test my changes (audio).
>
> I'm actively working on this and experimenting various scenarios with
> wakeup. I’ll share the updated patch as soon as possible.
>
> Should we include fix in V2 or new version(V1) if the fix originates
> from a different subsystem(pinctrol)?
Wait, I am a bit lost. Are there two regresssions? And is this patch only
targets one of the them?
Are there two fixes now for different problems?
If they are not related (independent) then I'd split it but it not something
exceptional -- just standard rules should apply.
Thanks,
Alexey
next prev parent reply other threads:[~2025-09-15 9:39 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-08 16:45 [PATCH v1] serial: qcom-geni: Fix pinctrl deadlock on runtime resume Praveen Talari
2025-09-08 17:43 ` Alexey Klimov
2025-09-09 20:05 ` Alexey Klimov
2025-09-11 8:34 ` Praveen Talari
2025-09-11 9:00 ` Alexey Klimov
2025-09-15 6:58 ` Praveen Talari
2025-09-15 9:39 ` Alexey Klimov [this message]
2025-09-15 14:25 ` Praveen Talari
2025-09-16 6:50 ` Praveen Talari
2025-09-16 14:39 ` Jorge Ramirez
2025-09-16 15:07 ` Jorge Ramirez
2025-09-16 17:12 ` Alexey Klimov
2025-09-17 0:05 ` Krzysztof Kozlowski
2025-09-17 0:13 ` Krzysztof Kozlowski
2025-09-17 4:02 ` Praveen Talari
2025-09-17 3:26 ` Praveen Talari
2025-09-16 14:34 ` Jorge Ramirez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DCT9VWQYD4VM.1NV5FJJCJG4PI@linaro.org \
--to=alexey.klimov@linaro.org \
--cc=bryan.odonoghue@linaro.org \
--cc=djaggi@quicinc.com \
--cc=gregkh@linuxfoundation.org \
--cc=jirislaby@kernel.org \
--cc=krzk@kernel.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-serial@vger.kernel.org \
--cc=praveen.talari@oss.qualcomm.com \
--cc=psodagud@quicinc.com \
--cc=quic_arandive@quicinc.com \
--cc=quic_msavaliy@quicinc.com \
--cc=quic_ptalari@quicinc.com \
--cc=quic_shazhuss@quicinc.com \
--cc=quic_vtanuku@quicinc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox