From: Konrad Dybcio <konrad.dybcio@linaro.org>
To: "Rafael J. Wysocki" <rafael@kernel.org>,
Marek Szyprowski <m.szyprowski@samsung.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
Linux PM <linux-pm@vger.kernel.org>,
Ulf Hansson <ulf.hansson@linaro.org>, Tejun Heo <tj@kernel.org>,
Nathan Chancellor <nathan@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>,
Lai Jiangshan <jiangshanlai@gmail.com>,
Naohiro.Aota@wdc.com, kernel-team@meta.com,
Konrad Dybcio <konradybcio@kernel.org>,
Bjorn Andersson <andersson@kernel.org>
Subject: Re: [PATCH v1] PM: sleep: Restore asynchronous device resume optimization
Date: Sat, 10 Feb 2024 15:04:08 +0100 [thread overview]
Message-ID: <dde25579-2f6e-4357-bce2-108061460431@linaro.org> (raw)
In-Reply-To: <CAJZ5v0g7VNdRtYsjAf-X24KtmnLzSXqgroReXAOeOQ+Myi2_Rw@mail.gmail.com>
On 2/7/24 11:59, Rafael J. Wysocki wrote:
> On Wed, Feb 7, 2024 at 11:31 AM Marek Szyprowski
> <m.szyprowski@samsung.com> wrote:
>>
>> Dear All,
>>
>> On 09.01.2024 17:59, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> Before commit 7839d0078e0d ("PM: sleep: Fix possible deadlocks in core
>>> system-wide PM code"), the resume of devices that were allowed to resume
>>> asynchronously was scheduled before starting the resume of the other
>>> devices, so the former did not have to wait for the latter unless
>>> functional dependencies were present.
>>>
>>> Commit 7839d0078e0d removed that optimization in order to address a
>>> correctness issue, but it can be restored with the help of a new device
>>> power management flag, so do that now.
>>>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> ---
>>
>> This patch finally landed in linux-next some time ago as 3e999770ac1c
>> ("PM: sleep: Restore asynchronous device resume optimization"). Recently
>> I found that it causes a non-trivial interaction with commit
>> 5797b1c18919 ("workqueue: Implement system-wide nr_active enforcement
>> for unbound workqueues"). Since merge commit 954350a5f8db in linux-next
>> system suspend/resume fails (board doesn't wake up) on my old Samsung
>> Exynos4412-based Odroid-U3 board (ARM 32bit based), which was rock
>> stable for last years.
>>
>> My further investigations confirmed that the mentioned commits are
>> responsible for this issue. Each of them separately (3e999770ac1c and
>> 5797b1c18919) doesn't trigger any problems. Reverting any of them on top
>> of linux-next (with some additional commit due to code dependencies)
>> also fixes/hides the problem.
>>
>> Let me know if You need more information or tests on the hardware. I'm
>> open to help debugging this issue.
>
> So I found this report:
>
> https://lore.kernel.org/lkml/b3d08cd8-d77f-45dd-a2c3-4a4db5a98dfa@kernel.org/
>
> which appears to be about the same issue.
>
> Strictly speaking, the regression is introduced by 5797b1c18919,
> because it is not a mainline commit yet, but the information regarding
> the interaction of it with commit 3e999770ac1c is valuable.
>
> Essentially, what commit 3e999770ac1c does is to schedule the
> execution of all of the async resume callbacks in a given phase
> upfront, so they can run without waiting for the sync ones to complete
> (except when there is a parent-child or supplier-consumer dependency -
> the latter represented by a device link).
>
> What seems to be happening after commit 5797b1c18919 is that one (or
> more) of the async callbacks get blocked in the workqueue for some
> reason.
>
> I would try to replace system_unbound_wq in
> __async_schedule_node_domain() with a dedicated workqueue that is not
> unbound and see what happens.
Thanks Rafael for helping connect the dots!
I did the laziest things imaginable to switch to a WQ that's
not unbound:
diff --git a/kernel/async.c b/kernel/async.c
index 97f224a5257b..37f1204ab4e9 100644
--- a/kernel/async.c
+++ b/kernel/async.c
@@ -174,7 +174,7 @@ static async_cookie_t __async_schedule_node_domain(async_func_t func,
spin_unlock_irqrestore(&async_lock, flags);
/* schedule for execution */
- queue_work_node(node, system_unbound_wq, &entry->work);
+ queue_work_node(node, system_wq, &entry->work);
..and the system can now suspend/resume all day long again!
Konrad
prev parent reply other threads:[~2024-02-10 14:04 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20240207103144eucas1p16b601a73ff347d2542f8380b25921491@eucas1p1.samsung.com>
2024-01-09 16:59 ` [PATCH v1] PM: sleep: Restore asynchronous device resume optimization Rafael J. Wysocki
2024-01-10 10:37 ` Stanislaw Gruszka
2024-01-10 12:33 ` Rafael J. Wysocki
2024-01-10 14:05 ` Stanislaw Gruszka
2024-01-11 7:58 ` Stanislaw Gruszka
2024-01-11 12:01 ` Rafael J. Wysocki
2024-02-07 10:31 ` Marek Szyprowski
2024-02-07 10:38 ` Rafael J. Wysocki
2024-02-07 11:16 ` Marek Szyprowski
2024-02-07 11:25 ` Rafael J. Wysocki
2024-02-07 16:39 ` Tejun Heo
2024-02-07 18:41 ` Rafael J. Wysocki
2024-02-07 18:55 ` Marek Szyprowski
2024-02-07 18:58 ` Tejun Heo
2024-02-07 19:48 ` Tejun Heo
2024-02-07 19:54 ` Rafael J. Wysocki
2024-02-07 21:30 ` Marek Szyprowski
2024-02-07 21:35 ` Tejun Heo
2024-02-08 7:47 ` Marek Szyprowski
2024-02-09 0:20 ` Tejun Heo
2024-02-12 14:57 ` Rafael J. Wysocki
2024-02-07 10:59 ` Rafael J. Wysocki
2024-02-10 14:04 ` Konrad Dybcio [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dde25579-2f6e-4357-bce2-108061460431@linaro.org \
--to=konrad.dybcio@linaro.org \
--cc=Naohiro.Aota@wdc.com \
--cc=andersson@kernel.org \
--cc=jiangshanlai@gmail.com \
--cc=kernel-team@meta.com \
--cc=konradybcio@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=m.szyprowski@samsung.com \
--cc=nathan@kernel.org \
--cc=rafael@kernel.org \
--cc=rjw@rjwysocki.net \
--cc=stanislaw.gruszka@linux.intel.com \
--cc=tj@kernel.org \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox