Linux Power Management development
 help / color / mirror / Atom feed
From: "Eric W. Biederman" <ebiederm@xmission.com>
To: Sasha Levin <sashal@kernel.org>
Cc: patches@lists.linux.dev,  stable@vger.kernel.org,
	 Mario Limonciello <mario.limonciello@amd.com>,
	 Nat Wittstock <nat@fardog.io>,
	 Lucian Langa <lucilanga@7pot.org>,
	 "Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
	rafael@kernel.org,  pavel@ucw.cz,  len.brown@intel.com,
	linux-pm@vger.kernel.org,  kexec@lists.infradead.org
Subject: Re: [PATCH AUTOSEL 6.15 6/8] PM: Restrict swap use to later in the suspend sequence
Date: Tue, 08 Jul 2025 14:13:42 -0500	[thread overview]
Message-ID: <87ikk2wl5l.fsf@email.froward.int.ebiederm.org> (raw)
In-Reply-To: <20250708000215.793090-6-sashal@kernel.org> (Sasha Levin's message of "Mon, 7 Jul 2025 20:02:13 -0400")


Wow!

Sasha I think an impersonator has gotten into your account, and is
just making nonsense up.

This reads like an impassioned plea to backport this change, from
someone who has actually dealt with it.

However reading the justification in detail is an exercise in reading
falehoods.

If this does not come from an impersonator then if this comes
from a human being, I recommend you have a talk with them.

If this comes from a machine I recommend take it out of commission
and rework it.

If I see this kind of baloney again I expect I will just auto-nack
it instead of reading it, as reading it appears to be a waste of
time.  It is a complete waste reading fiction in what little time I have
for kernel development.

Eric


Sasha Levin <sashal@kernel.org> writes:

> **YES**
>
> This commit should be backported to stable kernel trees for the
> following reasons:
>
> ## Critical Bug Fix for Real User Issues
>
> 1. **Fixes Actual Suspend Failures**: The commit addresses real-world
>    suspend failures under memory pressure on systems with AMD discrete
>    GPUs. The linked issues (ROCm/ROCK-Kernel-Driver#174 and
>    freedesktop.org/drm/amd#2362) indicate this affects actual users.

The links in the first paragraph are very distorted.  The links
from the actual change are:

 https://github.com/ROCm/ROCK-Kernel-Driver/issues/174
 https://gitlab.freedesktop.org/drm/amd/-/issues/2362

Those completely distorted links make understanding this justification
much harder then necessary.


> 2. **Regression Fix**: This is effectively a regression fix. The PM
>    subsystem's early swap restriction prevents AMD GPU drivers from
>    properly evicting VRAM during their prepare() callbacks, which is a
>    requirement that has become more critical as GPU VRAM sizes have
>    increased.

That is a justification.   There is no evidence that a kernel change
made this worse.  Thus there is no evidence this is a regression fix.


> ## Small, Contained Change
>
> 3. **Minimal Code Changes**: The fix is remarkably simple - it just
>    moves the `pm_restrict_gfp_mask()` call from early in the suspend
>    sequence to after `dpm_prepare()` completes. The changes are:
>    - Move `pm_restrict_gfp_mask()` from multiple early locations to
>      inside `dpm_suspend_start()` after `dpm_prepare()` succeeds
>    - Add corresponding `pm_restore_gfp_mask()` calls in error paths and
>      resume paths
>    - Remove the now-redundant calls from hibernate.c and suspend.c

Completely wrong.  

> 4. **Low Risk of Regression**: The change maintains the original intent
>    of preventing I/O during the critical suspend phase while allowing it
>    during device preparation. The swap restriction still happens before
>    `dpm_suspend()`, just after `dpm_prepare()`.

This is a fundamental change to a susbsystem that the subsystem
maintainer does not say is low risk.

> ## Follows Stable Rules
>
> 5. **Meets Stable Criteria**:
>    - Fixes a real bug that bothers people (suspend failures)
Addresses a real bug.

>    - Small change (moves function calls, doesn't introduce new logic)

The change is a large change in the logic.

>    - Obviously correct (allows drivers to use swap during their
>      designated preparation phase)

It obviously changes the behavior.  It is not at all obvious
the change is behavior is desirable for all callbacks, and in all
other scenarios.


>    - Already tested by users (Tested-by tags from affected users)
Yes it has Tested-by tags.

> ## Similar to Other Backported Commits
>
> 6. **Pattern Matches**: Looking at the similar commits provided, this
>    follows the same pattern as the AMD GPU eviction commits that were
>    backported. Those commits also addressed the same fundamental issue -
>    ensuring GPU VRAM can be properly evicted during suspend/hibernation.

Which commits that were backported?

> ## Critical Timing

Timing???  There is no race condition.

> 7. **Error Path Handling**: The commit properly handles error paths by
>    adding `pm_restore_gfp_mask()` calls in:
>    - `dpm_resume_end()` for normal resume
>    - `platform_recover()` error path in suspend.c
>    - `pm_restore_gfp_mask()` in kexec_core.c for kexec flows
>
> The commit is well-tested, addresses a real problem affecting users, and
> makes a minimal, obviously correct change to fix suspend failures on
> systems with discrete GPUs under memory pressure.

What evidence is there that this commit has been tested let alone
well-tested.

The entire line of reasoning is completely suspect.

Eric

  parent reply	other threads:[~2025-07-08 19:52 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20250708000215.793090-1-sashal@kernel.org>
2025-07-08  0:02 ` [PATCH AUTOSEL 6.15 6/8] PM: Restrict swap use to later in the suspend sequence Sasha Levin
2025-07-08  6:25   ` Pavel Machek
2025-07-08  6:39   ` Pavel Machek
2025-07-08 19:13   ` Eric W. Biederman [this message]
2025-07-08 19:32   ` Eric W. Biederman
2025-07-08 20:32     ` Sasha Levin
2025-07-08 20:37       ` Pavel Machek
2025-07-08 20:46         ` Willy Tarreau
2025-07-08 20:49           ` Pavel Machek
2025-07-08 21:12           ` Sasha Levin
2025-07-08 21:26             ` Pavel Machek
2025-07-09  5:34             ` Pavel Machek
2025-07-08 20:41       ` Pavel Machek
2025-07-08 21:46       ` Eric W. Biederman
2025-07-08 22:26         ` Sasha Levin
2025-07-09  5:39           ` Pavel Machek
2025-07-09 14:35             ` Mario Limonciello
2025-07-09 16:23           ` Eric W. Biederman
2025-07-09 16:35             ` Mario Limonciello
2025-07-09 16:55               ` Rafael J. Wysocki
2025-07-09 17:37             ` Sasha Levin
2025-07-08 20:38     ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ikk2wl5l.fsf@email.froward.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=kexec@lists.infradead.org \
    --cc=len.brown@intel.com \
    --cc=linux-pm@vger.kernel.org \
    --cc=lucilanga@7pot.org \
    --cc=mario.limonciello@amd.com \
    --cc=nat@fardog.io \
    --cc=patches@lists.linux.dev \
    --cc=pavel@ucw.cz \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rafael@kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox