* rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled
@ 2025-11-26 9:08 Tabby Kitten
2025-12-09 15:08 ` Ulf Hansson
0 siblings, 1 reply; 8+ messages in thread
From: Tabby Kitten @ 2025-11-26 9:08 UTC (permalink / raw)
To: ulf.hansson; +Cc: linux-mmc, linux-kernel
On a PC with a Realtek PCI Express SD reader, when you sleep with
`wakeup_count` active (eg. sleeping from KDE's lock screen), the MMC
driver wakes up the system and aborts suspend.
I've found a sleep failure bug in the rtsx_pci and mmc_core drivers.
After userspace writes a number to `/sys/power/wakeup_count` (eg. KDE
Plasma does it to distinguish user wakes from timers and Wake-on-LAN),
if it attempts a mem suspend it will be aborted when
rtsx_pci_runtime_resume() -> mmc_detect_change() emits a
pm_wakeup_ws_event(). This breaks sleep on some hardware and desktop
environments.
The detailed description:
The recently released Plasma 6.5.0 writes to `/sys/power/wakeup_count`
before sleeping. On my computer this caused the sleep attempt to fail
with dmesg error "PM: Some devices failed to suspend, or early wake
event detected". I got this error on both Arch Linux and Fedora, and
replicated it on Fedora with the mainline kernel COPR. KDE is tracking
this error at https://bugs.kde.org/show_bug.cgi?id=510992, and have
disabled writing to wakeup_count on Plasma 6.5.3 to work around this
issue.
I've written a standalone shell script to reproduce this sleep failure
(save as badsleep.sh):
#!/bin/bash
read wakeup_count < /sys/power/wakeup_count
if [[ $? -ne 0 ]]; then
e=$?
echo "Failed to open wakeup_count, suspend maybe already in progress"
exit $e
fi
echo $wakeup_count > /sys/power/wakeup_count
if [[ $? -ne 0 ]]; then
e=$?
echo "Failed to write wakeup_count, wakeup_count may have changed in between"
exit $e
fi
echo mem > /sys/power/state
Running `sudo ./badsleep.sh` reproduces failed sleeps on my computer.
(sudo is needed to write to `/sys/power/wakeup_count` on Fedora.)
* If I run the script unaltered, the screen turns off and on, and the
terminal outputs
`./badsleep.sh: line 14: echo: write error: Device or resource busy`
indicating the mem sleep failed.
* If I edit the script and comment out `echo $wakeup_count >
/sys/power/wakeup_count`, the sleep succeeds, and waking the computer
skips the lock screen and resumes where I left off.
* If I run `sudo rmmod rtsx_pci_sdmmc` to disable the faulty module, the
sleep succeeds, and waking the computer skips the lock screen and
resumes where I left off.
I think this problem happens in general when a driver spawns a wakeup
event from its suspend callback. On my system, the driver in question
lies in the MMC subsystem.
## Code debugging
If I run `echo 1 > /sys/power/pm_debug_messages` to enable verbose
logging, then attempt a failed sleep, I see output:
PM: Wakeup pending, aborting suspend
PM: active wakeup source: mmc0
PM: suspend of devices aborted after 151.615 msecs
PM: start suspend of devices aborted after 169.797 msecs
PM: Some devices failed to suspend, or early wake event detected
The "Wakeup pending, aborting suspend" message comes from function
`pm_wakeup_pending()`. This function checks if event checks are enabled,
and if some counters have changed aborts suspend and calls
`pm_print_active_wakeup_sources()`, which prints `wakeup_sources`.
Tracing the code that modifies `wakeup_sources`, I found that
`pm_wakeup_ws_event()` would activate an event and
`wakeup_source_register() → wakeup_source_add()` would add a new one.
To find who changed wakeup events, I used my stacksnoop fork at
https://github.com/nyanpasu64/bcc/blob/local/examples/tracing/stacksnoop
.py to trace a failed suspend:
nyanpasu64@ryzen ~/code/bcc (local)> sudo ./examples/tracing/stacksnoop.py pm_wakeup_ws_event wakeup_source_register
TIME(s) FUNCTION
7.254676819:
0: ret_from_fork_asm [kernel]
1: ret_from_fork [kernel]
2: kthread [kernel]
3: worker_thread [kernel]
4: process_one_work [kernel]
5: async_run_entry_fn [kernel]
6: async_suspend [kernel]
7: device_suspend [kernel]
8: dpm_run_callback [kernel]
9: mmc_bus_suspend [mmc_core]
10: mmc_blk_suspend [mmc_block]
11: mmc_queue_suspend [mmc_block]
12: __mmc_claim_host [mmc_core]
13: __pm_runtime_resume [kernel]
14: rpm_resume [kernel]
15: rpm_resume [kernel]
16: rpm_callback [kernel]
17: __rpm_callback [kernel]
18: rtsx_pci_runtime_resume [rtsx_pci]
19: mmc_detect_change [mmc_core]
20: pm_wakeup_ws_event [kernel]
On a previous kernel, lines 9-12 were replaced by a single call to
`pci_pm_suspend`. I've posted my detailed debugging on the older kernel
at https://bugs.kde.org/show_bug.cgi?id=510992#c26. There I found that
`pci_pm_suspend()` wakes PCI(e) devices before sending them into a full
sleep state, but in the process, `_mmc_detect_change()` will "Prevent
system sleep for 5s to allow user space to consume the\n corresponding
uevent"... which interrupts a system sleep in progress.
On my current kernel, the same logic applies, but reading the source I
can't tell where `__mmc_claim_host()` is actually calling
`__pm_runtime_resume()`. Nonetheless the problem remains that
`rpm_resume()` is called during system suspend, `mmc_detect_change()`
wakes the system when called, and this will abort system sleep when
`/sys/power/wakeup_count` is active.
## Next steps
How would this problem be addressed? Off the top of my head, perhaps you
could not call `__pm_runtime_resume()` on a SD card reader during the
`device_suspend()` process, not call `pm_wakeup_ws_event()` when the SD
card status changes, not call `pm_wakeup_ws_event()` *specifically*
when system suspend is temporarily waking up a SD card reader, or
disable pm_wakeup_ws_event() entirely during the suspend process (does
this defeat the purpose of the function?).
Are there other drivers which cause the same symptoms? I don't know. I
asked on the KDE bug tracker for other users to attempt a failed sleep
with `echo 1 > /sys/power/pm_debug_messages` active, to identify which
driver broke suspend in their system; so far nobody has replied with
logs.
Given that this bug is related to `/sys/power/wakeup_count`
(https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-power), I
was considering CCing Rafael J. Wysocki <rafael@kernel.org> and
linux-pm@vger.kernel.org, but have decided to only message the MMC
maintainers for now. If necessary we may have to forward this message
there to get their attention.
----
System information:
* I have an Intel NUC8i7BEH mini PC, with CPU 8 × Intel® Core™ i7-8559U
CPU @ 2.70GHz.
* uname -mi prints `x86_64 unknown`.
* `lspci -nn` prints
"6e:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader [10ec:522a] (rev 01)".
* I am running kernel 6.18.0-0.rc7.357.vanilla.fc43.x86_64 from the Fedora COPRs
(https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories).
* dmesg at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-dmesg-2025-11-25-txt
* Fully resolved config at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-config-6-18-0-0-rc7-357-vanilla-fc43-x86_64,
source at https://download.copr.fedorainfracloud.org/results/@kernel-vanilla/mainline-wo-mergew/fedora-43-x86_64/09831015-mainline-womergew-releases/kernel-6.18.0-0.rc7.357.vanilla.fc43.src.rpm
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled 2025-11-26 9:08 rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled Tabby Kitten @ 2025-12-09 15:08 ` Ulf Hansson 2026-01-01 5:11 ` Tabby Kitten [not found] ` <CAL57YxZagMaZF1X1bpx-nB76s=vZMWhUDiVbvB9P3CLiXG-qHQ@mail.gmail.com> 0 siblings, 2 replies; 8+ messages in thread From: Ulf Hansson @ 2025-12-09 15:08 UTC (permalink / raw) To: Tabby Kitten; +Cc: linux-mmc, linux-kernel Hi, On Wed, 26 Nov 2025 at 10:08, Tabby Kitten <nyanpasu256@gmail.com> wrote: > > On a PC with a Realtek PCI Express SD reader, when you sleep with > `wakeup_count` active (eg. sleeping from KDE's lock screen), the MMC > driver wakes up the system and aborts suspend. Okay, that's clearly a problem that needs to be fixed! > > I've found a sleep failure bug in the rtsx_pci and mmc_core drivers. > After userspace writes a number to `/sys/power/wakeup_count` (eg. KDE > Plasma does it to distinguish user wakes from timers and Wake-on-LAN), > if it attempts a mem suspend it will be aborted when > rtsx_pci_runtime_resume() -> mmc_detect_change() emits a > pm_wakeup_ws_event(). This breaks sleep on some hardware and desktop > environments. > > The detailed description: > The recently released Plasma 6.5.0 writes to `/sys/power/wakeup_count` > before sleeping. On my computer this caused the sleep attempt to fail > with dmesg error "PM: Some devices failed to suspend, or early wake > event detected". I got this error on both Arch Linux and Fedora, and > replicated it on Fedora with the mainline kernel COPR. KDE is tracking > this error at https://bugs.kde.org/show_bug.cgi?id=510992, and have > disabled writing to wakeup_count on Plasma 6.5.3 to work around this > issue. > > I've written a standalone shell script to reproduce this sleep failure > (save as badsleep.sh): > > #!/bin/bash > read wakeup_count < /sys/power/wakeup_count > if [[ $? -ne 0 ]]; then > e=$? > echo "Failed to open wakeup_count, suspend maybe already in progress" > exit $e > fi > echo $wakeup_count > /sys/power/wakeup_count > if [[ $? -ne 0 ]]; then > e=$? > echo "Failed to write wakeup_count, wakeup_count may have changed in between" > exit $e > fi > echo mem > /sys/power/state > > Running `sudo ./badsleep.sh` reproduces failed sleeps on my computer. > (sudo is needed to write to `/sys/power/wakeup_count` on Fedora.) > > * If I run the script unaltered, the screen turns off and on, and the > terminal outputs > `./badsleep.sh: line 14: echo: write error: Device or resource busy` > indicating the mem sleep failed. > > * If I edit the script and comment out `echo $wakeup_count > > /sys/power/wakeup_count`, the sleep succeeds, and waking the computer > skips the lock screen and resumes where I left off. > > * If I run `sudo rmmod rtsx_pci_sdmmc` to disable the faulty module, the > sleep succeeds, and waking the computer skips the lock screen and > resumes where I left off. > > I think this problem happens in general when a driver spawns a wakeup > event from its suspend callback. On my system, the driver in question > lies in the MMC subsystem. > > ## Code debugging > > If I run `echo 1 > /sys/power/pm_debug_messages` to enable verbose > logging, then attempt a failed sleep, I see output: > > PM: Wakeup pending, aborting suspend > PM: active wakeup source: mmc0 > PM: suspend of devices aborted after 151.615 msecs > PM: start suspend of devices aborted after 169.797 msecs > PM: Some devices failed to suspend, or early wake event detected > > The "Wakeup pending, aborting suspend" message comes from function > `pm_wakeup_pending()`. This function checks if event checks are enabled, > and if some counters have changed aborts suspend and calls > `pm_print_active_wakeup_sources()`, which prints `wakeup_sources`. > Tracing the code that modifies `wakeup_sources`, I found that > `pm_wakeup_ws_event()` would activate an event and > `wakeup_source_register() → wakeup_source_add()` would add a new one. Thanks for all the details! > > To find who changed wakeup events, I used my stacksnoop fork at > https://github.com/nyanpasu64/bcc/blob/local/examples/tracing/stacksnoop > .py to trace a failed suspend: > > nyanpasu64@ryzen ~/code/bcc (local)> sudo ./examples/tracing/stacksnoop.py pm_wakeup_ws_event wakeup_source_register > TIME(s) FUNCTION > 7.254676819: > 0: ret_from_fork_asm [kernel] > 1: ret_from_fork [kernel] > 2: kthread [kernel] > 3: worker_thread [kernel] > 4: process_one_work [kernel] > 5: async_run_entry_fn [kernel] > 6: async_suspend [kernel] > 7: device_suspend [kernel] > 8: dpm_run_callback [kernel] > 9: mmc_bus_suspend [mmc_core] > 10: mmc_blk_suspend [mmc_block] > 11: mmc_queue_suspend [mmc_block] > 12: __mmc_claim_host [mmc_core] > 13: __pm_runtime_resume [kernel] > 14: rpm_resume [kernel] > 15: rpm_resume [kernel] > 16: rpm_callback [kernel] > 17: __rpm_callback [kernel] > 18: rtsx_pci_runtime_resume [rtsx_pci] > 19: mmc_detect_change [mmc_core] > 20: pm_wakeup_ws_event [kernel] > > On a previous kernel, lines 9-12 were replaced by a single call to > `pci_pm_suspend`. I've posted my detailed debugging on the older kernel > at https://bugs.kde.org/show_bug.cgi?id=510992#c26. There I found that > `pci_pm_suspend()` wakes PCI(e) devices before sending them into a full > sleep state, but in the process, `_mmc_detect_change()` will "Prevent > system sleep for 5s to allow user space to consume the\n corresponding > uevent"... which interrupts a system sleep in progress. > > On my current kernel, the same logic applies, but reading the source I > can't tell where `__mmc_claim_host()` is actually calling > `__pm_runtime_resume()`. Nonetheless the problem remains that > `rpm_resume()` is called during system suspend, `mmc_detect_change()` > wakes the system when called, and this will abort system sleep when > `/sys/power/wakeup_count` is active. __mmc_claim_host() will call pm_runtime_get_sync() to runtime resume the mmc host device. The mmc host device's parent (a pci device) will then be runtime resumed too. That's the call to rtsx_pci_runtime_resume() we see above. The problem is then that rtsx_pci_runtime_resume() invokes a callback (->card_event())) back into the mmc host driver (drivers/mmc/host/rtsx_pci_sdmmc.c), which ends up calling mmc_detect_change() to try to detect whether a card have been inserted/removed. > > ## Next steps > > How would this problem be addressed? Off the top of my head, perhaps you > could not call `__pm_runtime_resume()` on a SD card reader during the > `device_suspend()` process, not call `pm_wakeup_ws_event()` when the SD > card status changes, not call `pm_wakeup_ws_event()` *specifically* > when system suspend is temporarily waking up a SD card reader, or > disable pm_wakeup_ws_event() entirely during the suspend process (does > this defeat the purpose of the function?). Let me think a bit on what makes the best sense here. I will get back to you in a couple of days. > > Are there other drivers which cause the same symptoms? I don't know. I > asked on the KDE bug tracker for other users to attempt a failed sleep > with `echo 1 > /sys/power/pm_debug_messages` active, to identify which > driver broke suspend in their system; so far nobody has replied with > logs. > > Given that this bug is related to `/sys/power/wakeup_count` > (https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-power), I > was considering CCing Rafael J. Wysocki <rafael@kernel.org> and > linux-pm@vger.kernel.org, but have decided to only message the MMC > maintainers for now. If necessary we may have to forward this message > there to get their attention. > > ---- > > System information: > > * I have an Intel NUC8i7BEH mini PC, with CPU 8 × Intel® Core™ i7-8559U > CPU @ 2.70GHz. > > * uname -mi prints `x86_64 unknown`. > > * `lspci -nn` prints > "6e:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader [10ec:522a] (rev 01)". > > * I am running kernel 6.18.0-0.rc7.357.vanilla.fc43.x86_64 from the Fedora COPRs > (https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories). > > * dmesg at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-dmesg-2025-11-25-txt > > * Fully resolved config at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-config-6-18-0-0-rc7-357-vanilla-fc43-x86_64, > source at https://download.copr.fedorainfracloud.org/results/@kernel-vanilla/mainline-wo-mergew/fedora-43-x86_64/09831015-mainline-womergew-releases/kernel-6.18.0-0.rc7.357.vanilla.fc43.src.rpm Kind regards Uffe ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled 2025-12-09 15:08 ` Ulf Hansson @ 2026-01-01 5:11 ` Tabby Kitten [not found] ` <CAL57YxZagMaZF1X1bpx-nB76s=vZMWhUDiVbvB9P3CLiXG-qHQ@mail.gmail.com> 1 sibling, 0 replies; 8+ messages in thread From: Tabby Kitten @ 2026-01-01 5:11 UTC (permalink / raw) To: Ulf Hansson; +Cc: linux-mmc, linux-kernel Hi, It's been a few weeks since you looked into the bug. I think the merge window is over now, have you had the time to look into resolving this issue? Tabby On Tue, Dec 9, 2025 at 7:09 AM Ulf Hansson <ulf.hansson@linaro.org> wrote: > > Hi, > > On Wed, 26 Nov 2025 at 10:08, Tabby Kitten <nyanpasu256@gmail.com> wrote: > > > > On a PC with a Realtek PCI Express SD reader, when you sleep with > > `wakeup_count` active (eg. sleeping from KDE's lock screen), the MMC > > driver wakes up the system and aborts suspend. > > Okay, that's clearly a problem that needs to be fixed! > > > > > I've found a sleep failure bug in the rtsx_pci and mmc_core drivers. > > After userspace writes a number to `/sys/power/wakeup_count` (eg. KDE > > Plasma does it to distinguish user wakes from timers and Wake-on-LAN), > > if it attempts a mem suspend it will be aborted when > > rtsx_pci_runtime_resume() -> mmc_detect_change() emits a > > pm_wakeup_ws_event(). This breaks sleep on some hardware and desktop > > environments. > > > > The detailed description: > > The recently released Plasma 6.5.0 writes to `/sys/power/wakeup_count` > > before sleeping. On my computer this caused the sleep attempt to fail > > with dmesg error "PM: Some devices failed to suspend, or early wake > > event detected". I got this error on both Arch Linux and Fedora, and > > replicated it on Fedora with the mainline kernel COPR. KDE is tracking > > this error at https://bugs.kde.org/show_bug.cgi?id=510992, and have > > disabled writing to wakeup_count on Plasma 6.5.3 to work around this > > issue. > > > > I've written a standalone shell script to reproduce this sleep failure > > (save as badsleep.sh): > > > > #!/bin/bash > > read wakeup_count < /sys/power/wakeup_count > > if [[ $? -ne 0 ]]; then > > e=$? > > echo "Failed to open wakeup_count, suspend maybe already in progress" > > exit $e > > fi > > echo $wakeup_count > /sys/power/wakeup_count > > if [[ $? -ne 0 ]]; then > > e=$? > > echo "Failed to write wakeup_count, wakeup_count may have changed in between" > > exit $e > > fi > > echo mem > /sys/power/state > > > > Running `sudo ./badsleep.sh` reproduces failed sleeps on my computer. > > (sudo is needed to write to `/sys/power/wakeup_count` on Fedora.) > > > > * If I run the script unaltered, the screen turns off and on, and the > > terminal outputs > > `./badsleep.sh: line 14: echo: write error: Device or resource busy` > > indicating the mem sleep failed. > > > > * If I edit the script and comment out `echo $wakeup_count > > > /sys/power/wakeup_count`, the sleep succeeds, and waking the computer > > skips the lock screen and resumes where I left off. > > > > * If I run `sudo rmmod rtsx_pci_sdmmc` to disable the faulty module, the > > sleep succeeds, and waking the computer skips the lock screen and > > resumes where I left off. > > > > I think this problem happens in general when a driver spawns a wakeup > > event from its suspend callback. On my system, the driver in question > > lies in the MMC subsystem. > > > > ## Code debugging > > > > If I run `echo 1 > /sys/power/pm_debug_messages` to enable verbose > > logging, then attempt a failed sleep, I see output: > > > > PM: Wakeup pending, aborting suspend > > PM: active wakeup source: mmc0 > > PM: suspend of devices aborted after 151.615 msecs > > PM: start suspend of devices aborted after 169.797 msecs > > PM: Some devices failed to suspend, or early wake event detected > > > > The "Wakeup pending, aborting suspend" message comes from function > > `pm_wakeup_pending()`. This function checks if event checks are enabled, > > and if some counters have changed aborts suspend and calls > > `pm_print_active_wakeup_sources()`, which prints `wakeup_sources`. > > Tracing the code that modifies `wakeup_sources`, I found that > > `pm_wakeup_ws_event()` would activate an event and > > `wakeup_source_register() → wakeup_source_add()` would add a new one. > > Thanks for all the details! > > > > > To find who changed wakeup events, I used my stacksnoop fork at > > https://github.com/nyanpasu64/bcc/blob/local/examples/tracing/stacksnoop > > .py to trace a failed suspend: > > > > nyanpasu64@ryzen ~/code/bcc (local)> sudo ./examples/tracing/stacksnoop.py pm_wakeup_ws_event wakeup_source_register > > TIME(s) FUNCTION > > 7.254676819: > > 0: ret_from_fork_asm [kernel] > > 1: ret_from_fork [kernel] > > 2: kthread [kernel] > > 3: worker_thread [kernel] > > 4: process_one_work [kernel] > > 5: async_run_entry_fn [kernel] > > 6: async_suspend [kernel] > > 7: device_suspend [kernel] > > 8: dpm_run_callback [kernel] > > 9: mmc_bus_suspend [mmc_core] > > 10: mmc_blk_suspend [mmc_block] > > 11: mmc_queue_suspend [mmc_block] > > 12: __mmc_claim_host [mmc_core] > > 13: __pm_runtime_resume [kernel] > > 14: rpm_resume [kernel] > > 15: rpm_resume [kernel] > > 16: rpm_callback [kernel] > > 17: __rpm_callback [kernel] > > 18: rtsx_pci_runtime_resume [rtsx_pci] > > 19: mmc_detect_change [mmc_core] > > 20: pm_wakeup_ws_event [kernel] > > > > On a previous kernel, lines 9-12 were replaced by a single call to > > `pci_pm_suspend`. I've posted my detailed debugging on the older kernel > > at https://bugs.kde.org/show_bug.cgi?id=510992#c26. There I found that > > `pci_pm_suspend()` wakes PCI(e) devices before sending them into a full > > sleep state, but in the process, `_mmc_detect_change()` will "Prevent > > system sleep for 5s to allow user space to consume the\n corresponding > > uevent"... which interrupts a system sleep in progress. > > > > On my current kernel, the same logic applies, but reading the source I > > can't tell where `__mmc_claim_host()` is actually calling > > `__pm_runtime_resume()`. Nonetheless the problem remains that > > `rpm_resume()` is called during system suspend, `mmc_detect_change()` > > wakes the system when called, and this will abort system sleep when > > `/sys/power/wakeup_count` is active. > > __mmc_claim_host() will call pm_runtime_get_sync() to runtime resume > the mmc host device. > > The mmc host device's parent (a pci device) will then be runtime > resumed too. That's the call to rtsx_pci_runtime_resume() we see > above. > > The problem is then that rtsx_pci_runtime_resume() invokes a callback > (->card_event())) back into the mmc host driver > (drivers/mmc/host/rtsx_pci_sdmmc.c), which ends up calling > mmc_detect_change() to try to detect whether a card have been > inserted/removed. > > > > > ## Next steps > > > > How would this problem be addressed? Off the top of my head, perhaps you > > could not call `__pm_runtime_resume()` on a SD card reader during the > > `device_suspend()` process, not call `pm_wakeup_ws_event()` when the SD > > card status changes, not call `pm_wakeup_ws_event()` *specifically* > > when system suspend is temporarily waking up a SD card reader, or > > disable pm_wakeup_ws_event() entirely during the suspend process (does > > this defeat the purpose of the function?). > > Let me think a bit on what makes the best sense here. I will get back > to you in a couple of days. > > > > > Are there other drivers which cause the same symptoms? I don't know. I > > asked on the KDE bug tracker for other users to attempt a failed sleep > > with `echo 1 > /sys/power/pm_debug_messages` active, to identify which > > driver broke suspend in their system; so far nobody has replied with > > logs. > > > > Given that this bug is related to `/sys/power/wakeup_count` > > (https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-power), I > > was considering CCing Rafael J. Wysocki <rafael@kernel.org> and > > linux-pm@vger.kernel.org, but have decided to only message the MMC > > maintainers for now. If necessary we may have to forward this message > > there to get their attention. > > > > ---- > > > > System information: > > > > * I have an Intel NUC8i7BEH mini PC, with CPU 8 × Intel® Core™ i7-8559U > > CPU @ 2.70GHz. > > > > * uname -mi prints `x86_64 unknown`. > > > > * `lspci -nn` prints > > "6e:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader [10ec:522a] (rev 01)". > > > > * I am running kernel 6.18.0-0.rc7.357.vanilla.fc43.x86_64 from the Fedora COPRs > > (https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories). > > > > * dmesg at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-dmesg-2025-11-25-txt > > > > * Fully resolved config at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-config-6-18-0-0-rc7-357-vanilla-fc43-x86_64, > > source at https://download.copr.fedorainfracloud.org/results/@kernel-vanilla/mainline-wo-mergew/fedora-43-x86_64/09831015-mainline-womergew-releases/kernel-6.18.0-0.rc7.357.vanilla.fc43.src.rpm > > Kind regards > Uffe ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <CAL57YxZagMaZF1X1bpx-nB76s=vZMWhUDiVbvB9P3CLiXG-qHQ@mail.gmail.com>]
* Re: rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled [not found] ` <CAL57YxZagMaZF1X1bpx-nB76s=vZMWhUDiVbvB9P3CLiXG-qHQ@mail.gmail.com> @ 2026-01-03 11:12 ` Ulf Hansson 2026-01-05 12:31 ` Adrian Hunter 2026-01-07 22:06 ` Tabby Kitten 0 siblings, 2 replies; 8+ messages in thread From: Ulf Hansson @ 2026-01-03 11:12 UTC (permalink / raw) To: Tabby Kitten; +Cc: linux-mmc, linux-kernel, Adrian Hunter + Adrian On Thu, 1 Jan 2026 at 05:58, Tabby Kitten <nyanpasu256@gmail.com> wrote: > > Hi, > > It's been a few weeks since you looked into the bug. I think the merge window is over now, have you had the time to look into resolving this issue? Yes, sorry for the delay. See below for an attached patch. Please try it out and report back. Kind regards Uffe > > Tabby > > On Tue, Dec 9, 2025 at 7:09 AM Ulf Hansson <ulf.hansson@linaro.org> wrote: >> >> Hi, >> >> On Wed, 26 Nov 2025 at 10:08, Tabby Kitten <nyanpasu256@gmail.com> wrote: >> > >> > On a PC with a Realtek PCI Express SD reader, when you sleep with >> > `wakeup_count` active (eg. sleeping from KDE's lock screen), the MMC >> > driver wakes up the system and aborts suspend. >> >> Okay, that's clearly a problem that needs to be fixed! >> >> > >> > I've found a sleep failure bug in the rtsx_pci and mmc_core drivers. >> > After userspace writes a number to `/sys/power/wakeup_count` (eg. KDE >> > Plasma does it to distinguish user wakes from timers and Wake-on-LAN), >> > if it attempts a mem suspend it will be aborted when >> > rtsx_pci_runtime_resume() -> mmc_detect_change() emits a >> > pm_wakeup_ws_event(). This breaks sleep on some hardware and desktop >> > environments. >> > >> > The detailed description: >> > The recently released Plasma 6.5.0 writes to `/sys/power/wakeup_count` >> > before sleeping. On my computer this caused the sleep attempt to fail >> > with dmesg error "PM: Some devices failed to suspend, or early wake >> > event detected". I got this error on both Arch Linux and Fedora, and >> > replicated it on Fedora with the mainline kernel COPR. KDE is tracking >> > this error at https://bugs.kde.org/show_bug.cgi?id=510992, and have >> > disabled writing to wakeup_count on Plasma 6.5.3 to work around this >> > issue. >> > >> > I've written a standalone shell script to reproduce this sleep failure >> > (save as badsleep.sh): >> > >> > #!/bin/bash >> > read wakeup_count < /sys/power/wakeup_count >> > if [[ $? -ne 0 ]]; then >> > e=$? >> > echo "Failed to open wakeup_count, suspend maybe already in progress" >> > exit $e >> > fi >> > echo $wakeup_count > /sys/power/wakeup_count >> > if [[ $? -ne 0 ]]; then >> > e=$? >> > echo "Failed to write wakeup_count, wakeup_count may have changed in between" >> > exit $e >> > fi >> > echo mem > /sys/power/state >> > >> > Running `sudo ./badsleep.sh` reproduces failed sleeps on my computer. >> > (sudo is needed to write to `/sys/power/wakeup_count` on Fedora.) >> > >> > * If I run the script unaltered, the screen turns off and on, and the >> > terminal outputs >> > `./badsleep.sh: line 14: echo: write error: Device or resource busy` >> > indicating the mem sleep failed. >> > >> > * If I edit the script and comment out `echo $wakeup_count > >> > /sys/power/wakeup_count`, the sleep succeeds, and waking the computer >> > skips the lock screen and resumes where I left off. >> > >> > * If I run `sudo rmmod rtsx_pci_sdmmc` to disable the faulty module, the >> > sleep succeeds, and waking the computer skips the lock screen and >> > resumes where I left off. >> > >> > I think this problem happens in general when a driver spawns a wakeup >> > event from its suspend callback. On my system, the driver in question >> > lies in the MMC subsystem. >> > >> > ## Code debugging >> > >> > If I run `echo 1 > /sys/power/pm_debug_messages` to enable verbose >> > logging, then attempt a failed sleep, I see output: >> > >> > PM: Wakeup pending, aborting suspend >> > PM: active wakeup source: mmc0 >> > PM: suspend of devices aborted after 151.615 msecs >> > PM: start suspend of devices aborted after 169.797 msecs >> > PM: Some devices failed to suspend, or early wake event detected >> > >> > The "Wakeup pending, aborting suspend" message comes from function >> > `pm_wakeup_pending()`. This function checks if event checks are enabled, >> > and if some counters have changed aborts suspend and calls >> > `pm_print_active_wakeup_sources()`, which prints `wakeup_sources`. >> > Tracing the code that modifies `wakeup_sources`, I found that >> > `pm_wakeup_ws_event()` would activate an event and >> > `wakeup_source_register() → wakeup_source_add()` would add a new one. >> >> Thanks for all the details! >> >> > >> > To find who changed wakeup events, I used my stacksnoop fork at >> > https://github.com/nyanpasu64/bcc/blob/local/examples/tracing/stacksnoop >> > .py to trace a failed suspend: >> > >> > nyanpasu64@ryzen ~/code/bcc (local)> sudo ./examples/tracing/stacksnoop.py pm_wakeup_ws_event wakeup_source_register >> > TIME(s) FUNCTION >> > 7.254676819: >> > 0: ret_from_fork_asm [kernel] >> > 1: ret_from_fork [kernel] >> > 2: kthread [kernel] >> > 3: worker_thread [kernel] >> > 4: process_one_work [kernel] >> > 5: async_run_entry_fn [kernel] >> > 6: async_suspend [kernel] >> > 7: device_suspend [kernel] >> > 8: dpm_run_callback [kernel] >> > 9: mmc_bus_suspend [mmc_core] >> > 10: mmc_blk_suspend [mmc_block] >> > 11: mmc_queue_suspend [mmc_block] >> > 12: __mmc_claim_host [mmc_core] >> > 13: __pm_runtime_resume [kernel] >> > 14: rpm_resume [kernel] >> > 15: rpm_resume [kernel] >> > 16: rpm_callback [kernel] >> > 17: __rpm_callback [kernel] >> > 18: rtsx_pci_runtime_resume [rtsx_pci] >> > 19: mmc_detect_change [mmc_core] >> > 20: pm_wakeup_ws_event [kernel] >> > >> > On a previous kernel, lines 9-12 were replaced by a single call to >> > `pci_pm_suspend`. I've posted my detailed debugging on the older kernel >> > at https://bugs.kde.org/show_bug.cgi?id=510992#c26. There I found that >> > `pci_pm_suspend()` wakes PCI(e) devices before sending them into a full >> > sleep state, but in the process, `_mmc_detect_change()` will "Prevent >> > system sleep for 5s to allow user space to consume the\n corresponding >> > uevent"... which interrupts a system sleep in progress. >> > >> > On my current kernel, the same logic applies, but reading the source I >> > can't tell where `__mmc_claim_host()` is actually calling >> > `__pm_runtime_resume()`. Nonetheless the problem remains that >> > `rpm_resume()` is called during system suspend, `mmc_detect_change()` >> > wakes the system when called, and this will abort system sleep when >> > `/sys/power/wakeup_count` is active. >> >> __mmc_claim_host() will call pm_runtime_get_sync() to runtime resume >> the mmc host device. >> >> The mmc host device's parent (a pci device) will then be runtime >> resumed too. That's the call to rtsx_pci_runtime_resume() we see >> above. >> >> The problem is then that rtsx_pci_runtime_resume() invokes a callback >> (->card_event())) back into the mmc host driver >> (drivers/mmc/host/rtsx_pci_sdmmc.c), which ends up calling >> mmc_detect_change() to try to detect whether a card have been >> inserted/removed. >> >> > >> > ## Next steps >> > >> > How would this problem be addressed? Off the top of my head, perhaps you >> > could not call `__pm_runtime_resume()` on a SD card reader during the >> > `device_suspend()` process, not call `pm_wakeup_ws_event()` when the SD >> > card status changes, not call `pm_wakeup_ws_event()` *specifically* >> > when system suspend is temporarily waking up a SD card reader, or >> > disable pm_wakeup_ws_event() entirely during the suspend process (does >> > this defeat the purpose of the function?). >> >> Let me think a bit on what makes the best sense here. I will get back >> to you in a couple of days. >> >> > >> > Are there other drivers which cause the same symptoms? I don't know. I >> > asked on the KDE bug tracker for other users to attempt a failed sleep >> > with `echo 1 > /sys/power/pm_debug_messages` active, to identify which >> > driver broke suspend in their system; so far nobody has replied with >> > logs. >> > >> > Given that this bug is related to `/sys/power/wakeup_count` >> > (https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-power), I >> > was considering CCing Rafael J. Wysocki <rafael@kernel.org> and >> > linux-pm@vger.kernel.org, but have decided to only message the MMC >> > maintainers for now. If necessary we may have to forward this message >> > there to get their attention. >> > >> > ---- >> > >> > System information: >> > >> > * I have an Intel NUC8i7BEH mini PC, with CPU 8 × Intel® Core™ i7-8559U >> > CPU @ 2.70GHz. >> > >> > * uname -mi prints `x86_64 unknown`. >> > >> > * `lspci -nn` prints >> > "6e:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader [10ec:522a] (rev 01)". >> > >> > * I am running kernel 6.18.0-0.rc7.357.vanilla.fc43.x86_64 from the Fedora COPRs >> > (https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories). >> > >> > * dmesg at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-dmesg-2025-11-25-txt >> > >> > * Fully resolved config at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-config-6-18-0-0-rc7-357-vanilla-fc43-x86_64, >> > source at https://download.copr.fedorainfracloud.org/results/@kernel-vanilla/mainline-wo-mergew/fedora-43-x86_64/09831015-mainline-womergew-releases/kernel-6.18.0-0.rc7.357.vanilla.fc43.src.rpm >> >> Kind regards >> Uffe From: Ulf Hansson <ulf.hansson@linaro.org> Date: Sat, 3 Jan 2026 11:55:44 +0100 Subject: [PATCH] mmc: core: Avoid runtime PM of host in mmc_queue_suspend() WIP Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> --- drivers/mmc/core/core.c | 18 ++++++++++++------ drivers/mmc/core/core.h | 11 ++++++++--- drivers/mmc/core/queue.c | 4 ++-- drivers/mmc/core/sdio_irq.c | 2 +- 4 files changed, 23 insertions(+), 12 deletions(-) diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c index 860378bea557..c3923522833a 100644 --- a/drivers/mmc/core/core.c +++ b/drivers/mmc/core/core.c @@ -781,6 +781,7 @@ static inline void mmc_ctx_set_claimer(struct mmc_host *host, * @ctx: context that claims the host or NULL in which case the default * context will be used * @abort: whether or not the operation should be aborted + * @do_pm: whether to use runtime PM or not * * Claim a host for a set of operations. If @abort is non null and * dereference a non-zero value then this will return prematurely with @@ -788,7 +789,7 @@ static inline void mmc_ctx_set_claimer(struct mmc_host *host, * with the lock held otherwise. */ int __mmc_claim_host(struct mmc_host *host, struct mmc_ctx *ctx, - atomic_t *abort) + atomic_t *abort, bool do_pm) { struct task_struct *task = ctx ? NULL : current; DECLARE_WAITQUEUE(wait, current); @@ -821,7 +822,7 @@ int __mmc_claim_host(struct mmc_host *host, struct mmc_ctx *ctx, spin_unlock_irqrestore(&host->lock, flags); remove_wait_queue(&host->wq, &wait); - if (pm) + if (do_pm && pm) pm_runtime_get_sync(mmc_dev(host)); return stop; @@ -829,13 +830,14 @@ int __mmc_claim_host(struct mmc_host *host, struct mmc_ctx *ctx, EXPORT_SYMBOL(__mmc_claim_host); /** - * mmc_release_host - release a host + * __mmc_release_host - release a host * @host: mmc host to release + * @do_pm: whether to use runtime PM or not * * Release a MMC host, allowing others to claim the host * for their operations. */ -void mmc_release_host(struct mmc_host *host) +void __mmc_release_host(struct mmc_host *host, bool do_pm) { unsigned long flags; @@ -851,6 +853,10 @@ void mmc_release_host(struct mmc_host *host) host->claimer = NULL; spin_unlock_irqrestore(&host->lock, flags); wake_up(&host->wq); + + if (!do_pm) + return; + pm_runtime_mark_last_busy(mmc_dev(host)); if (host->caps & MMC_CAP_SYNC_RUNTIME_PM) pm_runtime_put_sync_suspend(mmc_dev(host)); @@ -858,7 +864,7 @@ void mmc_release_host(struct mmc_host *host) pm_runtime_put_autosuspend(mmc_dev(host)); } } -EXPORT_SYMBOL(mmc_release_host); +EXPORT_SYMBOL(__mmc_release_host); /* * This is a helper function, which fetches a runtime pm reference for the @@ -867,7 +873,7 @@ EXPORT_SYMBOL(mmc_release_host); void mmc_get_card(struct mmc_card *card, struct mmc_ctx *ctx) { pm_runtime_get_sync(&card->dev); - __mmc_claim_host(card->host, ctx, NULL); + __mmc_claim_host(card->host, ctx, NULL, true); } EXPORT_SYMBOL(mmc_get_card); diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h index a028b48be164..5979c90d3b09 100644 --- a/drivers/mmc/core/core.h +++ b/drivers/mmc/core/core.h @@ -135,8 +135,8 @@ unsigned int mmc_calc_max_discard(struct mmc_card *card); int mmc_set_blocklen(struct mmc_card *card, unsigned int blocklen); int __mmc_claim_host(struct mmc_host *host, struct mmc_ctx *ctx, - atomic_t *abort); -void mmc_release_host(struct mmc_host *host); + atomic_t *abort, bool do_pm); +void __mmc_release_host(struct mmc_host *host, bool do_pm); void mmc_get_card(struct mmc_card *card, struct mmc_ctx *ctx); void mmc_put_card(struct mmc_card *card, struct mmc_ctx *ctx); @@ -150,7 +150,12 @@ int mmc_card_alternative_gpt_sector(struct mmc_card *card, sector_t *sector); */ static inline void mmc_claim_host(struct mmc_host *host) { - __mmc_claim_host(host, NULL, NULL); + __mmc_claim_host(host, NULL, NULL, true); +} + +static inline void mmc_release_host(struct mmc_host *host) +{ + __mmc_release_host(host, true); } int mmc_cqe_start_req(struct mmc_host *host, struct mmc_request *mrq); diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c index 284856c8f655..76e83f49ff4e 100644 --- a/drivers/mmc/core/queue.c +++ b/drivers/mmc/core/queue.c @@ -477,8 +477,8 @@ void mmc_queue_suspend(struct mmc_queue *mq) * The host remains claimed while there are outstanding requests, so * simply claiming and releasing here ensures there are none. */ - mmc_claim_host(mq->card->host); - mmc_release_host(mq->card->host); + __mmc_claim_host(mq->card->host, NULL, NULL, false); + __mmc_release_host(mq->card->host, false); } void mmc_queue_resume(struct mmc_queue *mq) diff --git a/drivers/mmc/core/sdio_irq.c b/drivers/mmc/core/sdio_irq.c index 2b24bdf38296..e5d4f8c634c8 100644 --- a/drivers/mmc/core/sdio_irq.c +++ b/drivers/mmc/core/sdio_irq.c @@ -172,7 +172,7 @@ static int sdio_irq_thread(void *_host) * that doesn't require that lock to be held. */ ret = __mmc_claim_host(host, NULL, - &host->sdio_irq_thread_abort); + &host->sdio_irq_thread_abort, true); if (ret) break; ret = process_sdio_pending_irqs(host); -- 2.43.0 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled 2026-01-03 11:12 ` Ulf Hansson @ 2026-01-05 12:31 ` Adrian Hunter 2026-01-09 2:46 ` Tabby Kitten 2026-01-07 22:06 ` Tabby Kitten 1 sibling, 1 reply; 8+ messages in thread From: Adrian Hunter @ 2026-01-05 12:31 UTC (permalink / raw) To: Ulf Hansson, Tabby Kitten, ricky_wu; +Cc: linux-mmc, linux-kernel On 03/01/2026 13:12, Ulf Hansson wrote: > + Adrian + Ricky WU <ricky_wu@realtek.com> > > On Thu, 1 Jan 2026 at 05:58, Tabby Kitten <nyanpasu256@gmail.com> wrote: >> >> Hi, >> >> It's been a few weeks since you looked into the bug. I think the merge window is over now, have you had the time to look into resolving this issue? > > Yes, sorry for the delay. > > See below for an attached patch. Please try it out and report back. > > Kind regards > Uffe > >> >> Tabby >> >> On Tue, Dec 9, 2025 at 7:09 AM Ulf Hansson <ulf.hansson@linaro.org> wrote: >>> >>> Hi, >>> >>> On Wed, 26 Nov 2025 at 10:08, Tabby Kitten <nyanpasu256@gmail.com> wrote: >>>> >>>> On a PC with a Realtek PCI Express SD reader, when you sleep with >>>> `wakeup_count` active (eg. sleeping from KDE's lock screen), the MMC >>>> driver wakes up the system and aborts suspend. >>> >>> Okay, that's clearly a problem that needs to be fixed! >>> >>>> >>>> I've found a sleep failure bug in the rtsx_pci and mmc_core drivers. >>>> After userspace writes a number to `/sys/power/wakeup_count` (eg. KDE >>>> Plasma does it to distinguish user wakes from timers and Wake-on-LAN), >>>> if it attempts a mem suspend it will be aborted when >>>> rtsx_pci_runtime_resume() -> mmc_detect_change() emits a >>>> pm_wakeup_ws_event(). This breaks sleep on some hardware and desktop >>>> environments. >>>> >>>> The detailed description: >>>> The recently released Plasma 6.5.0 writes to `/sys/power/wakeup_count` >>>> before sleeping. On my computer this caused the sleep attempt to fail >>>> with dmesg error "PM: Some devices failed to suspend, or early wake >>>> event detected". I got this error on both Arch Linux and Fedora, and >>>> replicated it on Fedora with the mainline kernel COPR. KDE is tracking >>>> this error at https://bugs.kde.org/show_bug.cgi?id=510992, and have >>>> disabled writing to wakeup_count on Plasma 6.5.3 to work around this >>>> issue. >>>> >>>> I've written a standalone shell script to reproduce this sleep failure >>>> (save as badsleep.sh): >>>> >>>> #!/bin/bash >>>> read wakeup_count < /sys/power/wakeup_count >>>> if [[ $? -ne 0 ]]; then >>>> e=$? >>>> echo "Failed to open wakeup_count, suspend maybe already in progress" >>>> exit $e >>>> fi >>>> echo $wakeup_count > /sys/power/wakeup_count >>>> if [[ $? -ne 0 ]]; then >>>> e=$? >>>> echo "Failed to write wakeup_count, wakeup_count may have changed in between" >>>> exit $e >>>> fi >>>> echo mem > /sys/power/state >>>> >>>> Running `sudo ./badsleep.sh` reproduces failed sleeps on my computer. >>>> (sudo is needed to write to `/sys/power/wakeup_count` on Fedora.) >>>> >>>> * If I run the script unaltered, the screen turns off and on, and the >>>> terminal outputs >>>> `./badsleep.sh: line 14: echo: write error: Device or resource busy` >>>> indicating the mem sleep failed. >>>> >>>> * If I edit the script and comment out `echo $wakeup_count > >>>> /sys/power/wakeup_count`, the sleep succeeds, and waking the computer >>>> skips the lock screen and resumes where I left off. >>>> >>>> * If I run `sudo rmmod rtsx_pci_sdmmc` to disable the faulty module, the >>>> sleep succeeds, and waking the computer skips the lock screen and >>>> resumes where I left off. >>>> >>>> I think this problem happens in general when a driver spawns a wakeup >>>> event from its suspend callback. On my system, the driver in question >>>> lies in the MMC subsystem. >>>> >>>> ## Code debugging >>>> >>>> If I run `echo 1 > /sys/power/pm_debug_messages` to enable verbose >>>> logging, then attempt a failed sleep, I see output: >>>> >>>> PM: Wakeup pending, aborting suspend >>>> PM: active wakeup source: mmc0 >>>> PM: suspend of devices aborted after 151.615 msecs >>>> PM: start suspend of devices aborted after 169.797 msecs >>>> PM: Some devices failed to suspend, or early wake event detected >>>> >>>> The "Wakeup pending, aborting suspend" message comes from function >>>> `pm_wakeup_pending()`. This function checks if event checks are enabled, >>>> and if some counters have changed aborts suspend and calls >>>> `pm_print_active_wakeup_sources()`, which prints `wakeup_sources`. >>>> Tracing the code that modifies `wakeup_sources`, I found that >>>> `pm_wakeup_ws_event()` would activate an event and >>>> `wakeup_source_register() → wakeup_source_add()` would add a new one. >>> >>> Thanks for all the details! >>> >>>> >>>> To find who changed wakeup events, I used my stacksnoop fork at >>>> https://github.com/nyanpasu64/bcc/blob/local/examples/tracing/stacksnoop >>>> .py to trace a failed suspend: >>>> >>>> nyanpasu64@ryzen ~/code/bcc (local)> sudo ./examples/tracing/stacksnoop.py pm_wakeup_ws_event wakeup_source_register >>>> TIME(s) FUNCTION >>>> 7.254676819: >>>> 0: ret_from_fork_asm [kernel] >>>> 1: ret_from_fork [kernel] >>>> 2: kthread [kernel] >>>> 3: worker_thread [kernel] >>>> 4: process_one_work [kernel] >>>> 5: async_run_entry_fn [kernel] >>>> 6: async_suspend [kernel] >>>> 7: device_suspend [kernel] >>>> 8: dpm_run_callback [kernel] >>>> 9: mmc_bus_suspend [mmc_core] >>>> 10: mmc_blk_suspend [mmc_block] >>>> 11: mmc_queue_suspend [mmc_block] >>>> 12: __mmc_claim_host [mmc_core] >>>> 13: __pm_runtime_resume [kernel] >>>> 14: rpm_resume [kernel] >>>> 15: rpm_resume [kernel] >>>> 16: rpm_callback [kernel] >>>> 17: __rpm_callback [kernel] >>>> 18: rtsx_pci_runtime_resume [rtsx_pci] >>>> 19: mmc_detect_change [mmc_core] >>>> 20: pm_wakeup_ws_event [kernel] >>>> >>>> On a previous kernel, lines 9-12 were replaced by a single call to >>>> `pci_pm_suspend`. I've posted my detailed debugging on the older kernel >>>> at https://bugs.kde.org/show_bug.cgi?id=510992#c26. There I found that >>>> `pci_pm_suspend()` wakes PCI(e) devices before sending them into a full >>>> sleep state, but in the process, `_mmc_detect_change()` will "Prevent >>>> system sleep for 5s to allow user space to consume the\n corresponding >>>> uevent"... which interrupts a system sleep in progress. >>>> >>>> On my current kernel, the same logic applies, but reading the source I >>>> can't tell where `__mmc_claim_host()` is actually calling >>>> `__pm_runtime_resume()`. Nonetheless the problem remains that >>>> `rpm_resume()` is called during system suspend, `mmc_detect_change()` >>>> wakes the system when called, and this will abort system sleep when >>>> `/sys/power/wakeup_count` is active. >>> >>> __mmc_claim_host() will call pm_runtime_get_sync() to runtime resume >>> the mmc host device. >>> >>> The mmc host device's parent (a pci device) will then be runtime >>> resumed too. That's the call to rtsx_pci_runtime_resume() we see >>> above. >>> >>> The problem is then that rtsx_pci_runtime_resume() invokes a callback >>> (->card_event())) back into the mmc host driver >>> (drivers/mmc/host/rtsx_pci_sdmmc.c), which ends up calling >>> mmc_detect_change() to try to detect whether a card have been >>> inserted/removed. >>> >>>> >>>> ## Next steps >>>> >>>> How would this problem be addressed? Off the top of my head, perhaps you >>>> could not call `__pm_runtime_resume()` on a SD card reader during the >>>> `device_suspend()` process, not call `pm_wakeup_ws_event()` when the SD >>>> card status changes, not call `pm_wakeup_ws_event()` *specifically* >>>> when system suspend is temporarily waking up a SD card reader, or >>>> disable pm_wakeup_ws_event() entirely during the suspend process (does >>>> this defeat the purpose of the function?). >>> >>> Let me think a bit on what makes the best sense here. I will get back >>> to you in a couple of days. >>> >>>> >>>> Are there other drivers which cause the same symptoms? I don't know. I >>>> asked on the KDE bug tracker for other users to attempt a failed sleep >>>> with `echo 1 > /sys/power/pm_debug_messages` active, to identify which >>>> driver broke suspend in their system; so far nobody has replied with >>>> logs. >>>> >>>> Given that this bug is related to `/sys/power/wakeup_count` >>>> (https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-power), I >>>> was considering CCing Rafael J. Wysocki <rafael@kernel.org> and >>>> linux-pm@vger.kernel.org, but have decided to only message the MMC >>>> maintainers for now. If necessary we may have to forward this message >>>> there to get their attention. >>>> >>>> ---- >>>> >>>> System information: >>>> >>>> * I have an Intel NUC8i7BEH mini PC, with CPU 8 × Intel® Core™ i7-8559U >>>> CPU @ 2.70GHz. >>>> >>>> * uname -mi prints `x86_64 unknown`. >>>> >>>> * `lspci -nn` prints >>>> "6e:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader [10ec:522a] (rev 01)". >>>> >>>> * I am running kernel 6.18.0-0.rc7.357.vanilla.fc43.x86_64 from the Fedora COPRs >>>> (https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories). >>>> >>>> * dmesg at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-dmesg-2025-11-25-txt >>>> >>>> * Fully resolved config at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-config-6-18-0-0-rc7-357-vanilla-fc43-x86_64, >>>> source at https://download.copr.fedorainfracloud.org/results/@kernel-vanilla/mainline-wo-mergew/fedora-43-x86_64/09831015-mainline-womergew-releases/kernel-6.18.0-0.rc7.357.vanilla.fc43.src.rpm >>> >>> Kind regards >>> Uffe > > From: Ulf Hansson <ulf.hansson@linaro.org> > Date: Sat, 3 Jan 2026 11:55:44 +0100 > Subject: [PATCH] mmc: core: Avoid runtime PM of host in mmc_queue_suspend() Seems reasonable, but isn't there also: bus_ops->suspend() == mmc_sd_suspend() _mmc_sd_suspend() mmc_claim_host(host) In general, it looks difficult to avoid runtime resume on the suspend path. PCI will usually runtime resume anyway i.e. from pci_pm_suspend(): /* * PCI devices suspended at run time may need to be resumed at this * point, because in general it may be necessary to reconfigure them for * system suspend. Namely, if the device is expected to wake up the * system from the sleep state, it may have to be reconfigured for this * purpose, or if the device is not expected to wake up the system from * the sleep state, it should be prevented from signaling wakeup events * going forward. * * Also if the driver of the device does not indicate that its system * suspend callbacks can cope with runtime-suspended devices, it is * better to resume the device from runtime suspend here. */ if (!dev_pm_smart_suspend(dev) || pci_dev_need_resume(pci_dev)) { pm_runtime_resume(dev); So maybe alter rtsx_pci_runtime_resume() to avoid calling pcr->slots[RTSX_SD_CARD].card_event() == rtsx_pci_sdmmc_card_event() when suspending. Perhaps along the lines of the hack below: static int rtsx_pci_runtime_resume(struct device *device) { struct pci_dev *pcidev = to_pci_dev(device); struct pcr_handle *handle = pci_get_drvdata(pcidev); struct rtsx_pcr *pcr = handle->pcr; dev_dbg(device, "--> %s\n", __func__); mutex_lock(&pcr->pcr_mutex); rtsx_pci_write_register(pcr, HOST_SLEEP_STATE, 0x03, 0x00); rtsx_pci_init_hw(pcr); if (pcr->slots[RTSX_SD_CARD].p_dev != NULL) { +#if IS_ENABLED(CONFIG_SUSPEND) + if (pm_suspend_target_state == PM_SUSPEND_ON) +#endif pcr->slots[RTSX_SD_CARD].card_event( pcr->slots[RTSX_SD_CARD].p_dev); } mutex_unlock(&pcr->pcr_mutex); return 0; } > > WIP > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> > --- > drivers/mmc/core/core.c | 18 ++++++++++++------ > drivers/mmc/core/core.h | 11 ++++++++--- > drivers/mmc/core/queue.c | 4 ++-- > drivers/mmc/core/sdio_irq.c | 2 +- > 4 files changed, 23 insertions(+), 12 deletions(-) > > diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c > index 860378bea557..c3923522833a 100644 > --- a/drivers/mmc/core/core.c > +++ b/drivers/mmc/core/core.c > @@ -781,6 +781,7 @@ static inline void mmc_ctx_set_claimer(struct > mmc_host *host, > * @ctx: context that claims the host or NULL in which case the default > * context will be used > * @abort: whether or not the operation should be aborted > + * @do_pm: whether to use runtime PM or not > * > * Claim a host for a set of operations. If @abort is non null and > * dereference a non-zero value then this will return prematurely with > @@ -788,7 +789,7 @@ static inline void mmc_ctx_set_claimer(struct > mmc_host *host, > * with the lock held otherwise. > */ > int __mmc_claim_host(struct mmc_host *host, struct mmc_ctx *ctx, > - atomic_t *abort) > + atomic_t *abort, bool do_pm) > { > struct task_struct *task = ctx ? NULL : current; > DECLARE_WAITQUEUE(wait, current); > @@ -821,7 +822,7 @@ int __mmc_claim_host(struct mmc_host *host, struct > mmc_ctx *ctx, > spin_unlock_irqrestore(&host->lock, flags); > remove_wait_queue(&host->wq, &wait); > > - if (pm) > + if (do_pm && pm) > pm_runtime_get_sync(mmc_dev(host)); > > return stop; > @@ -829,13 +830,14 @@ int __mmc_claim_host(struct mmc_host *host, > struct mmc_ctx *ctx, > EXPORT_SYMBOL(__mmc_claim_host); > > /** > - * mmc_release_host - release a host > + * __mmc_release_host - release a host > * @host: mmc host to release > + * @do_pm: whether to use runtime PM or not > * > * Release a MMC host, allowing others to claim the host > * for their operations. > */ > -void mmc_release_host(struct mmc_host *host) > +void __mmc_release_host(struct mmc_host *host, bool do_pm) > { > unsigned long flags; > > @@ -851,6 +853,10 @@ void mmc_release_host(struct mmc_host *host) > host->claimer = NULL; > spin_unlock_irqrestore(&host->lock, flags); > wake_up(&host->wq); > + > + if (!do_pm) > + return; > + > pm_runtime_mark_last_busy(mmc_dev(host)); > if (host->caps & MMC_CAP_SYNC_RUNTIME_PM) > pm_runtime_put_sync_suspend(mmc_dev(host)); > @@ -858,7 +864,7 @@ void mmc_release_host(struct mmc_host *host) > pm_runtime_put_autosuspend(mmc_dev(host)); > } > } > -EXPORT_SYMBOL(mmc_release_host); > +EXPORT_SYMBOL(__mmc_release_host); > > /* > * This is a helper function, which fetches a runtime pm reference for the > @@ -867,7 +873,7 @@ EXPORT_SYMBOL(mmc_release_host); > void mmc_get_card(struct mmc_card *card, struct mmc_ctx *ctx) > { > pm_runtime_get_sync(&card->dev); > - __mmc_claim_host(card->host, ctx, NULL); > + __mmc_claim_host(card->host, ctx, NULL, true); > } > EXPORT_SYMBOL(mmc_get_card); > > diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h > index a028b48be164..5979c90d3b09 100644 > --- a/drivers/mmc/core/core.h > +++ b/drivers/mmc/core/core.h > @@ -135,8 +135,8 @@ unsigned int mmc_calc_max_discard(struct mmc_card *card); > int mmc_set_blocklen(struct mmc_card *card, unsigned int blocklen); > > int __mmc_claim_host(struct mmc_host *host, struct mmc_ctx *ctx, > - atomic_t *abort); > -void mmc_release_host(struct mmc_host *host); > + atomic_t *abort, bool do_pm); > +void __mmc_release_host(struct mmc_host *host, bool do_pm); > void mmc_get_card(struct mmc_card *card, struct mmc_ctx *ctx); > void mmc_put_card(struct mmc_card *card, struct mmc_ctx *ctx); > > @@ -150,7 +150,12 @@ int mmc_card_alternative_gpt_sector(struct > mmc_card *card, sector_t *sector); > */ > static inline void mmc_claim_host(struct mmc_host *host) > { > - __mmc_claim_host(host, NULL, NULL); > + __mmc_claim_host(host, NULL, NULL, true); > +} > + > +static inline void mmc_release_host(struct mmc_host *host) > +{ > + __mmc_release_host(host, true); > } > > int mmc_cqe_start_req(struct mmc_host *host, struct mmc_request *mrq); > diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c > index 284856c8f655..76e83f49ff4e 100644 > --- a/drivers/mmc/core/queue.c > +++ b/drivers/mmc/core/queue.c > @@ -477,8 +477,8 @@ void mmc_queue_suspend(struct mmc_queue *mq) > * The host remains claimed while there are outstanding requests, so > * simply claiming and releasing here ensures there are none. > */ > - mmc_claim_host(mq->card->host); > - mmc_release_host(mq->card->host); > + __mmc_claim_host(mq->card->host, NULL, NULL, false); > + __mmc_release_host(mq->card->host, false); > } > > void mmc_queue_resume(struct mmc_queue *mq) > diff --git a/drivers/mmc/core/sdio_irq.c b/drivers/mmc/core/sdio_irq.c > index 2b24bdf38296..e5d4f8c634c8 100644 > --- a/drivers/mmc/core/sdio_irq.c > +++ b/drivers/mmc/core/sdio_irq.c > @@ -172,7 +172,7 @@ static int sdio_irq_thread(void *_host) > * that doesn't require that lock to be held. > */ > ret = __mmc_claim_host(host, NULL, > - &host->sdio_irq_thread_abort); > + &host->sdio_irq_thread_abort, true); > if (ret) > break; > ret = process_sdio_pending_irqs(host); ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled 2026-01-05 12:31 ` Adrian Hunter @ 2026-01-09 2:46 ` Tabby Kitten 2026-01-12 7:39 ` Adrian Hunter 0 siblings, 1 reply; 8+ messages in thread From: Tabby Kitten @ 2026-01-09 2:46 UTC (permalink / raw) To: Adrian Hunter, Ulf Hansson, ricky_wu; +Cc: linux-mmc, linux-kernel On 1/5/26 4:31 AM, Adrian Hunter wrote: > Seems reasonable, but isn't there also: > bus_ops->suspend() == mmc_sd_suspend() > _mmc_sd_suspend() > mmc_claim_host(host) > > In general, it looks difficult to avoid runtime resume on > the suspend path. PCI will usually runtime resume anyway > i.e. from pci_pm_suspend(): > > /* > * PCI devices suspended at run time may need to be resumed at this > * point, because in general it may be necessary to reconfigure them for > * system suspend. Namely, if the device is expected to wake up the > * system from the sleep state, it may have to be reconfigured for this > * purpose, or if the device is not expected to wake up the system from > * the sleep state, it should be prevented from signaling wakeup events > * going forward. > * > * Also if the driver of the device does not indicate that its system > * suspend callbacks can cope with runtime-suspended devices, it is > * better to resume the device from runtime suspend here. > */ > if (!dev_pm_smart_suspend(dev) || pci_dev_need_resume(pci_dev)) { > pm_runtime_resume(dev); > > So maybe alter rtsx_pci_runtime_resume() to avoid calling > pcr->slots[RTSX_SD_CARD].card_event() == rtsx_pci_sdmmc_card_event() > when suspending. Perhaps along the lines of the hack below: > > static int rtsx_pci_runtime_resume(struct device *device) > { > struct pci_dev *pcidev = to_pci_dev(device); > struct pcr_handle *handle = pci_get_drvdata(pcidev); > struct rtsx_pcr *pcr = handle->pcr; > > dev_dbg(device, "--> %s\n", __func__); > > mutex_lock(&pcr->pcr_mutex); > > rtsx_pci_write_register(pcr, HOST_SLEEP_STATE, 0x03, 0x00); > > rtsx_pci_init_hw(pcr); > > if (pcr->slots[RTSX_SD_CARD].p_dev != NULL) { > +#if IS_ENABLED(CONFIG_SUSPEND) > + if (pm_suspend_target_state == PM_SUSPEND_ON) > +#endif > pcr->slots[RTSX_SD_CARD].card_event( > pcr->slots[RTSX_SD_CARD].p_dev); > } > > mutex_unlock(&pcr->pcr_mutex); > return 0; > } > >> WIP >> >> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> >> --- >> drivers/mmc/core/core.c | 18 ++++++++++++------ >> drivers/mmc/core/core.h | 11 ++++++++--- >> drivers/mmc/core/queue.c | 4 ++-- >> drivers/mmc/core/sdio_irq.c | 2 +- >> 4 files changed, 23 insertions(+), 12 deletions(-) >> >> ... Me earlier: > I'm attemping to manually replicate the changes on Fedora 43's > kernel-6.18.3 checkout > (https://docs.fedoraproject.org/en-US/quick-docs/kernel-build-custom/), > though I'm much less experienced building kernels here than on Arch > Linux (the Arch SSD is currently in another computer). I will be > replying back with results once I can build and test these patches. I've built a test kernel based on Fedora's 6.18.3 along with these two patches. Now `sudo badsleep.sh` successfully completes on the machine with the Realtek card reader. * Adrian's code would not compile until I edited drivers/misc/cardreader/rtsx_pcr.c and added #include <linux/suspend.h>. * It looks a bit janky that the inner line of code is tied to a different natural indentation level based on a compile-time flag. With suspend enabled, the function call is on the same indentation level as the if statement! o One possibility is to indent the inner code one more level (which is an extraneous indentation if the #if is inactive) o Another is to move the added condition into the surrounding `if (pcr->slots[RTSX_SD_CARD].p_dev != NULL)`, but this prevents us from adding code that /doesn't/ check pm_suspend_target_state. I ran into a possible bug: * On my first boot attempt, I tried running badsleep.sh, waking the system, and inserted a microSD card. The card was not recognized in Dolphin or listed in lsblk. rtsx_pci_sdmmc was present in lsmod, and I saw no references to rtsx or mmc in the journal. * I could not reproduce this error on subsequent boots. I rebooted the computer, then tried badsleep.sh (with or without regular KDE sleep beforehand), then inserted the microSD card. At this point it was recognized properly. I also tried inserting the card /while/ the system was asleep, which worked too. I'm not sure why it failed the first time... dirty contacts? random bug? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled 2026-01-09 2:46 ` Tabby Kitten @ 2026-01-12 7:39 ` Adrian Hunter 0 siblings, 0 replies; 8+ messages in thread From: Adrian Hunter @ 2026-01-12 7:39 UTC (permalink / raw) To: Tabby Kitten, Ulf Hansson, ricky_wu; +Cc: linux-mmc, linux-kernel On 09/01/2026 04:46, Tabby Kitten wrote: > On 1/5/26 4:31 AM, Adrian Hunter wrote: >> Seems reasonable, but isn't there also: >> bus_ops->suspend() == mmc_sd_suspend() >> _mmc_sd_suspend() >> mmc_claim_host(host) >> >> In general, it looks difficult to avoid runtime resume on >> the suspend path. PCI will usually runtime resume anyway >> i.e. from pci_pm_suspend(): >> >> /* >> * PCI devices suspended at run time may need to be resumed at this >> * point, because in general it may be necessary to reconfigure them for >> * system suspend. Namely, if the device is expected to wake up the >> * system from the sleep state, it may have to be reconfigured for this >> * purpose, or if the device is not expected to wake up the system from >> * the sleep state, it should be prevented from signaling wakeup events >> * going forward. >> * >> * Also if the driver of the device does not indicate that its system >> * suspend callbacks can cope with runtime-suspended devices, it is >> * better to resume the device from runtime suspend here. >> */ >> if (!dev_pm_smart_suspend(dev) || pci_dev_need_resume(pci_dev)) { >> pm_runtime_resume(dev); >> >> So maybe alter rtsx_pci_runtime_resume() to avoid calling >> pcr->slots[RTSX_SD_CARD].card_event() == rtsx_pci_sdmmc_card_event() >> when suspending. Perhaps along the lines of the hack below: >> >> static int rtsx_pci_runtime_resume(struct device *device) >> { >> struct pci_dev *pcidev = to_pci_dev(device); >> struct pcr_handle *handle = pci_get_drvdata(pcidev); >> struct rtsx_pcr *pcr = handle->pcr; >> >> dev_dbg(device, "--> %s\n", __func__); >> >> mutex_lock(&pcr->pcr_mutex); >> >> rtsx_pci_write_register(pcr, HOST_SLEEP_STATE, 0x03, 0x00); >> >> rtsx_pci_init_hw(pcr); >> >> if (pcr->slots[RTSX_SD_CARD].p_dev != NULL) { >> +#if IS_ENABLED(CONFIG_SUSPEND) >> + if (pm_suspend_target_state == PM_SUSPEND_ON) >> +#endif >> pcr->slots[RTSX_SD_CARD].card_event( >> pcr->slots[RTSX_SD_CARD].p_dev); >> } >> >> mutex_unlock(&pcr->pcr_mutex); >> return 0; >> } >> >>> WIP >>> >>> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> >>> --- >>> drivers/mmc/core/core.c | 18 ++++++++++++------ >>> drivers/mmc/core/core.h | 11 ++++++++--- >>> drivers/mmc/core/queue.c | 4 ++-- >>> drivers/mmc/core/sdio_irq.c | 2 +- >>> 4 files changed, 23 insertions(+), 12 deletions(-) >>> >>> ... > > Me earlier: > >> I'm attemping to manually replicate the changes on Fedora 43's >> kernel-6.18.3 checkout >> (https://docs.fedoraproject.org/en-US/quick-docs/kernel-build-custom/), >> though I'm much less experienced building kernels here than on Arch >> Linux (the Arch SSD is currently in another computer). I will be >> replying back with results once I can build and test these patches. > > I've built a test kernel based on Fedora's 6.18.3 along with these two > patches. Now `sudo badsleep.sh` successfully completes on the machine > with the Realtek card reader. > > * Adrian's code would not compile until I edited > drivers/misc/cardreader/rtsx_pcr.c and added #include <linux/suspend.h>. > * It looks a bit janky that the inner line of code is tied to a > different natural indentation level based on a compile-time flag. > With suspend enabled, the function call is on the same indentation > level as the if statement! > o One possibility is to indent the inner code one more level > (which is an extraneous indentation if the #if is inactive) > o Another is to move the added condition into the surrounding `if > (pcr->slots[RTSX_SD_CARD].p_dev != NULL)`, but this prevents us > from adding code that /doesn't/ check pm_suspend_target_state. Currently (since v6.5) pm_suspend_target_state gets defined even if !CONFIG_SUSPEND so, allowing up to 100 cols, it can be - if (pcr->slots[RTSX_SD_CARD].p_dev != NULL) { - pcr->slots[RTSX_SD_CARD].card_event( - pcr->slots[RTSX_SD_CARD].p_dev); - } + if (pcr->slots[RTSX_SD_CARD].p_dev != NULL && pm_suspend_target_state == PM_SUSPEND_ON) + pcr->slots[RTSX_SD_CARD].card_event(pcr->slots[RTSX_SD_CARD].p_dev); Could use a comment as well, noting that card_event() can call mmc_detect_change() which prevents system sleep, so it must be avoided if the system is suspending. Note, this change should be sufficient to fix the issue by itself. > > I ran into a possible bug: > > * On my first boot attempt, I tried running badsleep.sh, waking the > system, and inserted a microSD card. The card was not recognized in > Dolphin or listed in lsblk. rtsx_pci_sdmmc was present in lsmod, and > I saw no references to rtsx or mmc in the journal. > * I could not reproduce this error on subsequent boots. I rebooted the > computer, then tried badsleep.sh (with or without regular KDE sleep > beforehand), then inserted the microSD card. At this point it was > recognized properly. I also tried inserting the card /while/ the > system was asleep, which worked too. I'm not sure why it failed the > first time... dirty contacts? random bug? Well, you need not only to reproduce it, but show that it doesn't also happen with the old code, otherwise there is not enough information to work with. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled 2026-01-03 11:12 ` Ulf Hansson 2026-01-05 12:31 ` Adrian Hunter @ 2026-01-07 22:06 ` Tabby Kitten 1 sibling, 0 replies; 8+ messages in thread From: Tabby Kitten @ 2026-01-07 22:06 UTC (permalink / raw) To: Ulf Hansson, Adrian Hunter; +Cc: linux-mmc, linux-kernel Unfortunately this patch appears corrupted, requiring me to manually recreate it. It seems the process of including it as text in the email body has corrupted the text, including hard-wrapping the @@ lines into multiple lines and converting tabs into spaces, so the patch no longer maps to the Linux source code (eg. drivers/mmc/core/core.c). These errors were present in Thunderbird, lore.kernel.org, and marc.info (where downloading the raw emails additionally equals-encodes/Quoted-Printable? special characters breaking the patch further). Additionally Adrian's email seems to be an informal patch rather than a machine-readable specification. I'm attemping to manually replicate the changes on Fedora 43's kernel-6.18.3 checkout (https://docs.fedoraproject.org/en-US/quick-docs/kernel-build-custom/), though I'm much less experienced building kernels here than on Arch Linux (the Arch SSD is currently in another computer). I will be replying back with results once I can build and test these patches. On 1/3/26 3:12 AM, Ulf Hansson wrote: > + Adrian > > On Thu, 1 Jan 2026 at 05:58, Tabby Kitten <nyanpasu256@gmail.com> wrote: >> Hi, >> >> It's been a few weeks since you looked into the bug. I think the merge window is over now, have you had the time to look into resolving this issue? > Yes, sorry for the delay. > > See below for an attached patch. Please try it out and report back. > > Kind regards > Uffe > >> Tabby >> >> On Tue, Dec 9, 2025 at 7:09 AM Ulf Hansson <ulf.hansson@linaro.org> wrote: >>> Hi, >>> >>> On Wed, 26 Nov 2025 at 10:08, Tabby Kitten <nyanpasu256@gmail.com> wrote: >>>> On a PC with a Realtek PCI Express SD reader, when you sleep with >>>> `wakeup_count` active (eg. sleeping from KDE's lock screen), the MMC >>>> driver wakes up the system and aborts suspend. >>> Okay, that's clearly a problem that needs to be fixed! >>> >>>> I've found a sleep failure bug in the rtsx_pci and mmc_core drivers. >>>> After userspace writes a number to `/sys/power/wakeup_count` (eg. KDE >>>> Plasma does it to distinguish user wakes from timers and Wake-on-LAN), >>>> if it attempts a mem suspend it will be aborted when >>>> rtsx_pci_runtime_resume() -> mmc_detect_change() emits a >>>> pm_wakeup_ws_event(). This breaks sleep on some hardware and desktop >>>> environments. >>>> >>>> The detailed description: >>>> The recently released Plasma 6.5.0 writes to `/sys/power/wakeup_count` >>>> before sleeping. On my computer this caused the sleep attempt to fail >>>> with dmesg error "PM: Some devices failed to suspend, or early wake >>>> event detected". I got this error on both Arch Linux and Fedora, and >>>> replicated it on Fedora with the mainline kernel COPR. KDE is tracking >>>> this error at https://bugs.kde.org/show_bug.cgi?id=510992, and have >>>> disabled writing to wakeup_count on Plasma 6.5.3 to work around this >>>> issue. >>>> >>>> I've written a standalone shell script to reproduce this sleep failure >>>> (save as badsleep.sh): >>>> >>>> #!/bin/bash >>>> read wakeup_count < /sys/power/wakeup_count >>>> if [[ $? -ne 0 ]]; then >>>> e=$? >>>> echo "Failed to open wakeup_count, suspend maybe already in progress" >>>> exit $e >>>> fi >>>> echo $wakeup_count > /sys/power/wakeup_count >>>> if [[ $? -ne 0 ]]; then >>>> e=$? >>>> echo "Failed to write wakeup_count, wakeup_count may have changed in between" >>>> exit $e >>>> fi >>>> echo mem > /sys/power/state >>>> >>>> Running `sudo ./badsleep.sh` reproduces failed sleeps on my computer. >>>> (sudo is needed to write to `/sys/power/wakeup_count` on Fedora.) >>>> >>>> * If I run the script unaltered, the screen turns off and on, and the >>>> terminal outputs >>>> `./badsleep.sh: line 14: echo: write error: Device or resource busy` >>>> indicating the mem sleep failed. >>>> >>>> * If I edit the script and comment out `echo $wakeup_count > >>>> /sys/power/wakeup_count`, the sleep succeeds, and waking the computer >>>> skips the lock screen and resumes where I left off. >>>> >>>> * If I run `sudo rmmod rtsx_pci_sdmmc` to disable the faulty module, the >>>> sleep succeeds, and waking the computer skips the lock screen and >>>> resumes where I left off. >>>> >>>> I think this problem happens in general when a driver spawns a wakeup >>>> event from its suspend callback. On my system, the driver in question >>>> lies in the MMC subsystem. >>>> >>>> ## Code debugging >>>> >>>> If I run `echo 1 > /sys/power/pm_debug_messages` to enable verbose >>>> logging, then attempt a failed sleep, I see output: >>>> >>>> PM: Wakeup pending, aborting suspend >>>> PM: active wakeup source: mmc0 >>>> PM: suspend of devices aborted after 151.615 msecs >>>> PM: start suspend of devices aborted after 169.797 msecs >>>> PM: Some devices failed to suspend, or early wake event detected >>>> >>>> The "Wakeup pending, aborting suspend" message comes from function >>>> `pm_wakeup_pending()`. This function checks if event checks are enabled, >>>> and if some counters have changed aborts suspend and calls >>>> `pm_print_active_wakeup_sources()`, which prints `wakeup_sources`. >>>> Tracing the code that modifies `wakeup_sources`, I found that >>>> `pm_wakeup_ws_event()` would activate an event and >>>> `wakeup_source_register() → wakeup_source_add()` would add a new one. >>> Thanks for all the details! >>> >>>> To find who changed wakeup events, I used my stacksnoop fork at >>>> https://github.com/nyanpasu64/bcc/blob/local/examples/tracing/stacksnoop >>>> .py to trace a failed suspend: >>>> >>>> nyanpasu64@ryzen ~/code/bcc (local)> sudo ./examples/tracing/stacksnoop.py pm_wakeup_ws_event wakeup_source_register >>>> TIME(s) FUNCTION >>>> 7.254676819: >>>> 0: ret_from_fork_asm [kernel] >>>> 1: ret_from_fork [kernel] >>>> 2: kthread [kernel] >>>> 3: worker_thread [kernel] >>>> 4: process_one_work [kernel] >>>> 5: async_run_entry_fn [kernel] >>>> 6: async_suspend [kernel] >>>> 7: device_suspend [kernel] >>>> 8: dpm_run_callback [kernel] >>>> 9: mmc_bus_suspend [mmc_core] >>>> 10: mmc_blk_suspend [mmc_block] >>>> 11: mmc_queue_suspend [mmc_block] >>>> 12: __mmc_claim_host [mmc_core] >>>> 13: __pm_runtime_resume [kernel] >>>> 14: rpm_resume [kernel] >>>> 15: rpm_resume [kernel] >>>> 16: rpm_callback [kernel] >>>> 17: __rpm_callback [kernel] >>>> 18: rtsx_pci_runtime_resume [rtsx_pci] >>>> 19: mmc_detect_change [mmc_core] >>>> 20: pm_wakeup_ws_event [kernel] >>>> >>>> On a previous kernel, lines 9-12 were replaced by a single call to >>>> `pci_pm_suspend`. I've posted my detailed debugging on the older kernel >>>> at https://bugs.kde.org/show_bug.cgi?id=510992#c26. There I found that >>>> `pci_pm_suspend()` wakes PCI(e) devices before sending them into a full >>>> sleep state, but in the process, `_mmc_detect_change()` will "Prevent >>>> system sleep for 5s to allow user space to consume the\n corresponding >>>> uevent"... which interrupts a system sleep in progress. >>>> >>>> On my current kernel, the same logic applies, but reading the source I >>>> can't tell where `__mmc_claim_host()` is actually calling >>>> `__pm_runtime_resume()`. Nonetheless the problem remains that >>>> `rpm_resume()` is called during system suspend, `mmc_detect_change()` >>>> wakes the system when called, and this will abort system sleep when >>>> `/sys/power/wakeup_count` is active. >>> __mmc_claim_host() will call pm_runtime_get_sync() to runtime resume >>> the mmc host device. >>> >>> The mmc host device's parent (a pci device) will then be runtime >>> resumed too. That's the call to rtsx_pci_runtime_resume() we see >>> above. >>> >>> The problem is then that rtsx_pci_runtime_resume() invokes a callback >>> (->card_event())) back into the mmc host driver >>> (drivers/mmc/host/rtsx_pci_sdmmc.c), which ends up calling >>> mmc_detect_change() to try to detect whether a card have been >>> inserted/removed. >>> >>>> ## Next steps >>>> >>>> How would this problem be addressed? Off the top of my head, perhaps you >>>> could not call `__pm_runtime_resume()` on a SD card reader during the >>>> `device_suspend()` process, not call `pm_wakeup_ws_event()` when the SD >>>> card status changes, not call `pm_wakeup_ws_event()` *specifically* >>>> when system suspend is temporarily waking up a SD card reader, or >>>> disable pm_wakeup_ws_event() entirely during the suspend process (does >>>> this defeat the purpose of the function?). >>> Let me think a bit on what makes the best sense here. I will get back >>> to you in a couple of days. >>> >>>> Are there other drivers which cause the same symptoms? I don't know. I >>>> asked on the KDE bug tracker for other users to attempt a failed sleep >>>> with `echo 1 > /sys/power/pm_debug_messages` active, to identify which >>>> driver broke suspend in their system; so far nobody has replied with >>>> logs. >>>> >>>> Given that this bug is related to `/sys/power/wakeup_count` >>>> (https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-power), I >>>> was considering CCing Rafael J. Wysocki <rafael@kernel.org> and >>>> linux-pm@vger.kernel.org, but have decided to only message the MMC >>>> maintainers for now. If necessary we may have to forward this message >>>> there to get their attention. >>>> >>>> ---- >>>> >>>> System information: >>>> >>>> * I have an Intel NUC8i7BEH mini PC, with CPU 8 × Intel® Core™ i7-8559U >>>> CPU @ 2.70GHz. >>>> >>>> * uname -mi prints `x86_64 unknown`. >>>> >>>> * `lspci -nn` prints >>>> "6e:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS522A PCI Express Card Reader [10ec:522a] (rev 01)". >>>> >>>> * I am running kernel 6.18.0-0.rc7.357.vanilla.fc43.x86_64 from the Fedora COPRs >>>> (https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories). >>>> >>>> * dmesg at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-dmesg-2025-11-25-txt >>>> >>>> * Fully resolved config at https://gist.github.com/nyanpasu64/ab5d3d1565aafe6c1c08cbcaf074e44a#file-config-6-18-0-0-rc7-357-vanilla-fc43-x86_64, >>>> source at https://download.copr.fedorainfracloud.org/results/@kernel-vanilla/mainline-wo-mergew/fedora-43-x86_64/09831015-mainline-womergew-releases/kernel-6.18.0-0.rc7.357.vanilla.fc43.src.rpm >>> Kind regards >>> Uffe > From: Ulf Hansson <ulf.hansson@linaro.org> > Date: Sat, 3 Jan 2026 11:55:44 +0100 > Subject: [PATCH] mmc: core: Avoid runtime PM of host in mmc_queue_suspend() > > WIP > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> > --- > drivers/mmc/core/core.c | 18 ++++++++++++------ > drivers/mmc/core/core.h | 11 ++++++++--- > drivers/mmc/core/queue.c | 4 ++-- > drivers/mmc/core/sdio_irq.c | 2 +- > 4 files changed, 23 insertions(+), 12 deletions(-) > > diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c > index 860378bea557..c3923522833a 100644 > --- a/drivers/mmc/core/core.c > +++ b/drivers/mmc/core/core.c > @@ -781,6 +781,7 @@ static inline void mmc_ctx_set_claimer(struct > mmc_host *host, > * @ctx: context that claims the host or NULL in which case the default > * context will be used > * @abort: whether or not the operation should be aborted > + * @do_pm: whether to use runtime PM or not > * > * Claim a host for a set of operations. If @abort is non null and > * dereference a non-zero value then this will return prematurely with > @@ -788,7 +789,7 @@ static inline void mmc_ctx_set_claimer(struct > mmc_host *host, > * with the lock held otherwise. > */ > int __mmc_claim_host(struct mmc_host *host, struct mmc_ctx *ctx, > - atomic_t *abort) > + atomic_t *abort, bool do_pm) > { > struct task_struct *task = ctx ? NULL : current; > DECLARE_WAITQUEUE(wait, current); > @@ -821,7 +822,7 @@ int __mmc_claim_host(struct mmc_host *host, struct > mmc_ctx *ctx, > spin_unlock_irqrestore(&host->lock, flags); > remove_wait_queue(&host->wq, &wait); > > - if (pm) > + if (do_pm && pm) > pm_runtime_get_sync(mmc_dev(host)); > > return stop; > @@ -829,13 +830,14 @@ int __mmc_claim_host(struct mmc_host *host, > struct mmc_ctx *ctx, > EXPORT_SYMBOL(__mmc_claim_host); > > /** > - * mmc_release_host - release a host > + * __mmc_release_host - release a host > * @host: mmc host to release > + * @do_pm: whether to use runtime PM or not > * > * Release a MMC host, allowing others to claim the host > * for their operations. > */ > -void mmc_release_host(struct mmc_host *host) > +void __mmc_release_host(struct mmc_host *host, bool do_pm) > { > unsigned long flags; > > @@ -851,6 +853,10 @@ void mmc_release_host(struct mmc_host *host) > host->claimer = NULL; > spin_unlock_irqrestore(&host->lock, flags); > wake_up(&host->wq); > + > + if (!do_pm) > + return; > + > pm_runtime_mark_last_busy(mmc_dev(host)); > if (host->caps & MMC_CAP_SYNC_RUNTIME_PM) > pm_runtime_put_sync_suspend(mmc_dev(host)); > @@ -858,7 +864,7 @@ void mmc_release_host(struct mmc_host *host) > pm_runtime_put_autosuspend(mmc_dev(host)); > } > } > -EXPORT_SYMBOL(mmc_release_host); > +EXPORT_SYMBOL(__mmc_release_host); > > /* > * This is a helper function, which fetches a runtime pm reference for the > @@ -867,7 +873,7 @@ EXPORT_SYMBOL(mmc_release_host); > void mmc_get_card(struct mmc_card *card, struct mmc_ctx *ctx) > { > pm_runtime_get_sync(&card->dev); > - __mmc_claim_host(card->host, ctx, NULL); > + __mmc_claim_host(card->host, ctx, NULL, true); > } > EXPORT_SYMBOL(mmc_get_card); > > diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h > index a028b48be164..5979c90d3b09 100644 > --- a/drivers/mmc/core/core.h > +++ b/drivers/mmc/core/core.h > @@ -135,8 +135,8 @@ unsigned int mmc_calc_max_discard(struct mmc_card *card); > int mmc_set_blocklen(struct mmc_card *card, unsigned int blocklen); > > int __mmc_claim_host(struct mmc_host *host, struct mmc_ctx *ctx, > - atomic_t *abort); > -void mmc_release_host(struct mmc_host *host); > + atomic_t *abort, bool do_pm); > +void __mmc_release_host(struct mmc_host *host, bool do_pm); > void mmc_get_card(struct mmc_card *card, struct mmc_ctx *ctx); > void mmc_put_card(struct mmc_card *card, struct mmc_ctx *ctx); > > @@ -150,7 +150,12 @@ int mmc_card_alternative_gpt_sector(struct > mmc_card *card, sector_t *sector); > */ > static inline void mmc_claim_host(struct mmc_host *host) > { > - __mmc_claim_host(host, NULL, NULL); > + __mmc_claim_host(host, NULL, NULL, true); > +} > + > +static inline void mmc_release_host(struct mmc_host *host) > +{ > + __mmc_release_host(host, true); > } > > int mmc_cqe_start_req(struct mmc_host *host, struct mmc_request *mrq); > diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c > index 284856c8f655..76e83f49ff4e 100644 > --- a/drivers/mmc/core/queue.c > +++ b/drivers/mmc/core/queue.c > @@ -477,8 +477,8 @@ void mmc_queue_suspend(struct mmc_queue *mq) > * The host remains claimed while there are outstanding requests, so > * simply claiming and releasing here ensures there are none. > */ > - mmc_claim_host(mq->card->host); > - mmc_release_host(mq->card->host); > + __mmc_claim_host(mq->card->host, NULL, NULL, false); > + __mmc_release_host(mq->card->host, false); > } > > void mmc_queue_resume(struct mmc_queue *mq) > diff --git a/drivers/mmc/core/sdio_irq.c b/drivers/mmc/core/sdio_irq.c > index 2b24bdf38296..e5d4f8c634c8 100644 > --- a/drivers/mmc/core/sdio_irq.c > +++ b/drivers/mmc/core/sdio_irq.c > @@ -172,7 +172,7 @@ static int sdio_irq_thread(void *_host) > * that doesn't require that lock to be held. > */ > ret = __mmc_claim_host(host, NULL, > - &host->sdio_irq_thread_abort); > + &host->sdio_irq_thread_abort, true); > if (ret) > break; > ret = process_sdio_pending_irqs(host); ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-01-12 7:39 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-26 9:08 rtsx_pci_sdmmc aborts suspend when /sys/power/wakeup_count is enabled Tabby Kitten
2025-12-09 15:08 ` Ulf Hansson
2026-01-01 5:11 ` Tabby Kitten
[not found] ` <CAL57YxZagMaZF1X1bpx-nB76s=vZMWhUDiVbvB9P3CLiXG-qHQ@mail.gmail.com>
2026-01-03 11:12 ` Ulf Hansson
2026-01-05 12:31 ` Adrian Hunter
2026-01-09 2:46 ` Tabby Kitten
2026-01-12 7:39 ` Adrian Hunter
2026-01-07 22:06 ` Tabby Kitten
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox