* NVMe suspend failure
@ 2024-02-04 1:01 Bart Van Assche
2024-02-04 1:18 ` Keith Busch
0 siblings, 1 reply; 5+ messages in thread
From: Bart Van Assche @ 2024-02-04 1:01 UTC (permalink / raw)
To: linux-nvme@lists.infradead.org
Hi,
Even after having added nvme_core.default_ps_max_latency_us=0
pcie_aspm=off to the kernel command line, the following still
appears in dmesg -w output upon suspend of an x86_64 workstation:
[ 2451.640676] nvme nvme0: controller is down; will reset:
CSTS=0xffffffff, PCI_STATUS=0x10
[ 2451.640690] nvme nvme0: Does your device have a faulty power saving
mode enabled?
[ 2451.640694] nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0
pcie_aspm=off" and report a bug
Hence this email. From the nvme id-ctrl output:
mn : Samsung SSD 970 EVO Plus 500GB
fr : 2B2QEXM7
Please let me know if you need more information.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NVMe suspend failure
2024-02-04 1:01 NVMe suspend failure Bart Van Assche
@ 2024-02-04 1:18 ` Keith Busch
2024-02-04 1:44 ` Bart Van Assche
0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2024-02-04 1:18 UTC (permalink / raw)
To: Bart Van Assche; +Cc: linux-nvme@lists.infradead.org
On Sat, Feb 03, 2024 at 05:01:41PM -0800, Bart Van Assche wrote:
> Hi,
>
> Even after having added nvme_core.default_ps_max_latency_us=0
> pcie_aspm=off to the kernel command line, the following still
> appears in dmesg -w output upon suspend of an x86_64 workstation:
>
> [ 2451.640676] nvme nvme0: controller is down; will reset: CSTS=0xffffffff,
> PCI_STATUS=0x10
> [ 2451.640690] nvme nvme0: Does your device have a faulty power saving mode
> enabled?
> [ 2451.640694] nvme nvme0: Try "nvme_core.default_ps_max_latency_us=0
> pcie_aspm=off" and report a bug
>
> Hence this email. From the nvme id-ctrl output:
>
> mn : Samsung SSD 970 EVO Plus 500GB
> fr : 2B2QEXM7
>
> Please let me know if you need more information.
And this is happening during a suspend? What kind of suspend? Like an
S3/S4, or idle suspend?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NVMe suspend failure
2024-02-04 1:18 ` Keith Busch
@ 2024-02-04 1:44 ` Bart Van Assche
2024-02-05 17:58 ` Keith Busch
0 siblings, 1 reply; 5+ messages in thread
From: Bart Van Assche @ 2024-02-04 1:44 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme@lists.infradead.org
On 2/3/24 17:18, Keith Busch wrote:
> And this is happening during a suspend? What kind of suspend? Like an
> S3/S4, or idle suspend?
I'm not sure how to determine this? This is what I found in the logs:
Feb 03 09:02:20 asus systemd-logind[1208]: The system will suspend now!
Feb 03 09:02:20 asus systemd-logind[1208]: Unit suspend.target is
masked, refusing operation.
Bart.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NVMe suspend failure
2024-02-04 1:44 ` Bart Van Assche
@ 2024-02-05 17:58 ` Keith Busch
2024-02-05 18:44 ` Bart Van Assche
0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2024-02-05 17:58 UTC (permalink / raw)
To: Bart Van Assche; +Cc: linux-nvme@lists.infradead.org
On Sat, Feb 03, 2024 at 05:44:03PM -0800, Bart Van Assche wrote:
> On 2/3/24 17:18, Keith Busch wrote:
> > And this is happening during a suspend? What kind of suspend? Like an
> > S3/S4, or idle suspend?
>
> I'm not sure how to determine this? This is what I found in the logs:
>
> Feb 03 09:02:20 asus systemd-logind[1208]: The system will suspend now!
> Feb 03 09:02:20 asus systemd-logind[1208]: Unit suspend.target is masked,
> refusing operation.
I am not sure what the "suspend now!" message means to the driver. Your
initial report with the "CSTS=0xfffffff" comes from the nvme request
timeout handler, so I'd want to confirm what command is timing out, and
what path dispatched it: was the command dispatched from nvme_suspend()
path, or is this coming from somewhere else? If somewhere else, was it
dispatched before or after the nvme_suspend?
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NVMe suspend failure
2024-02-05 17:58 ` Keith Busch
@ 2024-02-05 18:44 ` Bart Van Assche
0 siblings, 0 replies; 5+ messages in thread
From: Bart Van Assche @ 2024-02-05 18:44 UTC (permalink / raw)
To: Keith Busch; +Cc: linux-nvme@lists.infradead.org
On 2/5/24 09:58, Keith Busch wrote:
> On Sat, Feb 03, 2024 at 05:44:03PM -0800, Bart Van Assche wrote:
>> On 2/3/24 17:18, Keith Busch wrote:
>>> And this is happening during a suspend? What kind of suspend? Like an
>>> S3/S4, or idle suspend?
>>
>> I'm not sure how to determine this? This is what I found in the logs:
>>
>> Feb 03 09:02:20 asus systemd-logind[1208]: The system will suspend now!
>> Feb 03 09:02:20 asus systemd-logind[1208]: Unit suspend.target is masked,
>> refusing operation.
>
> I am not sure what the "suspend now!" message means to the driver. Your
> initial report with the "CSTS=0xfffffff" comes from the nvme request
> timeout handler, so I'd want to confirm what command is timing out, and
> what path dispatched it: was the command dispatched from nvme_suspend()
> path, or is this coming from somewhere else? If somewhere else, was it
> dispatched before or after the nvme_suspend?
Hi Keith,
I think the requests that timed out were submitted after resume. User space
software is paused before nvme_suspend() is called. nvme_suspend() waits for
pending requests to complete. Hence, the requests that timed out must have
been submitted after user space processes were resumed. An additional
indication is that I saw user space software crash after resume that writes
periodically to block storage devices (email client and web browser).
Thanks,
Bart.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-02-05 18:44 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-04 1:01 NVMe suspend failure Bart Van Assche
2024-02-04 1:18 ` Keith Busch
2024-02-04 1:44 ` Bart Van Assche
2024-02-05 17:58 ` Keith Busch
2024-02-05 18:44 ` Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox