Linux block layer
 help / color / mirror / Atom feed
* nvme-pci timeout issue
@ 2024-01-11  3:54 Chaitanya Kulkarni
  2024-01-23 22:19 ` Chaitanya Kulkarni
  0 siblings, 1 reply; 4+ messages in thread
From: Chaitanya Kulkarni @ 2024-01-11  3:54 UTC (permalink / raw)
  To: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org
  Cc: kbusch@kernel.org, Sagi Grimberg, Christoph Hellwig, Jens Axboe

Hi all,

After running the test that triggers the nvme timeout for nvme-pci, the
device under test is lingering in an inconsistent state. Here are steps :-

1. Load the driver.
2. Trigger nvme_timeout.
3. After timeout handler gets triggered it fails the application
    with I/O error.
4. lsblk and nvme listdoesn't show the device anymore.
5. ls does show the device.
6. Any write to it fails (e.g. dd) since device has 0 capacity.

Is this accepted behavior? if it is then a malfunctioning device is
lingering in the system and applications end up accessing it as if
it is functioning properly. Can we avoid this scenario?

How about we remove the device from the system? If you're all okay with
it, I'll send a patch for the nvme-pci timeout to remove the device that
has zero capacity. Otherwise please suggest how to deal with this scenario.

With this confusing behavior I'm not entirely sure what is the expected
scenario to pass timeout testcase.

Please have a look at detailed test log below.

-ck

Test Log :-

blktests (master) # lsblk | grep nvme
nvme0n1 259:0    0    1G  0 disk
blktests (master) # nvme list
Node                  Generic               SN 
Model                                    Namespace 
Usage                      Format           FW Rev
--------------------- --------------------- -------------------- 
---------------------------------------- --------- 
-------------------------- ---------------- --------
/dev/nvme0n1          /dev/ng0n1            foo QEMU NVMe 
Ctrl                           1           1.07  GB / 1.07  GB    512   
B +  0 B   1.0
blktests (master) # dd if=/dev/zero of=/dev/nvme0n1 bs=4k count=1 
oflag=direct
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000555306 s, 7.4 MB/s
blktests (master) #
blktests (master) #
blktests (master) #
blktests (master) #  nvme_trtype=pci ./check nvme/050
nvme/050 (test nvme-pci timeout with fio jobs.) [passed]
     runtime  0.093s  ...  90.901s
blktests (master) # dmesg  -c
[  263.714133] run blktests nvme/050 at 2024-01-10 19:16:57
[  269.104834] nvme 0000:00:04.0: vgaarb: pci_notify
[  269.126931] pci 0000:00:04.0: vgaarb: pci_notify
[  274.111859] nvme 0000:00:04.0: vgaarb: pci_notify
[  274.111878] nvme 0000:00:04.0: runtime IRQ mapping not provided by arch
[  274.112099] nvme nvme0: pci function 0000:00:04.0
[  274.145615] nvme 0000:00:04.0: enabling bus mastering
[  274.146125] nvme 0000:00:04.0: save config 0x00: 0x00101b36
[  274.146132] nvme 0000:00:04.0: save config 0x04: 0x00100507
[  274.146138] nvme 0000:00:04.0: save config 0x08: 0x01080202
[  274.146143] nvme 0000:00:04.0: save config 0x0c: 0x00000000
[  274.146149] nvme 0000:00:04.0: save config 0x10: 0xfebd0004
[  274.146154] nvme 0000:00:04.0: save config 0x14: 0x00000000
[  274.146167] nvme 0000:00:04.0: save config 0x18: 0x00000000
[  274.146173] nvme 0000:00:04.0: save config 0x1c: 0x00000000
[  274.146180] nvme 0000:00:04.0: save config 0x20: 0x00000000
[  274.146185] nvme 0000:00:04.0: save config 0x24: 0x00000000
[  274.146191] nvme 0000:00:04.0: save config 0x28: 0x00000000
[  274.146196] nvme 0000:00:04.0: save config 0x2c: 0x11001af4
[  274.146201] nvme 0000:00:04.0: save config 0x30: 0x00000000
[  274.146206] nvme 0000:00:04.0: save config 0x34: 0x00000040
[  274.146212] nvme 0000:00:04.0: save config 0x38: 0x00000000
[  274.146217] nvme 0000:00:04.0: save config 0x3c: 0x0000010b
[  274.158737] nvme nvme0: 48/0/0 default/read/poll queues
[  274.165837] nvme nvme0: Ignoring bogus Namespace Identifiers
[  274.169187] nvme 0000:00:04.0: vgaarb: pci_notify
[  302.150515] run blktests nvme/050 at 2024-01-10 19:17:35
[  302.391621] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 10, probability 99, space 
0, times -1

[...]

[  302.402142] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 10, probability 99, space 
0, times -1
[  302.402649] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 10, probability 99, space 
0, times -1
[  332.561584] nvme nvme0: I/O tag 0 (e000) opcode 0x2 (Read) QID 1 
timeout, aborting req_op:READ(0) size:4096
[  332.561631] nvme nvme0: I/O tag 832 (4340) opcode 0x2 (Read) QID 2 
timeout, aborting req_op:READ(0) size:4096
[  332.561685] nvme nvme0: I/O tag 896 (f380) opcode 0x2 (Read) QID 3 
timeout, aborting req_op:READ(0) size:4096
[  332.561703] nvme nvme0: Abort status: 0x0
[  332.561707] nvme nvme0: Abort status: 0x0
[  332.561720] nvme nvme0: I/O tag 768 (d300) opcode 0x2 (Read) QID 4 
timeout, aborting req_op:READ(0) size:4096
[  332.561739] nvme nvme0: Abort status: 0x0
[  332.561744] nvme nvme0: I/O tag 64 (5040) opcode 0x2 (Read) QID 5 
timeout, aborting req_op:READ(0) size:4096

[...]

[  332.563064] nvme nvme0: Abort status: 0x0
[  332.563067] nvme nvme0: I/O tag 0 (1000) opcode 0x2 (Read) QID 45 
timeout, aborting req_op:READ(0) size:4096
[  332.563088] nvme nvme0: I/O tag 704 (32c0) opcode 0x2 (Read) QID 46 
timeout, aborting req_op:READ(0) size:4096
[  332.563117] nvme nvme0: Abort status: 0x0
[  332.563124] nvme nvme0: Abort status: 0x0
[  332.563127] nvme nvme0: I/O tag 384 (b180) opcode 0x2 (Read) QID 47 
timeout, aborting req_op:READ(0) size:4096
[  332.563147] nvme nvme0: I/O tag 320 (8140) opcode 0x2 (Read) QID 48 
timeout, aborting req_op:READ(0) size:4096
[  332.563170] nvme nvme0: Abort status: 0x0
[  332.563179] nvme nvme0: Abort status: 0x0
[  362.769789] nvme nvme0: I/O tag 0 (e000) opcode 0x2 (Read) QID 1 
timeout, reset controller
[  362.871902] nvme 0000:00:04.0: enabling bus mastering
[  362.872350] nvme 0000:00:04.0: save config 0x00: 0x00101b36
[  362.872356] nvme 0000:00:04.0: save config 0x04: 0x00100507
[  362.872362] nvme 0000:00:04.0: save config 0x08: 0x01080202
[  362.872367] nvme 0000:00:04.0: save config 0x0c: 0x00000000
[  362.872372] nvme 0000:00:04.0: save config 0x10: 0xfebd0004
[  362.872378] nvme 0000:00:04.0: save config 0x14: 0x00000000
[  362.872383] nvme 0000:00:04.0: save config 0x18: 0x00000000
[  362.872388] nvme 0000:00:04.0: save config 0x1c: 0x00000000
[  362.872394] nvme 0000:00:04.0: save config 0x20: 0x00000000
[  362.872399] nvme 0000:00:04.0: save config 0x24: 0x00000000
[  362.872404] nvme 0000:00:04.0: save config 0x28: 0x00000000
[  362.872417] nvme 0000:00:04.0: save config 0x2c: 0x11001af4
[  362.872423] nvme 0000:00:04.0: save config 0x30: 0x00000000
[  362.872428] nvme 0000:00:04.0: save config 0x34: 0x00000040
[  362.872433] nvme 0000:00:04.0: save config 0x38: 0x00000000
[  362.872438] nvme 0000:00:04.0: save config 0x3c: 0x0000010b
[  362.883064] nvme nvme0: 48/0/0 default/read/poll queues
[  362.884880] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 10, probability 99, space 
0, times -1
[  362.884907] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 10, probability 99, space 
0, times -1
[  362.884970] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 10, probability 99, space 
0, times -1
[  362.885972] FAULT_INJECTION: forcing a failure.
                name fail_io_timeout, interval 10, probability 99, space 
0, times -1
[  392.973015] nvme nvme0: I/O tag 960 (a3c0) QID 32 timeout, disable 
controller
[  393.038995] I/O error, dev nvme0n1, sector 84856 op 0x0:(READ) flags 
0x800 phys_seg 1 prio class 2
[  393.038995] I/O error, dev nvme0n1, sector 1345216 op 0x0:(READ) 
flags 0x800 phys_seg 1 prio class 2
[  393.039000] I/O error, dev nvme0n1, sector 1104936 op 0x0:(READ) 
flags 0x800 phys_seg 1 prio class 2
[  393.039017] I/O error, dev nvme0n1, sector 1270088 op 0x0:(READ) 
flags 0x800 phys_seg 1 prio class 2
[  393.045604] nvme nvme0: failed to mark controller live state
[  393.045611] nvme nvme0: Disabling device after reset failure: -19
blktests (master) #
blktests (master) #
blktests (master) # lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda      8:0    0   50G  0 disk
├─sda1   8:1    0    1G  0 part /boot
└─sda2   8:2    0   49G  0 part /home
sdb      8:16   0  100G  0 disk /mnt/data
sr0     11:0    1 1024M  0 rom
vda    252:0    0    5G  0 disk
blktests (master) #
blktests (master) # nvme list
Node                  Generic               SN 
Model                                    Namespace 
Usage                      Format           FW Rev
--------------------- --------------------- -------------------- 
---------------------------------------- --------- 
-------------------------- ---------------- --------
blktests (master) #
blktests (master) # ls /dev/nvme0
/dev/nvme0
blktests (master) # ls /dev/nvme0n1
/dev/nvme0n1
blktests (master) # dd if=/dev/zero of=/dev/nvme0n1 bs=4k count=1 
oflag=direct
dd: error writing '/dev/nvme0n1': No space left on device
1+0 records in
0+0 records out
0 bytes copied, 0.000204128 s, 0.0 kB/s



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: nvme-pci timeout issue
  2024-01-11  3:54 nvme-pci timeout issue Chaitanya Kulkarni
@ 2024-01-23 22:19 ` Chaitanya Kulkarni
  2024-01-24  9:52   ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Chaitanya Kulkarni @ 2024-01-23 22:19 UTC (permalink / raw)
  To: Sagi Grimberg, kbusch@kernel.org, Christoph Hellwig, Jens Axboe
  Cc: linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	Shin'ichiro Kawasaki

Sagi/Christoph/Jens/Keith,

On 1/10/24 19:54, Chaitanya Kulkarni wrote:
> Hi all,
>
> After running the test that triggers the nvme timeout for nvme-pci, the
> device under test is lingering in an inconsistent state. Here are steps :-
>
> 1. Load the driver.
> 2. Trigger nvme_timeout.
> 3. After timeout handler gets triggered it fails the application
>      with I/O error.
> 4. lsblk and nvme listdoesn't show the device anymore.
> 5. ls does show the device.
> 6. Any write to it fails (e.g. dd) since device has 0 capacity.
>
> Is this accepted behavior? if it is then a malfunctioning device is
> lingering in the system and applications end up accessing it as if
> it is functioning properly. Can we avoid this scenario?
>
> How about we remove the device from the system? If you're all okay with
> it, I'll send a patch for the nvme-pci timeout to remove the device that
> has zero capacity. Otherwise please suggest how to deal with this scenario.
>
> With this confusing behavior I'm not entirely sure what is the expected
> scenario to pass timeout testcase.
>
> Please have a look at detailed test log below.
>
> -ck
>

Can someone please provide an insight on this behavior so we can merge 
testcase into blktests? Please note that Shinichiro also observed the 
same behavior.

-ck



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: nvme-pci timeout issue
  2024-01-23 22:19 ` Chaitanya Kulkarni
@ 2024-01-24  9:52   ` Christoph Hellwig
  2024-01-24 10:05     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2024-01-24  9:52 UTC (permalink / raw)
  To: Chaitanya Kulkarni
  Cc: Sagi Grimberg, kbusch@kernel.org, Christoph Hellwig, Jens Axboe,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	Shin'ichiro Kawasaki

On Tue, Jan 23, 2024 at 10:19:28PM +0000, Chaitanya Kulkarni wrote:
> Can someone please provide an insight on this behavior so we can merge 
> testcase into blktests? Please note that Shinichiro also observed the 
> same behavior.

Can you provide the test case?


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: nvme-pci timeout issue
  2024-01-24  9:52   ` Christoph Hellwig
@ 2024-01-24 10:05     ` Chaitanya Kulkarni
  0 siblings, 0 replies; 4+ messages in thread
From: Chaitanya Kulkarni @ 2024-01-24 10:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Sagi Grimberg, kbusch@kernel.org, Jens Axboe,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	Shin'ichiro Kawasaki

On 1/24/2024 1:52 AM, Christoph Hellwig wrote:
> On Tue, Jan 23, 2024 at 10:19:28PM +0000, Chaitanya Kulkarni wrote:
>> Can someone please provide an insight on this behavior so we can merge
>> testcase into blktests? Please note that Shinichiro also observed the
>> same behavior.
> 
> Can you provide the test case?
> 


Here is latest version of blktests patch for the testcase :-

 From d3ab5bb3d22fcc3593e4da7599523e013239720e Mon Sep 17 00:00:00 2001
From: Chaitanya Kulkarni <kch@nvidia.com>
Date: Tue, 23 Jan 2024 14:30:28 -0800
Subject: [PATCH blktests V3] nvme: add nvme pci timeout testcase

Trigger and test nvme-pci timeout with concurrent fio jobs.

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
V3:-

1. Add CAN_BE_ZONED.
2. Add FAULT_INJECTION_DEBUG_FS check in requires.
3. Remove _require_nvme_trtype pci in requires().
4. Remove device_requires().
5. Store fio output in FULL.
6. Revmoe shellcheck and use grep I/O error value to pass/fail testcase.

---
  tests/nvme/050     | 69 ++++++++++++++++++++++++++++++++++++++++++++++
  tests/nvme/050.out |  2 ++
  2 files changed, 71 insertions(+)
  create mode 100755 tests/nvme/050
  create mode 100644 tests/nvme/050.out

diff --git a/tests/nvme/050 b/tests/nvme/050
new file mode 100755
index 0000000..cacaba6
--- /dev/null
+++ b/tests/nvme/050
@@ -0,0 +1,69 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2024 Chaitanya Kulkarni
+#
+# Test NVMe-PCI timeout with FIO jobs by triggering the nvme_timeout 
function.
+#
+
+. tests/nvme/rc
+
+DESCRIPTION="test nvme-pci timeout with fio jobs"
+CAN_BE_ZONED=1
+
+sysfs_path="/sys/kernel/debug/fail_io_timeout/"
+#restrict test to nvme-pci only
+nvme_trtype=pci
+
+# fault injection config array
+declare -A fi_array
+
+requires() {
+	_have_fio
+	_nvme_requires
+	_have_kernel_option FAIL_IO_TIMEOUT
+	_have_kernel_option FAULT_INJECTION_DEBUG_FS
+}
+
+save_fi_settings() {
+	for fi_attr in probability interval times space verbose
+	do
+		fi_array["${fi_attr}"]=$(cat "${sysfs_path}/${fi_attr}")
+	done
+}
+
+restore_fi_settings() {
+	for fi_attr in probability interval times space verbose
+	do
+		echo "${fi_array["${fi_attr}"]}" > "${sysfs_path}/${fi_attr}"
+	done
+}
+
+test_device() {
+	local nvme_ns
+	local io_fimeout_fail
+
+	echo "Running ${TEST_NAME}"
+
+	nvme_ns="$(basename "${TEST_DEV}")"
+	io_fimeout_fail="$(cat /sys/block/"${nvme_ns}"/io-timeout-fail)"
+	save_fi_settings
+	echo 1 > /sys/block/"${nvme_ns}"/io-timeout-fail
+
+	echo 100 > /sys/kernel/debug/fail_io_timeout/probability
+	echo   1 > /sys/kernel/debug/fail_io_timeout/interval
+	echo  -1 > /sys/kernel/debug/fail_io_timeout/times
+	echo   0 > /sys/kernel/debug/fail_io_timeout/space
+	echo   1 > /sys/kernel/debug/fail_io_timeout/verbose
+
+	fio --bs=4k --rw=randread --norandommap --numjobs="$(nproc)" \
+	    --name=reads --direct=1 --filename="${TEST_DEV}" --group_reporting \
+	    --time_based --runtime=1m >& "$FULL"
+
+	if grep -q "Input/output error" "$FULL"; then
+		echo "Test complete"
+	else
+		echo "Test failed"
+	fi
+	restore_fi_settings
+	echo "${io_fimeout_fail}" > /sys/block/"${nvme_ns}"/io-timeout-fail
+}
diff --git a/tests/nvme/050.out b/tests/nvme/050.out
new file mode 100644
index 0000000..b78b05f
--- /dev/null
+++ b/tests/nvme/050.out
@@ -0,0 +1,2 @@
+Running nvme/050
+Test complete
-- 
2.40.0



It is also posted here on linux-nvme just in case :-

https://lists.infradead.org/pipermail/linux-nvme/2024-January/044562.html

-ck


^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-01-24 10:05 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-11  3:54 nvme-pci timeout issue Chaitanya Kulkarni
2024-01-23 22:19 ` Chaitanya Kulkarni
2024-01-24  9:52   ` Christoph Hellwig
2024-01-24 10:05     ` Chaitanya Kulkarni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox