* [PATCH] nvme: test nvmet-wq sysfs interface
@ 2024-11-04 19:29 Chaitanya Kulkarni
2024-11-05 7:02 ` Daniel Wagner
2024-11-08 5:21 ` Shinichiro Kawasaki
0 siblings, 2 replies; 7+ messages in thread
From: Chaitanya Kulkarni @ 2024-11-04 19:29 UTC (permalink / raw)
To: kanie
Cc: linux-nvme, linux-block, shinichiro.kawasaki, dwagner,
Chaitanya Kulkarni
Add a test that randomly sets the cpumask from available CPUs for
the nvmet-wq while running the fio workload. This patch has been
tested on nvme-loop and nvme-tcp transport.
Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
---
tests/nvme/055 | 99 ++++++++++++++++++++++++++++++++++++++++++++++
tests/nvme/055.out | 3 ++
2 files changed, 102 insertions(+)
create mode 100755 tests/nvme/055
create mode 100644 tests/nvme/055.out
diff --git a/tests/nvme/055 b/tests/nvme/055
new file mode 100755
index 0000000..9fe27a3
--- /dev/null
+++ b/tests/nvme/055
@@ -0,0 +1,99 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+# Copyright (C) 2024 Chaitanya Kulkarni
+#
+# Test nvmet-wq cpumask sysfs attribute with NVMe-oF and fio workload
+#
+
+. tests/nvme/rc
+
+DESCRIPTION="Test nvmet-wq cpumask sysfs attribute with fio on NVMe-oF device"
+TIMED=1
+
+requires() {
+ _nvme_requires
+ _have_fio && _have_loop
+ _require_nvme_trtype_is_fabrics
+}
+
+cleanup_setup() {
+ _nvme_disconnect_subsys
+ _nvmet_target_cleanup
+}
+
+test() {
+ local cpumask_path="/sys/devices/virtual/workqueue/nvmet-wq/cpumask"
+ local original_cpumask
+ local min_cpus
+ local max_cpus
+ local numbers
+ local idx
+ local ns
+
+ echo "Running ${TEST_NAME}"
+
+ _setup_nvmet
+ _nvmet_target_setup
+ _nvme_connect_subsys
+
+ if [ ! -f "$cpumask_path" ]; then
+ SKIP_REASONS+=("nvmet_wq cpumask sysfs attribute not found.")
+ cleanup_setup
+ return 1
+ fi
+
+ ns=$(_find_nvme_ns "${def_subsys_uuid}")
+
+ original_cpumask=$(cat "$cpumask_path")
+
+ num_cpus=$(nproc)
+ max_cpus=$(( num_cpus < 20 ? num_cpus : 20 ))
+ min_cpus=0
+ #shellcheck disable=SC2207
+ numbers=($(seq $min_cpus $max_cpus))
+
+ _run_fio_rand_io --filename="/dev/${ns}" --time_based --runtime=130s \
+ --iodepth=8 --size="${NVME_IMG_SIZE}" &> "$FULL" &
+
+ # Let the fio settle down else we will break in the loop for fio check
+ sleep 1
+ for ((i = 0; i < max_cpus; i++)); do
+ if ! pgrep -x fio &> /dev/null ; then
+ break
+ fi
+
+ if [[ ${#numbers[@]} -eq 0 ]]; then
+ break
+ fi
+
+ idx=$((RANDOM % ${#numbers[@]}))
+
+ #shellcheck disable=SC2004
+ cpu_mask=$(printf "%X" $((1 << ${numbers[idx]})))
+ echo "$cpu_mask" > "$cpumask_path"
+ if [[ $(cat "$cpumask_path") =~ ^[0,]*${cpu_mask}\n$ ]]; then
+ echo "Test Failed: cpumask was not set correctly "
+ echo "Expected ${cpu_mask} found $(cat "$cpumask_path")"
+ cleanup_setup
+ return 1
+ fi
+ sleep 3
+ # Remove the selected number
+ numbers=("${numbers[@]:0:$idx}" "${numbers[@]:$((idx + 1))}")
+ done
+
+ killall fio &> /dev/null
+
+ # Restore original cpumask
+ echo "$original_cpumask" > "$cpumask_path"
+ restored_cpumask=$(cat "$cpumask_path")
+
+ if [[ "$restored_cpumask" != "$original_cpumask" ]]; then
+ echo "Failed to restore original cpumask."
+ cleanup_setup
+ return 1
+ fi
+
+ cleanup_setup
+ echo "Test complete"
+}
diff --git a/tests/nvme/055.out b/tests/nvme/055.out
new file mode 100644
index 0000000..427dfee
--- /dev/null
+++ b/tests/nvme/055.out
@@ -0,0 +1,3 @@
+Running nvme/055
+disconnected 1 controller(s)
+Test complete
--
2.40.0
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] nvme: test nvmet-wq sysfs interface
2024-11-04 19:29 [PATCH] nvme: test nvmet-wq sysfs interface Chaitanya Kulkarni
@ 2024-11-05 7:02 ` Daniel Wagner
2024-11-06 3:46 ` Chaitanya Kulkarni
2024-11-08 5:21 ` Shinichiro Kawasaki
1 sibling, 1 reply; 7+ messages in thread
From: Daniel Wagner @ 2024-11-05 7:02 UTC (permalink / raw)
To: Chaitanya Kulkarni; +Cc: kanie, linux-nvme, linux-block, shinichiro.kawasaki
On Mon, Nov 04, 2024 at 11:29:07AM GMT, Chaitanya Kulkarni wrote:
> +# Test nvmet-wq cpumask sysfs attribute with NVMe-oF and fio workload
What do you want to test here? What's the objective? I don't see any
block layer related parts or do I miss something?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: test nvmet-wq sysfs interface
2024-11-05 7:02 ` Daniel Wagner
@ 2024-11-06 3:46 ` Chaitanya Kulkarni
2024-11-06 12:13 ` Daniel Wagner
0 siblings, 1 reply; 7+ messages in thread
From: Chaitanya Kulkarni @ 2024-11-06 3:46 UTC (permalink / raw)
To: Daniel Wagner
Cc: kanie@linux.alibaba.com, linux-nvme@lists.infradead.org,
linux-block@vger.kernel.org, shinichiro.kawasaki@wdc.com,
Chaitanya Kulkarni
On 11/4/24 23:02, Daniel Wagner wrote:
> On Mon, Nov 04, 2024 at 11:29:07AM GMT, Chaitanya Kulkarni wrote:
>> +# Test nvmet-wq cpumask sysfs attribute with NVMe-oF and fio workload
> What do you want to test here? What's the objective? I don't see any
> block layer related parts or do I miss something?
For NVMeOF target we have exported nvmet-wq to userspace with
recent patch. This behavior is new and I don't think there is any test
exists that runs the fio workload while changing the CPUMASK randomly
to ensure the stability for nvmet-wq when application is using the block
layer. Hence we need the test for latest patch we have added to the 6.13.
-ck
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: test nvmet-wq sysfs interface
2024-11-06 3:46 ` Chaitanya Kulkarni
@ 2024-11-06 12:13 ` Daniel Wagner
2024-11-08 5:00 ` Shinichiro Kawasaki
0 siblings, 1 reply; 7+ messages in thread
From: Daniel Wagner @ 2024-11-06 12:13 UTC (permalink / raw)
To: Chaitanya Kulkarni
Cc: kanie@linux.alibaba.com, linux-nvme@lists.infradead.org,
linux-block@vger.kernel.org, shinichiro.kawasaki@wdc.com
On Wed, Nov 06, 2024 at 03:46:26AM GMT, Chaitanya Kulkarni wrote:
> On 11/4/24 23:02, Daniel Wagner wrote:
> > On Mon, Nov 04, 2024 at 11:29:07AM GMT, Chaitanya Kulkarni wrote:
> >> +# Test nvmet-wq cpumask sysfs attribute with NVMe-oF and fio workload
> > What do you want to test here? What's the objective? I don't see any
> > block layer related parts or do I miss something?
>
> For NVMeOF target we have exported nvmet-wq to userspace with
> recent patch. This behavior is new and I don't think there is any test
> exists that runs the fio workload while changing the CPUMASK randomly
> to ensure the stability for nvmet-wq when application is using the block
> layer. Hence we need the test for latest patch we have added to the
> 6.13.
Though this is not a performance measurement just a functional testing if
the scheduler works. I don't think we should add such tests
because it will add to the overall runtime for little benefit. I am
pretty sure there already tests (e.g. kunit) which do more elaborate
scheduler tests.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: test nvmet-wq sysfs interface
2024-11-06 12:13 ` Daniel Wagner
@ 2024-11-08 5:00 ` Shinichiro Kawasaki
0 siblings, 0 replies; 7+ messages in thread
From: Shinichiro Kawasaki @ 2024-11-08 5:00 UTC (permalink / raw)
To: Daniel Wagner
Cc: Chaitanya Kulkarni, kanie@linux.alibaba.com,
linux-nvme@lists.infradead.org, linux-block@vger.kernel.org
On Nov 06, 2024 / 13:13, Daniel Wagner wrote:
> On Wed, Nov 06, 2024 at 03:46:26AM GMT, Chaitanya Kulkarni wrote:
> > On 11/4/24 23:02, Daniel Wagner wrote:
> > > On Mon, Nov 04, 2024 at 11:29:07AM GMT, Chaitanya Kulkarni wrote:
> > >> +# Test nvmet-wq cpumask sysfs attribute with NVMe-oF and fio workload
> > > What do you want to test here? What's the objective? I don't see any
> > > block layer related parts or do I miss something?
> >
> > For NVMeOF target we have exported nvmet-wq to userspace with
> > recent patch. This behavior is new and I don't think there is any test
> > exists that runs the fio workload while changing the CPUMASK randomly
> > to ensure the stability for nvmet-wq when application is using the block
> > layer. Hence we need the test for latest patch we have added to the
> > 6.13.
>
> Though this is not a performance measurement just a functional testing if
> the scheduler works. I don't think we should add such tests
> because it will add to the overall runtime for little benefit. I am
> pretty sure there already tests (e.g. kunit) which do more elaborate
> scheduler tests.
When I ran the test case on my test system using the kernel v6.12-rc6 and the
patch "nvmet: make nvmet_wq visible in sysfs", I observed a kernel INFO [1].
This INFO looks indicating a bug in nvmet driver. Is this a known issue? The
INFO is stably recreated with loop, rdma and tcp transports on my test system.
If this is a new, unknown issue, this test case a good test to exercise nvmet
driver. It looks worth adding to blktests.
As for the runtime cost, I think the test case can be modified to reflect the
TIMEOUT variable users set. This will allow users to control the runtime cost to
some extent.
[1]
[ 87.365354][ T963] run blktests nvme/055 at 2024-11-08 09:32:42
[ 87.433572][ T1011] loop0: detected capacity change from 0 to 2097152
[ 87.452511][ T1014] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[ 87.478018][ T1019] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
[ 87.565565][ T226] nvmet: creating nvm controller 1 for subsystem blktests-subsystem-1 for NQN nqn.20.
[ 87.570700][ T1026] nvme nvme1: Please enable CONFIG_NVME_MULTIPATH for full support of multi-port dev.
[ 87.573064][ T1026] nvme nvme1: creating 4 I/O queues.
[ 87.577041][ T1026] nvme nvme1: mapped 4/0/0 default/read/poll queues.
[ 87.580437][ T1026] nvme nvme1: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420, hostnqn: nq9
[ 101.337422][ T1073] nvme nvme1: Removing ctrl: NQN "blktests-subsystem-1"
[ 101.383814][ T104] INFO: trying to register non-static key.
[ 101.384456][ T104] The code is fine but needs lockdep annotation, or maybe
[ 101.385136][ T104] you didn't initialize this object before use?
[ 101.385758][ T104] turning off the locking correctness validator.
[ 101.386392][ T104] CPU: 1 UID: 0 PID: 104 Comm: kworker/u16:4 Not tainted 6.12.0-rc6+ #361
[ 101.387197][ T104] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/204
[ 101.388109][ T104] Workqueue: nvmet-wq nvmet_tcp_release_queue_work [nvmet_tcp]
[ 101.388868][ T104] Call Trace:
[ 101.389203][ T104] <TASK>
[ 101.389484][ T104] dump_stack_lvl+0x6a/0x90
[ 101.389927][ T104] register_lock_class+0xe2a/0x10a0
[ 101.390444][ T104] ? __lock_acquire+0xd1b/0x5f20
[ 101.390921][ T104] ? __pfx_register_lock_class+0x10/0x10
[ 101.391479][ T104] __lock_acquire+0x81e/0x5f20
[ 101.391939][ T104] ? lock_is_held_type+0xd5/0x130
[ 101.392434][ T104] ? find_held_lock+0x2d/0x110
[ 101.392891][ T104] ? __pfx___lock_acquire+0x10/0x10
[ 101.393399][ T104] ? lock_release+0x460/0x7a0
[ 101.393853][ T104] ? __pfx_lock_release+0x10/0x10
[ 101.394351][ T104] lock_acquire.part.0+0x12d/0x360
[ 101.394841][ T104] ? xa_erase+0xd/0x30
[ 101.395255][ T104] ? __pfx_lock_acquire.part.0+0x10/0x10
[ 101.395793][ T104] ? rcu_is_watching+0x11/0xb0
[ 101.396271][ T104] ? trace_lock_acquire+0x12f/0x1a0
[ 101.396770][ T104] ? __pfx___flush_work+0x10/0x10
[ 101.397266][ T104] ? xa_erase+0xd/0x30
[ 101.397658][ T104] ? lock_acquire+0x2d/0xc0
[ 101.398089][ T104] ? xa_erase+0xd/0x30
[ 101.398501][ T104] _raw_spin_lock+0x2f/0x40
[ 101.398933][ T104] ? xa_erase+0xd/0x30
[ 101.400167][ T104] xa_erase+0xd/0x30
[ 101.401359][ T104] nvmet_ctrl_destroy_pr+0x10e/0x1c0 [nvmet]
[ 101.402767][ T104] ? __pfx_nvmet_ctrl_destroy_pr+0x10/0x10 [nvmet]
[ 101.404235][ T104] ? __pfx___might_resched+0x10/0x10
[ 101.405558][ T104] nvmet_ctrl_free+0x2f0/0x830 [nvmet]
[ 101.406900][ T104] ? lockdep_hardirqs_on+0x78/0x100
[ 101.408192][ T104] ? __cancel_work+0x166/0x230
[ 101.409398][ T104] ? __pfx_nvmet_ctrl_free+0x10/0x10 [nvmet]
[ 101.410726][ T104] ? rcu_is_watching+0x11/0xb0
[ 101.411938][ T104] ? kfree+0x13e/0x4a0
[ 101.413073][ T104] ? lockdep_hardirqs_on+0x78/0x100
[ 101.414318][ T104] nvmet_sq_destroy+0x1f2/0x3a0 [nvmet]
[ 101.415556][ T104] nvmet_tcp_release_queue_work+0x4c0/0xe40 [nvmet_tcp]
[ 101.416912][ T104] process_one_work+0x85a/0x1460
[ 101.418064][ T104] ? __pfx_lock_acquire.part.0+0x10/0x10
[ 101.419266][ T104] ? __pfx_process_one_work+0x10/0x10
[ 101.420427][ T104] ? assign_work+0x16c/0x240
[ 101.421486][ T104] ? lock_is_held_type+0xd5/0x130
[ 101.422555][ T104] worker_thread+0x5e2/0xfc0
[ 101.423574][ T104] ? __pfx_worker_thread+0x10/0x10
[ 101.424639][ T104] kthread+0x2d1/0x3a0
[ 101.425587][ T104] ? _raw_spin_unlock_irq+0x24/0x50
[ 101.426644][ T104] ? __pfx_kthread+0x10/0x10
[ 101.427637][ T104] ret_from_fork+0x30/0x70
[ 101.428596][ T104] ? __pfx_kthread+0x10/0x10
[ 101.429562][ T104] ret_from_fork_asm+0x1a/0x30
[ 101.430535][ T104] </TASK>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] nvme: test nvmet-wq sysfs interface
2024-11-04 19:29 [PATCH] nvme: test nvmet-wq sysfs interface Chaitanya Kulkarni
2024-11-05 7:02 ` Daniel Wagner
@ 2024-11-08 5:21 ` Shinichiro Kawasaki
2024-11-12 0:58 ` Chaitanya Kulkarni
1 sibling, 1 reply; 7+ messages in thread
From: Shinichiro Kawasaki @ 2024-11-08 5:21 UTC (permalink / raw)
To: Chaitanya Kulkarni
Cc: kanie@linux.alibaba.com, linux-nvme@lists.infradead.org,
linux-block@vger.kernel.org, dwagner@suse.de
On Nov 04, 2024 / 11:29, Chaitanya Kulkarni wrote:
> Add a test that randomly sets the cpumask from available CPUs for
> the nvmet-wq while running the fio workload. This patch has been
> tested on nvme-loop and nvme-tcp transport.
>
> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Thanks. As I noted in another mail, this test case generates a kernel
INFO message, and it looks like catching a new bug. So I think this
test case worth adding. Please find my review comments in line.
> ---
> tests/nvme/055 | 99 ++++++++++++++++++++++++++++++++++++++++++++++
> tests/nvme/055.out | 3 ++
> 2 files changed, 102 insertions(+)
> create mode 100755 tests/nvme/055
> create mode 100644 tests/nvme/055.out
>
> diff --git a/tests/nvme/055 b/tests/nvme/055
> new file mode 100755
> index 0000000..9fe27a3
> --- /dev/null
> +++ b/tests/nvme/055
> @@ -0,0 +1,99 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0+
> +# Copyright (C) 2024 Chaitanya Kulkarni
> +#
> +# Test nvmet-wq cpumask sysfs attribute with NVMe-oF and fio workload
> +#
> +
> +. tests/nvme/rc
> +
> +DESCRIPTION="Test nvmet-wq cpumask sysfs attribute with fio on NVMe-oF device"
> +TIMED=1
> +
> +requires() {
> + _nvme_requires
> + _have_fio && _have_loop
Nit, a unneccesaary space.
> + _require_nvme_trtype_is_fabrics
> +}
I suggest to add set_conditions() here as below. Without it, the test case is
skipped when users do not se NVMET_TRTYPES variable. If we add set_conditions(),
_have_loop call in requires() can be removed.
set_conditions() {
_set_nvme_trtype "$@"
}
> +
> +cleanup_setup() {
> + _nvme_disconnect_subsys
> + _nvmet_target_cleanup
> +}
> +
> +test() {
> + local cpumask_path="/sys/devices/virtual/workqueue/nvmet-wq/cpumask"
> + local original_cpumask
> + local min_cpus
> + local max_cpus
> + local numbers
> + local idx
> + local ns
> +
> + echo "Running ${TEST_NAME}"
> +
> + _setup_nvmet
> + _nvmet_target_setup
> + _nvme_connect_subsys
> +
> + if [ ! -f "$cpumask_path" ]; then
> + SKIP_REASONS+=("nvmet_wq cpumask sysfs attribute not found.")
> + cleanup_setup
> + return 1
> + fi
> +
> + ns=$(_find_nvme_ns "${def_subsys_uuid}")
> +
> + original_cpumask=$(cat "$cpumask_path")
> +
> + num_cpus=$(nproc)
> + max_cpus=$(( num_cpus < 20 ? num_cpus : 20 ))
> + min_cpus=0
> + #shellcheck disable=SC2207
> + numbers=($(seq $min_cpus $max_cpus))
Nit: the shellcheck error can be vaoided with this:
read -a numbers -d '' < <(seq $min_cpus $max_cpus)
> +
> + _run_fio_rand_io --filename="/dev/${ns}" --time_based --runtime=130s \
The --time_based and --runtime=130s options are not required, because fio helper
bash functions will add them.
Instead, let's add this line before the fio command.
: ${TIMEOUT:=60}
The fio helper functions will refect this TIMEOUT value to the --runtime
option. This will allow users to control runtime cost as TIMED=1 indicates.
> + --iodepth=8 --size="${NVME_IMG_SIZE}" &> "$FULL" &
> +
> + # Let the fio settle down else we will break in the loop for fio check
> + sleep 1
> + for ((i = 0; i < max_cpus; i++)); do
> + if ! pgrep -x fio &> /dev/null ; then
pgrep command is not in the GNU coreutils. I suggest to keep the fio process id
in a variable and check with "kill -0" command. The test case block/005 does it.
> + break
> + fi
> +
> + if [[ ${#numbers[@]} -eq 0 ]]; then
> + break
> + fi
> +
> + idx=$((RANDOM % ${#numbers[@]}))
> +
> + #shellcheck disable=SC2004
> + cpu_mask=$(printf "%X" $((1 << ${numbers[idx]})))
> + echo "$cpu_mask" > "$cpumask_path"
When I ran this test case, I observed an error at this line:
echo: write error: Value too large for defined data type
I think the cpu_mask calculation is wrong, and it should be like this:
cpu_mask=0
for ((n = 0; n < numbers[idx]; n++)); do
cpu_mask=$((cpu_mask + (1 << n)))
done
cpu_mask=$(printf "%X" $((cpu_mask)))
> + if [[ $(cat "$cpumask_path") =~ ^[0,]*${cpu_mask}\n$ ]]; then
> + echo "Test Failed: cpumask was not set correctly "
> + echo "Expected ${cpu_mask} found $(cat "$cpumask_path")"
> + cleanup_setup
> + return 1
> + fi
> + sleep 3
> + # Remove the selected number
> + numbers=("${numbers[@]:0:$idx}" "${numbers[@]:$((idx + 1))}")
> + done
> +
> + killall fio &> /dev/null
killall is not in GNU coreutils either (blktests already uses it at other
places though...) Does "kill -9" work instead?
> +
> + # Restore original cpumask
> + echo "$original_cpumask" > "$cpumask_path"
> + restored_cpumask=$(cat "$cpumask_path")
> +
> + if [[ "$restored_cpumask" != "$original_cpumask" ]]; then
> + echo "Failed to restore original cpumask."
> + cleanup_setup
> + return 1
> + fi
> +
> + cleanup_setup
> + echo "Test complete"
> +}
> diff --git a/tests/nvme/055.out b/tests/nvme/055.out
> new file mode 100644
> index 0000000..427dfee
> --- /dev/null
> +++ b/tests/nvme/055.out
> @@ -0,0 +1,3 @@
> +Running nvme/055
> +disconnected 1 controller(s)
> +Test complete
> --
> 2.40.0
>
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: [PATCH] nvme: test nvmet-wq sysfs interface
2024-11-08 5:21 ` Shinichiro Kawasaki
@ 2024-11-12 0:58 ` Chaitanya Kulkarni
0 siblings, 0 replies; 7+ messages in thread
From: Chaitanya Kulkarni @ 2024-11-12 0:58 UTC (permalink / raw)
To: Shinichiro Kawasaki, Chaitanya Kulkarni
Cc: kanie@linux.alibaba.com, linux-nvme@lists.infradead.org,
linux-block@vger.kernel.org, dwagner@suse.de
On 11/7/24 21:21, Shinichiro Kawasaki wrote:
> On Nov 04, 2024 / 11:29, Chaitanya Kulkarni wrote:
>> Add a test that randomly sets the cpumask from available CPUs for
>> the nvmet-wq while running the fio workload. This patch has been
>> tested on nvme-loop and nvme-tcp transport.
>>
>> Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
>
> Thanks. As I noted in another mail, this test case generates a kernel
> INFO message, and it looks like catching a new bug. So I think this
> test case worth adding. Please find my review comments in line.
>
>> ---
>> tests/nvme/055 | 99 ++++++++++++++++++++++++++++++++++++++++++++++
>> tests/nvme/055.out | 3 ++
>> 2 files changed, 102 insertions(+)
>> create mode 100755 tests/nvme/055
>> create mode 100644 tests/nvme/055.out
>>
>> diff --git a/tests/nvme/055 b/tests/nvme/055
>> new file mode 100755
>> index 0000000..9fe27a3
>> --- /dev/null
>> +++ b/tests/nvme/055
>> @@ -0,0 +1,99 @@
>> +#!/bin/bash
>> +# SPDX-License-Identifier: GPL-2.0+
>> +# Copyright (C) 2024 Chaitanya Kulkarni
>> +#
>> +# Test nvmet-wq cpumask sysfs attribute with NVMe-oF and fio workload
>> +#
>> +
>> +. tests/nvme/rc
>> +
>> +DESCRIPTION="Test nvmet-wq cpumask sysfs attribute with fio on NVMe-oF device"
>> +TIMED=1
>> +
>> +requires() {
>> + _nvme_requires
>> + _have_fio && _have_loop
>
> Nit, a unneccesaary space.
>
>> + _require_nvme_trtype_is_fabrics
>> +}
>
done.
> I suggest to add set_conditions() here as below. Without it, the test case is
> skipped when users do not se NVMET_TRTYPES variable. If we add set_conditions(),
> _have_loop call in requires() can be removed.
>
> set_conditions() {
> _set_nvme_trtype "$@"
> }
>
done.
>> +
>> +cleanup_setup() {
>> + _nvme_disconnect_subsys
>> + _nvmet_target_cleanup
>> +}
>> +
>> +test() {
>> + local cpumask_path="/sys/devices/virtual/workqueue/nvmet-wq/cpumask"
>> + local original_cpumask
>> + local min_cpus
>> + local max_cpus
>> + local numbers
>> + local idx
>> + local ns
>> +
>> + echo "Running ${TEST_NAME}"
>> +
>> + _setup_nvmet
>> + _nvmet_target_setup
>> + _nvme_connect_subsys
>> +
>> + if [ ! -f "$cpumask_path" ]; then
>> + SKIP_REASONS+=("nvmet_wq cpumask sysfs attribute not found.")
>> + cleanup_setup
>> + return 1
>> + fi
>> +
>> + ns=$(_find_nvme_ns "${def_subsys_uuid}")
>> +
>> + original_cpumask=$(cat "$cpumask_path")
>> +
>> + num_cpus=$(nproc)
>> + max_cpus=$(( num_cpus < 20 ? num_cpus : 20 ))
>> + min_cpus=0
>> + #shellcheck disable=SC2207
>> + numbers=($(seq $min_cpus $max_cpus))
>
> Nit: the shellcheck error can be vaoided with this:
>
> read -a numbers -d '' < <(seq $min_cpus $max_cpus)
>
done.
>> +
>> + _run_fio_rand_io --filename="/dev/${ns}" --time_based --runtime=130s \
>
> The --time_based and --runtime=130s options are not required, because fio helper
> bash functions will add them.
>
> Instead, let's add this line before the fio command.
>
> : ${TIMEOUT:=60}
>
> The fio helper functions will refect this TIMEOUT value to the --runtime
> option. This will allow users to control runtime cost as TIMED=1 indicates.
>
done.
>> + --iodepth=8 --size="${NVME_IMG_SIZE}" &> "$FULL" &
>> +
>> + # Let the fio settle down else we will break in the loop for fio check
>> + sleep 1
>> + for ((i = 0; i < max_cpus; i++)); do
>> + if ! pgrep -x fio &> /dev/null ; then
>
> pgrep command is not in the GNU coreutils. I suggest to keep the fio process id
> in a variable and check with "kill -0" command. The test case block/005 does it.
>
thanks for the pointer, done and it looks much cleaner.
>> + break
>> + fi
>> +
>> + if [[ ${#numbers[@]} -eq 0 ]]; then
>> + break
>> + fi
>> +
>> + idx=$((RANDOM % ${#numbers[@]}))
>> +
>> + #shellcheck disable=SC2004
>> + cpu_mask=$(printf "%X" $((1 << ${numbers[idx]})))
>> + echo "$cpu_mask" > "$cpumask_path"
>
> When I ran this test case, I observed an error at this line:
>
> echo: write error: Value too large for defined data type
>
> I think the cpu_mask calculation is wrong, and it should be like this:
>
> cpu_mask=0
> for ((n = 0; n < numbers[idx]; n++)); do
> cpu_mask=$((cpu_mask + (1 << n)))
> done
> cpu_mask=$(printf "%X" $((cpu_mask)))
>
>
didn't see that error in my execution done.
This version tries to only use one CPU at a time, but with your
suggestion it will now consider more than one CPUs which I think is
useful too.
I modified the calculation to keep it simple for now.
>> + if [[ $(cat "$cpumask_path") =~ ^[0,]*${cpu_mask}\n$ ]]; then
>> + echo "Test Failed: cpumask was not set correctly "
>> + echo "Expected ${cpu_mask} found $(cat "$cpumask_path")"
>> + cleanup_setup
>> + return 1
>> + fi
>> + sleep 3
>> + # Remove the selected number
>> + numbers=("${numbers[@]:0:$idx}" "${numbers[@]:$((idx + 1))}")
>> + done
>> +
>> + killall fio &> /dev/null
>
> killall is not in GNU coreutils either (blktests already uses it at other
> places though...) Does "kill -9" work instead?
>
done.
Thanks for the review comments, I'll send it V2 soon.
-ck
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-11-12 1:04 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-04 19:29 [PATCH] nvme: test nvmet-wq sysfs interface Chaitanya Kulkarni
2024-11-05 7:02 ` Daniel Wagner
2024-11-06 3:46 ` Chaitanya Kulkarni
2024-11-06 12:13 ` Daniel Wagner
2024-11-08 5:00 ` Shinichiro Kawasaki
2024-11-08 5:21 ` Shinichiro Kawasaki
2024-11-12 0:58 ` Chaitanya Kulkarni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox