* [PATCH blktests 0/2] Add scsi-stress-remove to blktests
@ 2018-12-12 23:09 Dennis Zhou
2018-12-12 23:09 ` [PATCH blktests 1/2] blktests: split out cgroup2 controller and file check Dennis Zhou
2018-12-12 23:09 ` [PATCH blktests 2/2] blktests: add Ming Lei's scsi-stress-remove Dennis Zhou
0 siblings, 2 replies; 11+ messages in thread
From: Dennis Zhou @ 2018-12-12 23:09 UTC (permalink / raw)
To: Omar Sandoval, Ming Lei; +Cc: kernel-team, linux-block, Dennis Zhou
Hi,
Ming Lei's scsi-stress-remove test found a bug in blkg destruction [1]
where bios being created when the request_queue was being cleaned up
threw a NPE in blkg association. The fix is currently being discussed in
[2]. To make this test more accessible, I've ported it to blktests with
Ming Lei's copyright. I've tested this in my qemu instance and verified
we do not see the NPE on a fixed kernel.
Ming, please let me know if you have any objections.
[1] https://lore.kernel.org/linux-block/CACVXFVO_QXipD3cmPvpLyBYSiEcWPN_ThQ=0pO9AwLqN-Lv93w@mail.gmail.com
[2] https://lore.kernel.org/lkml/20181211230308.66276-1-dennis@kernel.org/
This patchset is ontop of osandov#josef ad08c1fe0d9f.
diffstats below:
Dennis Zhou (2):
blktests: split out cgroup2 controller and file check
blktests: add Ming Lei's scsi-stress-remove
common/cgroup | 18 ++++++---
tests/block/022 | 96 +++++++++++++++++++++++++++++++++++++++++++++
tests/block/022.out | 2 +
3 files changed, 111 insertions(+), 5 deletions(-)
create mode 100755 tests/block/022
create mode 100644 tests/block/022.out
Thanks,
Dennis
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH blktests 1/2] blktests: split out cgroup2 controller and file check
2018-12-12 23:09 [PATCH blktests 0/2] Add scsi-stress-remove to blktests Dennis Zhou
@ 2018-12-12 23:09 ` Dennis Zhou
2018-12-19 18:34 ` Omar Sandoval
2018-12-12 23:09 ` [PATCH blktests 2/2] blktests: add Ming Lei's scsi-stress-remove Dennis Zhou
1 sibling, 1 reply; 11+ messages in thread
From: Dennis Zhou @ 2018-12-12 23:09 UTC (permalink / raw)
To: Omar Sandoval, Ming Lei; +Cc: kernel-team, linux-block, Dennis Zhou
This is a prep patch for a new test that will race blkg association and
request_queue cleanup. As blkg association is a underlying cgroup io
controller feature, we need the ability to check if the controller is
available.
Signed-off-by: Dennis Zhou <dennis@kernel.org>
---
common/cgroup | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/common/cgroup b/common/cgroup
index d445093..3481458 100644
--- a/common/cgroup
+++ b/common/cgroup
@@ -37,19 +37,27 @@ _have_cgroup2()
return 0
}
-_have_cgroup2_controller_file()
+_have_cgroup2_controller()
{
- _have_cgroup2 || return 1
-
local controller="$1"
- local file="$2"
- local dir
+
+ _have_cgroup2 || return 1
dir="$(_cgroup2_base_dir)"
+
if ! grep -q "$controller" "$dir/cgroup.controllers"; then
SKIP_REASON="no support for $controller cgroup controller; if it is enabled, you may need to boot with cgroup_no_v1=$controller"
return 1
fi
+}
+
+_have_cgroup2_controller_file()
+{
+ local controller="$1"
+ local file="$2"
+ local dir
+
+ _have_cgroup_2_controller "$controller" || return 1
mkdir "$dir/blktests"
echo "+$controller" > "$dir/cgroup.subtree_control"
--
2.17.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH blktests 1/2] blktests: split out cgroup2 controller and file check
2018-12-12 23:09 ` [PATCH blktests 1/2] blktests: split out cgroup2 controller and file check Dennis Zhou
@ 2018-12-19 18:34 ` Omar Sandoval
0 siblings, 0 replies; 11+ messages in thread
From: Omar Sandoval @ 2018-12-19 18:34 UTC (permalink / raw)
To: Dennis Zhou; +Cc: Omar Sandoval, Ming Lei, kernel-team, linux-block
On Wed, Dec 12, 2018 at 06:09:58PM -0500, Dennis Zhou wrote:
> This is a prep patch for a new test that will race blkg association and
> request_queue cleanup. As blkg association is a underlying cgroup io
> controller feature, we need the ability to check if the controller is
> available.
>
> Signed-off-by: Dennis Zhou <dennis@kernel.org>
> ---
> common/cgroup | 18 +++++++++++++-----
> 1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/common/cgroup b/common/cgroup
> index d445093..3481458 100644
> --- a/common/cgroup
> +++ b/common/cgroup
> @@ -37,19 +37,27 @@ _have_cgroup2()
> return 0
> }
>
> -_have_cgroup2_controller_file()
> +_have_cgroup2_controller()
> {
> - _have_cgroup2 || return 1
> -
> local controller="$1"
> - local file="$2"
> - local dir
> +
> + _have_cgroup2 || return 1
>
> dir="$(_cgroup2_base_dir)"
> +
> if ! grep -q "$controller" "$dir/cgroup.controllers"; then
> SKIP_REASON="no support for $controller cgroup controller; if it is enabled, you may need to boot with cgroup_no_v1=$controller"
> return 1
> fi
> +}
> +
> +_have_cgroup2_controller_file()
> +{
> + local controller="$1"
> + local file="$2"
> + local dir
> +
> + _have_cgroup_2_controller "$controller" || return 1
This should be _have_cgroup2_controller. I'll fix it when I apply it.
>
> mkdir "$dir/blktests"
> echo "+$controller" > "$dir/cgroup.subtree_control"
> --
> 2.17.1
>
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH blktests 2/2] blktests: add Ming Lei's scsi-stress-remove
2018-12-12 23:09 [PATCH blktests 0/2] Add scsi-stress-remove to blktests Dennis Zhou
2018-12-12 23:09 ` [PATCH blktests 1/2] blktests: split out cgroup2 controller and file check Dennis Zhou
@ 2018-12-12 23:09 ` Dennis Zhou
2018-12-13 1:24 ` Ming Lei
2018-12-13 18:28 ` [PATCH blktests v2 " Dennis Zhou
1 sibling, 2 replies; 11+ messages in thread
From: Dennis Zhou @ 2018-12-12 23:09 UTC (permalink / raw)
To: Omar Sandoval, Ming Lei; +Cc: kernel-team, linux-block, Dennis Zhou
This test exposed a race condition with shutting down a request_queue
and the new blkg association. The issue ended up being that while the
request_queue will just start failing requests, blkg destruction sets
the q->root_blkg to %NULL. This caused a NPE when trying to reference
it. So to help prevent this from happening again, integrate Ming's test
into blktests so that it can more easily be ran.
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
---
tests/block/022 | 96 +++++++++++++++++++++++++++++++++++++++++++++
tests/block/022.out | 2 +
2 files changed, 98 insertions(+)
create mode 100755 tests/block/022
create mode 100644 tests/block/022.out
diff --git a/tests/block/022 b/tests/block/022
new file mode 100755
index 0000000..45bfff7
--- /dev/null
+++ b/tests/block/022
@@ -0,0 +1,96 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2018 Ming Lei
+#
+# Regression test for patch "blkcg: handle dying request_queue when associating
+# a blkg"
+#
+# This tries to expose the race condition between blkg association and
+# request_queue shutdown. When a request_queue is shutdown, the corresponding
+# blkgs are destroyed. Any further associations should fail gracefully and not
+# cause a kernel panic.
+
+. tests/block/rc
+. common/scsi_debug
+. common/cgroup
+
+DESCRIPTION="test graceful shutdown of scsi_debug devices with running fio jobs"
+QUICK=1
+
+requires() {
+ _have_cgroup2_controller io && _have_scsi_debug && _have_fio
+}
+
+scsi_debug_stress_remove() {
+ scsi_debug_path="/sys/bus/pseudo/drivers/scsi_debug"
+ count=21
+
+ runtime=12
+ nr_fio_jobs=8
+ scsi_dbg_ndelay=10000
+
+ # set higher aio limit
+ echo 524288 > /proc/sys/fs/aio-max-nr
+
+ #figure out the CAN_QUEUE
+ can_queue=$(((count + 1) * (count / 2) / 2))
+
+ rmmod scsi_debug > /dev/null 2>&1
+ modprobe scsi_debug virtual_gb=128 max_luns=$count \
+ ndelay=$scsi_dbg_ndelay max_queue=$can_queue
+
+ # figure out scsi_debug disks
+ hosts=$(ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter0/host*)
+ hostname=$(basename "$hosts")
+ host=$(echo "$hostname" | grep -o -E '[0-9]+')
+
+ sdisks=$(ls -d $scsi_debug_path/adapter*/"$hostname"/target*/*/block/*)
+ disks=""
+ for sd in $sdisks; do
+ disks+="/dev/"$(basename "$sd")
+ disks+=" "
+ done
+
+ use_mq=$(cat /sys/module/scsi_mod/parameters/use_blk_mq)
+ if [[ $use_mq = "Y" ]]; then
+ scheds=("none" "mq-deadline" "kyber")
+ else
+ scheds=("noop" "deadline" "cfq")
+ fi
+
+ fio_jobs=""
+ cnt=0
+ for sd in $disks; do
+ cnt=$((cnt+1))
+ fio_jobs=$fio_jobs" --name=job1 --filename=$sd: "
+ dev_name=$(basename "$sd")
+ q_path=/sys/block/$dev_name/queue
+
+ sched_idx=$((cnt % ${#scheds[@]}))
+ echo "${scheds[$sched_idx]}" > "$q_path/scheduler"
+ echo $cnt > "$q_path/../device/queue_depth"
+ done
+
+ fio --rw=randread --size=128G --direct=1 --ioengine=libaio \
+ --iodepth=2048 --numjobs=$nr_fio_jobs --bs=4k \
+ --group_reporting=1 --group_reporting=1 --runtime=$runtime \
+ --loops=10000 "$fio_jobs" > "$FULL" 2>&1 &
+
+ sleep 7
+ for sd in $disks; do
+ dev_name=$(basename "$sd")
+ dpath=/sys/block/$dev_name/device
+ [ -f "$dpath/delete" ] && echo 1 > "$dpath/delete"
+ done
+
+ wait
+}
+
+
+test() {
+ echo "Running ${TEST_NAME}"
+
+ scsi_debug_stress_remove
+
+ echo "Test complete"
+}
diff --git a/tests/block/022.out b/tests/block/022.out
new file mode 100644
index 0000000..14d43cb
--- /dev/null
+++ b/tests/block/022.out
@@ -0,0 +1,2 @@
+Running block/022
+Test complete
--
2.17.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH blktests 2/2] blktests: add Ming Lei's scsi-stress-remove
2018-12-12 23:09 ` [PATCH blktests 2/2] blktests: add Ming Lei's scsi-stress-remove Dennis Zhou
@ 2018-12-13 1:24 ` Ming Lei
2018-12-13 18:21 ` Dennis Zhou
2018-12-13 18:28 ` [PATCH blktests v2 " Dennis Zhou
1 sibling, 1 reply; 11+ messages in thread
From: Ming Lei @ 2018-12-13 1:24 UTC (permalink / raw)
To: Dennis Zhou; +Cc: Omar Sandoval, kernel-team, linux-block
On Wed, Dec 12, 2018 at 06:09:59PM -0500, Dennis Zhou wrote:
> This test exposed a race condition with shutting down a request_queue
> and the new blkg association. The issue ended up being that while the
> request_queue will just start failing requests, blkg destruction sets
> the q->root_blkg to %NULL. This caused a NPE when trying to reference
> it. So to help prevent this from happening again, integrate Ming's test
> into blktests so that it can more easily be ran.
>
> Signed-off-by: Dennis Zhou <dennis@kernel.org>
> Cc: Ming Lei <ming.lei@redhat.com>
> ---
> tests/block/022 | 96 +++++++++++++++++++++++++++++++++++++++++++++
> tests/block/022.out | 2 +
> 2 files changed, 98 insertions(+)
> create mode 100755 tests/block/022
> create mode 100644 tests/block/022.out
>
> diff --git a/tests/block/022 b/tests/block/022
> new file mode 100755
> index 0000000..45bfff7
> --- /dev/null
> +++ b/tests/block/022
> @@ -0,0 +1,96 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-3.0+
> +# Copyright (C) 2018 Ming Lei
> +#
> +# Regression test for patch "blkcg: handle dying request_queue when associating
> +# a blkg"
> +#
> +# This tries to expose the race condition between blkg association and
> +# request_queue shutdown. When a request_queue is shutdown, the corresponding
> +# blkgs are destroyed. Any further associations should fail gracefully and not
> +# cause a kernel panic.
> +
> +. tests/block/rc
> +. common/scsi_debug
> +. common/cgroup
> +
> +DESCRIPTION="test graceful shutdown of scsi_debug devices with running fio jobs"
> +QUICK=1
> +
> +requires() {
> + _have_cgroup2_controller io && _have_scsi_debug && _have_fio
> +}
> +
> +scsi_debug_stress_remove() {
> + scsi_debug_path="/sys/bus/pseudo/drivers/scsi_debug"
> + count=21
> +
> + runtime=12
> + nr_fio_jobs=8
> + scsi_dbg_ndelay=10000
> +
> + # set higher aio limit
> + echo 524288 > /proc/sys/fs/aio-max-nr
> +
> + #figure out the CAN_QUEUE
> + can_queue=$(((count + 1) * (count / 2) / 2))
> +
> + rmmod scsi_debug > /dev/null 2>&1
> + modprobe scsi_debug virtual_gb=128 max_luns=$count \
> + ndelay=$scsi_dbg_ndelay max_queue=$can_queue
> +
> + # figure out scsi_debug disks
> + hosts=$(ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter0/host*)
> + hostname=$(basename "$hosts")
> + host=$(echo "$hostname" | grep -o -E '[0-9]+')
> +
> + sdisks=$(ls -d $scsi_debug_path/adapter*/"$hostname"/target*/*/block/*)
> + disks=""
> + for sd in $sdisks; do
> + disks+="/dev/"$(basename "$sd")
> + disks+=" "
> + done
> +
> + use_mq=$(cat /sys/module/scsi_mod/parameters/use_blk_mq)
> + if [[ $use_mq = "Y" ]]; then
> + scheds=("none" "mq-deadline" "kyber")
> + else
> + scheds=("noop" "deadline" "cfq")
> + fi
You may use the following to figure out all supported io schedulers,
especially we have removed all legacy io schedulers.
IOSCHEDS=`sed 's/[][]//g' $Q_PATH/scheduler`
Thanks,
Ming
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH blktests 2/2] blktests: add Ming Lei's scsi-stress-remove
2018-12-13 1:24 ` Ming Lei
@ 2018-12-13 18:21 ` Dennis Zhou
0 siblings, 0 replies; 11+ messages in thread
From: Dennis Zhou @ 2018-12-13 18:21 UTC (permalink / raw)
To: Ming Lei; +Cc: Dennis Zhou, Omar Sandoval, kernel-team, linux-block
On Thu, Dec 13, 2018 at 09:24:09AM +0800, Ming Lei wrote:
> On Wed, Dec 12, 2018 at 06:09:59PM -0500, Dennis Zhou wrote:
> > This test exposed a race condition with shutting down a request_queue
> > and the new blkg association. The issue ended up being that while the
> > request_queue will just start failing requests, blkg destruction sets
> > the q->root_blkg to %NULL. This caused a NPE when trying to reference
> > it. So to help prevent this from happening again, integrate Ming's test
> > into blktests so that it can more easily be ran.
> >
> > Signed-off-by: Dennis Zhou <dennis@kernel.org>
> > Cc: Ming Lei <ming.lei@redhat.com>
> > ---
> > tests/block/022 | 96 +++++++++++++++++++++++++++++++++++++++++++++
> > tests/block/022.out | 2 +
> > 2 files changed, 98 insertions(+)
> > create mode 100755 tests/block/022
> > create mode 100644 tests/block/022.out
> >
> > diff --git a/tests/block/022 b/tests/block/022
> > new file mode 100755
> > index 0000000..45bfff7
> > --- /dev/null
> > +++ b/tests/block/022
> > @@ -0,0 +1,96 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-3.0+
> > +# Copyright (C) 2018 Ming Lei
> > +#
> > +# Regression test for patch "blkcg: handle dying request_queue when associating
> > +# a blkg"
> > +#
> > +# This tries to expose the race condition between blkg association and
> > +# request_queue shutdown. When a request_queue is shutdown, the corresponding
> > +# blkgs are destroyed. Any further associations should fail gracefully and not
> > +# cause a kernel panic.
> > +
> > +. tests/block/rc
> > +. common/scsi_debug
> > +. common/cgroup
> > +
> > +DESCRIPTION="test graceful shutdown of scsi_debug devices with running fio jobs"
> > +QUICK=1
> > +
> > +requires() {
> > + _have_cgroup2_controller io && _have_scsi_debug && _have_fio
> > +}
> > +
> > +scsi_debug_stress_remove() {
> > + scsi_debug_path="/sys/bus/pseudo/drivers/scsi_debug"
> > + count=21
> > +
> > + runtime=12
> > + nr_fio_jobs=8
> > + scsi_dbg_ndelay=10000
> > +
> > + # set higher aio limit
> > + echo 524288 > /proc/sys/fs/aio-max-nr
> > +
> > + #figure out the CAN_QUEUE
> > + can_queue=$(((count + 1) * (count / 2) / 2))
> > +
> > + rmmod scsi_debug > /dev/null 2>&1
> > + modprobe scsi_debug virtual_gb=128 max_luns=$count \
> > + ndelay=$scsi_dbg_ndelay max_queue=$can_queue
> > +
> > + # figure out scsi_debug disks
> > + hosts=$(ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter0/host*)
> > + hostname=$(basename "$hosts")
> > + host=$(echo "$hostname" | grep -o -E '[0-9]+')
> > +
> > + sdisks=$(ls -d $scsi_debug_path/adapter*/"$hostname"/target*/*/block/*)
> > + disks=""
> > + for sd in $sdisks; do
> > + disks+="/dev/"$(basename "$sd")
> > + disks+=" "
> > + done
> > +
> > + use_mq=$(cat /sys/module/scsi_mod/parameters/use_blk_mq)
> > + if [[ $use_mq = "Y" ]]; then
> > + scheds=("none" "mq-deadline" "kyber")
> > + else
> > + scheds=("noop" "deadline" "cfq")
> > + fi
>
> You may use the following to figure out all supported io schedulers,
> especially we have removed all legacy io schedulers.
>
> IOSCHEDS=`sed 's/[][]//g' $Q_PATH/scheduler`
>
>
> Thanks,
> Ming
Thanks Ming! I'll post a v2 update for this patch.
Thanks,
Dennis
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH blktests v2 2/2] blktests: add Ming Lei's scsi-stress-remove
2018-12-12 23:09 ` [PATCH blktests 2/2] blktests: add Ming Lei's scsi-stress-remove Dennis Zhou
2018-12-13 1:24 ` Ming Lei
@ 2018-12-13 18:28 ` Dennis Zhou
2018-12-14 0:31 ` Ming Lei
2018-12-19 22:49 ` Omar Sandoval
1 sibling, 2 replies; 11+ messages in thread
From: Dennis Zhou @ 2018-12-13 18:28 UTC (permalink / raw)
To: Omar Sandoval, Ming Lei; +Cc: kernel-team, linux-block, Dennis Zhou
This test exposed a race condition with shutting down a request_queue
and the new blkg association. The issue ended up being that while the
request_queue will just start failing requests, blkg destruction sets
the q->root_blkg to %NULL. This caused a NPE when trying to reference
it. So to help prevent this from happening again, integrate Ming's test
into blktests so that it can more easily be ran.
Signed-off-by: Dennis Zhou <dennis@kernel.org>
Cc: Ming Lei <ming.lei@redhat.com>
---
v2:
- Change scheduler retrieving logic based on Ming's comment
tests/block/022 | 90 +++++++++++++++++++++++++++++++++++++++++++++
tests/block/022.out | 2 +
2 files changed, 92 insertions(+)
create mode 100755 tests/block/022
create mode 100644 tests/block/022.out
diff --git a/tests/block/022 b/tests/block/022
new file mode 100755
index 0000000..84336e0
--- /dev/null
+++ b/tests/block/022
@@ -0,0 +1,90 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-3.0+
+# Copyright (C) 2018 Ming Lei
+#
+# Regression test for patch "blkcg: handle dying request_queue when associating
+# a blkg"
+#
+# This tries to expose the race condition between blkg association and
+# request_queue shutdown. When a request_queue is shutdown, the corresponding
+# blkgs are destroyed. Any further associations should fail gracefully and not
+# cause a kernel panic.
+
+. tests/block/rc
+. common/scsi_debug
+. common/cgroup
+
+DESCRIPTION="test graceful shutdown of scsi_debug devices with running fio jobs"
+QUICK=1
+
+requires() {
+ _have_cgroup2_controller io && _have_scsi_debug && _have_fio
+}
+
+scsi_debug_stress_remove() {
+ scsi_debug_path="/sys/bus/pseudo/drivers/scsi_debug"
+ count=21
+
+ runtime=12
+ nr_fio_jobs=8
+ scsi_dbg_ndelay=10000
+
+ # set higher aio limit
+ echo 524288 > /proc/sys/fs/aio-max-nr
+
+ #figure out the CAN_QUEUE
+ can_queue=$(((count + 1) * (count / 2) / 2))
+
+ rmmod scsi_debug > /dev/null 2>&1
+ modprobe scsi_debug virtual_gb=128 max_luns=$count \
+ ndelay=$scsi_dbg_ndelay max_queue=$can_queue
+
+ # figure out scsi_debug disks
+ hosts=$(ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter0/host*)
+ hostname=$(basename "$hosts")
+ host=$(echo "$hostname" | grep -o -E '[0-9]+')
+
+ sdisks=$(ls -d $scsi_debug_path/adapter*/"$hostname"/target*/*/block/*)
+ disks=""
+ for sd in $sdisks; do
+ disks+="/dev/"$(basename "$sd")
+ disks+=" "
+ done
+
+ fio_jobs=""
+ cnt=0
+ for sd in $disks; do
+ cnt=$((cnt+1))
+ fio_jobs=$fio_jobs" --name=job1 --filename=$sd: "
+ dev_name=$(basename "$sd")
+ q_path=/sys/block/$dev_name/queue
+
+ scheds=($(sed 's/[][]//g' "$q_path/scheduler"))
+ sched_idx=$((cnt % ${#scheds[@]}))
+ echo "${scheds[$sched_idx]}" > "$q_path/scheduler"
+ echo $cnt > "$q_path/../device/queue_depth"
+ done
+
+ fio --rw=randread --size=128G --direct=1 --ioengine=libaio \
+ --iodepth=2048 --numjobs=$nr_fio_jobs --bs=4k \
+ --group_reporting=1 --group_reporting=1 --runtime=$runtime \
+ --loops=10000 "$fio_jobs" > "$FULL" 2>&1 &
+
+ sleep 7
+ for sd in $disks; do
+ dev_name=$(basename "$sd")
+ dpath=/sys/block/$dev_name/device
+ [ -f "$dpath/delete" ] && echo 1 > "$dpath/delete"
+ done
+
+ wait
+}
+
+
+test() {
+ echo "Running ${TEST_NAME}"
+
+ scsi_debug_stress_remove
+
+ echo "Test complete"
+}
diff --git a/tests/block/022.out b/tests/block/022.out
new file mode 100644
index 0000000..14d43cb
--- /dev/null
+++ b/tests/block/022.out
@@ -0,0 +1,2 @@
+Running block/022
+Test complete
--
2.17.1
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH blktests v2 2/2] blktests: add Ming Lei's scsi-stress-remove
2018-12-13 18:28 ` [PATCH blktests v2 " Dennis Zhou
@ 2018-12-14 0:31 ` Ming Lei
2018-12-19 22:49 ` Omar Sandoval
1 sibling, 0 replies; 11+ messages in thread
From: Ming Lei @ 2018-12-14 0:31 UTC (permalink / raw)
To: Dennis Zhou; +Cc: Omar Sandoval, kernel-team, linux-block
On Thu, Dec 13, 2018 at 01:28:44PM -0500, Dennis Zhou wrote:
> This test exposed a race condition with shutting down a request_queue
> and the new blkg association. The issue ended up being that while the
> request_queue will just start failing requests, blkg destruction sets
> the q->root_blkg to %NULL. This caused a NPE when trying to reference
> it. So to help prevent this from happening again, integrate Ming's test
> into blktests so that it can more easily be ran.
>
> Signed-off-by: Dennis Zhou <dennis@kernel.org>
> Cc: Ming Lei <ming.lei@redhat.com>
> ---
> v2:
> - Change scheduler retrieving logic based on Ming's comment
>
> tests/block/022 | 90 +++++++++++++++++++++++++++++++++++++++++++++
> tests/block/022.out | 2 +
> 2 files changed, 92 insertions(+)
> create mode 100755 tests/block/022
> create mode 100644 tests/block/022.out
>
> diff --git a/tests/block/022 b/tests/block/022
> new file mode 100755
> index 0000000..84336e0
> --- /dev/null
> +++ b/tests/block/022
> @@ -0,0 +1,90 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-3.0+
> +# Copyright (C) 2018 Ming Lei
> +#
> +# Regression test for patch "blkcg: handle dying request_queue when associating
> +# a blkg"
> +#
> +# This tries to expose the race condition between blkg association and
> +# request_queue shutdown. When a request_queue is shutdown, the corresponding
> +# blkgs are destroyed. Any further associations should fail gracefully and not
> +# cause a kernel panic.
> +
> +. tests/block/rc
> +. common/scsi_debug
> +. common/cgroup
> +
> +DESCRIPTION="test graceful shutdown of scsi_debug devices with running fio jobs"
> +QUICK=1
> +
> +requires() {
> + _have_cgroup2_controller io && _have_scsi_debug && _have_fio
> +}
> +
> +scsi_debug_stress_remove() {
> + scsi_debug_path="/sys/bus/pseudo/drivers/scsi_debug"
> + count=21
> +
> + runtime=12
> + nr_fio_jobs=8
> + scsi_dbg_ndelay=10000
> +
> + # set higher aio limit
> + echo 524288 > /proc/sys/fs/aio-max-nr
> +
> + #figure out the CAN_QUEUE
> + can_queue=$(((count + 1) * (count / 2) / 2))
> +
> + rmmod scsi_debug > /dev/null 2>&1
> + modprobe scsi_debug virtual_gb=128 max_luns=$count \
> + ndelay=$scsi_dbg_ndelay max_queue=$can_queue
> +
> + # figure out scsi_debug disks
> + hosts=$(ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter0/host*)
> + hostname=$(basename "$hosts")
> + host=$(echo "$hostname" | grep -o -E '[0-9]+')
> +
> + sdisks=$(ls -d $scsi_debug_path/adapter*/"$hostname"/target*/*/block/*)
> + disks=""
> + for sd in $sdisks; do
> + disks+="/dev/"$(basename "$sd")
> + disks+=" "
> + done
> +
> + fio_jobs=""
> + cnt=0
> + for sd in $disks; do
> + cnt=$((cnt+1))
> + fio_jobs=$fio_jobs" --name=job1 --filename=$sd: "
> + dev_name=$(basename "$sd")
> + q_path=/sys/block/$dev_name/queue
> +
> + scheds=($(sed 's/[][]//g' "$q_path/scheduler"))
> + sched_idx=$((cnt % ${#scheds[@]}))
> + echo "${scheds[$sched_idx]}" > "$q_path/scheduler"
> + echo $cnt > "$q_path/../device/queue_depth"
> + done
> +
> + fio --rw=randread --size=128G --direct=1 --ioengine=libaio \
> + --iodepth=2048 --numjobs=$nr_fio_jobs --bs=4k \
> + --group_reporting=1 --group_reporting=1 --runtime=$runtime \
> + --loops=10000 "$fio_jobs" > "$FULL" 2>&1 &
> +
> + sleep 7
> + for sd in $disks; do
> + dev_name=$(basename "$sd")
> + dpath=/sys/block/$dev_name/device
> + [ -f "$dpath/delete" ] && echo 1 > "$dpath/delete"
> + done
> +
> + wait
> +}
> +
> +
> +test() {
> + echo "Running ${TEST_NAME}"
> +
> + scsi_debug_stress_remove
> +
> + echo "Test complete"
> +}
> diff --git a/tests/block/022.out b/tests/block/022.out
> new file mode 100644
> index 0000000..14d43cb
> --- /dev/null
> +++ b/tests/block/022.out
> @@ -0,0 +1,2 @@
> +Running block/022
> +Test complete
> --
> 2.17.1
>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
thanks,
Ming
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH blktests v2 2/2] blktests: add Ming Lei's scsi-stress-remove
2018-12-13 18:28 ` [PATCH blktests v2 " Dennis Zhou
2018-12-14 0:31 ` Ming Lei
@ 2018-12-19 22:49 ` Omar Sandoval
2018-12-19 22:57 ` Dennis Zhou
1 sibling, 1 reply; 11+ messages in thread
From: Omar Sandoval @ 2018-12-19 22:49 UTC (permalink / raw)
To: Dennis Zhou; +Cc: Omar Sandoval, Ming Lei, kernel-team, linux-block
On Thu, Dec 13, 2018 at 01:28:44PM -0500, Dennis Zhou wrote:
> This test exposed a race condition with shutting down a request_queue
> and the new blkg association. The issue ended up being that while the
> request_queue will just start failing requests, blkg destruction sets
> the q->root_blkg to %NULL. This caused a NPE when trying to reference
> it. So to help prevent this from happening again, integrate Ming's test
> into blktests so that it can more easily be ran.
>
> Signed-off-by: Dennis Zhou <dennis@kernel.org>
> Cc: Ming Lei <ming.lei@redhat.com>
> ---
> v2:
> - Change scheduler retrieving logic based on Ming's comment
>
> tests/block/022 | 90 +++++++++++++++++++++++++++++++++++++++++++++
> tests/block/022.out | 2 +
> 2 files changed, 92 insertions(+)
> create mode 100755 tests/block/022
> create mode 100644 tests/block/022.out
>
> diff --git a/tests/block/022 b/tests/block/022
> new file mode 100755
> index 0000000..84336e0
> --- /dev/null
> +++ b/tests/block/022
> @@ -0,0 +1,90 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-3.0+
> +# Copyright (C) 2018 Ming Lei
> +#
> +# Regression test for patch "blkcg: handle dying request_queue when associating
> +# a blkg"
> +#
> +# This tries to expose the race condition between blkg association and
> +# request_queue shutdown. When a request_queue is shutdown, the corresponding
> +# blkgs are destroyed. Any further associations should fail gracefully and not
> +# cause a kernel panic.
> +
> +. tests/block/rc
> +. common/scsi_debug
> +. common/cgroup
> +
> +DESCRIPTION="test graceful shutdown of scsi_debug devices with running fio jobs"
> +QUICK=1
> +
> +requires() {
> + _have_cgroup2_controller io && _have_scsi_debug && _have_fio
> +}
> +
> +scsi_debug_stress_remove() {
> + scsi_debug_path="/sys/bus/pseudo/drivers/scsi_debug"
> + count=21
> +
> + runtime=12
> + nr_fio_jobs=8
> + scsi_dbg_ndelay=10000
> +
> + # set higher aio limit
> + echo 524288 > /proc/sys/fs/aio-max-nr
> +
> + #figure out the CAN_QUEUE
> + can_queue=$(((count + 1) * (count / 2) / 2))
> +
> + rmmod scsi_debug > /dev/null 2>&1
> + modprobe scsi_debug virtual_gb=128 max_luns=$count \
> + ndelay=$scsi_dbg_ndelay max_queue=$can_queue
> +
> + # figure out scsi_debug disks
> + hosts=$(ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter0/host*)
> + hostname=$(basename "$hosts")
> + host=$(echo "$hostname" | grep -o -E '[0-9]+')
> +
> + sdisks=$(ls -d $scsi_debug_path/adapter*/"$hostname"/target*/*/block/*)
> + disks=""
> + for sd in $sdisks; do
> + disks+="/dev/"$(basename "$sd")
> + disks+=" "
> + done
blktests has _init_scsi_debug which does all of this for you. And,
block/001 is very similar to this test, just without the fio workload or
changing schedulers. Could you please rework this to be based on
block/001?
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH blktests v2 2/2] blktests: add Ming Lei's scsi-stress-remove
2018-12-19 22:49 ` Omar Sandoval
@ 2018-12-19 22:57 ` Dennis Zhou
0 siblings, 0 replies; 11+ messages in thread
From: Dennis Zhou @ 2018-12-19 22:57 UTC (permalink / raw)
To: Omar Sandoval
Cc: Dennis Zhou, Omar Sandoval, Ming Lei, kernel-team, linux-block
On Wed, Dec 19, 2018 at 02:49:42PM -0800, Omar Sandoval wrote:
> On Thu, Dec 13, 2018 at 01:28:44PM -0500, Dennis Zhou wrote:
> > This test exposed a race condition with shutting down a request_queue
> > and the new blkg association. The issue ended up being that while the
> > request_queue will just start failing requests, blkg destruction sets
> > the q->root_blkg to %NULL. This caused a NPE when trying to reference
> > it. So to help prevent this from happening again, integrate Ming's test
> > into blktests so that it can more easily be ran.
> >
> > Signed-off-by: Dennis Zhou <dennis@kernel.org>
> > Cc: Ming Lei <ming.lei@redhat.com>
> > ---
> > v2:
> > - Change scheduler retrieving logic based on Ming's comment
> >
> > tests/block/022 | 90 +++++++++++++++++++++++++++++++++++++++++++++
> > tests/block/022.out | 2 +
> > 2 files changed, 92 insertions(+)
> > create mode 100755 tests/block/022
> > create mode 100644 tests/block/022.out
> >
> > diff --git a/tests/block/022 b/tests/block/022
> > new file mode 100755
> > index 0000000..84336e0
> > --- /dev/null
> > +++ b/tests/block/022
> > @@ -0,0 +1,90 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-3.0+
> > +# Copyright (C) 2018 Ming Lei
> > +#
> > +# Regression test for patch "blkcg: handle dying request_queue when associating
> > +# a blkg"
> > +#
> > +# This tries to expose the race condition between blkg association and
> > +# request_queue shutdown. When a request_queue is shutdown, the corresponding
> > +# blkgs are destroyed. Any further associations should fail gracefully and not
> > +# cause a kernel panic.
> > +
> > +. tests/block/rc
> > +. common/scsi_debug
> > +. common/cgroup
> > +
> > +DESCRIPTION="test graceful shutdown of scsi_debug devices with running fio jobs"
> > +QUICK=1
> > +
> > +requires() {
> > + _have_cgroup2_controller io && _have_scsi_debug && _have_fio
> > +}
> > +
> > +scsi_debug_stress_remove() {
> > + scsi_debug_path="/sys/bus/pseudo/drivers/scsi_debug"
> > + count=21
> > +
> > + runtime=12
> > + nr_fio_jobs=8
> > + scsi_dbg_ndelay=10000
> > +
> > + # set higher aio limit
> > + echo 524288 > /proc/sys/fs/aio-max-nr
> > +
> > + #figure out the CAN_QUEUE
> > + can_queue=$(((count + 1) * (count / 2) / 2))
> > +
> > + rmmod scsi_debug > /dev/null 2>&1
> > + modprobe scsi_debug virtual_gb=128 max_luns=$count \
> > + ndelay=$scsi_dbg_ndelay max_queue=$can_queue
> > +
> > + # figure out scsi_debug disks
> > + hosts=$(ls -d /sys/bus/pseudo/drivers/scsi_debug/adapter0/host*)
> > + hostname=$(basename "$hosts")
> > + host=$(echo "$hostname" | grep -o -E '[0-9]+')
> > +
> > + sdisks=$(ls -d $scsi_debug_path/adapter*/"$hostname"/target*/*/block/*)
> > + disks=""
> > + for sd in $sdisks; do
> > + disks+="/dev/"$(basename "$sd")
> > + disks+=" "
> > + done
>
> blktests has _init_scsi_debug which does all of this for you. And,
> block/001 is very similar to this test, just without the fio workload or
> changing schedulers. Could you please rework this to be based on
> block/001?
Yeah I can do that. Sorry for not looking at block/001 more closely.
Thanks,
Dennis
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH blktests v2 0/2] Add scsi-stress-remove to blktests
@ 2018-12-20 18:18 Dennis Zhou
2018-12-20 18:18 ` [PATCH blktests 1/2] blktests: split out cgroup2 controller and file check Dennis Zhou
0 siblings, 1 reply; 11+ messages in thread
From: Dennis Zhou @ 2018-12-20 18:18 UTC (permalink / raw)
To: Omar Sandoval, Ming Lei; +Cc: kernel-team, linux-block, Dennis Zhou
Hi,
v2:
There was a minor typo in 0001 pointed out by Omar which I fixed here.
0002 was refactored to use common/scsi_debug. I also rebased onto the
top of osandov#josef changing the test from block/022 -> block/027.
From v1:
Ming Lei's scsi-stress-remove test found a bug in blkg destruction [1]
where bios being created when the request_queue was being cleaned up
threw a NPE in blkg association. The fix is currently being discussed in
[2]. To make this test more accessible, I've ported it to blktests with
Ming Lei's copyright. I've tested this in my qemu instance and verified
we do not see the NPE on a fixed kernel.
Ming, please let me know if you have any objections.
[1] https://lore.kernel.org/linux-block/CACVXFVO_QXipD3cmPvpLyBYSiEcWPN_ThQ=0pO9AwLqN-Lv93w@mail.gmail.com
[2] https://lore.kernel.org/lkml/20181211230308.66276-1-dennis@kernel.org/
This patchset is ontop of osandov#josef 98db2e11c97f.
diffstats below:
Dennis Zhou (2):
blktests: split out cgroup2 controller and file check
blktests: add Ming Lei's scsi-stress-remove
common/cgroup | 18 +++++++----
tests/block/027 | 73 +++++++++++++++++++++++++++++++++++++++++++++
tests/block/027.out | 2 ++
3 files changed, 88 insertions(+), 5 deletions(-)
create mode 100755 tests/block/027
create mode 100644 tests/block/027.out
Thanks,
Dennis
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH blktests 1/2] blktests: split out cgroup2 controller and file check
2018-12-20 18:18 [PATCH blktests v2 0/2] Add scsi-stress-remove to blktests Dennis Zhou
@ 2018-12-20 18:18 ` Dennis Zhou
0 siblings, 0 replies; 11+ messages in thread
From: Dennis Zhou @ 2018-12-20 18:18 UTC (permalink / raw)
To: Omar Sandoval, Ming Lei; +Cc: kernel-team, linux-block, Dennis Zhou
This is a prep patch for a new test that will race blkg association and
request_queue cleanup. As blkg association is a underlying cgroup io
controller feature, we need the ability to check if the controller is
available.
Signed-off-by: Dennis Zhou <dennis@kernel.org>
---
v2:
- fixed minor typo.
common/cgroup | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/common/cgroup b/common/cgroup
index d445093..48e546f 100644
--- a/common/cgroup
+++ b/common/cgroup
@@ -37,19 +37,27 @@ _have_cgroup2()
return 0
}
-_have_cgroup2_controller_file()
+_have_cgroup2_controller()
{
- _have_cgroup2 || return 1
-
local controller="$1"
- local file="$2"
- local dir
+
+ _have_cgroup2 || return 1
dir="$(_cgroup2_base_dir)"
+
if ! grep -q "$controller" "$dir/cgroup.controllers"; then
SKIP_REASON="no support for $controller cgroup controller; if it is enabled, you may need to boot with cgroup_no_v1=$controller"
return 1
fi
+}
+
+_have_cgroup2_controller_file()
+{
+ local controller="$1"
+ local file="$2"
+ local dir
+
+ _have_cgroup2_controller "$controller" || return 1
mkdir "$dir/blktests"
echo "+$controller" > "$dir/cgroup.subtree_control"
--
2.17.1
^ permalink raw reply related [flat|nested] 11+ messages in thread
end of thread, other threads:[~2018-12-20 18:18 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-12 23:09 [PATCH blktests 0/2] Add scsi-stress-remove to blktests Dennis Zhou
2018-12-12 23:09 ` [PATCH blktests 1/2] blktests: split out cgroup2 controller and file check Dennis Zhou
2018-12-19 18:34 ` Omar Sandoval
2018-12-12 23:09 ` [PATCH blktests 2/2] blktests: add Ming Lei's scsi-stress-remove Dennis Zhou
2018-12-13 1:24 ` Ming Lei
2018-12-13 18:21 ` Dennis Zhou
2018-12-13 18:28 ` [PATCH blktests v2 " Dennis Zhou
2018-12-14 0:31 ` Ming Lei
2018-12-19 22:49 ` Omar Sandoval
2018-12-19 22:57 ` Dennis Zhou
-- strict thread matches above, loose matches on Subject: below --
2018-12-20 18:18 [PATCH blktests v2 0/2] Add scsi-stress-remove to blktests Dennis Zhou
2018-12-20 18:18 ` [PATCH blktests 1/2] blktests: split out cgroup2 controller and file check Dennis Zhou
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.