From: Uday Shankar <ushankar@purestorage.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>, linux-block@vger.kernel.org
Subject: Re: [PATCH 2/4] ublk: refactor recovery configuration flag helpers
Date: Thu, 27 Jun 2024 11:09:15 -0600 [thread overview]
Message-ID: <Zn2cuwpM+/dK/682@dev-ushankar.dev.purestorage.com> (raw)
In-Reply-To: <Zny9vr/2iHIkc2bC@fedora>
When I say "behavior A + 2," I mean behavior A and behavior 2 at the
same time on the same ublk device. I still think this is not supported
with current ublk_drv, see below.
> > the ublk server can "handle" the I/O error because during this time,
> > there is no ublk server and all decisions on how to handle I/O are made
> > by ublk_drv directly (based on configuration flags specified when the
> > device was created).
> >
> > If the ublk server created the device with UBLK_F_USER_RECOVERY, then
> > when the ublk server has crashed (and not restarted yet), I/Os issued by
> > the application will queue/hang until the ublk server comes back and
> > recovers the device, because the underlying request_queue is left in a
> > quiesced state. So in this case, behavior A is not possible.
>
> When ublk server is crashed, ublk_abort_requests() will be called to fail
> queued inflight requests. Meantime ubq->canceling is set to requeue
> new request instead of forwarding it to ublk server.
>
> So behavior A should be supported easily by failing request in
> ublk_queue_rq() if ubq->canceling is set.
This argument only works for devices created without
UBLK_F_USER_RECOVERY. If UBLK_F_USER_RECOVERY is set, then the
request_queue for the device is left in a quiesced state and so I/Os
will not even get to ublk_queue_rq. See the following as proof (using a
build of ublksrv master):
# ./ublk add -t loop -f file -r 1
dev id 0: nr_hw_queues 1 queue_depth 128 block size 4096 dev_capacity 2097152
max rq size 524288 daemon pid 244608 flags 0x4a state LIVE
ublkc: 240:0 ublkb: 259:0 owner: 0:0
queue 0: tid 244610 affinity(0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 )
target {"backing_file":"file","dev_size":1073741824,"direct_io":1,"name":"loop","type":1}
# kill -9 244608
# dd if=/dev/urandom of=/dev/ublkb0 count=1 bs=4096 oflag=direct
(hung)
# ps aux | grep " D"
root 244626 0.0 0.0 5620 1880 pts/0 D+ 16:57 0:00 dd if=/dev/urandom of=/dev/ublkb0 count=1 bs=4096 oflag=direct
root 244656 0.0 0.0 6408 2188 pts/1 S+ 16:58 0:00 grep --color=auto D
# cat /proc/244626/stack
[<0>] submit_bio_wait+0x63/0x90
[<0>] __blkdev_direct_IO_simple+0xd9/0x1e0
[<0>] blkdev_write_iter+0x1b4/0x230
[<0>] vfs_write+0x2ae/0x3d0
[<0>] ksys_write+0x4f/0xc0
[<0>] do_syscall_64+0x5d/0x160
[<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
# cat /sys/kernel/debug/block/ublkb0/state
SAME_COMP|NONROT|IO_STAT|INIT_DONE|STATS|REGISTERED|QUIESCED|NOWAIT|SQ_SCHED
Therefore, in order to obtain behavior A with current ublk_drv, one must
not set UBLK_F_USER_RECOVERY.
> >
> > If the ublk server created the device without UBLK_F_USER_RECOVERY, then
> > when the ublk server has crashed (and not restarted yet), I/Os issued by
> > the application will immediately error (since in this case, ublk will
> > call del_gendisk). However, when the ublk server restarts, it cannot
> > recover the existing ublk device - the disk has been deleted and the
> > ublk device is in state UBLK_S_DEV_DEAD from which recovery is not
> > permitted. So in this case, behavior 2 is not possible.
>
> UBLK_F_USER_RECOVERY is supposed for supporting to recover device, and
> if this flag isn't enabled, we don't support the feature simply, so
> looks behavior 2 isn't one valid case, is it?
Sure, so we're in agreement that recovery is impossible if
UBLK_F_USER_RECOVERY is not set.
So:
- To get behavior A, UBLK_F_USER_RECOVERY must be unset
- To get behavior 2, UBLK_F_USER_RECOVERY must be set
Hence, having behavior A and behavior 2, at the same time, on the same
device, would require UBLK_F_USER_RECOVERY to be both set and unset when
that device is created. Obviously that's impossible.
next prev parent reply other threads:[~2024-06-27 17:09 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-17 19:44 [PATCH 0/4] ublk: support device recovery without I/O queueing Uday Shankar
2024-06-17 19:44 ` [PATCH 1/4] ublk: check recovery flags for validity Uday Shankar
2024-06-18 2:00 ` Ming Lei
2024-07-23 19:32 ` Uday Shankar
2024-06-17 19:44 ` [PATCH 2/4] ublk: refactor recovery configuration flag helpers Uday Shankar
2024-06-18 2:11 ` Ming Lei
2024-06-26 17:22 ` Uday Shankar
2024-06-27 1:17 ` Ming Lei
2024-06-27 17:09 ` Uday Shankar [this message]
2024-06-30 13:56 ` Ming Lei
2024-07-01 21:02 ` Uday Shankar
2024-07-02 4:07 ` Ming Lei
2024-07-02 11:09 ` Ming Lei
2024-09-17 0:26 ` Uday Shankar
2024-06-17 19:44 ` [PATCH 3/4] ublk: merge stop_work and quiesce_work Uday Shankar
2024-07-02 13:31 ` Ming Lei
2024-06-17 19:44 ` [PATCH 4/4] ublk: support device recovery without I/O queueing Uday Shankar
2024-07-02 13:46 ` Ming Lei
2024-09-17 0:29 ` Uday Shankar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zn2cuwpM+/dK/682@dev-ushankar.dev.purestorage.com \
--to=ushankar@purestorage.com \
--cc=axboe@kernel.dk \
--cc=linux-block@vger.kernel.org \
--cc=ming.lei@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.