* Re: Re: [PATCH] virtio-blk: don't start dataplane during the stop of dataplane
@ 2023-10-19 7:28 lv.mengzhao
2023-10-19 12:15 ` Stefan Hajnoczi
2023-10-23 9:13 ` Kevin Wolf
0 siblings, 2 replies; 4+ messages in thread
From: lv.mengzhao @ 2023-10-19 7:28 UTC (permalink / raw)
To: stefanha; +Cc: mst, kwolf, hreitz, qemu-devel, hu.jian
[-- Attachment #1.1.1: Type: text/plain, Size: 5002 bytes --]
On Tue, Oct 17, 2023 at 10:04PM +0800, stefanha@redhat.com wrote:
> > From: hujian <hu.jian@zte.com.cn>
> >
> > During the stop of dataplane for virtio-blk, virtio_bus_cleanup_host_notifier() is be
> > called to clean up notifier at the end, if polled ioeventfd, virtio_blk_handle_output()
> > is used to handle io request. But due to s->dataplane_disabled is false, it will be
> > returned directly, which drops io request.
> > Backtrace:
> > ->virtio_blk_data_plane_stop
> > ->virtio_bus_cleanup_host_notifier
> > ->virtio_queue_host_notifier_read
> > ->virtio_queue_notify_vq
> > ->vq->handle_output
> > ->virtio_blk_handle_output
> > ->if (s->dataplane && !s->dataplane_stoped)
> > ->if (!s->dataplane_disabled)
> > ->return *
> > ->virtio_blk_handle_output_do
> > The above problem can occur when using "virsh reset" cmdline to reset guest, while
> > guest does io.
> > To fix this problem, don't try to start dataplane if s->stopping is true, and io would
> > be handled by virtio_blk_handle_vq().
> >
> > Signed-off-by: hujian <hu.jian@zte.com.cn>
> > ---
> > hw/block/virtio-blk.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
>
> I have dropped this patch again after Fiona pointed out it does not
> compile and Kevin noticed that handling requests from the main loop
> thread while the I/O is still being processed in the IOThread is going
> to cause thread-safety issues.
>
> Can you explain the problem you are seeing in more detail? You run
> "virsh reset" while the guest is doing I/O. Then what happens?
>
> Stefan
1 Compilation issues
I'm sorry to be in such a hurry to submit the patch that I forgot to compile it locally.
Compilable patches are at the bottom.
2 Troubleshooting
We have done a lifecycle test for the VM (QEMU version: 4.1.0, Host kernel version: 4.18),
which is loop execution: virsh create -> virsh suspend -> virsh resume -> virsh reset ->
virsh shutdown, and io stress test inside the virtual machine. After the loop is executed
about 200 times, after "virsh reset" is executed, the virtual machine goes into emergency
mode and fails to start normally. Theoretically, "virsh reset" may causes data loss of
virtual machine, but not enough to put it into emergency mode.
Coincidentally, I happen to be fixing another different fault with the same phenomenon,
which is caused by a our private patch, this patch calls virtio_blk_data_plane_ [stop|start]
to operate the dataplane, if QEMU is processing io requests at same time, it may cause the
loss of io requests.
Analyzing virtio_blk_data_plane_stop(), virtio_bus_cleanup_host_notifier() is used to
clean up notifiers, and my understanding is that it would handle the remaining IO requests.
The stack is as follows, I add the print at the star line and find that virtio_blk_handle_output()
returned directly instead of continuing to call virtio_blk_handle_vq() to handle io. So I modify
the code here to make it don't return during the stop of dataplane, and our internal private patch
works normally.
Backtrace:
->virtio_blk_data_plane_stop
->virtio_bus_cleanup_host_notifier
->virtio_queue_host_notifier_read
->virtio_queue_notify_vq
->vq->handle_output
->virtio_blk_handle_output
->if (s->dataplane && !s->dataplane_stoped)
->if (!s->dataplane_disabled)
->return *
->virtio_blk_handle_vq
Back to the problem caused by the virsh reset, libvirt sends the "system_reset" QEMU
monitor command to QEMU, and QEMU calls qmp_system_reset(), and eventually calls
virtio_blk_data_plane_[stop|start] to reset devices. I suspect that io loss will
also occur if QEMU still has io to process during the stop of dataplane.
3 Thread-safety issues
virtio_blk_data_plane_stop() calls blk_set_aio_context() to switch bs back to the QEMU
main loop after virtio_bus_cleanup_host_notifier(), so the remaining IO requests
are still handling by iothread(if configured). I'm a little confused as to why there
is thread-safety issues.
Lastly, please CC to cv11411@126.com, this is my private email, so I can contact with
you in my free time, Thanks.
4 Compilable patches
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 39e7f23..d3a719c 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -1165,8 +1165,9 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
{
VirtIOBlock *s = (VirtIOBlock *)vdev;
+ VirtIOBlockDataPlane *dp = s->dataplane;
- if (s->dataplane && !s->dataplane_started) {
+ if (s->dataplane && !s->dataplane_started && !dp->stopping) {
/* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
* dataplane here instead of waiting for .set_status().
*/
[-- Attachment #1.1.2: Type: text/html , Size: 6909 bytes --]
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Re: [PATCH] virtio-blk: don't start dataplane during the stop of dataplane
2023-10-19 7:28 Re: [PATCH] " lv.mengzhao
@ 2023-10-19 12:15 ` Stefan Hajnoczi
2023-10-23 9:13 ` Kevin Wolf
1 sibling, 0 replies; 4+ messages in thread
From: Stefan Hajnoczi @ 2023-10-19 12:15 UTC (permalink / raw)
To: lv.mengzhao; +Cc: stefanha, mst, kwolf, hreitz, qemu-devel, hu.jian, cv11411
On Thu, 19 Oct 2023 at 00:31, <lv.mengzhao@zte.com.cn> wrote:
>
> On Tue, Oct 17, 2023 at 10:04PM +0800, stefanha@redhat.com wrote:
>
> > > From: hujian <hu.jian@zte.com.cn>
>
> > >
>
> > > During the stop of dataplane for virtio-blk, virtio_bus_cleanup_host_notifier() is be
>
> > > called to clean up notifier at the end, if polled ioeventfd, virtio_blk_handle_output()
>
> > > is used to handle io request. But due to s->dataplane_disabled is false, it will be
>
> > > returned directly, which drops io request.
>
> > > Backtrace:
>
> > > ->virtio_blk_data_plane_stop
>
> > > ->virtio_bus_cleanup_host_notifier
>
> > > ->virtio_queue_host_notifier_read
>
> > > ->virtio_queue_notify_vq
>
> > > ->vq->handle_output
>
> > > ->virtio_blk_handle_output
>
> > > ->if (s->dataplane && !s->dataplane_stoped)
>
> > > ->if (!s->dataplane_disabled)
>
> > > ->return *
>
> > > ->virtio_blk_handle_output_do
>
> > > The above problem can occur when using "virsh reset" cmdline to reset guest, while
>
> > > guest does io.
>
> > > To fix this problem, don't try to start dataplane if s->stopping is true, and io would
>
> > > be handled by virtio_blk_handle_vq().
>
> > >
>
> > > Signed-off-by: hujian <hu.jian@zte.com.cn>
>
> > > ---
>
> > > hw/block/virtio-blk.c | 2 +-
>
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
>
> >
>
> > I have dropped this patch again after Fiona pointed out it does not
>
> > compile and Kevin noticed that handling requests from the main loop
>
> > thread while the I/O is still being processed in the IOThread is going
>
> > to cause thread-safety issues.
>
> >
>
> > Can you explain the problem you are seeing in more detail? You run
>
> > "virsh reset" while the guest is doing I/O. Then what happens?
>
> >
>
> > Stefan
>
>
> 1 Compilation issues
>
> I'm sorry to be in such a hurry to submit the patch that I forgot to compile it locally.
>
> Compilable patches are at the bottom.
>
>
> 2 Troubleshooting
I won't be able to reply until November 2nd. Maybe Kevin or Hanna can
discuss this with you in the meantime.
Stefan
> We have done a lifecycle test for the VM (QEMU version: 4.1.0, Host kernel version: 4.18),
>
> which is loop execution: virsh create -> virsh suspend -> virsh resume -> virsh reset ->
>
> virsh shutdown, and io stress test inside the virtual machine. After the loop is executed
>
> about 200 times, after "virsh reset" is executed, the virtual machine goes into emergency
>
> mode and fails to start normally. Theoretically, "virsh reset" may causes data loss of
>
> virtual machine, but not enough to put it into emergency mode.
>
>
> Coincidentally, I happen to be fixing another different fault with the same phenomenon,
>
> which is caused by a our private patch, this patch calls virtio_blk_data_plane_ [stop|start]
>
> to operate the dataplane, if QEMU is processing io requests at same time, it may cause the
>
> loss of io requests.
>
>
> Analyzing virtio_blk_data_plane_stop(), virtio_bus_cleanup_host_notifier() is used to
>
> clean up notifiers, and my understanding is that it would handle the remaining IO requests.
>
> The stack is as follows, I add the print at the star line and find that virtio_blk_handle_output()
>
> returned directly instead of continuing to call virtio_blk_handle_vq() to handle io. So I modify
>
> the code here to make it don't return during the stop of dataplane, and our internal private patch
>
> works normally.
>
>
> Backtrace:
>
> ->virtio_blk_data_plane_stop
>
> ->virtio_bus_cleanup_host_notifier
>
> ->virtio_queue_host_notifier_read
>
> ->virtio_queue_notify_vq
>
> ->vq->handle_output
>
> ->virtio_blk_handle_output
>
> ->if (s->dataplane && !s->dataplane_stoped)
>
> ->if (!s->dataplane_disabled)
>
> ->return *
>
> ->virtio_blk_handle_vq
>
>
> Back to the problem caused by the virsh reset, libvirt sends the "system_reset" QEMU
>
> monitor command to QEMU, and QEMU calls qmp_system_reset(), and eventually calls
>
> virtio_blk_data_plane_[stop|start] to reset devices. I suspect that io loss will
>
> also occur if QEMU still has io to process during the stop of dataplane.
>
>
> 3 Thread-safety issues
>
> virtio_blk_data_plane_stop() calls blk_set_aio_context() to switch bs back to the QEMU
>
> main loop after virtio_bus_cleanup_host_notifier(), so the remaining IO requests
>
> are still handling by iothread(if configured). I'm a little confused as to why there
>
> is thread-safety issues.
>
>
> Lastly, please CC to cv11411@126.com, this is my private email, so I can contact with
>
> you in my free time, Thanks.
>
>
> 4 Compilable patches
>
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>
> index 39e7f23..d3a719c 100644
>
> --- a/hw/block/virtio-blk.c
>
> +++ b/hw/block/virtio-blk.c
>
> @@ -1165,8 +1165,9 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
>
> static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>
> {
>
> VirtIOBlock *s = (VirtIOBlock *)vdev;
>
> + VirtIOBlockDataPlane *dp = s->dataplane;
>
>
> - if (s->dataplane && !s->dataplane_started) {
>
> + if (s->dataplane && !s->dataplane_started && !dp->stopping) {
>
> /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
>
> * dataplane here instead of waiting for .set_status().
>
> */
>
>
>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re:[PATCH] virtio-blk: don't start dataplane during the stop of dataplane
@ 2023-10-23 2:12 lv.mengzhao
0 siblings, 0 replies; 4+ messages in thread
From: lv.mengzhao @ 2023-10-23 2:12 UTC (permalink / raw)
To: stefanha, kwolf, hreitz; +Cc: stefanha, mst, qemu-devel, hu.jian, cv11411
[-- Attachment #1.1.1: Type: text/plain, Size: 6316 bytes --]
>> I won't be able to reply until November 2nd. Maybe Kevin or Hanna can>> discuss this with you in the meantime.
Thanks
@Kevin, @Hanna, Do you have any oponions on this issue and my solution ?
Original Mail
Sender: StefanHajnoczi
To: lv mengzhao10286442;
CC: stefanha@redhat.com;mst@redhat.com;kwolf@redhat.com;hreitz@redhat.com;qemu-devel@nongnu.org;hu jian10269307;cv11411@126.com;
Date: 2023/10/19 20:16
Subject: Re: Re: [PATCH] virtio-blk: don't start dataplane during the stop of dataplane
On Thu, 19 Oct 2023 at 00:31, <lv.mengzhao@zte.com.cn> wrote:
>
> On Tue, Oct 17, 2023 at 10:04PM +0800, stefanha@redhat.com wrote:
>
> > > From: hujian <hu.jian@zte.com.cn>
>
> > >
>
> > > During the stop of dataplane for virtio-blk, virtio_bus_cleanup_host_notifier() is be
>
> > > called to clean up notifier at the end, if polled ioeventfd, virtio_blk_handle_output()
>
> > > is used to handle io request. But due to s->dataplane_disabled is false, it will be
>
> > > returned directly, which drops io request.
>
> > > Backtrace:
>
> > > ->virtio_blk_data_plane_stop
>
> > > ->virtio_bus_cleanup_host_notifier
>
> > > ->virtio_queue_host_notifier_read
>
> > > ->virtio_queue_notify_vq
>
> > > ->vq->handle_output
>
> > > ->virtio_blk_handle_output
>
> > > ->if (s->dataplane && !s->dataplane_stoped)
>
> > > ->if (!s->dataplane_disabled)
>
> > > ->return *
>
> > > ->virtio_blk_handle_output_do
>
> > > The above problem can occur when using "virsh reset" cmdline to reset guest, while
>
> > > guest does io.
>
> > > To fix this problem, don't try to start dataplane if s->stopping is true, and io would
>
> > > be handled by virtio_blk_handle_vq().
>
> > >
>
> > > Signed-off-by: hujian <hu.jian@zte.com.cn>
>
> > > ---
>
> > > hw/block/virtio-blk.c | 2 +-
>
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
>
> >
>
> > I have dropped this patch again after Fiona pointed out it does not
>
> > compile and Kevin noticed that handling requests from the main loop
>
> > thread while the I/O is still being processed in the IOThread is going
>
> > to cause thread-safety issues.
>
> >
>
> > Can you explain the problem you are seeing in more detail? You run
>
> > "virsh reset" while the guest is doing I/O. Then what happens?
>
> >
>
> > Stefan
>
>
> 1 Compilation issues
>
> I'm sorry to be in such a hurry to submit the patch that I forgot to compile it locally.
>
> Compilable patches are at the bottom.
>
>
> 2 Troubleshooting
I won't be able to reply until November 2nd. Maybe Kevin or Hanna can
discuss this with you in the meantime.
Stefan
> We have done a lifecycle test for the VM (QEMU version: 4.1.0, Host kernel version: 4.18),
>
> which is loop execution: virsh create -> virsh suspend -> virsh resume -> virsh reset ->
>
> virsh shutdown, and io stress test inside the virtual machine. After the loop is executed
>
> about 200 times, after "virsh reset" is executed, the virtual machine goes into emergency
>
> mode and fails to start normally. Theoretically, "virsh reset" may causes data loss of
>
> virtual machine, but not enough to put it into emergency mode.
>
>
> Coincidentally, I happen to be fixing another different fault with the same phenomenon,
>
> which is caused by a our private patch, this patch calls virtio_blk_data_plane_ [stop|start]
>
> to operate the dataplane, if QEMU is processing io requests at same time, it may cause the
>
> loss of io requests.
>
>
> Analyzing virtio_blk_data_plane_stop(), virtio_bus_cleanup_host_notifier() is used to
>
> clean up notifiers, and my understanding is that it would handle the remaining IO requests.
>
> The stack is as follows, I add the print at the star line and find that virtio_blk_handle_output()
>
> returned directly instead of continuing to call virtio_blk_handle_vq() to handle io. So I modify
>
> the code here to make it don't return during the stop of dataplane, and our internal private patch
>
> works normally.
>
>
> Backtrace:
>
> ->virtio_blk_data_plane_stop
>
> ->virtio_bus_cleanup_host_notifier
>
> ->virtio_queue_host_notifier_read
>
> ->virtio_queue_notify_vq
>
> ->vq->handle_output
>
> ->virtio_blk_handle_output
>
> ->if (s->dataplane && !s->dataplane_stoped)
>
> ->if (!s->dataplane_disabled)
>
> ->return *
>
> ->virtio_blk_handle_vq
>
>
> Back to the problem caused by the virsh reset, libvirt sends the "system_reset" QEMU
>
> monitor command to QEMU, and QEMU calls qmp_system_reset(), and eventually calls
>
> virtio_blk_data_plane_[stop|start] to reset devices. I suspect that io loss will
>
> also occur if QEMU still has io to process during the stop of dataplane.
>
>
> 3 Thread-safety issues
>
> virtio_blk_data_plane_stop() calls blk_set_aio_context() to switch bs back to the QEMU
>
> main loop after virtio_bus_cleanup_host_notifier(), so the remaining IO requests
>
> are still handling by iothread(if configured). I'm a little confused as to why there
>
> is thread-safety issues.
>
>
> Lastly, please CC to cv11411@126.com, this is my private email, so I can contact with
>
> you in my free time, Thanks.
>
>
> 4 Compilable patches
>
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
>
> index 39e7f23..d3a719c 100644
>
> --- a/hw/block/virtio-blk.c
>
> +++ b/hw/block/virtio-blk.c
>
> @@ -1165,8 +1165,9 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
>
> static void virtio_blk_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>
> {
>
> VirtIOBlock *s = (VirtIOBlock *)vdev;
>
> + VirtIOBlockDataPlane *dp = s->dataplane;
>
>
> - if (s->dataplane && !s->dataplane_started) {
>
> + if (s->dataplane && !s->dataplane_started && !dp->stopping) {
>
> /* Some guests kick before setting VIRTIO_CONFIG_S_DRIVER_OK so start
>
> * dataplane here instead of waiting for .set_status().
>
> */
>
>
>
>
>
[-- Attachment #1.1.2: Type: text/html , Size: 15072 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Re: [PATCH] virtio-blk: don't start dataplane during the stop of dataplane
2023-10-19 7:28 Re: [PATCH] " lv.mengzhao
2023-10-19 12:15 ` Stefan Hajnoczi
@ 2023-10-23 9:13 ` Kevin Wolf
1 sibling, 0 replies; 4+ messages in thread
From: Kevin Wolf @ 2023-10-23 9:13 UTC (permalink / raw)
To: lv.mengzhao; +Cc: stefanha, mst, hreitz, qemu-devel, hu.jian, cv11411
Am 19.10.2023 um 09:28 hat lv.mengzhao@zte.com.cn geschrieben:
> On Tue, Oct 17, 2023 at 10:04PM +0800, stefanha@redhat.com wrote:
> > > From: hujian <hu.jian@zte.com.cn>
> > >
> > > During the stop of dataplane for virtio-blk, virtio_bus_cleanup_host_notifier() is be
> > > called to clean up notifier at the end, if polled ioeventfd, virtio_blk_handle_output()
> > > is used to handle io request. But due to s->dataplane_disabled is false, it will be
> > > returned directly, which drops io request.
> > > Backtrace:
> > > ->virtio_blk_data_plane_stop
> > > ->virtio_bus_cleanup_host_notifier
> > > ->virtio_queue_host_notifier_read
> > > ->virtio_queue_notify_vq
> > > ->vq->handle_output
> > > ->virtio_blk_handle_output
> > > ->if (s->dataplane && !s->dataplane_stoped)
> > > ->if (!s->dataplane_disabled)
> > > ->return *
> > > ->virtio_blk_handle_output_do
> > > The above problem can occur when using "virsh reset" cmdline to reset guest, while
> > > guest does io.
> > > To fix this problem, don't try to start dataplane if s->stopping is true, and io would
> > > be handled by virtio_blk_handle_vq().
> > >
> > > Signed-off-by: hujian <hu.jian@zte.com.cn>
> > > ---
> > > hw/block/virtio-blk.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > I have dropped this patch again after Fiona pointed out it does not
> > compile and Kevin noticed that handling requests from the main loop
> > thread while the I/O is still being processed in the IOThread is going
> > to cause thread-safety issues.
> >
> > Can you explain the problem you are seeing in more detail? You run
> > "virsh reset" while the guest is doing I/O. Then what happens?
> >
> > Stefan
>
> 1 Compilation issues
> I'm sorry to be in such a hurry to submit the patch that I forgot to
> compile it locally. Compilable patches are at the bottom.
The more worrying part is that without building it, you also can't have
tested it. How much testing did the new, compilable patch undergo?
> 2 Troubleshooting
> We have done a lifecycle test for the VM (QEMU version: 4.1.0, Host
> kernel version: 4.18), which is loop execution: virsh create -> virsh
> suspend -> virsh resume -> virsh reset -> virsh shutdown, and io
> stress test inside the virtual machine. After the loop is executed
> about 200 times, after "virsh reset" is executed, the virtual machine
> goes into emergency mode and fails to start normally. Theoretically,
> "virsh reset" may causes data loss of virtual machine, but not enough
> to put it into emergency mode.
QEMU 4.1.0 is quite old. Do you happen to know if the problem is still
reproducible on current git master (or 8.1.0)?
Though from your description of the bug and what the code looks like
today, I would expect so.
> Coincidentally, I happen to be fixing another different fault with the
> same phenomenon, which is caused by a our private patch, this patch
> calls virtio_blk_data_plane_ [stop|start] to operate the dataplane, if
> QEMU is processing io requests at same time, it may cause the loss of
> io requests.
>
> Analyzing virtio_blk_data_plane_stop(),
> virtio_bus_cleanup_host_notifier() is used to clean up notifiers, and
> my understanding is that it would handle the remaining IO requests.
Well, as you found out, it doesn't actually do that in practice. I'm
not sure what the idea behind it was. Handling the remaining requests in
our case would be a problem because we're not in the right state to
handle requests.
> The stack is as follows, I add the print at the star line and find
> that virtio_blk_handle_output() returned directly instead of
> continuing to call virtio_blk_handle_vq() to handle io. So I modify
> the code here to make it don't return during the stop of dataplane,
> and our internal private patch works normally.
>
> Backtrace:
> ->virtio_blk_data_plane_stop
> ->virtio_bus_cleanup_host_notifier
> ->virtio_queue_host_notifier_read
> ->virtio_queue_notify_vq
> ->vq->handle_output
> ->virtio_blk_handle_output
> ->if (s->dataplane && !s->dataplane_stoped)
> ->if (!s->dataplane_disabled)
> ->return *
> ->virtio_blk_handle_vq
>
> Back to the problem caused by the virsh reset, libvirt sends the
> "system_reset" QEMU monitor command to QEMU, and QEMU calls
> qmp_system_reset(), and eventually calls
> virtio_blk_data_plane_[stop|start] to reset devices. I suspect that io
> loss will also occur if QEMU still has io to process during the stop
> of dataplane.
Hm... But why can the guest expect that requests are processed during a
hard reset? Intuitively dropping remaining requests after a certain
point where you do a reset doesn't sound like a problem.
> 3 Thread-safety issues
> virtio_blk_data_plane_stop() calls blk_set_aio_context() to switch bs
> back to the QEMU main loop after virtio_bus_cleanup_host_notifier(),
> so the remaining IO requests are still handling by iothread(if
> configured). I'm a little confused as to why there is thread-safety
> issues.
blk_set_aio_context() only tells the BlockBackend which iothread to
expect requests from. The device needs to take care to actually send
requests from the same thread that it promised to the BlockBackend. In
your stack trace, we still promise that requests come from the iothread,
but we're actually trying to process them in the main thread:
virtio_blk_data_plane_stop() runs in the main thread and indirectly
calls virtio_blk_handle_output(), so with your fix, they would be
handled in the main thread before blk_set_aio_context() was called. This
isn't right.
We would have to delay processing requests until after we have
completely switched back to the non-dataplane mode. And after thinking a
bit about this, this seems to be exactly what the current code is doing:
In fact, returning from virtio_blk_handle_output() doesn't drop the
request. It just means that we're not processing it right now, it still
remains queued (there is no virtqueue_pop() involved when we return
early).
So I think returning there early is the right thing to do.
I'm not entirely sure where your problem happens. What would make most
sense is that virtio_blk_reset() throws away the requests. But throwing
away unprocessed requests is exactly what I would expect from a reset,
so I don't really see a problem in that.
We probably need to dig a bit deeper to figure out why your guest has a
problem with this. Can you give some details about the guest? Like which
OS, filesystem, other storage related configuration etc.? At this point
it sounds a bit like a guest side bug.
> Lastly, please CC to cv11411@126.com, this is my private email, so I
> can contact with you in my free time, Thanks.
Ok. What I usually do in such cases is that I already CC myself with the
alternative address to make sure that nobody forgets to add it while
replying.
Kevin
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2023-10-23 9:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-23 2:12 Re:[PATCH] virtio-blk: don't start dataplane during the stop of dataplane lv.mengzhao
-- strict thread matches above, loose matches on Subject: below --
2023-10-19 7:28 Re: [PATCH] " lv.mengzhao
2023-10-19 12:15 ` Stefan Hajnoczi
2023-10-23 9:13 ` Kevin Wolf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).