From: Stefan Hajnoczi <stefanha@redhat.com>
To: qemu-devel@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>,
Stefan Hajnoczi <stefanha@redhat.com>,
Bin Wu <wu.wubin@huawei.com>, Paolo Bonzini <pbonzini@redhat.com>
Subject: [Qemu-devel] [PULL 39/65] nbd: fix the co_queue multi-adding bug
Date: Fri, 13 Feb 2015 16:24:35 +0000 [thread overview]
Message-ID: <1423844701-21041-40-git-send-email-stefanha@redhat.com> (raw)
In-Reply-To: <1423844701-21041-1-git-send-email-stefanha@redhat.com>
From: Bin Wu <wu.wubin@huawei.com>
When we tested the VM migartion between different hosts with NBD
devices, we found if we sent a cancel command after the drive_mirror
was just started, a coroutine re-enter error would occur. The stack
was as follow:
(gdb) bt
00) 0x00007fdfc744d885 in raise () from /lib64/libc.so.6
01) 0x00007fdfc744ee61 in abort () from /lib64/libc.so.6
02) 0x00007fdfca467cc5 in qemu_coroutine_enter (co=0x7fdfcaedb400, opaque=0x0)
at qemu-coroutine.c:118
03) 0x00007fdfca467f6c in qemu_co_queue_run_restart (co=0x7fdfcaedb400) at
qemu-coroutine-lock.c:59
04) 0x00007fdfca467be5 in coroutine_swap (from=0x7fdfcaf3c4e8,
to=0x7fdfcaedb400) at qemu-coroutine.c:96
05) 0x00007fdfca467cea in qemu_coroutine_enter (co=0x7fdfcaedb400, opaque=0x0)
at qemu-coroutine.c:123
06) 0x00007fdfca467f6c in qemu_co_queue_run_restart (co=0x7fdfcaedbdc0) at
qemu-coroutine-lock.c:59
07) 0x00007fdfca467be5 in coroutine_swap (from=0x7fdfcaf3c4e8,
to=0x7fdfcaedbdc0) at qemu-coroutine.c:96
08) 0x00007fdfca467cea in qemu_coroutine_enter (co=0x7fdfcaedbdc0, opaque=0x0)
at qemu-coroutine.c:123
09) 0x00007fdfca4a1fa4 in nbd_recv_coroutines_enter_all (s=0x7fdfcaef7dd0) at
block/nbd-client.c:41
10) 0x00007fdfca4a1ff9 in nbd_teardown_connection (client=0x7fdfcaef7dd0) at
block/nbd-client.c:50
11) 0x00007fdfca4a20f0 in nbd_reply_ready (opaque=0x7fdfcaef7dd0) at
block/nbd-client.c:92
12) 0x00007fdfca45ed80 in aio_dispatch (ctx=0x7fdfcae15e90) at aio-posix.c:144
13) 0x00007fdfca45ef1b in aio_poll (ctx=0x7fdfcae15e90, blocking=false) at
aio-posix.c:222
14) 0x00007fdfca448c34 in aio_ctx_dispatch (source=0x7fdfcae15e90, callback=0x0,
user_data=0x0) at async.c:212
15) 0x00007fdfc8f2f69a in g_main_context_dispatch () from
/usr/lib64/libglib-2.0.so.0
16) 0x00007fdfca45c391 in glib_pollfds_poll () at main-loop.c:190
17) 0x00007fdfca45c489 in os_host_main_loop_wait (timeout=1483677098) at
main-loop.c:235
18) 0x00007fdfca45c57b in main_loop_wait (nonblocking=0) at main-loop.c:484
19) 0x00007fdfca25f403 in main_loop () at vl.c:2249
20) 0x00007fdfca266fc2 in main (argc=42, argv=0x7ffff517d638,
envp=0x7ffff517d790) at vl.c:4814
We find the nbd_recv_coroutines_enter_all function (triggered by a cancel
command or a network connection breaking down) will enter a coroutine which
is waiting for the sending lock. If the lock is still held by another coroutine,
the entering coroutine will be added into the co_queue again. Latter, when the
lock is released, a coroutine re-enter error will occur.
This bug can be fixed simply by delaying the setting of recv_coroutine as
suggested by paolo. After applying this patch, we have tested the cancel
operation in mirror phase looply for more than 5 hous and everything is fine.
Without this patch, a coroutine re-enter error will occur in 5 minutes.
Signed-off-by: Bn Wu <wu.wubin@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1423552846-3896-1-git-send-email-wu.wubin@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/nbd-client.c | 25 +++++++++++++------------
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/block/nbd-client.c b/block/nbd-client.c
index a01a37f..259f5a3 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -108,11 +108,22 @@ static int nbd_co_send_request(BlockDriverState *bs,
{
NbdClientSession *s = nbd_get_client_session(bs);
AioContext *aio_context;
- int rc, ret;
+ int rc, ret, i;
qemu_co_mutex_lock(&s->send_mutex);
+
+ for (i = 0; i < MAX_NBD_REQUESTS; i++) {
+ if (s->recv_coroutine[i] == NULL) {
+ s->recv_coroutine[i] = qemu_coroutine_self();
+ break;
+ }
+ }
+
+ assert(i < MAX_NBD_REQUESTS);
+ request->handle = INDEX_TO_HANDLE(s, i);
s->send_coroutine = qemu_coroutine_self();
aio_context = bdrv_get_aio_context(bs);
+
aio_set_fd_handler(aio_context, s->sock,
nbd_reply_ready, nbd_restart_write, bs);
if (qiov) {
@@ -168,8 +179,6 @@ static void nbd_co_receive_reply(NbdClientSession *s,
static void nbd_coroutine_start(NbdClientSession *s,
struct nbd_request *request)
{
- int i;
-
/* Poor man semaphore. The free_sema is locked when no other request
* can be accepted, and unlocked after receiving one reply. */
if (s->in_flight >= MAX_NBD_REQUESTS - 1) {
@@ -178,15 +187,7 @@ static void nbd_coroutine_start(NbdClientSession *s,
}
s->in_flight++;
- for (i = 0; i < MAX_NBD_REQUESTS; i++) {
- if (s->recv_coroutine[i] == NULL) {
- s->recv_coroutine[i] = qemu_coroutine_self();
- break;
- }
- }
-
- assert(i < MAX_NBD_REQUESTS);
- request->handle = INDEX_TO_HANDLE(s, i);
+ /* s->recv_coroutine[i] is set as soon as we get the send_lock. */
}
static void nbd_coroutine_end(NbdClientSession *s,
--
2.1.0
next prev parent reply other threads:[~2015-02-13 16:34 UTC|newest]
Thread overview: 68+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-13 16:23 [Qemu-devel] [PULL 00/65] Block patches Stefan Hajnoczi
2015-02-13 16:23 ` [Qemu-devel] [PULL 01/65] nbd: Drop BDS backpointer Stefan Hajnoczi
2015-02-13 16:23 ` [Qemu-devel] [PULL 02/65] iotests: Add "wait" functionality to _cleanup_qemu Stefan Hajnoczi
2015-02-13 16:23 ` [Qemu-devel] [PULL 03/65] iotests: Add test for drive-mirror with NBD target Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 04/65] libqos: Split apart pc_alloc_init Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 05/65] qtest/ahci: Create ahci.h Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 06/65] libqos: create libqos.c Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 07/65] libqos: add qtest_vboot Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 08/65] libqos: add alloc_init_flags Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 09/65] libqos: Update QGuestAllocator to be opaque Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 10/65] libqos: add pc specific interface Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 11/65] qtest/ahci: Store hba_base in AHCIQState Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 12/65] qtest/ahci: finalize AHCIQState consolidation Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 13/65] qtest/ahci: remove pcibus global Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 14/65] qtest/ahci: remove guest_malloc global Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 15/65] libqos/ahci: Functional register helpers Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 16/65] qtest/ahci: remove getter/setter macros Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 17/65] qtest/ahci: Bookmark FB and CLB pointers Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 18/65] libqos/ahci: create libqos/ahci.c Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 19/65] dataplane: endianness-aware accesses Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 20/65] libqos/ahci: Add ahci_port_select helper Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 21/65] libqos/ahci: Add ahci_port_clear helper Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 22/65] qtest/ahci: rename 'Command' to 'CommandHeader' Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 23/65] libqos/ahci: Add command header helpers Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 24/65] libqos/ahci: Add ahci_port_check_error helper Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 25/65] libqos/ahci: Add ahci_port_check_interrupts helper Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 26/65] libqos/ahci: Add port_check_nonbusy helper Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 27/65] libqos/ahci: Add cmd response sanity check helpers Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 28/65] qtest/ahci: Demagic ahci tests Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 29/65] qtest/ahci: add ahci_write_fis Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 30/65] libqos/ahci: Add ide cmd properties Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 31/65] libqos/ahci: add ahci command functions Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 32/65] libqos/ahci: add ahci command verify Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 33/65] libqos/ahci: add ahci command size setters Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 34/65] libqos/ahci: Add ahci_guest_io Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 35/65] libqos/ahci: add ahci_io Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 36/65] libqos/ahci: Add ahci_clean_mem Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 37/65] qtest/ahci: Assert sector size in identify test Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 38/65] qtest/ahci: Adding simple dma read-write test Stefan Hajnoczi
2015-02-13 16:24 ` Stefan Hajnoczi [this message]
2015-02-13 16:24 ` [Qemu-devel] [PULL 40/65] savevm: Improve error message for blocked migration Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 41/65] block: vmdk - fixed sizeof() error Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 42/65] qtest: Fix deadloop by running main loop AIO context's timers Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 43/65] qemu-io: Account IO by aio_read and aio_write Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 44/65] qtest: Add scripts/qtest.py Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 45/65] qemu-iotests: Add VM method qtest() to iotests.py Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 46/65] qemu-iotests: Allow caller to disable underscore convertion for qmp Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 47/65] qemu-iotests: Add 093 for IO throttling Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 48/65] qemu-img: Fix qemu-img convert -n Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 49/65] iotests: Add test for qemu-img convert to NBD Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 50/65] block: Lift some BDS functions to the BlockBackend Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 51/65] block: Add blk_new_open() Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 52/65] block: Add Error parameter to bdrv_find_protocol() Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 53/65] iotests: Add test for driver=qcow2, format=qcow2 Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 54/65] blockdev: Use blk_new_open() in blockdev_init() Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 55/65] block/xen: Use blk_new_open() in blk_connect() Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 56/65] qemu-img: Use blk_new_open() in img_open() Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 57/65] qemu-img: Use blk_new_open() in img_rebase() Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 58/65] qemu-img: Use BlockBackend as far as possible Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 59/65] qemu-nbd: Use blk_new_open() in main() Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 60/65] qemu-io: Use blk_new_open() in openfile() Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 61/65] qemu-io: Remove "growable" option Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 62/65] qemu-io: Use BlockBackend Stefan Hajnoczi
2015-02-13 16:24 ` [Qemu-devel] [PULL 63/65] block: Clamp BlockBackend requests Stefan Hajnoczi
2015-02-13 16:25 ` [Qemu-devel] [PULL 64/65] block: Remove "growable" from BDS Stefan Hajnoczi
2015-02-13 16:25 ` [Qemu-devel] [PULL 65/65] block: Keep bdrv_check*_request()'s return value Stefan Hajnoczi
2015-02-14 0:50 ` [Qemu-devel] [PULL 00/65] Block patches Peter Maydell
2015-02-16 14:12 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1423844701-21041-40-git-send-email-stefanha@redhat.com \
--to=stefanha@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=wu.wubin@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).