From: Lukas Straub <lukasstraub2@web.de>
To: "Zhang, Chen" <chen.zhang@intel.com>
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Jason Wang" <jasowang@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>,
"Li Zhijian" <lizhijian@cn.fujitsu.com>,
"Paolo Bonzini" <pbonzini@redhat.com>
Subject: Re: [PATCH 3/3] net/colo-compare.c: Fix deadlock
Date: Thu, 23 Apr 2020 16:03:25 +0200 [thread overview]
Message-ID: <20200423160325.205248ae@luklap> (raw)
In-Reply-To: <43db66be4b41426e97cf086b0a5d4784@intel.com>
[-- Attachment #1: Type: text/plain, Size: 6731 bytes --]
On Wed, 22 Apr 2020 08:40:40 +0000
"Zhang, Chen" <chen.zhang@intel.com> wrote:
> > -----Original Message-----
> > From: Lukas Straub <lukasstraub2@web.de>
> > Sent: Thursday, April 9, 2020 2:34 AM
> > To: qemu-devel <qemu-devel@nongnu.org>
> > Cc: Zhang, Chen <chen.zhang@intel.com>; Li Zhijian
> > <lizhijian@cn.fujitsu.com>; Jason Wang <jasowang@redhat.com>; Marc-
> > André Lureau <marcandre.lureau@redhat.com>; Paolo Bonzini
> > <pbonzini@redhat.com>
> > Subject: [PATCH 3/3] net/colo-compare.c: Fix deadlock
> >
> > The chr_out chardev is connected to a filter-redirector running in the main
> > loop. qemu_chr_fe_write_all might block here in compare_chr_send if the
> > (socket-)buffer is full.
> > If another filter-redirector in the main loop want's to send data to chr_pri_in
> > it might also block if the buffer is full. This leads to a deadlock because both
> > event loops get blocked.
> >
> > Fix this by converting compare_chr_send to a coroutine and return error if it
> > is in use.
> >
>
> I have tested this series, running fine currently.
> Can you share performance data after this patch?
>
> Thanks
> Zhang Chen
Hello,
Here are the results (using iperf3):
Client-to-server tcp:
without patch: ~64.2 Mbit/s
with patch: ~28.9 Mbit/s
Server-to-client tcp:
without patch: 360 Kbit/s (when it doesn't deadlock :)
with patch: 220 Kbit/s
Yeah, it hurts performance somewhat, but the deadlock happens often with lots
of server-to-client traffic. (It deadlocked in 2 of 4 runs)
Do you have a better idea to solve this issue?
Regards,
Lukas Straub
> > Signed-off-by: Lukas Straub <lukasstraub2@web.de>
> > ---
> > net/colo-compare.c | 82
> > +++++++++++++++++++++++++++++++++++++++-------
> > 1 file changed, 71 insertions(+), 11 deletions(-)
> >
> > diff --git a/net/colo-compare.c b/net/colo-compare.c index
> > 1de4220fe2..82787d3055 100644
> > --- a/net/colo-compare.c
> > +++ b/net/colo-compare.c
> > @@ -32,6 +32,9 @@
> > #include "migration/migration.h"
> > #include "util.h"
> >
> > +#include "block/aio-wait.h"
> > +#include "qemu/coroutine.h"
> > +
> > #define TYPE_COLO_COMPARE "colo-compare"
> > #define COLO_COMPARE(obj) \
> > OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE) @@ -77,6
> > +80,17 @@ static int event_unhandled_count;
> > * |packet | |packet + |packet | |packet +
> > * +--------+ +--------+ +--------+ +--------+
> > */
> > +
> > +typedef struct SendCo {
> > + Coroutine *co;
> > + uint8_t *buf;
> > + uint32_t size;
> > + uint32_t vnet_hdr_len;
> > + bool notify_remote_frame;
> > + bool done;
> > + int ret;
> > +} SendCo;
> > +
> > typedef struct CompareState {
> > Object parent;
> >
> > @@ -91,6 +105,7 @@ typedef struct CompareState {
> > SocketReadState pri_rs;
> > SocketReadState sec_rs;
> > SocketReadState notify_rs;
> > + SendCo sendco;
> > bool vnet_hdr;
> > uint32_t compare_timeout;
> > uint32_t expired_scan_cycle;
> > @@ -699,19 +714,17 @@ static void colo_compare_connection(void
> > *opaque, void *user_data)
> > }
> > }
> >
> > -static int compare_chr_send(CompareState *s,
> > - const uint8_t *buf,
> > - uint32_t size,
> > - uint32_t vnet_hdr_len,
> > - bool notify_remote_frame)
> > +static void coroutine_fn _compare_chr_send(void *opaque)
> > {
> > + CompareState *s = opaque;
> > + SendCo *sendco = &s->sendco;
> > + const uint8_t *buf = sendco->buf;
> > + uint32_t size = sendco->size;
> > + uint32_t vnet_hdr_len = sendco->vnet_hdr_len;
> > + bool notify_remote_frame = sendco->notify_remote_frame;
> > int ret = 0;
> > uint32_t len = htonl(size);
> >
> > - if (!size) {
> > - return 0;
> > - }
> > -
> > if (notify_remote_frame) {
> > ret = qemu_chr_fe_write_all(&s->chr_notify_dev,
> > (uint8_t *)&len, @@ -754,10 +767,50 @@ static int
> > compare_chr_send(CompareState *s,
> > goto err;
> > }
> >
> > - return 0;
> > + sendco->ret = 0;
> > + goto out;
> >
> > err:
> > - return ret < 0 ? ret : -EIO;
> > + sendco->ret = ret < 0 ? ret : -EIO;
> > +out:
> > + sendco->co = NULL;
> > + g_free(sendco->buf);
> > + sendco->buf = NULL;
> > + sendco->done = true;
> > + aio_wait_kick();
> > +}
> > +
> > +static int compare_chr_send(CompareState *s,
> > + const uint8_t *buf,
> > + uint32_t size,
> > + uint32_t vnet_hdr_len,
> > + bool notify_remote_frame) {
> > + SendCo *sendco = &s->sendco;
> > +
> > + if (!size) {
> > + return 0;
> > + }
> > +
> > + if (sendco->done) {
> > + sendco->co = qemu_coroutine_create(_compare_chr_send, s);
> > + sendco->buf = g_malloc(size);
> > + sendco->size = size;
> > + sendco->vnet_hdr_len = vnet_hdr_len;
> > + sendco->notify_remote_frame = notify_remote_frame;
> > + sendco->done = false;
> > + memcpy(sendco->buf, buf, size);
> > + qemu_coroutine_enter(sendco->co);
> > + if (sendco->done) {
> > + /* report early errors */
> > + return sendco->ret;
> > + } else {
> > + /* else assume success */
> > + return 0;
> > + }
> > + }
> > +
> > + return -ENOBUFS;
> > }
> >
> > static int compare_chr_can_read(void *opaque) @@ -1146,6 +1199,8 @@
> > static void colo_compare_complete(UserCreatable *uc, Error **errp)
> > CompareState *s = COLO_COMPARE(uc);
> > Chardev *chr;
> >
> > + s->sendco.done = true;
> > +
> > if (!s->pri_indev || !s->sec_indev || !s->outdev || !s->iothread) {
> > error_setg(errp, "colo compare needs 'primary_in' ,"
> > "'secondary_in','outdev','iothread' property set"); @@ -1281,6
> > +1336,11 @@ static void colo_compare_finalize(Object *obj)
> > CompareState *s = COLO_COMPARE(obj);
> > CompareState *tmp = NULL;
> >
> > + AioContext *ctx = iothread_get_aio_context(s->iothread);
> > + aio_context_acquire(ctx);
> > + AIO_WAIT_WHILE(ctx, !s->sendco.done);
> > + aio_context_release(ctx);
> > +
> > qemu_chr_fe_deinit(&s->chr_pri_in, false);
> > qemu_chr_fe_deinit(&s->chr_sec_in, false);
> > qemu_chr_fe_deinit(&s->chr_out, false);
> > --
> > 2.20.1
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
prev parent reply other threads:[~2020-04-23 14:10 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-08 18:33 [PATCH 0/3] colo-compare bugfixes Lukas Straub
2020-04-08 18:33 ` [PATCH 1/3] net/colo-compare.c: Create event_bh with the right AioContext Lukas Straub
2020-04-22 8:29 ` Zhang, Chen
2020-04-22 8:43 ` Lukas Straub
2020-04-22 9:03 ` Zhang, Chen
2020-04-22 9:40 ` Lukas Straub
2020-04-23 7:29 ` Zhang, Chen
2020-04-24 4:36 ` Derek Su
2020-04-27 3:09 ` Zhang, Chen
2020-04-08 18:33 ` [PATCH 2/3] chardev/char.c: Use qemu_co_sleep_ns if in coroutine Lukas Straub
2020-04-08 19:10 ` Marc-André Lureau
2020-04-22 8:31 ` Zhang, Chen
2020-04-08 18:33 ` [PATCH 3/3] net/colo-compare.c: Fix deadlock Lukas Straub
2020-04-22 8:40 ` Zhang, Chen
2020-04-23 14:03 ` Lukas Straub [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200423160325.205248ae@luklap \
--to=lukasstraub2@web.de \
--cc=chen.zhang@intel.com \
--cc=jasowang@redhat.com \
--cc=lizhijian@cn.fujitsu.com \
--cc=marcandre.lureau@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.