All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Roth <mdroth@linux.vnet.ibm.com>
To: Liu Ping Fan <qemulist@gmail.com>, qemu-devel@nongnu.org
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v1 1/5] util: introduce gsource event abstraction
Date: Wed, 14 Aug 2013 14:26:20 -0500	[thread overview]
Message-ID: <20130814192620.16563.45392@loki> (raw)
In-Reply-To: <20130808210330.20939.61326@loki>

Quoting Michael Roth (2013-08-08 16:03:30)
> Quoting Liu Ping Fan (2013-08-08 01:26:07)
> > Introduce struct EventsGSource. It will ease the usage of GSource
> > associated with a group of files, which are dynamically allocated
> > and release, ex, slirp.
> > 
> > Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> > ---
> >  util/Makefile.objs   |  1 +
> >  util/event_gsource.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  util/event_gsource.h | 37 +++++++++++++++++++++
> >  3 files changed, 132 insertions(+)
> >  create mode 100644 util/event_gsource.c
> >  create mode 100644 util/event_gsource.h
> > 
> > diff --git a/util/Makefile.objs b/util/Makefile.objs
> > index dc72ab0..eec55bd 100644
> > --- a/util/Makefile.objs
> > +++ b/util/Makefile.objs
> > @@ -11,3 +11,4 @@ util-obj-y += iov.o aes.o qemu-config.o qemu-sockets.o uri.o notify.o
> >  util-obj-y += qemu-option.o qemu-progress.o
> >  util-obj-y += hexdump.o
> >  util-obj-y += crc32c.o
> > +util-obj-y += event_gsource.o
> > diff --git a/util/event_gsource.c b/util/event_gsource.c
> > new file mode 100644
> > index 0000000..4b9fa89
> > --- /dev/null
> > +++ b/util/event_gsource.c
> > @@ -0,0 +1,94 @@
> > +/*
> > + *  Copyright (C) 2013 IBM
> > + *
> > + *  This program is free software; you can redistribute it and/or modify
> > + *  it under the terms of the GNU General Public License as published by
> > + *  the Free Software Foundation; under version 2 of the License.
> > + *
> > + *  This program is distributed in the hope that it will be useful,
> > + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + *  GNU General Public License for more details.
> > + *
> > + *  You should have received a copy of the GNU General Public License
> > + *  along with this program; if not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#include "event_gsource.h"
> > +#include "qemu/bitops.h"
> > +
> > +GPollFD *events_source_add_pollfd(EventsGSource *src, int fd)
> > +{
> > +    GPollFD *retfd;
> > +
> > +    retfd = g_slice_alloc(sizeof(GPollFD));
> > +    retfd->events = 0;
> > +    retfd->fd = fd;
> > +    src->pollfds_list = g_list_append(src->pollfds_list, retfd);
> 
> I think moving to a GSource to simplify our mainloop implementation is
> worthwhile even if we still rely on the global mutex and don't initially
> support running those GSources outside the main iothread. But since being
> able to eventually offload NetClient backends to seperate events loops to
> support things like virtio-net dataplane is (I assume) still one of the
> eventual goals, we should consider how to deal with concurrent
> access to EventsGSource members.
> 
> For instance, In the case of slirp your dispatch callback may modify
> src->pollfds_lists via
> slirp_handler->tcp_input->socreate->events_source_add_pollfd(), while
> another thread attempts to call socreate() via something like
> net_slirp_hostfwd_add from the monitor (that's being driven by a separate
> main loop).
> 
> So events_source_add_pollfd() and the various prepare/check/dispatch
> functions should take a lock on pollfds_lists.
> 
> Dispatch is tricky though, since dispatch() invoke callbacks that may in
> turn try to call events_source_add_pollfd(), as is the case above, in which
> case you can deadlock.
> 
> I've been looking at the situation with regard to moving
> qemu_set_fd_handler* to a GSource.
> 
> GLib has to deal with the same issue with regard to calls to
> g_source_attach() while it's iterating through it's list of GSources. It
> uses the GMainContext lock protect that list, and actually doesn't disallow
> use of functions like g_source_attach() within a dispatch function. In
> g_main_context_dispatch(), to work around the potential deadlock issue, it
> actually builds up a separate list of dispatch cb functions and callback data,
> then drops the GMainContext lock before iterating through that list and
> calling the dispatch cb functions for all the GSources that fired.
> This new list it builds up is safe from concurrent modification since
> only the main loop thread can access it.
> 
> AFAIK there's 3 ways to deal with this type of concurrency:
> 
> a) Move to a 1-to-1 mapping between GPollFDs and EventGSources: we can then
>    let GLib handle managing our list of GPollFDs for us. We may still need a
>    mutex for other members of EventsGSource, but that lock can be taken from
>    within our prepare/check/dispatch functions and held for the duration of
>    the calls without any strange deadlock issues.
> 
>    The major downside here is potentially performance. Currently we do an
>    O(n) lookup for stuff like qemu_set_fd_handler, where n is the number of
>    IOHandlers. If we move to 1-to-1 fd-to-GSource mapping, it's O(m), where
>    m is all GSources attached to the GMainContext. I'm not sure what the
>    performance penalty would be, but it will get worse as the number of
>    GSources increases. Not sure if this penalty is applicable for slirp,
>    as it doesn't seem like we need to do any sort of per-socket/fd lookup,
>    since we have a direct pointer to the GPollFD (if you take the approach
>    I mentioned above where you pass a GPollFD* to event_source_add_pollfd())
> 
> b) Stick with this many-to-1 mapping between GPollFDs and EventGSources: we
>    can then introduce variants of events_source_{add,remove}_pollfd
>    that don't take the EventGSource mutex so you can call them inside the
>    dispatch function, which is nasty, because in the case of slirp you'll then
>    end up with similar variants for things like socreate(), or:
> 
> c) Stick with the many-to-1 mapping, but do what glib does and build a list
>    of dispatch callbacks, then drop the EventGSource lock before calling them.
> 
>    (I know for EventGSource we currently have 1 cb for all the FDs, but the
>    requirements are the same, you're just pushing synchronization concerns
>    higher up the stack to
>    slirp_event_source_add_pollfd/slirp_prepare/slirp_handler, and will run
>    into the same recursive locking issue in slirp_handler as a result. I think
>    it's better to handle it all in EventGSource so non-slirp users don't need
>    to implement the same trick, but the approach should be applicable either
>    way)
> 
>    One concern here is that we might remove an event via
>    sofree()->slirp_event_source_remove_pollfd() just after
>    EventGSource->dispatch() drops EventGSource->mutex, so it might still end up
>    dispatching cb for that pfd even though we've deleted it. I think we can
>    have EventGSource set a flag in this case indicating it's in the middle of
>    dispatch, so that event_source_{add,remove}_pfd can wait on a condition
>    variable like EventGSource->cond_event_dispatch_complete before completing.
> 
> I'll be looking at c) WRT to the qemu_set_fd_handler stuff. Perhaps we can
> re-use the same GSourceFuncs here as well, but don't let that bottleneck you,
> just wanted to bring it up for discussion.

Here's the c) approach I was referring to:

https://github.com/mdroth/qemu/blob/9a749a2a1ae93ac1b7d6a1216edaf0ca96ff1edb/iohandler.c#L110

It was actually quite a bit more straightforward: we set a flag during dispatch
that tells registration/de-registration that they cannot modify the event list
until the dispatch_complete condition is issued by GSource's dispatch function
unless that thread owns the GMainContext (which we can easily check via
g_main_context_is_owner() due to the requirement that callers of
g_main_context_dispatch must call g_main_context_acquire)

As a result, we can drop the GSource's mutex prior to walking the list of event
callbacks, and don't even need to build up a second list. The special
consideration is use the Q*_FOREACH__SAFE macros while walking, since callbacks
might modify it while we do so.

Anthony wasn't too enthusiastic about it and after talking with him a bit I
decided to look into a lockless approach for fd-based events, but hopefully it
at least provides a reference for a possible approach to the issue if
lockless isn't a viable option for the GSource, or we don't care about
lock contention due to rapid de/re-registration of events/pfds/fds for a
particular GSource.

> 
> > +    if (fd >= 0) {
> > +        g_source_add_poll(&src->source, retfd);
> > +    }
> > +
> > +    return retfd;
> > +}
> > +
> > +void events_source_remove_pollfd(EventsGSource *src, GPollFD *pollfd)
> > +{
> > +    g_source_remove_poll(&src->source, pollfd);
> > +    src->pollfds_list = g_list_remove(src->pollfds_list, pollfd);
> > +    g_slice_free(GPollFD, pollfd);
> > +}
> > +
> > +static gboolean events_source_check(GSource *src)
> > +{
> > +    EventsGSource *nsrc = (EventsGSource *)src;
> > +    GList *cur;
> > +    GPollFD *gfd;
> > +
> > +    cur = nsrc->pollfds_list;
> > +    while (cur) {
> > +        gfd = cur->data;
> > +        if (gfd->fd >= 0 && (gfd->revents & gfd->events)) {
> > +            return true;
> > +        }
> > +        cur = g_list_next(cur);
> > +    }
> > +
> > +    return false;
> > +}
> > +
> > +static gboolean events_source_dispatch(GSource *src, GSourceFunc cb,
> > +    gpointer data)
> > +{
> > +    gboolean ret = false;
> > +
> > +    if (cb) {
> > +        ret = cb(data);
> > +    }
> > +    return ret;
> > +}
> > +
> > +EventsGSource *events_source_new(GPrepare prepare, GSourceFunc dispatch_cb,
> > +    void *opaque)
> > +{
> > +    EventsGSource *src;
> > +    GSourceFuncs *gfuncs = g_new0(GSourceFuncs, 1);
> > +    gfuncs->prepare = prepare;
> > +    gfuncs->check = events_source_check,
> > +    gfuncs->dispatch = events_source_dispatch,
> > +
> > +    src = (EventsGSource *)g_source_new(gfuncs, sizeof(EventsGSource));
> > +    src->gfuncs = gfuncs;
> > +    src->pollfds_list = NULL;
> > +    src->opaque = opaque;
> > +    g_source_set_callback(&src->source, dispatch_cb, src, NULL);
> > +
> > +    return src;
> > +}
> > +
> > +void events_source_release(EventsGSource *src)
> > +{
> > +    assert(!src->pollfds_list);
> > +    g_free(src->gfuncs);
> > +    g_source_destroy(&src->source);
> > +}
> > diff --git a/util/event_gsource.h b/util/event_gsource.h
> > new file mode 100644
> > index 0000000..8755952
> > --- /dev/null
> > +++ b/util/event_gsource.h
> > @@ -0,0 +1,37 @@
> > +/*
> > + *  Copyright (C) 2013 IBM
> > + *
> > + *  This program is free software; you can redistribute it and/or modify
> > + *  it under the terms of the GNU General Public License as published by
> > + *  the Free Software Foundation; under version 2 of the License.
> > + *
> > + *  This program is distributed in the hope that it will be useful,
> > + *  but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + *  GNU General Public License for more details.
> > + *
> > + *  You should have received a copy of the GNU General Public License
> > + *  along with this program; if not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#ifndef EVENT_GSOURCE_H
> > +#define EVENT_GSOURCE_H
> > +#include "qemu-common.h"
> > +
> > +typedef gboolean (*GPrepare)(GSource *source, gint *timeout_);
> > +
> > +/* multi fd drive GSource*/
> > +typedef struct EventsGSource {
> > +    GSource source;
> > +    /* a group of GPollFD which dynamically join or leave the GSource */
> > +    GList *pollfds_list;
> > +    GSourceFuncs *gfuncs;
> > +    void *opaque;
> > +} EventsGSource;
> > +
> > +EventsGSource *events_source_new(GPrepare prepare, GSourceFunc dispatch_cb,
> > +    void *opaque);
> > +void events_source_release(EventsGSource *src);
> > +GPollFD *events_source_add_pollfd(EventsGSource *src, int fd);
> > +void events_source_remove_pollfd(EventsGSource *src, GPollFD *pollfd);
> > +#endif
> > -- 
> > 1.8.1.4

  parent reply	other threads:[~2013-08-14 19:26 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-08  6:26 [Qemu-devel] [PATCH v1 0/5] make slirp subsystem self-contained Liu Ping Fan
2013-08-08  6:26 ` [Qemu-devel] [PATCH v1 1/5] util: introduce gsource event abstraction Liu Ping Fan
2013-08-08 16:29   ` Michael Roth
2013-08-09  7:10     ` liu ping fan
2013-08-08 21:03   ` Michael Roth
2013-08-08 21:12     ` Michael Roth
2013-08-14 19:26     ` Michael Roth [this message]
2013-08-08  6:26 ` [Qemu-devel] [PATCH v1 2/5] slirp: make timeout local Liu Ping Fan
2013-08-09  8:06   ` Paolo Bonzini
2013-08-09  8:48     ` liu ping fan
2013-08-09  9:32       ` Paolo Bonzini
2013-08-08  6:26 ` [Qemu-devel] [PATCH v1 3/5] slirp: make slirp event dispatch based on slirp instance Liu Ping Fan
2013-08-08  6:26 ` [Qemu-devel] [PATCH v1 4/5] slirp: decouple timeout for gpoll Liu Ping Fan
2013-08-08  6:26 ` [Qemu-devel] [PATCH v1 5/5] slirp: fold curtime into slirp instance Liu Ping Fan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130814192620.16563.45392@loki \
    --to=mdroth@linux.vnet.ibm.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemulist@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.