From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40184) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZCUUf-0000Pw-BH for qemu-devel@nongnu.org; Tue, 07 Jul 2015 11:09:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZCUUa-0005tf-Is for qemu-devel@nongnu.org; Tue, 07 Jul 2015 11:09:09 -0400 Date: Tue, 7 Jul 2015 16:08:50 +0100 From: Stefan Hajnoczi Message-ID: <20150707150850.GG28673@stefanha-thinkpad.redhat.com> References: <1435670385-625-1-git-send-email-famz@redhat.com> <1435670385-625-5-git-send-email-famz@redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="OFj+1YLvsEfSXdCH" Content-Disposition: inline In-Reply-To: <1435670385-625-5-git-send-email-famz@redhat.com> Subject: Re: [Qemu-devel] [PATCH RFC 4/4] aio-posix: Use epoll in aio_poll List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Fam Zheng Cc: Kevin Wolf , pbonzini@redhat.com, qemu-devel@nongnu.org, qemu-block@nongnu.org --OFj+1YLvsEfSXdCH Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jun 30, 2015 at 09:19:45PM +0800, Fam Zheng wrote: > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > # of scsi-disks | master | epoll > | rd wr randrw | rd wr randrw > --------------------------------------------------------------------- > 1 | 103 96 49 | 105 99 49 > 4 | 92 96 48 | 103 98 49 > 8 | 96 94 46 | 101 97 50 > 16 | 91 91 45 | 101 95 48 > 32 | 84 83 40 | 95 95 48 > 64 | 75 73 35 | 91 90 44 > 128 | 54 53 26 | 79 80 39 > 256 | 41 39 19 | 63 62 30 > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Nice results! > @@ -44,6 +47,12 @@ static AioHandler *find_aio_handler(AioContext *ctx, i= nt fd) > =20 > void aio_context_setup(AioContext *ctx, Error **errp) > { > +#ifdef CONFIG_EPOLL > + ctx->epollfd =3D epoll_create1(EPOLL_CLOEXEC); > + if (ctx->epollfd < 0) { > + error_setg(errp, "Failed to create epoll fd: %s", strerror(errno= )); Slightly more concise: error_setg_errno(errp, errno, "Failed to create epoll fd") > -/* These thread-local variables are used only in a small part of aio_poll > +#ifdef CONFIG_EPOLL > +QEMU_BUILD_BUG_ON((int)G_IO_IN !=3D EPOLLIN); > +QEMU_BUILD_BUG_ON((int)G_IO_OUT !=3D EPOLLOUT); > +QEMU_BUILD_BUG_ON((int)G_IO_PRI !=3D EPOLLPRI); > +QEMU_BUILD_BUG_ON((int)G_IO_ERR !=3D EPOLLERR); > +QEMU_BUILD_BUG_ON((int)G_IO_HUP !=3D EPOLLHUP); I guess this assumption is okay but maybe the compiler optimizes: event.events =3D (node->pfd.events & G_IO_IN ? EPOLLIN : 0) | (node->pfd.events & G_IO_OUT ? EPOLLOUT : 0) | (node->pfd.events & G_IO_PRI ? EPOLLPRI : 0) | (node->pfd.events & G_IO_ERR ? EPOLLERR : 0) | (node->pfd.events & G_IO_HUP ? EPOLLHUP : 0); into: events.events =3D node->pfd.events & (EPOLLIN | EPOLLOUT | EPOLLPRI | EPOLLERR | EPOLLHUP); which is just an AND instruction so it's effectively free and doesn't assume that these constants have the same values. > + > +#define EPOLL_BATCH 128 > +static bool aio_poll_epoll(AioContext *ctx, bool blocking) > +{ > + AioHandler *node; > + bool was_dispatching; > + int i, ret; > + bool progress; > + int64_t timeout; > + struct epoll_event events[EPOLL_BATCH]; > + > + aio_context_acquire(ctx); > + was_dispatching =3D ctx->dispatching; > + progress =3D false; > + > + /* aio_notify can avoid the expensive event_notifier_set if > + * everything (file descriptors, bottom halves, timers) will > + * be re-evaluated before the next blocking poll(). This is > + * already true when aio_poll is called with blocking =3D=3D false; > + * if blocking =3D=3D true, it is only true after poll() returns. > + * > + * If we're in a nested event loop, ctx->dispatching might be true. > + * In that case we can restore it just before returning, but we > + * have to clear it now. > + */ > + aio_set_dispatching(ctx, !blocking); > + > + ctx->walking_handlers++; > + > + timeout =3D blocking ? aio_compute_timeout(ctx) : 0; > + > + if (timeout > 0) { > + timeout =3D DIV_ROUND_UP(timeout, 1000000); > + } I think you already posted the timerfd code in an earlier series. Why degrade to millisecond precision? It needs to be fixed up anyway if the main loop uses aio_poll() in the future. --OFj+1YLvsEfSXdCH Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAEBAgAGBQJVm+uCAAoJEJykq7OBq3PInoUH/2tH0yXqcbwC3l8zuQc5Bw83 kTQd+FxPjMgzZ4bYHvXLkl55p8nr8DU+scfBztv6lIvO5v41BFeRGgssVVyPoBZ4 8kFzHhxOh6HSXHlejmv0eNCUFKm7d7zmcwOaZuQ3+0e/iHHVuN2YcoSg+Ma9ZIz8 jhl6dwq//BeJr8LEiLiRTqc3qVWDQXPW2kjZZhPVEKb5yqGAHF3FiEqFNPri4v9P SdHMFFGCpVuw7kCOTizFy9GviHxNlKPkQoxpqBa+c+4b+K0NWu6A6H3EI2XYNDI7 h5DnvHkNXb4EmdrcMDvNg9+DUo5D6u5Znr13eh/XpKjhxdUB2G3TDdh9p+RnBZc= =MUC3 -----END PGP SIGNATURE----- --OFj+1YLvsEfSXdCH--