From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54905) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XEg2F-00050E-5Q for qemu-devel@nongnu.org; Tue, 05 Aug 2014 10:48:23 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XEg2A-0006KO-Jt for qemu-devel@nongnu.org; Tue, 05 Aug 2014 10:48:19 -0400 Received: from mx1.redhat.com ([209.132.183.28]:18337) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XEg2A-0006KB-CV for qemu-devel@nongnu.org; Tue, 05 Aug 2014 10:48:14 -0400 Date: Tue, 5 Aug 2014 16:47:28 +0200 From: Kevin Wolf Message-ID: <20140805144728.GH4391@noname.str.redhat.com> References: <1407209598-2572-1-git-send-email-ming.lei@canonical.com> <20140805094844.GF4391@noname.str.redhat.com> <20140805134815.GD12251@stefanha-thinkpad.redhat.com> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Y5rl02BVI9TCfPar" Content-Disposition: inline In-Reply-To: <20140805134815.GD12251@stefanha-thinkpad.redhat.com> Subject: Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Stefan Hajnoczi Cc: Peter Maydell , Fam Zheng , "Michael S. Tsirkin" , Ming Lei , qemu-devel , Paolo Bonzini --Y5rl02BVI9TCfPar Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben: > On Tue, Aug 05, 2014 at 06:00:22PM +0800, Ming Lei wrote: > > On Tue, Aug 5, 2014 at 5:48 PM, Kevin Wolf wrote: > > > Am 05.08.2014 um 05:33 hat Ming Lei geschrieben: > > >> Hi, > > >> > > >> These patches bring up below 4 changes: > > >> - introduce object allocation pool and apply it to > > >> virtio-blk dataplane for improving its performance > > >> > > >> - introduce selective coroutine bypass mechanism > > >> for improving performance of virtio-blk dataplane with > > >> raw format image > > > > > > Before applying any bypassing patches, I think we should understand in > > > detail where we are losing performance with coroutines enabled. > >=20 > > From the below profiling data, CPU becomes slow to run instructions > > with coroutine, and CPU dcache miss is increased so it is very > > likely caused by switching stack frequently. > >=20 > > http://marc.info/?l=3Dqemu-devel&m=3D140679721126306&w=3D2 > >=20 > > http://pastebin.com/ae0vnQ6V >=20 > I have been wondering how to prove that the root cause is the ucontext > coroutine mechanism (stack switching). Here is an idea: >=20 > Hack your "bypass" code path to run the request inside a coroutine. > That way you can compare "bypass without coroutine" against "bypass with > coroutine". >=20 > Right now I think there are doubts because the bypass code path is > indeed a different (and not 100% correct) code path. So this approach > might prove that the coroutines are adding the overhead and not > something that you bypassed. My doubts aren't only that the overhead might not come from the coroutines, but also whether any coroutine-related overhead is really unavoidable. If we can optimise coroutines, I'd strongly prefer to do just that instead of introducing additional code paths. Another thought I had was this: If the performance difference is indeed only coroutines, then that is completely inside the block layer and we don't actually need a VM to test it. We could instead have something like a simple qemu-img based benchmark and should be observing the same. I played a bit with the following, I hope it's not too naive. I couldn't see a difference with your patches, but at least one reason for this is probably that my laptop SSD isn't fast enough to make the CPU the bottleneck. Haven't tried ramdisk yet, that would probably be the next thing. (I actually wrote the patch up just for some profiling on my own, not for comparing throughput, but it should be usable for that as well.) Kevin diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx index d029609..ae64b3d 100644 --- a/qemu-img-cmds.hx +++ b/qemu-img-cmds.hx @@ -9,6 +9,12 @@ STEXI @table @option ETEXI =20 +DEF("bench", img_bench, + "bench [-q] [-f fmt] [-n] [-t cache] filename") +STEXI +@item bench [-q] [-f @var{fmt]} [-n] [-t @var{cache}] filename +ETEXI + DEF("check", img_check, "check [-q] [-f fmt] [--output=3Dofmt] [-r [leaks | all]] filename") STEXI diff --git a/qemu-img.c b/qemu-img.c index d4518e7..92e9529 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -2789,6 +2789,132 @@ out: return 0; } =20 +typedef struct BenchData { + BlockDriverState *bs; + int bufsize; + int nrreq; + int n; + uint8_t *buf; + QEMUIOVector *qiov; + + int in_flight; + uint64_t sector; +} BenchData; + +static void bench_cb(void *opaque, int ret) +{ + BenchData *b =3D opaque; + BlockDriverAIOCB *acb; + + if (ret < 0) { + error_report("Failed request: %s\n", strerror(-ret)); + exit(EXIT_FAILURE); + } + if (b->in_flight > 0) { + b->n--; + b->in_flight--; + } + + while (b->n > b->in_flight && b->in_flight < b->nrreq) { + acb =3D bdrv_aio_readv(b->bs, b->sector, b->qiov, + b->bufsize >> BDRV_SECTOR_BITS, + bench_cb, b); + if (!acb) { + error_report("Failed to issue request"); + exit(EXIT_FAILURE); + } + b->in_flight++; + b->sector +=3D b->bufsize; + b->sector %=3D b->bs->total_sectors; + } +} + +static int img_bench(int argc, char **argv) +{ + int c, ret =3D 0; + const char *fmt =3D NULL, *filename; + bool quiet =3D false; + BlockDriverState *bs =3D NULL; + int flags =3D BDRV_O_FLAGS; + int i; + + for (;;) { + c =3D getopt(argc, argv, "hf:nqt:"); + if (c =3D=3D -1) { + break; + } + + switch (c) { + case 'h': + case '?': + help(); + break; + case 'f': + fmt =3D optarg; + break; + case 'n': + flags |=3D BDRV_O_NATIVE_AIO; + break; + case 'q': + quiet =3D true; + break; + case 't': + ret =3D bdrv_parse_cache_flags(optarg, &flags); + if (ret < 0) { + error_report("Invalid cache mode"); + ret =3D -1; + goto out; + } + break; + } + } + + if (optind !=3D argc - 1) { + error_exit("Expecting one image file name"); + } + filename =3D argv[argc - 1]; + + bs =3D bdrv_new_open("image", filename, fmt, flags, true, quiet); + if (!bs) { + error_report("Could not open image '%s'", filename); + ret =3D -1; + goto out; + } + + BenchData data =3D { + .bs =3D bs, + .bufsize =3D 0x1000, + .nrreq =3D 64, + .n =3D 75000, + }; + + data.buf =3D qemu_blockalign(bs, data.nrreq * data.bufsize); + data.qiov =3D g_new(QEMUIOVector, data.nrreq); + for (i =3D 0; i < data.nrreq; i++) { + qemu_iovec_init(&data.qiov[i], 1); + qemu_iovec_add(&data.qiov[i], + data.buf + i * data.bufsize, data.bufsize); + } + + bench_cb(&data, 0); + + while (data.n > 0) { + main_loop_wait(false); + } + +out: + qemu_vfree(data.buf); + if (bs) { + bdrv_unref(bs); + } + + if (ret) { + return 1; + } + return 0; +} + + static const img_cmd_t img_cmds[] =3D { #define DEF(option, callback, arg_string) \ { option, callback }, --Y5rl02BVI9TCfPar Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJT4O6AAAoJEH8JsnLIjy/WnqAP/imO9nIH+xjvKaUUoTkr1ZpR T8V3uS1BaoD2jwSKUEyqyq6VvoE5uYWI754TUYW+fGEsOFvihn20E8th07LsbCGF QwTr5T4de3dwQrjPmuf14LVxxF1AXb6V+kzBO3cD7QcTqsnbXlZkGxXF9XarmcE4 8roRLxRTlRWi27qvLY4GO6B335FymadUIXAL5yhZzjs2psmf9xcFZykoT68X75Ed tNb2MVojEkCZkbsJXhhoh8ySeU2lv0t/dA6bBgfRkIFklCK7FJBXPpXQcYjBLfaf MUNvmVuP95G9Gfo8JoKLWTnr643aoQj30eNJ+/2beY80d1qGfWpDZ0syST3Ix5lq LUMZsv80gIGFaZTuVI9b+FPWhv9VbNYZ5m4IKLtGKxLBphMnMRNLFtPds/bR0oJj +L5JKvkPf7d8Gji8ya2HbV6dd0NNM9nQnGIJMq0G+HEzNL/ISkK+CjcClBfML19A jDjim3PYp9qfQWgFBi7g8lxJrw8obthds1INjjC53vWfCvrCaP9AAMamWqwAs0UJ Ehp9z25j7z4UxwCiT5kikvyIbfPWjdMYvOqTzdB7rnYmMJV1FDqBIkkSBeX9l/3i YZvTXLOxp7+Yl/0k0ARVqNp+WZUCQWqgPzj4MxiQKv7HHsg9nodLRS0XlZgMlcC0 AOGr7lmmXJdszIXD5R2t =7Pdv -----END PGP SIGNATURE----- --Y5rl02BVI9TCfPar--