From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42775) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bJ170-00087r-IZ for qemu-devel@nongnu.org; Fri, 01 Jul 2016 12:16:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bJ16v-0002aG-3L for qemu-devel@nongnu.org; Fri, 01 Jul 2016 12:16:13 -0400 Received: from mail-wm0-x22e.google.com ([2a00:1450:400c:c09::22e]:38083) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bJ16u-0002aB-I1 for qemu-devel@nongnu.org; Fri, 01 Jul 2016 12:16:09 -0400 Received: by mail-wm0-x22e.google.com with SMTP id r201so35795996wme.1 for ; Fri, 01 Jul 2016 09:16:08 -0700 (PDT) From: =?UTF-8?q?Alex=20Benn=C3=A9e?= Date: Fri, 1 Jul 2016 17:16:08 +0100 Message-Id: <1467389770-9738-1-git-send-email-alex.bennee@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: [Qemu-devel] [PATCH 0/2] Reduce lock contention on TCG hot-path List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: mttcg@listserver.greensocs.com, qemu-devel@nongnu.org, fred.konrad@greensocs.com, a.rigo@virtualopensystems.com, serge.fdrv@gmail.com, cota@braap.org, bobby.prani@gmail.com, rth@twiddle.net Cc: mark.burton@greensocs.com, pbonzini@redhat.com, jan.kiszka@siemens.com, peter.maydell@linaro.org, claudio.fontana@huawei.com, =?UTF-8?q?Alex=20Benn=C3=A9e?= These patches have been on the list before in my base enabling patches series [1]. However while looking at some user-space work loads I realised there is no particular reason to hold them back until the MTTCG work is complete. I fixed one missing atomic_set in Sergey's patch and addressed his review comments from the last posting. For a simple parallel user-mode test it give a ~5% speed boost: Before: retry.oy called with ['./arm-linux-user/qemu-arm', './pigz.armhf', '-c', '-9', 'source.tar'] Source code is @ pull-target-arm-20160627-153-g1b756f1 or heads/master run 1: ret=0 (PASS), time=4.755824 (1/1) run 2: ret=0 (PASS), time=4.756076 (2/2) run 3: ret=0 (PASS), time=4.755916 (3/3) run 4: ret=0 (PASS), time=4.755853 (4/4) run 5: ret=0 (PASS), time=4.755929 (5/5) Results summary: 0: 5 times (100.00%), avg time 4.755920 (0.000000 deviation) After: retry.py called with ['./arm-linux-user/qemu-arm', './pigz.armhf', '-c', '-9', 'source.tar'] Source code is @ pull-target-arm-20160627-155-g579ffd4 or heads/tcg/hot-path-cleanups run 1: ret=0 (PASS), time=4.505735 (1/1) run 2: ret=0 (PASS), time=4.505683 (2/2) run 3: ret=0 (PASS), time=4.505666 (3/3) run 4: ret=0 (PASS), time=4.505578 (4/4) run 5: ret=0 (PASS), time=4.505544 (5/5) Results summary: 0: 5 times (100.00%), avg time 4.505641 (0.000000 deviation) For system-mode the change is in the noise despite the fact by dropping the CONFIG_USER_ONLY specific stuff we will run tb_find_physical twice on the first lookup for any given block. Before: retry.py called with ['/home/alex/lsrc/qemu/qemu.git/arm-softmmu/qemu-system-arm', '-machine', 'type=virt', '-display', 'none', '-smp', '1', '-m', '4096', '-cpu', 'cortex-a15', '-serial', 'telnet:127.0.0.1:4444', '-monitor', 'stdio', '-netdev', 'user,id=unet,hostfwd=tcp::2222-:22', '-device', 'virtio-net-device,netdev=unet', '-drive', 'file=/home/alex/lsrc/qemu/images/jessie-arm32.qcow2,id=myblock,index=0,if=none', '-device', 'virtio-blk-device,drive=myblock', '-append', 'console=ttyAMA0 systemd.unit=benchmark.service root=/dev/vda1', '-kernel', '/home/alex/lsrc/qemu/images/aarch32-current-linux-kernel-only.img'] Source code is @ pull-target-arm-20160627-153-g1b756f1 or heads/master run 1: ret=0 (PASS), time=10.262175 (1/1) run 2: ret=0 (PASS), time=10.262821 (2/2) run 3: ret=0 (PASS), time=9.762559 (3/3) run 4: ret=0 (PASS), time=9.762108 (4/4) run 5: ret=0 (PASS), time=10.262576 (5/5) Results summary: 0: 5 times (100.00%), avg time 10.062448 (0.060046 deviation) Ran command 5 times, 5 passes After: retry.py called with ['/home/alex/lsrc/qemu/qemu.git/arm-softmmu/qemu-system-arm', '-machine', 'type=virt', '-display', 'none', '-smp', '1', '-m', '4096', '-cpu', 'cortex-a15', '-serial', 'telnet:127.0.0.1:4444', '-monitor', 'stdio', '-netdev', 'user,id=unet,hostfwd=tcp::2222-:22', '-device', 'virtio-net-device,netdev=unet', '-drive', 'file=/home/alex/lsrc/qemu/images/jessie-arm32.qcow2,id=myblock,index=0,if=none', '-device', 'virtio-blk-device,drive=myblock', '-append', 'console=ttyAMA0 systemd.unit=benchmark.service root=/dev/vda1', '-kernel', '/home/alex/lsrc/qemu/images/aarch32-current-linux-kernel-only.img'] Source code is @ pull-target-arm-20160627-155-g579ffd4 or heads/tcg/hot-path-cleanups run 1: ret=0 (PASS), time=9.761559 (1/1) run 2: ret=0 (PASS), time=9.511616 (2/2) run 3: ret=0 (PASS), time=9.761713 (3/3) run 4: ret=0 (PASS), time=10.262504 (4/4) run 5: ret=0 (PASS), time=9.762059 (5/5) Results summary: 0: 5 times (100.00%), avg time 9.811890 (0.060150 deviation) Ran command 5 times, 5 passes [1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg375023.html Alex Bennée (1): cpu-exec: remove tb_lock from the hot-path Sergey Fedorov (1): tcg: Ensure safe tb_jmp_cache lookup out of 'tb_lock' cpu-exec.c | 58 ++++++++++++++++++++++++++++----------------------------- translate-all.c | 7 ++++++- 2 files changed, 35 insertions(+), 30 deletions(-) -- 2.7.4