From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54451) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XS1Qs-0005jh-Ew for qemu-devel@nongnu.org; Thu, 11 Sep 2014 06:17:00 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XS1Qm-0000FO-BC for qemu-devel@nongnu.org; Thu, 11 Sep 2014 06:16:54 -0400 Received: from mx1.redhat.com ([209.132.183.28]:7262) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XS1Qm-0000FE-2T for qemu-devel@nongnu.org; Thu, 11 Sep 2014 06:16:48 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s8BAGluG031833 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Thu, 11 Sep 2014 06:16:47 -0400 From: Fam Zheng Date: Thu, 11 Sep 2014 18:16:37 +0800 Message-Id: <1410430599-27540-1-git-send-email-famz@redhat.com> Subject: [Qemu-devel] [PATCH 0/2] virtio-scsi: Optimizing request allocation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: qemu-devel@nongnu.org Cc: Paolo Bonzini Zeroing is relatively expensive since we have big request structures. VirtQueueElement (> 4k) and sense_buf (256 bytes) are two points to look at. This visibly reduces overhead of request handling when testing with the unmerged "null" driver and virtio-scsi dataplane. Before, the issue is very obvious with perf top: perf top -G -p `pidof qemu-system-x86_64` ----------------------------------------- + 16.50% libc-2.17.so [.] __memset_sse2 + 2.28% libc-2.17.so [.] _int_malloc + 2.25% [vdso] [.] 0x0000000000000cd1 + 2.02% [kernel] [k] _raw_spin_lock_irqsave + 1.97% libpthread-2.17.so [.] pthread_mutex_lock + 1.87% libpthread-2.17.so [.] pthread_mutex_unlock + 1.81% [kernel] [k] fget_light + 1.70% libc-2.17.so [.] malloc After, the high __memset_sse2 and _int_malloc is gone: perf top -G -p `pidof qemu-system-x86_64` ----------------------------------------- + 4.20% [kernel] [k] vcpu_enter_guest + 3.97% [kernel] [k] vmx_vcpu_run + 2.63% [kernel] [k] _raw_spin_lock_irqsave + 1.72% [kernel] [k] native_read_msr_safe + 1.65% [kernel] [k] __srcu_read_lock + 1.64% [kernel] [k] _raw_spin_unlock_irqrestore + 1.57% [vdso] [.] 0x00000000000008d8 + 1.49% libc-2.17.so [.] _int_malloc + 1.29% libpthread-2.17.so [.] pthread_mutex_unlock + 1.26% [kernel] [k] native_write_msr_safe See the commit message of patch 2 for some fio test data. Thanks, Fam Fam Zheng (2): scsi: Optimize scsi_req_alloc virtio-scsi: Optimize virtio_scsi_init_req hw/scsi/scsi-bus.c | 7 +++++-- hw/scsi/virtio-scsi.c | 17 ++++++++++------- include/hw/scsi/scsi.h | 21 ++++++++++++++------- include/hw/virtio/virtio-scsi.h | 1 + 4 files changed, 30 insertions(+), 16 deletions(-) -- 1.9.3