qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/2] virtio-scsi: Optimizing request allocation
@ 2014-09-16  7:20 Fam Zheng
  2014-09-16  7:20 ` [Qemu-devel] [PATCH v3 1/2] scsi: Optimize scsi_req_alloc Fam Zheng
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Fam Zheng @ 2014-09-16  7:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini

v3: Small tweak on "cmd" in 1/2 and "sreq" in 2/2.

Zeroing is relatively expensive since we have big request structures.
VirtQueueElement (>48k!) and sense_buf (256 bytes) are two points to look at.

This visibly reduces overhead of request handling when testing with the
unmerged "null" driver and virtio-scsi dataplane. Before, the issue is very
obvious with perf top:

perf top -G -p `pidof qemu-system-x86_64`
-----------------------------------------
+  16.50%  libc-2.17.so             [.] __memset_sse2
+   2.28%  libc-2.17.so             [.] _int_malloc
+   2.25%  [vdso]                   [.] 0x0000000000000cd1
+   2.02%  [kernel]                 [k] _raw_spin_lock_irqsave
+   1.97%  libpthread-2.17.so       [.] pthread_mutex_lock
+   1.87%  libpthread-2.17.so       [.] pthread_mutex_unlock
+   1.81%  [kernel]                 [k] fget_light
+   1.70%  libc-2.17.so             [.] malloc

After, the high __memset_sse2 and _int_malloc is gone:

perf top -G -p `pidof qemu-system-x86_64`
-----------------------------------------
+   4.20%  [kernel]                 [k] vcpu_enter_guest
+   3.97%  [kernel]                 [k] vmx_vcpu_run
+   2.63%  [kernel]                 [k] _raw_spin_lock_irqsave
+   1.72%  [kernel]                 [k] native_read_msr_safe
+   1.65%  [kernel]                 [k] __srcu_read_lock
+   1.64%  [kernel]                 [k] _raw_spin_unlock_irqrestore
+   1.57%  [vdso]                   [.] 0x00000000000008d8
+   1.49%  libc-2.17.so             [.] _int_malloc
+   1.29%  libpthread-2.17.so       [.] pthread_mutex_unlock
+   1.26%  [kernel]                 [k] native_write_msr_safe

See the commit message of patch 2 for some fio test data.

Thanks,
Fam


Fam Zheng (2):
  scsi: Optimize scsi_req_alloc
  virtio-scsi: Optimize virtio_scsi_init_req

 hw/scsi/scsi-bus.c     |  8 +++++---
 hw/scsi/virtio-scsi.c  | 24 +++++++++++++++++-------
 include/hw/scsi/scsi.h | 21 ++++++++++++++-------
 3 files changed, 36 insertions(+), 17 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-09-16  8:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-16  7:20 [Qemu-devel] [PATCH v3 0/2] virtio-scsi: Optimizing request allocation Fam Zheng
2014-09-16  7:20 ` [Qemu-devel] [PATCH v3 1/2] scsi: Optimize scsi_req_alloc Fam Zheng
2014-09-16  7:20 ` [Qemu-devel] [PATCH v3 2/2] virtio-scsi: Optimize virtio_scsi_init_req Fam Zheng
2014-09-16  8:19 ` [Qemu-devel] [PATCH v3 0/2] virtio-scsi: Optimizing request allocation Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).