All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v3 0/2] virtio-scsi: Optimizing request allocation
@ 2014-09-16  7:20 Fam Zheng
  2014-09-16  7:20 ` [Qemu-devel] [PATCH v3 1/2] scsi: Optimize scsi_req_alloc Fam Zheng
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Fam Zheng @ 2014-09-16  7:20 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini

v3: Small tweak on "cmd" in 1/2 and "sreq" in 2/2.

Zeroing is relatively expensive since we have big request structures.
VirtQueueElement (>48k!) and sense_buf (256 bytes) are two points to look at.

This visibly reduces overhead of request handling when testing with the
unmerged "null" driver and virtio-scsi dataplane. Before, the issue is very
obvious with perf top:

perf top -G -p `pidof qemu-system-x86_64`
-----------------------------------------
+  16.50%  libc-2.17.so             [.] __memset_sse2
+   2.28%  libc-2.17.so             [.] _int_malloc
+   2.25%  [vdso]                   [.] 0x0000000000000cd1
+   2.02%  [kernel]                 [k] _raw_spin_lock_irqsave
+   1.97%  libpthread-2.17.so       [.] pthread_mutex_lock
+   1.87%  libpthread-2.17.so       [.] pthread_mutex_unlock
+   1.81%  [kernel]                 [k] fget_light
+   1.70%  libc-2.17.so             [.] malloc

After, the high __memset_sse2 and _int_malloc is gone:

perf top -G -p `pidof qemu-system-x86_64`
-----------------------------------------
+   4.20%  [kernel]                 [k] vcpu_enter_guest
+   3.97%  [kernel]                 [k] vmx_vcpu_run
+   2.63%  [kernel]                 [k] _raw_spin_lock_irqsave
+   1.72%  [kernel]                 [k] native_read_msr_safe
+   1.65%  [kernel]                 [k] __srcu_read_lock
+   1.64%  [kernel]                 [k] _raw_spin_unlock_irqrestore
+   1.57%  [vdso]                   [.] 0x00000000000008d8
+   1.49%  libc-2.17.so             [.] _int_malloc
+   1.29%  libpthread-2.17.so       [.] pthread_mutex_unlock
+   1.26%  [kernel]                 [k] native_write_msr_safe

See the commit message of patch 2 for some fio test data.

Thanks,
Fam


Fam Zheng (2):
  scsi: Optimize scsi_req_alloc
  virtio-scsi: Optimize virtio_scsi_init_req

 hw/scsi/scsi-bus.c     |  8 +++++---
 hw/scsi/virtio-scsi.c  | 24 +++++++++++++++++-------
 include/hw/scsi/scsi.h | 21 ++++++++++++++-------
 3 files changed, 36 insertions(+), 17 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-09-16  8:19 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-09-16  7:20 [Qemu-devel] [PATCH v3 0/2] virtio-scsi: Optimizing request allocation Fam Zheng
2014-09-16  7:20 ` [Qemu-devel] [PATCH v3 1/2] scsi: Optimize scsi_req_alloc Fam Zheng
2014-09-16  7:20 ` [Qemu-devel] [PATCH v3 2/2] virtio-scsi: Optimize virtio_scsi_init_req Fam Zheng
2014-09-16  8:19 ` [Qemu-devel] [PATCH v3 0/2] virtio-scsi: Optimizing request allocation Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.