From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51100) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XTnz3-0000iV-0i for qemu-devel@nongnu.org; Tue, 16 Sep 2014 04:19:39 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XTnyw-0001S8-KA for qemu-devel@nongnu.org; Tue, 16 Sep 2014 04:19:32 -0400 Received: from mx1.redhat.com ([209.132.183.28]:5444) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XTnyw-0001Rj-BU for qemu-devel@nongnu.org; Tue, 16 Sep 2014 04:19:26 -0400 Received: from int-mx13.intmail.prod.int.phx2.redhat.com (int-mx13.intmail.prod.int.phx2.redhat.com [10.5.11.26]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s8G8JMaV014126 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL) for ; Tue, 16 Sep 2014 04:19:22 -0400 Message-ID: <5417F287.6080303@redhat.com> Date: Tue, 16 Sep 2014 10:19:19 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <1410852018-18025-1-git-send-email-famz@redhat.com> In-Reply-To: <1410852018-18025-1-git-send-email-famz@redhat.com> Content-Type: text/plain; charset=iso-8859-15 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v3 0/2] virtio-scsi: Optimizing request allocation List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Fam Zheng , qemu-devel@nongnu.org Il 16/09/2014 09:20, Fam Zheng ha scritto: > v3: Small tweak on "cmd" in 1/2 and "sreq" in 2/2. > > Zeroing is relatively expensive since we have big request structures. > VirtQueueElement (>48k!) and sense_buf (256 bytes) are two points to look at. > > This visibly reduces overhead of request handling when testing with the > unmerged "null" driver and virtio-scsi dataplane. Before, the issue is very > obvious with perf top: > > perf top -G -p `pidof qemu-system-x86_64` > ----------------------------------------- > + 16.50% libc-2.17.so [.] __memset_sse2 > + 2.28% libc-2.17.so [.] _int_malloc > + 2.25% [vdso] [.] 0x0000000000000cd1 > + 2.02% [kernel] [k] _raw_spin_lock_irqsave > + 1.97% libpthread-2.17.so [.] pthread_mutex_lock > + 1.87% libpthread-2.17.so [.] pthread_mutex_unlock > + 1.81% [kernel] [k] fget_light > + 1.70% libc-2.17.so [.] malloc > > After, the high __memset_sse2 and _int_malloc is gone: > > perf top -G -p `pidof qemu-system-x86_64` > ----------------------------------------- > + 4.20% [kernel] [k] vcpu_enter_guest > + 3.97% [kernel] [k] vmx_vcpu_run > + 2.63% [kernel] [k] _raw_spin_lock_irqsave > + 1.72% [kernel] [k] native_read_msr_safe > + 1.65% [kernel] [k] __srcu_read_lock > + 1.64% [kernel] [k] _raw_spin_unlock_irqrestore > + 1.57% [vdso] [.] 0x00000000000008d8 > + 1.49% libc-2.17.so [.] _int_malloc > + 1.29% libpthread-2.17.so [.] pthread_mutex_unlock > + 1.26% [kernel] [k] native_write_msr_safe > > See the commit message of patch 2 for some fio test data. Thanks, applied to scsi-next. Paolo