From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54839) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1btqzo-0002oh-BW for qemu-devel@nongnu.org; Tue, 11 Oct 2016 02:57:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1btqzl-0005yv-7m for qemu-devel@nongnu.org; Tue, 11 Oct 2016 02:57:04 -0400 Received: from mga14.intel.com ([192.55.52.115]:40144) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1btqzk-0005xR-L4 for qemu-devel@nongnu.org; Tue, 11 Oct 2016 02:57:01 -0400 Date: Tue, 11 Oct 2016 14:57:49 +0800 From: Yuanhan Liu Message-ID: <20161011065749.GO1597@yliu-dev.sh.intel.com> References: <20160928022848.GE1597@yliu-dev.sh.intel.com> <20160929205047-mutt-send-email-mst@kernel.org> <2889e609-f750-a4e1-66f8-768bb07a2339@redhat.com> <20160929231252-mutt-send-email-mst@kernel.org> <20161010033744.GW1597@yliu-dev.sh.intel.com> <20161010064113-mutt-send-email-mst@kernel.org> <20161010035910.GY1597@yliu-dev.sh.intel.com> <8F6C2BD409508844A0EFC19955BE09414E7BC050@SHSMSX103.ccr.corp.intel.com> <20161010073721-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161010073721-mutt-send-email-mst@kernel.org> Subject: Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: "Wang, Zhihong" , Maxime Coquelin , Stephen Hemminger , "dev@dpdk.org" , "qemu-devel@nongnu.org" On Mon, Oct 10, 2016 at 07:39:59AM +0300, Michael S. Tsirkin wrote: > > > > > > 1. why is this memset expensive? > > > > > > > > > > I don't have the exact answer, but just some rough thoughts: > > > > > > > > > > It's an external clib function: there is a call stack and the > > > > > IP register will bounch back and forth. > > > > > > > > for memset 0? gcc 5.3.1 on fedora happily inlines it. > > > > > > Good to know! > > > > > > > > overkill to use that for resetting 14 bytes structure. > > > > > > > > > > Some trick like > > > > > *(struct virtio_net_hdr *)hdr = {0, }; > > > > > > > > > > Or even > > > > > hdr->xxx = 0; > > > > > hdr->yyy = 0; > > > > > > > > > > should behaviour better. > > > > > > > > > > There was an example: the vhost enqueue optmization patchset from > > > > > Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC) > > > > > on my Ivybridge server: it has no such issue on his server though. > > > > > > > > > > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html > > > > > > > > > > --yliu > > > > > > > > I'd say that's weird. what's your config? any chance you > > > > are using an old compiler? > > > > > > Not really, it's gcc 5.3.1. Maybe Zhihong could explain more. IIRC, > > > he said the memset is not well optimized for Ivybridge server. > > > > The dst is remote in that case. It's fine on Haswell but has complication > > in Ivy Bridge which (wasn't supposed to but) causes serious frontend issue. > > > > I don't think gcc inlined it there. I'm using fc24 gcc 6.1.1. > > > So try something like this then: Yes, I saw memset is inlined when this diff is applied. So, mind to send a formal patch? You might want to try build at least: it doesn't build. > Generally pointer chasing in vq->hw->vtnet_hdr_size can't be good > for performance. Move fields used on data path into vq > and use from there to avoid indirections? Good suggestion! --yliu