From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37849) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zzlll-0001Jw-P6 for qemu-devel@nongnu.org; Fri, 20 Nov 2015 08:30:30 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zzllh-0001cJ-LH for qemu-devel@nongnu.org; Fri, 20 Nov 2015 08:30:29 -0500 Received: from mx1.redhat.com ([209.132.183.28]:45744) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zzllh-0001cC-FO for qemu-devel@nongnu.org; Fri, 20 Nov 2015 08:30:25 -0500 Date: Fri, 20 Nov 2015 15:30:22 +0200 From: "Michael S. Tsirkin" Message-ID: <20151120152425-mutt-send-email-mst@redhat.com> References: <1448016301-20944-1-git-send-email-caoj.fnst@cn.fujitsu.com> <20151120124452-mutt-send-email-mst@redhat.com> <564EFE27.5040203@cn.fujitsu.com> <20151120132622-mutt-send-email-mst@redhat.com> <564F0AC9.3030601@cn.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <564F0AC9.3030601@cn.fujitsu.com> Subject: Re: [Qemu-devel] [PATCH] PCI: minor performance optimization List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Cao jin Cc: qemu-devel@nongnu.org On Fri, Nov 20, 2015 at 07:58:01PM +0800, Cao jin wrote: > > > On 11/20/2015 07:26 PM, Michael S. Tsirkin wrote: > >On Fri, Nov 20, 2015 at 07:04:07PM +0800, Cao jin wrote: > >> > >> > >>On 11/20/2015 06:45 PM, Michael S. Tsirkin wrote: > >>>On Fri, Nov 20, 2015 at 06:45:01PM +0800, Cao jin wrote: > >>> > >>>>2. As spec says, each capability must be DWORD aligned, so an optimization can > >>>> be done via Loop Unrolling. > >>> > >>>Why do we want to optimize it? > >>> > >> > >>For tiny performance improvement via less loop. take pcie express > >>capability(60 bytes at most) for example, it may loop 60 times, now we just > >>need 15 times, a quarter of before. > > > >But who cares? This is not a data path operation. > > It is tiny thing I found when browsing code. When found there are several > places looks like this, I think maybe it does good to qemu to do this and > CCed to you because it don`t look like a simple trivial patch. > > So, hey Michael, if you don`t like this kind of optimization, that`t ok, > forget it. But I think it make me little confused when determine which kind > of patch should be CCed to you. Optimization patches should normally include performance numbers if they are to be merged. Try to come up with a benchmark and you will realize that the speed of this function has no effect under even half way realistic conditions. > > > >>>> > >>>>Signed-off-by: Cao jin > >>>>--- > >>>> hw/pci/pci.c | 12 ++++++++---- > >>>> 1 file changed, 8 insertions(+), 4 deletions(-) > >>>> > >>>>diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >>>>index 168b9cc..1e99603 100644 > >>>>--- a/hw/pci/pci.c > >>>>+++ b/hw/pci/pci.c > >>>>@@ -1924,13 +1924,15 @@ PCIDevice *pci_create_simple(PCIBus *bus, int devfn, const char *name) > >>>> static uint8_t pci_find_space(PCIDevice *pdev, uint8_t size) > >>>> { > >>>> int offset = PCI_CONFIG_HEADER_SIZE; > >>>>- int i; > >>>>- for (i = PCI_CONFIG_HEADER_SIZE; i < PCI_CONFIG_SPACE_SIZE; ++i) { > >>>>+ int i = PCI_CONFIG_HEADER_SIZE;; > >>>>+ > >>>>+ for (; i < PCI_CONFIG_SPACE_SIZE; i = i + 4) { > >>>> if (pdev->used[i]) > >>>>- offset = i + 1; > >>>>- else if (i - offset + 1 == size) > >>>>+ offset = i + 4; > >>>>+ else if (i - offset >= size) > >>>> return offset; > >>>> } > >>>>+ > >>>> return 0; > >>>> } > >>>> > >>>>@@ -2144,6 +2146,8 @@ int pci_add_capability2(PCIDevice *pdev, uint8_t cap_id, > >>>> uint8_t *config; > >>>> int i, overlapping_cap; > >>>> > >>>>+ assert(size > 0); > >>>>+ > >>>> if (!offset) { > >>>> offset = pci_find_space(pdev, size); > >>>> if (!offset) { > >>>>-- > >>>>2.1.0 > >>>. > >>> > >> > >>-- > >>Yours Sincerely, > >> > >>Cao Jin > >. > > > > -- > Yours Sincerely, > > Cao Jin