From mboxrd@z Thu Jan 1 00:00:00 1970 From: Avi Kivity Subject: Re: [PATCH 3/3] KVM: MMU: Separate trivial NULL check out from rmap_get_next() Date: Thu, 15 Mar 2012 14:01:11 +0200 Message-ID: <4F61DA07.4010200@redhat.com> References: <20120315181856.2ced0a28.yoshikawa.takuya@oss.ntt.co.jp> <20120315182127.07600c50.yoshikawa.takuya@oss.ntt.co.jp> <4F61BB1D.6050209@redhat.com> <20120315191553.32d6fc36.yoshikawa.takuya@oss.ntt.co.jp> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: mtosatti@redhat.com, kvm@vger.kernel.org To: Takuya Yoshikawa Return-path: Received: from mx1.redhat.com ([209.132.183.28]:59737 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932159Ab2COMBQ (ORCPT ); Thu, 15 Mar 2012 08:01:16 -0400 In-Reply-To: <20120315191553.32d6fc36.yoshikawa.takuya@oss.ntt.co.jp> Sender: kvm-owner@vger.kernel.org List-ID: On 03/15/2012 12:15 PM, Takuya Yoshikawa wrote: > Avi Kivity wrote: > > > > Although using "inline" like this does not look clean, we could see > > > measurable performance improvements: get_dirty_log for 1GB dirty memory > > > became faster by more than 10% on my test box. > > > > > > > WOW. I'd have assumed the processor deals better with this; it should > > be 100% predicted branches. > > > > But I won't argue with cold data. > > What I checked was: > > original with-patch2 with-patch3 > 8.7ms 8.5ms 7.5ms What's the per-call numbers? > I assumed that without "inline" only __rmap_get_next() would be inlined > into rmap_get_next() so did like this. > > I thought the improvement was just from removing one function call for > each rmap_write_protect. Not sure if anything was changed with branch > predictions. What I mean is, modern cpus effectively inline simple function calls by predicting the call, and branchs within the function, and the return, so they don't have to stop their pipelines at any of these points. But again, the numbers talk louder than speculation about cpu architecture. -- error compiling committee.c: too many arguments to function