From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:55522)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1VeV0U-0007C7-LA
	for qemu-devel@nongnu.org; Thu, 07 Nov 2013 14:12:51 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1VeV0M-0001G4-6u
	for qemu-devel@nongnu.org; Thu, 07 Nov 2013 14:12:42 -0500
Received: from mail-ee0-x22b.google.com ([2a00:1450:4013:c00::22b]:63286)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1VeV0L-0001Fv-Vv
	for qemu-devel@nongnu.org; Thu, 07 Nov 2013 14:12:34 -0500
Received: by mail-ee0-f43.google.com with SMTP id b47so538364eek.30
	for <qemu-devel@nongnu.org>; Thu, 07 Nov 2013 11:12:32 -0800 (PST)
Sender: Paolo Bonzini <paolo.bonzini@gmail.com>
Message-ID: <527BE61B.3010309@redhat.com>
Date: Thu, 07 Nov 2013 20:12:27 +0100
From: Paolo Bonzini <pbonzini@redhat.com>
MIME-Version: 1.0
References: <1383840877-2861-1-git-send-email-pbonzini@redhat.com>
	<20131107162131.GA4370@redhat.com> <527BBFDB.2010404@redhat.com>
	<20131107164705.GA4572@redhat.com> <527BCE04.9020107@redhat.com>
	<20131107185413.GA4974@redhat.com>
In-Reply-To: <20131107185413.GA4974@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 0/2] exec: alternative fix for master abort
	woes
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Michael S. Tsirkin" <mst@redhat.com>
Cc: lcapitulino@redhat.com, qemu-devel@nongnu.org, marcel.a@redhat.com

Il 07/11/2013 19:54, Michael S. Tsirkin ha scritto:
> On Thu, Nov 07, 2013 at 06:29:40PM +0100, Paolo Bonzini wrote:
>> Il 07/11/2013 17:47, Michael S. Tsirkin ha scritto:
>>> That's on kvm with 52 bit address.
>>> But where I would be concerned is systems with e.g. 36 bit address
>>> space where we are doubling the cost of the lookup.
>>> E.g. try i386 and not x86_64.
>>
>> Tried now...
>>
>>                 P_L2_LEVELS pre-patch           post-patch
>>    i386         3                               6
>>    x86_64       4                               6
>>
>> I timed the inl_from_qemu test of vmexit.flat with both KVM and TCG.  With
>> TCG there's indeed a visible penalty of 20 cycles for i386 and 10 for x86_64
>> (you can extrapolate to 30 cycles for TARGET_PHYS_ADDR_SPACE_BITS=32 targets).
>> These can be more or less entirely ascribed to phys_page_find:
>>
>>                                  TCG             |      KVM
>>                            pre-patch  post-patch |  pre-patch   post-patch
>> phys_page_find(i386)          13%         25%    |     0.6%         1%
>> inl_from_qemu cycles(i386)    153         173    |   ~12000      ~12000
> 
> I'm a bit confused by the numbers above. The % of phys_page_find has
> grown from 13% to  25% (almost double, which is kind of expected
> give we have twice the # of levels).

Yes.

> But overhead in # of cycles only went from 153 to
> 173?

new cycles / old cycles = 173 / 153 = 113%

% outside phys_page_find + % in phys_page_find*2 = 87% + 13%*2 = 113%

> Maybe the test is a bit wrong for tcg - how about unrolling the
> loop in kvm unit test?

Done that already. :)

>> Also, compiling with "-fstack-protector" instead of "-fstack-protector-all",
>> as suggested a while ago by rth, is already giving a savings of 20 cycles.
> 
> Is it true that with TCG this affects more than just MMIO
> as phys_page_find will also sometimes run on CPU accesses to memory?

Yes.  I tried benchmarking with perf the boot of a RHEL guest, which has

                         TCG               |             KVM
               pre-patch      post-patch   | pre-patch        post-patch
                  3%             5.8%      |    0.9%             1.7%

This is actually higher than usual for KVM because there are many VGA
access during GRUB.

>> And of course, if this were a realistic test, KVM's 60x penalty would
>> be a severe problem---but it isn't, because this is not a realistic setting.
> 
> Well, for this argument to carry the day we'd need to design
> a realistic test which isn't easy :)

Yes, I guess the number that matters is the extra 2% penalty for TCG
(the part that doesn't come from MMIO).

Paolo