From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:56473)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <yeongkyoon.lee@samsung.com>) id 1TLwyU-0007Zb-Mn
	for qemu-devel@nongnu.org; Wed, 10 Oct 2012 10:09:34 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <yeongkyoon.lee@samsung.com>) id 1TLwyL-000470-0v
	for qemu-devel@nongnu.org; Wed, 10 Oct 2012 10:09:26 -0400
Received: from mailout1.samsung.com ([203.254.224.24]:18014)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <yeongkyoon.lee@samsung.com>) id 1TLwyK-00045p-D8
	for qemu-devel@nongnu.org; Wed, 10 Oct 2012 10:09:16 -0400
Received: from epcpsbgm1.samsung.com (epcpsbgm1 [203.254.230.26])
	by mailout1.samsung.com
	(Oracle Communications Messaging Server 7u4-24.01(7.0.4.24.0) 64bit
	(built Nov
	17 2011)) with ESMTP id <0MBO00MFAKN5T820@mailout1.samsung.com> for
	qemu-devel@nongnu.org; Wed, 10 Oct 2012 23:09:12 +0900 (KST)
Received: from [172.21.111.108] ([182.198.1.3])
	by mmp2.samsung.com (Oracle Communications Messaging Server 7u4-24.01
	(7.0.4.24.0) 64bit (built Nov 17 2011))
	with ESMTPA id <0MBO00IG8KNCMOB0@mmp2.samsung.com> for
	qemu-devel@nongnu.org; Wed, 10 Oct 2012 23:09:12 +0900 (KST)
Date: Wed, 10 Oct 2012 23:09:12 +0900
From: Yeongkyoon Lee <yeongkyoon.lee@samsung.com>
In-reply-to: <50754F3A.5030602@samsung.com>
Message-id: <50758188.3000904@samsung.com>
MIME-version: 1.0
Content-type: text/plain; charset=UTF-8; format=flowed
Content-transfer-encoding: QUOTED-PRINTABLE
References: <1349786252-12343-1-git-send-email-yeongkyoon.lee@samsung.com>
	<20121009142610.GA14078@ohm.aurel32.net>
	<20121009161956.GG14078@ohm.aurel32.net> <5074571E.60700@redhat.com>
	<20121009170923.GF9643@ohm.aurel32.net> <5074F6E0.1090206@samsung.com>
	<20121010064517.GJ9643@ohm.aurel32.net> <50754F3A.5030602@samsung.com>
Subject: Re: [Qemu-devel] [PATCH v5 0/3] tcg: enhance code generation
 quality for qemu_ld/st IRs
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>, qemu-devel@nongnu.org

On 2012=EB=85=84 10=EC=9B=94 10=EC=9D=BC 19:34, Yeongkyoon Lee wrote:
> On 2012=EB=85=84 10=EC=9B=94 10=EC=9D=BC 15:45, Aurelien Jarno wrot=
e:
>> On Wed, Oct 10, 2012 at 01:17:36PM +0900, Yeongkyoon Lee wrote:
>>> On 2012=EB=85=84 10=EC=9B=94 10=EC=9D=BC 02:09, Aurelien Jarno wr=
ote:
>>>> On Tue, Oct 09, 2012 at 06:55:58PM +0200, Paolo Bonzini wrote:
>>>>> Il 09/10/2012 18:19, Aurelien Jarno ha scritto:
>>>>>>>> Instead of calling the MMU helper with an additional argumen=
t=20
>>>>>>>> (7), and
>>>>>>>> then jump back (8) to the next code (4), what about pushing =
the=20
>>>>>>>> address
>>>>>>>> of the next code (4) on the stack and use a jmp instead of t=
he=20
>>>>>>>> call. In
>>>>>>>> that case you don't need the extra argument to the helpers.
>>>>>>>>
>>>>>> Maybe it wasn't very clear. This is based on the fact that cal=
l is
>>>>>> basically push %rip + jmp. Therefore we can fake the return=
=20
>>>>>> address by
>>>>>> putting the value we want, here the address of the next code.=
=20
>>>>>> This mean
>>>>>> that we don't need to pass the extra argument to the helper fo=
r the
>>>>>> return address, as GET_PC() would work correctly (it basically=
=20
>>>>>> reads the
>>>>>> return address on the stack).
>>>>>>
>>>>>> For other architectures, it might not be a push, but rather a=
=20
>>>>>> move to
>>>>>> link register, basically put the return address where the call=
ing
>>>>>> convention asks for.
>>>>>>
>>>>>> OTOH I just realized it only works if the end of the slow path=
=20
>>>>>> (moving
>>>>>> the value from the return address to the correct register). It=
=20
>>>>>> might be
>>>>>> something doable.
>>>>> Branch predictors will not oldschool tricks like this one. :)
>>>>>
>>>> Given it is only used in the slow path (ie the exception more th=
an the
>>>> rule), branch prediction isn't that important there.
>>>>
>>> I had already considered the approach of using jmp and removing
>>> extra argument for helper call.
>>> However, the problem is that the helper needs the gen code addr u=
sed
>>> by tb_find_pc() and cpu_restore_state().
>>> That means the code addr in the helper can be actually said the a=
ddr
>>> corresponding to QEMU_ld/st IR rather than the return addr.
>>> In my LDST optimization, the helper call site is not in the code =
of
>>> IR but in the end of TB.
>> GETPC() uses the return address to determine the call place, and a=
s long
>> as the code at the end of the TB set a return address correspondin=
g to
>> the one of the fast path instructions, tb_find_pc() will be able t=
o find
>> the correct instruction.
>>
>> That implies that at least one instruction at the end of the gener=
ated
>> code is shared between the slow path and the fast path, but in the=
 other
>> hand it avoids having to different kind of mmu helpers.
>>
>
> How about nop instruction at the end of fast path as return address=
 of=20
> helper?
> That means the change of "call helper" to "push addr of nop" and "j=
mp=20
> helper".
> Although I need to check the feasibility, it is expected to avoid=
=20
> helper fragmentation and to make performance degradation to be mini=
mum.
>
>

I've done some tests about performance degradation when nop instructi=
on=20
is inserted to qemu_ld/st fast path.
The result is ok because I did not find any notable performance degra=
dation.
I'll patch new version without the change of MMU helper's description=
 soon.