Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Scott Wood <scottwood@freescale.com>
To: Alexander Graf <agraf@suse.de>
Cc: Mihai Caraman <mihai.caraman@freescale.com>,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation
Date: Tue, 09 Jul 2013 18:46:45 +0000	[thread overview]
Message-ID: <1373395605.8183.198@snotra> (raw)
In-Reply-To: <51DC4C00.70509@suse.de> (from agraf@suse.de on Tue Jul  9 12:44:32 2013)

On 07/09/2013 12:44:32 PM, Alexander Graf wrote:
> On 07/09/2013 07:13 PM, Scott Wood wrote:
>> On 07/08/2013 08:39:05 AM, Alexander Graf wrote:
>>> 
>>> On 28.06.2013, at 11:20, Mihai Caraman wrote:
>>> 
>>> > lwepx faults needs to be handled by KVM and this implies  
>>> additional code
>>> > in DO_KVM macro to identify the source of the exception  
>>> originated from
>>> > host context. This requires to check the Exception Syndrome  
>>> Register
>>> > (ESR[EPID]) and External PID Load Context Register (EPLC[EGS])  
>>> for DTB_MISS,
>>> > DSI and LRAT exceptions which is too intrusive for the host.
>>> >
>>> > Get rid of lwepx and acquire last instuction in  
>>> kvmppc_handle_exit() by
>>> > searching for the physical address and kmap it. This fixes an  
>>> infinite loop
>>> 
>>> What's the difference in speed for this?
>>> 
>>> Also, could we call lwepx later in host code, when  
>>> kvmppc_get_last_inst() gets invoked?
>> 
>> Any use of lwepx is problematic unless we want to add overhead to  
>> the main Linux TLB miss handler.
> 
> What exactly would be missing?

If lwepx faults, it goes to the normal host TLB miss handler.  Without  
adding code to it to recognize that it's an external-PID fault, it will  
try to search the normal Linux page tables and insert a normal host  
entry.  If it thinks it has succeeded, it will retry the instruction  
rather than search for an exception handler.  The instruction will  
fault again, and you get a hang.

> I'd also still like to see some performance benchmarks on this to  
> make sure we're not walking into a bad direction.

I doubt it'll be significantly different.  There's overhead involved in  
setting up for lwepx as well.  It doesn't hurt to test, though this is  
a functional correctness issue, so I'm not sure what better  
alternatives we have.  I don't want to slow down non-KVM TLB misses for  
this.

>>> > +    addr = (mas7_mas3 & (~0ULL << psize_shift)) |
>>> > +           (geaddr & ((1ULL << psize_shift) - 1ULL));
>>> > +
>>> > +    /* Map a page and get guest's instruction */
>>> > +    page = pfn_to_page(addr >> PAGE_SHIFT);
>>> 
>>> So it seems to me like you're jumping through a lot of hoops to  
>>> make sure this works for LRAT and non-LRAT at the same time. Can't  
>>> we just treat them as the different things they are?
>>> 
>>> What if we have different MMU backends for LRAT and non-LRAT? The  
>>> non-LRAT case could then try lwepx, if that fails, fall back to  
>>> read the shadow TLB. For the LRAT case, we'd do lwepx, if that  
>>> fails fall back to this logic.
>> 
>> This isn't about LRAT; it's about hardware threads.  It also fixes  
>> the handling of execute-only pages on current chips.
> 
> On non-LRAT systems we could always check our shadow copy of the  
> guest's TLB, no? I'd really like to know what the performance  
> difference would be for the 2 approaches.

I suspect that tlbsx is faster, or at worst similar.  And unlike  
comparing tlbsx to lwepx (not counting a fix for the threading  
problem), we don't already have code to search the guest TLB, so  
testing would be more work.

-Scott

WARNING: multiple messages have this Message-ID (diff)

From: Scott Wood <scottwood@freescale.com>
To: Alexander Graf <agraf@suse.de>
Cc: Mihai Caraman <mihai.caraman@freescale.com>,
	linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org,
	kvm-ppc@vger.kernel.org
Subject: Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation
Date: Tue, 9 Jul 2013 13:46:45 -0500	[thread overview]
Message-ID: <1373395605.8183.198@snotra> (raw)
In-Reply-To: <51DC4C00.70509@suse.de> (from agraf@suse.de on Tue Jul  9 12:44:32 2013)

On 07/09/2013 12:44:32 PM, Alexander Graf wrote:
> On 07/09/2013 07:13 PM, Scott Wood wrote:
>> On 07/08/2013 08:39:05 AM, Alexander Graf wrote:
>>>=20
>>> On 28.06.2013, at 11:20, Mihai Caraman wrote:
>>>=20
>>> > lwepx faults needs to be handled by KVM and this implies =20
>>> additional code
>>> > in DO_KVM macro to identify the source of the exception =20
>>> originated from
>>> > host context. This requires to check the Exception Syndrome =20
>>> Register
>>> > (ESR[EPID]) and External PID Load Context Register (EPLC[EGS]) =20
>>> for DTB_MISS,
>>> > DSI and LRAT exceptions which is too intrusive for the host.
>>> >
>>> > Get rid of lwepx and acquire last instuction in =20
>>> kvmppc_handle_exit() by
>>> > searching for the physical address and kmap it. This fixes an =20
>>> infinite loop
>>>=20
>>> What's the difference in speed for this?
>>>=20
>>> Also, could we call lwepx later in host code, when =20
>>> kvmppc_get_last_inst() gets invoked?
>>=20
>> Any use of lwepx is problematic unless we want to add overhead to =20
>> the main Linux TLB miss handler.
>=20
> What exactly would be missing?

If lwepx faults, it goes to the normal host TLB miss handler.  Without =20
adding code to it to recognize that it's an external-PID fault, it will =20
try to search the normal Linux page tables and insert a normal host =20
entry.  If it thinks it has succeeded, it will retry the instruction =20
rather than search for an exception handler.  The instruction will =20
fault again, and you get a hang.

> I'd also still like to see some performance benchmarks on this to =20
> make sure we're not walking into a bad direction.

I doubt it'll be significantly different.  There's overhead involved in =20
setting up for lwepx as well.  It doesn't hurt to test, though this is =20
a functional correctness issue, so I'm not sure what better =20
alternatives we have.  I don't want to slow down non-KVM TLB misses for =20
this.

>>> > +    addr =3D (mas7_mas3 & (~0ULL << psize_shift)) |
>>> > +           (geaddr & ((1ULL << psize_shift) - 1ULL));
>>> > +
>>> > +    /* Map a page and get guest's instruction */
>>> > +    page =3D pfn_to_page(addr >> PAGE_SHIFT);
>>>=20
>>> So it seems to me like you're jumping through a lot of hoops to =20
>>> make sure this works for LRAT and non-LRAT at the same time. Can't =20
>>> we just treat them as the different things they are?
>>>=20
>>> What if we have different MMU backends for LRAT and non-LRAT? The =20
>>> non-LRAT case could then try lwepx, if that fails, fall back to =20
>>> read the shadow TLB. For the LRAT case, we'd do lwepx, if that =20
>>> fails fall back to this logic.
>>=20
>> This isn't about LRAT; it's about hardware threads.  It also fixes =20
>> the handling of execute-only pages on current chips.
>=20
> On non-LRAT systems we could always check our shadow copy of the =20
> guest's TLB, no? I'd really like to know what the performance =20
> difference would be for the 2 approaches.

I suspect that tlbsx is faster, or at worst similar.  And unlike =20
comparing tlbsx to lwepx (not counting a fix for the threading =20
problem), we don't already have code to search the guest TLB, so =20
testing would be more work.

-Scott=

WARNING: multiple messages have this Message-ID (diff)

From: Scott Wood <scottwood@freescale.com>
To: Alexander Graf <agraf@suse.de>
Cc: Mihai Caraman <mihai.caraman@freescale.com>,
	<kvm-ppc@vger.kernel.org>, <kvm@vger.kernel.org>,
	<linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation
Date: Tue, 9 Jul 2013 13:46:45 -0500	[thread overview]
Message-ID: <1373395605.8183.198@snotra> (raw)
In-Reply-To: <51DC4C00.70509@suse.de> (from agraf@suse.de on Tue Jul  9 12:44:32 2013)

On 07/09/2013 12:44:32 PM, Alexander Graf wrote:
> On 07/09/2013 07:13 PM, Scott Wood wrote:
>> On 07/08/2013 08:39:05 AM, Alexander Graf wrote:
>>> 
>>> On 28.06.2013, at 11:20, Mihai Caraman wrote:
>>> 
>>> > lwepx faults needs to be handled by KVM and this implies  
>>> additional code
>>> > in DO_KVM macro to identify the source of the exception  
>>> originated from
>>> > host context. This requires to check the Exception Syndrome  
>>> Register
>>> > (ESR[EPID]) and External PID Load Context Register (EPLC[EGS])  
>>> for DTB_MISS,
>>> > DSI and LRAT exceptions which is too intrusive for the host.
>>> >
>>> > Get rid of lwepx and acquire last instuction in  
>>> kvmppc_handle_exit() by
>>> > searching for the physical address and kmap it. This fixes an  
>>> infinite loop
>>> 
>>> What's the difference in speed for this?
>>> 
>>> Also, could we call lwepx later in host code, when  
>>> kvmppc_get_last_inst() gets invoked?
>> 
>> Any use of lwepx is problematic unless we want to add overhead to  
>> the main Linux TLB miss handler.
> 
> What exactly would be missing?

If lwepx faults, it goes to the normal host TLB miss handler.  Without  
adding code to it to recognize that it's an external-PID fault, it will  
try to search the normal Linux page tables and insert a normal host  
entry.  If it thinks it has succeeded, it will retry the instruction  
rather than search for an exception handler.  The instruction will  
fault again, and you get a hang.

> I'd also still like to see some performance benchmarks on this to  
> make sure we're not walking into a bad direction.

I doubt it'll be significantly different.  There's overhead involved in  
setting up for lwepx as well.  It doesn't hurt to test, though this is  
a functional correctness issue, so I'm not sure what better  
alternatives we have.  I don't want to slow down non-KVM TLB misses for  
this.

>>> > +    addr = (mas7_mas3 & (~0ULL << psize_shift)) |
>>> > +           (geaddr & ((1ULL << psize_shift) - 1ULL));
>>> > +
>>> > +    /* Map a page and get guest's instruction */
>>> > +    page = pfn_to_page(addr >> PAGE_SHIFT);
>>> 
>>> So it seems to me like you're jumping through a lot of hoops to  
>>> make sure this works for LRAT and non-LRAT at the same time. Can't  
>>> we just treat them as the different things they are?
>>> 
>>> What if we have different MMU backends for LRAT and non-LRAT? The  
>>> non-LRAT case could then try lwepx, if that fails, fall back to  
>>> read the shadow TLB. For the LRAT case, we'd do lwepx, if that  
>>> fails fall back to this logic.
>> 
>> This isn't about LRAT; it's about hardware threads.  It also fixes  
>> the handling of execute-only pages on current chips.
> 
> On non-LRAT systems we could always check our shadow copy of the  
> guest's TLB, no? I'd really like to know what the performance  
> difference would be for the 2 approaches.

I suspect that tlbsx is faster, or at worst similar.  And unlike  
comparing tlbsx to lwepx (not counting a fix for the threading  
problem), we don't already have code to search the guest TLB, so  
testing would be more work.

-Scott

next prev parent reply	other threads:[~2013-07-09 18:46 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-06 16:11 [PATCH 1/2] KVM: PPC: e500mc: Revert "add load inst fixup" Mihai Caraman
2013-06-06 16:11 ` Mihai Caraman
2013-06-06 16:11 ` Mihai Caraman
2013-06-06 16:11 ` [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation Mihai Caraman
2013-06-06 16:11   ` Mihai Caraman
2013-06-06 16:11   ` Mihai Caraman
2013-06-28  9:20 ` [PATCH 1/2] KVM: PPC: e500mc: Revert "add load inst fixup" Mihai Caraman
2013-06-28  9:20   ` Mihai Caraman
2013-06-28  9:20   ` Mihai Caraman
2013-06-28  9:20   ` [PATCH 2/2] KVM: PPC: Book3E: Get vcpu's last instruction for emulation Mihai Caraman
2013-06-28  9:20     ` Mihai Caraman
2013-06-28  9:20     ` Mihai Caraman
2013-07-08 13:39     ` Alexander Graf
2013-07-08 13:39       ` Alexander Graf
2013-07-08 13:39       ` Alexander Graf
2013-07-09 17:13       ` Scott Wood
2013-07-09 17:13         ` Scott Wood
2013-07-09 17:13         ` Scott Wood
2013-07-09 17:44         ` Alexander Graf
2013-07-09 17:44           ` Alexander Graf
2013-07-09 17:44           ` Alexander Graf
2013-07-09 18:46           ` Scott Wood [this message]
2013-07-09 18:46             ` Scott Wood
2013-07-09 18:46             ` Scott Wood
2013-07-09 21:44             ` Alexander Graf
2013-07-09 21:44               ` Alexander Graf
2013-07-09 21:44               ` Alexander Graf
2013-07-10  0:06               ` Scott Wood
2013-07-10  0:06                 ` Scott Wood
2013-07-10  0:06                 ` Scott Wood
2013-07-10 10:15                 ` Alexander Graf
2013-07-10 10:15                   ` Alexander Graf
2013-07-10 10:15                   ` Alexander Graf
2013-07-10 18:42                   ` Scott Wood
2013-07-10 18:42                     ` Scott Wood
2013-07-10 18:42                     ` Scott Wood
2013-07-10 22:50                     ` Alexander Graf
2013-07-10 22:50                       ` Alexander Graf
2013-07-10 22:50                       ` Alexander Graf
2013-07-11  0:15                       ` Scott Wood
2013-07-11  0:15                         ` Scott Wood
2013-07-11  0:15                         ` Scott Wood
2013-07-11  0:17                         ` Alexander Graf
2013-07-11  0:17                           ` Alexander Graf
2013-07-11  0:17                           ` Alexander Graf
2013-07-09 21:45     ` Alexander Graf
2013-07-09 21:45       ` Alexander Graf
2013-07-09 21:45       ` Alexander Graf
2013-07-10  0:12       ` Scott Wood
2013-07-10  0:12         ` Scott Wood
2013-07-10  0:12         ` Scott Wood
2013-07-10 10:18         ` Alexander Graf
2013-07-10 10:18           ` Alexander Graf
2013-07-10 10:18           ` Alexander Graf
2013-07-10 18:37           ` Scott Wood
2013-07-10 18:37             ` Scott Wood
2013-07-10 18:37             ` Scott Wood
2013-07-10 22:48             ` Alexander Graf
2013-07-10 22:48               ` Alexander Graf
2013-07-10 22:48               ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1373395605.8183.198@snotra \
    --to=scottwood@freescale.com \
    --cc=agraf@suse.de \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mihai.caraman@freescale.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.