* [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code
@ 2013-02-14 12:56 Diana Craciun
2013-02-15 0:11 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 7+ messages in thread
From: Diana Craciun @ 2013-02-14 12:56 UTC (permalink / raw)
To: linuxppc-dev; +Cc: Diana Craciun
From: Diana Craciun <Diana.Craciun@freescale.com>
On Freescale e6500 cores EPCR[DGTMI] controls whether guest supervisor
state can execute TLB management instructions. If EPCR[DGTMI]=0
tlbwe and tlbilx are allowed to execute normally in the guest state.
A hypervisor may choose to virtualize TLB1 and for this purpose it
may use IPROT to protect the entries for being invalidated by the
guest. However, because tlbwe and tlbilx execution in the guest state
are sharing the same bit, it is not possible to have a scenario where
tlbwe is allowed to be executed in guest state and tlbilx traps. When
guest TLB management instructions are allowed to be executed in guest
state the guest cannot use tlbilx to invalidate TLB1 guest entries.
Linux is using tlbilx in the boot code to invalidate the temporary
entries it creates when initializing the MMU. The patch is replacing
the usage of tlbilx in initialization code with tlbwe with VALID bit
cleared.
Linux is also using tlbilx in other contexts (like huge pages or
indirect entries) but removing the tlbilx from the initialization code
offers the possibility to have scenarios under hypervisor which are
not using huge pages or indirect entries.
Signed-off-by: Diana Craciun <Diana.Craciun@freescale.com>
---
arch/powerpc/kernel/exceptions-64e.S | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
index 4684e33..1f0ae33 100644
--- a/arch/powerpc/kernel/exceptions-64e.S
+++ b/arch/powerpc/kernel/exceptions-64e.S
@@ -1010,12 +1010,9 @@ skpinv: addi r6,r6,1 /* Increment */
mtspr SPRN_MAS0,r3
tlbre
mfspr r6,SPRN_MAS1
- rlwinm r6,r6,0,2,0 /* clear IPROT */
+ rlwinm r6,r6,0,2,31 /* clear IPROT and VALID */
mtspr SPRN_MAS1,r6
tlbwe
-
- /* Invalidate TLB1 */
- PPC_TLBILX_ALL(0,R0)
sync
isync
@@ -1069,12 +1066,9 @@ skpinv: addi r6,r6,1 /* Increment */
mtspr SPRN_MAS0,r4
tlbre
mfspr r5,SPRN_MAS1
- rlwinm r5,r5,0,2,0 /* clear IPROT */
+ rlwinm r5,r5,0,2,31 /* clear IPROT and VALID */
mtspr SPRN_MAS1,r5
tlbwe
-
- /* Invalidate TLB1 */
- PPC_TLBILX_ALL(0,R0)
sync
isync
--
1.7.11.7
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code
2013-02-14 12:56 [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code Diana Craciun
@ 2013-02-15 0:11 ` Benjamin Herrenschmidt
2013-02-15 15:16 ` Diana Craciun
0 siblings, 1 reply; 7+ messages in thread
From: Benjamin Herrenschmidt @ 2013-02-15 0:11 UTC (permalink / raw)
To: Diana Craciun; +Cc: linuxppc-dev
On Thu, 2013-02-14 at 14:56 +0200, Diana Craciun wrote:
> From: Diana Craciun <Diana.Craciun@freescale.com>
>
> On Freescale e6500 cores EPCR[DGTMI] controls whether guest supervisor
> state can execute TLB management instructions. If EPCR[DGTMI]=0
> tlbwe and tlbilx are allowed to execute normally in the guest state.
>
> A hypervisor may choose to virtualize TLB1 and for this purpose it
> may use IPROT to protect the entries for being invalidated by the
> guest. However, because tlbwe and tlbilx execution in the guest state
> are sharing the same bit, it is not possible to have a scenario where
> tlbwe is allowed to be executed in guest state and tlbilx traps. When
> guest TLB management instructions are allowed to be executed in guest
> state the guest cannot use tlbilx to invalidate TLB1 guest entries.
Sorry, I don't understand the explanation... can you be more detailed ?
> Linux is using tlbilx in the boot code to invalidate the temporary
> entries it creates when initializing the MMU. The patch is replacing
> the usage of tlbilx in initialization code with tlbwe with VALID bit
> cleared.
>
> Linux is also using tlbilx in other contexts (like huge pages or
> indirect entries) but removing the tlbilx from the initialization code
> offers the possibility to have scenarios under hypervisor which are
> not using huge pages or indirect entries.
>
> Signed-off-by: Diana Craciun <Diana.Craciun@freescale.com>
> ---
> arch/powerpc/kernel/exceptions-64e.S | 10 ++--------
> 1 file changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S
> index 4684e33..1f0ae33 100644
> --- a/arch/powerpc/kernel/exceptions-64e.S
> +++ b/arch/powerpc/kernel/exceptions-64e.S
> @@ -1010,12 +1010,9 @@ skpinv: addi r6,r6,1 /* Increment */
> mtspr SPRN_MAS0,r3
> tlbre
> mfspr r6,SPRN_MAS1
> - rlwinm r6,r6,0,2,0 /* clear IPROT */
> + rlwinm r6,r6,0,2,31 /* clear IPROT and VALID */
> mtspr SPRN_MAS1,r6
> tlbwe
> -
> - /* Invalidate TLB1 */
> - PPC_TLBILX_ALL(0,R0)
> sync
> isync
>
> @@ -1069,12 +1066,9 @@ skpinv: addi r6,r6,1 /* Increment */
> mtspr SPRN_MAS0,r4
> tlbre
> mfspr r5,SPRN_MAS1
> - rlwinm r5,r5,0,2,0 /* clear IPROT */
> + rlwinm r5,r5,0,2,31 /* clear IPROT and VALID */
> mtspr SPRN_MAS1,r5
> tlbwe
> -
> - /* Invalidate TLB1 */
> - PPC_TLBILX_ALL(0,R0)
> sync
> isync
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code
2013-02-15 0:11 ` Benjamin Herrenschmidt
@ 2013-02-15 15:16 ` Diana Craciun
2013-02-19 19:47 ` Scott Wood
0 siblings, 1 reply; 7+ messages in thread
From: Diana Craciun @ 2013-02-15 15:16 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev
On 02/15/2013 02:11 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2013-02-14 at 14:56 +0200, Diana Craciun wrote:
>> From: Diana Craciun <Diana.Craciun@freescale.com>
>>
>> On Freescale e6500 cores EPCR[DGTMI] controls whether guest supervisor
>> state can execute TLB management instructions. If EPCR[DGTMI]=0
>> tlbwe and tlbilx are allowed to execute normally in the guest state.
>>
>> A hypervisor may choose to virtualize TLB1 and for this purpose it
>> may use IPROT to protect the entries for being invalidated by the
>> guest. However, because tlbwe and tlbilx execution in the guest state
>> are sharing the same bit, it is not possible to have a scenario where
>> tlbwe is allowed to be executed in guest state and tlbilx traps. When
>> guest TLB management instructions are allowed to be executed in guest
>> state the guest cannot use tlbilx to invalidate TLB1 guest entries.
> Sorry, I don't understand the explanation... can you be more detailed ?
TLB1 supports huge page sizes. The guest may see the memory as
contiguous but it sees the guest physical memory as presented by the
hypervisor. In reality the real physical memory may be fragmented. In
this case the hypervisor can add more than one TLB1 entry for one guest
request and the hypervisor will keep track of all fragments. When the
guest performs a tlbilx, the hypervisor will correctly invalidate all
the corresponding fragments because both tlbwe and tlbilx trap and has
full control of tlb management instructions targeting TLB1.
For e6500 a single bit controls if tlbwe and tlbilx trap to the
Hypervisor. tlbwe targeting TLB1 always traps. But if we want to use
LRAT for TLB0, we have to configure tlbwe (targeting TLB 0) to go
directly to the guest. But in this case tlbilx (which is targeting both
TLBs) will never trap.
If the tlbilx does not trap, the guest can invalidate only one of
(possible more) fragments and furthermore the synchronization between
what entries the hypervisor thinks there are in the TLB1 and what are
the actual entries is lost.
Diana
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code
2013-02-15 15:16 ` Diana Craciun
@ 2013-02-19 19:47 ` Scott Wood
2013-02-20 9:22 ` Diana Craciun
2013-02-20 14:22 ` Stuart Yoder
0 siblings, 2 replies; 7+ messages in thread
From: Scott Wood @ 2013-02-19 19:47 UTC (permalink / raw)
To: Diana Craciun; +Cc: linuxppc-dev
On 02/15/2013 09:16:15 AM, Diana Craciun wrote:
> On 02/15/2013 02:11 AM, Benjamin Herrenschmidt wrote:
>> On Thu, 2013-02-14 at 14:56 +0200, Diana Craciun wrote:
>>> From: Diana Craciun <Diana.Craciun@freescale.com>
>>>=20
>>> On Freescale e6500 cores EPCR[DGTMI] controls whether guest =20
>>> supervisor
>>> state can execute TLB management instructions. If EPCR[DGTMI]=3D0
>>> tlbwe and tlbilx are allowed to execute normally in the guest state.
>>>=20
>>> A hypervisor may choose to virtualize TLB1 and for this purpose it
>>> may use IPROT to protect the entries for being invalidated by the
>>> guest. However, because tlbwe and tlbilx execution in the guest =20
>>> state
>>> are sharing the same bit, it is not possible to have a scenario =20
>>> where
>>> tlbwe is allowed to be executed in guest state and tlbilx traps. =20
>>> When
>>> guest TLB management instructions are allowed to be executed in =20
>>> guest
>>> state the guest cannot use tlbilx to invalidate TLB1 guest entries.
>> Sorry, I don't understand the explanation... can you be more =20
>> detailed ?
>=20
> TLB1 supports huge page sizes. The guest may see the memory as =20
> contiguous but it sees the guest physical memory as presented by the =20
> hypervisor. In reality the real physical memory may be fragmented. In =20
> this case the hypervisor can add more than one TLB1 entry for one =20
> guest request and the hypervisor will keep track of all fragments. =20
> When the guest performs a tlbilx, the hypervisor will correctly =20
> invalidate all the corresponding fragments because both tlbwe and =20
> tlbilx trap and has full control of tlb management instructions =20
> targeting TLB1.
>=20
> For e6500 a single bit controls if tlbwe and tlbilx trap to the =20
> Hypervisor. tlbwe targeting TLB1 always traps. But if we want to use =20
> LRAT for TLB0, we have to configure tlbwe (targeting TLB 0) to go =20
> directly to the guest. But in this case tlbilx (which is targeting =20
> both TLBs) will never trap.
>=20
> If the tlbilx does not trap, the guest can invalidate only one of =20
> (possible more) fragments and furthermore the synchronization between =20
> what entries the hypervisor thinks there are in the TLB1 and what are =20
> the actual entries is lost.
This patch addresses boot-time invalidations only. How will you handle =20
hugetlb invalidations (or indirect entry invalidations, once that =20
becomes supported)?
-Scott=
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code
2013-02-19 19:47 ` Scott Wood
@ 2013-02-20 9:22 ` Diana Craciun
2013-02-20 14:22 ` Stuart Yoder
1 sibling, 0 replies; 7+ messages in thread
From: Diana Craciun @ 2013-02-20 9:22 UTC (permalink / raw)
To: Scott Wood; +Cc: Yoder Stuart-B08248, linuxppc-dev
On 02/19/2013 09:47 PM, Scott Wood wrote:
> On 02/15/2013 09:16:15 AM, Diana Craciun wrote:
>> On 02/15/2013 02:11 AM, Benjamin Herrenschmidt wrote:
>>> On Thu, 2013-02-14 at 14:56 +0200, Diana Craciun wrote:
>>>> From: Diana Craciun <Diana.Craciun@freescale.com>
>>>>
>>>> On Freescale e6500 cores EPCR[DGTMI] controls whether guest
>>>> supervisor
>>>> state can execute TLB management instructions. If EPCR[DGTMI]=0
>>>> tlbwe and tlbilx are allowed to execute normally in the guest state.
>>>>
>>>> A hypervisor may choose to virtualize TLB1 and for this purpose it
>>>> may use IPROT to protect the entries for being invalidated by the
>>>> guest. However, because tlbwe and tlbilx execution in the guest
>>>> state
>>>> are sharing the same bit, it is not possible to have a scenario
>>>> where
>>>> tlbwe is allowed to be executed in guest state and tlbilx traps.
>>>> When
>>>> guest TLB management instructions are allowed to be executed in
>>>> guest
>>>> state the guest cannot use tlbilx to invalidate TLB1 guest entries.
>>> Sorry, I don't understand the explanation... can you be more
>>> detailed ?
>> TLB1 supports huge page sizes. The guest may see the memory as
>> contiguous but it sees the guest physical memory as presented by the
>> hypervisor. In reality the real physical memory may be fragmented. In
>> this case the hypervisor can add more than one TLB1 entry for one
>> guest request and the hypervisor will keep track of all fragments.
>> When the guest performs a tlbilx, the hypervisor will correctly
>> invalidate all the corresponding fragments because both tlbwe and
>> tlbilx trap and has full control of tlb management instructions
>> targeting TLB1.
>>
>> For e6500 a single bit controls if tlbwe and tlbilx trap to the
>> Hypervisor. tlbwe targeting TLB1 always traps. But if we want to use
>> LRAT for TLB0, we have to configure tlbwe (targeting TLB 0) to go
>> directly to the guest. But in this case tlbilx (which is targeting
>> both TLBs) will never trap.
>>
>> If the tlbilx does not trap, the guest can invalidate only one of
>> (possible more) fragments and furthermore the synchronization between
>> what entries the hypervisor thinks there are in the TLB1 and what are
>> the actual entries is lost.
> This patch addresses boot-time invalidations only. How will you handle
> hugetlb invalidations (or indirect entry invalidations, once that
> becomes supported)?
>
> -Scott
I will not handle them. This patch offers the possibility to run Linux
under hypervisor without using hugetlb or indirect entries (of course in
case when we configure tlb management instructions to go to the guest
because otherwise it works)
If indirect entries are supported most likely we will configure tlbilx
and tlbwe to trap. In this case LRAT will be still used through the page
table walk mechanism.
Diana
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code
2013-02-19 19:47 ` Scott Wood
2013-02-20 9:22 ` Diana Craciun
@ 2013-02-20 14:22 ` Stuart Yoder
2013-02-20 14:31 ` Diana Craciun
1 sibling, 1 reply; 7+ messages in thread
From: Stuart Yoder @ 2013-02-20 14:22 UTC (permalink / raw)
To: Scott Wood; +Cc: Diana Craciun, linuxppc-dev
On Tue, Feb 19, 2013 at 1:47 PM, Scott Wood <scottwood@freescale.com> wrote:
>
> This patch addresses boot-time invalidations only. How will you handle
> hugetlb invalidations (or indirect entry invalidations, once that becomes
> supported)?
We do envision that "direct guest TLB management" is an opt-in option
that a guest can enable.
If LRAT is on, with TLB management directly handled by guests, the only
mechanism we have to do TLB1 invalidates is tlbwe. That is our only option
as far as I know. So, hugetlb and indirect entries will each need to be
addressed separately. The kernel code that handles these either needs
to be A) modified to unconditionally do all invalidates by tlbwe or B)
conditionally
use tlbwe depending on whether this is a guest that has enabled direct
TLB management.
Stuart
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code
2013-02-20 14:22 ` Stuart Yoder
@ 2013-02-20 14:31 ` Diana Craciun
0 siblings, 0 replies; 7+ messages in thread
From: Diana Craciun @ 2013-02-20 14:31 UTC (permalink / raw)
To: Stuart Yoder; +Cc: Scott Wood, linuxppc-dev
On 02/20/2013 04:22 PM, Stuart Yoder wrote:
> On Tue, Feb 19, 2013 at 1:47 PM, Scott Wood <scottwood@freescale.com> wrote:
>> This patch addresses boot-time invalidations only. How will you handle
>> hugetlb invalidations (or indirect entry invalidations, once that becomes
>> supported)?
> We do envision that "direct guest TLB management" is an opt-in option
> that a guest can enable.
>
> If LRAT is on, with TLB management directly handled by guests, the only
> mechanism we have to do TLB1 invalidates is tlbwe. That is our only option
> as far as I know. So, hugetlb and indirect entries will each need to be
> addressed separately. The kernel code that handles these either needs
> to be A) modified to unconditionally do all invalidates by tlbwe or B)
> conditionally
> use tlbwe depending on whether this is a guest that has enabled direct
> TLB management.
>
> Stuart
>
In case of indirect entries I think we can configure tlbwe and tlbilx to
go to the hypervisor. The guest should not mix tlbwe (for TLB0) and
hardware page table walk, so we can support this scenario without
modifying the guest.
Diana
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-02-20 14:31 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-14 12:56 [PATCH][RFC] Replaced tlbilx with tlbwe in the initialization code Diana Craciun
2013-02-15 0:11 ` Benjamin Herrenschmidt
2013-02-15 15:16 ` Diana Craciun
2013-02-19 19:47 ` Scott Wood
2013-02-20 9:22 ` Diana Craciun
2013-02-20 14:22 ` Stuart Yoder
2013-02-20 14:31 ` Diana Craciun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).