* Help on kvm_tlb_flush_vmid_ipa usage
@ 2014-12-18 19:27 Mario Smarduch
2014-12-18 19:38 ` Marc Zyngier
0 siblings, 1 reply; 7+ messages in thread
From: Mario Smarduch @ 2014-12-18 19:27 UTC (permalink / raw)
To: linux-arm-kernel
When this function is called IPA address is used.
Looking at the HYP implementation it uses the IPA
directly in tlbi instructions. But reading the TLB maintnance
instruction syntax, bit [35:0] should be set to
IPA[47:12]. I traced the source code but don't see
the adjustment. I must be missing something
given this function is fundamental to KVM MMU.
Thanks,
Mario
^ permalink raw reply [flat|nested] 7+ messages in thread
* Help on kvm_tlb_flush_vmid_ipa usage
2014-12-18 19:27 Help on kvm_tlb_flush_vmid_ipa usage Mario Smarduch
@ 2014-12-18 19:38 ` Marc Zyngier
2014-12-18 19:55 ` Mario Smarduch
2014-12-18 22:25 ` Mario Smarduch
0 siblings, 2 replies; 7+ messages in thread
From: Marc Zyngier @ 2014-12-18 19:38 UTC (permalink / raw)
To: linux-arm-kernel
On 18/12/14 19:27, Mario Smarduch wrote:
> When this function is called IPA address is used. Looking at the HYP
> implementation it uses the IPA directly in tlbi instructions. But
> reading the TLB maintnance instruction syntax, bit [35:0] should be
> set to IPA[47:12]. I traced the source code but don't see the
> adjustment. I must be missing something given this function is
> fundamental to KVM MMU.
Ermmm... Someone (that is, I) needs a brown paper back again.
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index b72aa9f..a767f6a 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -1014,6 +1014,7 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
* Instead, we invalidate Stage-2 for this IPA, and the
* whole of Stage-1. Weep...
*/
+ lsr x1, x1, #12
tlbi ipas2e1is, x1
/*
* We have to ensure completion of the invalidation at Stage-2,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Help on kvm_tlb_flush_vmid_ipa usage
2014-12-18 19:38 ` Marc Zyngier
@ 2014-12-18 19:55 ` Mario Smarduch
2014-12-19 9:35 ` Marc Zyngier
2014-12-18 22:25 ` Mario Smarduch
1 sibling, 1 reply; 7+ messages in thread
From: Mario Smarduch @ 2014-12-18 19:55 UTC (permalink / raw)
To: linux-arm-kernel
On 12/18/2014 11:38 AM, Marc Zyngier wrote:
> On 18/12/14 19:27, Mario Smarduch wrote:
>> When this function is called IPA address is used. Looking at the HYP
>> implementation it uses the IPA directly in tlbi instructions. But
>> reading the TLB maintnance instruction syntax, bit [35:0] should be
>> set to IPA[47:12]. I traced the source code but don't see the
>> adjustment. I must be missing something given this function is
>> fundamental to KVM MMU.
>
> Ermmm... Someone (that is, I) needs a brown paper back again.
>
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index b72aa9f..a767f6a 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -1014,6 +1014,7 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
> * Instead, we invalidate Stage-2 for this IPA, and the
> * whole of Stage-1. Weep...
> */
> + lsr x1, x1, #12
> tlbi ipas2e1is, x1
> /*
> * We have to ensure completion of the invalidation at Stage-2,
>
> M.
>
Marc,
thanks.
Another question, is how do you handle a huge tlb
do you need to zero out any PMD/PUD mask
bits or just pass it in as is. The manual says
MMU can figure out which bits are treated as
0 for 16/64KB pages, can same thing be assumed
for huge pages?
- Mario
^ permalink raw reply [flat|nested] 7+ messages in thread
* Help on kvm_tlb_flush_vmid_ipa usage
2014-12-18 19:38 ` Marc Zyngier
2014-12-18 19:55 ` Mario Smarduch
@ 2014-12-18 22:25 ` Mario Smarduch
2014-12-19 9:31 ` Marc Zyngier
1 sibling, 1 reply; 7+ messages in thread
From: Mario Smarduch @ 2014-12-18 22:25 UTC (permalink / raw)
To: linux-arm-kernel
On 12/18/2014 11:38 AM, Marc Zyngier wrote:
> On 18/12/14 19:27, Mario Smarduch wrote:
>> When this function is called IPA address is used. Looking at the HYP
>> implementation it uses the IPA directly in tlbi instructions. But
>> reading the TLB maintnance instruction syntax, bit [35:0] should be
>> set to IPA[47:12]. I traced the source code but don't see the
>> adjustment. I must be missing something given this function is
>> fundamental to KVM MMU.
>
> Ermmm... Someone (that is, I) needs a brown paper back again.
>
> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
> index b72aa9f..a767f6a 100644
> --- a/arch/arm64/kvm/hyp.S
> +++ b/arch/arm64/kvm/hyp.S
> @@ -1014,6 +1014,7 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
> * Instead, we invalidate Stage-2 for this IPA, and the
> * whole of Stage-1. Weep...
> */
> + lsr x1, x1, #12
> tlbi ipas2e1is, x1
> /*
> * We have to ensure completion of the invalidation at Stage-2,
>
> M.
>
Hi Marc,
fwiw I re-ran the test that halts the host (on foundation
model) with a guest booted, panic has gone away.
BUG: Bad page state in process K20nfsserver pfn:ff818
page:ffff7c7fc37e4540 count:-3 mapcount:0 mapping: (null) index:0x0
flags: 0x0()
page dumped because: nonzero _count
Modules linked in:
CPU: 1 PID: 761 Comm: K20nfsserver Not tainted 3.18.0-rc2+ #55
Call trace:
[<ffff800000087244>] dump_backtrace+0x0/0x12c
[<ffff800000087380>] show_stack+0x10/0x1c
[<ffff8000003ce804>] dump_stack+0x74/0x98
[<ffff80000011bdc4>] bad_page+0xdc/0x12c
[<ffff80000011ecc4>] get_page_from_freelist+0x4b4/0x600
[<ffff80000011eeec>] __alloc_pages_nodemask+0xdc/0x780
[<ffff80000011f5a4>] __get_free_pages+0x14/0x5c
[<ffff80000011f5fc>] get_zeroed_page+0x10/0x1c
[<ffff8000000903fc>] pgd_alloc+0xc/0x18
[<ffff8000000a13a0>] mm_init+0xcc/0x12c
[<ffff8000000a17f8>] mm_alloc+0x44/0x54
[<ffff800000166cbc>] do_execve+0x1a8/0x49c
[<ffff8000001671bc>] SyS_execve+0x1c/0x2c
^ permalink raw reply [flat|nested] 7+ messages in thread
* Help on kvm_tlb_flush_vmid_ipa usage
2014-12-18 22:25 ` Mario Smarduch
@ 2014-12-19 9:31 ` Marc Zyngier
2014-12-19 17:40 ` Mario Smarduch
0 siblings, 1 reply; 7+ messages in thread
From: Marc Zyngier @ 2014-12-19 9:31 UTC (permalink / raw)
To: linux-arm-kernel
On 18/12/14 22:25, Mario Smarduch wrote:
> On 12/18/2014 11:38 AM, Marc Zyngier wrote:
>> On 18/12/14 19:27, Mario Smarduch wrote:
>>> When this function is called IPA address is used. Looking at the HYP
>>> implementation it uses the IPA directly in tlbi instructions. But
>>> reading the TLB maintnance instruction syntax, bit [35:0] should be
>>> set to IPA[47:12]. I traced the source code but don't see the
>>> adjustment. I must be missing something given this function is
>>> fundamental to KVM MMU.
>>
>> Ermmm... Someone (that is, I) needs a brown paper back again.
>>
>> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
>> index b72aa9f..a767f6a 100644
>> --- a/arch/arm64/kvm/hyp.S
>> +++ b/arch/arm64/kvm/hyp.S
>> @@ -1014,6 +1014,7 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
>> * Instead, we invalidate Stage-2 for this IPA, and the
>> * whole of Stage-1. Weep...
>> */
>> + lsr x1, x1, #12
>> tlbi ipas2e1is, x1
>> /*
>> * We have to ensure completion of the invalidation at Stage-2,
>>
>> M.
>>
>
> Hi Marc,
> fwiw I re-ran the test that halts the host (on foundation
> model) with a guest booted, panic has gone away.
>
> BUG: Bad page state in process K20nfsserver pfn:ff818
> page:ffff7c7fc37e4540 count:-3 mapcount:0 mapping: (null) index:0x0
> flags: 0x0()
> page dumped because: nonzero _count
> Modules linked in:
> CPU: 1 PID: 761 Comm: K20nfsserver Not tainted 3.18.0-rc2+ #55
> Call trace:
> [<ffff800000087244>] dump_backtrace+0x0/0x12c
> [<ffff800000087380>] show_stack+0x10/0x1c
> [<ffff8000003ce804>] dump_stack+0x74/0x98
> [<ffff80000011bdc4>] bad_page+0xdc/0x12c
> [<ffff80000011ecc4>] get_page_from_freelist+0x4b4/0x600
> [<ffff80000011eeec>] __alloc_pages_nodemask+0xdc/0x780
> [<ffff80000011f5a4>] __get_free_pages+0x14/0x5c
> [<ffff80000011f5fc>] get_zeroed_page+0x10/0x1c
> [<ffff8000000903fc>] pgd_alloc+0xc/0x18
> [<ffff8000000a13a0>] mm_init+0xcc/0x12c
> [<ffff8000000a17f8>] mm_alloc+0x44/0x54
> [<ffff800000166cbc>] do_execve+0x1a8/0x49c
> [<ffff8000001671bc>] SyS_execve+0x1c/0x2c
>
Absolutely amazing that we managed to run for so long with such a bug. I
suppose we rarely update page table entries, and most implementations
don't use split TLBs (where this instruction is useful).
Thanks a lot for reporting this bug, I'll repost a proper fix today.
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 7+ messages in thread
* Help on kvm_tlb_flush_vmid_ipa usage
2014-12-18 19:55 ` Mario Smarduch
@ 2014-12-19 9:35 ` Marc Zyngier
0 siblings, 0 replies; 7+ messages in thread
From: Marc Zyngier @ 2014-12-19 9:35 UTC (permalink / raw)
To: linux-arm-kernel
On 18/12/14 19:55, Mario Smarduch wrote:
> On 12/18/2014 11:38 AM, Marc Zyngier wrote:
>> On 18/12/14 19:27, Mario Smarduch wrote:
>>> When this function is called IPA address is used. Looking at the HYP
>>> implementation it uses the IPA directly in tlbi instructions. But
>>> reading the TLB maintnance instruction syntax, bit [35:0] should be
>>> set to IPA[47:12]. I traced the source code but don't see the
>>> adjustment. I must be missing something given this function is
>>> fundamental to KVM MMU.
>>
>> Ermmm... Someone (that is, I) needs a brown paper back again.
>>
>> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
>> index b72aa9f..a767f6a 100644
>> --- a/arch/arm64/kvm/hyp.S
>> +++ b/arch/arm64/kvm/hyp.S
>> @@ -1014,6 +1014,7 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
>> * Instead, we invalidate Stage-2 for this IPA, and the
>> * whole of Stage-1. Weep...
>> */
>> + lsr x1, x1, #12
>> tlbi ipas2e1is, x1
>> /*
>> * We have to ensure completion of the invalidation at Stage-2,
>>
>> M.
>>
>
> Marc,
> thanks.
>
> Another question, is how do you handle a huge tlb do you need to zero
> out any PMD/PUD mask bits or just pass it in as is. The manual says
> MMU can figure out which bits are treated as 0 for 16/64KB pages, can
> same thing be assumed for huge pages?
Indeed. The MMU will simply try to match the address with the TLBs,
taking into account how much the actual entry is covering (1G, 2M, 4K).
Thanks,
M.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply [flat|nested] 7+ messages in thread
* Help on kvm_tlb_flush_vmid_ipa usage
2014-12-19 9:31 ` Marc Zyngier
@ 2014-12-19 17:40 ` Mario Smarduch
0 siblings, 0 replies; 7+ messages in thread
From: Mario Smarduch @ 2014-12-19 17:40 UTC (permalink / raw)
To: linux-arm-kernel
On 12/19/2014 01:31 AM, Marc Zyngier wrote:
> On 18/12/14 22:25, Mario Smarduch wrote:
>> On 12/18/2014 11:38 AM, Marc Zyngier wrote:
>>> On 18/12/14 19:27, Mario Smarduch wrote:
>>>> When this function is called IPA address is used. Looking at the HYP
>>>> implementation it uses the IPA directly in tlbi instructions. But
>>>> reading the TLB maintnance instruction syntax, bit [35:0] should be
>>>> set to IPA[47:12]. I traced the source code but don't see the
>>>> adjustment. I must be missing something given this function is
>>>> fundamental to KVM MMU.
>>>
>>> Ermmm... Someone (that is, I) needs a brown paper back again.
>>>
>>> diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
>>> index b72aa9f..a767f6a 100644
>>> --- a/arch/arm64/kvm/hyp.S
>>> +++ b/arch/arm64/kvm/hyp.S
>>> @@ -1014,6 +1014,7 @@ ENTRY(__kvm_tlb_flush_vmid_ipa)
>>> * Instead, we invalidate Stage-2 for this IPA, and the
>>> * whole of Stage-1. Weep...
>>> */
>>> + lsr x1, x1, #12
>>> tlbi ipas2e1is, x1
>>> /*
>>> * We have to ensure completion of the invalidation at Stage-2,
>>>
>>> M.
>>>
>>
>> Hi Marc,
>> fwiw I re-ran the test that halts the host (on foundation
>> model) with a guest booted, panic has gone away.
>>
>> BUG: Bad page state in process K20nfsserver pfn:ff818
>> page:ffff7c7fc37e4540 count:-3 mapcount:0 mapping: (null) index:0x0
>> flags: 0x0()
>> page dumped because: nonzero _count
>> Modules linked in:
>> CPU: 1 PID: 761 Comm: K20nfsserver Not tainted 3.18.0-rc2+ #55
>> Call trace:
>> [<ffff800000087244>] dump_backtrace+0x0/0x12c
>> [<ffff800000087380>] show_stack+0x10/0x1c
>> [<ffff8000003ce804>] dump_stack+0x74/0x98
>> [<ffff80000011bdc4>] bad_page+0xdc/0x12c
>> [<ffff80000011ecc4>] get_page_from_freelist+0x4b4/0x600
>> [<ffff80000011eeec>] __alloc_pages_nodemask+0xdc/0x780
>> [<ffff80000011f5a4>] __get_free_pages+0x14/0x5c
>> [<ffff80000011f5fc>] get_zeroed_page+0x10/0x1c
>> [<ffff8000000903fc>] pgd_alloc+0xc/0x18
>> [<ffff8000000a13a0>] mm_init+0xcc/0x12c
>> [<ffff8000000a17f8>] mm_alloc+0x44/0x54
>> [<ffff800000166cbc>] do_execve+0x1a8/0x49c
>> [<ffff8000001671bc>] SyS_execve+0x1c/0x2c
>>
>
> Absolutely amazing that we managed to run for so long with such a bug. I
> suppose we rarely update page table entries, and most implementations
> don't use split TLBs (where this instruction is useful).
Yes I'm no expert on MMU but those tend to stay concealed for long
time. Way back Itanium had a wrap around ASID bug,the kernel ran for
couple years like that before it was fixed.
Also slight correction host kernel didn't panic, it continued on.
- Mario
>
> Thanks a lot for reporting this bug, I'll repost a proper fix today.
>
> M.
>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2014-12-19 17:40 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-18 19:27 Help on kvm_tlb_flush_vmid_ipa usage Mario Smarduch
2014-12-18 19:38 ` Marc Zyngier
2014-12-18 19:55 ` Mario Smarduch
2014-12-19 9:35 ` Marc Zyngier
2014-12-18 22:25 ` Mario Smarduch
2014-12-19 9:31 ` Marc Zyngier
2014-12-19 17:40 ` Mario Smarduch
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).