Linux KVM/arm64 development list
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: kvmarm@lists.cs.columbia.edu, shan.gavin@gmail.com
Subject: Re: [PATCH] arm64/kvm: Fix zapping stage2 page table wrongly
Date: Wed, 02 Sep 2020 13:04:00 +0100	[thread overview]
Message-ID: <4fb3f8e03e1fb3bb93113037fa62fceb@kernel.org> (raw)
In-Reply-To: <772ab66f-1e17-275f-f65d-08d8f67a90f9@arm.com>

On 2020-09-02 12:53, Alexandru Elisei wrote:
> Hi Marc,
> 
> On 9/2/20 12:10 PM, Marc Zyngier wrote:
>> On 2020-09-02 11:59, Alexandru Elisei wrote:
>>> Hi,
>>> 
>>> On 8/22/20 3:44 AM, Gavin Shan wrote:
>>>> Depending on the kernel configuration, PUD_SIZE could be equal to
>>>> PMD_SIZE. For example, both of them are 512MB with the following
>>>> kernel configuration. In this case, both PUD and PMD are folded
>>>> to PGD.
>>>> 
>>>>    CONFIG_ARM64_64K_PAGES   y
>>>>    CONFIG_ARM64_VA_BITS     42
>>>>    CONFIG_PGTABLE_LEVELS    2
>>>> 
>>>> With the above configuration, the stage2 PUD is used to backup the
>>>> 512MB huge page when the stage2 mapping is built. During the 
>>>> mapping,
>>>> the PUD and its subordinate levels of page table entries are 
>>>> unmapped
>>>> if the PUD is present and not huge page sensitive in 
>>>> stage2_set_pud_huge().
>>>> Unfornately, the @addr isn't aligned to S2_PUD_SIZE and wrong page 
>>>> table
>>>> entries are zapped. It eventually leads to PUD's present bit can't 
>>>> be
>>>> cleared successfully and infinite loop in stage2_set_pud_huge().
>>>> 
>>>> This fixes the issue by checking with S2_{PUD, PMD}_SIZE instead of
>>>> {PUD, PMD}_SIZE to determine if stage2 PUD or PMD is used to back 
>>>> the
>>>> huge page. For this particular case, the stage2 PMD entry should be
>>>> used to backup the 512MB huge page with stage2_set_pmd_huge().
>>> 
>>> I can reproduce this on my rockpro64 using kvmtool.
>>> 
>>> I see two issues here: first, PUD_SIZE = 512MB, but S2_PUD_SIZE = 4TB 
>>> (checked
>>> using printk), and second, stage2_set_pud_huge() hangs. I'm working 
>>> on
>>> debugging them.
>> 
>> I have this as an immediate fix for the set_pud_huge hang, tested
>> on Seattle with 64k/42bits.
>> 
>> I can't wait to see the back of this code...
> 
> The problem is in stage2_set_pud_huge(), because kvm_stage2_has_pmd() 
> returns
> false (CONFIG_PGTABLE_LEVELS = 2):
> 
>     pudp = stage2_get_pud(mmu, cache, addr);
>     VM_BUG_ON(!pudp);
> 
>     old_pud = *pudp;
> 
>     [..]
> 
>     // Returns 1 because !kvm_stage2_has_pmd()
>     if (stage2_pud_present(kvm, old_pud)) {
>         /*
>          * If we already have table level mapping for this block, unmap
>          * the range for this block and retry.
>          */
>         if (!stage2_pud_huge(kvm, old_pud)) { // Always true because
> !kvm_stage2_has_pmd()
>             unmap_stage2_range(mmu, addr & S2_PUD_MASK, S2_PUD_SIZE);
>             goto retry;
>         }
> 
> And we end up jumping back to retry forever. IMO, in user_mem_abort(),
> if PUD_SIZE
> == PMD_SIZE, we should try to map PMD_SIZE instead of PUD_SIZE. Maybe 
> something
> like this?

Err... If PUD_SIZE == PMD_SIZE, what difference does it make to map
one or the other?

> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index ba00bcc0c884..178267dec511 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1886,8 +1886,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu,
> phys_addr_t fault_ipa,
>          * As for PUD huge maps, we must make sure that we have at 
> least
>          * 3 levels, i.e, PMD is not folded.
>          */
> -       if (vma_pagesize == PMD_SIZE ||
> -           (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
> +       if (vma_pagesize == PUD_SIZE && !kvm_stage2_has_pmd(kvm))
> +               vma_pagesize = PMD_SIZE;
> +
> +       if (vma_pagesize == PUD_SIZE || vma_pagesize == PUD_SIZE)
>                 gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >>
> PAGE_SHIFT;
>         mmap_read_unlock(current->mm);

I don't think this solves anything in the 2 level case. The gist of
the issue is that if we go on the PUD path, we end-up computing the
wrong offset for the entry and either loop or do something silly.
In either case, we read/write something we have no business touching.

Gavin sidesteps the problem by using a fantasy PUD size, while I
catch it by explicitly checking for the PMD == PUD case.

         M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

  parent reply	other threads:[~2020-09-02 12:04 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-22  2:44 [PATCH] arm64/kvm: Fix zapping stage2 page table wrongly Gavin Shan
2020-08-22 10:01 ` Marc Zyngier
2020-08-22 23:59   ` Gavin Shan
2020-09-01 16:50     ` Marc Zyngier
2020-09-02 10:59 ` Alexandru Elisei
2020-09-02 11:10   ` Marc Zyngier
2020-09-02 11:53     ` Alexandru Elisei
2020-09-02 11:56       ` Alexandru Elisei
2020-09-02 12:04       ` Marc Zyngier [this message]
2020-09-02 13:58         ` Alexandru Elisei
2020-09-02 17:38     ` Auger Eric
2020-09-02 23:55     ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4fb3f8e03e1fb3bb93113037fa62fceb@kernel.org \
    --to=maz@kernel.org \
    --cc=alexandru.elisei@arm.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox