All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: kvmarm@lists.cs.columbia.edu, shan.gavin@gmail.com
Subject: Re: [PATCH] arm64/kvm: Fix zapping stage2 page table wrongly
Date: Wed, 02 Sep 2020 13:04:00 +0100	[thread overview]
Message-ID: <4fb3f8e03e1fb3bb93113037fa62fceb@kernel.org> (raw)
In-Reply-To: <772ab66f-1e17-275f-f65d-08d8f67a90f9@arm.com>

On 2020-09-02 12:53, Alexandru Elisei wrote:
> Hi Marc,
> 
> On 9/2/20 12:10 PM, Marc Zyngier wrote:
>> On 2020-09-02 11:59, Alexandru Elisei wrote:
>>> Hi,
>>> 
>>> On 8/22/20 3:44 AM, Gavin Shan wrote:
>>>> Depending on the kernel configuration, PUD_SIZE could be equal to
>>>> PMD_SIZE. For example, both of them are 512MB with the following
>>>> kernel configuration. In this case, both PUD and PMD are folded
>>>> to PGD.
>>>> 
>>>>    CONFIG_ARM64_64K_PAGES   y
>>>>    CONFIG_ARM64_VA_BITS     42
>>>>    CONFIG_PGTABLE_LEVELS    2
>>>> 
>>>> With the above configuration, the stage2 PUD is used to backup the
>>>> 512MB huge page when the stage2 mapping is built. During the 
>>>> mapping,
>>>> the PUD and its subordinate levels of page table entries are 
>>>> unmapped
>>>> if the PUD is present and not huge page sensitive in 
>>>> stage2_set_pud_huge().
>>>> Unfornately, the @addr isn't aligned to S2_PUD_SIZE and wrong page 
>>>> table
>>>> entries are zapped. It eventually leads to PUD's present bit can't 
>>>> be
>>>> cleared successfully and infinite loop in stage2_set_pud_huge().
>>>> 
>>>> This fixes the issue by checking with S2_{PUD, PMD}_SIZE instead of
>>>> {PUD, PMD}_SIZE to determine if stage2 PUD or PMD is used to back 
>>>> the
>>>> huge page. For this particular case, the stage2 PMD entry should be
>>>> used to backup the 512MB huge page with stage2_set_pmd_huge().
>>> 
>>> I can reproduce this on my rockpro64 using kvmtool.
>>> 
>>> I see two issues here: first, PUD_SIZE = 512MB, but S2_PUD_SIZE = 4TB 
>>> (checked
>>> using printk), and second, stage2_set_pud_huge() hangs. I'm working 
>>> on
>>> debugging them.
>> 
>> I have this as an immediate fix for the set_pud_huge hang, tested
>> on Seattle with 64k/42bits.
>> 
>> I can't wait to see the back of this code...
> 
> The problem is in stage2_set_pud_huge(), because kvm_stage2_has_pmd() 
> returns
> false (CONFIG_PGTABLE_LEVELS = 2):
> 
>     pudp = stage2_get_pud(mmu, cache, addr);
>     VM_BUG_ON(!pudp);
> 
>     old_pud = *pudp;
> 
>     [..]
> 
>     // Returns 1 because !kvm_stage2_has_pmd()
>     if (stage2_pud_present(kvm, old_pud)) {
>         /*
>          * If we already have table level mapping for this block, unmap
>          * the range for this block and retry.
>          */
>         if (!stage2_pud_huge(kvm, old_pud)) { // Always true because
> !kvm_stage2_has_pmd()
>             unmap_stage2_range(mmu, addr & S2_PUD_MASK, S2_PUD_SIZE);
>             goto retry;
>         }
> 
> And we end up jumping back to retry forever. IMO, in user_mem_abort(),
> if PUD_SIZE
> == PMD_SIZE, we should try to map PMD_SIZE instead of PUD_SIZE. Maybe 
> something
> like this?

Err... If PUD_SIZE == PMD_SIZE, what difference does it make to map
one or the other?

> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index ba00bcc0c884..178267dec511 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1886,8 +1886,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu,
> phys_addr_t fault_ipa,
>          * As for PUD huge maps, we must make sure that we have at 
> least
>          * 3 levels, i.e, PMD is not folded.
>          */
> -       if (vma_pagesize == PMD_SIZE ||
> -           (vma_pagesize == PUD_SIZE && kvm_stage2_has_pmd(kvm)))
> +       if (vma_pagesize == PUD_SIZE && !kvm_stage2_has_pmd(kvm))
> +               vma_pagesize = PMD_SIZE;
> +
> +       if (vma_pagesize == PUD_SIZE || vma_pagesize == PUD_SIZE)
>                 gfn = (fault_ipa & huge_page_mask(hstate_vma(vma))) >>
> PAGE_SHIFT;
>         mmap_read_unlock(current->mm);

I don't think this solves anything in the 2 level case. The gist of
the issue is that if we go on the PUD path, we end-up computing the
wrong offset for the entry and either loop or do something silly.
In either case, we read/write something we have no business touching.

Gavin sidesteps the problem by using a fantasy PUD size, while I
catch it by explicitly checking for the PMD == PUD case.

         M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

  parent reply	other threads:[~2020-09-02 12:04 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-22  2:44 [PATCH] arm64/kvm: Fix zapping stage2 page table wrongly Gavin Shan
2020-08-22 10:01 ` Marc Zyngier
2020-08-22 23:59   ` Gavin Shan
2020-09-01 16:50     ` Marc Zyngier
2020-09-02 10:59 ` Alexandru Elisei
2020-09-02 11:10   ` Marc Zyngier
2020-09-02 11:53     ` Alexandru Elisei
2020-09-02 11:56       ` Alexandru Elisei
2020-09-02 12:04       ` Marc Zyngier [this message]
2020-09-02 13:58         ` Alexandru Elisei
2020-09-02 17:38     ` Auger Eric
2020-09-02 23:55     ` Gavin Shan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4fb3f8e03e1fb3bb93113037fa62fceb@kernel.org \
    --to=maz@kernel.org \
    --cc=alexandru.elisei@arm.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=shan.gavin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.