Re: [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
To: Christoffer Dall <christoffer.dall@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, Catalin.Marinas@arm.com,
	Will.Deacon@arm.com, Mark.Rutland@arm.com, Marc.Zyngier@arm.com,
	kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
	ard.biesheuvel@linaro.org
Subject: Re: [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling
Date: Mon, 12 Oct 2015 10:55:24 +0100	[thread overview]
Message-ID: <561B838C.5090008@arm.com> (raw)
In-Reply-To: <20151010145227.GB29128@cbox>

On 10/10/15 15:52, Christoffer Dall wrote:
> Hi Suzuki,

Hi Christoffer,

Thanks for being patient enough to review the code :-) without much of
the comments. I now realise there needs much more documentation than
what I have put in already. I am taking care of this in the next
revision already.

> I had to refresh my mind a fair bit to be able to review this, so I
> thought it may be useful to just remind us all what the constraints of
> this whole thing is, and make sure we agree on this:
>
> 1. We fix the IPA max width to 40 bits
> 2. We don't support systems with a PARange smaller than 40 bits (do we
>     check this anywhere or document this anywhere?)

AFAIT, no we don't check it anywhere. May be we should. We could plug this
into my CPU feature infrastructure[1] and let the is_hype_mode_available()
use the info to decide if we can support 40bit IPA ?

> 3. We always assume we are running on a system with PARange of 40 bits
>     and we are therefore constrained to use concatination.
>
> As an implication of (3) above, this code will attempt to allocate 256K
> of physically contiguous memory for each VM on the system.  That is
> probably ok, but I just wanted to point it out in case it raises any
> eyebrows for other people following this thread.

Right, I will document this in a comment.

>> level:  0       1         2         3
>> bits : [47] [46 - 36] [35 - 25] [24 - 14] [13 - 0]
>>           ^       ^     ^
>>           |       |     |
>>     host entry    |     x---- stage-2 entry
>>                   |
>>          IPA -----x
>
> Isn't the stage-2 entry using bits [39:25], because you resolve
> more than 11 bits on the initial level of lookup when you concatenate
> tables?

Yes, the stage-2 entry is just supposed to show the entry level (2).

>>
>> The following conditions hold true for all cases(with 40bit IPA)
>> 1) The stage-2 entry level <= 2
>> 2) Number of fake page-table entries is in the inclusive range [0, 2].
>
> nit: Number of fake levels of page tables

Correct, I have fixed it already.


>> +/*
>> + * At stage-2 entry level, upto 16 tables can be concatenated and
>
> nit: Can you rewrite the first part of this comment to be in line with
> the ARM ARM, such as: "The stage-2 page tables can concatenate up to 16
> tables at the inital level"  ?

Yes, will do it.

>
>
>> + * the hardware expects us to use concatenation, whenever possible.
>
> I think the 'hardware expects us' is a bit vague.  At least I find this
> whole part of the architecture incredibly confusing already, so it would
> help me in the future if we put something like:
>
> "The hardware requires that we use concatenation depending on the
> supported PARange and page size.  We always assume the hardware's PASize
> is maximum 40 bits in this context, and with a fixed IPA width of 40
> bits, we concatenate 2 tables for 4K pages, 16 tables for 16K pages, and
> do not use concatenation for 64K pages."
>
> Did I get this right?

You are right. The rule is simple. Upto 16 tables can be concatenated at
the stage-2 entry level.

>
>> + * So, number of page table levels for KVM_PHYS_SHIFT is always
>> + * the number of normal page table levels for (KVM_PHYS_SHIFT - 4).
>> + */
>> +#define HYP_PGTABLE_LEVELS	ARM64_HW_PGTABLE_LEVELS(KVM_PHYS_SHIFT - 4)
>
> I see the math lines up, but I don't think it's intuitive, as I don't
> understand why it's obvious that it's the 'normal' page table for
> KVM_PHYS_SHIFT - 4.

Because, we can concatenate upto 16 page table entries. With the current
set of page sizes the above 'magic' formula works out. But yes, the following
suggestion makes more sense.

>
> I see this as an architectural limitation given in the ARM ARM, and we
> should just refer to that, and do:
>
> #if PAGE_SHIFT == 12
> #define S2_PGTABLE_LEVELS	3
> #else
> #define S2_PGTABLE_LEVELS	2
> #endif

OK, we could do that.

>
>> +/* Number of bits normally addressed by HYP_PGTABLE_LEVELS */
>> +#define HYP_PGTABLE_SHIFT	ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS + 1)
>> +#define HYP_PGDIR_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS)
>> +#define HYP_PGTABLE_ENTRY_LEVEL	(4 - HYP_PGTABLE_LEVELS)
>
> We are introducing a huge number of defines here, which are all more or
> less opaque to anyone coming back to this code.
>
> I may be extraordinarily stupid, but I really need each define explained
> in a comment to be able to follow this code (those above and the
> S2_ENTRY_TABLES below).

No, you right. I need to document all the above properly, which I is something
I am in the middle of.

>
> I actually wonder from looking at this whole patch if we even want to go
> here.  Maybe this is really the time to say that we should get rid of
> the dependency between the host page table layout and the stage-2 page
> table layout.
>
> Since the rest of this series looks pretty good, I'm wondering if you
> should just disable KVM in the config system if 16K pages is selected,
> and then you can move ahead with this series while we fix KVM properly?

I can send an updated version (which is in the test furnace) soon, so that
you can take a look ?

Suzuki

next prev parent reply	other threads:[~2015-10-12  9:55 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-15 15:41 [PATCHv2 00/15] arm64: 16K translation granule support Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 01/15] arm64: Move swapper pagetable definitions Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 02/15] arm64: Handle section maps for swapper/idmap Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 03/15] arm64: Introduce helpers for page table levels Suzuki K. Poulose
2015-10-07  8:26   ` Christoffer Dall
2015-10-07  9:26     ` Marc Zyngier
2015-10-07  9:48       ` Suzuki K. Poulose
2015-10-08 14:45       ` Christoffer Dall
2015-10-08 17:22         ` Suzuki K. Poulose
2015-10-08 17:28           ` Catalin Marinas
2015-10-09  9:22             ` Suzuki K. Poulose
2015-10-07  9:51     ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 04/15] arm64: Calculate size for idmap_pg_dir at compile time Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 05/15] arm64: Handle 4 level page table for swapper Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 06/15] arm64: Clean config usages for page size Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 07/15] arm64: Kconfig: Fix help text about AArch32 support with 64K pages Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 08/15] arm64: Check for selected granule support Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 09/15] arm64: Add page size to the kernel image header Suzuki K. Poulose
2015-10-02 15:49   ` Catalin Marinas
2015-10-02 16:31     ` Catalin Marinas
2015-10-02 16:50       ` Marc Zyngier
2015-10-05 15:43         ` Christoffer Dall
2015-10-05 13:02     ` Suzuki K. Poulose
2015-10-05 13:22       ` Ard Biesheuvel
2015-10-10 17:22   ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 10/15] arm64: kvm: Fix {V}TCR_EL2_TG0 mask Suzuki K. Poulose
2015-10-08 15:17   ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 11/15] arm64: Cleanup VTCR_EL2 computation Suzuki K. Poulose
2015-10-07 10:11   ` Marc Zyngier
2015-10-07 10:23     ` Suzuki K. Poulose
2015-10-10 17:22   ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 12/15] arm: kvm: Move fake PGD handling to arch specific files Suzuki K. Poulose
2015-10-07 10:23   ` Marc Zyngier
2015-10-10 17:22     ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling Suzuki K. Poulose
2015-10-07 11:13   ` Marc Zyngier
2015-10-07 12:21     ` Suzuki K. Poulose
2015-10-10 14:52   ` Christoffer Dall
2015-10-12  9:55     ` Suzuki K. Poulose [this message]
2015-10-13 15:39       ` Christoffer Dall
2015-10-13 16:04         ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 14/15] arm64: Add 16K page size support Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 15/15] arm64: 36 bit VA Suzuki K. Poulose

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=561B838C.5090008@arm.com \
    --to=suzuki.poulose@arm.com \
    --cc=Catalin.Marinas@arm.com \
    --cc=Marc.Zyngier@arm.com \
    --cc=Mark.Rutland@arm.com \
    --cc=Will.Deacon@arm.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=christoffer.dall@linaro.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).