From: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
To: Christoffer Dall <christoffer.dall@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, Catalin.Marinas@arm.com,
Will.Deacon@arm.com, Mark.Rutland@arm.com, Marc.Zyngier@arm.com,
kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
ard.biesheuvel@linaro.org
Subject: Re: [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling
Date: Mon, 12 Oct 2015 10:55:24 +0100 [thread overview]
Message-ID: <561B838C.5090008@arm.com> (raw)
In-Reply-To: <20151010145227.GB29128@cbox>
On 10/10/15 15:52, Christoffer Dall wrote:
> Hi Suzuki,
Hi Christoffer,
Thanks for being patient enough to review the code :-) without much of
the comments. I now realise there needs much more documentation than
what I have put in already. I am taking care of this in the next
revision already.
> I had to refresh my mind a fair bit to be able to review this, so I
> thought it may be useful to just remind us all what the constraints of
> this whole thing is, and make sure we agree on this:
>
> 1. We fix the IPA max width to 40 bits
> 2. We don't support systems with a PARange smaller than 40 bits (do we
> check this anywhere or document this anywhere?)
AFAIT, no we don't check it anywhere. May be we should. We could plug this
into my CPU feature infrastructure[1] and let the is_hype_mode_available()
use the info to decide if we can support 40bit IPA ?
> 3. We always assume we are running on a system with PARange of 40 bits
> and we are therefore constrained to use concatination.
>
> As an implication of (3) above, this code will attempt to allocate 256K
> of physically contiguous memory for each VM on the system. That is
> probably ok, but I just wanted to point it out in case it raises any
> eyebrows for other people following this thread.
Right, I will document this in a comment.
>> level: 0 1 2 3
>> bits : [47] [46 - 36] [35 - 25] [24 - 14] [13 - 0]
>> ^ ^ ^
>> | | |
>> host entry | x---- stage-2 entry
>> |
>> IPA -----x
>
> Isn't the stage-2 entry using bits [39:25], because you resolve
> more than 11 bits on the initial level of lookup when you concatenate
> tables?
Yes, the stage-2 entry is just supposed to show the entry level (2).
>>
>> The following conditions hold true for all cases(with 40bit IPA)
>> 1) The stage-2 entry level <= 2
>> 2) Number of fake page-table entries is in the inclusive range [0, 2].
>
> nit: Number of fake levels of page tables
Correct, I have fixed it already.
>> +/*
>> + * At stage-2 entry level, upto 16 tables can be concatenated and
>
> nit: Can you rewrite the first part of this comment to be in line with
> the ARM ARM, such as: "The stage-2 page tables can concatenate up to 16
> tables at the inital level" ?
Yes, will do it.
>
>
>> + * the hardware expects us to use concatenation, whenever possible.
>
> I think the 'hardware expects us' is a bit vague. At least I find this
> whole part of the architecture incredibly confusing already, so it would
> help me in the future if we put something like:
>
> "The hardware requires that we use concatenation depending on the
> supported PARange and page size. We always assume the hardware's PASize
> is maximum 40 bits in this context, and with a fixed IPA width of 40
> bits, we concatenate 2 tables for 4K pages, 16 tables for 16K pages, and
> do not use concatenation for 64K pages."
>
> Did I get this right?
You are right. The rule is simple. Upto 16 tables can be concatenated at
the stage-2 entry level.
>
>> + * So, number of page table levels for KVM_PHYS_SHIFT is always
>> + * the number of normal page table levels for (KVM_PHYS_SHIFT - 4).
>> + */
>> +#define HYP_PGTABLE_LEVELS ARM64_HW_PGTABLE_LEVELS(KVM_PHYS_SHIFT - 4)
>
> I see the math lines up, but I don't think it's intuitive, as I don't
> understand why it's obvious that it's the 'normal' page table for
> KVM_PHYS_SHIFT - 4.
Because, we can concatenate upto 16 page table entries. With the current
set of page sizes the above 'magic' formula works out. But yes, the following
suggestion makes more sense.
>
> I see this as an architectural limitation given in the ARM ARM, and we
> should just refer to that, and do:
>
> #if PAGE_SHIFT == 12
> #define S2_PGTABLE_LEVELS 3
> #else
> #define S2_PGTABLE_LEVELS 2
> #endif
OK, we could do that.
>
>> +/* Number of bits normally addressed by HYP_PGTABLE_LEVELS */
>> +#define HYP_PGTABLE_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS + 1)
>> +#define HYP_PGDIR_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS)
>> +#define HYP_PGTABLE_ENTRY_LEVEL (4 - HYP_PGTABLE_LEVELS)
>
> We are introducing a huge number of defines here, which are all more or
> less opaque to anyone coming back to this code.
>
> I may be extraordinarily stupid, but I really need each define explained
> in a comment to be able to follow this code (those above and the
> S2_ENTRY_TABLES below).
No, you right. I need to document all the above properly, which I is something
I am in the middle of.
>
> I actually wonder from looking at this whole patch if we even want to go
> here. Maybe this is really the time to say that we should get rid of
> the dependency between the host page table layout and the stage-2 page
> table layout.
>
> Since the rest of this series looks pretty good, I'm wondering if you
> should just disable KVM in the config system if 16K pages is selected,
> and then you can move ahead with this series while we fix KVM properly?
I can send an updated version (which is in the test furnace) soon, so that
you can take a look ?
Suzuki
next prev parent reply other threads:[~2015-10-12 9:55 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-15 15:41 [PATCHv2 00/15] arm64: 16K translation granule support Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 01/15] arm64: Move swapper pagetable definitions Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 02/15] arm64: Handle section maps for swapper/idmap Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 03/15] arm64: Introduce helpers for page table levels Suzuki K. Poulose
2015-10-07 8:26 ` Christoffer Dall
2015-10-07 9:26 ` Marc Zyngier
2015-10-07 9:48 ` Suzuki K. Poulose
2015-10-08 14:45 ` Christoffer Dall
2015-10-08 17:22 ` Suzuki K. Poulose
2015-10-08 17:28 ` Catalin Marinas
2015-10-09 9:22 ` Suzuki K. Poulose
2015-10-07 9:51 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 04/15] arm64: Calculate size for idmap_pg_dir at compile time Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 05/15] arm64: Handle 4 level page table for swapper Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 06/15] arm64: Clean config usages for page size Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 07/15] arm64: Kconfig: Fix help text about AArch32 support with 64K pages Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 08/15] arm64: Check for selected granule support Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 09/15] arm64: Add page size to the kernel image header Suzuki K. Poulose
2015-10-02 15:49 ` Catalin Marinas
2015-10-02 16:31 ` Catalin Marinas
2015-10-02 16:50 ` Marc Zyngier
2015-10-05 15:43 ` Christoffer Dall
2015-10-05 13:02 ` Suzuki K. Poulose
2015-10-05 13:22 ` Ard Biesheuvel
2015-10-10 17:22 ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 10/15] arm64: kvm: Fix {V}TCR_EL2_TG0 mask Suzuki K. Poulose
2015-10-08 15:17 ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 11/15] arm64: Cleanup VTCR_EL2 computation Suzuki K. Poulose
2015-10-07 10:11 ` Marc Zyngier
2015-10-07 10:23 ` Suzuki K. Poulose
2015-10-10 17:22 ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 12/15] arm: kvm: Move fake PGD handling to arch specific files Suzuki K. Poulose
2015-10-07 10:23 ` Marc Zyngier
2015-10-10 17:22 ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling Suzuki K. Poulose
2015-10-07 11:13 ` Marc Zyngier
2015-10-07 12:21 ` Suzuki K. Poulose
2015-10-10 14:52 ` Christoffer Dall
2015-10-12 9:55 ` Suzuki K. Poulose [this message]
2015-10-13 15:39 ` Christoffer Dall
2015-10-13 16:04 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 14/15] arm64: Add 16K page size support Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 15/15] arm64: 36 bit VA Suzuki K. Poulose
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=561B838C.5090008@arm.com \
--to=suzuki.poulose@arm.com \
--cc=Catalin.Marinas@arm.com \
--cc=Marc.Zyngier@arm.com \
--cc=Mark.Rutland@arm.com \
--cc=Will.Deacon@arm.com \
--cc=ard.biesheuvel@linaro.org \
--cc=christoffer.dall@linaro.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).