From: "Suzuki K. Poulose" <Suzuki.Poulose@arm.com>
To: Christoffer Dall <christoffer.dall@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, Catalin.Marinas@arm.com,
Will.Deacon@arm.com, Mark.Rutland@arm.com, Marc.Zyngier@arm.com,
kvmarm@lists.cs.columbia.edu, kvm@vger.kernel.org,
ard.biesheuvel@linaro.org
Subject: Re: [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling
Date: Mon, 12 Oct 2015 10:55:24 +0100 [thread overview]
Message-ID: <561B838C.5090008@arm.com> (raw)
In-Reply-To: <20151010145227.GB29128@cbox>
On 10/10/15 15:52, Christoffer Dall wrote:
> Hi Suzuki,
Hi Christoffer,
Thanks for being patient enough to review the code :-) without much of
the comments. I now realise there needs much more documentation than
what I have put in already. I am taking care of this in the next
revision already.
> I had to refresh my mind a fair bit to be able to review this, so I
> thought it may be useful to just remind us all what the constraints of
> this whole thing is, and make sure we agree on this:
>
> 1. We fix the IPA max width to 40 bits
> 2. We don't support systems with a PARange smaller than 40 bits (do we
> check this anywhere or document this anywhere?)
AFAIT, no we don't check it anywhere. May be we should. We could plug this
into my CPU feature infrastructure[1] and let the is_hype_mode_available()
use the info to decide if we can support 40bit IPA ?
> 3. We always assume we are running on a system with PARange of 40 bits
> and we are therefore constrained to use concatination.
>
> As an implication of (3) above, this code will attempt to allocate 256K
> of physically contiguous memory for each VM on the system. That is
> probably ok, but I just wanted to point it out in case it raises any
> eyebrows for other people following this thread.
Right, I will document this in a comment.
>> level: 0 1 2 3
>> bits : [47] [46 - 36] [35 - 25] [24 - 14] [13 - 0]
>> ^ ^ ^
>> | | |
>> host entry | x---- stage-2 entry
>> |
>> IPA -----x
>
> Isn't the stage-2 entry using bits [39:25], because you resolve
> more than 11 bits on the initial level of lookup when you concatenate
> tables?
Yes, the stage-2 entry is just supposed to show the entry level (2).
>>
>> The following conditions hold true for all cases(with 40bit IPA)
>> 1) The stage-2 entry level <= 2
>> 2) Number of fake page-table entries is in the inclusive range [0, 2].
>
> nit: Number of fake levels of page tables
Correct, I have fixed it already.
>> +/*
>> + * At stage-2 entry level, upto 16 tables can be concatenated and
>
> nit: Can you rewrite the first part of this comment to be in line with
> the ARM ARM, such as: "The stage-2 page tables can concatenate up to 16
> tables at the inital level" ?
Yes, will do it.
>
>
>> + * the hardware expects us to use concatenation, whenever possible.
>
> I think the 'hardware expects us' is a bit vague. At least I find this
> whole part of the architecture incredibly confusing already, so it would
> help me in the future if we put something like:
>
> "The hardware requires that we use concatenation depending on the
> supported PARange and page size. We always assume the hardware's PASize
> is maximum 40 bits in this context, and with a fixed IPA width of 40
> bits, we concatenate 2 tables for 4K pages, 16 tables for 16K pages, and
> do not use concatenation for 64K pages."
>
> Did I get this right?
You are right. The rule is simple. Upto 16 tables can be concatenated at
the stage-2 entry level.
>
>> + * So, number of page table levels for KVM_PHYS_SHIFT is always
>> + * the number of normal page table levels for (KVM_PHYS_SHIFT - 4).
>> + */
>> +#define HYP_PGTABLE_LEVELS ARM64_HW_PGTABLE_LEVELS(KVM_PHYS_SHIFT - 4)
>
> I see the math lines up, but I don't think it's intuitive, as I don't
> understand why it's obvious that it's the 'normal' page table for
> KVM_PHYS_SHIFT - 4.
Because, we can concatenate upto 16 page table entries. With the current
set of page sizes the above 'magic' formula works out. But yes, the following
suggestion makes more sense.
>
> I see this as an architectural limitation given in the ARM ARM, and we
> should just refer to that, and do:
>
> #if PAGE_SHIFT == 12
> #define S2_PGTABLE_LEVELS 3
> #else
> #define S2_PGTABLE_LEVELS 2
> #endif
OK, we could do that.
>
>> +/* Number of bits normally addressed by HYP_PGTABLE_LEVELS */
>> +#define HYP_PGTABLE_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS + 1)
>> +#define HYP_PGDIR_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS)
>> +#define HYP_PGTABLE_ENTRY_LEVEL (4 - HYP_PGTABLE_LEVELS)
>
> We are introducing a huge number of defines here, which are all more or
> less opaque to anyone coming back to this code.
>
> I may be extraordinarily stupid, but I really need each define explained
> in a comment to be able to follow this code (those above and the
> S2_ENTRY_TABLES below).
No, you right. I need to document all the above properly, which I is something
I am in the middle of.
>
> I actually wonder from looking at this whole patch if we even want to go
> here. Maybe this is really the time to say that we should get rid of
> the dependency between the host page table layout and the stage-2 page
> table layout.
>
> Since the rest of this series looks pretty good, I'm wondering if you
> should just disable KVM in the config system if 16K pages is selected,
> and then you can move ahead with this series while we fix KVM properly?
I can send an updated version (which is in the test furnace) soon, so that
you can take a look ?
Suzuki
WARNING: multiple messages have this Message-ID (diff)
From: Suzuki.Poulose@arm.com (Suzuki K. Poulose)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling
Date: Mon, 12 Oct 2015 10:55:24 +0100 [thread overview]
Message-ID: <561B838C.5090008@arm.com> (raw)
In-Reply-To: <20151010145227.GB29128@cbox>
On 10/10/15 15:52, Christoffer Dall wrote:
> Hi Suzuki,
Hi Christoffer,
Thanks for being patient enough to review the code :-) without much of
the comments. I now realise there needs much more documentation than
what I have put in already. I am taking care of this in the next
revision already.
> I had to refresh my mind a fair bit to be able to review this, so I
> thought it may be useful to just remind us all what the constraints of
> this whole thing is, and make sure we agree on this:
>
> 1. We fix the IPA max width to 40 bits
> 2. We don't support systems with a PARange smaller than 40 bits (do we
> check this anywhere or document this anywhere?)
AFAIT, no we don't check it anywhere. May be we should. We could plug this
into my CPU feature infrastructure[1] and let the is_hype_mode_available()
use the info to decide if we can support 40bit IPA ?
> 3. We always assume we are running on a system with PARange of 40 bits
> and we are therefore constrained to use concatination.
>
> As an implication of (3) above, this code will attempt to allocate 256K
> of physically contiguous memory for each VM on the system. That is
> probably ok, but I just wanted to point it out in case it raises any
> eyebrows for other people following this thread.
Right, I will document this in a comment.
>> level: 0 1 2 3
>> bits : [47] [46 - 36] [35 - 25] [24 - 14] [13 - 0]
>> ^ ^ ^
>> | | |
>> host entry | x---- stage-2 entry
>> |
>> IPA -----x
>
> Isn't the stage-2 entry using bits [39:25], because you resolve
> more than 11 bits on the initial level of lookup when you concatenate
> tables?
Yes, the stage-2 entry is just supposed to show the entry level (2).
>>
>> The following conditions hold true for all cases(with 40bit IPA)
>> 1) The stage-2 entry level <= 2
>> 2) Number of fake page-table entries is in the inclusive range [0, 2].
>
> nit: Number of fake levels of page tables
Correct, I have fixed it already.
>> +/*
>> + * At stage-2 entry level, upto 16 tables can be concatenated and
>
> nit: Can you rewrite the first part of this comment to be in line with
> the ARM ARM, such as: "The stage-2 page tables can concatenate up to 16
> tables at the inital level" ?
Yes, will do it.
>
>
>> + * the hardware expects us to use concatenation, whenever possible.
>
> I think the 'hardware expects us' is a bit vague. At least I find this
> whole part of the architecture incredibly confusing already, so it would
> help me in the future if we put something like:
>
> "The hardware requires that we use concatenation depending on the
> supported PARange and page size. We always assume the hardware's PASize
> is maximum 40 bits in this context, and with a fixed IPA width of 40
> bits, we concatenate 2 tables for 4K pages, 16 tables for 16K pages, and
> do not use concatenation for 64K pages."
>
> Did I get this right?
You are right. The rule is simple. Upto 16 tables can be concatenated at
the stage-2 entry level.
>
>> + * So, number of page table levels for KVM_PHYS_SHIFT is always
>> + * the number of normal page table levels for (KVM_PHYS_SHIFT - 4).
>> + */
>> +#define HYP_PGTABLE_LEVELS ARM64_HW_PGTABLE_LEVELS(KVM_PHYS_SHIFT - 4)
>
> I see the math lines up, but I don't think it's intuitive, as I don't
> understand why it's obvious that it's the 'normal' page table for
> KVM_PHYS_SHIFT - 4.
Because, we can concatenate upto 16 page table entries. With the current
set of page sizes the above 'magic' formula works out. But yes, the following
suggestion makes more sense.
>
> I see this as an architectural limitation given in the ARM ARM, and we
> should just refer to that, and do:
>
> #if PAGE_SHIFT == 12
> #define S2_PGTABLE_LEVELS 3
> #else
> #define S2_PGTABLE_LEVELS 2
> #endif
OK, we could do that.
>
>> +/* Number of bits normally addressed by HYP_PGTABLE_LEVELS */
>> +#define HYP_PGTABLE_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS + 1)
>> +#define HYP_PGDIR_SHIFT ARM64_HW_PGTABLE_LEVEL_SHIFT(HYP_PGTABLE_LEVELS)
>> +#define HYP_PGTABLE_ENTRY_LEVEL (4 - HYP_PGTABLE_LEVELS)
>
> We are introducing a huge number of defines here, which are all more or
> less opaque to anyone coming back to this code.
>
> I may be extraordinarily stupid, but I really need each define explained
> in a comment to be able to follow this code (those above and the
> S2_ENTRY_TABLES below).
No, you right. I need to document all the above properly, which I is something
I am in the middle of.
>
> I actually wonder from looking at this whole patch if we even want to go
> here. Maybe this is really the time to say that we should get rid of
> the dependency between the host page table layout and the stage-2 page
> table layout.
>
> Since the rest of this series looks pretty good, I'm wondering if you
> should just disable KVM in the config system if 16K pages is selected,
> and then you can move ahead with this series while we fix KVM properly?
I can send an updated version (which is in the test furnace) soon, so that
you can take a look ?
Suzuki
next prev parent reply other threads:[~2015-10-12 9:55 UTC|newest]
Thread overview: 121+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-15 15:41 [PATCHv2 00/15] arm64: 16K translation granule support Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 01/15] arm64: Move swapper pagetable definitions Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 02/15] arm64: Handle section maps for swapper/idmap Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 03/15] arm64: Introduce helpers for page table levels Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-10-07 8:26 ` Christoffer Dall
2015-10-07 8:26 ` Christoffer Dall
2015-10-07 8:26 ` Christoffer Dall
2015-10-07 9:26 ` Marc Zyngier
2015-10-07 9:26 ` Marc Zyngier
2015-10-07 9:26 ` Marc Zyngier
2015-10-07 9:48 ` Suzuki K. Poulose
2015-10-07 9:48 ` Suzuki K. Poulose
2015-10-07 9:48 ` Suzuki K. Poulose
2015-10-08 14:45 ` Christoffer Dall
2015-10-08 14:45 ` Christoffer Dall
2015-10-08 14:45 ` Christoffer Dall
2015-10-08 17:22 ` Suzuki K. Poulose
2015-10-08 17:22 ` Suzuki K. Poulose
2015-10-08 17:28 ` Catalin Marinas
2015-10-08 17:28 ` Catalin Marinas
2015-10-09 9:22 ` Suzuki K. Poulose
2015-10-09 9:22 ` Suzuki K. Poulose
2015-10-09 9:22 ` Suzuki K. Poulose
2015-10-07 9:51 ` Suzuki K. Poulose
2015-10-07 9:51 ` Suzuki K. Poulose
2015-10-07 9:51 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 04/15] arm64: Calculate size for idmap_pg_dir at compile time Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 05/15] arm64: Handle 4 level page table for swapper Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 06/15] arm64: Clean config usages for page size Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 07/15] arm64: Kconfig: Fix help text about AArch32 support with 64K pages Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 08/15] arm64: Check for selected granule support Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 09/15] arm64: Add page size to the kernel image header Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-10-02 15:49 ` Catalin Marinas
2015-10-02 15:49 ` Catalin Marinas
2015-10-02 15:49 ` Catalin Marinas
2015-10-02 16:31 ` Catalin Marinas
2015-10-02 16:31 ` Catalin Marinas
2015-10-02 16:50 ` Marc Zyngier
2015-10-02 16:50 ` Marc Zyngier
2015-10-02 16:50 ` Marc Zyngier
2015-10-05 15:43 ` Christoffer Dall
2015-10-05 15:43 ` Christoffer Dall
2015-10-05 13:02 ` Suzuki K. Poulose
2015-10-05 13:02 ` Suzuki K. Poulose
2015-10-05 13:02 ` Suzuki K. Poulose
2015-10-05 13:22 ` Ard Biesheuvel
2015-10-05 13:22 ` Ard Biesheuvel
2015-10-05 13:22 ` Ard Biesheuvel
2015-10-10 17:22 ` Christoffer Dall
2015-10-10 17:22 ` Christoffer Dall
2015-10-10 17:22 ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 10/15] arm64: kvm: Fix {V}TCR_EL2_TG0 mask Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-10-08 15:17 ` Christoffer Dall
2015-10-08 15:17 ` Christoffer Dall
2015-10-08 15:17 ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 11/15] arm64: Cleanup VTCR_EL2 computation Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-10-07 10:11 ` Marc Zyngier
2015-10-07 10:11 ` Marc Zyngier
2015-10-07 10:23 ` Suzuki K. Poulose
2015-10-07 10:23 ` Suzuki K. Poulose
2015-10-07 10:23 ` Suzuki K. Poulose
2015-10-10 17:22 ` Christoffer Dall
2015-10-10 17:22 ` Christoffer Dall
2015-10-10 17:22 ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 12/15] arm: kvm: Move fake PGD handling to arch specific files Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-10-07 10:23 ` Marc Zyngier
2015-10-07 10:23 ` Marc Zyngier
2015-10-07 10:23 ` Marc Zyngier
2015-10-10 17:22 ` Christoffer Dall
2015-10-10 17:22 ` Christoffer Dall
2015-10-10 17:22 ` Christoffer Dall
2015-09-15 15:41 ` [PATCH 13/15] arm64: kvm: Rewrite fake pgd handling Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-10-07 11:13 ` Marc Zyngier
2015-10-07 11:13 ` Marc Zyngier
2015-10-07 11:13 ` Marc Zyngier
2015-10-07 12:21 ` Suzuki K. Poulose
2015-10-07 12:21 ` Suzuki K. Poulose
2015-10-07 12:21 ` Suzuki K. Poulose
2015-10-10 14:52 ` Christoffer Dall
2015-10-10 14:52 ` Christoffer Dall
2015-10-10 14:52 ` Christoffer Dall
2015-10-12 9:55 ` Suzuki K. Poulose [this message]
2015-10-12 9:55 ` Suzuki K. Poulose
2015-10-13 15:39 ` Christoffer Dall
2015-10-13 15:39 ` Christoffer Dall
2015-10-13 15:39 ` Christoffer Dall
2015-10-13 16:04 ` Suzuki K. Poulose
2015-10-13 16:04 ` Suzuki K. Poulose
2015-10-13 16:04 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 14/15] arm64: Add 16K page size support Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` [PATCH 15/15] arm64: 36 bit VA Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
2015-09-15 15:41 ` Suzuki K. Poulose
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=561B838C.5090008@arm.com \
--to=suzuki.poulose@arm.com \
--cc=Catalin.Marinas@arm.com \
--cc=Marc.Zyngier@arm.com \
--cc=Mark.Rutland@arm.com \
--cc=Will.Deacon@arm.com \
--cc=ard.biesheuvel@linaro.org \
--cc=christoffer.dall@linaro.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.