public inbox for u-boot@lists.denx.de
 help / color / mirror / Atom feed
From: Stephen Warren <swarren@wwwdotorg.org>
To: u-boot@lists.denx.de
Subject: [U-Boot] [PATCH 02/10] arm64: Make full va map code more dynamic
Date: Mon, 29 Feb 2016 09:52:21 -0700	[thread overview]
Message-ID: <56D47745.50905@wwwdotorg.org> (raw)
In-Reply-To: <56D191FE.80801@suse.de>

On 02/27/2016 05:09 AM, Alexander Graf wrote:
>
>
> On 26.02.16 20:25, Stephen Warren wrote:
>> On 02/25/2016 09:36 AM, Alexander Graf wrote:
>>>
>>>
>>> On 24.02.16 19:14, Stephen Warren wrote:
>>>> On 02/24/2016 05:11 AM, Alexander Graf wrote:
>>>>> The idea to generate our pages tables from an array of memory ranges
>>>>> is very sound. However, instead of hard coding the code to create up
>>>>> to 2 levels of 64k granule page tables, we really should just create
>>>>> normal 4k page tables that allow us to set caching attributes on 2M
>>>>> or 4k level later on.
>>>>>
>>>>> So this patch moves the full_va mapping code to 4k page size and
>>>>> makes it fully flexible to dynamically create as many levels as
>>>>> necessary for a map (including dynamic 1G/2M pages). It also adds
>>>>> support to dynamically split a large map into smaller ones when
>>>>> some code wants to set dcache attributes.
>>>>>
>>>>> With all this in place, there is very little reason to create your
>>>>> own page tables in board specific files.
>>>>>
>>>>> Signed-off-by: Alexander Graf <agraf@suse.de>
>>>>
>>>>> +/*
>>>>> + * This is a recursively called function to count the number of
>>>>> + * page tables we need to cover a particular PTE range. If you
>>>>> + * call this with level = -1 you basically get the full 48 bit
>>>>> + * coverage.
>>>>> + */
>>>>> +static int count_required_pts(u64 addr, int level, u64 maxaddr)
>>>>
>>>> I think this looks correct now. Nits below if a respin is needed for
>>>> other reasons.
>>>>
>>>>> +{
>>>>> +    int levelshift = level2shift(level);
>>>>> +    u64 levelsize = 1ULL << levelshift;
>>>>> +    u64 levelmask = levelsize - 1;
>>>>> +    u64 levelend = addr + levelsize;
>>>>> +    int r = 0;
>>>>> +    int i;
>>>>> +    bool is_level = false;
>>>>
>>>> I might suggest renaming that is_level_needed. It's not obvious to me
>>>> exactly what the name "is_level" is intended to represent; the name
>>>> seems to represent whether something (unspecified) is a level or not.
>>>
>>> We're basically asking the function whether a PTE for address <addr> at
>>> level <level> would be an inval/block/level PTE. is_level marks it as a
>>> level pte.
>>>
>>> I could maybe rename this into pte_type and create an enum that is
>>> either PTE_INVAL, PTE_BLOCK or PTE_LEVEL.
>>>
>>>>
>>>>> +
>>>>> +    for (i = 0; i < ARRAY_SIZE(mem_map); i++) {
>>>>> +        struct mm_region *map = &mem_map[i];
>>>>> +        u64 start = map->base;
>>>>> +        u64 end = start + map->size;
>>>>> +
>>>>> +        /* Check if the PTE would overlap with the map */
>>>>> +        if (max(addr, start) <= min(levelend, end)) {
>>>>> +            start = max(addr, start);
>>>>> +            end = min(levelend, end);
>>>>> +
>>>>> +            /* We need a sub-pt for this level */
>>>>> +            if ((start & levelmask) || (end & levelmask)) {
>>>>> +                is_level = true;
>>>>> +                break;
>>>>> +            }
>>>>> +
>>>>> +            /* Lv0 can not do block PTEs, so do levels here too */
>>>>> +            if (level <= 0) {
>>>>> +                is_level = true;
>>>>> +                break;
>>>>> +            }
>>>>> +        }
>>>>> +    }
>>>>> +
>>>>> +    /*
>>>>> +     * Block PTEs at this level are already covered by the parent page
>>>>> +     * table, so we only need to count sub page tables.
>>>>> +     */
>>>>> +    if (is_level) {
>>>>> +        int sublevel = level + 1;
>>>>> +        u64 sublevelsize = 1ULL << level2shift(sublevel);
>>>>> +
>>>>> +        /* Account for the new sub page table ... */
>>>>> +        r = 1;
>>>>
>>>> "Account for the page table at /this/ level"? This may represent the
>>>> top-level (0/1) page table, not just sub page tables. The sub-level
>>>> accounting is via the recursive call to count_required_pts() I believe.
>>>
>>> I think the easiest way to visualize what's going on is if we start with
>>> level -1.
>>>
>>> We basically ask the function at level -1 whether a PTE at level -1 (48
>>> bits) would fit into a block PTE at level -1 (so the PTE spans all 48
>>> bits) or whether we need to create a new level entry plus page table
>>> for it.
>>
>> Hmm, I had overlooked that the call to count_required_pts() passed
>> "start_level - 1" not "start_level". Now that I see that, your
>> explanation makes more sense to me.
>>
>> It'd be nice if some/all of this explanation were comments in the code
>> to aid readers.
>>
>> That said, I would have expected a slightly more direct calculation; why
>> not:
>>
>> int start_level = 0; // or 1 if small VA space
>> total_pts = count_required_pts(start_level);
>
> We need to account for the count twice, to have an (immutable) copy of
> our page tables around when we split them.

I think that's just a hard-coded "* 2" there, unless I'm missing your point?

>> total_pts += 4; // "random" number to account for later splits
>>
>> int count_required_pts(int level, ...) {
>
> ... includes the address, as I have it today, I guess?
>
>>      int npts = 1; // the table at this level
>>      for each pte slot at this level:
>>          if a mem_map entry starts/stops within the pte slot:
>>              npts += count_required_pts(level + 1, ...);
>
> This means we would count some levels multiple times. Imagine two
> entries from
>
> 	1G - 1.25G
> 	1.25G - 1.5G
>
> With your pseudo-code, we would count for both cases while checking for
> 1G page table entries. Hence we need to jump out of the loop here and do
> the counting outside.

I don't think so; "if a mem_map entry starts/stops within the pte slot" 
is a single decision for that PT entry at the current level, not a loop 
that recurses once per memory block. Hence, you'd only ever recurse 
once, so there's no double-counting.

>
> At which point this is basically my code, right? :)
>
>
> Alex
>
>>          else:
>>              // nothing; pte is either a block or inval at this level
>>      return npts;
>> }
>>
>> Still, the current code appears to work find, and can be ammended later
>> if we want.
>

  reply	other threads:[~2016-02-29 16:52 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-24 12:11 [U-Boot] [PATCH 00/10] arm64: Unify MMU code Alexander Graf
2016-02-24 12:11 ` [U-Boot] [PATCH 01/10] thunderx: Calculate TCR dynamically Alexander Graf
2016-02-24 13:37   ` Mark Rutland
2016-02-24 17:39     ` Alexander Graf
2016-02-25 11:58       ` Mark Rutland
2016-02-24 12:11 ` [U-Boot] [PATCH 02/10] arm64: Make full va map code more dynamic Alexander Graf
2016-02-24 14:52   ` Mark Rutland
2016-02-24 18:14   ` Stephen Warren
2016-02-25 16:36     ` Alexander Graf
2016-02-26 19:25       ` Stephen Warren
2016-02-27 12:09         ` Alexander Graf
2016-02-29 16:52           ` Stephen Warren [this message]
2016-02-29 21:05             ` Alexander Graf
2016-02-24 12:11 ` [U-Boot] [PATCH 03/10] thunderx: Move mmu table into board file Alexander Graf
2016-02-24 12:11 ` [U-Boot] [PATCH 04/10] zymqmp: Replace home grown mmu code with generic table approach Alexander Graf
2016-02-24 12:11 ` [U-Boot] [PATCH 05/10] tegra: " Alexander Graf
2016-02-24 17:49   ` Stephen Warren
2016-02-24 12:11 ` [U-Boot] [PATCH 06/10] vexpress64: Add MMU tables Alexander Graf
2016-02-24 12:11 ` [U-Boot] [PATCH 07/10] dwmmc: Increase retry timeout Alexander Graf
2016-02-24 12:11 ` [U-Boot] [PATCH 08/10] hikey: Add MMU tables Alexander Graf
2016-02-24 12:11 ` [U-Boot] [PATCH 09/10] arm64: Remove non-full-va map code Alexander Graf
2016-02-24 12:11 ` [U-Boot] [PATCH 10/10] arm64: Only allow dcache disabled in SPL builds Alexander Graf
2016-02-24 12:19 ` [U-Boot] [PATCH 00/10] arm64: Unify MMU code Michal Simek
2016-02-24 13:36   ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56D47745.50905@wwwdotorg.org \
    --to=swarren@wwwdotorg.org \
    --cc=u-boot@lists.denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox