From mboxrd@z Thu Jan 1 00:00:00 1970 From: Stephen Warren Date: Fri, 26 Feb 2016 12:25:52 -0700 Subject: [U-Boot] [PATCH 02/10] arm64: Make full va map code more dynamic In-Reply-To: <56CF2D7B.9090101@suse.de> References: <1456315904-113924-1-git-send-email-agraf@suse.de> <1456315904-113924-3-git-send-email-agraf@suse.de> <56CDF305.1020302@wwwdotorg.org> <56CF2D7B.9090101@suse.de> Message-ID: <56D0A6C0.6040404@wwwdotorg.org> List-Id: MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit To: u-boot@lists.denx.de On 02/25/2016 09:36 AM, Alexander Graf wrote: > > > On 24.02.16 19:14, Stephen Warren wrote: >> On 02/24/2016 05:11 AM, Alexander Graf wrote: >>> The idea to generate our pages tables from an array of memory ranges >>> is very sound. However, instead of hard coding the code to create up >>> to 2 levels of 64k granule page tables, we really should just create >>> normal 4k page tables that allow us to set caching attributes on 2M >>> or 4k level later on. >>> >>> So this patch moves the full_va mapping code to 4k page size and >>> makes it fully flexible to dynamically create as many levels as >>> necessary for a map (including dynamic 1G/2M pages). It also adds >>> support to dynamically split a large map into smaller ones when >>> some code wants to set dcache attributes. >>> >>> With all this in place, there is very little reason to create your >>> own page tables in board specific files. >>> >>> Signed-off-by: Alexander Graf >> >>> +/* >>> + * This is a recursively called function to count the number of >>> + * page tables we need to cover a particular PTE range. If you >>> + * call this with level = -1 you basically get the full 48 bit >>> + * coverage. >>> + */ >>> +static int count_required_pts(u64 addr, int level, u64 maxaddr) >> >> I think this looks correct now. Nits below if a respin is needed for >> other reasons. >> >>> +{ >>> + int levelshift = level2shift(level); >>> + u64 levelsize = 1ULL << levelshift; >>> + u64 levelmask = levelsize - 1; >>> + u64 levelend = addr + levelsize; >>> + int r = 0; >>> + int i; >>> + bool is_level = false; >> >> I might suggest renaming that is_level_needed. It's not obvious to me >> exactly what the name "is_level" is intended to represent; the name >> seems to represent whether something (unspecified) is a level or not. > > We're basically asking the function whether a PTE for address at > level would be an inval/block/level PTE. is_level marks it as a > level pte. > > I could maybe rename this into pte_type and create an enum that is > either PTE_INVAL, PTE_BLOCK or PTE_LEVEL. > >> >>> + >>> + for (i = 0; i < ARRAY_SIZE(mem_map); i++) { >>> + struct mm_region *map = &mem_map[i]; >>> + u64 start = map->base; >>> + u64 end = start + map->size; >>> + >>> + /* Check if the PTE would overlap with the map */ >>> + if (max(addr, start) <= min(levelend, end)) { >>> + start = max(addr, start); >>> + end = min(levelend, end); >>> + >>> + /* We need a sub-pt for this level */ >>> + if ((start & levelmask) || (end & levelmask)) { >>> + is_level = true; >>> + break; >>> + } >>> + >>> + /* Lv0 can not do block PTEs, so do levels here too */ >>> + if (level <= 0) { >>> + is_level = true; >>> + break; >>> + } >>> + } >>> + } >>> + >>> + /* >>> + * Block PTEs at this level are already covered by the parent page >>> + * table, so we only need to count sub page tables. >>> + */ >>> + if (is_level) { >>> + int sublevel = level + 1; >>> + u64 sublevelsize = 1ULL << level2shift(sublevel); >>> + >>> + /* Account for the new sub page table ... */ >>> + r = 1; >> >> "Account for the page table at /this/ level"? This may represent the >> top-level (0/1) page table, not just sub page tables. The sub-level >> accounting is via the recursive call to count_required_pts() I believe. > > I think the easiest way to visualize what's going on is if we start with > level -1. > > We basically ask the function at level -1 whether a PTE at level -1 (48 > bits) would fit into a block PTE at level -1 (so the PTE spans all 48 > bits) or whether we need to create a new level entry plus page table for it. Hmm, I had overlooked that the call to count_required_pts() passed "start_level - 1" not "start_level". Now that I see that, your explanation makes more sense to me. It'd be nice if some/all of this explanation were comments in the code to aid readers. That said, I would have expected a slightly more direct calculation; why not: int start_level = 0; // or 1 if small VA space total_pts = count_required_pts(start_level); total_pts += 4; // "random" number to account for later splits int count_required_pts(int level, ...) { int npts = 1; // the table at this level for each pte slot at this level: if a mem_map entry starts/stops within the pte slot: npts += count_required_pts(level + 1, ...); else: // nothing; pte is either a block or inval at this level return npts; } Still, the current code appears to work find, and can be ammended later if we want. > Obviously for all entries this resolves into a "yes, create a level PTE > entry and a new page table to point to". So at level=-1 we account for > the root page table which really is a "sub page table" here. > > Then we recursively go through the address space that we cover, checking > whether a page fits into inval/block PTEs or we need to create page > tables for it as well. > > So at level=0, we really are checking for PTEs that you'd have at > level=0 (38 bits). If the combination at level=0 results > in "level", we need to create a level=1 page table and walk through that > one again. > > I agree that the logic seems backwards, but it's the only thing that I > could come up with that has low memory footprint (required to run from > SRAM).