public inbox for u-boot@lists.denx.de
 help / color / mirror / Atom feed
From: Alexander Graf <agraf@suse.de>
To: u-boot@lists.denx.de
Subject: [U-Boot] [PATCH 0/9] arm64: Unify MMU code
Date: Mon, 22 Feb 2016 21:09:41 +0100	[thread overview]
Message-ID: <56CB6B05.5050400@suse.de> (raw)
In-Reply-To: <AM4PR0401MB1732C839BDED7CB929273E7C9AA30@AM4PR0401MB1732.eurprd04.prod.outlook.com>



On 22.02.16 20:52, york sun wrote:
> On 02/22/2016 11:42 AM, Alexander Graf wrote:
>>
>>
>> On 22.02.16 19:39, york sun wrote:
>>> On 02/22/2016 10:31 AM, Alexander Graf wrote:
>>>>
>>>> On Feb 22, 2016, at 7:12 PM, york sun <york.sun@nxp.com> wrote:
>>>>
>>>>> On 02/22/2016 10:02 AM, Alexander Graf wrote:
>>>>>>
>>>>>>
>>>>>>> Am 22.02.2016 um 18:37 schrieb york sun <york.sun@nxp.com>:
>>>>>>>
>>>>>>>> On 02/21/2016 05:57 PM, Alexander Graf wrote:
>>>>>>>> Howdy,
>>>>>>>>
>>>>>>>> Currently on arm64 there is a big pile of mess when it comes to MMU
>>>>>>>> support and page tables. Each board does its own little thing and the
>>>>>>>> generic code is pretty dumb and nobody actually uses it.
>>>>>>>>
>>>>>>>> This patch set tries to clean that up. After this series is applied,
>>>>>>>> all boards except for the FSL Layerscape ones are converted to the
>>>>>>>> new generic page table logic and have icache+dcache enabled.
>>>>>>>>
>>>>>>>> The new code always uses 4k page size. It dynamically allocates 1G or
>>>>>>>> 2M pages for ranges that fit. When a dcache attribute request comes in
>>>>>>>> that requires a smaller granularity than our previous allocation could
>>>>>>>> fulfill, pages get automatically split.
>>>>>>>>
>>>>>>>> I have tested and verified the code works on HiKey (bare metal),
>>>>>>>> vexpress64 (Foundation Model) and zynqmp (QEMU). The TX1 target is
>>>>>>>> untested, but given the simplicity of the maps I doubt it'll break.
>>>>>>>> ThunderX in theory should also work, but I haven't tested it. I would
>>>>>>>> be very happy if people with access to those system could give the patch
>>>>>>>> set a try.
>>>>>>>>
>>>>>>>> With this we're a big step closer to a good base line for EFI payload
>>>>>>>> support, since we can now just require that all boards always have dcache
>>>>>>>> enabled.
>>>>>>>>
>>>>>>>> I would also be incredibly happy if some Freescale people could look
>>>>>>>> at their MMU code and try to unify it into the now cleaned up generic
>>>>>>>> code. I don't think we're far off here.
>>>>>>>
>>>>>>> Alex,
>>>>>>>
>>>>>>> Unified MMU will be great for all of us. The reason we started with our own MMU
>>>>>>> table was size and performance. I don't know much about other ARMv8 SoCs. For
>>>>>>> our use, we enable cache very early to speed up running, especially for
>>>>>>> pre-silicon development on emulators. We don't have DDR to use for the early
>>>>>>> stage and we have very limited on-chip SRAM. I believe we can use the unified
>>>>>>> structure for our 2nd stage MMU when DDR is up.
>>>>>>
>>>>>> Yup, and I think it should be fairly doable to move the early generation into the same table format - maybe even fully reuse the generic code.
>>>>>
>>>>> What's the size for the MMU tables? I think it may be simpler to use static
>>>>> tables for our early stage.
>>>>
>>>> The size is determined dynamically from the memory map using some code that (as Steven found) is not 100% sound, but works well enough so far :).
>>>
>>> That's the part I can't live with. Since we have very limited on-chip RAM, we
>>> have to know limit the size. But again, I do see the benefit to use unified
>>> structure for the 2nd stage.
>>
>> I'm not quite sure I see how your current code works any differently.
>> While the code to determine the page table pool size is dynamic, the
>> outcome is static depending on your memory map. So the same memory map
>> always means the same page table pool size.
>>
>> We could also just hard code the size for the early phase for you I guess.
> 
> We can definitely try.
> 
>>
>>>
>>>>
>>>>>
>>>>>>
>>>>>> The thing that I tripped over while attempting conversion was that you don't always map phys==virt, unless other boards, and I didn't fully understand why.
>>>>>>
>>>>> True. We have some complication on the address mapping. For compatibility, each
>>>>> device is mapped (partially) under 32-bit space. If the device is too large to
>>>>
>>>> Compatibility with what? Do we really need this in an AArch64 world?
>>>
>>> It's not up to me. The SoC was designed this way. By the way, this SoC can work
>>> in AArch32 mode.
>>
>> I think I'm slowly grasping what the problem is.
>>
>> The fact that the SoC can run in AArch32 mode doesn't actually make a
>> difference here though, since we're talking about U-Boot internal memory
>> maps. The only reason to keep things mapped reachable from 32bits is if
>> you want to run 32bit code with the U-Boot maps. I don't think you'd
>> want to do that, no? :)
> 
> I don't really want to run 32-bit code. My point is the SoC was designed that
> way. We have DDR under 32-bit space, and in high region. We have the same for
> flash controller where NOR is connected. Explained later below.
>>
>>>
>>>>
>>>> For 32bit code I can definitely understand why you'd want to have phys != virt. But in a pure 64bit world (which this target really is, no?) I see little benefit on it.
>>>>
>>>>> fit, the rest is mapped to high regions. I remember one particular case on top
>>>>> of my head. It is the NOR flash we use for environmental variables. U-boot uses
>>>>> that address for saving, but also uses that for loading during booting. For our
>>>>> case, the NOR flash doesn't fit well in the low region, so it is remapped to
>>>>> high region after booting. To make the environmental variables accessible during
>>>>> boot, we mapped the high region phys with different virt, so u-boot doesn't have
>>>>> to know the low region address.
>>>>
>>>> I might be missing the obvious, but why can't the environmental variables live in high regions?
>>>>
>>>
>>> It is in high region. But as I tried to explain, the default physical mapping of
>>> NOR flash (not MMU) is in low region out of reset.
>>
>> I see. So the problem is during the transitioning phase from uncached to
>> MMU enabled, where we'd end up at a different address.
> 
> Not exactly. We enable cache very early for performance boost on emulator. It
> may sound trivial but it makes big difference when debugging software on
> emulators. Since we still use emulators for new product, I am not ready to drop
> the early MMU approach.

I'm surprised it is that slow for you. Running the Foundation model
(which doesn't do early mmu FWIW) seemed to be fast enough.

> But you get the idea, the difference is before and after relocation. After
> u-boot relocates itself into DDR, we remap flash controller physical address to
> high region.
> 
>>
>> Could we just configure NOR to be in high memory in early asm init code,
>> then always use the high physical NOR address range and jump to it from
>> asm very early on? Then we could ignore the 32bit map and everything
>> could just stay 1:1 mapped.
>>
> 
> Out of reset, if booting from NOR flash, the flash controller is pre-configured
> to use low region address. We can only reprogram the controller when u-boot is
> not running on it.

I see, so you keep the low map alive until you make the switch-over to
DDR. Makes a lot of sense.

I guess I can give the conversion another stab now whenever I get a free
night :). If I understand you correctly we'd only need to do non-1:1
maps for the early code, right?

> I see you are trying to maintain the 1:1 mapping for MMU. Why so? I think the
> framework should allow different mapping.

Mostly for the sake of simplicity. It wouldn't be very different to
extend the logic to support setting of va != pa, but I find code vastly
easier to debug and understand if the address I see is the address I access.


Alex

  reply	other threads:[~2016-02-22 20:09 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-22  1:57 [U-Boot] [PATCH 0/9] arm64: Unify MMU code Alexander Graf
2016-02-22  1:57 ` [U-Boot] [PATCH 1/9] thunderx: Calculate TCR dynamically Alexander Graf
2016-02-22  1:57 ` [U-Boot] [PATCH 2/9] arm64: Make full va map code more dynamic Alexander Graf
2016-02-22 18:18   ` Stephen Warren
2016-02-22 18:37     ` Alexander Graf
2016-02-22 18:45       ` Stephen Warren
2016-02-24 10:21         ` Alexander Graf
2016-02-22 18:42   ` Stephen Warren
2016-02-23 13:17   ` Simon Glass
2016-02-23 17:21     ` Stephen Warren
2016-02-23 17:30       ` Simon Glass
2016-02-23 17:40         ` Stephen Warren
2016-02-23 20:00           ` Simon Glass
2016-02-23 20:33             ` Stephen Warren
2016-02-24  4:42               ` Simon Glass
2016-02-24 16:56                 ` Stephen Warren
2016-02-24 10:55     ` Alexander Graf
2016-02-24 17:01       ` Stephen Warren
2016-02-24 17:04         ` Alexander Graf
2016-02-22  1:57 ` [U-Boot] [PATCH 3/9] zymqmp: Replace home grown mmu code with generic table approach Alexander Graf
2016-02-23 11:04   ` Michal Simek
2016-02-23 11:33     ` Alexander Graf
2016-02-23 13:07       ` Michal Simek
2016-02-26  0:49         ` Alexander Graf
2016-02-26  8:29           ` Michal Simek
2016-02-26  8:55             ` Alexander Graf
2017-02-16 15:26               ` brettstahlman
2016-02-22  1:57 ` [U-Boot] [PATCH 4/9] tegra: " Alexander Graf
2016-02-22 18:28   ` Stephen Warren
2016-02-23 10:37     ` Michal Simek
2016-02-23 17:29       ` Stephen Warren
2016-02-24 10:28     ` Alexander Graf
2016-02-22  1:57 ` [U-Boot] [PATCH 5/9] vexpress64: Add MMU tables Alexander Graf
2016-02-22  1:57 ` [U-Boot] [PATCH 6/9] dwmmc: Increase retry timeout Alexander Graf
2016-02-22  1:57 ` [U-Boot] [PATCH 7/9] hikey: Add MMU tables Alexander Graf
2016-02-22  1:57 ` [U-Boot] [PATCH 8/9] arm64: Remove non-full-va map code Alexander Graf
2016-02-22  1:57 ` [U-Boot] [PATCH 9/9] arm64: Only allow dcache disabled in SPL builds Alexander Graf
2016-02-22 17:37 ` [U-Boot] [PATCH 0/9] arm64: Unify MMU code york sun
2016-02-22 18:02   ` Alexander Graf
2016-02-22 18:12     ` york sun
2016-02-22 18:31       ` Alexander Graf
2016-02-22 18:39         ` york sun
2016-02-22 19:42           ` Alexander Graf
2016-02-22 19:52             ` york sun
2016-02-22 20:09               ` Alexander Graf [this message]
2016-02-22 20:15                 ` york sun
2016-02-24 10:19                   ` Alexander Graf
2016-02-24 16:57                     ` Stephen Warren
2016-02-22 18:34 ` Stephen Warren
2016-02-24 10:33   ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56CB6B05.5050400@suse.de \
    --to=agraf@suse.de \
    --cc=u-boot@lists.denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox