From: Florian Fainelli <florian.fainelli@broadcom.com>
To: Elad Nachman <enachman@marvell.com>, Will Deacon <will@kernel.org>
Cc: "catalin.marinas@arm.com" <catalin.marinas@arm.com>,
"thunder.leizhen@huawei.com" <thunder.leizhen@huawei.com>,
"bhe@redhat.com" <bhe@redhat.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"yajun.deng@linux.dev" <yajun.deng@linux.dev>,
"chris.zjh@huawei.com" <chris.zjh@huawei.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above zero
Date: Thu, 4 Jan 2024 10:28:38 -0800 [thread overview]
Message-ID: <cfbfe416-5499-4a4f-9cf2-eacf68359072@broadcom.com> (raw)
In-Reply-To: <BN9PR18MB4251A62F3C57A8C6910C00F5DB672@BN9PR18MB4251.namprd18.prod.outlook.com>
[-- Attachment #1.1: Type: text/plain, Size: 7709 bytes --]
On 1/4/24 04:38, Elad Nachman wrote:
>
>
>> -----Original Message-----
>> From: Florian Fainelli <florian.fainelli@broadcom.com>
>> Sent: Wednesday, January 3, 2024 8:32 PM
>> To: Will Deacon <will@kernel.org>; Elad Nachman
>> <enachman@marvell.com>
>> Cc: catalin.marinas@arm.com; thunder.leizhen@huawei.com;
>> bhe@redhat.com; akpm@linux-foundation.org; yajun.deng@linux.dev;
>> chris.zjh@huawei.com; linux-kernel@vger.kernel.org; linux-arm-
>> kernel@lists.infradead.org
>> Subject: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above
>> zero
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> On 1/3/24 09:45, Will Deacon wrote:
>>> On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote:
>>>> From: Elad Nachman <enachman@marvell.com>
>>>>
>>>> Some SOCs, like the Marvell AC5/X/IM, have a combination
>>>> of DDR starting at 0x2_0000_0000 coupled with DMA controllers
>>>> limited to 31 and 32 bit of addressing.
>>>> This requires to properly arrange ZONE_DMA and ZONE_DMA32 for
>>>> these SOCs, so swiotlb and coherent DMA allocation would work
>>>> properly.
>>>> Change initialization so device tree dma zone bits are taken as
>>>> function of offset from DRAM start, and when calculating the
>>>> maximal zone physical RAM address for physical DDR starting above
>>>> 32-bit, combine the physical address start plus the zone mask
>>>> passed as parameter.
>>>> This creates the proper zone splitting for these SOCs:
>>>> 0..2GB for ZONE_DMA
>>>> 2GB..4GB for ZONE_DMA32
>>>> 4GB..8GB for ZONE_NORMAL
>>>>
>>>> Signed-off-by: Elad Nachman <enachman@marvell.com>
>>>> ---
>>>> arch/arm64/mm/init.c | 20 +++++++++++++++-----
>>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>> index 74c1db8ce271..8288c778916e 100644
>>>> --- a/arch/arm64/mm/init.c
>>>> +++ b/arch/arm64/mm/init.c
>>>> @@ -115,20 +115,21 @@ static void __init
>> arch_reserve_crashkernel(void)
>>>>
>>>> /*
>>>> * Return the maximum physical address for a zone accessible by the
>> given bits
>>>> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum
>>>> - * available memory, otherwise cap it at 32-bit.
>>>> + * limit. If DRAM starts above 32-bit, expand the zone to the available
>> memory
>>>> + * start limited by the zone bits mask, otherwise cap it at 32-bit.
>>>> */
>>>> static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
>>>> {
>>>> phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits);
>>>> phys_addr_t phys_start = memblock_start_of_DRAM();
>>>> + phys_addr_t phys_end = memblock_end_of_DRAM();
>>>>
>>>> if (phys_start > U32_MAX)
>>>> - zone_mask = PHYS_ADDR_MAX;
>>>> + zone_mask = phys_start | zone_mask;
>>>> else if (phys_start > zone_mask)
>>>> zone_mask = U32_MAX;
>>>>
>>>> - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1;
>>>> + return min(zone_mask, phys_end - 1) + 1;
>>>> }
>>>>
>>>> static void __init zone_sizes_init(void)
>>>> @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void)
>>>>
>>>> #ifdef CONFIG_ZONE_DMA
>>>> acpi_zone_dma_bits =
>> fls64(acpi_iort_dma_get_max_cpu_address());
>>>> - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL));
>>>> + /*
>>>> + * When calculating the dma zone bits from the device tree, subtract
>>>> + * the DRAM start address, in case it does not start from address
>>>> + * zero. This way. we pass only the zone size related bits to
>>>> + * max_zone_phys(), which will add them to the base of the DRAM.
>>>> + * This prevents miscalculations on arm64 SOCs which combines
>>>> + * DDR starting above 4GB with memory controllers limited to
>>>> + * 32-bits or less:
>>>> + */
>>>> + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) -
>> memblock_start_of_DRAM());
>>>> zone_dma_bits = min3(32U, dt_zone_dma_bits,
>> acpi_zone_dma_bits);
>>>> arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
>>>> max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
>>>
>>> Hmm, I'm a bit worried this could regress other platforms since you seem
>>> to be assuming that DMA address 0 corresponds to the physical start of
>>> DRAM. What if that isn't the case?
>>
>> All of our most recent Set-top-box SoCs map DRAM starting at PA
>> 0x4000_0000 FWIW. We have the following memory maps:
>
> That is below 4GB
Right, I was just making sure that there would be no regressions with
non-zero base addresses for DRAM, not that there should be, but getting
Tested-by tags for such changes is always a good thing IMHO.
>
>>
>> 1. DRAM mapped at PA 0x0000_0000
>> 2. DRAM mapped at PA 0x4000_0000 with another memory controller
>> starting
>> at 0x3_0000_0000
>> 3. DRAM mapped at PA 0x4000_0000 with a single memory controller.
>>
>> Here is the before/after diff with debugging prints introduced to print
>> start, end, min, mask and the dt_zone_dma_bits value:
>>
>> Memory map 1. with 2GB -> no change in output
>>
>> Memory map 2. with 2+2GB -> no change in output
>>
>> Memory map 3. with 4GB:
>>
>> @@ -39,7 +39,7 @@
>> [ 0.000000] OF: reserved mem:
>> 0x00000000fdfff000..0x00000000fdffffff (4 KiB) nomap non-reusable
>> NWMBOX
>> [ 0.000000] OF: reserved mem:
>> 0x00000000fe000000..0x00000000ffffffff (32768 KiB) nomap non-reusable
>> SRR@fe000000
>> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff
>> -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000021
>> +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020
>> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff
>> [ 0.000000] Zone ranges:
>> [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff]
>> @@ -88,243 +88,243 @@
>>
>> Memory map 3. with 2GB:
>>
>> @@ -39,11 +39,11 @@
>> [ 0.000000] OF: reserved mem:
>> 0x00000000bdfff000..0x00000000bdffffff (4 KiB) nomap non-reusable
>> NWMBOX
>> [ 0.000000] OF: reserved mem:
>> 0x00000000be000000..0x00000000bfffffff (32768 KiB) nomap non-reusable
>> SRR@be000000
>> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff
>> -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020
>> -[ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff
>> +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x0000001f
>> +[ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x00000000c0000000, min: 0x0000000080000000, mask: 0x000000007fffffff
>> [ 0.000000] Zone ranges:
>> -[ 0.000000] DMA [mem 0x0000000040000000-0x00000000bfffffff]
>> -[ 0.000000] DMA32 empty
>> +[ 0.000000] DMA [mem 0x0000000040000000-0x000000007fffffff]
>> +[ 0.000000] DMA32 [mem 0x0000000080000000-0x00000000bfffffff]
>> [ 0.000000] Normal empty
>> [ 0.000000] Movable zone start for each node
>> [ 0.000000] Early memory node ranges
>> @@ -54,7 +54,7 @@
>> --
>> Florian
>
> Do you have an example where this works when DDR memory starts above 4GB?
> The code in max_zone_phys() has separate if clauses for memory starting above 4GB,
> and for memory starting above zone mask but below 4GB...
I am afraid I don't have such a system handy. The one with 2+2GB with
the second 2GB starting at 12GB might be a tad difficult to start
without the first 2GB due to a number of critical reserved memory
regions being there.
--
Florian
[-- Attachment #1.2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4221 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
WARNING: multiple messages have this Message-ID (diff)
From: Florian Fainelli <florian.fainelli@broadcom.com>
To: Elad Nachman <enachman@marvell.com>, Will Deacon <will@kernel.org>
Cc: "catalin.marinas@arm.com" <catalin.marinas@arm.com>,
"thunder.leizhen@huawei.com" <thunder.leizhen@huawei.com>,
"bhe@redhat.com" <bhe@redhat.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"yajun.deng@linux.dev" <yajun.deng@linux.dev>,
"chris.zjh@huawei.com" <chris.zjh@huawei.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above zero
Date: Thu, 4 Jan 2024 10:28:38 -0800 [thread overview]
Message-ID: <cfbfe416-5499-4a4f-9cf2-eacf68359072@broadcom.com> (raw)
In-Reply-To: <BN9PR18MB4251A62F3C57A8C6910C00F5DB672@BN9PR18MB4251.namprd18.prod.outlook.com>
[-- Attachment #1: Type: text/plain, Size: 7709 bytes --]
On 1/4/24 04:38, Elad Nachman wrote:
>
>
>> -----Original Message-----
>> From: Florian Fainelli <florian.fainelli@broadcom.com>
>> Sent: Wednesday, January 3, 2024 8:32 PM
>> To: Will Deacon <will@kernel.org>; Elad Nachman
>> <enachman@marvell.com>
>> Cc: catalin.marinas@arm.com; thunder.leizhen@huawei.com;
>> bhe@redhat.com; akpm@linux-foundation.org; yajun.deng@linux.dev;
>> chris.zjh@huawei.com; linux-kernel@vger.kernel.org; linux-arm-
>> kernel@lists.infradead.org
>> Subject: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above
>> zero
>>
>> External Email
>>
>> ----------------------------------------------------------------------
>> On 1/3/24 09:45, Will Deacon wrote:
>>> On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote:
>>>> From: Elad Nachman <enachman@marvell.com>
>>>>
>>>> Some SOCs, like the Marvell AC5/X/IM, have a combination
>>>> of DDR starting at 0x2_0000_0000 coupled with DMA controllers
>>>> limited to 31 and 32 bit of addressing.
>>>> This requires to properly arrange ZONE_DMA and ZONE_DMA32 for
>>>> these SOCs, so swiotlb and coherent DMA allocation would work
>>>> properly.
>>>> Change initialization so device tree dma zone bits are taken as
>>>> function of offset from DRAM start, and when calculating the
>>>> maximal zone physical RAM address for physical DDR starting above
>>>> 32-bit, combine the physical address start plus the zone mask
>>>> passed as parameter.
>>>> This creates the proper zone splitting for these SOCs:
>>>> 0..2GB for ZONE_DMA
>>>> 2GB..4GB for ZONE_DMA32
>>>> 4GB..8GB for ZONE_NORMAL
>>>>
>>>> Signed-off-by: Elad Nachman <enachman@marvell.com>
>>>> ---
>>>> arch/arm64/mm/init.c | 20 +++++++++++++++-----
>>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>>>> index 74c1db8ce271..8288c778916e 100644
>>>> --- a/arch/arm64/mm/init.c
>>>> +++ b/arch/arm64/mm/init.c
>>>> @@ -115,20 +115,21 @@ static void __init
>> arch_reserve_crashkernel(void)
>>>>
>>>> /*
>>>> * Return the maximum physical address for a zone accessible by the
>> given bits
>>>> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum
>>>> - * available memory, otherwise cap it at 32-bit.
>>>> + * limit. If DRAM starts above 32-bit, expand the zone to the available
>> memory
>>>> + * start limited by the zone bits mask, otherwise cap it at 32-bit.
>>>> */
>>>> static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
>>>> {
>>>> phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits);
>>>> phys_addr_t phys_start = memblock_start_of_DRAM();
>>>> + phys_addr_t phys_end = memblock_end_of_DRAM();
>>>>
>>>> if (phys_start > U32_MAX)
>>>> - zone_mask = PHYS_ADDR_MAX;
>>>> + zone_mask = phys_start | zone_mask;
>>>> else if (phys_start > zone_mask)
>>>> zone_mask = U32_MAX;
>>>>
>>>> - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1;
>>>> + return min(zone_mask, phys_end - 1) + 1;
>>>> }
>>>>
>>>> static void __init zone_sizes_init(void)
>>>> @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void)
>>>>
>>>> #ifdef CONFIG_ZONE_DMA
>>>> acpi_zone_dma_bits =
>> fls64(acpi_iort_dma_get_max_cpu_address());
>>>> - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL));
>>>> + /*
>>>> + * When calculating the dma zone bits from the device tree, subtract
>>>> + * the DRAM start address, in case it does not start from address
>>>> + * zero. This way. we pass only the zone size related bits to
>>>> + * max_zone_phys(), which will add them to the base of the DRAM.
>>>> + * This prevents miscalculations on arm64 SOCs which combines
>>>> + * DDR starting above 4GB with memory controllers limited to
>>>> + * 32-bits or less:
>>>> + */
>>>> + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) -
>> memblock_start_of_DRAM());
>>>> zone_dma_bits = min3(32U, dt_zone_dma_bits,
>> acpi_zone_dma_bits);
>>>> arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
>>>> max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
>>>
>>> Hmm, I'm a bit worried this could regress other platforms since you seem
>>> to be assuming that DMA address 0 corresponds to the physical start of
>>> DRAM. What if that isn't the case?
>>
>> All of our most recent Set-top-box SoCs map DRAM starting at PA
>> 0x4000_0000 FWIW. We have the following memory maps:
>
> That is below 4GB
Right, I was just making sure that there would be no regressions with
non-zero base addresses for DRAM, not that there should be, but getting
Tested-by tags for such changes is always a good thing IMHO.
>
>>
>> 1. DRAM mapped at PA 0x0000_0000
>> 2. DRAM mapped at PA 0x4000_0000 with another memory controller
>> starting
>> at 0x3_0000_0000
>> 3. DRAM mapped at PA 0x4000_0000 with a single memory controller.
>>
>> Here is the before/after diff with debugging prints introduced to print
>> start, end, min, mask and the dt_zone_dma_bits value:
>>
>> Memory map 1. with 2GB -> no change in output
>>
>> Memory map 2. with 2+2GB -> no change in output
>>
>> Memory map 3. with 4GB:
>>
>> @@ -39,7 +39,7 @@
>> [ 0.000000] OF: reserved mem:
>> 0x00000000fdfff000..0x00000000fdffffff (4 KiB) nomap non-reusable
>> NWMBOX
>> [ 0.000000] OF: reserved mem:
>> 0x00000000fe000000..0x00000000ffffffff (32768 KiB) nomap non-reusable
>> SRR@fe000000
>> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff
>> -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000021
>> +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020
>> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff
>> [ 0.000000] Zone ranges:
>> [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff]
>> @@ -88,243 +88,243 @@
>>
>> Memory map 3. with 2GB:
>>
>> @@ -39,11 +39,11 @@
>> [ 0.000000] OF: reserved mem:
>> 0x00000000bdfff000..0x00000000bdffffff (4 KiB) nomap non-reusable
>> NWMBOX
>> [ 0.000000] OF: reserved mem:
>> 0x00000000be000000..0x00000000bfffffff (32768 KiB) nomap non-reusable
>> SRR@be000000
>> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff
>> -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020
>> -[ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff
>> +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x0000001f
>> +[ 0.000000] max_zone_phys: start: 0x0000000040000000, end:
>> 0x00000000c0000000, min: 0x0000000080000000, mask: 0x000000007fffffff
>> [ 0.000000] Zone ranges:
>> -[ 0.000000] DMA [mem 0x0000000040000000-0x00000000bfffffff]
>> -[ 0.000000] DMA32 empty
>> +[ 0.000000] DMA [mem 0x0000000040000000-0x000000007fffffff]
>> +[ 0.000000] DMA32 [mem 0x0000000080000000-0x00000000bfffffff]
>> [ 0.000000] Normal empty
>> [ 0.000000] Movable zone start for each node
>> [ 0.000000] Early memory node ranges
>> @@ -54,7 +54,7 @@
>> --
>> Florian
>
> Do you have an example where this works when DDR memory starts above 4GB?
> The code in max_zone_phys() has separate if clauses for memory starting above 4GB,
> and for memory starting above zone mask but below 4GB...
I am afraid I don't have such a system handy. The one with 2+2GB with
the second 2GB starting at 12GB might be a tad difficult to start
without the first 2GB due to a number of critical reserved memory
regions being there.
--
Florian
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4221 bytes --]
next prev parent reply other threads:[~2024-01-04 18:29 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-03 17:00 [PATCH] arm64: mm: Fix SOCs with DDR starting above zero Elad Nachman
2024-01-03 17:45 ` Will Deacon
2024-01-03 17:45 ` Will Deacon
2024-01-03 18:32 ` Florian Fainelli
2024-01-03 18:32 ` Florian Fainelli
2024-01-04 12:38 ` [EXT] " Elad Nachman
2024-01-04 12:38 ` Elad Nachman
2024-01-04 18:28 ` Florian Fainelli [this message]
2024-01-04 18:28 ` Florian Fainelli
2024-01-04 18:34 ` Elad Nachman
2024-01-04 18:34 ` Elad Nachman
2024-01-04 18:37 ` Florian Fainelli
2024-01-04 18:37 ` Florian Fainelli
2024-01-03 18:47 ` Baruch Siach
2024-01-03 18:47 ` Baruch Siach
2024-01-05 19:21 ` Catalin Marinas
2024-01-05 19:21 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cfbfe416-5499-4a4f-9cf2-eacf68359072@broadcom.com \
--to=florian.fainelli@broadcom.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=catalin.marinas@arm.com \
--cc=chris.zjh@huawei.com \
--cc=enachman@marvell.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=thunder.leizhen@huawei.com \
--cc=will@kernel.org \
--cc=yajun.deng@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.