All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baruch Siach <baruch@tkos.co.il>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: "Christoph Hellwig" <hch@lst.de>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Will Deacon" <will@kernel.org>,
	"Robin Murphy" <robin.murphy@arm.com>,
	iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, "Petr Tesařík" <petr@tesarici.cz>,
	"Ramon Fried" <ramon@neureality.ai>,
	"Elad Nachman" <enachman@marvell.com>,
	linux-rockchip@lists.infradead.org
Subject: Re: [PATCH v6 RESED 1/2] dma: replace zone_dma_bits by zone_dma_limit
Date: Tue, 27 Aug 2024 07:52:27 +0300	[thread overview]
Message-ID: <87mskyva7o.fsf@tarshish> (raw)
In-Reply-To: <53d988b1-bdce-422a-ae4e-158f305ad703@samsung.com> (Marek Szyprowski's message of "Mon, 26 Aug 2024 21:28:29 +0200")

Hi Marek,

Thanks for your report.

On Mon, Aug 26 2024, Marek Szyprowski wrote:
> On 11.08.2024 09:09, Baruch Siach wrote:
>> From: Catalin Marinas <catalin.marinas@arm.com>
>>
>> Hardware DMA limit might not be power of 2. When RAM range starts above
>> 0, say 4GB, DMA limit of 30 bits should end at 5GB. A single high bit
>> can not encode this limit.
>>
>> Use plain address for DMA zone limit.
>>
>> Since DMA zone can now potentially span beyond 4GB physical limit of
>> DMA32, make sure to use DMA zone for GFP_DMA32 allocations in that case.
>>
>> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>> Co-developed-by: Baruch Siach <baruch@tkos.co.il>
>> Signed-off-by: Baruch Siach <baruch@tkos.co.il>
>> ---
>
> This patch landed recently in linux-next as commit ba0fb44aed47 
> ("dma-mapping: replace zone_dma_bits by zone_dma_limit"). During my 
> tests I found that it introduces the following warning on ARM64/Rockchip 
> based Odroid M1 board (arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts):

Does this warning go away if you revert both 3be9b846896d and ba0fb44aed47?

Upstream rockchip DTs have no dma-ranges property. Is that the case for
your platform as well?

Can you share kernel report of DMA zones and swiotlb? On my platform I get:

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000800000000-0x000000083fffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   [mem 0x0000000840000000-0x0000000fffffffff]
...
[    0.000000] software IO TLB: area num 8.
[    0.000000] software IO TLB: mapped [mem 0x000000083be38000-0x000000083fe38000] (64MB)

What do you get at your end?

> ------------[ cut here ]------------
> dwmmc_rockchip fe2b0000.mmc: swiotlb addr 0x00000001faf00000+4096 
> overflow (mask ffffffff, bus limit 0).
> WARNING: CPU: 3 PID: 1 at kernel/dma/swiotlb.c:1594 swiotlb_map+0x2f0/0x308
> Modules linked in:
> CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc4+ #15278
> Hardware name: Hardkernel ODROID-M1 (DT)
> pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : swiotlb_map+0x2f0/0x308
> lr : swiotlb_map+0x2f0/0x308
> ...
> Call trace:
>   swiotlb_map+0x2f0/0x308
>   dma_direct_map_sg+0x9c/0x2e4
>   __dma_map_sg_attrs+0x28/0x94
>   dma_map_sg_attrs+0x10/0x24
>   dw_mci_pre_dma_transfer+0xb8/0xf4
>   dw_mci_pre_req+0x50/0x68
>   mmc_blk_mq_issue_rq+0x3e0/0x964
>   mmc_mq_queue_rq+0x118/0x2b4
>   blk_mq_dispatch_rq_list+0x21c/0x714
>   __blk_mq_sched_dispatch_requests+0x490/0x58c
>   blk_mq_sched_dispatch_requests+0x30/0x6c
>   blk_mq_run_hw_queue+0x284/0x40c
>   blk_mq_flush_plug_list.part.0+0x190/0x974
>   blk_mq_flush_plug_list+0x1c/0x2c
>   __blk_flush_plug+0xe4/0x140
>   blk_finish_plug+0x38/0x4c
>   __ext4_get_inode_loc+0x22c/0x654
>   __ext4_get_inode_loc_noinmem+0x40/0xa8
>   __ext4_iget+0x154/0xcc0
>   ext4_get_journal_inode+0x30/0x110
>   ext4_load_and_init_journal+0x9c/0xaf0
>   ext4_fill_super+0x1fec/0x2d90
>   get_tree_bdev+0x140/0x1d8
>   ext4_get_tree+0x18/0x24
>   vfs_get_tree+0x28/0xe8
>   path_mount+0x3e8/0xb7c
>   init_mount+0x68/0xac
>   do_mount_root+0x108/0x1dc
>   mount_root_generic+0x100/0x330
>   mount_root+0x160/0x2d0
>   initrd_load+0x1f0/0x2a0
>   prepare_namespace+0x4c/0x29c
>   kernel_init_freeable+0x4b4/0x50c
>   kernel_init+0x20/0x1d8
>   ret_from_fork+0x10/0x20
> irq event stamp: 1305682
> hardirqs last  enabled at (1305681): [<ffff8000800e332c>] 
> console_unlock+0x124/0x130
> hardirqs last disabled at (1305682): [<ffff80008124e684>] el1_dbg+0x24/0x8c
> softirqs last  enabled at (1305678): [<ffff80008005be1c>] 
> handle_softirqs+0x4cc/0x4e4
> softirqs last disabled at (1305665): [<ffff8000800105b0>] 
> __do_softirq+0x14/0x20
> ---[ end trace 0000000000000000 ]---
>
> This "bus limit 0" seems to be a bit suspicious to me as well as the 
> fact that swiotlb is used for the MMC DMA. I will investigate this 
> further tomorrow. The board boots fine though.

Looking at the code I guess that bus_dma_limit set to 0 means no bus
limit. But dma_mask for your device indicates 32-bit device limit. This
can't work with address above 4GB. For some reason DMA code tries to
allocate from higher address. This is most likely the reason
dma_capable() returns false.

Thanks,
baruch

>>   arch/arm64/mm/init.c       | 30 +++++++++++++++---------------
>>   arch/powerpc/mm/mem.c      |  5 ++++-
>>   arch/s390/mm/init.c        |  2 +-
>>   include/linux/dma-direct.h |  2 +-
>>   kernel/dma/direct.c        |  6 +++---
>>   kernel/dma/pool.c          |  4 ++--
>>   kernel/dma/swiotlb.c       |  6 +++---
>>   7 files changed, 29 insertions(+), 26 deletions(-)
>>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 9b5ab6818f7f..c45e2152ca9e 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -115,35 +115,35 @@ static void __init arch_reserve_crashkernel(void)
>>   }
>>   
>>   /*
>> - * Return the maximum physical address for a zone accessible by the given bits
>> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum
>> + * Return the maximum physical address for a zone given its limit.
>> + * If DRAM starts above 32-bit, expand the zone to the maximum
>>    * available memory, otherwise cap it at 32-bit.
>>    */
>> -static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
>> +static phys_addr_t __init max_zone_phys(phys_addr_t zone_limit)
>>   {
>> -	phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits);
>>   	phys_addr_t phys_start = memblock_start_of_DRAM();
>>   
>>   	if (phys_start > U32_MAX)
>> -		zone_mask = PHYS_ADDR_MAX;
>> -	else if (phys_start > zone_mask)
>> -		zone_mask = U32_MAX;
>> +		zone_limit = PHYS_ADDR_MAX;
>> +	else if (phys_start > zone_limit)
>> +		zone_limit = U32_MAX;
>>   
>> -	return min(zone_mask, memblock_end_of_DRAM() - 1) + 1;
>> +	return min(zone_limit, memblock_end_of_DRAM() - 1) + 1;
>>   }
>>   
>>   static void __init zone_sizes_init(void)
>>   {
>>   	unsigned long max_zone_pfns[MAX_NR_ZONES]  = {0};
>> -	unsigned int __maybe_unused acpi_zone_dma_bits;
>> -	unsigned int __maybe_unused dt_zone_dma_bits;
>> -	phys_addr_t __maybe_unused dma32_phys_limit = max_zone_phys(32);
>> +	phys_addr_t __maybe_unused acpi_zone_dma_limit;
>> +	phys_addr_t __maybe_unused dt_zone_dma_limit;
>> +	phys_addr_t __maybe_unused dma32_phys_limit =
>> +		max_zone_phys(DMA_BIT_MASK(32));
>>   
>>   #ifdef CONFIG_ZONE_DMA
>> -	acpi_zone_dma_bits = fls64(acpi_iort_dma_get_max_cpu_address());
>> -	dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL));
>> -	zone_dma_bits = min3(32U, dt_zone_dma_bits, acpi_zone_dma_bits);
>> -	arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
>> +	acpi_zone_dma_limit = acpi_iort_dma_get_max_cpu_address();
>> +	dt_zone_dma_limit = of_dma_get_max_cpu_address(NULL);
>> +	zone_dma_limit = min(dt_zone_dma_limit, acpi_zone_dma_limit);
>> +	arm64_dma_phys_limit = max_zone_phys(zone_dma_limit);
>>   	max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
>>   #endif
>>   #ifdef CONFIG_ZONE_DMA32
>> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
>> index d325217ab201..05b7f702b3f7 100644
>> --- a/arch/powerpc/mm/mem.c
>> +++ b/arch/powerpc/mm/mem.c
>> @@ -216,7 +216,7 @@ static int __init mark_nonram_nosave(void)
>>    * everything else. GFP_DMA32 page allocations automatically fall back to
>>    * ZONE_DMA.
>>    *
>> - * By using 31-bit unconditionally, we can exploit zone_dma_bits to inform the
>> + * By using 31-bit unconditionally, we can exploit zone_dma_limit to inform the
>>    * generic DMA mapping code.  32-bit only devices (if not handled by an IOMMU
>>    * anyway) will take a first dip into ZONE_NORMAL and get otherwise served by
>>    * ZONE_DMA.
>> @@ -230,6 +230,7 @@ void __init paging_init(void)
>>   {
>>   	unsigned long long total_ram = memblock_phys_mem_size();
>>   	phys_addr_t top_of_ram = memblock_end_of_DRAM();
>> +	int zone_dma_bits;
>>   
>>   #ifdef CONFIG_HIGHMEM
>>   	unsigned long v = __fix_to_virt(FIX_KMAP_END);
>> @@ -256,6 +257,8 @@ void __init paging_init(void)
>>   	else
>>   		zone_dma_bits = 31;
>>   
>> +	zone_dma_limit = DMA_BIT_MASK(zone_dma_bits);
>> +
>>   #ifdef CONFIG_ZONE_DMA
>>   	max_zone_pfns[ZONE_DMA]	= min(max_low_pfn,
>>   				      1UL << (zone_dma_bits - PAGE_SHIFT));
>> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
>> index ddcd39ef4346..91fc2b91adfc 100644
>> --- a/arch/s390/mm/init.c
>> +++ b/arch/s390/mm/init.c
>> @@ -97,7 +97,7 @@ void __init paging_init(void)
>>   
>>   	vmem_map_init();
>>   	sparse_init();
>> -	zone_dma_bits = 31;
>> +	zone_dma_limit = DMA_BIT_MASK(31);
>>   	memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
>>   	max_zone_pfns[ZONE_DMA] = virt_to_pfn(MAX_DMA_ADDRESS);
>>   	max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
>> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
>> index edbe13d00776..d7e30d4f7503 100644
>> --- a/include/linux/dma-direct.h
>> +++ b/include/linux/dma-direct.h
>> @@ -12,7 +12,7 @@
>>   #include <linux/mem_encrypt.h>
>>   #include <linux/swiotlb.h>
>>   
>> -extern unsigned int zone_dma_bits;
>> +extern u64 zone_dma_limit;
>>   
>>   /*
>>    * Record the mapping of CPU physical to DMA addresses for a given region.
>> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
>> index 4480a3cd92e0..f2ba074a6a54 100644
>> --- a/kernel/dma/direct.c
>> +++ b/kernel/dma/direct.c
>> @@ -20,7 +20,7 @@
>>    * it for entirely different regions. In that case the arch code needs to
>>    * override the variable below for dma-direct to work properly.
>>    */
>> -unsigned int zone_dma_bits __ro_after_init = 24;
>> +u64 zone_dma_limit __ro_after_init = DMA_BIT_MASK(24);
>>   
>>   static inline dma_addr_t phys_to_dma_direct(struct device *dev,
>>   		phys_addr_t phys)
>> @@ -59,7 +59,7 @@ static gfp_t dma_direct_optimal_gfp_mask(struct device *dev, u64 *phys_limit)
>>   	 * zones.
>>   	 */
>>   	*phys_limit = dma_to_phys(dev, dma_limit);
>> -	if (*phys_limit <= DMA_BIT_MASK(zone_dma_bits))
>> +	if (*phys_limit <= zone_dma_limit)
>>   		return GFP_DMA;
>>   	if (*phys_limit <= DMA_BIT_MASK(32))
>>   		return GFP_DMA32;
>> @@ -580,7 +580,7 @@ int dma_direct_supported(struct device *dev, u64 mask)
>>   	 * part of the check.
>>   	 */
>>   	if (IS_ENABLED(CONFIG_ZONE_DMA))
>> -		min_mask = min_t(u64, min_mask, DMA_BIT_MASK(zone_dma_bits));
>> +		min_mask = min_t(u64, min_mask, zone_dma_limit);
>>   	return mask >= phys_to_dma_unencrypted(dev, min_mask);
>>   }
>>   
>> diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c
>> index d10613eb0f63..7b04f7575796 100644
>> --- a/kernel/dma/pool.c
>> +++ b/kernel/dma/pool.c
>> @@ -70,9 +70,9 @@ static bool cma_in_zone(gfp_t gfp)
>>   	/* CMA can't cross zone boundaries, see cma_activate_area() */
>>   	end = cma_get_base(cma) + size - 1;
>>   	if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp & GFP_DMA))
>> -		return end <= DMA_BIT_MASK(zone_dma_bits);
>> +		return end <= zone_dma_limit;
>>   	if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp & GFP_DMA32))
>> -		return end <= DMA_BIT_MASK(32);
>> +		return end <= max(DMA_BIT_MASK(32), zone_dma_limit);
>>   	return true;
>>   }
>>   
>> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
>> index df68d29740a0..abcf3fa63a56 100644
>> --- a/kernel/dma/swiotlb.c
>> +++ b/kernel/dma/swiotlb.c
>> @@ -450,9 +450,9 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
>>   	if (!remap)
>>   		io_tlb_default_mem.can_grow = true;
>>   	if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp_mask & __GFP_DMA))
>> -		io_tlb_default_mem.phys_limit = DMA_BIT_MASK(zone_dma_bits);
>> +		io_tlb_default_mem.phys_limit = zone_dma_limit;
>>   	else if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp_mask & __GFP_DMA32))
>> -		io_tlb_default_mem.phys_limit = DMA_BIT_MASK(32);
>> +		io_tlb_default_mem.phys_limit = max(DMA_BIT_MASK(32), zone_dma_limit);
>>   	else
>>   		io_tlb_default_mem.phys_limit = virt_to_phys(high_memory - 1);
>>   #endif
>> @@ -629,7 +629,7 @@ static struct page *swiotlb_alloc_tlb(struct device *dev, size_t bytes,
>>   	}
>>   
>>   	gfp &= ~GFP_ZONEMASK;
>> -	if (phys_limit <= DMA_BIT_MASK(zone_dma_bits))
>> +	if (phys_limit <= zone_dma_limit)
>>   		gfp |= __GFP_DMA;
>>   	else if (phys_limit <= DMA_BIT_MASK(32))
>>   		gfp |= __GFP_DMA32;
>
> Best regards

-- 
                                                     ~. .~   Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
   - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il -

WARNING: multiple messages have this Message-ID (diff)
From: Baruch Siach <baruch@tkos.co.il>
To: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: "Christoph Hellwig" <hch@lst.de>,
	"Catalin Marinas" <catalin.marinas@arm.com>,
	"Will Deacon" <will@kernel.org>,
	"Robin Murphy" <robin.murphy@arm.com>,
	iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, "Petr Tesařík" <petr@tesarici.cz>,
	"Ramon Fried" <ramon@neureality.ai>,
	"Elad Nachman" <enachman@marvell.com>,
	linux-rockchip@lists.infradead.org
Subject: Re: [PATCH v6 RESED 1/2] dma: replace zone_dma_bits by zone_dma_limit
Date: Tue, 27 Aug 2024 07:52:27 +0300	[thread overview]
Message-ID: <87mskyva7o.fsf@tarshish> (raw)
In-Reply-To: <53d988b1-bdce-422a-ae4e-158f305ad703@samsung.com> (Marek Szyprowski's message of "Mon, 26 Aug 2024 21:28:29 +0200")

Hi Marek,

Thanks for your report.

On Mon, Aug 26 2024, Marek Szyprowski wrote:
> On 11.08.2024 09:09, Baruch Siach wrote:
>> From: Catalin Marinas <catalin.marinas@arm.com>
>>
>> Hardware DMA limit might not be power of 2. When RAM range starts above
>> 0, say 4GB, DMA limit of 30 bits should end at 5GB. A single high bit
>> can not encode this limit.
>>
>> Use plain address for DMA zone limit.
>>
>> Since DMA zone can now potentially span beyond 4GB physical limit of
>> DMA32, make sure to use DMA zone for GFP_DMA32 allocations in that case.
>>
>> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>> Co-developed-by: Baruch Siach <baruch@tkos.co.il>
>> Signed-off-by: Baruch Siach <baruch@tkos.co.il>
>> ---
>
> This patch landed recently in linux-next as commit ba0fb44aed47 
> ("dma-mapping: replace zone_dma_bits by zone_dma_limit"). During my 
> tests I found that it introduces the following warning on ARM64/Rockchip 
> based Odroid M1 board (arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts):

Does this warning go away if you revert both 3be9b846896d and ba0fb44aed47?

Upstream rockchip DTs have no dma-ranges property. Is that the case for
your platform as well?

Can you share kernel report of DMA zones and swiotlb? On my platform I get:

[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000800000000-0x000000083fffffff]
[    0.000000]   DMA32    empty
[    0.000000]   Normal   [mem 0x0000000840000000-0x0000000fffffffff]
...
[    0.000000] software IO TLB: area num 8.
[    0.000000] software IO TLB: mapped [mem 0x000000083be38000-0x000000083fe38000] (64MB)

What do you get at your end?

> ------------[ cut here ]------------
> dwmmc_rockchip fe2b0000.mmc: swiotlb addr 0x00000001faf00000+4096 
> overflow (mask ffffffff, bus limit 0).
> WARNING: CPU: 3 PID: 1 at kernel/dma/swiotlb.c:1594 swiotlb_map+0x2f0/0x308
> Modules linked in:
> CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc4+ #15278
> Hardware name: Hardkernel ODROID-M1 (DT)
> pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> pc : swiotlb_map+0x2f0/0x308
> lr : swiotlb_map+0x2f0/0x308
> ...
> Call trace:
>   swiotlb_map+0x2f0/0x308
>   dma_direct_map_sg+0x9c/0x2e4
>   __dma_map_sg_attrs+0x28/0x94
>   dma_map_sg_attrs+0x10/0x24
>   dw_mci_pre_dma_transfer+0xb8/0xf4
>   dw_mci_pre_req+0x50/0x68
>   mmc_blk_mq_issue_rq+0x3e0/0x964
>   mmc_mq_queue_rq+0x118/0x2b4
>   blk_mq_dispatch_rq_list+0x21c/0x714
>   __blk_mq_sched_dispatch_requests+0x490/0x58c
>   blk_mq_sched_dispatch_requests+0x30/0x6c
>   blk_mq_run_hw_queue+0x284/0x40c
>   blk_mq_flush_plug_list.part.0+0x190/0x974
>   blk_mq_flush_plug_list+0x1c/0x2c
>   __blk_flush_plug+0xe4/0x140
>   blk_finish_plug+0x38/0x4c
>   __ext4_get_inode_loc+0x22c/0x654
>   __ext4_get_inode_loc_noinmem+0x40/0xa8
>   __ext4_iget+0x154/0xcc0
>   ext4_get_journal_inode+0x30/0x110
>   ext4_load_and_init_journal+0x9c/0xaf0
>   ext4_fill_super+0x1fec/0x2d90
>   get_tree_bdev+0x140/0x1d8
>   ext4_get_tree+0x18/0x24
>   vfs_get_tree+0x28/0xe8
>   path_mount+0x3e8/0xb7c
>   init_mount+0x68/0xac
>   do_mount_root+0x108/0x1dc
>   mount_root_generic+0x100/0x330
>   mount_root+0x160/0x2d0
>   initrd_load+0x1f0/0x2a0
>   prepare_namespace+0x4c/0x29c
>   kernel_init_freeable+0x4b4/0x50c
>   kernel_init+0x20/0x1d8
>   ret_from_fork+0x10/0x20
> irq event stamp: 1305682
> hardirqs last  enabled at (1305681): [<ffff8000800e332c>] 
> console_unlock+0x124/0x130
> hardirqs last disabled at (1305682): [<ffff80008124e684>] el1_dbg+0x24/0x8c
> softirqs last  enabled at (1305678): [<ffff80008005be1c>] 
> handle_softirqs+0x4cc/0x4e4
> softirqs last disabled at (1305665): [<ffff8000800105b0>] 
> __do_softirq+0x14/0x20
> ---[ end trace 0000000000000000 ]---
>
> This "bus limit 0" seems to be a bit suspicious to me as well as the 
> fact that swiotlb is used for the MMC DMA. I will investigate this 
> further tomorrow. The board boots fine though.

Looking at the code I guess that bus_dma_limit set to 0 means no bus
limit. But dma_mask for your device indicates 32-bit device limit. This
can't work with address above 4GB. For some reason DMA code tries to
allocate from higher address. This is most likely the reason
dma_capable() returns false.

Thanks,
baruch

>>   arch/arm64/mm/init.c       | 30 +++++++++++++++---------------
>>   arch/powerpc/mm/mem.c      |  5 ++++-
>>   arch/s390/mm/init.c        |  2 +-
>>   include/linux/dma-direct.h |  2 +-
>>   kernel/dma/direct.c        |  6 +++---
>>   kernel/dma/pool.c          |  4 ++--
>>   kernel/dma/swiotlb.c       |  6 +++---
>>   7 files changed, 29 insertions(+), 26 deletions(-)
>>
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 9b5ab6818f7f..c45e2152ca9e 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -115,35 +115,35 @@ static void __init arch_reserve_crashkernel(void)
>>   }
>>   
>>   /*
>> - * Return the maximum physical address for a zone accessible by the given bits
>> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum
>> + * Return the maximum physical address for a zone given its limit.
>> + * If DRAM starts above 32-bit, expand the zone to the maximum
>>    * available memory, otherwise cap it at 32-bit.
>>    */
>> -static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
>> +static phys_addr_t __init max_zone_phys(phys_addr_t zone_limit)
>>   {
>> -	phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits);
>>   	phys_addr_t phys_start = memblock_start_of_DRAM();
>>   
>>   	if (phys_start > U32_MAX)
>> -		zone_mask = PHYS_ADDR_MAX;
>> -	else if (phys_start > zone_mask)
>> -		zone_mask = U32_MAX;
>> +		zone_limit = PHYS_ADDR_MAX;
>> +	else if (phys_start > zone_limit)
>> +		zone_limit = U32_MAX;
>>   
>> -	return min(zone_mask, memblock_end_of_DRAM() - 1) + 1;
>> +	return min(zone_limit, memblock_end_of_DRAM() - 1) + 1;
>>   }
>>   
>>   static void __init zone_sizes_init(void)
>>   {
>>   	unsigned long max_zone_pfns[MAX_NR_ZONES]  = {0};
>> -	unsigned int __maybe_unused acpi_zone_dma_bits;
>> -	unsigned int __maybe_unused dt_zone_dma_bits;
>> -	phys_addr_t __maybe_unused dma32_phys_limit = max_zone_phys(32);
>> +	phys_addr_t __maybe_unused acpi_zone_dma_limit;
>> +	phys_addr_t __maybe_unused dt_zone_dma_limit;
>> +	phys_addr_t __maybe_unused dma32_phys_limit =
>> +		max_zone_phys(DMA_BIT_MASK(32));
>>   
>>   #ifdef CONFIG_ZONE_DMA
>> -	acpi_zone_dma_bits = fls64(acpi_iort_dma_get_max_cpu_address());
>> -	dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL));
>> -	zone_dma_bits = min3(32U, dt_zone_dma_bits, acpi_zone_dma_bits);
>> -	arm64_dma_phys_limit = max_zone_phys(zone_dma_bits);
>> +	acpi_zone_dma_limit = acpi_iort_dma_get_max_cpu_address();
>> +	dt_zone_dma_limit = of_dma_get_max_cpu_address(NULL);
>> +	zone_dma_limit = min(dt_zone_dma_limit, acpi_zone_dma_limit);
>> +	arm64_dma_phys_limit = max_zone_phys(zone_dma_limit);
>>   	max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
>>   #endif
>>   #ifdef CONFIG_ZONE_DMA32
>> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
>> index d325217ab201..05b7f702b3f7 100644
>> --- a/arch/powerpc/mm/mem.c
>> +++ b/arch/powerpc/mm/mem.c
>> @@ -216,7 +216,7 @@ static int __init mark_nonram_nosave(void)
>>    * everything else. GFP_DMA32 page allocations automatically fall back to
>>    * ZONE_DMA.
>>    *
>> - * By using 31-bit unconditionally, we can exploit zone_dma_bits to inform the
>> + * By using 31-bit unconditionally, we can exploit zone_dma_limit to inform the
>>    * generic DMA mapping code.  32-bit only devices (if not handled by an IOMMU
>>    * anyway) will take a first dip into ZONE_NORMAL and get otherwise served by
>>    * ZONE_DMA.
>> @@ -230,6 +230,7 @@ void __init paging_init(void)
>>   {
>>   	unsigned long long total_ram = memblock_phys_mem_size();
>>   	phys_addr_t top_of_ram = memblock_end_of_DRAM();
>> +	int zone_dma_bits;
>>   
>>   #ifdef CONFIG_HIGHMEM
>>   	unsigned long v = __fix_to_virt(FIX_KMAP_END);
>> @@ -256,6 +257,8 @@ void __init paging_init(void)
>>   	else
>>   		zone_dma_bits = 31;
>>   
>> +	zone_dma_limit = DMA_BIT_MASK(zone_dma_bits);
>> +
>>   #ifdef CONFIG_ZONE_DMA
>>   	max_zone_pfns[ZONE_DMA]	= min(max_low_pfn,
>>   				      1UL << (zone_dma_bits - PAGE_SHIFT));
>> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
>> index ddcd39ef4346..91fc2b91adfc 100644
>> --- a/arch/s390/mm/init.c
>> +++ b/arch/s390/mm/init.c
>> @@ -97,7 +97,7 @@ void __init paging_init(void)
>>   
>>   	vmem_map_init();
>>   	sparse_init();
>> -	zone_dma_bits = 31;
>> +	zone_dma_limit = DMA_BIT_MASK(31);
>>   	memset(max_zone_pfns, 0, sizeof(max_zone_pfns));
>>   	max_zone_pfns[ZONE_DMA] = virt_to_pfn(MAX_DMA_ADDRESS);
>>   	max_zone_pfns[ZONE_NORMAL] = max_low_pfn;
>> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
>> index edbe13d00776..d7e30d4f7503 100644
>> --- a/include/linux/dma-direct.h
>> +++ b/include/linux/dma-direct.h
>> @@ -12,7 +12,7 @@
>>   #include <linux/mem_encrypt.h>
>>   #include <linux/swiotlb.h>
>>   
>> -extern unsigned int zone_dma_bits;
>> +extern u64 zone_dma_limit;
>>   
>>   /*
>>    * Record the mapping of CPU physical to DMA addresses for a given region.
>> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
>> index 4480a3cd92e0..f2ba074a6a54 100644
>> --- a/kernel/dma/direct.c
>> +++ b/kernel/dma/direct.c
>> @@ -20,7 +20,7 @@
>>    * it for entirely different regions. In that case the arch code needs to
>>    * override the variable below for dma-direct to work properly.
>>    */
>> -unsigned int zone_dma_bits __ro_after_init = 24;
>> +u64 zone_dma_limit __ro_after_init = DMA_BIT_MASK(24);
>>   
>>   static inline dma_addr_t phys_to_dma_direct(struct device *dev,
>>   		phys_addr_t phys)
>> @@ -59,7 +59,7 @@ static gfp_t dma_direct_optimal_gfp_mask(struct device *dev, u64 *phys_limit)
>>   	 * zones.
>>   	 */
>>   	*phys_limit = dma_to_phys(dev, dma_limit);
>> -	if (*phys_limit <= DMA_BIT_MASK(zone_dma_bits))
>> +	if (*phys_limit <= zone_dma_limit)
>>   		return GFP_DMA;
>>   	if (*phys_limit <= DMA_BIT_MASK(32))
>>   		return GFP_DMA32;
>> @@ -580,7 +580,7 @@ int dma_direct_supported(struct device *dev, u64 mask)
>>   	 * part of the check.
>>   	 */
>>   	if (IS_ENABLED(CONFIG_ZONE_DMA))
>> -		min_mask = min_t(u64, min_mask, DMA_BIT_MASK(zone_dma_bits));
>> +		min_mask = min_t(u64, min_mask, zone_dma_limit);
>>   	return mask >= phys_to_dma_unencrypted(dev, min_mask);
>>   }
>>   
>> diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c
>> index d10613eb0f63..7b04f7575796 100644
>> --- a/kernel/dma/pool.c
>> +++ b/kernel/dma/pool.c
>> @@ -70,9 +70,9 @@ static bool cma_in_zone(gfp_t gfp)
>>   	/* CMA can't cross zone boundaries, see cma_activate_area() */
>>   	end = cma_get_base(cma) + size - 1;
>>   	if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp & GFP_DMA))
>> -		return end <= DMA_BIT_MASK(zone_dma_bits);
>> +		return end <= zone_dma_limit;
>>   	if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp & GFP_DMA32))
>> -		return end <= DMA_BIT_MASK(32);
>> +		return end <= max(DMA_BIT_MASK(32), zone_dma_limit);
>>   	return true;
>>   }
>>   
>> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
>> index df68d29740a0..abcf3fa63a56 100644
>> --- a/kernel/dma/swiotlb.c
>> +++ b/kernel/dma/swiotlb.c
>> @@ -450,9 +450,9 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
>>   	if (!remap)
>>   		io_tlb_default_mem.can_grow = true;
>>   	if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp_mask & __GFP_DMA))
>> -		io_tlb_default_mem.phys_limit = DMA_BIT_MASK(zone_dma_bits);
>> +		io_tlb_default_mem.phys_limit = zone_dma_limit;
>>   	else if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp_mask & __GFP_DMA32))
>> -		io_tlb_default_mem.phys_limit = DMA_BIT_MASK(32);
>> +		io_tlb_default_mem.phys_limit = max(DMA_BIT_MASK(32), zone_dma_limit);
>>   	else
>>   		io_tlb_default_mem.phys_limit = virt_to_phys(high_memory - 1);
>>   #endif
>> @@ -629,7 +629,7 @@ static struct page *swiotlb_alloc_tlb(struct device *dev, size_t bytes,
>>   	}
>>   
>>   	gfp &= ~GFP_ZONEMASK;
>> -	if (phys_limit <= DMA_BIT_MASK(zone_dma_bits))
>> +	if (phys_limit <= zone_dma_limit)
>>   		gfp |= __GFP_DMA;
>>   	else if (phys_limit <= DMA_BIT_MASK(32))
>>   		gfp |= __GFP_DMA32;
>
> Best regards

-- 
                                                     ~. .~   Tk Open Systems
=}------------------------------------------------ooO--U--Ooo------------{=
   - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il -

_______________________________________________
Linux-rockchip mailing list
Linux-rockchip@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-rockchip

  reply	other threads:[~2024-08-27  4:58 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-11  7:09 [PATCH v6 RESED 0/2] dma: support DMA zone starting above 4GB Baruch Siach
2024-08-11  7:09 ` Baruch Siach
2024-08-11  7:09 ` [PATCH v6 RESED 1/2] dma: replace zone_dma_bits by zone_dma_limit Baruch Siach
2024-08-11  7:09   ` Baruch Siach
2024-08-12  5:52   ` Petr Tesarik
2024-08-12  5:52     ` Petr Tesarik
2024-08-12 11:22   ` Catalin Marinas
2024-08-16 11:52   ` Will Deacon
2024-08-16 14:37     ` Petr Tesařík
2024-08-26 19:28   ` Marek Szyprowski
2024-08-26 19:28     ` Marek Szyprowski
2024-08-27  4:52     ` Baruch Siach [this message]
2024-08-27  4:52       ` Baruch Siach
2024-08-27  6:14       ` Marek Szyprowski
2024-08-27  6:14         ` Marek Szyprowski
2024-08-27  7:03         ` Baruch Siach
2024-08-27  7:03           ` Baruch Siach
2024-08-27  7:46           ` Marek Szyprowski
2024-08-27  7:46             ` Marek Szyprowski
2024-08-29 13:42   ` Neil Armstrong
2024-08-29 14:38     ` Robin Murphy
2024-08-29 14:54       ` neil.armstrong
2024-08-11  7:09 ` [PATCH v6 RESED 2/2] arm64: support DMA zone above 4GB Baruch Siach
2024-08-11  7:09   ` Baruch Siach
2024-08-12  5:54   ` Petr Tesarik
2024-08-12  5:54     ` Petr Tesarik
2024-08-22  4:18 ` [PATCH v6 RESED 0/2] dma: support DMA zone starting " Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87mskyva7o.fsf@tarshish \
    --to=baruch@tkos.co.il \
    --cc=catalin.marinas@arm.com \
    --cc=enachman@marvell.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rockchip@lists.infradead.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=m.szyprowski@samsung.com \
    --cc=petr@tesarici.cz \
    --cc=ramon@neureality.ai \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.