Re: [PATCH v3 2/3] dma-mapping: replace zone_dma_bits by zone_dma_limit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Catalin Marinas <catalin.marinas@arm.com>
To: Baruch Siach <baruch@tkos.co.il>
Cc: "Christoph Hellwig" <hch@lst.de>,
	"Marek Szyprowski" <m.szyprowski@samsung.com>,
	"Will Deacon" <will@kernel.org>,
	"Robin Murphy" <robin.murphy@arm.com>,
	iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	linux-s390@vger.kernel.org, "Petr Tesařík" <petr@tesarici.cz>,
	"Ramon Fried" <ramon@neureality.ai>,
	"Elad Nachman" <enachman@marvell.com>
Subject: Re: [PATCH v3 2/3] dma-mapping: replace zone_dma_bits by zone_dma_limit
Date: Wed, 31 Jul 2024 18:26:20 +0100	[thread overview]
Message-ID: <ZqpzvHXBHXMt9IAZ@arm.com> (raw)
In-Reply-To: <053fa4806a2c63efcde80caca473a8b670a2701c.1722249878.git.baruch@tkos.co.il>

On Mon, Jul 29, 2024 at 01:51:25PM +0300, Baruch Siach wrote:
> From: Catalin Marinas <catalin.marinas@arm.com>
> 
> Hardware DMA limit might not be power of 2. When RAM range starts above
> 0, say 4GB, DMA limit of 30 bits should end at 5GB. A single high bit
> can not encode this limit.
> 
> Use direct phys_addr_t limit address for DMA zone limit.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>

You should add your Co-developed-by line, the patch evolved a bit from
initial my partial diff.

> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 9b5ab6818f7f..870fd967c610 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -114,36 +114,28 @@ static void __init arch_reserve_crashkernel(void)
>  				    low_size, high);
>  }
>  
> -/*
> - * Return the maximum physical address for a zone accessible by the given bits
> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum
> - * available memory, otherwise cap it at 32-bit.
> - */
> -static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
> +static phys_addr_t __init max_zone_phys(phys_addr_t zone_limit)
>  {
> -	phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits);
> -	phys_addr_t phys_start = memblock_start_of_DRAM();
> -
> -	if (phys_start > U32_MAX)
> -		zone_mask = PHYS_ADDR_MAX;
> -	else if (phys_start > zone_mask)
> -		zone_mask = U32_MAX;
> +	/* We have RAM in low 32-bit area, keep DMA zone there */
> +	if (memblock_start_of_DRAM() < U32_MAX)
> +		zone_limit = min(U32_MAX, zone_limit);

Does this matter anymore? Or is it to keep ZONE_DMA below (or equal to)
the ZONE_DMA32 limit?

Anyway, since this patch is about replacing zone_dma_bits with
zone_dma_limit, we should not introduce functional changes. AFAICT, we
need zone_limit to be set to memblock_end_of_DRAM() when phys_start is
above U32_MAX. You can do the functional change in a subsequent patch
once all the other refactoring has been handled.

>  
> -	return min(zone_mask, memblock_end_of_DRAM() - 1) + 1;
> +	return min(zone_limit, memblock_end_of_DRAM() - 1) + 1;
>  }
[...]
> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> index edbe13d00776..98b7d8015043 100644
> --- a/include/linux/dma-direct.h
> +++ b/include/linux/dma-direct.h
> @@ -12,7 +12,7 @@
>  #include <linux/mem_encrypt.h>
>  #include <linux/swiotlb.h>
>  
> -extern unsigned int zone_dma_bits;
> +extern phys_addr_t zone_dma_limit;
>  
>  /*
>   * Record the mapping of CPU physical to DMA addresses for a given region.
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 3b4be4ca3b08..3dbc0b89d6fb 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -20,7 +20,7 @@
>   * it for entirely different regions. In that case the arch code needs to
>   * override the variable below for dma-direct to work properly.
>   */
> -unsigned int zone_dma_bits __ro_after_init = 24;
> +phys_addr_t zone_dma_limit __ro_after_init = DMA_BIT_MASK(24);
>  
>  static inline dma_addr_t phys_to_dma_direct(struct device *dev,
>  		phys_addr_t phys)
> @@ -580,7 +580,7 @@ int dma_direct_supported(struct device *dev, u64 mask)
>  	 * part of the check.
>  	 */
>  	if (IS_ENABLED(CONFIG_ZONE_DMA))
> -		min_mask = min_t(u64, min_mask, DMA_BIT_MASK(zone_dma_bits));
> +		min_mask = min_t(u64, min_mask, zone_dma_limit);
>  	return mask >= phys_to_dma_unencrypted(dev, min_mask);
>  }
>  
> diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c
> index d10613eb0f63..410a7b40e496 100644
> --- a/kernel/dma/pool.c
> +++ b/kernel/dma/pool.c
> @@ -70,7 +70,7 @@ static bool cma_in_zone(gfp_t gfp)
>  	/* CMA can't cross zone boundaries, see cma_activate_area() */
>  	end = cma_get_base(cma) + size - 1;
>  	if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp & GFP_DMA))
> -		return end <= DMA_BIT_MASK(zone_dma_bits);
> +		return end <= zone_dma_limit;
>  	if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp & GFP_DMA32))
>  		return end <= DMA_BIT_MASK(32);
>  	return true;

I haven't got to the third patch yet but with this series we can have
zone_dma_limit above DMA_BIT_MASK(32). The above function could return
false for GFP_DMA32 when 'end' is perfectly valid within ZONE_DMA (which
implies safe for GFP_DMA32).

> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index df68d29740a0..dfd83e5ee0b3 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -450,7 +450,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
>  	if (!remap)
>  		io_tlb_default_mem.can_grow = true;
>  	if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp_mask & __GFP_DMA))
> -		io_tlb_default_mem.phys_limit = DMA_BIT_MASK(zone_dma_bits);
> +		io_tlb_default_mem.phys_limit = zone_dma_limit;
>  	else if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp_mask & __GFP_DMA32))
>  		io_tlb_default_mem.phys_limit = DMA_BIT_MASK(32);
>  	else
> @@ -629,7 +629,7 @@ static struct page *swiotlb_alloc_tlb(struct device *dev, size_t bytes,
>  	}
>  
>  	gfp &= ~GFP_ZONEMASK;
> -	if (phys_limit <= DMA_BIT_MASK(zone_dma_bits))
> +	if (phys_limit <= zone_dma_limit)
>  		gfp |= __GFP_DMA;
>  	else if (phys_limit <= DMA_BIT_MASK(32))
>  		gfp |= __GFP_DMA32;

I think this has the same issue as dma_direct_optimal_gfp_mask(). If
the requested limit is strictly smaller than DMA_BIT_MASK(32) (but
potentially bigger than zone_dma_limit), we should go for __GFP_DMA
rather than __GFP_DMA32. You should probably fix this in the first patch
as well.

As above, with this series we can end up with zone_dma_limit above
DMA_BIT_MASK(32) and all these checks become confusing. Even the
swiotlb_init_late() above if called with GFP_DMA32 will set a phys_limit
that may not be accessible at all if the DRAM starts above 4GB.
Similarly, cma_in_zone() could return false with GFP_DMA32 in similar
hardware configurations.

So we either introduce a zone_dma32_limit variable and allow a 32-bit
range above the start of DRAM or sanitise these sites to make sure
passing GFP_DMA32 is safe - i.e. assume we only have ZONE_DMA if
zone_dma_limit is above 32-bit. I prefer the latter without introducing
a zone_dma32_limit.

-- 
Catalin

WARNING: multiple messages have this Message-ID (diff)

From: Catalin Marinas <catalin.marinas@arm.com>
To: Baruch Siach <baruch@tkos.co.il>
Cc: linux-s390@vger.kernel.org, "Ramon Fried" <ramon@neureality.ai>,
	"Petr Tesařík" <petr@tesarici.cz>,
	"Will Deacon" <will@kernel.org>,
	linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org,
	iommu@lists.linux.dev, "Elad Nachman" <enachman@marvell.com>,
	"Robin Murphy" <robin.murphy@arm.com>,
	"Christoph Hellwig" <hch@lst.de>,
	linux-arm-kernel@lists.infradead.org,
	"Marek Szyprowski" <m.szyprowski@samsung.com>
Subject: Re: [PATCH v3 2/3] dma-mapping: replace zone_dma_bits by zone_dma_limit
Date: Wed, 31 Jul 2024 18:26:20 +0100	[thread overview]
Message-ID: <ZqpzvHXBHXMt9IAZ@arm.com> (raw)
In-Reply-To: <053fa4806a2c63efcde80caca473a8b670a2701c.1722249878.git.baruch@tkos.co.il>

On Mon, Jul 29, 2024 at 01:51:25PM +0300, Baruch Siach wrote:
> From: Catalin Marinas <catalin.marinas@arm.com>
> 
> Hardware DMA limit might not be power of 2. When RAM range starts above
> 0, say 4GB, DMA limit of 30 bits should end at 5GB. A single high bit
> can not encode this limit.
> 
> Use direct phys_addr_t limit address for DMA zone limit.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>

You should add your Co-developed-by line, the patch evolved a bit from
initial my partial diff.

> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 9b5ab6818f7f..870fd967c610 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -114,36 +114,28 @@ static void __init arch_reserve_crashkernel(void)
>  				    low_size, high);
>  }
>  
> -/*
> - * Return the maximum physical address for a zone accessible by the given bits
> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum
> - * available memory, otherwise cap it at 32-bit.
> - */
> -static phys_addr_t __init max_zone_phys(unsigned int zone_bits)
> +static phys_addr_t __init max_zone_phys(phys_addr_t zone_limit)
>  {
> -	phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits);
> -	phys_addr_t phys_start = memblock_start_of_DRAM();
> -
> -	if (phys_start > U32_MAX)
> -		zone_mask = PHYS_ADDR_MAX;
> -	else if (phys_start > zone_mask)
> -		zone_mask = U32_MAX;
> +	/* We have RAM in low 32-bit area, keep DMA zone there */
> +	if (memblock_start_of_DRAM() < U32_MAX)
> +		zone_limit = min(U32_MAX, zone_limit);

Does this matter anymore? Or is it to keep ZONE_DMA below (or equal to)
the ZONE_DMA32 limit?

Anyway, since this patch is about replacing zone_dma_bits with
zone_dma_limit, we should not introduce functional changes. AFAICT, we
need zone_limit to be set to memblock_end_of_DRAM() when phys_start is
above U32_MAX. You can do the functional change in a subsequent patch
once all the other refactoring has been handled.

>  
> -	return min(zone_mask, memblock_end_of_DRAM() - 1) + 1;
> +	return min(zone_limit, memblock_end_of_DRAM() - 1) + 1;
>  }
[...]
> diff --git a/include/linux/dma-direct.h b/include/linux/dma-direct.h
> index edbe13d00776..98b7d8015043 100644
> --- a/include/linux/dma-direct.h
> +++ b/include/linux/dma-direct.h
> @@ -12,7 +12,7 @@
>  #include <linux/mem_encrypt.h>
>  #include <linux/swiotlb.h>
>  
> -extern unsigned int zone_dma_bits;
> +extern phys_addr_t zone_dma_limit;
>  
>  /*
>   * Record the mapping of CPU physical to DMA addresses for a given region.
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 3b4be4ca3b08..3dbc0b89d6fb 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -20,7 +20,7 @@
>   * it for entirely different regions. In that case the arch code needs to
>   * override the variable below for dma-direct to work properly.
>   */
> -unsigned int zone_dma_bits __ro_after_init = 24;
> +phys_addr_t zone_dma_limit __ro_after_init = DMA_BIT_MASK(24);
>  
>  static inline dma_addr_t phys_to_dma_direct(struct device *dev,
>  		phys_addr_t phys)
> @@ -580,7 +580,7 @@ int dma_direct_supported(struct device *dev, u64 mask)
>  	 * part of the check.
>  	 */
>  	if (IS_ENABLED(CONFIG_ZONE_DMA))
> -		min_mask = min_t(u64, min_mask, DMA_BIT_MASK(zone_dma_bits));
> +		min_mask = min_t(u64, min_mask, zone_dma_limit);
>  	return mask >= phys_to_dma_unencrypted(dev, min_mask);
>  }
>  
> diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c
> index d10613eb0f63..410a7b40e496 100644
> --- a/kernel/dma/pool.c
> +++ b/kernel/dma/pool.c
> @@ -70,7 +70,7 @@ static bool cma_in_zone(gfp_t gfp)
>  	/* CMA can't cross zone boundaries, see cma_activate_area() */
>  	end = cma_get_base(cma) + size - 1;
>  	if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp & GFP_DMA))
> -		return end <= DMA_BIT_MASK(zone_dma_bits);
> +		return end <= zone_dma_limit;
>  	if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp & GFP_DMA32))
>  		return end <= DMA_BIT_MASK(32);
>  	return true;

I haven't got to the third patch yet but with this series we can have
zone_dma_limit above DMA_BIT_MASK(32). The above function could return
false for GFP_DMA32 when 'end' is perfectly valid within ZONE_DMA (which
implies safe for GFP_DMA32).

> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
> index df68d29740a0..dfd83e5ee0b3 100644
> --- a/kernel/dma/swiotlb.c
> +++ b/kernel/dma/swiotlb.c
> @@ -450,7 +450,7 @@ int swiotlb_init_late(size_t size, gfp_t gfp_mask,
>  	if (!remap)
>  		io_tlb_default_mem.can_grow = true;
>  	if (IS_ENABLED(CONFIG_ZONE_DMA) && (gfp_mask & __GFP_DMA))
> -		io_tlb_default_mem.phys_limit = DMA_BIT_MASK(zone_dma_bits);
> +		io_tlb_default_mem.phys_limit = zone_dma_limit;
>  	else if (IS_ENABLED(CONFIG_ZONE_DMA32) && (gfp_mask & __GFP_DMA32))
>  		io_tlb_default_mem.phys_limit = DMA_BIT_MASK(32);
>  	else
> @@ -629,7 +629,7 @@ static struct page *swiotlb_alloc_tlb(struct device *dev, size_t bytes,
>  	}
>  
>  	gfp &= ~GFP_ZONEMASK;
> -	if (phys_limit <= DMA_BIT_MASK(zone_dma_bits))
> +	if (phys_limit <= zone_dma_limit)
>  		gfp |= __GFP_DMA;
>  	else if (phys_limit <= DMA_BIT_MASK(32))
>  		gfp |= __GFP_DMA32;

I think this has the same issue as dma_direct_optimal_gfp_mask(). If
the requested limit is strictly smaller than DMA_BIT_MASK(32) (but
potentially bigger than zone_dma_limit), we should go for __GFP_DMA
rather than __GFP_DMA32. You should probably fix this in the first patch
as well.

As above, with this series we can end up with zone_dma_limit above
DMA_BIT_MASK(32) and all these checks become confusing. Even the
swiotlb_init_late() above if called with GFP_DMA32 will set a phys_limit
that may not be accessible at all if the DRAM starts above 4GB.
Similarly, cma_in_zone() could return false with GFP_DMA32 in similar
hardware configurations.

So we either introduce a zone_dma32_limit variable and allow a 32-bit
range above the start of DRAM or sanitise these sites to make sure
passing GFP_DMA32 is safe - i.e. assume we only have ZONE_DMA if
zone_dma_limit is above 32-bit. I prefer the latter without introducing
a zone_dma32_limit.

-- 
Catalin

next prev parent reply	other threads:[~2024-07-31 17:26 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-29 10:51 [PATCH v3 0/3] dma: support DMA zone starting above 4GB Baruch Siach
2024-07-29 10:51 ` Baruch Siach
2024-07-29 10:51 ` [PATCH v3 1/3] dma-mapping: improve DMA zone selection Baruch Siach
2024-07-29 10:51   ` Baruch Siach
2024-07-29 10:51 ` [PATCH v3 2/3] dma-mapping: replace zone_dma_bits by zone_dma_limit Baruch Siach
2024-07-29 10:51   ` Baruch Siach
2024-07-29 20:20   ` kernel test robot
2024-07-29 20:20     ` kernel test robot
2024-07-30  2:12     ` Nathan Chancellor
2024-07-30  2:12       ` Nathan Chancellor
2024-07-30 15:34       ` Christoph Hellwig
2024-07-30 15:34         ` Christoph Hellwig
2024-08-01  1:24         ` Nathan Chancellor
2024-08-01  1:24           ` Nathan Chancellor
2024-08-01 13:44           ` Christoph Hellwig
2024-08-01 13:44             ` Christoph Hellwig
2024-08-01 15:22             ` Nathan Chancellor
2024-08-01 15:22               ` Nathan Chancellor
2024-07-31 17:26   ` Catalin Marinas [this message]
2024-07-31 17:26     ` Catalin Marinas
2024-07-29 10:51 ` [PATCH v3 3/3] dma-direct: use RAM start to offset zone_dma_limit Baruch Siach
2024-07-29 10:51   ` Baruch Siach
2024-07-31 17:33   ` Catalin Marinas
2024-07-31 17:33     ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZqpzvHXBHXMt9IAZ@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=baruch@tkos.co.il \
    --cc=enachman@marvell.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=m.szyprowski@samsung.com \
    --cc=petr@tesarici.cz \
    --cc=ramon@neureality.ai \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.