linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
@ 2024-10-15 16:40 Ming Lei
  2024-10-16  8:04 ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2024-10-15 16:40 UTC (permalink / raw)
  To: linux-block, Keith Busch, Christoph Hellwig; +Cc: Jens Axboe

[-- Attachment #1: Type: text/plain, Size: 395 bytes --]

Hello Guys,

Turns out host controller's DMA alignment is often too relax, so two DMA
buffers may cross same cache line easily, and trigger the warning of
"cacheline tracking EEXIST, overlapping mappings aren't supported".

The attached test code can trigger the warning immediately with CONFIG_DMA_API_DEBUG
enabled when reading from one scsi disk which queue DMA alignment is 3.

Thanks,
Ming

[-- Attachment #2: dma.c --]
[-- Type: text/plain, Size: 1434 bytes --]

#define _GNU_SOURCE
#include <stdio.h>
#include <fcntl.h>
#include <string.h>
#include <stdlib.h>
#include <libaio.h>
#include <errno.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
	const char *outputfile=argv[1];
	io_context_t ctx;
	int output_fd;
	const int nr = 4;
	struct iocb _io[nr];
	struct iocb *io[16] = {
		&_io[0],
		&_io[1],
		&_io[2],
		&_io[3],
	};
	struct io_event e[nr];
	struct timespec timeout;
	int ret;
	char *content;
	unsigned size = 2*1024 * 1024;

	posix_memalign((void **)&content, 4096, nr * size + 512);

	memset(&ctx,0,sizeof(ctx));
	if(io_setup(10, &ctx) != 0) {
		printf("io_setup error\n");
		return -1;
	}

	if((output_fd = open(outputfile, O_DIRECT, 0644)) < 0) {
		perror("open error");
		io_destroy(ctx);
		return -1;
	}
	io_prep_pread(io[0], output_fd, content + 4, size, 0);
	io_prep_pread(io[1], output_fd, content + size, size, size * 2);
	io_prep_pread(io[2], output_fd, content + size * 2, size, size * 4);
	io_prep_pread(io[3], output_fd, content + size * 3, size, size * 8);

	ret = io_submit(ctx, nr, io);
	if(ret != nr) {
		io_destroy(ctx);
		printf("io_submit error %d\n", ret);
		return -1;
	}

	while(1) {
		timeout.tv_sec=0;
	        timeout.tv_nsec=500000000;
	        if (io_getevents(ctx, nr, nr, e, &timeout) == nr) {
	            close(output_fd);
	            break;
	        }
	        printf("haven't done\n");
	        sleep(1);
	}
	io_destroy(ctx);
	return 0;
}


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-15 16:40 [Regression] b1a000d3b8ec ("block: relax direct io memory alignment") Ming Lei
@ 2024-10-16  8:04 ` Christoph Hellwig
  2024-10-16  8:31   ` Ming Lei
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2024-10-16  8:04 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, Keith Busch, Christoph Hellwig, Jens Axboe

On Wed, Oct 16, 2024 at 12:40:13AM +0800, Ming Lei wrote:
> Hello Guys,
> 
> Turns out host controller's DMA alignment is often too relax, so two DMA
> buffers may cross same cache line easily, and trigger the warning of
> "cacheline tracking EEXIST, overlapping mappings aren't supported".
> 
> The attached test code can trigger the warning immediately with CONFIG_DMA_API_DEBUG
> enabled when reading from one scsi disk which queue DMA alignment is 3.
> 

We should not allow smaller than cache line alignment on architectures
that are not cache coherent indeed.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-16  8:04 ` Christoph Hellwig
@ 2024-10-16  8:31   ` Ming Lei
  2024-10-16 12:31     ` Christoph Hellwig
                       ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Ming Lei @ 2024-10-16  8:31 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-block, Keith Busch, Jens Axboe, ming.lei

On Wed, Oct 16, 2024 at 10:04:19AM +0200, Christoph Hellwig wrote:
> On Wed, Oct 16, 2024 at 12:40:13AM +0800, Ming Lei wrote:
> > Hello Guys,
> > 
> > Turns out host controller's DMA alignment is often too relax, so two DMA
> > buffers may cross same cache line easily, and trigger the warning of
> > "cacheline tracking EEXIST, overlapping mappings aren't supported".
> > 
> > The attached test code can trigger the warning immediately with CONFIG_DMA_API_DEBUG
> > enabled when reading from one scsi disk which queue DMA alignment is 3.
> > 
> 
> We should not allow smaller than cache line alignment on architectures
> that are not cache coherent indeed.

Yes, something like the following change:

diff --git a/block/blk-settings.c b/block/blk-settings.c
index a446654ddee5..26bd0e72c68e 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -348,7 +348,9 @@ static int blk_validate_limits(struct queue_limits *lim)
 	 */
 	if (!lim->dma_alignment)
 		lim->dma_alignment = SECTOR_SIZE - 1;
-	if (WARN_ON_ONCE(lim->dma_alignment > PAGE_SIZE))
+	else if (lim->dma_alignment < L1_CACHE_BYTES - 1)
+		lim->dma_alignment = L1_CACHE_BYTES - 1;
+	else if (WARN_ON_ONCE(lim->dma_alignment > PAGE_SIZE))
 		return -EINVAL;
 
 	if (lim->alignment_offset) {
 


Thanks,
Ming


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-16  8:31   ` Ming Lei
@ 2024-10-16 12:31     ` Christoph Hellwig
  2024-10-22  1:21       ` Ming Lei
  2024-10-22  2:15     ` Jens Axboe
  2024-10-22 10:24     ` Catalin Marinas
  2 siblings, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2024-10-16 12:31 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, linux-block, Keith Busch, Jens Axboe,
	Catalin Marinas

On Wed, Oct 16, 2024 at 04:31:45PM +0800, Ming Lei wrote:
> On Wed, Oct 16, 2024 at 10:04:19AM +0200, Christoph Hellwig wrote:
> > On Wed, Oct 16, 2024 at 12:40:13AM +0800, Ming Lei wrote:
> > > Hello Guys,
> > > 
> > > Turns out host controller's DMA alignment is often too relax, so two DMA
> > > buffers may cross same cache line easily, and trigger the warning of
> > > "cacheline tracking EEXIST, overlapping mappings aren't supported".
> > > 
> > > The attached test code can trigger the warning immediately with CONFIG_DMA_API_DEBUG
> > > enabled when reading from one scsi disk which queue DMA alignment is 3.
> > > 
> > 
> > We should not allow smaller than cache line alignment on architectures
> > that are not cache coherent indeed.
> 
> Yes, something like the following change:

We only really need this if the architecture support cache incoherent
DMA.  Maybe even as a runtime setting.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-16 12:31     ` Christoph Hellwig
@ 2024-10-22  1:21       ` Ming Lei
  2024-10-22  7:25         ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2024-10-22  1:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-block, Keith Busch, Jens Axboe, Catalin Marinas,
	Robin Murphy

On Wed, Oct 16, 2024 at 02:31:53PM +0200, Christoph Hellwig wrote:
> On Wed, Oct 16, 2024 at 04:31:45PM +0800, Ming Lei wrote:
> > On Wed, Oct 16, 2024 at 10:04:19AM +0200, Christoph Hellwig wrote:
> > > On Wed, Oct 16, 2024 at 12:40:13AM +0800, Ming Lei wrote:
> > > > Hello Guys,
> > > > 
> > > > Turns out host controller's DMA alignment is often too relax, so two DMA
> > > > buffers may cross same cache line easily, and trigger the warning of
> > > > "cacheline tracking EEXIST, overlapping mappings aren't supported".
> > > > 
> > > > The attached test code can trigger the warning immediately with CONFIG_DMA_API_DEBUG
> > > > enabled when reading from one scsi disk which queue DMA alignment is 3.
> > > > 
> > > 
> > > We should not allow smaller than cache line alignment on architectures
> > > that are not cache coherent indeed.
> > 
> > Yes, something like the following change:
> 
> We only really need this if the architecture support cache incoherent
> DMA.  Maybe even as a runtime setting.

Can you take coherent DMA into account on kernel/dma/debug.c first?
Otherwise the warning still may be triggered on coherent DMA.


thanks, 
Ming


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-16  8:31   ` Ming Lei
  2024-10-16 12:31     ` Christoph Hellwig
@ 2024-10-22  2:15     ` Jens Axboe
  2024-10-22 10:24     ` Catalin Marinas
  2 siblings, 0 replies; 12+ messages in thread
From: Jens Axboe @ 2024-10-22  2:15 UTC (permalink / raw)
  To: Ming Lei, Christoph Hellwig; +Cc: linux-block, Keith Busch

On 10/16/24 2:31 AM, Ming Lei wrote:
> On Wed, Oct 16, 2024 at 10:04:19AM +0200, Christoph Hellwig wrote:
>> On Wed, Oct 16, 2024 at 12:40:13AM +0800, Ming Lei wrote:
>>> Hello Guys,
>>>
>>> Turns out host controller's DMA alignment is often too relax, so two DMA
>>> buffers may cross same cache line easily, and trigger the warning of
>>> "cacheline tracking EEXIST, overlapping mappings aren't supported".
>>>
>>> The attached test code can trigger the warning immediately with CONFIG_DMA_API_DEBUG
>>> enabled when reading from one scsi disk which queue DMA alignment is 3.
>>>
>>
>> We should not allow smaller than cache line alignment on architectures
>> that are not cache coherent indeed.
> 
> Yes, something like the following change:
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index a446654ddee5..26bd0e72c68e 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -348,7 +348,9 @@ static int blk_validate_limits(struct queue_limits *lim)
>  	 */
>  	if (!lim->dma_alignment)
>  		lim->dma_alignment = SECTOR_SIZE - 1;
> -	if (WARN_ON_ONCE(lim->dma_alignment > PAGE_SIZE))
> +	else if (lim->dma_alignment < L1_CACHE_BYTES - 1)
> +		lim->dma_alignment = L1_CACHE_BYTES - 1;
> +	else if (WARN_ON_ONCE(lim->dma_alignment > PAGE_SIZE))
>  		return -EINVAL;
>  
>  	if (lim->alignment_offset) {

This will break existing applications, running on an architecture
that are cache coherent.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-22  1:21       ` Ming Lei
@ 2024-10-22  7:25         ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2024-10-22  7:25 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, linux-block, Keith Busch, Jens Axboe,
	Catalin Marinas, Robin Murphy

On Tue, Oct 22, 2024 at 09:21:21AM +0800, Ming Lei wrote:
> Can you take coherent DMA into account on kernel/dma/debug.c first?

We intentionally don't to force people to write portable code.

> Otherwise the warning still may be triggered on coherent DMA.

Well, it is a valid warning for the above reason.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-16  8:31   ` Ming Lei
  2024-10-16 12:31     ` Christoph Hellwig
  2024-10-22  2:15     ` Jens Axboe
@ 2024-10-22 10:24     ` Catalin Marinas
  2024-10-23  0:50       ` Ming Lei
  2024-10-23  6:12       ` Christoph Hellwig
  2 siblings, 2 replies; 12+ messages in thread
From: Catalin Marinas @ 2024-10-22 10:24 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, linux-block, Keith Busch, Jens Axboe,
	Robin Murphy

On Wed, Oct 16, 2024 at 04:31:45PM +0800, Ming Lei wrote:
> On Wed, Oct 16, 2024 at 10:04:19AM +0200, Christoph Hellwig wrote:
> > On Wed, Oct 16, 2024 at 12:40:13AM +0800, Ming Lei wrote:
> > > Turns out host controller's DMA alignment is often too relax, so two DMA
> > > buffers may cross same cache line easily, and trigger the warning of
> > > "cacheline tracking EEXIST, overlapping mappings aren't supported".
> > > 
> > > The attached test code can trigger the warning immediately with CONFIG_DMA_API_DEBUG
> > > enabled when reading from one scsi disk which queue DMA alignment is 3.
> > 
> > We should not allow smaller than cache line alignment on architectures
> > that are not cache coherent indeed.

Even on architectures that are not fully coherent, the coherency is a
property of the device. You may need to somehow pass this information in
struct queue_limits if you want it to be optimal.

> Yes, something like the following change:
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index a446654ddee5..26bd0e72c68e 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -348,7 +348,9 @@ static int blk_validate_limits(struct queue_limits *lim)
>  	 */
>  	if (!lim->dma_alignment)
>  		lim->dma_alignment = SECTOR_SIZE - 1;
> -	if (WARN_ON_ONCE(lim->dma_alignment > PAGE_SIZE))
> +	else if (lim->dma_alignment < L1_CACHE_BYTES - 1)
> +		lim->dma_alignment = L1_CACHE_BYTES - 1;
> +	else if (WARN_ON_ONCE(lim->dma_alignment > PAGE_SIZE))
>  		return -EINVAL;

L1_CACHE_BYTES is not the right check here since a level 2/3 cache may
have a larger cache line than level 1 (and we have such configurations
on arm64 where ARCH_DMA_MINALIGN is 128 and L1_CACHE_BYTES is 64). Use
dma_get_cache_alignment() instead. On fully coherent architectures like
x86 it should return 1.

That said, the DMA debug code also uses the static L1_CACHE_SHIFT and it
will trigger the warning anyway. Some discussion around the DMA API
debug came up during the small ARCH_KMALLOC_MINALIGN changes (don't
remember it was in private with Robin or on the list). Now kmalloc() can
return a small buffer (less than a cache line) that won't be bounced if
the device is coherent (see dma_kmalloc_safe()) but the DMA API debug
code only checks for direction == DMA_TO_DEVICE, not
dev_is_dma_coherent(). For arm64 I did not want to disable small
ARCH_KMALLOC_MINALIGN if CONFIG_DMA_API_DEBUG is enabled as this would
skew the testing by forcing all allocations to be ARCH_DMA_MINALIGN
aligned.

Maybe I'm missing something in those checks but I'm surprised that the
DMA API debug code doesn't complain about small kmalloc() buffers on x86
(which never had any bouncing for this specific case since it's fully
coherent). I suspect people just don't enable DMA debugging on x86 for
such devices (typically USB drivers have this issue).

So maybe the DMA API debug should have two modes: a generic one that
catches alignments irrespective of the coherency of the device and
another that's specific to the device/architecture coherency properties.
The former, if enabled, should also force a higher minimum kmalloc()
alignment and a dma_get_cache_alignment() > 1.

-- 
Catalin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-22 10:24     ` Catalin Marinas
@ 2024-10-23  0:50       ` Ming Lei
  2024-10-23  6:12       ` Christoph Hellwig
  1 sibling, 0 replies; 12+ messages in thread
From: Ming Lei @ 2024-10-23  0:50 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Christoph Hellwig, linux-block, Keith Busch, Jens Axboe,
	Robin Murphy

On Tue, Oct 22, 2024 at 11:24:31AM +0100, Catalin Marinas wrote:
> On Wed, Oct 16, 2024 at 04:31:45PM +0800, Ming Lei wrote:
> > On Wed, Oct 16, 2024 at 10:04:19AM +0200, Christoph Hellwig wrote:
> > > On Wed, Oct 16, 2024 at 12:40:13AM +0800, Ming Lei wrote:
> > > > Turns out host controller's DMA alignment is often too relax, so two DMA
> > > > buffers may cross same cache line easily, and trigger the warning of
> > > > "cacheline tracking EEXIST, overlapping mappings aren't supported".
> > > > 
> > > > The attached test code can trigger the warning immediately with CONFIG_DMA_API_DEBUG
> > > > enabled when reading from one scsi disk which queue DMA alignment is 3.
> > > 
> > > We should not allow smaller than cache line alignment on architectures
> > > that are not cache coherent indeed.
> 
> Even on architectures that are not fully coherent, the coherency is a
> property of the device. You may need to somehow pass this information in
> struct queue_limits if you want it to be optimal.

Yeah, looks the issue has to be fixed from driver side, only driver has
'struct device' info.

> 
> > Yes, something like the following change:
> > 
> > diff --git a/block/blk-settings.c b/block/blk-settings.c
> > index a446654ddee5..26bd0e72c68e 100644
> > --- a/block/blk-settings.c
> > +++ b/block/blk-settings.c
> > @@ -348,7 +348,9 @@ static int blk_validate_limits(struct queue_limits *lim)
> >  	 */
> >  	if (!lim->dma_alignment)
> >  		lim->dma_alignment = SECTOR_SIZE - 1;
> > -	if (WARN_ON_ONCE(lim->dma_alignment > PAGE_SIZE))
> > +	else if (lim->dma_alignment < L1_CACHE_BYTES - 1)
> > +		lim->dma_alignment = L1_CACHE_BYTES - 1;
> > +	else if (WARN_ON_ONCE(lim->dma_alignment > PAGE_SIZE))
> >  		return -EINVAL;
> 
> L1_CACHE_BYTES is not the right check here since a level 2/3 cache may
> have a larger cache line than level 1 (and we have such configurations
> on arm64 where ARCH_DMA_MINALIGN is 128 and L1_CACHE_BYTES is 64). Use
> dma_get_cache_alignment() instead. On fully coherent architectures like
> x86 it should return 1.
> 
> That said, the DMA debug code also uses the static L1_CACHE_SHIFT and it
> will trigger the warning anyway. Some discussion around the DMA API
> debug came up during the small ARCH_KMALLOC_MINALIGN changes (don't
> remember it was in private with Robin or on the list). Now kmalloc() can
> return a small buffer (less than a cache line) that won't be bounced if
> the device is coherent (see dma_kmalloc_safe()) but the DMA API debug
> code only checks for direction == DMA_TO_DEVICE, not
> dev_is_dma_coherent(). For arm64 I did not want to disable small
> ARCH_KMALLOC_MINALIGN if CONFIG_DMA_API_DEBUG is enabled as this would
> skew the testing by forcing all allocations to be ARCH_DMA_MINALIGN
> aligned.
> 
> Maybe I'm missing something in those checks but I'm surprised that the
> DMA API debug code doesn't complain about small kmalloc() buffers on x86
> (which never had any bouncing for this specific case since it's fully
> coherent). I suspect people just don't enable DMA debugging on x86 for
> such devices (typically USB drivers have this issue).

I did see report on warning of "cacheline tracking EEXIST, overlapping mappings
aren't supported" on USB several times, since it is often treated as same with
this one.

> 
> So maybe the DMA API debug should have two modes: a generic one that
> catches alignments irrespective of the coherency of the device and
> another that's specific to the device/architecture coherency properties.
> The former, if enabled, should also force a higher minimum kmalloc()
> alignment and a dma_get_cache_alignment() > 1.

Or dma debug log needs to be improved by showing the warning is just a
hint.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-22 10:24     ` Catalin Marinas
  2024-10-23  0:50       ` Ming Lei
@ 2024-10-23  6:12       ` Christoph Hellwig
  2024-10-23  8:14         ` Ming Lei
  1 sibling, 1 reply; 12+ messages in thread
From: Christoph Hellwig @ 2024-10-23  6:12 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ming Lei, Christoph Hellwig, linux-block, Keith Busch, Jens Axboe,
	Robin Murphy

On Tue, Oct 22, 2024 at 11:24:31AM +0100, Catalin Marinas wrote:
> > > We should not allow smaller than cache line alignment on architectures
> > > that are not cache coherent indeed.
> 
> Even on architectures that are not fully coherent, the coherency is a
> property of the device. You may need to somehow pass this information in
> struct queue_limits if you want it to be optimal.

Well, devices set the queue limits.  So this would be a fix in the
drivers that set the queue limits.  SCSI already does this in the
midlayer code, so the main places to fix are nvme und ublk.

I cant take care of nvme by copying the scsi pattern.

> That said, the DMA debug code also uses the static L1_CACHE_SHIFT and it
> will trigger the warning anyway. Some discussion around the DMA API
> debug came up during the small ARCH_KMALLOC_MINALIGN changes (don't
> remember it was in private with Robin or on the list). Now kmalloc() can
> return a small buffer (less than a cache line) that won't be bounced if
> the device is coherent (see dma_kmalloc_safe()) but the DMA API debug
> code only checks for direction == DMA_TO_DEVICE, not
> dev_is_dma_coherent(). For arm64 I did not want to disable small
> ARCH_KMALLOC_MINALIGN if CONFIG_DMA_API_DEBUG is enabled as this would
> skew the testing by forcing all allocations to be ARCH_DMA_MINALIGN
> aligned.
> 
> Maybe I'm missing something in those checks but I'm surprised that the
> DMA API debug code doesn't complain about small kmalloc() buffers on x86
> (which never had any bouncing for this specific case since it's fully
> coherent). I suspect people just don't enable DMA debugging on x86 for
> such devices (typically USB drivers have this issue).

I don't think there's too many of these indeed.

> So maybe the DMA API debug should have two modes: a generic one that
> catches alignments irrespective of the coherency of the device and
> another that's specific to the device/architecture coherency properties.
> The former, if enabled, should also force a higher minimum kmalloc()
> alignment and a dma_get_cache_alignment() > 1.

Sounds reasonable.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-23  6:12       ` Christoph Hellwig
@ 2024-10-23  8:14         ` Ming Lei
  2024-10-23 12:23           ` Christoph Hellwig
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Lei @ 2024-10-23  8:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Catalin Marinas, linux-block, Keith Busch, Jens Axboe,
	Robin Murphy

On Wed, Oct 23, 2024 at 08:12:33AM +0200, Christoph Hellwig wrote:
> On Tue, Oct 22, 2024 at 11:24:31AM +0100, Catalin Marinas wrote:
> > > > We should not allow smaller than cache line alignment on architectures
> > > > that are not cache coherent indeed.
> > 
> > Even on architectures that are not fully coherent, the coherency is a
> > property of the device. You may need to somehow pass this information in
> > struct queue_limits if you want it to be optimal.
> 
> Well, devices set the queue limits.  So this would be a fix in the
> drivers that set the queue limits.  SCSI already does this in the
> midlayer code,

I guess it isn't true:

[linux]# cat /sys/block/sda/queue/dma_alignment
3

> so the main places to fix are nvme und ublk.
> 
> I cant take care of nvme by copying the scsi pattern.
> 
> > That said, the DMA debug code also uses the static L1_CACHE_SHIFT and it
> > will trigger the warning anyway. Some discussion around the DMA API
> > debug came up during the small ARCH_KMALLOC_MINALIGN changes (don't
> > remember it was in private with Robin or on the list). Now kmalloc() can
> > return a small buffer (less than a cache line) that won't be bounced if
> > the device is coherent (see dma_kmalloc_safe()) but the DMA API debug
> > code only checks for direction == DMA_TO_DEVICE, not
> > dev_is_dma_coherent(). For arm64 I did not want to disable small
> > ARCH_KMALLOC_MINALIGN if CONFIG_DMA_API_DEBUG is enabled as this would
> > skew the testing by forcing all allocations to be ARCH_DMA_MINALIGN
> > aligned.
> > 
> > Maybe I'm missing something in those checks but I'm surprised that the
> > DMA API debug code doesn't complain about small kmalloc() buffers on x86
> > (which never had any bouncing for this specific case since it's fully
> > coherent). I suspect people just don't enable DMA debugging on x86 for
> > such devices (typically USB drivers have this issue).
> 
> I don't think there's too many of these indeed.

Usually it is assumed that it is safe to DMA over kmalloc() buffer...

thanks,
Ming


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
  2024-10-23  8:14         ` Ming Lei
@ 2024-10-23 12:23           ` Christoph Hellwig
  0 siblings, 0 replies; 12+ messages in thread
From: Christoph Hellwig @ 2024-10-23 12:23 UTC (permalink / raw)
  To: Ming Lei
  Cc: Christoph Hellwig, Catalin Marinas, linux-block, Keith Busch,
	Jens Axboe, Robin Murphy

On Wed, Oct 23, 2024 at 04:14:34PM +0800, Ming Lei wrote:
> On Wed, Oct 23, 2024 at 08:12:33AM +0200, Christoph Hellwig wrote:
> > On Tue, Oct 22, 2024 at 11:24:31AM +0100, Catalin Marinas wrote:
> > > > > We should not allow smaller than cache line alignment on architectures
> > > > > that are not cache coherent indeed.
> > > 
> > > Even on architectures that are not fully coherent, the coherency is a
> > > property of the device. You may need to somehow pass this information in
> > > struct queue_limits if you want it to be optimal.
> > 
> > Well, devices set the queue limits.  So this would be a fix in the
> > drivers that set the queue limits.  SCSI already does this in the
> > midlayer code,
> 
> I guess it isn't true:
> 
> [linux]# cat /sys/block/sda/queue/dma_alignment

Is that a SCSI HBA that is on a not DMA coherent bus?  If not that
is expected.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-10-23 12:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-15 16:40 [Regression] b1a000d3b8ec ("block: relax direct io memory alignment") Ming Lei
2024-10-16  8:04 ` Christoph Hellwig
2024-10-16  8:31   ` Ming Lei
2024-10-16 12:31     ` Christoph Hellwig
2024-10-22  1:21       ` Ming Lei
2024-10-22  7:25         ` Christoph Hellwig
2024-10-22  2:15     ` Jens Axboe
2024-10-22 10:24     ` Catalin Marinas
2024-10-23  0:50       ` Ming Lei
2024-10-23  6:12       ` Christoph Hellwig
2024-10-23  8:14         ` Ming Lei
2024-10-23 12:23           ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).