Bug in add_dma_entry()'s debugging code

public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed

* Bug in add_dma_entry()'s debugging code
@ 2023-11-27 16:02 Alan Stern
  2023-11-27 16:07 ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2023-11-27 16:02 UTC (permalink / raw)
  To: Hamza Mahfooz, Dan Williams, Christoph Hellwig
  Cc: Marek Szyprowski, Andrew, Ferry Toth, Andy Shevchenko,
	Thorsten Leemhuis, iommu, Kernel development list,
	USB mailing list

Among other things, add_dma_entry() in kernel/dma/debug.c prints an 
error message when it sees two overlapping FROM_DEVICE DMA mappings.  
The actual overlap detection is performed by a separate routine, 
active_cacheline_insert().  But the criterion this routine uses is 
wrong.

All it looks for is mappings that start on the same cache line.  However 
on architectures that have cache-coherent DMA (such as x86), touching 
the same cache line does not mean that two DMA mappings will interfere 
with each other.  To truly overlap, they would have to touch the same 
_bytes_.

The routine does not check for this, and consequently we get error 
messages about overlapping mappings when in fact there is no overlap.  
This bug has been reported in

	https://bugzilla.kernel.org/show_bug.cgi?id=215740

How should this be fixed?  Since the check done in add_dma_entry() is 
completely invalid for x86 and similar architectures, should it simply 
be removed for them?  Or should the check be enhanced to look for 
byte-granularity overlap?

Alan Stern

PS: As a separate issue, active_cacheline_insert() fails to detect 
overlap in situations where a mapping occupies more than one cache line.  
For example, if mapping A uses lines N and N+1 and mapping B uses line 
N+1, no overlap will be detected because the radix-tree keys for A and B 
will be different (N vs. N+1).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-11-27 16:02 Bug in add_dma_entry()'s debugging code Alan Stern
@ 2023-11-27 16:07 ` Christoph Hellwig
  2023-11-27 16:51   ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2023-11-27 16:07 UTC (permalink / raw)
  To: Alan Stern
  Cc: Hamza Mahfooz, Dan Williams, Christoph Hellwig, Marek Szyprowski,
	Andrew, Ferry Toth, Andy Shevchenko, Thorsten Leemhuis, iommu,
	Kernel development list, USB mailing list

On Mon, Nov 27, 2023 at 11:02:20AM -0500, Alan Stern wrote:
> All it looks for is mappings that start on the same cache line.  However 
> on architectures that have cache-coherent DMA (such as x86), touching 
> the same cache line does not mean that two DMA mappings will interfere 
> with each other.  To truly overlap, they would have to touch the same 
> _bytes_.

But that is a special case that does not matter.  Linux drivers need
to be written in a portable way, and that means we have to care about
platforms that are not DMA coherent.

> How should this be fixed?  Since the check done in add_dma_entry() is 
> completely invalid for x86 and similar architectures, should it simply 
> be removed for them?  Or should the check be enhanced to look for 
> byte-granularity overlap?

The patch is every but "completely invalid".  It points out that you
violate the Linux DMA API requirements.  This might not have an
effect on the particular plaform you are currently running on, but it
is still wrong.  Note that there have been various mumblings about
using nosnoop dma on x86, in which case you'll have the issue there
as well.

> PS: As a separate issue, active_cacheline_insert() fails to detect 
> overlap in situations where a mapping occupies more than one cache line.  
> For example, if mapping A uses lines N and N+1 and mapping B uses line 
> N+1, no overlap will be detected because the radix-tree keys for A and B 
> will be different (N vs. N+1).

Fixes welcome.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-11-27 16:07 ` Christoph Hellwig
@ 2023-11-27 16:51   ` Alan Stern
  2023-11-28 13:37     ` Christoph Hellwig
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2023-11-27 16:51 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Hamza Mahfooz, Dan Williams, Marek Szyprowski, Andrew, Ferry Toth,
	Andy Shevchenko, Thorsten Leemhuis, iommu,
	Kernel development list, USB mailing list

On Mon, Nov 27, 2023 at 05:07:59PM +0100, Christoph Hellwig wrote:
> On Mon, Nov 27, 2023 at 11:02:20AM -0500, Alan Stern wrote:
> > All it looks for is mappings that start on the same cache line.  However 
> > on architectures that have cache-coherent DMA (such as x86), touching 
> > the same cache line does not mean that two DMA mappings will interfere 
> > with each other.  To truly overlap, they would have to touch the same 
> > _bytes_.
> 
> But that is a special case that does not matter.  Linux drivers need
> to be written in a portable way, and that means we have to care about
> platforms that are not DMA coherent.

The buffers in the bug report were allocated by kmalloc().  Doesn't 
kmalloc() guarantee that on architectures with non-cache-coherent DMA, 
allocations will be aligned on cache-line boundaries (or larger)?  Isn't 
this what ARCH_DMA_MINALIGN and ARCH_KMALLOC_MINALIGN are supposed to 
take care of in include/linux/slab.h?

> > How should this be fixed?  Since the check done in add_dma_entry() is 
> > completely invalid for x86 and similar architectures, should it simply 
> > be removed for them?  Or should the check be enhanced to look for 
> > byte-granularity overlap?
> 
> The patch is every but "completely invalid".  It points out that you
> violate the Linux DMA API requirements.

Since when does the DMA API require that mappings on x86 must be to 
separate cache lines?  Is this documented anywhere?

For that matter, Documentation/core-api/dma-api-howto.rst explicitly 
says:

	If you acquired your memory via the page allocator
	(i.e. __get_free_page*()) or the generic memory allocators
	(i.e. kmalloc() or kmem_cache_alloc()) then you may DMA to/from
	that memory using the addresses returned from those routines.

It also says:

	Architectures must ensure that kmalloc'ed buffer is
	DMA-safe. Drivers and subsystems depend on it. If an architecture
	isn't fully DMA-coherent (i.e. hardware doesn't ensure that data in
	the CPU cache is identical to data in main memory),
	ARCH_DMA_MINALIGN must be set so that the memory allocator
	makes sure that kmalloc'ed buffer doesn't share a cache line with
	the others. See arch/arm/include/asm/cache.h as an example.

It says nothing about avoiding more than one DMA operation at a time to 
prevent overlap.  Is the documentation wrong?

>  This might not have an
> effect on the particular plaform you are currently running on, but it
> is still wrong.

Who decides what is right and what is wrong in this area?  Clearly you 
have a different opinion from David S. Miller, Richard Henderson, and 
Jakub Jelinek (the authors of that documentation file).

>  Note that there have been various mumblings about
> using nosnoop dma on x86, in which case you'll have the issue there
> as well.

Unless the people who implement nosnoop DMA also modify kmalloc() or 
ARCH_DMA_MINALIGN.

I guess the real question boils down to where the responsiblity should 
lie.  Should kmalloc() guarantee that the memory regions it provides 
will always be suitable for DMA, with no overlap issues?  Or should all 
the innumerable users of kmalloc() be responsible for jumping through 
hoops to request arch-specific minimum alignment for their DMA buffers?

Alan Stern

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-11-27 16:51   ` Alan Stern
@ 2023-11-28 13:37     ` Christoph Hellwig
  2023-11-28 15:18       ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2023-11-28 13:37 UTC (permalink / raw)
  To: Alan Stern
  Cc: Christoph Hellwig, Hamza Mahfooz, Dan Williams, Marek Szyprowski,
	Andrew, Ferry Toth, Andy Shevchenko, Thorsten Leemhuis, iommu,
	Kernel development list, USB mailing list, Catalin Marinas,
	Robin Murphy, Will Deacon

On Mon, Nov 27, 2023 at 11:51:21AM -0500, Alan Stern wrote:
> The buffers in the bug report were allocated by kmalloc().  Doesn't 
> kmalloc() guarantee that on architectures with non-cache-coherent DMA, 
> allocations will be aligned on cache-line boundaries (or larger)?  Isn't 
> this what ARCH_DMA_MINALIGN and ARCH_KMALLOC_MINALIGN are supposed to 
> take care of in include/linux/slab.h?

Oh.  Yes, the variable alignment from kmalloc make things complicated.

> 	Architectures must ensure that kmalloc'ed buffer is
> 	DMA-safe. Drivers and subsystems depend on it. If an architecture
> 	isn't fully DMA-coherent (i.e. hardware doesn't ensure that data in
> 	the CPU cache is identical to data in main memory),
> 	ARCH_DMA_MINALIGN must be set so that the memory allocator
> 	makes sure that kmalloc'ed buffer doesn't share a cache line with
> 	the others. See arch/arm/include/asm/cache.h as an example.
> 
> It says nothing about avoiding more than one DMA operation at a time to 
> prevent overlap.  Is the documentation wrong?

The documentation is a bit lacking unfortunately.  Again, assuming
you actually have non-coherent mappings you'd easily break the fragile
cache line ownership if you did.  That doesn't apply to x86 as-is, but
we really try to keep drivers portable.

> >  This might not have an
> > effect on the particular plaform you are currently running on, but it
> > is still wrong.
> 
> Who decides what is right and what is wrong in this area?  Clearly you 
> have a different opinion from David S. Miller, Richard Henderson, and 
> Jakub Jelinek (the authors of that documentation file).

I don't think this about anyone's opinion. It's a fundamental propery of
how to manage caches in the face of non-coherent DMA.

> 
> >  Note that there have been various mumblings about
> > using nosnoop dma on x86, in which case you'll have the issue there
> > as well.
> 
> Unless the people who implement nosnoop DMA also modify kmalloc() or 
> ARCH_DMA_MINALIGN.

Or don't do it on kmalloc buffers.

> I guess the real question boils down to where the responsiblity should 
> lie.  Should kmalloc() guarantee that the memory regions it provides 
> will always be suitable for DMA, with no overlap issues?  Or should all 
> the innumerable users of kmalloc() be responsible for jumping through 
> hoops to request arch-specific minimum alignment for their DMA buffers?

I'd actually go one step back:

 1) for not cache coherent DMA you can't do overlapping operations inside
    a cache line
 2) dma-debug is intended to find DMA API misuses, even if they don't
    have bad effects on your particular system
 3) the fact that the kmalloc implementation returns differently aligned
    memory depending on the platform breaks how dma-debug works currently

The logical confcusion from that would be that IFF dma-debug is enabled on
any platform we need to set ARCH_DMA_MINALIGN to the cache line size.

BUT:  we're actually reduzing our dependency on ARCH_DMA_MINALIGN by
moving to bounce buffering unaligned memory for non-coherent
architectures, which makes this even more complicated.  Right now I
don't have a good idea how to actually deal with having the cachline
sanity checks with that, but I'm Ccing some of the usual suspects if
they have a better idea.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-11-28 13:37     ` Christoph Hellwig
@ 2023-11-28 15:18       ` Alan Stern
  2023-11-28 16:31         ` Robin Murphy
  2023-11-28 17:44         ` Catalin Marinas
  0 siblings, 2 replies; 16+ messages in thread
From: Alan Stern @ 2023-11-28 15:18 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Hamza Mahfooz, Dan Williams, Marek Szyprowski, Andrew, Ferry Toth,
	Andy Shevchenko, Thorsten Leemhuis, iommu,
	Kernel development list, USB mailing list, Catalin Marinas,
	Robin Murphy, Will Deacon

On Tue, Nov 28, 2023 at 02:37:02PM +0100, Christoph Hellwig wrote:
> I'd actually go one step back:
> 
>  1) for not cache coherent DMA you can't do overlapping operations inside
>     a cache line

Rephrasing slightly: You mustn't perform multiple non-cache-coherent DMA 
operations that touch the same cache line concurrently.  (The word 
"overlapping" is a a little ambiguous in this context.)

(Right now dma-debug considers only DMA-IN operations.  In theory this 
restriction should apply even when some of the concurrent operations are 
DMA-OUT, provided that at least one of them is DMA-IN.  Minor point...)

>  2) dma-debug is intended to find DMA API misuses, even if they don't
>     have bad effects on your particular system
>  3) the fact that the kmalloc implementation returns differently aligned
>     memory depending on the platform breaks how dma-debug works currently

Exactly.  That's essentially what Bugzilla #215740 is about.

> The logical confcusion from that would be that IFF dma-debug is enabled on
> any platform we need to set ARCH_DMA_MINALIGN to the cache line size.

(IF, not IFF.)  And tell distributions that CONFIG_DMA_API_DEBUG is not 
meant for production systems but rather for kernel testing, right?

> BUT:  we're actually reduzing our dependency on ARCH_DMA_MINALIGN by
> moving to bounce buffering unaligned memory for non-coherent
> architectures,

What's the reason for this?  To allow the minimum allocation size to be
smaller than the cache line size?  Does the savings in memory make up 
for the extra overhead of bounce buffering?

Or is this just to allow people to be more careless about how they 
allocate their DMA buffers (which doesn't seem to make sense)?

>  which makes this even more complicated.  Right now I
> don't have a good idea how to actually deal with having the cachline
> sanity checks with that, but I'm Ccing some of the usual suspects if
> they have a better idea.

I get the impression that you would really like to have two different 
versions of kmalloc() and friends: one for buffers that will be used in 
DMA (and hence require cache-line alignment) and one for buffers that 
won't be.

Alan Stern

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-11-28 15:18       ` Alan Stern
@ 2023-11-28 16:31         ` Robin Murphy
  2023-11-28 16:34           ` Andy Shevchenko
  2023-11-28 17:44         ` Catalin Marinas
  1 sibling, 1 reply; 16+ messages in thread
From: Robin Murphy @ 2023-11-28 16:31 UTC (permalink / raw)
  To: Alan Stern, Christoph Hellwig
  Cc: Hamza Mahfooz, Dan Williams, Marek Szyprowski, Andrew, Ferry Toth,
	Andy Shevchenko, Thorsten Leemhuis, iommu,
	Kernel development list, USB mailing list, Catalin Marinas,
	Will Deacon

On 28/11/2023 3:18 pm, Alan Stern wrote:
> On Tue, Nov 28, 2023 at 02:37:02PM +0100, Christoph Hellwig wrote:
>> I'd actually go one step back:
>>
>>   1) for not cache coherent DMA you can't do overlapping operations inside
>>      a cache line
> 
> Rephrasing slightly: You mustn't perform multiple non-cache-coherent DMA
> operations that touch the same cache line concurrently.  (The word
> "overlapping" is a a little ambiguous in this context.)
> 
> (Right now dma-debug considers only DMA-IN operations.  In theory this
> restriction should apply even when some of the concurrent operations are
> DMA-OUT, provided that at least one of them is DMA-IN.  Minor point...)
> 
>>   2) dma-debug is intended to find DMA API misuses, even if they don't
>>      have bad effects on your particular system
>>   3) the fact that the kmalloc implementation returns differently aligned
>>      memory depending on the platform breaks how dma-debug works currently
> 
> Exactly.  That's essentially what Bugzilla #215740 is about.
> 
>> The logical confcusion from that would be that IFF dma-debug is enabled on
>> any platform we need to set ARCH_DMA_MINALIGN to the cache line size.
> 
> (IF, not IFF.)  And tell distributions that CONFIG_DMA_API_DEBUG is not
> meant for production systems but rather for kernel testing, right?

Yikes, I'd hope that distros are heeding the warning that the Kconfig 
calls out already. It's perhaps somewhat understated, as I'd describe 
the performance impact to large modern systems with high-bandwidth I/O 
as somewhere between "severe" and "crippling".

>> BUT:  we're actually reduzing our dependency on ARCH_DMA_MINALIGN by
>> moving to bounce buffering unaligned memory for non-coherent
>> architectures,
> 
> What's the reason for this?  To allow the minimum allocation size to be
> smaller than the cache line size?  Does the savings in memory make up
> for the extra overhead of bounce buffering?

Yes, on systems where non-coherent streaming DMA is expected to be rare, 
or at least not performance-critical, not having to allocate 128 bytes 
every time we want just 8 or so soon adds up.

> Or is this just to allow people to be more careless about how they
> allocate their DMA buffers (which doesn't seem to make sense)?
> 
>>   which makes this even more complicated.  Right now I
>> don't have a good idea how to actually deal with having the cachline
>> sanity checks with that, but I'm Ccing some of the usual suspects if
>> they have a better idea.
> 
> I get the impression that you would really like to have two different
> versions of kmalloc() and friends: one for buffers that will be used in
> DMA (and hence require cache-line alignment) and one for buffers that
> won't be.

That approach was mooted, but still has potentially-fiddly corner cases 
like when the DMA buffer is a member of a larger struct, so we settled 
on the bounce-buffering solution as the most robust compromise.

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-11-28 16:31         ` Robin Murphy
@ 2023-11-28 16:34           ` Andy Shevchenko
  2023-11-28 16:54             ` Robin Murphy
  0 siblings, 1 reply; 16+ messages in thread
From: Andy Shevchenko @ 2023-11-28 16:34 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Alan Stern, Christoph Hellwig, Hamza Mahfooz, Dan Williams,
	Marek Szyprowski, Andrew, Ferry Toth, Thorsten Leemhuis, iommu,
	Kernel development list, USB mailing list, Catalin Marinas,
	Will Deacon

On Tue, Nov 28, 2023 at 6:31 PM Robin Murphy <robin.murphy@arm.com> wrote:
> On 28/11/2023 3:18 pm, Alan Stern wrote:
> > On Tue, Nov 28, 2023 at 02:37:02PM +0100, Christoph Hellwig wrote:

...

> >> The logical confcusion from that would be that IFF dma-debug is enabled on
> >> any platform we need to set ARCH_DMA_MINALIGN to the cache line size.
> >
> > (IF, not IFF.)  And tell distributions that CONFIG_DMA_API_DEBUG is not
> > meant for production systems but rather for kernel testing, right?
>
> Yikes, I'd hope that distros are heeding the warning that the Kconfig
> calls out already. It's perhaps somewhat understated, as I'd describe
> the performance impact to large modern systems with high-bandwidth I/O
> as somewhere between "severe" and "crippling".

Independently on the distros configurations the (false positive)
message(s) will make difficult to debug the actual things, shouldn't
we have our code robust in any case?

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-11-28 16:34           ` Andy Shevchenko
@ 2023-11-28 16:54             ` Robin Murphy
  0 siblings, 0 replies; 16+ messages in thread
From: Robin Murphy @ 2023-11-28 16:54 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Alan Stern, Christoph Hellwig, Hamza Mahfooz, Dan Williams,
	Marek Szyprowski, Andrew, Ferry Toth, Thorsten Leemhuis, iommu,
	Kernel development list, USB mailing list, Catalin Marinas,
	Will Deacon

On 28/11/2023 4:34 pm, Andy Shevchenko wrote:
> On Tue, Nov 28, 2023 at 6:31 PM Robin Murphy <robin.murphy@arm.com> wrote:
>> On 28/11/2023 3:18 pm, Alan Stern wrote:
>>> On Tue, Nov 28, 2023 at 02:37:02PM +0100, Christoph Hellwig wrote:
> 
> ...
> 
>>>> The logical confcusion from that would be that IFF dma-debug is enabled on
>>>> any platform we need to set ARCH_DMA_MINALIGN to the cache line size.
>>>
>>> (IF, not IFF.)  And tell distributions that CONFIG_DMA_API_DEBUG is not
>>> meant for production systems but rather for kernel testing, right?
>>
>> Yikes, I'd hope that distros are heeding the warning that the Kconfig
>> calls out already. It's perhaps somewhat understated, as I'd describe
>> the performance impact to large modern systems with high-bandwidth I/O
>> as somewhere between "severe" and "crippling".
> 
> Independently on the distros configurations the (false positive)
> message(s) will make difficult to debug the actual things, shouldn't
> we have our code robust in any case?

Sure, I have no objection to making dma-debug more robust and useful for 
its intended purpose, I was just commenting on the fact that any 
potential change in behaviour from this should be of less concern to 
distros than the significant change in behaviour that enabling it 
*already* poses (i.e. globally serialising DMA operations and doing 
inherently slow stuff in what are normally expected to be fast paths).

Thanks,
Robin.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-11-28 15:18       ` Alan Stern
  2023-11-28 16:31         ` Robin Murphy
@ 2023-11-28 17:44         ` Catalin Marinas
  2023-11-30 20:08           ` Ferry Toth
  1 sibling, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2023-11-28 17:44 UTC (permalink / raw)
  To: Alan Stern
  Cc: Christoph Hellwig, Hamza Mahfooz, Dan Williams, Marek Szyprowski,
	Andrew, Ferry Toth, Andy Shevchenko, Thorsten Leemhuis, iommu,
	Kernel development list, USB mailing list, Robin Murphy,
	Will Deacon

On Tue, Nov 28, 2023 at 10:18:19AM -0500, Alan Stern wrote:
> On Tue, Nov 28, 2023 at 02:37:02PM +0100, Christoph Hellwig wrote:
> > I'd actually go one step back:
> > 
> >  1) for not cache coherent DMA you can't do overlapping operations inside
> >     a cache line
> 
> Rephrasing slightly: You mustn't perform multiple non-cache-coherent DMA 
> operations that touch the same cache line concurrently.  (The word 
> "overlapping" is a a little ambiguous in this context.)

The problem is worse. I'd say you should not perform even a single
non-cache-coherent DMA (usually from-device or bidirectional) operation
if the cache line is shared with anything else modifying it. It doesn't
need to be another DMA operation. But that's more difficult to add to
the DMA API debug code (maybe something like the bouncing logic in
dma_kmalloc_needs_bounce()).

> > The logical confcusion from that would be that IFF dma-debug is enabled on
> > any platform we need to set ARCH_DMA_MINALIGN to the cache line size.

Or just force the kmalloc() min align to cache_line_size(), something
like:

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 4a658de44ee9..3ece20367636 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -543,6 +543,8 @@ static inline int dma_get_cache_alignment(void)
 #ifdef ARCH_HAS_DMA_MINALIGN
 	return ARCH_DMA_MINALIGN;
 #endif
+	if (IS_ENABLED(CONFIG_DMA_API_DEBUG))
+		return cache_line_size();
 	return 1;
 }
 #endif
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 8d431193c273..d0b21d6e9328 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -879,7 +879,7 @@ static unsigned int __kmalloc_minalign(void)
 	unsigned int minalign = dma_get_cache_alignment();
 
 	if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
-	    is_swiotlb_allocated())
+	    is_swiotlb_allocated() && !IS_ENABLED(CONFIG_DMA_API_DEBUG))
 		minalign = ARCH_KMALLOC_MINALIGN;
 
 	return max(minalign, arch_slab_minalign());

Also note that to_cacheline_number() in kernel/dma/debug.c only takes
into account the L1_CACHE_SHIFT. On arm64 for example, cache_line_size()
returns the maximum line of all the cache levels (and we've seen
hardware where the L1 is 64-byte, L2 is 128).

> > BUT:  we're actually reduzing our dependency on ARCH_DMA_MINALIGN by
> > moving to bounce buffering unaligned memory for non-coherent
> > architectures,
> 
> What's the reason for this?  To allow the minimum allocation size to be
> smaller than the cache line size?  Does the savings in memory make up 
> for the extra overhead of bounce buffering?
> 
> Or is this just to allow people to be more careless about how they 
> allocate their DMA buffers (which doesn't seem to make sense)?

It's the former and it does make a difference with lots of small
structure or string allocations.

[...]
> I get the impression that you would really like to have two different 
> versions of kmalloc() and friends: one for buffers that will be used in 
> DMA (and hence require cache-line alignment) and one for buffers that 
> won't be.

We've been there for the past 2-3 years (and a few other options). It's
hard to guess in a generic way because the allocation place may not
necessarily know how the device is going to access that memory (PIO,
DMA). The conclusion was that for those cases where the buffer is small,
we just do a bounce. If it's performance critical, the driver can use a
kmem_cache_create(SLAB_HWCACHE_ALIGN) and avoid the bouncing.

-- 
Catalin

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-11-28 17:44         ` Catalin Marinas
@ 2023-11-30 20:08           ` Ferry Toth
  2023-12-01 11:08             ` Catalin Marinas
  0 siblings, 1 reply; 16+ messages in thread
From: Ferry Toth @ 2023-11-30 20:08 UTC (permalink / raw)
  To: Catalin Marinas, Alan Stern
  Cc: Christoph Hellwig, Hamza Mahfooz, Dan Williams, Marek Szyprowski,
	Andrew, Ferry Toth, Andy Shevchenko, Thorsten Leemhuis, iommu,
	Kernel development list, USB mailing list, Robin Murphy,
	Will Deacon

Hi,

Op 28-11-2023 om 18:44 schreef Catalin Marinas:
> On Tue, Nov 28, 2023 at 10:18:19AM -0500, Alan Stern wrote:
>> On Tue, Nov 28, 2023 at 02:37:02PM +0100, Christoph Hellwig wrote:
>>> I'd actually go one step back:
>>>
>>>   1) for not cache coherent DMA you can't do overlapping operations inside
>>>      a cache line
>>
>> Rephrasing slightly: You mustn't perform multiple non-cache-coherent DMA
>> operations that touch the same cache line concurrently.  (The word
>> "overlapping" is a a little ambiguous in this context.)
> 
> The problem is worse. I'd say you should not perform even a single
> non-cache-coherent DMA (usually from-device or bidirectional) operation
> if the cache line is shared with anything else modifying it. It doesn't
> need to be another DMA operation. But that's more difficult to add to
> the DMA API debug code (maybe something like the bouncing logic in
> dma_kmalloc_needs_bounce()).
> 
>>> The logical confcusion from that would be that IFF dma-debug is enabled on
>>> any platform we need to set ARCH_DMA_MINALIGN to the cache line size.
> 
> Or just force the kmalloc() min align to cache_line_size(), something
> like:
> 
> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> index 4a658de44ee9..3ece20367636 100644
> --- a/include/linux/dma-mapping.h
> +++ b/include/linux/dma-mapping.h
> @@ -543,6 +543,8 @@ static inline int dma_get_cache_alignment(void)
>   #ifdef ARCH_HAS_DMA_MINALIGN
>   	return ARCH_DMA_MINALIGN;
>   #endif
> +	if (IS_ENABLED(CONFIG_DMA_API_DEBUG))
> +		return cache_line_size();
>   	return 1;
>   }
>   #endif
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index 8d431193c273..d0b21d6e9328 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -879,7 +879,7 @@ static unsigned int __kmalloc_minalign(void)
>   	unsigned int minalign = dma_get_cache_alignment();
>   
>   	if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
> -	    is_swiotlb_allocated())
> +	    is_swiotlb_allocated() && !IS_ENABLED(CONFIG_DMA_API_DEBUG))
>   		minalign = ARCH_KMALLOC_MINALIGN;
>   
>   	return max(minalign, arch_slab_minalign());

With above suggestion "force the kmalloc() min align to 
cache_line_size()" + Alan's debug code:

root@yuna:~# journalctl -k | grep hub
kernel: usbcore: registered new interface driver hub
kernel: hub 1-0:1.0: USB hub found
kernel: usb usb1: hub buffer at 71c7180, status at 71c71c0
kernel: hub 1-0:1.0: 1 port detected
kernel: hub 2-0:1.0: USB hub found
kernel: usb usb2: hub buffer at 71c79c0, status at 71c7a00
kernel: hub 2-0:1.0: 1 port detected
kernel: hub 1-1:1.0: USB hub found
kernel: usb 1-1: hub buffer at 65b36c0, status at 6639340
kernel: hub 1-1:1.0: 7 ports detected

and the stack trace indeed goes away.

IOW also the 2 root hub kmalloc() are now also aligned to the cache line 
size, even though these never triggered the stack trace. Strange: hub 
status is aligned far away from hub buffer, kmalloc mysteries.

This still did not land for me: are we detecting a false alarm here as 
the 2 DMA operations can never happen on the same cache line on 
non-cache-coherent platforms? If so, shouldn't we fix up the dma debug 
code to not detect a false alarm? Instead of changing the alignment?
Or, is this a bonafide warning (for non-cache-coherent platforms)? Then 
we should not silence it by force aligning it, but issue a WARN (on a 
cache coherent platform) that is more useful (i.e. here we have not an 
overlap but a shared cache line). On a non-cache coherent platform 
something stronger than a WARN might be appropriate?

> Also note that to_cacheline_number() in kernel/dma/debug.c only takes
> into account the L1_CACHE_SHIFT. On arm64 for example, cache_line_size()
> returns the maximum line of all the cache levels (and we've seen
> hardware where the L1 is 64-byte, L2 is 128).
> 
>>> BUT:  we're actually reduzing our dependency on ARCH_DMA_MINALIGN by
>>> moving to bounce buffering unaligned memory for non-coherent
>>> architectures,
>>
>> What's the reason for this?  To allow the minimum allocation size to be
>> smaller than the cache line size?  Does the savings in memory make up
>> for the extra overhead of bounce buffering?
>>
>> Or is this just to allow people to be more careless about how they
>> allocate their DMA buffers (which doesn't seem to make sense)?
> 
> It's the former and it does make a difference with lots of small
> structure or string allocations.
> 
> [...]
>> I get the impression that you would really like to have two different
>> versions of kmalloc() and friends: one for buffers that will be used in
>> DMA (and hence require cache-line alignment) and one for buffers that
>> won't be.
> 
> We've been there for the past 2-3 years (and a few other options). It's
> hard to guess in a generic way because the allocation place may not
> necessarily know how the device is going to access that memory (PIO,
> DMA). The conclusion was that for those cases where the buffer is small,
> we just do a bounce. If it's performance critical, the driver can use a
> kmem_cache_create(SLAB_HWCACHE_ALIGN) and avoid the bouncing.
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-11-30 20:08           ` Ferry Toth
@ 2023-12-01 11:08             ` Catalin Marinas
  2023-12-01 12:17               ` Ferry Toth
  0 siblings, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2023-12-01 11:08 UTC (permalink / raw)
  To: Ferry Toth
  Cc: Alan Stern, Christoph Hellwig, Hamza Mahfooz, Dan Williams,
	Marek Szyprowski, Andrew, Ferry Toth, Andy Shevchenko,
	Thorsten Leemhuis, iommu, Kernel development list,
	USB mailing list, Robin Murphy, Will Deacon

On Thu, Nov 30, 2023 at 09:08:23PM +0100, Ferry Toth wrote:
> Op 28-11-2023 om 18:44 schreef Catalin Marinas:
> > Or just force the kmalloc() min align to cache_line_size(), something
> > like:
> > 
> > diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
> > index 4a658de44ee9..3ece20367636 100644
> > --- a/include/linux/dma-mapping.h
> > +++ b/include/linux/dma-mapping.h
> > @@ -543,6 +543,8 @@ static inline int dma_get_cache_alignment(void)
> >   #ifdef ARCH_HAS_DMA_MINALIGN
> >   	return ARCH_DMA_MINALIGN;
> >   #endif
> > +	if (IS_ENABLED(CONFIG_DMA_API_DEBUG))
> > +		return cache_line_size();
> >   	return 1;
> >   }
> >   #endif
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index 8d431193c273..d0b21d6e9328 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -879,7 +879,7 @@ static unsigned int __kmalloc_minalign(void)
> >   	unsigned int minalign = dma_get_cache_alignment();
> >   	if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
> > -	    is_swiotlb_allocated())
> > +	    is_swiotlb_allocated() && !IS_ENABLED(CONFIG_DMA_API_DEBUG))
> >   		minalign = ARCH_KMALLOC_MINALIGN;
> >   	return max(minalign, arch_slab_minalign());
> 
> With above suggestion "force the kmalloc() min align to cache_line_size()" +
> Alan's debug code:
> 
> root@yuna:~# journalctl -k | grep hub
> kernel: usbcore: registered new interface driver hub
> kernel: hub 1-0:1.0: USB hub found
> kernel: usb usb1: hub buffer at 71c7180, status at 71c71c0
> kernel: hub 1-0:1.0: 1 port detected
> kernel: hub 2-0:1.0: USB hub found
> kernel: usb usb2: hub buffer at 71c79c0, status at 71c7a00
> kernel: hub 2-0:1.0: 1 port detected
> kernel: hub 1-1:1.0: USB hub found
> kernel: usb 1-1: hub buffer at 65b36c0, status at 6639340
> kernel: hub 1-1:1.0: 7 ports detected
> 
> and the stack trace indeed goes away.
> 
> IOW also the 2 root hub kmalloc() are now also aligned to the cache line
> size, even though these never triggered the stack trace. Strange: hub status
> is aligned far away from hub buffer, kmalloc mysteries.

They are 64 bytes apart in most cases other than the last one which I
guess the status had to go to a different slab page.

> This still did not land for me: are we detecting a false alarm here as the 2
> DMA operations can never happen on the same cache line on non-cache-coherent
> platforms? If so, shouldn't we fix up the dma debug code to not detect a
> false alarm? Instead of changing the alignment?

It's a false alarm indeed on this hardware since the DMA is
cache-coherent. I think Christoph mentioned earlier in this thread that
he'd like the DMA API debug to report potential problems even if the
hardware it is running on is safe.

> Or, is this a bonafide warning (for non-cache-coherent platforms)? Then we
> should not silence it by force aligning it, but issue a WARN (on a cache
> coherent platform) that is more useful (i.e. here we have not an overlap but
> a shared cache line). On a non-cache coherent platform something stronger
> than a WARN might be appropriate?

A non-cache coherent platform would either have the kmalloc()
allocations aligned or it would bounce those small, unaligned buffers.
So it doesn't seem right to issue a warning on x86 where kmalloc()
minimum alignment is 8 and not getting the warning on a non-coherent
platform which forces the kmalloc() alignment.

If we consider the kmalloc() aspect already covered by bouncing or force
alignment, the DMA API debug code can still detect other cache line
sharing situations.

-- 
Catalin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-12-01 11:08             ` Catalin Marinas
@ 2023-12-01 12:17               ` Ferry Toth
  2023-12-01 16:21                 ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Ferry Toth @ 2023-12-01 12:17 UTC (permalink / raw)
  To: Catalin Marinas, Ferry Toth
  Cc: Alan Stern, Christoph Hellwig, Hamza Mahfooz, Dan Williams,
	Marek Szyprowski, Andrew, Andy Shevchenko, Thorsten Leemhuis,
	iommu, Kernel development list, USB mailing list, Robin Murphy,
	Will Deacon

Hi

On 01-12-2023 12:08, Catalin Marinas wrote:
> On Thu, Nov 30, 2023 at 09:08:23PM +0100, Ferry Toth wrote:
>> Op 28-11-2023 om 18:44 schreef Catalin Marinas:
>>> Or just force the kmalloc() min align to cache_line_size(), something
>>> like:
>>>
>>> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
>>> index 4a658de44ee9..3ece20367636 100644
>>> --- a/include/linux/dma-mapping.h
>>> +++ b/include/linux/dma-mapping.h
>>> @@ -543,6 +543,8 @@ static inline int dma_get_cache_alignment(void)
>>>    #ifdef ARCH_HAS_DMA_MINALIGN
>>>    	return ARCH_DMA_MINALIGN;
>>>    #endif
>>> +	if (IS_ENABLED(CONFIG_DMA_API_DEBUG))
>>> +		return cache_line_size();
>>>    	return 1;
>>>    }
>>>    #endif
>>> diff --git a/mm/slab_common.c b/mm/slab_common.c
>>> index 8d431193c273..d0b21d6e9328 100644
>>> --- a/mm/slab_common.c
>>> +++ b/mm/slab_common.c
>>> @@ -879,7 +879,7 @@ static unsigned int __kmalloc_minalign(void)
>>>    	unsigned int minalign = dma_get_cache_alignment();
>>>    	if (IS_ENABLED(CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC) &&
>>> -	    is_swiotlb_allocated())
>>> +	    is_swiotlb_allocated() && !IS_ENABLED(CONFIG_DMA_API_DEBUG))
>>>    		minalign = ARCH_KMALLOC_MINALIGN;
>>>    	return max(minalign, arch_slab_minalign());
>> With above suggestion "force the kmalloc() min align to cache_line_size()" +
>> Alan's debug code:
>>
>> root@yuna:~# journalctl -k | grep hub
>> kernel: usbcore: registered new interface driver hub
>> kernel: hub 1-0:1.0: USB hub found
>> kernel: usb usb1: hub buffer at 71c7180, status at 71c71c0
>> kernel: hub 1-0:1.0: 1 port detected
>> kernel: hub 2-0:1.0: USB hub found
>> kernel: usb usb2: hub buffer at 71c79c0, status at 71c7a00
>> kernel: hub 2-0:1.0: 1 port detected
>> kernel: hub 1-1:1.0: USB hub found
>> kernel: usb 1-1: hub buffer at 65b36c0, status at 6639340
>> kernel: hub 1-1:1.0: 7 ports detected
>>
>> and the stack trace indeed goes away.
>>
>> IOW also the 2 root hub kmalloc() are now also aligned to the cache line
>> size, even though these never triggered the stack trace. Strange: hub status
>> is aligned far away from hub buffer, kmalloc mysteries.
> They are 64 bytes apart in most cases other than the last one which I
> guess the status had to go to a different slab page.
>
>> This still did not land for me: are we detecting a false alarm here as the 2
>> DMA operations can never happen on the same cache line on non-cache-coherent
>> platforms? If so, shouldn't we fix up the dma debug code to not detect a
>> false alarm? Instead of changing the alignment?
> It's a false alarm indeed on this hardware since the DMA is
> cache-coherent. I think Christoph mentioned earlier in this thread that
> he'd like the DMA API debug to report potential problems even if the
> hardware it is running on is safe.

I agree with Christoph. So, my question is "are we detecting a false 
alarm", not "a false alarm indeed on this hardware"?

>> Or, is this a bonafide warning (for non-cache-coherent platforms)? Then we
>> should not silence it by force aligning it, but issue a WARN (on a cache
>> coherent platform) that is more useful (i.e. here we have not an overlap but
>> a shared cache line). On a non-cache coherent platform something stronger
>> than a WARN might be appropriate?
> A non-cache coherent platform would either have the kmalloc()
> allocations aligned or it would bounce those small, unaligned buffers.
It would? Or it always has?
> So it doesn't seem right to issue a warning on x86 where kmalloc()
> minimum alignment is 8 and not getting the warning on a non-coherent
> platform which forces the kmalloc() alignment.

If *all*non-coherent platforms implement either correct alignment or 
bounce buffer, and on (coherent) x86 we get an WARN, then it is a false 
alarm right?

That is exactly my question (because I have no idea which platforms have 
non-coherent caches).

> If we consider the kmalloc() aspect already covered by bouncing or force
> alignment, the DMA API debug code can still detect other cache line
> sharing situations.

Consider? Or know?

If we know, and we can not easily prevent false WARN's on x86 it could 
be sufficient to change the message to "possible cacheline sharing 
detected" and add a line that "DMA_API_DEBUG" should be disabled on 
production systems.

And while at it, we might be able to fix the missed real cacheline 
sharing mentioned by Alan:

  "As a separate issue, active_cacheline_insert() fails to detect
overlap in situations where a mapping occupies more than one cache line.
For example, if mapping A uses lines N and N+1 and mapping B uses line
N+1, no overlap will be detected because the radix-tree keys for A and B
will be different (N vs. N+1)."

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-12-01 12:17               ` Ferry Toth
@ 2023-12-01 16:21                 ` Alan Stern
  2023-12-01 17:42                   ` Catalin Marinas
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2023-12-01 16:21 UTC (permalink / raw)
  To: Ferry Toth
  Cc: Catalin Marinas, Ferry Toth, Christoph Hellwig, Hamza Mahfooz,
	Dan Williams, Marek Szyprowski, Andrew, Andy Shevchenko,
	Thorsten Leemhuis, iommu, Kernel development list,
	USB mailing list, Robin Murphy, Will Deacon

On Fri, Dec 01, 2023 at 01:17:43PM +0100, Ferry Toth wrote:
> > A non-cache coherent platform would either have the kmalloc()
> > allocations aligned or it would bounce those small, unaligned buffers.
> It would? Or it always has?
> > So it doesn't seem right to issue a warning on x86 where kmalloc()
> > minimum alignment is 8 and not getting the warning on a non-coherent
> > platform which forces the kmalloc() alignment.
> 
> If *all*non-coherent platforms implement either correct alignment or bounce
> buffer, and on (coherent) x86 we get an WARN, then it is a false alarm
> right?
> 
> That is exactly my question (because I have no idea which platforms have
> non-coherent caches).

Don't forget, not all DMA buffers are allocated by kmalloc().  A buffer 
allocated by some other means might not be aligned properly and might 
share a cache line with another buffer.

Or you might have a single data structure that was allocated by 
kmalloc() and then create separate DMA mappings for two members of that 
structure.  If the two members are in the same cache line, that would be 
an error.

Alan Stern

> > If we consider the kmalloc() aspect already covered by bouncing or force
> > alignment, the DMA API debug code can still detect other cache line
> > sharing situations.
> 
> Consider? Or know?
> 
> If we know, and we can not easily prevent false WARN's on x86 it could be
> sufficient to change the message to "possible cacheline sharing detected"
> and add a line that "DMA_API_DEBUG" should be disabled on production
> systems.
> 
> And while at it, we might be able to fix the missed real cacheline sharing
> mentioned by Alan:
> 
>  "As a separate issue, active_cacheline_insert() fails to detect
> overlap in situations where a mapping occupies more than one cache line.
> For example, if mapping A uses lines N and N+1 and mapping B uses line
> N+1, no overlap will be detected because the radix-tree keys for A and B
> will be different (N vs. N+1)."
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-12-01 16:21                 ` Alan Stern
@ 2023-12-01 17:42                   ` Catalin Marinas
  2023-12-05 18:28                     ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Catalin Marinas @ 2023-12-01 17:42 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ferry Toth, Ferry Toth, Christoph Hellwig, Hamza Mahfooz,
	Dan Williams, Marek Szyprowski, Andrew, Andy Shevchenko,
	Thorsten Leemhuis, iommu, Kernel development list,
	USB mailing list, Robin Murphy, Will Deacon

Replying to both here.

On Fri, Dec 01, 2023 at 11:21:40AM -0500, Alan Stern wrote:
> On Fri, Dec 01, 2023 at 01:17:43PM +0100, Ferry Toth wrote:
> > > A non-cache coherent platform would either have the kmalloc()
> > > allocations aligned or it would bounce those small, unaligned buffers.
> > 
> > It would? Or it always has?

It depends on the configuration. arm64 does the bounce as it opted in to
CONFIG_DMA_BOUNCE_UNALIGNED_KMALLOC.

> > > So it doesn't seem right to issue a warning on x86 where kmalloc()
> > > minimum alignment is 8 and not getting the warning on a non-coherent
> > > platform which forces the kmalloc() alignment.
> > 
> > If *all*non-coherent platforms implement either correct alignment or bounce
> > buffer, and on (coherent) x86 we get an WARN, then it is a false alarm
> > right?
> > 
> > That is exactly my question (because I have no idea which platforms have
> > non-coherent caches).
> 
> Don't forget, not all DMA buffers are allocated by kmalloc().  A buffer 
> allocated by some other means might not be aligned properly and might 
> share a cache line with another buffer.
> 
> Or you might have a single data structure that was allocated by 
> kmalloc() and then create separate DMA mappings for two members of that 
> structure.  If the two members are in the same cache line, that would be 
> an error.

Indeed, so to be sure we don't trip over other false alarms, we'd also
need to force ARCH_DMA_MINALIGN to be at least a cache-line size. That's
used in some structures to force a static alignment of various members
that take DMA transfers. After that, anything reported might actually be
a potential issue, not a false alarm.

However, I wonder whether we'd actually hide some genuine problems.
Let's say x86 gets some DMA corruption when it tries to DMA 8 bytes
into two adjacent buffers because of some DMA buffer overflow, nothing
to do with cache lines. You enable the DMA API debugging to see if it
reports anything and it suddenly starts working because of the forced
minimum alignment.

-- 
Catalin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Bug in add_dma_entry()'s debugging code
  2023-12-01 17:42                   ` Catalin Marinas
@ 2023-12-05 18:28                     ` Alan Stern
       [not found]                       ` <1e4df825-08fa-40cf-a565-9c0d285c9b73@gmail.com>
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2023-12-05 18:28 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: Ferry Toth, Ferry Toth, Christoph Hellwig, Hamza Mahfooz,
	Dan Williams, Marek Szyprowski, Andrew, Andy Shevchenko,
	Thorsten Leemhuis, iommu, Kernel development list,
	USB mailing list, Robin Murphy, Will Deacon

On Fri, Dec 01, 2023 at 05:42:52PM +0000, Catalin Marinas wrote:
> Indeed, so to be sure we don't trip over other false alarms, we'd also
> need to force ARCH_DMA_MINALIGN to be at least a cache-line size. That's
> used in some structures to force a static alignment of various members
> that take DMA transfers. After that, anything reported might actually be
> a potential issue, not a false alarm.
> 
> However, I wonder whether we'd actually hide some genuine problems.
> Let's say x86 gets some DMA corruption when it tries to DMA 8 bytes
> into two adjacent buffers because of some DMA buffer overflow, nothing
> to do with cache lines. You enable the DMA API debugging to see if it
> reports anything and it suddenly starts working because of the forced
> minimum alignment.

In the long run, it may turn out that trying to detect memory usage 
patterns that could cause problems for architectures other than the one 
currently running is not workable.  Certainly it is a bad idea to have a 
debugging infrastructure that changes the behavior of other parts of the 
system -- particularly when those other parts may be the ones you're 
trying to debug.

You may end up needing to make a choice here.  Which evil is lesser?

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

[parent not found: <1e4df825-08fa-40cf-a565-9c0d285c9b73@gmail.com>]

* Re: Bug in add_dma_entry()'s debugging code
       [not found]                       ` <1e4df825-08fa-40cf-a565-9c0d285c9b73@gmail.com>
@ 2023-12-06 16:21                         ` Alan Stern
  0 siblings, 0 replies; 16+ messages in thread
From: Alan Stern @ 2023-12-06 16:21 UTC (permalink / raw)
  To: Ferry Toth
  Cc: Catalin Marinas, Christoph Hellwig, Hamza Mahfooz, Dan Williams,
	Marek Szyprowski, Andrew, Andy Shevchenko, Thorsten Leemhuis,
	iommu, Kernel development list, USB mailing list, Robin Murphy,
	Will Deacon

On Wed, Dec 06, 2023 at 05:12:40PM +0100, Ferry Toth wrote:
> Hi
> 
> On 05-12-2023 19:28, Alan Stern wrote:
> > In the long run, it may turn out that trying to detect memory usage
> > patterns that could cause problems for architectures other than the one
> > currently running is not workable.  Certainly it is a bad idea to have a
> 
> Maybe. But while debugging code on your platform it is a good thing to get
> warnings for potential issues on other platforms.

Oh, absolutely!  It's definitely a good thing.  I'm just saying that 
doing it may not be practical -- there may be too many false positives 
(as in this bug report) and false negatives.

> In this case we (I) got misled by the warning message itself. That should be
> easy enough to improve.

I don't think so.  Issuing incorrect warnings that should be ignored is 
a very bad idea, no matter what the wording is.  After seeing a few 
messages like that, people learn to expect them -- and then they ignore 
the valid warnings along with the incorrect ones.

Alan Stern

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-12-06 16:21 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-27 16:02 Bug in add_dma_entry()'s debugging code Alan Stern
2023-11-27 16:07 ` Christoph Hellwig
2023-11-27 16:51   ` Alan Stern
2023-11-28 13:37     ` Christoph Hellwig
2023-11-28 15:18       ` Alan Stern
2023-11-28 16:31         ` Robin Murphy
2023-11-28 16:34           ` Andy Shevchenko
2023-11-28 16:54             ` Robin Murphy
2023-11-28 17:44         ` Catalin Marinas
2023-11-30 20:08           ` Ferry Toth
2023-12-01 11:08             ` Catalin Marinas
2023-12-01 12:17               ` Ferry Toth
2023-12-01 16:21                 ` Alan Stern
2023-12-01 17:42                   ` Catalin Marinas
2023-12-05 18:28                     ` Alan Stern
     [not found]                       ` <1e4df825-08fa-40cf-a565-9c0d285c9b73@gmail.com>
2023-12-06 16:21                         ` Alan Stern

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox