All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>,
	linux-block@vger.kernel.org, Keith Busch <kbusch@kernel.org>,
	Jens Axboe <axboe@kernel.dk>, Robin Murphy <robin.murphy@arm.com>
Subject: Re: [Regression] b1a000d3b8ec ("block: relax direct io memory alignment")
Date: Wed, 23 Oct 2024 08:50:17 +0800	[thread overview]
Message-ID: <ZxhISRqy8q2Cp8f8@fedora> (raw)
In-Reply-To: <Zxd9XyqqA604F1Rn@arm.com>

On Tue, Oct 22, 2024 at 11:24:31AM +0100, Catalin Marinas wrote:
> On Wed, Oct 16, 2024 at 04:31:45PM +0800, Ming Lei wrote:
> > On Wed, Oct 16, 2024 at 10:04:19AM +0200, Christoph Hellwig wrote:
> > > On Wed, Oct 16, 2024 at 12:40:13AM +0800, Ming Lei wrote:
> > > > Turns out host controller's DMA alignment is often too relax, so two DMA
> > > > buffers may cross same cache line easily, and trigger the warning of
> > > > "cacheline tracking EEXIST, overlapping mappings aren't supported".
> > > > 
> > > > The attached test code can trigger the warning immediately with CONFIG_DMA_API_DEBUG
> > > > enabled when reading from one scsi disk which queue DMA alignment is 3.
> > > 
> > > We should not allow smaller than cache line alignment on architectures
> > > that are not cache coherent indeed.
> 
> Even on architectures that are not fully coherent, the coherency is a
> property of the device. You may need to somehow pass this information in
> struct queue_limits if you want it to be optimal.

Yeah, looks the issue has to be fixed from driver side, only driver has
'struct device' info.

> 
> > Yes, something like the following change:
> > 
> > diff --git a/block/blk-settings.c b/block/blk-settings.c
> > index a446654ddee5..26bd0e72c68e 100644
> > --- a/block/blk-settings.c
> > +++ b/block/blk-settings.c
> > @@ -348,7 +348,9 @@ static int blk_validate_limits(struct queue_limits *lim)
> >  	 */
> >  	if (!lim->dma_alignment)
> >  		lim->dma_alignment = SECTOR_SIZE - 1;
> > -	if (WARN_ON_ONCE(lim->dma_alignment > PAGE_SIZE))
> > +	else if (lim->dma_alignment < L1_CACHE_BYTES - 1)
> > +		lim->dma_alignment = L1_CACHE_BYTES - 1;
> > +	else if (WARN_ON_ONCE(lim->dma_alignment > PAGE_SIZE))
> >  		return -EINVAL;
> 
> L1_CACHE_BYTES is not the right check here since a level 2/3 cache may
> have a larger cache line than level 1 (and we have such configurations
> on arm64 where ARCH_DMA_MINALIGN is 128 and L1_CACHE_BYTES is 64). Use
> dma_get_cache_alignment() instead. On fully coherent architectures like
> x86 it should return 1.
> 
> That said, the DMA debug code also uses the static L1_CACHE_SHIFT and it
> will trigger the warning anyway. Some discussion around the DMA API
> debug came up during the small ARCH_KMALLOC_MINALIGN changes (don't
> remember it was in private with Robin or on the list). Now kmalloc() can
> return a small buffer (less than a cache line) that won't be bounced if
> the device is coherent (see dma_kmalloc_safe()) but the DMA API debug
> code only checks for direction == DMA_TO_DEVICE, not
> dev_is_dma_coherent(). For arm64 I did not want to disable small
> ARCH_KMALLOC_MINALIGN if CONFIG_DMA_API_DEBUG is enabled as this would
> skew the testing by forcing all allocations to be ARCH_DMA_MINALIGN
> aligned.
> 
> Maybe I'm missing something in those checks but I'm surprised that the
> DMA API debug code doesn't complain about small kmalloc() buffers on x86
> (which never had any bouncing for this specific case since it's fully
> coherent). I suspect people just don't enable DMA debugging on x86 for
> such devices (typically USB drivers have this issue).

I did see report on warning of "cacheline tracking EEXIST, overlapping mappings
aren't supported" on USB several times, since it is often treated as same with
this one.

> 
> So maybe the DMA API debug should have two modes: a generic one that
> catches alignments irrespective of the coherency of the device and
> another that's specific to the device/architecture coherency properties.
> The former, if enabled, should also force a higher minimum kmalloc()
> alignment and a dma_get_cache_alignment() > 1.

Or dma debug log needs to be improved by showing the warning is just a
hint.


Thanks,
Ming


  reply	other threads:[~2024-10-23  0:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-15 16:40 [Regression] b1a000d3b8ec ("block: relax direct io memory alignment") Ming Lei
2024-10-16  8:04 ` Christoph Hellwig
2024-10-16  8:31   ` Ming Lei
2024-10-16 12:31     ` Christoph Hellwig
2024-10-22  1:21       ` Ming Lei
2024-10-22  7:25         ` Christoph Hellwig
2024-10-22  2:15     ` Jens Axboe
2024-10-22 10:24     ` Catalin Marinas
2024-10-23  0:50       ` Ming Lei [this message]
2024-10-23  6:12       ` Christoph Hellwig
2024-10-23  8:14         ` Ming Lei
2024-10-23 12:23           ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZxhISRqy8q2Cp8f8@fedora \
    --to=ming.lei@redhat.com \
    --cc=axboe@kernel.dk \
    --cc=catalin.marinas@arm.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.