All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alexander Lobakin <aleksander.lobakin@intel.com>
To: Marek Szyprowski <m.szyprowski@samsung.com>,
	Christoph Hellwig <hch@lst.de>
Cc: Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Robin Murphy <robin.murphy@arm.com>,
	Joerg Roedel <joro@8bytes.org>, "Will Deacon" <will@kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	"Magnus Karlsson" <magnus.karlsson@intel.com>,
	<nex.sw.ncis.osdt.itp.upstreaming@intel.com>,
	<bpf@vger.kernel.org>, <netdev@vger.kernel.org>,
	<iommu@lists.linux.dev>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v6 2/7] dma: avoid redundant calls for sync operations
Date: Thu, 9 May 2024 13:59:41 +0200	[thread overview]
Message-ID: <3dce41a3-e5a9-43e7-b918-ecb8d688ea1c@intel.com> (raw)
In-Reply-To: <b4632761-3ec6-4070-a60e-b74c1bfdd579@intel.com>

From: Alexander Lobakin <aleksander.lobakin@intel.com>
Date: Thu, 9 May 2024 13:44:37 +0200

> From: Marek Szyprowski <m.szyprowski@samsung.com>
> Date: Thu, 9 May 2024 13:41:16 +0200
> 
>> Dear All,
>>
>> On 07.05.2024 13:20, Alexander Lobakin wrote:
>>> Quite often, devices do not need dma_sync operations on x86_64 at least.
>>> Indeed, when dev_is_dma_coherent(dev) is true and
>>> dev_use_swiotlb(dev) is false, iommu_dma_sync_single_for_cpu()
>>> and friends do nothing.
>>>
>>> However, indirectly calling them when CONFIG_RETPOLINE=y consumes about
>>> 10% of cycles on a cpu receiving packets from softirq at ~100Gbit rate.
>>> Even if/when CONFIG_RETPOLINE is not set, there is a cost of about 3%.
>>>
>>> Add dev->need_dma_sync boolean and turn it off during the device
>>> initialization (dma_set_mask()) depending on the setup:
>>> dev_is_dma_coherent() for the direct DMA, !(sync_single_for_device ||
>>> sync_single_for_cpu) or the new dma_map_ops flag, %DMA_F_CAN_SKIP_SYNC,
>>> advertised for non-NULL DMA ops.
>>> Then later, if/when swiotlb is used for the first time, the flag
>>> is reset back to on, from swiotlb_tbl_map_single().
>>>
>>> On iavf, the UDP trafficgen with XDP_DROP in skb mode test shows
>>> +3-5% increase for direct DMA.
>>>
>>> Suggested-by: Christoph Hellwig <hch@lst.de> # direct DMA shortcut
>>> Co-developed-by: Eric Dumazet <edumazet@google.com>
>>> Signed-off-by: Eric Dumazet <edumazet@google.com>
>>> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
>>> ---
>>>   include/linux/device.h      |  4 +++
>>>   include/linux/dma-map-ops.h | 12 ++++++++
>>>   include/linux/dma-mapping.h | 53 +++++++++++++++++++++++++++++++----
>>>   kernel/dma/mapping.c        | 55 +++++++++++++++++++++++++++++--------
>>>   kernel/dma/swiotlb.c        |  6 ++++
>>>   5 files changed, 113 insertions(+), 17 deletions(-)
>>
>>
>> This patch landed in today's linux-next as commit f406c8e4b770 ("dma: 
>> avoid redundant calls for sync operations"). Unfortunately I found that 
>> it breaks some of the ARM 32bit boards by forcing skipping DMA sync 
>> operations on non-coherent systems. This happens because this patch 
>> hooks dma_need_sync=true initialization into set_dma_mask(), but 
>> set_dma_mask() is not called from all device drivers, especially from 
>> those which operates properly with the default 32bit dma mask (like most 
>> of the platform devices created by the OF layer).
>>
>> Frankly speaking I have no idea how this should be fixed. I expect that 
>> there are lots of broken devices after this change, because I don't 
>> remember that calling set_dma_mask() is mandatory for device drivers.
>>
>> After adding dma_set_mask(dev, DMA_BIT_MASK(32)) to the drivers relevant 
>> for my boards the issues are gone, but I'm not sure this is the right 
>> approach...
> 
> If I remember correctly, *all* device drivers which use DMA *must* call
> dma_set_*mask() on probe. That's why we added it there and didn't care.
> Alternatively, if it really breaks a lot of drivers, we can set
> dma_need_sync = true by default before the driver probing. I thought of

Or invert the flag, so that false would mean "it needs sync" and it
would be the default if dma_*mask*() wasn't called.

Chris, what do you think?

> this, but the correct approach would be to call dma_set_*mask() from the
> respective drivers.
> 
>>
>>
>>> ...
>>
>> Best regards

Thanks,
Olek

  reply	other threads:[~2024-05-09 12:01 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-07 11:20 [PATCH v6 0/7] dma: skip calling no-op sync ops when possible Alexander Lobakin
2024-05-07 11:20 ` [PATCH v6 1/7] dma: compile-out DMA sync op calls when not used Alexander Lobakin
2024-05-07 11:20 ` [PATCH v6 2/7] dma: avoid redundant calls for sync operations Alexander Lobakin
2024-05-09 11:41   ` Marek Szyprowski
2024-05-09 11:44     ` Alexander Lobakin
2024-05-09 11:59       ` Alexander Lobakin [this message]
2024-05-09 12:02         ` Christoph Hellwig
2024-05-09 12:01       ` Christoph Hellwig
2024-05-09 13:43   ` Steven Price
2024-05-09 13:49     ` Christoph Hellwig
2024-05-09 14:33     ` Robin Murphy
2024-05-09 14:43       ` Alexander Lobakin
2024-05-07 11:20 ` [PATCH v6 3/7] iommu/dma: avoid expensive indirect " Alexander Lobakin
2024-05-07 11:20 ` [PATCH v6 4/7] page_pool: make sure frag API fields don't span between cachelines Alexander Lobakin
2024-05-07 11:20 ` [PATCH v6 5/7] page_pool: don't use driver-set flags field directly Alexander Lobakin
2024-05-07 11:20 ` [PATCH v6 6/7] page_pool: check for DMA sync shortcut earlier Alexander Lobakin
2024-05-07 11:20 ` [PATCH v6 7/7] xsk: use generic DMA sync shortcut instead of a custom one Alexander Lobakin
2024-05-07 11:30 ` [PATCH v6 0/7] dma: skip calling no-op sync ops when possible Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3dce41a3-e5a9-43e7-b918-ecb8d688ea1c@intel.com \
    --to=aleksander.lobakin@intel.com \
    --cc=bpf@vger.kernel.org \
    --cc=edumazet@google.com \
    --cc=hch@lst.de \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=m.szyprowski@samsung.com \
    --cc=magnus.karlsson@intel.com \
    --cc=netdev@vger.kernel.org \
    --cc=nex.sw.ncis.osdt.itp.upstreaming@intel.com \
    --cc=rafael@kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.