Linux IOMMU Development
 help / color / mirror / Atom feed
From: John Garry via iommu <iommu@lists.linux-foundation.org>
To: Robin Murphy <robin.murphy@arm.com>, <will@kernel.org>,
	<joro@8bytes.org>
Cc: iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org,
	hch@lst.de
Subject: Re: [PATCH v2] iommu/dma: Add config for PCI SAC address trick
Date: Thu, 9 Jun 2022 19:09:53 +0100	[thread overview]
Message-ID: <b46fd053-aaee-a384-0e5a-e7a5a011c71a@huawei.com> (raw)
In-Reply-To: <3f06994f9f370f9d35b2630ab75171ecd2065621.1654782107.git.robin.murphy@arm.com>

On 09/06/2022 16:12, Robin Murphy wrote:
> For devices stuck behind a conventional PCI bus, saving extra cycles at
> 33MHz is probably fairly significant. However since native PCI Express
> is now the norm for high-performance devices, the optimisation to always
> prefer 32-bit addresses for the sake of avoiding DAC is starting to look
> rather anachronistic. Technically 32-bit addresses do have shorter TLPs
> on PCIe, but unless the device is saturating its link bandwidth with
> small transfers it seems unlikely that the difference is appreciable.
> 
> What definitely is appreciable, however, is that the IOVA allocator
> doesn't behave all that well once the 32-bit space starts getting full.
> As DMA working sets get bigger, this optimisation increasingly backfires
> and adds considerable overhead to the dma_map path for use-cases like
> high-bandwidth networking. We've increasingly bandaged the allocator
> in attempts to mitigate this, but it remains fundamentally at odds with
> other valid requirements to try as hard as possible to satisfy a request
> within the given limit; what we really need is to just avoid this odd
> notion of a speculative allocation when it isn't beneficial anyway.
> 
> Unfortunately that's where things get awkward... Having been present on
> x86 for 15 years or so now, it turns out there are systems which fail to
> properly define the upper limit of usable IOVA space for certain devices
> and this trick was the only thing letting them work OK. I had a similar
> ulterior motive for a couple of early arm64 systems when originally
> adding it to iommu-dma, but those really should be fixed with proper
> firmware bindings by now. Let's be brave and default it to off in the
> hope that CI systems and developers will find and fix those bugs, > but
> expect that desktop-focused distro configs are likely to want to turn
> it back on for maximum compatibility.
> 
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>

FWIW,
Reviewed-by: John Garry <john.garry@huawei.com>

If we're not enabling by default for x86 then doesn't Jeorg have some 
XHCI issue which we would now need to quirk? I don't remember which 
device exactly. Or, alternatively, simply ask him to enable this new config.


> ---
> 
> v2: Tweak wording to clarify that it's not really an optimisation in
>      general, remove "default X86".
> 
>   drivers/iommu/Kconfig     | 26 ++++++++++++++++++++++++++
>   drivers/iommu/dma-iommu.c |  2 +-
>   2 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index c79a0df090c0..5a225b48dd00 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -144,6 +144,32 @@ config IOMMU_DMA
>   	select IRQ_MSI_IOMMU
>   	select NEED_SG_DMA_LENGTH
>   
> +config IOMMU_DMA_PCI_SAC
> +	bool "Enable 64-bit legacy PCI optimisation by default"
> +	depends on IOMMU_DMA
> +	help
> +	  Enable by default an IOMMU optimisation for 64-bit legacy PCI devices,
> +	  wherein the DMA API layer will always first try to allocate a 32-bit
> +	  DMA address suitable for a single address cycle, before falling back
> +	  to allocating from the device's full usable address range. If your
> +	  system has 64-bit legacy PCI devices in 32-bit slots where using dual
> +	  address cycles reduces DMA throughput significantly, this may be
> +	  beneficial to overall performance.
> +
> +	  If you have a modern PCI Express based system, this feature mostly just
> +	  represents extra overhead in the allocation path for no practical
> +	  benefit, and it should usually be preferable to say "n" here.
> +
> +	  However, beware that this feature has also historically papered over
> +	  bugs where the IOMMU address width and/or device DMA mask is not set
> +	  correctly. If device DMA problems and IOMMU faults start occurring
> +	  after disabling this option, it is almost certainly indicative of a
> +	  latent driver or firmware/BIOS bug, which would previously have only
> +	  manifested with several gigabytes worth of concurrent DMA mappings.
> +
> +	  If this option is not set, the feature can still be re-enabled at
> +	  boot time with the "iommu.forcedac=0" command-line argument.
> +
>   # Shared Virtual Addressing
>   config IOMMU_SVA
>   	bool
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index f90251572a5d..9f9d9ba7f376 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -67,7 +67,7 @@ struct iommu_dma_cookie {
>   };
>   
>   static DEFINE_STATIC_KEY_FALSE(iommu_deferred_attach_enabled);
> -bool iommu_dma_forcedac __read_mostly;
> +bool iommu_dma_forcedac __read_mostly = !IS_ENABLED(CONFIG_IOMMU_DMA_PCI_SAC);
>   
>   static int __init iommu_dma_forcedac_setup(char *str)
>   {

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

  reply	other threads:[~2022-06-09 18:10 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-09 15:12 [PATCH v2] iommu/dma: Add config for PCI SAC address trick Robin Murphy
2022-06-09 18:09 ` John Garry via iommu [this message]
2022-06-10  6:01 ` Christoph Hellwig
2022-06-22 12:59 ` Joerg Roedel
2022-06-22 13:12   ` Robin Murphy
2022-06-23 11:33     ` Joerg Roedel
2022-06-23 11:41       ` Robin Murphy
2022-06-24 13:28         ` Joerg Roedel
2022-06-24 14:49           ` Robin Murphy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b46fd053-aaee-a384-0e5a-e7a5a011c71a@huawei.com \
    --to=iommu@lists.linux-foundation.org \
    --cc=hch@lst.de \
    --cc=john.garry@huawei.com \
    --cc=joro@8bytes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox