Linux ARM-MSM sub-architecture
 help / color / mirror / Atom feed
* Re: [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
       [not found] <5-v2-c86cc8c2230e+160bb-smmu_newapi_jgg@nvidia.com>
@ 2024-02-09 20:05 ` Dmitry Baryshkov
  2024-02-09 22:23   ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Baryshkov @ 2024-02-09 20:05 UTC (permalink / raw)
  To: jgg
  Cc: iommu, joro, linux-arm-kernel, nicolinc, robin.murphy, will,
	linux-arm-msm

On Tue, 17 Oct 2023 Jason Gunthorpe <jgg@nvidia.com> wrote:
> Now that the BLOCKED and IDENTITY behaviors are managed with their own
> domains change to the domain_alloc_paging() op.
> 
> The check for using_legacy_binding is now redundant,
> arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this
> mode, so the core code will never attempt to create a DMA domain in the
> first place.
> 
> Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core
> code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will
> not pass in IDENTITY or BLOCKED if the global statics exist, so the test
> for DMA is also redundant now too.
> 
> Call arm_smmu_init_domain_context() early if a dev is available.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/iommu/arm/arm-smmu/arm-smmu.c | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)

For some reason this patch breaks booting of the APQ8096 Dragonboard820c
(qcom/apq8096-db820c.dts). Dispbling display subsystem (mdss) and venus
devices makes the board boot in most of the cases. Most frequently the
last parts of the log loog in a following way:

arm-smmu b40000.iommu: probing hardware configuration...
arm-smmu b40000.iommu: SMMUv2 with:
arm-smmu b40000.iommu: 	stage 1 translation
arm-smmu b40000.iommu: 	address translation ops
arm-smmu b40000.iommu: 	non-coherent table walk
arm-smmu b40000.iommu: 	(IDR0.CTTW overridden by FW configuration)
arm-smmu b40000.iommu: 	stream matching with 2 register groups
arm-smmu b40000.iommu: 	2 context banks (0 stage-2 only)
arm-smmu b40000.iommu: 	Supported page sizes: 0x63315000
arm-smmu b40000.iommu: 	Stage-1: 48-bit VA -> 36-bit IPA
arm-smmu b40000.iommu: 	preserved 0 boot mappings
arm-smmu d00000.iommu: probing hardware configuration...
arm-smmu d00000.iommu: SMMUv2 with:
arm-smmu d00000.iommu: 	stage 1 translation
arm-smmu d00000.iommu: 	address translation ops
arm-smmu d00000.iommu: 	non-coherent table walk
arm-smmu d00000.iommu: 	(IDR0.CTTW overridden by FW configuration)
arm-smmu d00000.iommu: 	stream matching with 2 register groups
arm-smmu d00000.iommu: 	2 context banks (0 stage-2 only)
arm-smmu d00000.iommu: 	Supported page sizes: 0x63315000
arm-smmu d00000.iommu: 	Stage-1: 32-bit VA -> 36-bit IPA
arm-smmu d00000.iommu: 	preserved 0 boot mappings
arm-smmu d40000.iommu: probing hardware configuration...
arm-smmu d40000.iommu: SMMUv2 with:
arm-smmu d40000.iommu: 	stage 1 translation
arm-smmu d40000.iommu: 	address translation ops
arm-smmu d40000.iommu: 	non-coherent table walk
arm-smmu d40000.iommu: 	(IDR0.CTTW overridden by FW configuration)
arm-smmu d40000.iommu: 	stream matching with 42 register groups
arm-smmu d40000.iommu: 	7 context banks (0 stage-2 only)
arm-smmu d40000.iommu: 	Supported page sizes: 0x63315000
arm-smmu d40000.iommu: 	Stage-1: 32-bit VA -> 36-bit IPA
arm-smmu d40000.iommu: 	preserved 0 boot mappings
arm-smmu da0000.iommu: probing hardware configuration...
arm-smmu da0000.iommu: SMMUv2 with:
arm-smmu da0000.iommu: 	stage 1 translation
arm-smmu da0000.iommu: 	address translation ops
arm-smmu da0000.iommu: 	non-coherent table walk
arm-smmu da0000.iommu: 	(IDR0.CTTW overridden by FW configuration)
arm-smmu da0000.iommu: 	stream matching with 4 register groups
arm-smmu da0000.iommu: 	2 context banks (0 stage-2 only)
arm-smmu da0000.iommu: 	Supported page sizes: 0x63315000
arm-smmu da0000.iommu: 	Stage-1: 32-bit VA -> 36-bit IPA
arm-smmu da0000.iommu: 	preserved 0 boot mappings
arm-smmu 1600000.iommu: probing hardware configuration...
arm-smmu 1600000.iommu: SMMUv2 with:
arm-smmu 1600000.iommu: 	stage 1 translation
arm-smmu 1600000.iommu: 	address translation ops
arm-smmu 1600000.iommu: 	non-coherent table walk
arm-smmu 1600000.iommu: 	(IDR0.CTTW overridden by FW configuration)
arm-smmu 1600000.iommu: 	stream matching with 15 register groups
arm-smmu 1600000.iommu: 	12 context banks (0 stage-2 only)
arm-smmu 1600000.iommu: 	Supported page sizes: 0x63315000
arm-smmu 1600000.iommu: 	Stage-1: 36-bit VA -> 36-bit IPA
arm-smmu 1600000.iommu: 	preserved 0 boot mappings
adreno b00000.gpu: Adding to iommu group 0
Bluetooth: hci0: QCA Product ID   :0x00000008
Bluetooth: hci0: QCA SOC Version  :0x00000044
Bluetooth: hci0: QCA ROM Version  :0x00000302
Bluetooth: hci0: QCA Patch Version:0x00000111
Bluetooth: hci0: QCA controller version 0x00440302
platform 9a0000.hdmi-tx: Fixed dependency cycle(s) with /soc@0/display-subsystem@900000/display-controller@901000/ports/port@0/endpoint
Bluetooth: hci0: QCA Downloading qca/rampatch_00440302.bin
spi_qup 7575000.spi: IN:block:16, fifo:64, OUT:block:16, fifo:64
spi_qup 75ba000.spi: IN:block:16, fifo:64, OUT:block:16, fifo:64
phy phy-7411000.phy.5: QUSB2PHY pll lock failed: status reg = 0
phy phy-7411000.phy.5: phy init failed --> -16
dwc3 6a00000.usb: error -EBUSY: failed to initialize core
dwc3: probe of 6a00000.usb failed with error -16
phy phy-7412000.phy.6: QUSB2PHY pll lock failed: status reg = 0
phy phy-7412000.phy.6: phy init failed --> -16
dwc3 7600000.usb: error -EBUSY: failed to initialize core
dwc3: probe of 7600000.usb failed with error -16
i2c_qup 7577000.i2c: using default clock-frequency 100000
i2c_qup 75b5000.i2c: using default clock-frequency 100000
sdhci_msm 74a4900.mmc: Got CD GPIO
scsi host0: ufshcd
ufshcd-qcom 624000.ufshc: ufs_qcom_host_reset: reset control not set
remoteproc remoteproc0: 2080000.remoteproc is available
remoteproc remoteproc1: 9300000.remoteproc is available
mmc0: SDHCI controller on 74a4900.mmc [74a4900.mmc] using ADMA 64-bit
qcom-pcie 600000.pcie: host bridge /soc@0/bus@0/pcie@600000 ranges:
qcom-pcie 608000.pcie: supply vddpe-3v3 not found, using dummy regulator

[ The board resets to the bootloader ]


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
  2024-02-09 20:05 ` [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging() Dmitry Baryshkov
@ 2024-02-09 22:23   ` Jason Gunthorpe
  2024-02-12 23:18     ` Dmitry Baryshkov
  2024-02-13  7:51     ` Dmitry Baryshkov
  0 siblings, 2 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2024-02-09 22:23 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: iommu, joro, linux-arm-kernel, nicolinc, robin.murphy, will,
	linux-arm-msm

On Fri, Feb 09, 2024 at 10:05:38PM +0200, Dmitry Baryshkov wrote:
> On Tue, 17 Oct 2023 Jason Gunthorpe <jgg@nvidia.com> wrote:
> > Now that the BLOCKED and IDENTITY behaviors are managed with their own
> > domains change to the domain_alloc_paging() op.
> > 
> > The check for using_legacy_binding is now redundant,
> > arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this
> > mode, so the core code will never attempt to create a DMA domain in the
> > first place.
> > 
> > Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core
> > code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will
> > not pass in IDENTITY or BLOCKED if the global statics exist, so the test
> > for DMA is also redundant now too.
> > 
> > Call arm_smmu_init_domain_context() early if a dev is available.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > ---
> >  drivers/iommu/arm/arm-smmu/arm-smmu.c | 21 +++++++++++++++------
> >  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> For some reason this patch breaks booting of the APQ8096 Dragonboard820c
> (qcom/apq8096-db820c.dts). Dispbling display subsystem (mdss) and venus
> devices makes the board boot in most of the cases. Most frequently the
> last parts of the log loog in a following way:

It is surprising we tested this patch on some tegra systems with this
iommu and didn't hit anything..

The only real functional thing this changes is to move the domain
initialization up in time, potentially a lot in time in some
cases. That function does alot of things including touching HW so
possibly there is some surprising interaction with something else.

So, I would expect this to not WARN_ON and to make it work the same as
before the patch:

--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -875,7 +875,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
        mutex_init(&smmu_domain->init_mutex);
        spin_lock_init(&smmu_domain->cb_lock);
 
-       if (dev) {
+       WARN_ON(using_legacy_binding);
+
+/*     if (dev) {
                struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
 
                if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
@@ -883,7 +885,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
                        return NULL;
                }
        }
-
+*/
        return &smmu_domain->domain;
 }

Then I'd ask you to remove the comment and do:

@@ -878,7 +878,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
        if (dev) {
                struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
 
+               WARN_ON(true);
                if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
+                       printk("Allocation failure in arm_smmu_domain_alloc_paging()\n");
                        kfree(smmu_domain);
                        return NULL;
                }


And then we may get a clue from the backtraces it generates. I only
saw one iommu group reported in your log so I'd expect one trace?

Thanks,
Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
  2024-02-09 22:23   ` Jason Gunthorpe
@ 2024-02-12 23:18     ` Dmitry Baryshkov
  2024-02-13  0:19       ` Jason Gunthorpe
  2024-02-13  7:51     ` Dmitry Baryshkov
  1 sibling, 1 reply; 9+ messages in thread
From: Dmitry Baryshkov @ 2024-02-12 23:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, joro, linux-arm-kernel, nicolinc, robin.murphy, will,
	linux-arm-msm

On Sat, 10 Feb 2024 at 00:23, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Fri, Feb 09, 2024 at 10:05:38PM +0200, Dmitry Baryshkov wrote:
> > On Tue, 17 Oct 2023 Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > Now that the BLOCKED and IDENTITY behaviors are managed with their own
> > > domains change to the domain_alloc_paging() op.
> > >
> > > The check for using_legacy_binding is now redundant,
> > > arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this
> > > mode, so the core code will never attempt to create a DMA domain in the
> > > first place.
> > >
> > > Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core
> > > code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will
> > > not pass in IDENTITY or BLOCKED if the global statics exist, so the test
> > > for DMA is also redundant now too.
> > >
> > > Call arm_smmu_init_domain_context() early if a dev is available.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > ---
> > >  drivers/iommu/arm/arm-smmu/arm-smmu.c | 21 +++++++++++++++------
> > >  1 file changed, 15 insertions(+), 6 deletions(-)
> >
> > For some reason this patch breaks booting of the APQ8096 Dragonboard820c
> > (qcom/apq8096-db820c.dts). Dispbling display subsystem (mdss) and venus
> > devices makes the board boot in most of the cases. Most frequently the
> > last parts of the log loog in a following way:
>
> It is surprising we tested this patch on some tegra systems with this
> iommu and didn't hit anything..
>
> The only real functional thing this changes is to move the domain
> initialization up in time, potentially a lot in time in some
> cases. That function does alot of things including touching HW so
> possibly there is some surprising interaction with something else.
>
> So, I would expect this to not WARN_ON and to make it work the same as
> before the patch:
>
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -875,7 +875,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
>         mutex_init(&smmu_domain->init_mutex);
>         spin_lock_init(&smmu_domain->cb_lock);
>
> -       if (dev) {
> +       WARN_ON(using_legacy_binding);
> +
> +/*     if (dev) {
>                 struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
>
>                 if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
> @@ -883,7 +885,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
>                         return NULL;
>                 }
>         }
> -
> +*/
>         return &smmu_domain->domain;
>  }

With the full device tree, this crashed during the IOMMU probe, no warnings:

[   15.343477] bam-dma-engine 644000.dma-controller: num-channels
unspecified in dt
[   15.343675] bam-dma-engine 644000.dma-controller: num-ees unspecified in dt
[   15.394653] msm_serial 7570000.serial: msm_serial: detected port #2
[   15.395170] msm_serial 7570000.serial: uartclk = 19200000
[   15.401983] 7570000.serial: ttyMSM2 at MMIO 0x7570000 (irq = 37,
base_baud = 1200000) is a MSM
[   15.407549] serial serial0: tty port ttyMSM2 registered
[   15.421567] Bluetooth: hci0: setting up ROME/QCA6390
[   15.421728] arm-smmu b40000.iommu: probing hardware configuration...
[   15.425866] arm-smmu b40000.iommu: SMMUv2 with:
[   15.432153] arm-smmu b40000.iommu: stage 1 translation
[   15.436407] arm-smmu b40000.iommu: address translation ops
[   15.441580] arm-smmu b40000.iommu: non-coherent table walk
[   15.447136] arm-smmu b40000.iommu: (IDR0.CTTW overridden by FW configuration)
[   15.452750] arm-smmu b40000.iommu: stream matching with 2 register groups
[   15.460038] arm-smmu b40000.iommu: 2 context banks (0 stage-2 only)
[   15.466908] arm-smmu b40000.iommu: Supported page sizes: 0x63315000
[   15.473367] arm-smmu b40000.iommu: Stage-1: 48-bit VA -> 36-bit IPA
[   15.481638] arm-smmu b40000.iommu: preserved 0 boot mappings
[   15.491783] arm-smmu d00000.iommu: probing hardware configuration...
[   15.491955] arm-smmu d00000.iommu: SMMUv2 with:
[
Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301


>
> Then I'd ask you to remove the comment and do:
>
> @@ -878,7 +878,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
>         if (dev) {
>                 struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
>
> +               WARN_ON(true);
>                 if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
> +                       printk("Allocation failure in arm_smmu_domain_alloc_paging()\n");
>                         kfree(smmu_domain);
>                         return NULL;
>                 }
>
>
> And then we may get a clue from the backtraces it generates. I only
> saw one iommu group reported in your log so I'd expect one trace?

With the full device tree, same result:

[   12.319303] bam-dma-engine 644000.dma-controller: num-channels
unspecified in dt
[   12.319502] bam-dma-engine 644000.dma-controller: num-ees unspecified in dt
[   12.370379] msm_serial 7570000.serial: msm_serial: detected port #2
[   12.370895] msm_serial 7570000.serial: uartclk = 19200000
[   12.377773] 7570000.serial: ttyMSM2 at MMIO 0x7570000 (irq = 37,
base_baud = 1200000) is a MSM
[   12.383228] serial serial0: tty port ttyMSM2 registered
[   12.397263] arm-smmu b40000.iommu: probing hardware configuration...
[   12.397441] arm-smmu b40000.iommu: SMMUv2 with:
[   12.398072] Bluetooth: hci0: setting up ROME/QCA6390
[   12.402779] arm-smmu b40000.iommu: stage 1 translation
[   12.402817] arm-smmu b40000.iommu: address translation ops
[   12.402832] arm-smmu b40000.iommu: non-coherent table walk
[   12.402848] arm-smmu b40000.iommu: (IDR0.CTTW overridden by FW configuration)
[   12.402881] arm-smmu b40000.iommu: stream matching with 2 register groups
[   12.402943] arm-smmu b40000.iommu: 2 context banks (0 stage-2 only)
[   12.402987] arm-smmu b40000.iommu: Supported page sizes: 0x63315000
[   12.403004] arm-smmu b40000.iommu: Stage-1: 48-bit VA -> 36-bit IPA
[   12.404941] arm-smmu b40000.iommu: preserved 0 boot mappings
[   12.467015] arm-smmu d00000.iommu: probing hardware configuration...
[   12.467318] arm-smmu d0
Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301

Even with disabling display-subsystem and venus, it crashes at the
same place, I don't seem to be able to get past it anymore.

-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
  2024-02-12 23:18     ` Dmitry Baryshkov
@ 2024-02-13  0:19       ` Jason Gunthorpe
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2024-02-13  0:19 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: iommu, joro, linux-arm-kernel, nicolinc, robin.murphy, will,
	linux-arm-msm

On Tue, Feb 13, 2024 at 01:18:24AM +0200, Dmitry Baryshkov wrote:
> > > For some reason this patch breaks booting of the APQ8096 Dragonboard820c
> > > (qcom/apq8096-db820c.dts). Dispbling display subsystem (mdss) and venus
> > > devices makes the board boot in most of the cases. Most frequently the
> > > last parts of the log loog in a following way:
> >
> > It is surprising we tested this patch on some tegra systems with this
> > iommu and didn't hit anything..
> >
> > The only real functional thing this changes is to move the domain
> > initialization up in time, potentially a lot in time in some
> > cases. That function does alot of things including touching HW so
> > possibly there is some surprising interaction with something else.
> >
> > So, I would expect this to not WARN_ON and to make it work the same as
> > before the patch:
> >
> > --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> > @@ -875,7 +875,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
> >         mutex_init(&smmu_domain->init_mutex);
> >         spin_lock_init(&smmu_domain->cb_lock);
> >
> > -       if (dev) {
> > +       WARN_ON(using_legacy_binding);
> > +
> > +/*     if (dev) {
> >                 struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
> >
> >                 if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
> > @@ -883,7 +885,7 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
> >                         return NULL;
> >                 }
> >         }
> > -
> > +*/
> >         return &smmu_domain->domain;
> >  }
> 
> With the full device tree, this crashed during the IOMMU probe, no warnings:

The above reverts nearly all the functional elements of the patch you
said caused the problem, are you certain of your bisection?

> > And then we may get a clue from the backtraces it generates. I only
> > saw one iommu group reported in your log so I'd expect one trace?
> 
> With the full device tree, same result:

This adds basically an unconditional WARN_ON on all the probe paths,
and nothing printed? That is even more surprising.

Those two together suggest that arm_smmu_domain_alloc_paging() isn't
even being called. But the caller is:

	if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain)
		return ops->identity_domain;
	else if (alloc_type == IOMMU_DOMAIN_BLOCKED && ops->blocked_domain)
		return ops->blocked_domain;
	else if (type & __IOMMU_DOMAIN_PAGING && ops->domain_alloc_paging)
		domain = ops->domain_alloc_paging(dev);
	else if (ops->domain_alloc)
		domain = ops->domain_alloc(alloc_type);
	else
		return ERR_PTR(-EOPNOTSUPP);

Which, suggest that alloc_type is somehow garbage for your system? I
don't see how that can happen, but this patch will also cause a
garbage type to be rejected by the core code.

Does this reveal anything for you?

@@ -2118,6 +2118,7 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
        struct iommu_domain *domain;
        unsigned int alloc_type = type & IOMMU_DOMAIN_ALLOC_FLAGS;
 
+       WARN(true, " __iommu_domain_alloc %u %u", alloc_type, type);
        if (alloc_type == IOMMU_DOMAIN_IDENTITY && ops->identity_domain)
                return ops->identity_domain;
        else if (alloc_type == IOMMU_DOMAIN_BLOCKED && ops->blocked_domain)
@@ -2126,8 +2127,10 @@ static struct iommu_domain *__iommu_domain_alloc(const struct iommu_ops *ops,
                domain = ops->domain_alloc_paging(dev);
        else if (ops->domain_alloc)
                domain = ops->domain_alloc(alloc_type);
-       else
+       else {
+               printk("Returning failure from __iommu_domain_alloc()\n");
                return ERR_PTR(-EOPNOTSUPP);
+       }
 
        /*
         * Many domain_alloc ops now return ERR_PTR, make things easier for the

It must print, something is wrong with the debugging process if this
doesn't generate back traces :\

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
  2024-02-09 22:23   ` Jason Gunthorpe
  2024-02-12 23:18     ` Dmitry Baryshkov
@ 2024-02-13  7:51     ` Dmitry Baryshkov
  2024-02-13 10:20       ` Robin Murphy
  1 sibling, 1 reply; 9+ messages in thread
From: Dmitry Baryshkov @ 2024-02-13  7:51 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: iommu, joro, linux-arm-kernel, nicolinc, robin.murphy, will,
	linux-arm-msm

On Sat, 10 Feb 2024 at 00:23, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Fri, Feb 09, 2024 at 10:05:38PM +0200, Dmitry Baryshkov wrote:
> > On Tue, 17 Oct 2023 Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > Now that the BLOCKED and IDENTITY behaviors are managed with their own
> > > domains change to the domain_alloc_paging() op.
> > >
> > > The check for using_legacy_binding is now redundant,
> > > arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this
> > > mode, so the core code will never attempt to create a DMA domain in the
> > > first place.
> > >
> > > Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core
> > > code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will
> > > not pass in IDENTITY or BLOCKED if the global statics exist, so the test
> > > for DMA is also redundant now too.
> > >
> > > Call arm_smmu_init_domain_context() early if a dev is available.
> > >
> > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> > > ---
> > >  drivers/iommu/arm/arm-smmu/arm-smmu.c | 21 +++++++++++++++------
> > >  1 file changed, 15 insertions(+), 6 deletions(-)
> >
> > For some reason this patch breaks booting of the APQ8096 Dragonboard820c
> > (qcom/apq8096-db820c.dts). Dispbling display subsystem (mdss) and venus
> > devices makes the board boot in most of the cases. Most frequently the
> > last parts of the log loog in a following way:
>
> It is surprising we tested this patch on some tegra systems with this
> iommu and didn't hit anything..
>
> The only real functional thing this changes is to move the domain
> initialization up in time, potentially a lot in time in some
> cases. That function does alot of things including touching HW so
> possibly there is some surprising interaction with something else.

I should not be debugging strange platforms at 1 a.m. I forgot that
there was another patch to revert. So after reverting the MPM patch,
I'm getting the following results:

>
> So, I would expect this to not WARN_ON and to make it work the same as
> before the patch:

No warnings, the platform now boots up to the point of actually
bringing up the venus device:


[   11.906514] ath10k_pci 0000:01:00.0: qca6174 hw3.2 target
0x05030000 chip_id 0x00340aff sub 0000:0000
[   11.907119] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 0
tracing 0 dfs 0 testmode 0
[   11.915881] ath10k_pci 0000:01:00.0: firmware ver
WLAN.RM.4.4.1-00288- api 6 features wowlan,ignore-otp,mfp crc32
bf907c7c
[   11.979972] Console: switching to colour frame buffer device 320x90
[   11.990756] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1
crc32 d2863f91
[   12.060834] msm_mdp 901000.display-controller: [drm] fb0: msmdrmfb
frame buffer device
[   12.096203] qcom-pcie 608000.pcie: Phy link never came up
[   12.103785] qcom-pcie 608000.pcie: PCI host bridge to bus 0001:00
[   12.103970] qcom-venus c00000.video-codec: Adding to iommu group 3

Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301
S - IMAGE_VARIANT_STRING=M8996LAB
S - OEM_IMAGE_VERSION_STRING=crm-ubuntu68
S - Boot Interface: UFS

>
> Then I'd ask you to remove the comment and do:
>
> @@ -878,7 +878,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
>         if (dev) {
>                 struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
>
> +               WARN_ON(true);
>                 if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
> +                       printk("Allocation failure in arm_smmu_domain_alloc_paging()\n");
>                         kfree(smmu_domain);
>                         return NULL;
>                 }
>
>
> And then we may get a clue from the backtraces it generates. I only
> saw one iommu group reported in your log so I'd expect one trace?

I added dev_info + mdelays() around the arm_smmu_init_domain_context()
and I can see that it crashes within that function.

[   29.819624] qcom-venus c00000.video-codec: Adding to iommu group 1
[   29.833181] ------------[ cut here ]------------
[   29.839198] WARNING: CPU: 1 PID: 35 at
drivers/iommu/arm/arm-smmu/arm-smmu.c:883
arm_smmu_domain_alloc_paging+0x80/0x174
[   29.843980] Modules linked in:
[   29.854824] CPU: 1 PID: 35 Comm: kworker/u18:0 Tainted: G     U
        6.8.0-rc3-next-20240208-05495-g20708c29957d-dirty #1739
[   29.857694] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
[   29.869410] Workqueue: events_unbound deferred_probe_work_func
[   29.875658] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[   29.881474] pc : arm_smmu_domain_alloc_paging+0x80/0x174
[   29.888331] lr : arm_smmu_domain_alloc_paging+0x68/0x174
[   29.893885] sp : ffff8000830338c0
[   29.899179] x29: ffff8000830338c0 x28: 0000000000000000 x27: ffff800081e72000
[   29.902396] x26: ffff00008034ee48 x25: ffff000080b24810 x24: 0000000000000000
[   29.909513] x23: ffff800081e73000 x22: ffff000080b24810 x21: ffff800082e23258
[   29.916633] x20: ffff00008389a700 x19: ffff00008034f600 x18: ffffffffffffffff
[   29.918788] usb 1-1: new high-speed USB device number 2 using xhci-hcd
[   29.923746] x17: 0000000c0000000b x16: 0000000900000008 x15: 0000000000000000
[   29.923765] x14: 000000000000b0af x13: 0000000000000000 x12: 0000000000000166
[   29.923783] x11: 0000000000000001 x10: 0000000000001410 x9 : 0000000000000000
[   29.923801] x8 : ffff00008034f800 x7 : 0000000000000000 x6 : 0000000000000000
[   29.923819] x5 : 0000000000000000 x4 : 0000000000000002 x3 : 0000000000000000
[   29.923837] x2 : ffff800082e23290 x1 : dead4ead00000000 x0 : 0000000000000000
[   29.923855] Call trace:
[   29.923861]  arm_smmu_domain_alloc_paging+0x80/0x174
[   29.923872]  __iommu_domain_alloc+0xcc/0xf4
[   29.923884]  iommu_setup_default_domain+0x294/0x554
[   29.938567] Bluetooth: hci0: Frame reassembly failed (-84)
[   29.944494]  __iommu_probe_device+0x418/0x43c
[   29.944508]  iommu_probe_device+0x3c/0x80
[   29.944519]  of_iommu_configure+0x124/0x1b4
[   29.944529]  of_dma_configure_id+0x170/0x2f4
[   29.969874] mmc0: new ultra high speed SDR104 SDHC card at address 5048
[   29.972966]  platform_dma_configure+0xa8/0xb4
[   29.972983]  really_probe+0x70/0x2ac
[   29.972992]  __driver_probe_device+0x78/0x12c
[   29.973001]  driver_probe_device+0xd8/0x160
[   29.973010]  __device_attach_driver+0xb8/0x138
[   29.973019]  bus_for_each_drv+0x80/0xdc
[   29.973027]  __device_attach+0x9c/0x188
[   29.973037]  device_initial_probe+0x14/0x20
[   29.973046]  bus_probe_device+0xac/0xb0
[   29.973055]  deferred_probe_work_func+0x8c/0xc8
[   29.973064]  process_one_work+0x210/0x5e4
[   29.983596] mmcblk0: mmc0:5048 SD32G 28.8 GiB
[   29.987546]  worker_thread+0x1bc/0x38c
[   29.987558]  kthread+0x120/0x124
[   29.987568]  ret_from_fork+0x10/0x20
[   29.987579] irq event stamp: 109977
[   29.987584] hardirqs last  enabled at (109977):
[<ffff800080fbbc48>] _raw_spin_unlock_irqrestore+0x6c/0x70
[   29.987600] hardirqs last disabled at (109976):
[<ffff800080fbb0a8>] _raw_spin_lock_irqsave+0x84/0x88
[   29.987610] softirqs last  enabled at (109966):
[<ffff800080090680>] __do_softirq+0x498/0x4e0
[   29.987619] softirqs last disabled at (109961):
[<ffff800080096184>] ____do_softirq+0x10/0x1c
[   30.006747]  mmcblk0: p1
[   30.010291] ---[ end trace 0000000000000000 ]---
[   30.018630] remoteproc remoteproc1: remote processor
9300000.remoteproc is now up
[   30.024525] qcom-pcie 600000.pcie: iATU: unroll F, 32 ob, 8 ib,
align 4K, limit 4G
[   30.044747] qcom,apr remoteproc1:smd-edge.apr_audio_svc.-1.-1:
Adding APR/GPR dev: aprsvc:service:4:3
[   30.046118] qcom-pcie 600000.pcie: Invalid eDMA IRQs found
[   30.051718] qcom,apr remoteproc1:smd-edge.apr_audio_svc.-1.-1:
Adding APR/GPR dev: aprsvc:service:4:4
[   30.066435] Bluetooth: hci0: QCA Downloading qca/nvm_00440302.bin
[   30.130736] hub 1-1:1.0: USB hub found
[   30.150390] qcom-pcie 600000.pcie: PCIe Gen.1 x1 link up
[   30.156394] hub 1-1:1.0: 4 ports detected
[   30.161837] qcom-pcie 600000.pcie: PCI host bridge to bus 0000:00
[   30.189583] pci_bus 0000:00: root bus resource [bus 00-ff]
[   30.195652] pci_bus 0000:00: root bus resource [io  0x0000-0xfffff]
[   30.201035] pci_bus 0000:00: root bus resource [mem 0x0c300000-0x0cffffff]
[   30.205424] Bluetooth: hci0: QCA setup on UART is completed
[   30.207262] pci 0000:00:00.0: [17cb:0104] type 01 class 0x060400
PCIe Root Port
[   30.214380] usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
[   30.219636] qcom-venus c00000.video-codec: Allocating domain
[   30.221503] pci 0000:00:00.0: BAR 0 [mem 0x00000000-0x00000fff]
[   30.221680] pci 0000:00:00.0: PCI bridge to [bus 01-ff]
[   30.221772] pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
[   30.221832] pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff]
[   30.221945] pci 0000:00:00.0:   bridge window [mem
0x00000000-0x000fffff 64bit pref]
[   30.222617] pci 0000:00:00.0: PME# supported from D0 D3hot
[   30.273673] hub 2-1:1.0: USB hub found
[   30.276567] hub 2-1:1.0: 4 ports detected

Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301
S - IMAGE_VARIANT_STRING=M8996LAB
S - OEM_IMAGE_VERSION_STRING=crm-ubuntu68
S - Boot Interface: UFS

I traced this further, it crashes during arm_smmu_write_context_bank().



--
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
  2024-02-13  7:51     ` Dmitry Baryshkov
@ 2024-02-13 10:20       ` Robin Murphy
  2024-02-13 10:55         ` Dmitry Baryshkov
  0 siblings, 1 reply; 9+ messages in thread
From: Robin Murphy @ 2024-02-13 10:20 UTC (permalink / raw)
  To: Dmitry Baryshkov, Jason Gunthorpe
  Cc: iommu, joro, linux-arm-kernel, nicolinc, will, linux-arm-msm

On 2024-02-13 7:51 am, Dmitry Baryshkov wrote:
> On Sat, 10 Feb 2024 at 00:23, Jason Gunthorpe <jgg@nvidia.com> wrote:
>>
>> On Fri, Feb 09, 2024 at 10:05:38PM +0200, Dmitry Baryshkov wrote:
>>> On Tue, 17 Oct 2023 Jason Gunthorpe <jgg@nvidia.com> wrote:
>>>> Now that the BLOCKED and IDENTITY behaviors are managed with their own
>>>> domains change to the domain_alloc_paging() op.
>>>>
>>>> The check for using_legacy_binding is now redundant,
>>>> arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this
>>>> mode, so the core code will never attempt to create a DMA domain in the
>>>> first place.
>>>>
>>>> Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core
>>>> code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will
>>>> not pass in IDENTITY or BLOCKED if the global statics exist, so the test
>>>> for DMA is also redundant now too.
>>>>
>>>> Call arm_smmu_init_domain_context() early if a dev is available.
>>>>
>>>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
>>>> ---
>>>>   drivers/iommu/arm/arm-smmu/arm-smmu.c | 21 +++++++++++++++------
>>>>   1 file changed, 15 insertions(+), 6 deletions(-)
>>>
>>> For some reason this patch breaks booting of the APQ8096 Dragonboard820c
>>> (qcom/apq8096-db820c.dts). Dispbling display subsystem (mdss) and venus
>>> devices makes the board boot in most of the cases. Most frequently the
>>> last parts of the log loog in a following way:
>>
>> It is surprising we tested this patch on some tegra systems with this
>> iommu and didn't hit anything..
>>
>> The only real functional thing this changes is to move the domain
>> initialization up in time, potentially a lot in time in some
>> cases. That function does alot of things including touching HW so
>> possibly there is some surprising interaction with something else.
> 
> I should not be debugging strange platforms at 1 a.m. I forgot that
> there was another patch to revert. So after reverting the MPM patch,
> I'm getting the following results:
> 
>>
>> So, I would expect this to not WARN_ON and to make it work the same as
>> before the patch:
> 
> No warnings, the platform now boots up to the point of actually
> bringing up the venus device:
> 
> 
> [   11.906514] ath10k_pci 0000:01:00.0: qca6174 hw3.2 target
> 0x05030000 chip_id 0x00340aff sub 0000:0000
> [   11.907119] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 0
> tracing 0 dfs 0 testmode 0
> [   11.915881] ath10k_pci 0000:01:00.0: firmware ver
> WLAN.RM.4.4.1-00288- api 6 features wowlan,ignore-otp,mfp crc32
> bf907c7c
> [   11.979972] Console: switching to colour frame buffer device 320x90
> [   11.990756] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1
> crc32 d2863f91
> [   12.060834] msm_mdp 901000.display-controller: [drm] fb0: msmdrmfb
> frame buffer device
> [   12.096203] qcom-pcie 608000.pcie: Phy link never came up
> [   12.103785] qcom-pcie 608000.pcie: PCI host bridge to bus 0001:00
> [   12.103970] qcom-venus c00000.video-codec: Adding to iommu group 3
> 
> Format: Log Type - Time(microsec) - Message - Optional Info
> Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
> S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301
> S - IMAGE_VARIANT_STRING=M8996LAB
> S - OEM_IMAGE_VERSION_STRING=crm-ubuntu68
> S - Boot Interface: UFS
> 
>>
>> Then I'd ask you to remove the comment and do:
>>
>> @@ -878,7 +878,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
>>          if (dev) {
>>                  struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
>>
>> +               WARN_ON(true);
>>                  if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
>> +                       printk("Allocation failure in arm_smmu_domain_alloc_paging()\n");
>>                          kfree(smmu_domain);
>>                          return NULL;
>>                  }
>>
>>
>> And then we may get a clue from the backtraces it generates. I only
>> saw one iommu group reported in your log so I'd expect one trace?
> 
> I added dev_info + mdelays() around the arm_smmu_init_domain_context()
> and I can see that it crashes within that function.

Yeah, this is totally broken. We can't just call the unmodified 
arm_smmu_init_domain_context() at domain allocation because half of what 
it's doing belongs to the attach operation. We should not be allocating 
context banks, IRQs, etc. for a not-yet-attached domain, and we 
certainly shouldn't be touching hardware there outside of RPM.

Thanks,
Robin.

> 
> [   29.819624] qcom-venus c00000.video-codec: Adding to iommu group 1
> [   29.833181] ------------[ cut here ]------------
> [   29.839198] WARNING: CPU: 1 PID: 35 at
> drivers/iommu/arm/arm-smmu/arm-smmu.c:883
> arm_smmu_domain_alloc_paging+0x80/0x174
> [   29.843980] Modules linked in:
> [   29.854824] CPU: 1 PID: 35 Comm: kworker/u18:0 Tainted: G     U
>          6.8.0-rc3-next-20240208-05495-g20708c29957d-dirty #1739
> [   29.857694] Hardware name: Qualcomm Technologies, Inc. DB820c (DT)
> [   29.869410] Workqueue: events_unbound deferred_probe_work_func
> [   29.875658] pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [   29.881474] pc : arm_smmu_domain_alloc_paging+0x80/0x174
> [   29.888331] lr : arm_smmu_domain_alloc_paging+0x68/0x174
> [   29.893885] sp : ffff8000830338c0
> [   29.899179] x29: ffff8000830338c0 x28: 0000000000000000 x27: ffff800081e72000
> [   29.902396] x26: ffff00008034ee48 x25: ffff000080b24810 x24: 0000000000000000
> [   29.909513] x23: ffff800081e73000 x22: ffff000080b24810 x21: ffff800082e23258
> [   29.916633] x20: ffff00008389a700 x19: ffff00008034f600 x18: ffffffffffffffff
> [   29.918788] usb 1-1: new high-speed USB device number 2 using xhci-hcd
> [   29.923746] x17: 0000000c0000000b x16: 0000000900000008 x15: 0000000000000000
> [   29.923765] x14: 000000000000b0af x13: 0000000000000000 x12: 0000000000000166
> [   29.923783] x11: 0000000000000001 x10: 0000000000001410 x9 : 0000000000000000
> [   29.923801] x8 : ffff00008034f800 x7 : 0000000000000000 x6 : 0000000000000000
> [   29.923819] x5 : 0000000000000000 x4 : 0000000000000002 x3 : 0000000000000000
> [   29.923837] x2 : ffff800082e23290 x1 : dead4ead00000000 x0 : 0000000000000000
> [   29.923855] Call trace:
> [   29.923861]  arm_smmu_domain_alloc_paging+0x80/0x174
> [   29.923872]  __iommu_domain_alloc+0xcc/0xf4
> [   29.923884]  iommu_setup_default_domain+0x294/0x554
> [   29.938567] Bluetooth: hci0: Frame reassembly failed (-84)
> [   29.944494]  __iommu_probe_device+0x418/0x43c
> [   29.944508]  iommu_probe_device+0x3c/0x80
> [   29.944519]  of_iommu_configure+0x124/0x1b4
> [   29.944529]  of_dma_configure_id+0x170/0x2f4
> [   29.969874] mmc0: new ultra high speed SDR104 SDHC card at address 5048
> [   29.972966]  platform_dma_configure+0xa8/0xb4
> [   29.972983]  really_probe+0x70/0x2ac
> [   29.972992]  __driver_probe_device+0x78/0x12c
> [   29.973001]  driver_probe_device+0xd8/0x160
> [   29.973010]  __device_attach_driver+0xb8/0x138
> [   29.973019]  bus_for_each_drv+0x80/0xdc
> [   29.973027]  __device_attach+0x9c/0x188
> [   29.973037]  device_initial_probe+0x14/0x20
> [   29.973046]  bus_probe_device+0xac/0xb0
> [   29.973055]  deferred_probe_work_func+0x8c/0xc8
> [   29.973064]  process_one_work+0x210/0x5e4
> [   29.983596] mmcblk0: mmc0:5048 SD32G 28.8 GiB
> [   29.987546]  worker_thread+0x1bc/0x38c
> [   29.987558]  kthread+0x120/0x124
> [   29.987568]  ret_from_fork+0x10/0x20
> [   29.987579] irq event stamp: 109977
> [   29.987584] hardirqs last  enabled at (109977):
> [<ffff800080fbbc48>] _raw_spin_unlock_irqrestore+0x6c/0x70
> [   29.987600] hardirqs last disabled at (109976):
> [<ffff800080fbb0a8>] _raw_spin_lock_irqsave+0x84/0x88
> [   29.987610] softirqs last  enabled at (109966):
> [<ffff800080090680>] __do_softirq+0x498/0x4e0
> [   29.987619] softirqs last disabled at (109961):
> [<ffff800080096184>] ____do_softirq+0x10/0x1c
> [   30.006747]  mmcblk0: p1
> [   30.010291] ---[ end trace 0000000000000000 ]---
> [   30.018630] remoteproc remoteproc1: remote processor
> 9300000.remoteproc is now up
> [   30.024525] qcom-pcie 600000.pcie: iATU: unroll F, 32 ob, 8 ib,
> align 4K, limit 4G
> [   30.044747] qcom,apr remoteproc1:smd-edge.apr_audio_svc.-1.-1:
> Adding APR/GPR dev: aprsvc:service:4:3
> [   30.046118] qcom-pcie 600000.pcie: Invalid eDMA IRQs found
> [   30.051718] qcom,apr remoteproc1:smd-edge.apr_audio_svc.-1.-1:
> Adding APR/GPR dev: aprsvc:service:4:4
> [   30.066435] Bluetooth: hci0: QCA Downloading qca/nvm_00440302.bin
> [   30.130736] hub 1-1:1.0: USB hub found
> [   30.150390] qcom-pcie 600000.pcie: PCIe Gen.1 x1 link up
> [   30.156394] hub 1-1:1.0: 4 ports detected
> [   30.161837] qcom-pcie 600000.pcie: PCI host bridge to bus 0000:00
> [   30.189583] pci_bus 0000:00: root bus resource [bus 00-ff]
> [   30.195652] pci_bus 0000:00: root bus resource [io  0x0000-0xfffff]
> [   30.201035] pci_bus 0000:00: root bus resource [mem 0x0c300000-0x0cffffff]
> [   30.205424] Bluetooth: hci0: QCA setup on UART is completed
> [   30.207262] pci 0000:00:00.0: [17cb:0104] type 01 class 0x060400
> PCIe Root Port
> [   30.214380] usb 2-1: new SuperSpeed USB device number 2 using xhci-hcd
> [   30.219636] qcom-venus c00000.video-codec: Allocating domain
> [   30.221503] pci 0000:00:00.0: BAR 0 [mem 0x00000000-0x00000fff]
> [   30.221680] pci 0000:00:00.0: PCI bridge to [bus 01-ff]
> [   30.221772] pci 0000:00:00.0:   bridge window [io  0x0000-0x0fff]
> [   30.221832] pci 0000:00:00.0:   bridge window [mem 0x00000000-0x000fffff]
> [   30.221945] pci 0000:00:00.0:   bridge window [mem
> 0x00000000-0x000fffff 64bit pref]
> [   30.222617] pci 0000:00:00.0: PME# supported from D0 D3hot
> [   30.273673] hub 2-1:1.0: USB hub found
> [   30.276567] hub 2-1:1.0: 4 ports detected
> 
> Format: Log Type - Time(microsec) - Message - Optional Info
> Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
> S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301
> S - IMAGE_VARIANT_STRING=M8996LAB
> S - OEM_IMAGE_VERSION_STRING=crm-ubuntu68
> S - Boot Interface: UFS
> 
> I traced this further, it crashes during arm_smmu_write_context_bank().
> 
> 
> 
> --
> With best wishes
> Dmitry

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
  2024-02-13 10:20       ` Robin Murphy
@ 2024-02-13 10:55         ` Dmitry Baryshkov
  2024-02-13 11:16           ` Will Deacon
  0 siblings, 1 reply; 9+ messages in thread
From: Dmitry Baryshkov @ 2024-02-13 10:55 UTC (permalink / raw)
  To: Robin Murphy
  Cc: Jason Gunthorpe, iommu, joro, linux-arm-kernel, nicolinc, will,
	linux-arm-msm

On Tue, 13 Feb 2024 at 12:20, Robin Murphy <robin.murphy@arm.com> wrote:
>
> On 2024-02-13 7:51 am, Dmitry Baryshkov wrote:
> > On Sat, 10 Feb 2024 at 00:23, Jason Gunthorpe <jgg@nvidia.com> wrote:
> >>
> >> On Fri, Feb 09, 2024 at 10:05:38PM +0200, Dmitry Baryshkov wrote:
> >>> On Tue, 17 Oct 2023 Jason Gunthorpe <jgg@nvidia.com> wrote:
> >>>> Now that the BLOCKED and IDENTITY behaviors are managed with their own
> >>>> domains change to the domain_alloc_paging() op.
> >>>>
> >>>> The check for using_legacy_binding is now redundant,
> >>>> arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this
> >>>> mode, so the core code will never attempt to create a DMA domain in the
> >>>> first place.
> >>>>
> >>>> Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core
> >>>> code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will
> >>>> not pass in IDENTITY or BLOCKED if the global statics exist, so the test
> >>>> for DMA is also redundant now too.
> >>>>
> >>>> Call arm_smmu_init_domain_context() early if a dev is available.
> >>>>
> >>>> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> >>>> ---
> >>>>   drivers/iommu/arm/arm-smmu/arm-smmu.c | 21 +++++++++++++++------
> >>>>   1 file changed, 15 insertions(+), 6 deletions(-)
> >>>
> >>> For some reason this patch breaks booting of the APQ8096 Dragonboard820c
> >>> (qcom/apq8096-db820c.dts). Dispbling display subsystem (mdss) and venus
> >>> devices makes the board boot in most of the cases. Most frequently the
> >>> last parts of the log loog in a following way:
> >>
> >> It is surprising we tested this patch on some tegra systems with this
> >> iommu and didn't hit anything..
> >>
> >> The only real functional thing this changes is to move the domain
> >> initialization up in time, potentially a lot in time in some
> >> cases. That function does alot of things including touching HW so
> >> possibly there is some surprising interaction with something else.
> >
> > I should not be debugging strange platforms at 1 a.m. I forgot that
> > there was another patch to revert. So after reverting the MPM patch,
> > I'm getting the following results:
> >
> >>
> >> So, I would expect this to not WARN_ON and to make it work the same as
> >> before the patch:
> >
> > No warnings, the platform now boots up to the point of actually
> > bringing up the venus device:
> >
> >
> > [   11.906514] ath10k_pci 0000:01:00.0: qca6174 hw3.2 target
> > 0x05030000 chip_id 0x00340aff sub 0000:0000
> > [   11.907119] ath10k_pci 0000:01:00.0: kconfig debug 1 debugfs 0
> > tracing 0 dfs 0 testmode 0
> > [   11.915881] ath10k_pci 0000:01:00.0: firmware ver
> > WLAN.RM.4.4.1-00288- api 6 features wowlan,ignore-otp,mfp crc32
> > bf907c7c
> > [   11.979972] Console: switching to colour frame buffer device 320x90
> > [   11.990756] ath10k_pci 0000:01:00.0: board_file api 2 bmi_id 0:1
> > crc32 d2863f91
> > [   12.060834] msm_mdp 901000.display-controller: [drm] fb0: msmdrmfb
> > frame buffer device
> > [   12.096203] qcom-pcie 608000.pcie: Phy link never came up
> > [   12.103785] qcom-pcie 608000.pcie: PCI host bridge to bus 0001:00
> > [   12.103970] qcom-venus c00000.video-codec: Adding to iommu group 3
> >
> > Format: Log Type - Time(microsec) - Message - Optional Info
> > Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
> > S - QC_IMAGE_VERSION_STRING=BOOT.XF.1.0-00301
> > S - IMAGE_VARIANT_STRING=M8996LAB
> > S - OEM_IMAGE_VERSION_STRING=crm-ubuntu68
> > S - Boot Interface: UFS
> >
> >>
> >> Then I'd ask you to remove the comment and do:
> >>
> >> @@ -878,7 +878,9 @@ static struct iommu_domain *arm_smmu_domain_alloc_paging(struct device *dev)
> >>          if (dev) {
> >>                  struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
> >>
> >> +               WARN_ON(true);
> >>                  if (arm_smmu_init_domain_context(smmu_domain, cfg->smmu, dev)) {
> >> +                       printk("Allocation failure in arm_smmu_domain_alloc_paging()\n");
> >>                          kfree(smmu_domain);
> >>                          return NULL;
> >>                  }
> >>
> >>
> >> And then we may get a clue from the backtraces it generates. I only
> >> saw one iommu group reported in your log so I'd expect one trace?
> >
> > I added dev_info + mdelays() around the arm_smmu_init_domain_context()
> > and I can see that it crashes within that function.
>
> Yeah, this is totally broken. We can't just call the unmodified
> arm_smmu_init_domain_context() at domain allocation because half of what
> it's doing belongs to the attach operation. We should not be allocating
> context banks, IRQs, etc. for a not-yet-attached domain, and we
> certainly shouldn't be touching hardware there outside of RPM.

Should I send a revert?

>
> Thanks,
> Robin.



-- 
With best wishes
Dmitry

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
  2024-02-13 10:55         ` Dmitry Baryshkov
@ 2024-02-13 11:16           ` Will Deacon
  2024-02-13 11:54             ` Jason Gunthorpe
  0 siblings, 1 reply; 9+ messages in thread
From: Will Deacon @ 2024-02-13 11:16 UTC (permalink / raw)
  To: Dmitry Baryshkov
  Cc: Robin Murphy, Jason Gunthorpe, iommu, joro, linux-arm-kernel,
	nicolinc, linux-arm-msm

On Tue, Feb 13, 2024 at 12:55:38PM +0200, Dmitry Baryshkov wrote:
> On Tue, 13 Feb 2024 at 12:20, Robin Murphy <robin.murphy@arm.com> wrote:
> > On 2024-02-13 7:51 am, Dmitry Baryshkov wrote:
> > > On Sat, 10 Feb 2024 at 00:23, Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >> And then we may get a clue from the backtraces it generates. I only
> > >> saw one iommu group reported in your log so I'd expect one trace?
> > >
> > > I added dev_info + mdelays() around the arm_smmu_init_domain_context()
> > > and I can see that it crashes within that function.
> >
> > Yeah, this is totally broken. We can't just call the unmodified
> > arm_smmu_init_domain_context() at domain allocation because half of what
> > it's doing belongs to the attach operation. We should not be allocating
> > context banks, IRQs, etc. for a not-yet-attached domain, and we
> > certainly shouldn't be touching hardware there outside of RPM.
> 
> Should I send a revert?

If reverting the patch fixes the issue for you, then yes please!

Hopefully you can help Jason test a reworked verson for the future, as
it's evident that Tegra doesn't tickle the power management side of things
in the same way.

Will

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging()
  2024-02-13 11:16           ` Will Deacon
@ 2024-02-13 11:54             ` Jason Gunthorpe
  0 siblings, 0 replies; 9+ messages in thread
From: Jason Gunthorpe @ 2024-02-13 11:54 UTC (permalink / raw)
  To: Will Deacon
  Cc: Dmitry Baryshkov, Robin Murphy, iommu, joro, linux-arm-kernel,
	nicolinc, linux-arm-msm

On Tue, Feb 13, 2024 at 11:16:23AM +0000, Will Deacon wrote:
> On Tue, Feb 13, 2024 at 12:55:38PM +0200, Dmitry Baryshkov wrote:
> > On Tue, 13 Feb 2024 at 12:20, Robin Murphy <robin.murphy@arm.com> wrote:
> > > On 2024-02-13 7:51 am, Dmitry Baryshkov wrote:
> > > > On Sat, 10 Feb 2024 at 00:23, Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > >> And then we may get a clue from the backtraces it generates. I only
> > > >> saw one iommu group reported in your log so I'd expect one trace?
> > > >
> > > > I added dev_info + mdelays() around the arm_smmu_init_domain_context()
> > > > and I can see that it crashes within that function.
> > >
> > > Yeah, this is totally broken. We can't just call the unmodified
> > > arm_smmu_init_domain_context() at domain allocation because half of what
> > > it's doing belongs to the attach operation. We should not be allocating
> > > context banks, IRQs, etc. for a not-yet-attached domain, and we
> > > certainly shouldn't be touching hardware there outside of RPM.
> > 
> > Should I send a revert?
> 
> If reverting the patch fixes the issue for you, then yes please!

Not the whole thing though, just remove the 'if (dev)' like you
tested, thanks.

If you want I will send it

> Hopefully you can help Jason test a reworked verson for the future, as
> it's evident that Tegra doesn't tickle the power management side of things
> in the same way.

It can stay, as long as it uses the alloc_domain_paging() that is
enough for the core code to move forward.

I included it only because we were able to test it, most of the other
drivers I did not try to move their "finalize".

Jason

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2024-02-13 11:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <5-v2-c86cc8c2230e+160bb-smmu_newapi_jgg@nvidia.com>
2024-02-09 20:05 ` [PATCH v2 5/5] iommu/arm-smmu: Convert to domain_alloc_paging() Dmitry Baryshkov
2024-02-09 22:23   ` Jason Gunthorpe
2024-02-12 23:18     ` Dmitry Baryshkov
2024-02-13  0:19       ` Jason Gunthorpe
2024-02-13  7:51     ` Dmitry Baryshkov
2024-02-13 10:20       ` Robin Murphy
2024-02-13 10:55         ` Dmitry Baryshkov
2024-02-13 11:16           ` Will Deacon
2024-02-13 11:54             ` Jason Gunthorpe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox