From: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
To: Ray Jui <ray.jui-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Cc: sunil.goutham-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org,
iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
linu.cherian-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Subject: Re: [PATCH 0/8] io-pgtable lock removal
Date: Thu, 6 Jul 2017 16:08:39 +0100 [thread overview]
Message-ID: <20170706150838.GB15574@arm.com> (raw)
In-Reply-To: <5149280b-a214-249c-c5e2-3712b1f941d2-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Hi Ray,
Thanks for testing this, and sorry it didn't help.
On Wed, Jul 05, 2017 at 04:24:22PM -0700, Ray Jui wrote:
> On 7/5/17 1:41 AM, Will Deacon wrote:
> > On Tue, Jul 04, 2017 at 06:45:17PM -0700, Ray Jui wrote:
> >> Has anything functionally changed between PATCH v2 and v1? I'm seeing a
> >> very different L2 throughput with v2 (in general a lot worse with v2 vs.
> >> v1); however, I'm currently unable to reproduce the TLB sync timed out
> >> issue with v2 (without the patch from Will's email).
> >>
> >> It could also be something else that has changed in my setup, but so far
> >> I have not yet been able to spot anything wrong in the setup.
> >
> > There were fixes, and that initially involved a DSB that was found to be
> > expensive. The patches queued in -next should have that addressed, so please
> > use those (or my for-joerg/arm-smmu/updates branch).
> >
>
> That was my bad yesterday. I was in a rush and the setup was incorrect.
>
> I redo my Ethernet performance test with both PATCH v1 and v2 today, and
> can confirm the performance is consistent between v1 and v2 as expected.
>
> I also made sure the following message can still be reproduced with
> patch set v2:
>
> arm-smmu 64000000.mmu: TLB sync timed out -- SMMU may be deadlocked
>
> Then I proceeded to apply your patch that attempt to fix the deadlock
> issue.
[...]
> However, with the fix patch, I can still see the deadlock message when I
> have > 32 iperf TX threads active in the system:
Damn. We've been going over this today and the only plausible theory seems
to be that concurrent TLB syncs are causing the completion to be pushed out,
resulting in timeouts.
Can you try this patch below, instead of the one I sent before, please?
Thanks,
Will
--->8
>From bbf3737c29e3d18f539998a66f42878ac91cde97 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
Date: Thu, 6 Jul 2017 15:55:48 +0100
Subject: [PATCH] iommu/arm-smmu: Reintroduce locking around TLB sync
operations
Commit 523d7423e21b ("iommu/arm-smmu: Remove io-pgtable spinlock")
removed the locking used to serialise map/unmap calls into the io-pgtable
code from the ARM SMMU driver. This is good for performance, but opens
us up to a nasty race with TLB syncs because the TLB sync register is
shared within a context bank (or even globally for stage-2 on SMMUv1).
There are two cases to consider:
1. A CPU can be spinning on the completion of a TLB sync, take an
interrupt which issues a subsequent TLB sync, and then report a
timeout on return from the interrupt.
2. A CPU can be spinning on the completion of a TLB sync, but other
CPUs can continuously issue additional TLB syncs in such a way that
the backoff logic reports a timeout.
Rather than fix this by spinning for completion of prior TLB syncs before
issuing a new one (which may suffer from fairness issues on large systems),
instead reintroduce locking around TLB sync operations in the ARM SMMU
driver.
Fixes: 523d7423e21b ("iommu/arm-smmu: Remove io-pgtable spinlock")
Cc: Robin Murphy <robin.murphy-5wv7dgnIgG8@public.gmane.org>
Reported-by: Ray Jui <ray.jui-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
Signed-off-by: Will Deacon <will.deacon-5wv7dgnIgG8@public.gmane.org>
---
drivers/iommu/arm-smmu.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b446183b3015..770abd247f40 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -400,6 +400,8 @@ struct arm_smmu_device {
u32 cavium_id_base; /* Specific to Cavium */
+ spinlock_t global_sync_lock;
+
/* IOMMU core code handle */
struct iommu_device iommu;
};
@@ -436,7 +438,7 @@ struct arm_smmu_domain {
struct arm_smmu_cfg cfg;
enum arm_smmu_domain_stage stage;
struct mutex init_mutex; /* Protects smmu pointer */
- spinlock_t cb_lock; /* Serialises ATS1* ops */
+ spinlock_t cb_lock; /* Serialises ATS1* ops and TLB syncs */
struct iommu_domain domain;
};
@@ -602,9 +604,12 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
static void arm_smmu_tlb_sync_global(struct arm_smmu_device *smmu)
{
void __iomem *base = ARM_SMMU_GR0(smmu);
+ unsigned long flags;
+ spin_lock_irqsave(&smmu->global_sync_lock, flags);
__arm_smmu_tlb_sync(smmu, base + ARM_SMMU_GR0_sTLBGSYNC,
base + ARM_SMMU_GR0_sTLBGSTATUS);
+ spin_unlock_irqrestore(&smmu->global_sync_lock, flags);
}
static void arm_smmu_tlb_sync_context(void *cookie)
@@ -612,9 +617,12 @@ static void arm_smmu_tlb_sync_context(void *cookie)
struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_device *smmu = smmu_domain->smmu;
void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx);
+ unsigned long flags;
+ spin_lock_irqsave(&smmu_domain->cb_lock, flags);
__arm_smmu_tlb_sync(smmu, base + ARM_SMMU_CB_TLBSYNC,
base + ARM_SMMU_CB_TLBSTATUS);
+ spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
}
static void arm_smmu_tlb_sync_vmid(void *cookie)
@@ -1925,6 +1933,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
smmu->num_mapping_groups = size;
mutex_init(&smmu->stream_map_mutex);
+ spin_lock_init(&smmu->global_sync_lock);
if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) {
smmu->features |= ARM_SMMU_FEAT_FMT_AARCH32_L;
--
2.1.4
WARNING: multiple messages have this Message-ID (diff)
From: will.deacon@arm.com (Will Deacon)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 0/8] io-pgtable lock removal
Date: Thu, 6 Jul 2017 16:08:39 +0100 [thread overview]
Message-ID: <20170706150838.GB15574@arm.com> (raw)
In-Reply-To: <5149280b-a214-249c-c5e2-3712b1f941d2@broadcom.com>
Hi Ray,
Thanks for testing this, and sorry it didn't help.
On Wed, Jul 05, 2017 at 04:24:22PM -0700, Ray Jui wrote:
> On 7/5/17 1:41 AM, Will Deacon wrote:
> > On Tue, Jul 04, 2017 at 06:45:17PM -0700, Ray Jui wrote:
> >> Has anything functionally changed between PATCH v2 and v1? I'm seeing a
> >> very different L2 throughput with v2 (in general a lot worse with v2 vs.
> >> v1); however, I'm currently unable to reproduce the TLB sync timed out
> >> issue with v2 (without the patch from Will's email).
> >>
> >> It could also be something else that has changed in my setup, but so far
> >> I have not yet been able to spot anything wrong in the setup.
> >
> > There were fixes, and that initially involved a DSB that was found to be
> > expensive. The patches queued in -next should have that addressed, so please
> > use those (or my for-joerg/arm-smmu/updates branch).
> >
>
> That was my bad yesterday. I was in a rush and the setup was incorrect.
>
> I redo my Ethernet performance test with both PATCH v1 and v2 today, and
> can confirm the performance is consistent between v1 and v2 as expected.
>
> I also made sure the following message can still be reproduced with
> patch set v2:
>
> arm-smmu 64000000.mmu: TLB sync timed out -- SMMU may be deadlocked
>
> Then I proceeded to apply your patch that attempt to fix the deadlock
> issue.
[...]
> However, with the fix patch, I can still see the deadlock message when I
> have > 32 iperf TX threads active in the system:
Damn. We've been going over this today and the only plausible theory seems
to be that concurrent TLB syncs are causing the completion to be pushed out,
resulting in timeouts.
Can you try this patch below, instead of the one I sent before, please?
Thanks,
Will
--->8
>From bbf3737c29e3d18f539998a66f42878ac91cde97 Mon Sep 17 00:00:00 2001
From: Will Deacon <will.deacon@arm.com>
Date: Thu, 6 Jul 2017 15:55:48 +0100
Subject: [PATCH] iommu/arm-smmu: Reintroduce locking around TLB sync
operations
Commit 523d7423e21b ("iommu/arm-smmu: Remove io-pgtable spinlock")
removed the locking used to serialise map/unmap calls into the io-pgtable
code from the ARM SMMU driver. This is good for performance, but opens
us up to a nasty race with TLB syncs because the TLB sync register is
shared within a context bank (or even globally for stage-2 on SMMUv1).
There are two cases to consider:
1. A CPU can be spinning on the completion of a TLB sync, take an
interrupt which issues a subsequent TLB sync, and then report a
timeout on return from the interrupt.
2. A CPU can be spinning on the completion of a TLB sync, but other
CPUs can continuously issue additional TLB syncs in such a way that
the backoff logic reports a timeout.
Rather than fix this by spinning for completion of prior TLB syncs before
issuing a new one (which may suffer from fairness issues on large systems),
instead reintroduce locking around TLB sync operations in the ARM SMMU
driver.
Fixes: 523d7423e21b ("iommu/arm-smmu: Remove io-pgtable spinlock")
Cc: Robin Murphy <robin.murphy@arm.com>
Reported-by: Ray Jui <ray.jui@broadcom.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
drivers/iommu/arm-smmu.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index b446183b3015..770abd247f40 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -400,6 +400,8 @@ struct arm_smmu_device {
u32 cavium_id_base; /* Specific to Cavium */
+ spinlock_t global_sync_lock;
+
/* IOMMU core code handle */
struct iommu_device iommu;
};
@@ -436,7 +438,7 @@ struct arm_smmu_domain {
struct arm_smmu_cfg cfg;
enum arm_smmu_domain_stage stage;
struct mutex init_mutex; /* Protects smmu pointer */
- spinlock_t cb_lock; /* Serialises ATS1* ops */
+ spinlock_t cb_lock; /* Serialises ATS1* ops and TLB syncs */
struct iommu_domain domain;
};
@@ -602,9 +604,12 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu,
static void arm_smmu_tlb_sync_global(struct arm_smmu_device *smmu)
{
void __iomem *base = ARM_SMMU_GR0(smmu);
+ unsigned long flags;
+ spin_lock_irqsave(&smmu->global_sync_lock, flags);
__arm_smmu_tlb_sync(smmu, base + ARM_SMMU_GR0_sTLBGSYNC,
base + ARM_SMMU_GR0_sTLBGSTATUS);
+ spin_unlock_irqrestore(&smmu->global_sync_lock, flags);
}
static void arm_smmu_tlb_sync_context(void *cookie)
@@ -612,9 +617,12 @@ static void arm_smmu_tlb_sync_context(void *cookie)
struct arm_smmu_domain *smmu_domain = cookie;
struct arm_smmu_device *smmu = smmu_domain->smmu;
void __iomem *base = ARM_SMMU_CB(smmu, smmu_domain->cfg.cbndx);
+ unsigned long flags;
+ spin_lock_irqsave(&smmu_domain->cb_lock, flags);
__arm_smmu_tlb_sync(smmu, base + ARM_SMMU_CB_TLBSYNC,
base + ARM_SMMU_CB_TLBSTATUS);
+ spin_unlock_irqrestore(&smmu_domain->cb_lock, flags);
}
static void arm_smmu_tlb_sync_vmid(void *cookie)
@@ -1925,6 +1933,7 @@ static int arm_smmu_device_cfg_probe(struct arm_smmu_device *smmu)
smmu->num_mapping_groups = size;
mutex_init(&smmu->stream_map_mutex);
+ spin_lock_init(&smmu->global_sync_lock);
if (smmu->version < ARM_SMMU_V2 || !(id & ID0_PTFS_NO_AARCH32)) {
smmu->features |= ARM_SMMU_FEAT_FMT_AARCH32_L;
--
2.1.4
next prev parent reply other threads:[~2017-07-06 15:08 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-08 11:51 [PATCH 0/8] io-pgtable lock removal Robin Murphy
2017-06-08 11:51 ` Robin Murphy
2017-06-08 11:52 ` [PATCH 1/8] iommu/io-pgtable-arm-v7s: Check table PTEs more precisely Robin Murphy
2017-06-08 11:52 ` Robin Murphy
2017-06-08 11:52 ` [PATCH 2/8] iommu/io-pgtable-arm: Improve split_blk_unmap Robin Murphy
2017-06-08 11:52 ` Robin Murphy
2017-06-08 11:52 ` [PATCH 3/8] iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap Robin Murphy
2017-06-08 11:52 ` Robin Murphy
2017-06-08 11:52 ` [PATCH 4/8] iommu/io-pgtable: Introduce explicit coherency Robin Murphy
2017-06-08 11:52 ` Robin Murphy
2017-06-08 11:52 ` [PATCH 5/8] iommu/io-pgtable-arm: Support lockless operation Robin Murphy
2017-06-08 11:52 ` Robin Murphy
2017-06-08 11:52 ` [PATCH 6/8] iommu/io-pgtable-arm-v7s: " Robin Murphy
2017-06-08 11:52 ` Robin Murphy
2017-06-08 11:52 ` [PATCH 7/8] iommu/arm-smmu: Remove io-pgtable spinlock Robin Murphy
2017-06-08 11:52 ` Robin Murphy
2017-06-08 11:52 ` [PATCH 8/8] iommu/arm-smmu-v3: " Robin Murphy
2017-06-08 11:52 ` Robin Murphy
[not found] ` <cover.1496921366.git.robin.murphy-5wv7dgnIgG8@public.gmane.org>
2017-06-09 19:28 ` [PATCH 0/8] io-pgtable lock removal Nate Watterson
2017-06-09 19:28 ` Nate Watterson
[not found] ` <458ad41d-6679-eeca-3c0f-13ccb6c933b6-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
2017-06-15 0:40 ` Ray Jui via iommu
2017-06-15 0:40 ` Ray Jui
[not found] ` <b7830be1-9a78-e29f-a29c-4798aaa28c0a-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-06-15 12:25 ` John Garry
2017-06-15 12:25 ` John Garry
2017-06-20 13:37 ` Robin Murphy
2017-06-20 13:37 ` Robin Murphy
[not found] ` <cdc1799b-f142-09ed-a7e5-d7fd2e70268f-5wv7dgnIgG8@public.gmane.org>
2017-06-27 16:43 ` Ray Jui via iommu
2017-06-27 16:43 ` Ray Jui
[not found] ` <e43ba1fe-696e-fabb-a800-52fadaa2fa93-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-06-28 11:46 ` Will Deacon
2017-06-28 11:46 ` Will Deacon
[not found] ` <20170628114609.GD11053-5wv7dgnIgG8@public.gmane.org>
2017-06-28 17:02 ` Ray Jui via iommu
2017-06-28 17:02 ` Ray Jui
[not found] ` <87d53115-3d80-5a3d-6632-c31986cb7018-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-07-04 17:31 ` Will Deacon
2017-07-04 17:31 ` Will Deacon
[not found] ` <20170704173155.GN22175-5wv7dgnIgG8@public.gmane.org>
2017-07-04 17:39 ` Ray Jui via iommu
2017-07-04 17:39 ` Ray Jui
[not found] ` <6814b246-22f0-bfaa-5002-a269b2735116-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-07-05 1:45 ` Ray Jui via iommu
2017-07-05 1:45 ` Ray Jui
[not found] ` <2d5f5ef3-32b1-76c6-6869-ff980557f8e8-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-07-05 8:41 ` Will Deacon
2017-07-05 8:41 ` Will Deacon
[not found] ` <20170705084143.GA9378-5wv7dgnIgG8@public.gmane.org>
2017-07-05 23:24 ` Ray Jui via iommu
2017-07-05 23:24 ` Ray Jui
[not found] ` <5149280b-a214-249c-c5e2-3712b1f941d2-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-07-06 15:08 ` Will Deacon [this message]
2017-07-06 15:08 ` Will Deacon
[not found] ` <20170706150838.GB15574-5wv7dgnIgG8@public.gmane.org>
2017-07-06 18:14 ` Ray Jui via iommu
2017-07-06 18:14 ` Ray Jui
[not found] ` <94ba5d4a-0dae-9394-79ef-90da86e49c86-dY08KVG/lbpWk0Htik3J/w@public.gmane.org>
2017-07-07 12:46 ` Will Deacon
2017-07-07 12:46 ` Will Deacon
2017-06-21 15:47 ` Joerg Roedel
2017-06-21 15:47 ` Joerg Roedel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170706150838.GB15574@arm.com \
--to=will.deacon-5wv7dgnigg8@public.gmane.org \
--cc=iommu-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
--cc=linu.cherian-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org \
--cc=linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org \
--cc=ray.jui-dY08KVG/lbpWk0Htik3J/w@public.gmane.org \
--cc=sunil.goutham-YGCgFSpz5w/QT0dZR+AlfA@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.