From: linu.cherian@cavium.com (Linu Cherian)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 5/8] iommu/io-pgtable-arm: Support lockless operation
Date: Fri, 23 Jun 2017 11:23:26 +0530 [thread overview]
Message-ID: <20170623055326.GA2949@virtx40> (raw)
In-Reply-To: <af61929e3984a31588cab08ed84a237602b5263e.1498145008.git.robin.murphy@arm.com>
Robin,
Was trying to understand the new changes. Had few questions on
arm_lpae_install_table.
On Thu Jun 22, 2017 at 04:53:54PM +0100, Robin Murphy wrote:
> For parallel I/O with multiple concurrent threads servicing the same
> device (or devices, if several share a domain), serialising page table
> updates becomes a massive bottleneck. On reflection, though, we don't
> strictly need to do that - for valid IOMMU API usage, there are in fact
> only two races that we need to guard against: multiple map requests for
> different blocks within the same region, when the intermediate-level
> table for that region does not yet exist; and multiple unmaps of
> different parts of the same block entry. Both of those are fairly easily
> solved by using a cmpxchg to install the new table, such that if we then
> find that someone else's table got there first, we can simply free ours
> and continue.
>
> Make the requisite changes such that we can withstand being called
> without the caller maintaining a lock. In theory, this opens up a few
> corners in which wildly misbehaving callers making nonsensical
> overlapping requests might lead to crashes instead of just unpredictable
> results, but correct code really does not deserve to pay a significant
> performance cost for the sake of masking bugs in theoretical broken code.
>
> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
> ---
>
> v2:
> - Fix barriers in install_table
> - Make a few more PTE accesses with {READ,WRITE}_ONCE() just in case
> - Minor cosmetics
>
> drivers/iommu/io-pgtable-arm.c | 72 +++++++++++++++++++++++++++++++++---------
> 1 file changed, 57 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
> index 6334f51912ea..52700fa958c2 100644
> --- a/drivers/iommu/io-pgtable-arm.c
> +++ b/drivers/iommu/io-pgtable-arm.c
> @@ -20,6 +20,7 @@
>
> #define pr_fmt(fmt) "arm-lpae io-pgtable: " fmt
>
> +#include <linux/atomic.h>
> #include <linux/iommu.h>
> #include <linux/kernel.h>
> #include <linux/sizes.h>
> @@ -99,6 +100,8 @@
> #define ARM_LPAE_PTE_ATTR_HI_MASK (((arm_lpae_iopte)6) << 52)
> #define ARM_LPAE_PTE_ATTR_MASK (ARM_LPAE_PTE_ATTR_LO_MASK | \
> ARM_LPAE_PTE_ATTR_HI_MASK)
> +/* Software bit for solving coherency races */
> +#define ARM_LPAE_PTE_SW_SYNC (((arm_lpae_iopte)1) << 55)
>
> /* Stage-1 PTE */
> #define ARM_LPAE_PTE_AP_UNPRIV (((arm_lpae_iopte)1) << 6)
> @@ -249,15 +252,20 @@ static void __arm_lpae_free_pages(void *pages, size_t size,
> free_pages_exact(pages, size);
> }
>
> +static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep,
> + struct io_pgtable_cfg *cfg)
> +{
> + dma_sync_single_for_device(cfg->iommu_dev, __arm_lpae_dma_addr(ptep),
> + sizeof(*ptep), DMA_TO_DEVICE);
> +}
> +
> static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
> struct io_pgtable_cfg *cfg)
> {
> *ptep = pte;
>
> if (!(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA))
> - dma_sync_single_for_device(cfg->iommu_dev,
> - __arm_lpae_dma_addr(ptep),
> - sizeof(pte), DMA_TO_DEVICE);
> + __arm_lpae_sync_pte(ptep, cfg);
> }
>
> static int __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
> @@ -314,16 +322,30 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
>
> static arm_lpae_iopte arm_lpae_install_table(arm_lpae_iopte *table,
> arm_lpae_iopte *ptep,
> + arm_lpae_iopte curr,
> struct io_pgtable_cfg *cfg)
> {
> - arm_lpae_iopte new;
> + arm_lpae_iopte old, new;
>
> new = __pa(table) | ARM_LPAE_PTE_TYPE_TABLE;
> if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
> new |= ARM_LPAE_PTE_NSTABLE;
>
> - __arm_lpae_set_pte(ptep, new, cfg);
> - return new;
> + /* Ensure the table itself is visible before its PTE can be */
> + wmb();
Could you please give more hints on why this is required.
> +
> + old = cmpxchg64_relaxed(ptep, curr, new);
> +
> + if ((cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA) ||
> + (old & ARM_LPAE_PTE_SW_SYNC))
> + return old;
> +
> + /* Even if it's not ours, there's no point waiting; just kick it */
> + __arm_lpae_sync_pte(ptep, cfg);
> + if (old == curr)
> + WRITE_ONCE(*ptep, new | ARM_LPAE_PTE_SW_SYNC);
How is it ensured that the cache operations are completed before we flag them with
ARM_LPAE_PTE_SW_SYNC. The previous version had a wmb() after the sync operation.
> +
> + return old;
> }
>
Thanks.
--
Linu cherian
next prev parent reply other threads:[~2017-06-23 5:53 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-22 15:53 [PATCH v2 0/8] io-pgtable lock removal Robin Murphy
2017-06-22 15:53 ` [PATCH v2 1/8] iommu/io-pgtable-arm-v7s: Check table PTEs more precisely Robin Murphy
2017-06-22 15:53 ` [PATCH v2 2/8] iommu/io-pgtable-arm: Improve split_blk_unmap Robin Murphy
2017-06-22 15:53 ` [PATCH v2 3/8] iommu/io-pgtable-arm-v7s: Refactor split_blk_unmap Robin Murphy
2017-06-22 15:53 ` [PATCH v2 4/8] iommu/io-pgtable: Introduce explicit coherency Robin Murphy
2017-06-22 15:53 ` [PATCH v2 5/8] iommu/io-pgtable-arm: Support lockless operation Robin Murphy
2017-06-23 5:53 ` Linu Cherian [this message]
2017-06-23 8:56 ` Linu Cherian
2017-06-23 10:35 ` Robin Murphy
2017-06-23 11:34 ` Linu Cherian
2017-06-27 5:11 ` Linu Cherian
2017-06-27 8:39 ` Will Deacon
2017-06-27 9:08 ` Linu Cherian
2017-06-22 15:53 ` [PATCH v2 6/8] iommu/io-pgtable-arm-v7s: " Robin Murphy
2017-06-22 15:53 ` [PATCH v2 7/8] iommu/arm-smmu: Remove io-pgtable spinlock Robin Murphy
2017-06-22 15:53 ` [PATCH v2 8/8] iommu/arm-smmu-v3: " Robin Murphy
2017-06-23 8:47 ` [PATCH v2 0/8] io-pgtable lock removal John Garry
2017-06-23 9:58 ` Robin Murphy
2017-06-26 11:35 ` John Garry
2017-06-26 12:31 ` Robin Murphy
2017-06-26 13:12 ` John Garry
2017-06-26 13:19 ` Leizhen (ThunderTown)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170623055326.GA2949@virtx40 \
--to=linu.cherian@cavium.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).