From: Kunkun Jiang <jiangkunkun@huawei.com>
To: Robin Murphy <robin.murphy@arm.com>, Will Deacon <will@kernel.org>
Cc: wanghaibin.wang@huawei.com, iommu@lists.linux-foundation.org,
zhukeqian1@huawei.com
Subject: Re: Some questions about arm_lpae_install_table
Date: Wed, 4 Nov 2020 15:17:58 +0800 [thread overview]
Message-ID: <6ad9732c-caa4-0049-3fa2-c193e5dc6a88@huawei.com> (raw)
In-Reply-To: <e5772a9c-df32-e865-3d0b-90f24ff233c0@arm.com>
Hi Will and Robin,
Sorry for the late reply.
On 2020/11/3 18:21, Robin Murphy wrote:
> On 2020-11-03 09:11, Will Deacon wrote:
>> On Tue, Nov 03, 2020 at 11:00:27AM +0800, Kunkun Jiang wrote:
>>> Recently, I have read and learned the code related to io-pgtable-arm.c.
>>> There
>>> are two question on arm_lpae_install_table.
>>>
>>> 1、the first
>>>
>>>> static arm_lpae_iopte arm_lpae_install_table(arm_lpae_iopte *table,
>>>> arm_lpae_iopte *ptep,
>>>> arm_lpae_iopte curr,
>>>> struct io_pgtable_cfg
>>>> *cfg)
>>>> {
>>>> arm_lpae_iopte old, new;
>>>>
>>>> new = __pa(table) | ARM_LPAE_PTE_TYPE_TABLE;
>>>> if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
>>>> new |= ARM_LPAE_PTE_NSTABLE;
>>>>
>>>> /*
>>>> * Ensure the table itself is visible before its PTE can be.
>>>> * Whilst we could get away with cmpxchg64_release below,
>>>> this
>>>> * doesn't have any ordering semantics when !CONFIG_SMP.
>>>> */
>>>> dma_wmb();
>>>>
>>>> old = cmpxchg64_relaxed(ptep, curr, new);
>>>>
>>>> if (cfg->coherent_walk || (old & ARM_LPAE_PTE_SW_SYNC))
>>>> return old;
>>>>
>>>> /* Even if it's not ours, there's no point waiting; just
>>>> kick it
>>>> */
>>>> __arm_lpae_sync_pte(ptep, cfg);
>>>> if (old == curr)
>>>> WRITE_ONCE(*ptep, new | ARM_LPAE_PTE_SW_SYNC);
>>>>
>>>> return old;
>>>> }
>>>
>>> If another thread changes the ptep between cmpxchg64_relaxed and
>>> WRITE_ONCE(*ptep, new | ARM_LPAE_PTE_SW_SYNC), the operation
>>> WRITE_ONCE will overwrite the change.
>>
>> Can you please provide an example of a code path where this happens? The
>> idea is that CPUs can race on the cmpxchg(), but there will only be one
>> winner.
>
> The only way a table entry can suddenly disappear is in a race that
> involves mapping or unmapping a whole block/table-sized region, while
> simultaneously mapping a page *within* that region. Yes, if someone
> uses the API in a nonsensical and completely invalid way that cannot
> have a predictable outcome, they get an unpredictable outcome. Whoop
> de do...
>
Yes, as Robin mentioned, there will be an unpredictable outcome in extreme
cases. Now, we have two APIs mapping and unmapping to alter a block/page
descriptor. I worry about that it may be increasingly difficult to add
API in
the future.
>>> 2、the second
>>>
>>>> for (i = 0; i < tablesz / sizeof(pte); i++, blk_paddr += split_sz) {
>>>> /* Unmap! */
>>>> if (i == unmap_idx)
>>>> continue;
>>>>
>>>> __arm_lpae_init_pte(data, blk_paddr, pte, lvl,
>>>> &tablep[i]);
>>>> }
>>>>
>>>> pte = arm_lpae_install_table(tablep, ptep, blk_pte, cfg);
>>>
>>> When altering a translation table descriptor include split a block into
>>> constituent granules, the Armv8-A and SMMUv3 architectures require
>>> a break-before-make procedure. But in the function
>>> arm_lpae_split_blk_unmap,
>>> it changes a block descriptor to an equivalent span of page
>>> translations
>>> directly. Is it appropriate to do so?
>>
>> Break-before-make doesn't really work for the SMMU because faults are
>> generally fatal.
>>
>> Are you seeing problems in practice with this code?
>
> TBH I do still wonder if we shouldn't just get rid of split_blk_unmap
> and all its complexity. Other drivers treat an unmap of a page from a
> block entry as simply unmapping the whole block, and that's the
> behaviour VFIO seems to expect. My only worry is that it's been around
> long enough that there might be some horrible out-of-tree code that
> *is* relying on it, despite the fact that it's impossible to implement
> in a way that's definitely 100% safe :/
>
> Robin.
> .
I have not encountered a problem. I mean SMMU has supported BBML feature
in SMMUv3.2. Maybe we can use this feature to update translation table
without
break-before-make.
Look forward to your reply.
Thanks,
Kunkun Jiang
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
next prev parent reply other threads:[~2020-11-04 7:18 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-03 3:00 Some questions about arm_lpae_install_table Kunkun Jiang
2020-11-03 9:11 ` Will Deacon
2020-11-03 10:21 ` Robin Murphy
2020-11-04 7:17 ` Kunkun Jiang [this message]
2020-11-04 11:15 ` Robin Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6ad9732c-caa4-0049-3fa2-c193e5dc6a88@huawei.com \
--to=jiangkunkun@huawei.com \
--cc=iommu@lists.linux-foundation.org \
--cc=robin.murphy@arm.com \
--cc=wanghaibin.wang@huawei.com \
--cc=will@kernel.org \
--cc=zhukeqian1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox