From: Robin Murphy <robin.murphy@arm.com>
To: Kunkun Jiang <jiangkunkun@huawei.com>, Will Deacon <will@kernel.org>
Cc: wanghaibin.wang@huawei.com, iommu@lists.linux-foundation.org,
zhukeqian1@huawei.com
Subject: Re: Some questions about arm_lpae_install_table
Date: Wed, 4 Nov 2020 11:15:19 +0000 [thread overview]
Message-ID: <4fbe4ad5-2eef-c5a1-b60c-d00a274b5e35@arm.com> (raw)
In-Reply-To: <6ad9732c-caa4-0049-3fa2-c193e5dc6a88@huawei.com>
On 2020-11-04 07:17, Kunkun Jiang wrote:
> Hi Will and Robin,
>
> Sorry for the late reply.
>
> On 2020/11/3 18:21, Robin Murphy wrote:
>> On 2020-11-03 09:11, Will Deacon wrote:
>>> On Tue, Nov 03, 2020 at 11:00:27AM +0800, Kunkun Jiang wrote:
>>>> Recently, I have read and learned the code related to io-pgtable-arm.c.
>>>> There
>>>> are two question on arm_lpae_install_table.
>>>>
>>>> 1、the first
>>>>
>>>>> static arm_lpae_iopte arm_lpae_install_table(arm_lpae_iopte *table,
>>>>> arm_lpae_iopte *ptep,
>>>>> arm_lpae_iopte curr,
>>>>> struct io_pgtable_cfg
>>>>> *cfg)
>>>>> {
>>>>> arm_lpae_iopte old, new;
>>>>>
>>>>> new = __pa(table) | ARM_LPAE_PTE_TYPE_TABLE;
>>>>> if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_NS)
>>>>> new |= ARM_LPAE_PTE_NSTABLE;
>>>>>
>>>>> /*
>>>>> * Ensure the table itself is visible before its PTE can be.
>>>>> * Whilst we could get away with cmpxchg64_release below,
>>>>> this
>>>>> * doesn't have any ordering semantics when !CONFIG_SMP.
>>>>> */
>>>>> dma_wmb();
>>>>>
>>>>> old = cmpxchg64_relaxed(ptep, curr, new);
>>>>>
>>>>> if (cfg->coherent_walk || (old & ARM_LPAE_PTE_SW_SYNC))
>>>>> return old;
>>>>>
>>>>> /* Even if it's not ours, there's no point waiting; just
>>>>> kick it
>>>>> */
>>>>> __arm_lpae_sync_pte(ptep, cfg);
>>>>> if (old == curr)
>>>>> WRITE_ONCE(*ptep, new | ARM_LPAE_PTE_SW_SYNC);
>>>>>
>>>>> return old;
>>>>> }
>>>>
>>>> If another thread changes the ptep between cmpxchg64_relaxed and
>>>> WRITE_ONCE(*ptep, new | ARM_LPAE_PTE_SW_SYNC), the operation
>>>> WRITE_ONCE will overwrite the change.
>>>
>>> Can you please provide an example of a code path where this happens? The
>>> idea is that CPUs can race on the cmpxchg(), but there will only be one
>>> winner.
>>
>> The only way a table entry can suddenly disappear is in a race that
>> involves mapping or unmapping a whole block/table-sized region, while
>> simultaneously mapping a page *within* that region. Yes, if someone
>> uses the API in a nonsensical and completely invalid way that cannot
>> have a predictable outcome, they get an unpredictable outcome. Whoop
>> de do...
>>
> Yes, as Robin mentioned, there will be an unpredictable outcome in extreme
> cases. Now, we have two APIs mapping and unmapping to alter a block/page
> descriptor. I worry about that it may be increasingly difficult to add
> API in
> the future.
I still don't understand your concern. If two threads try to
simultaneously create a block mapping and a page mapping *for the same
virtual address*, the result is that one of those mappings will end up
in place, depending on who managed to write to the PTE last. This code
is self-consistent - it doesn't crash, nor end up with any mapping that
*wasn't* requested - it just doesn't go far out of its way to
accommodate obviously invalid usage. By comparison, say instead those
two threads simultaneously call kfree() on the same pointer; the outcome
of that is considerably worse.
What kind of additional new API are you imagining where such
deliberately racy behaviour would be anything other than broken nonsense?
>>>> 2、the second
>>>>
>>>>> for (i = 0; i < tablesz / sizeof(pte); i++, blk_paddr += split_sz) {
>>>>> /* Unmap! */
>>>>> if (i == unmap_idx)
>>>>> continue;
>>>>>
>>>>> __arm_lpae_init_pte(data, blk_paddr, pte, lvl,
>>>>> &tablep[i]);
>>>>> }
>>>>>
>>>>> pte = arm_lpae_install_table(tablep, ptep, blk_pte, cfg);
>>>>
>>>> When altering a translation table descriptor include split a block into
>>>> constituent granules, the Armv8-A and SMMUv3 architectures require
>>>> a break-before-make procedure. But in the function
>>>> arm_lpae_split_blk_unmap,
>>>> it changes a block descriptor to an equivalent span of page
>>>> translations
>>>> directly. Is it appropriate to do so?
>>>
>>> Break-before-make doesn't really work for the SMMU because faults are
>>> generally fatal.
>>>
>>> Are you seeing problems in practice with this code?
>>
>> TBH I do still wonder if we shouldn't just get rid of split_blk_unmap
>> and all its complexity. Other drivers treat an unmap of a page from a
>> block entry as simply unmapping the whole block, and that's the
>> behaviour VFIO seems to expect. My only worry is that it's been around
>> long enough that there might be some horrible out-of-tree code that
>> *is* relying on it, despite the fact that it's impossible to implement
>> in a way that's definitely 100% safe :/
>>
>> Robin.
>> .
> I have not encountered a problem. I mean SMMU has supported BBML feature
> in SMMUv3.2. Maybe we can use this feature to update translation table
> without
> break-before-make.
We could. But then what do we do for all the existing hardware without
BBML, or on x86 or other drivers which don't split blocks anyway?
Robin.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
prev parent reply other threads:[~2020-11-04 11:15 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-11-03 3:00 Some questions about arm_lpae_install_table Kunkun Jiang
2020-11-03 9:11 ` Will Deacon
2020-11-03 10:21 ` Robin Murphy
2020-11-04 7:17 ` Kunkun Jiang
2020-11-04 11:15 ` Robin Murphy [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4fbe4ad5-2eef-c5a1-b60c-d00a274b5e35@arm.com \
--to=robin.murphy@arm.com \
--cc=iommu@lists.linux-foundation.org \
--cc=jiangkunkun@huawei.com \
--cc=wanghaibin.wang@huawei.com \
--cc=will@kernel.org \
--cc=zhukeqian1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox