From: Oleksandr Tyshchenko <olekstysh@gmail.com>
To: Julien Grall <julien.grall@arm.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>,
Oleksandr Andrushchenko <andr2000@gmail.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Andrii Anisov <andrii.anisov@gmail.com>,
Volodymyr Babchuk <vlad.babchuk@gmail.com>,
al1img <al1img@gmail.com>,
xen-devel@lists.xenproject.org,
Artem Mygaiev <joculator@gmail.com>
Subject: Re: Shattering superpages impact on IOMMU in Xen
Date: Mon, 3 Apr 2017 20:53:45 +0300 [thread overview]
Message-ID: <CAPD2p-mg2KOUWzVyxAQS9uRm2WmKGMr55Kw4zWZriwtSUeWyfA@mail.gmail.com> (raw)
In-Reply-To: <CAPD2p-mmkausPp10PabKFaQvRoAh=vTbQyt=JFm+usVzJ1zJZA@mail.gmail.com>
On Mon, Apr 3, 2017 at 8:39 PM, Oleksandr Tyshchenko
<olekstysh@gmail.com> wrote:
> Hi, Julien.
>
> On Mon, Apr 3, 2017 at 8:02 PM, Julien Grall <julien.grall@arm.com> wrote:
>> Hi Andrew,
>>
>>
>> On 03/04/17 17:42, Andrew Cooper wrote:
>>>
>>> On 03/04/17 17:24, Oleksandr Tyshchenko wrote:
>>>>
>>>> Hi, all.
>>>>
>>>> Playing with non-shared IOMMU in Xen on ARM I faced one interesting
>>>> thing. I found out that the superpages were shattered during domain
>>>> life cycle.
>>>> This is the result of mapping of foreign pages, ballooning memory,
>>>> even if domain maps Xen shared pages, etc.
>>>> I don't bother with the memory fragmentation at the moment. But,
>>>> shattering bothers me from the IOMMU point of view.
>>>> As the Xen owns IOMMU it might manipulate IOMMU page tables when
>>>> passthoughed/protected device doing DMA in Linux. It is hard to detect
>>>> when the DMA transaction isn't in progress
>>>> in order to prevent this race. So, if we have inflight transaction
>>>> from a device when changing IOMMU mapping we might get into trouble.
>>>> Unfortunately, not in all the cases the
>>>> faulting transaction can be restarted. The chance to hit the problem
>>>> increases during shattering.
>>>>
>>>> I did next test:
>>>> The dom0 on my setup contains ethernet IP that are protected by IOMMU.
>>>> What is more, as the IOMMU I am playing with supports superpages (2M,
>>>> 1G) the IOMMU driver
>>>> takes into account these capabilities when building page tables. As I
>>>> gave 256 MB for dom0, the IOMMU mapping was built by 2M memory blocks
>>>> only. As I am using NFS for both dom0 and domU the ethernet IP
>>>> performs DMA transactions almost all the time.
>>>> Sometimes, I see the IOMMU page faults during creating guest domain. I
>>>> think, it happens during Xen is shattering 2M mappings 4K mappings (it
>>>> unmaps dom0 pages by one 4K page at a time, then maps domU pages there
>>>> for copying domU images).
>>>> But, I don't see any page faults when the IOMMU page table was built
>>>> by 4K pages only.
>>>>
>>>> I had a talk with Julien on IIRC and we came to conclusion that the
>>>> safest way would be to use 4K pages to prevent shattering, so the
>>>> IOMMU shouldn't report superpage capability.
>>>> On the other hand, if we build IOMMU from 4K pages we will have
>>>> performance drop (during building, walking page tables), TLB pressure,
>>>> etc.
>>>> Another possible solution Julien was suggesting is to always
>>>> ballooning with 2M, 1G, and not using 4K. That would help us to
>>>> prevent shattering effect.
>>>> The discussion was moved to the ML since it seems to be a generic
>>>> issue and the right solution should be think of.
>>>>
>>>> What do you think is the right way to follow? Use 4K pages and don't
>>>> bother with shattering or try to optimize? And if the idea to make
>>>> balloon mechanism smarter makes sense how to teach balloon to do so?
>>>> Thank you.
>>>
>>>
>>> Ballooning and foreign mappings are terrible for trying to retain
>>> superpage mappings. No OS, not even Linux, can sensibly provide victim
>>> pages in a useful way to avoid shattering.
>>>
>>> If you care about performance, don't ever balloon. Foreign mappings in
>>> translated guests should start from the top of RAM, and work upwards.
>>
>>
>> I am not sure to understand this. Can you extend?
>>
>>>
>>>
>>> As for the IOMMU specifically, things are rather easier. It is the
>>> guests responsibility to ensure that frames offered up for ballooning or
>>> foreign mappings are unused. Therefore, if anything cares about the
>>> specific 4K region becoming non-present in the IOMMU mappings, it is the
>>> guest kernels fault for offering up a frame already in use.
>>>
>>> For the shattering however, It is Xen's responsibility to ensure that
>>> all other mappings stay valid at all points. The correct way to do this
>>> is to construct a new L1 table, mirroring the L2 superpage but lacking
>>> the specific 4K mapping in question, then atomically replace the L2
>>> superpage entry with the new L1 table, then issue an IOMMU TLB
>>> invalidation to remove any cached mappings.
>>>
>>> By following that procedure, all DMA within the 2M region, but not
>>> hitting the 4K frame, won't observe any interim lack of mappings. It
>>> appears from your description that Xen isn't following the procedure.
>>
>>
>> Xen is following what's the ARM ARM is mandating. For shattering page table,
>> we have to follow the break-before-sequence i.e:
>> - Invalidate the L2 entry
>> - Flush the TLBs
>> - Add the new L1 table
>>
>> See D4-1816 in ARM DDI 0487A.k_iss10775 for details. So we end up in a small
>> window where there are no valid mapping. It is easy to trap data abort from
>> processor and restarting it but not for device memory transactions.
>>
>> Xen by default is sharing stage-2 page tables with between the IOMMU and the
>> MMU. However, from the discussion I had with Oleksandr, they are not sharing
>> page tables and still see the problem. I am not sure how they are updating
>> the page table here. Oleksandr, can you provide more details?
>
> Yes, the IOMMU is a IPMMU-VMSA that doesn't share page table with the
> CPU. It uses stage-1 page table. So, the IOMMU driver builds own page
> table and feeds it to the HW IP.
> For this reason I ported ARM LPAE page table allocator [1] and
> modified it to work inside Xen [2]. I hope that I didn't break or
> change allocation/updating/removing logic.
>
> What we have during shattering:
> In order to unmap single 4K page within 2M memory block the IOMMU
> driver has to split memory block into 512 4K pages
> and map them at the next level except one page we try to unmap, then
> it has to replace old block entry with new table entry and perform
> cache flush.
>
> [1] http://lxr.free-electrons.com/source/drivers/iommu/io-pgtable-arm.c
> [2] https://github.com/otyshchenko1/xen/blob/ipmmu_ml/xen/drivers/passthrough/arm/io-pgtable-arm.c
>
>>
>> Cheers,
>>
>> --
>> Julien Grall
>
>
>
> --
> Regards,
>
> Oleksandr Tyshchenko
Julien,
Did you mean how I update IOMMU mapping from P2M code?
If yes, here please:
https://www.mail-archive.com/xen-devel@lists.xen.org/msg100470.html
As IOMMU supports the same page granularity as the CPU on ARM (1G, 2M,
4K) I think that the IOMMU stage-1 page table finally contains the
same superpages as the CPU stage-2 page table.
--
Regards,
Oleksandr Tyshchenko
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-04-03 17:53 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-03 16:24 Shattering superpages impact on IOMMU in Xen Oleksandr Tyshchenko
2017-04-03 16:42 ` Andrew Cooper
2017-04-03 17:02 ` Julien Grall
2017-04-03 17:16 ` Andrew Cooper
2017-04-03 18:06 ` Julien Grall
2017-04-03 18:21 ` Oleksandr Tyshchenko
2017-04-03 20:33 ` Stefano Stabellini
2017-04-04 9:28 ` Oleksandr Tyshchenko
2017-04-06 18:59 ` Oleksandr Tyshchenko
2017-04-06 19:22 ` Julien Grall
2017-04-06 20:36 ` Oleksandr Tyshchenko
2017-04-06 20:39 ` Julien Grall
2017-04-03 17:39 ` Oleksandr Tyshchenko
2017-04-03 17:53 ` Oleksandr Tyshchenko [this message]
2017-04-03 18:26 ` Oleksandr Tyshchenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPD2p-mg2KOUWzVyxAQS9uRm2WmKGMr55Kw4zWZriwtSUeWyfA@mail.gmail.com \
--to=olekstysh@gmail.com \
--cc=al1img@gmail.com \
--cc=andr2000@gmail.com \
--cc=andrew.cooper3@citrix.com \
--cc=andrii.anisov@gmail.com \
--cc=joculator@gmail.com \
--cc=julien.grall@arm.com \
--cc=sstabellini@kernel.org \
--cc=vlad.babchuk@gmail.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).