From: Oleksandr Tyshchenko <olekstysh@gmail.com>
To: Stefano Stabellini <sstabellini@kernel.org>
Cc: Oleksandr Andrushchenko <andr2000@gmail.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
al1img <al1img@gmail.com>,
Andrii Anisov <andrii.anisov@gmail.com>,
Volodymyr Babchuk <vlad.babchuk@gmail.com>,
Julien Grall <julien.grall@arm.com>,
xen-devel@lists.xenproject.org,
Artem Mygaiev <joculator@gmail.com>
Subject: Re: Shattering superpages impact on IOMMU in Xen
Date: Tue, 4 Apr 2017 12:28:52 +0300 [thread overview]
Message-ID: <CAPD2p-kGppx8fSmr6Pt_ePpmnz6vtLt3xyBbj2LMKfR8FNwMUw@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.2.10.1704031323170.3039@sstabellini-ThinkPad-X260>
Hi, Stefano.
On Mon, Apr 3, 2017 at 11:33 PM, Stefano Stabellini
<sstabellini@kernel.org> wrote:
> On Mon, 3 Apr 2017, Oleksandr Tyshchenko wrote:
>> On Mon, Apr 3, 2017 at 9:06 PM, Julien Grall <julien.grall@arm.com> wrote:
>> > Hi Andrew,
>> >
>> >
>> > On 03/04/17 18:16, Andrew Cooper wrote:
>> >>
>> >> On 03/04/17 18:02, Julien Grall wrote:
>> >>>
>> >>> Hi Andrew,
>> >>>
>> >>> On 03/04/17 17:42, Andrew Cooper wrote:
>> >>>>
>> >>>> On 03/04/17 17:24, Oleksandr Tyshchenko wrote:
>> >>>>>
>> >>>>> Hi, all.
>> >>>>>
>> >>>>> Playing with non-shared IOMMU in Xen on ARM I faced one interesting
>> >>>>> thing. I found out that the superpages were shattered during domain
>> >>>>> life cycle.
>> >>>>> This is the result of mapping of foreign pages, ballooning memory,
>> >>>>> even if domain maps Xen shared pages, etc.
>> >>>>> I don't bother with the memory fragmentation at the moment. But,
>> >>>>> shattering bothers me from the IOMMU point of view.
>> >>>>> As the Xen owns IOMMU it might manipulate IOMMU page tables when
>> >>>>> passthoughed/protected device doing DMA in Linux. It is hard to detect
>> >>>>> when the DMA transaction isn't in progress
>> >>>>> in order to prevent this race. So, if we have inflight transaction
>> >>>>> from a device when changing IOMMU mapping we might get into trouble.
>> >>>>> Unfortunately, not in all the cases the
>> >>>>> faulting transaction can be restarted. The chance to hit the problem
>> >>>>> increases during shattering.
>> >>>>>
>> >>>>> I did next test:
>> >>>>> The dom0 on my setup contains ethernet IP that are protected by IOMMU.
>> >>>>> What is more, as the IOMMU I am playing with supports superpages (2M,
>> >>>>> 1G) the IOMMU driver
>> >>>>> takes into account these capabilities when building page tables. As I
>> >>>>> gave 256 MB for dom0, the IOMMU mapping was built by 2M memory blocks
>> >>>>> only. As I am using NFS for both dom0 and domU the ethernet IP
>> >>>>> performs DMA transactions almost all the time.
>> >>>>> Sometimes, I see the IOMMU page faults during creating guest domain. I
>> >>>>> think, it happens during Xen is shattering 2M mappings 4K mappings (it
>> >>>>> unmaps dom0 pages by one 4K page at a time, then maps domU pages there
>> >>>>> for copying domU images).
>> >>>>> But, I don't see any page faults when the IOMMU page table was built
>> >>>>> by 4K pages only.
>> >>>>>
>> >>>>> I had a talk with Julien on IIRC and we came to conclusion that the
>> >>>>> safest way would be to use 4K pages to prevent shattering, so the
>> >>>>> IOMMU shouldn't report superpage capability.
>> >>>>> On the other hand, if we build IOMMU from 4K pages we will have
>> >>>>> performance drop (during building, walking page tables), TLB pressure,
>> >>>>> etc.
>> >>>>> Another possible solution Julien was suggesting is to always
>> >>>>> ballooning with 2M, 1G, and not using 4K. That would help us to
>> >>>>> prevent shattering effect.
>> >>>>> The discussion was moved to the ML since it seems to be a generic
>> >>>>> issue and the right solution should be think of.
>> >>>>>
>> >>>>> What do you think is the right way to follow? Use 4K pages and don't
>> >>>>> bother with shattering or try to optimize? And if the idea to make
>> >>>>> balloon mechanism smarter makes sense how to teach balloon to do so?
>> >>>>> Thank you.
>> >>>>
>> >>>>
>> >>>> Ballooning and foreign mappings are terrible for trying to retain
>> >>>> superpage mappings. No OS, not even Linux, can sensibly provide victim
>> >>>> pages in a useful way to avoid shattering.
>> >>>>
>> >>>> If you care about performance, don't ever balloon. Foreign mappings in
>> >>>> translated guests should start from the top of RAM, and work upwards.
>> >>>
>> >>>
>> >>> I am not sure to understand this. Can you extend?
>> >>
>> >>
>> >> I am not sure what is unclear. Handing random frames of RAM back to the
>> >> hypervisor is what exacerbates host superpage fragmentation, and all
>> >> balloon drivers currently do it.
>> >>
>> >> If you want to avoid host superpage fragmentation, don't use a
>> >> scattergun approach of handing frames back to Xen. However, because
>> >> even Linux doesn't provide enough hooks into the physical memory
>> >> management logic, the only solution is to not balloon at all, and to use
>> >> already-unoccupied frames for foreign mappings.
>> >
>> >
>> > Do you have any pointer in the Linux code?
>> >
>> >
>> >>
>> >>>
>> >>>>
>> >>>>
>> >>>> As for the IOMMU specifically, things are rather easier. It is the
>> >>>> guests responsibility to ensure that frames offered up for ballooning or
>> >>>> foreign mappings are unused. Therefore, if anything cares about the
>> >>>> specific 4K region becoming non-present in the IOMMU mappings, it is the
>> >>>> guest kernels fault for offering up a frame already in use.
>> >>>>
>> >>>> For the shattering however, It is Xen's responsibility to ensure that
>> >>>> all other mappings stay valid at all points. The correct way to do this
>> >>>> is to construct a new L1 table, mirroring the L2 superpage but lacking
>> >>>> the specific 4K mapping in question, then atomically replace the L2
>> >>>> superpage entry with the new L1 table, then issue an IOMMU TLB
>> >>>> invalidation to remove any cached mappings.
>> >>>>
>> >>>> By following that procedure, all DMA within the 2M region, but not
>> >>>> hitting the 4K frame, won't observe any interim lack of mappings. It
>> >>>> appears from your description that Xen isn't following the procedure.
>> >>>
>> >>>
>> >>> Xen is following what's the ARM ARM is mandating. For shattering page
>> >>> table, we have to follow the break-before-sequence i.e:
>> >>> - Invalidate the L2 entry
>> >>> - Flush the TLBs
>> >>> - Add the new L1 table
>> >>> See D4-1816 in ARM DDI 0487A.k_iss10775 for details. So we end up in a
>> >>> small window where there are no valid mapping. It is easy to trap data
>> >>> abort from processor and restarting it but not for device memory
>> >>> transactions.
>> >>>
>> >>> Xen by default is sharing stage-2 page tables with between the IOMMU
>> >>> and the MMU. However, from the discussion I had with Oleksandr, they
>> >>> are not sharing page tables and still see the problem. I am not sure
>> >>> how they are updating the page table here. Oleksandr, can you provide
>> >>> more details?
>> >>
>> >>
>> >> Are you saying that ARM has no way of making atomic updates to the IOMMU
>> >> mappings? (How do I get access to that document? Google gets me to
>> >>
>> >> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.subset.architecture.reference/index.html,
>> >> but
>> >> http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487a.k/index.html
>> >> which looks like the document you specified results in 404.)
>> >
>> >
>> > Below a link, I am not sure why google does not refer it:
>> >
>> > http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487a.k_10775/index.html
>> >
>> >>
>> >> If so, that is an architecture bug IMO. By design, the IOMMU is out of
>> >> control of guest software, and the hypervisor should be able to make
>> >> atomic modifications without guest cooperation.
>> >
>> >
>> > I think you misread what I meant, IOMMU supports atomic operations. However,
>> > if you share the page table we have to apply Break-Before-Make when
>> > shattering superpage. This is mandatory if you want to get Xen running on
>> > all the micro-architectures.
>> >
>> > Some IOMMU may cope with the BBM, some not. I haven't seen any issue so far
>> > (it does not mean there are none).
>> >
>> > The IOMMU used by Oleksandr (e.g VMSA-IPMMU) is an IP from Renesas which I
>> > never used myself. In his case he needs different page tables because the
>> > layouts are not the same.
>> >
>> > Oleksandr, looking at the code your provided, the superpage are split the
>> > way Andrew said, i.e:
>> > 1) allocating level 3 table minus the 4K mapping
>> > 2) replace level 2 entry with the new table
>> >
>> > Am I right?
>>
>> It seems, yes. Walking the page table down when trying to unmap we
>> bump into leaf entry (2M mapping),
>> so 2M-4K mapping are inserted at the next level and after that the
>> page table entry are replaced.
>
> Let me premise that Andrew well pointed out what should be the right
> approach on dealing with this issue. However, if we have to use
> break-before-make for IOMMU pagetables, then it means we cannot do
> atomic updates to IOMMU mappings, like Andrew wrote. Therefore, we
> have to make a choice: we either disable superpage IOMMU mappings or
> ballooning. I would disable IOMMU superpage mappings, on the ground that
> supporting superpage mappings without supporting atomic shattering or
> restartable transactions is not really supporting superpage mappings.
Sounds reasonable. As Julien mentioned too "using 4K pages only" is
the safest way.
At least until I will find a reason why DMA faults take place despite
the fast that shattering is
doing in an atomic way.
>
> However, you are not doing break-before-make here. I would investigate
> if break-before-make is required by VMSA-IPMMU. If it is not required,
> why are you seeing DMA faults?
Unfortunally, I can't say about break-before-make sequence for IPMMU
at the moment.
TRM says nothing about it.
--
Regards,
Oleksandr Tyshchenko
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-04-04 9:28 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-03 16:24 Shattering superpages impact on IOMMU in Xen Oleksandr Tyshchenko
2017-04-03 16:42 ` Andrew Cooper
2017-04-03 17:02 ` Julien Grall
2017-04-03 17:16 ` Andrew Cooper
2017-04-03 18:06 ` Julien Grall
2017-04-03 18:21 ` Oleksandr Tyshchenko
2017-04-03 20:33 ` Stefano Stabellini
2017-04-04 9:28 ` Oleksandr Tyshchenko [this message]
2017-04-06 18:59 ` Oleksandr Tyshchenko
2017-04-06 19:22 ` Julien Grall
2017-04-06 20:36 ` Oleksandr Tyshchenko
2017-04-06 20:39 ` Julien Grall
2017-04-03 17:39 ` Oleksandr Tyshchenko
2017-04-03 17:53 ` Oleksandr Tyshchenko
2017-04-03 18:26 ` Oleksandr Tyshchenko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAPD2p-kGppx8fSmr6Pt_ePpmnz6vtLt3xyBbj2LMKfR8FNwMUw@mail.gmail.com \
--to=olekstysh@gmail.com \
--cc=al1img@gmail.com \
--cc=andr2000@gmail.com \
--cc=andrew.cooper3@citrix.com \
--cc=andrii.anisov@gmail.com \
--cc=joculator@gmail.com \
--cc=julien.grall@arm.com \
--cc=sstabellini@kernel.org \
--cc=vlad.babchuk@gmail.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).