* Question on IOMMU indentity mapping for intel-iommu
@ 2018-02-26 23:01 Alexander Duyck
[not found] ` <CAKgT0UfgW23f6K-TbOP=6t=LPEH8e-6xphjZ7amC-Bvb3-S-GA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Alexander Duyck @ 2018-02-26 23:01 UTC (permalink / raw)
To: David Woodhouse, Joerg Roedel; +Cc: open list:INTEL IOMMU (VT-d)
I am interested in adding a new memory mapping option that establishes
one identity-mapped region for all DMA_TO_DEVICE mappings and creates
a new dynamic mapping for any DMA_FROM_DEVICE and DMA_BIDIRECTIONAL
mappings. My thought is it should allow for a compromise between
security and performance (in the case of networking) in that many of
the server NICs drivers these days are running with mostly pinned or
resused paged for Rx. By using an identity mapping for the Tx packets
we should be able to significantly cut down on the IOMMU overhead for
the device. The other advantage if this works is that we could use
this to possibly do something like dirty page tracking in the case of
a emulated version of the IOMMU.
I was originally thinking I could get away with just reusing the
identity mapping code but it looks like that would end up merging
everything into one domain if I am understanding correctly. Do I have
that right?
Would I be correct in assuming that I will need to have a separate
domain per device, each domain containing the 1 TO_DEVICE identity
mapped region, and then whatever other mappings are needed to handle
the FROM and BIDIRECTIONAL mappings?
Thanks.
- Alex
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Question on IOMMU indentity mapping for intel-iommu
[not found] ` <CAKgT0UfgW23f6K-TbOP=6t=LPEH8e-6xphjZ7amC-Bvb3-S-GA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2018-02-27 7:47 ` David Woodhouse
[not found] ` <1519717675.10676.36.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: David Woodhouse @ 2018-02-27 7:47 UTC (permalink / raw)
To: Alexander Duyck, Joerg Roedel; +Cc: open list:INTEL IOMMU (VT-d)
[-- Attachment #1.1: Type: text/plain, Size: 2875 bytes --]
On Mon, 2018-02-26 at 15:01 -0800, Alexander Duyck wrote:
> I am interested in adding a new memory mapping option that
> establishes
> one identity-mapped region for all DMA_TO_DEVICE mappings and creates
> a new dynamic mapping for any DMA_FROM_DEVICE and DMA_BIDIRECTIONAL
> mappings. My thought is it should allow for a compromise between
> security and performance (in the case of networking) in that many of
> the server NICs drivers these days are running with mostly pinned or
> resused paged for Rx. By using an identity mapping for the Tx packets
> we should be able to significantly cut down on the IOMMU overhead for
> the device. The other advantage if this works is that we could use
> this to possibly do something like dirty page tracking in the case of
> a emulated version of the IOMMU.
>
> I was originally thinking I could get away with just reusing the
> identity mapping code but it looks like that would end up merging
> everything into one domain if I am understanding correctly. Do I have
> that right?
>
> Would I be correct in assuming that I will need to have a separate
> domain per device, each domain containing the 1 TO_DEVICE identity
> mapped region, and then whatever other mappings are needed to handle
> the FROM and BIDIRECTIONAL mappings?
In the normal model where we explicitly map every RX and TX buffer, you
have a domain device anyway; that's not a new requirement for your
model.
It sounds like an interesting idea; I agree that it's a reasonable
compromise between security and performance. The device can *read* all
of memory, but it can't write anywhere that isn't explicitly mapped.
In addition, we're mapping buffers for RX some time in advance of them
being needed (replenishing the tail of the RX ring), where the latency
of the map operation hopefully shouldn't be quite so much of an issue.
While packets for TX can go straight to the device with no latency.
Overall, I think it might work really well.
You don't want the existing identity mapping code; that will give you a
RW mapping which you don't want — you really do want read-only or this
whole exercise is pointless, right? And you're right, it would have put
the domain into the single identity domain.
You could probably start by mocking this up with the IOMMU API. Create
a domain with the 1:1 read-only mapping of all memory, add your device
to it, and then do your writeable mappings on top (at IOVAs higher than
the top of physical memory). That's probably a quick way to assess
performance and prove the concept (although you don't get deferred
unmap of RX packets that way, which might mess things up a bit).
When we expose this through the DMA API, I'd quite like this *not* to
be Intel-specific. It could reasonably live in a higher layer and be
usable with all kinds of IOMMU implementations.
[-- Attachment #1.2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5213 bytes --]
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Question on IOMMU indentity mapping for intel-iommu
[not found] ` <1519717675.10676.36.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2018-03-01 3:15 ` Tian, Kevin
[not found] ` <AADFC41AFE54684AB9EE6CBC0274A5D191021BD8-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
0 siblings, 1 reply; 4+ messages in thread
From: Tian, Kevin @ 2018-03-01 3:15 UTC (permalink / raw)
To: David Woodhouse, Alexander Duyck, Joerg Roedel
Cc: open list:INTEL IOMMU (VT-d)
> From: David Woodhouse
> Sent: Tuesday, February 27, 2018 3:48 PM
>
> On Mon, 2018-02-26 at 15:01 -0800, Alexander Duyck wrote:
> > I am interested in adding a new memory mapping option that
> > establishes
> > one identity-mapped region for all DMA_TO_DEVICE mappings and
> creates
> > a new dynamic mapping for any DMA_FROM_DEVICE and
> DMA_BIDIRECTIONAL
> > mappings. My thought is it should allow for a compromise between
> > security and performance (in the case of networking) in that many of
> > the server NICs drivers these days are running with mostly pinned or
> > resused paged for Rx. By using an identity mapping for the Tx packets
Rx packets?
> > we should be able to significantly cut down on the IOMMU overhead for
> > the device. The other advantage if this works is that we could use
> > this to possibly do something like dirty page tracking in the case of
> > a emulated version of the IOMMU.
I didn't quite get this part. dirty-page tracking on emulated IOMMU
doesn't rely on the changes that you are proposing - just monitor
invalidation requests and then count all changed pages as dirty.
Of course it would be overkill since even RO mapping changes are
also counted. In that manner your proposal is still an optimization
instead of a must to enable dirty-page tracking, correct?
> >
> > I was originally thinking I could get away with just reusing the
> > identity mapping code but it looks like that would end up merging
> > everything into one domain if I am understanding correctly. Do I have
> > that right?
> >
> > Would I be correct in assuming that I will need to have a separate
> > domain per device, each domain containing the 1 TO_DEVICE identity
> > mapped region, and then whatever other mappings are needed to handle
> > the FROM and BIDIRECTIONAL mappings?
>
> In the normal model where we explicitly map every RX and TX buffer, you
> have a domain device anyway; that's not a new requirement for your
> model.
>
> It sounds like an interesting idea; I agree that it's a reasonable
> compromise between security and performance. The device can *read* all
> of memory, but it can't write anywhere that isn't explicitly mapped.
>
> In addition, we're mapping buffers for RX some time in advance of them
> being needed (replenishing the tail of the RX ring), where the latency
> of the map operation hopefully shouldn't be quite so much of an issue.
> While packets for TX can go straight to the device with no latency.
> Overall, I think it might work really well.
>
> You don't want the existing identity mapping code; that will give you a
> RW mapping which you don't want — you really do want read-only or this
> whole exercise is pointless, right? And you're right, it would have put
> the domain into the single identity domain.
>
> You could probably start by mocking this up with the IOMMU API. Create
> a domain with the 1:1 read-only mapping of all memory, add your device
> to it, and then do your writeable mappings on top (at IOVAs higher than
> the top of physical memory). That's probably a quick way to assess
> performance and prove the concept (although you don't get deferred
> unmap of RX packets that way, which might mess things up a bit).
Curious question. Do you think above could be a final solution or just
a proof-of-concept? If the latter, will the cleaner option be to allow the
device binding to multiple domains e.g. RO & RW which are selected
based on DMA mapping direction (not sure whether it's an intrusive
change to iommu driver which today likely assumes single binding)?
>
> When we expose this through the DMA API, I'd quite like this *not* to
> be Intel-specific. It could reasonably live in a higher layer and be
> usable with all kinds of IOMMU implementations.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Question on IOMMU indentity mapping for intel-iommu
[not found] ` <AADFC41AFE54684AB9EE6CBC0274A5D191021BD8-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2018-03-01 5:05 ` Alexander Duyck
0 siblings, 0 replies; 4+ messages in thread
From: Alexander Duyck @ 2018-03-01 5:05 UTC (permalink / raw)
To: Tian, Kevin; +Cc: David Woodhouse, open list:INTEL IOMMU (VT-d)
On Wed, Feb 28, 2018 at 7:15 PM, Tian, Kevin <kevin.tian@intel.com> wrote:
>> From: David Woodhouse
>> Sent: Tuesday, February 27, 2018 3:48 PM
>>
>> On Mon, 2018-02-26 at 15:01 -0800, Alexander Duyck wrote:
>> > I am interested in adding a new memory mapping option that
>> > establishes
>> > one identity-mapped region for all DMA_TO_DEVICE mappings and
>> creates
>> > a new dynamic mapping for any DMA_FROM_DEVICE and
>> DMA_BIDIRECTIONAL
>> > mappings. My thought is it should allow for a compromise between
>> > security and performance (in the case of networking) in that many of
>> > the server NICs drivers these days are running with mostly pinned or
>> > resused paged for Rx. By using an identity mapping for the Tx packets
>
> Rx packets?
Yes, Rx packets. In the case of drivers like ixgbevf the driver has a
page reuse mechanism in place that allows it to reuse the same page
multiple times for each Rx buffer. As a result we often end up
processing millions of packets and don't have to allocate and/or
map/unmap a page while doing so. As such we normally don't have to
invalidate or create any new mappings in the IOMMU for those packets
which greatly reduces the overhead.
>> > we should be able to significantly cut down on the IOMMU overhead for
>> > the device. The other advantage if this works is that we could use
>> > this to possibly do something like dirty page tracking in the case of
>> > a emulated version of the IOMMU.
>
> I didn't quite get this part. dirty-page tracking on emulated IOMMU
> doesn't rely on the changes that you are proposing - just monitor
> invalidation requests and then count all changed pages as dirty.
> Of course it would be overkill since even RO mapping changes are
> also counted. In that manner your proposal is still an optimization
> instead of a must to enable dirty-page tracking, correct?
Yes this is meant to be an optimization. One of the theoretical issues
with trying to use the IOMMU for dirty page tracking is that we are
currently having to update a paravirtual IOMMU to create and
invalidate mappings for every packet. With that being the case then
the argument could be make that we might as well just be using a
paravirtual network interface since many of the benefits of SR-IOV are
likely lost.
I feel that the design of drivers like ixgbevf already solves half the
issue since we almost never map/unmap Rx buffers. If we identity
mapped the Tx packets then that is the other half of the issue solved.
Essentially what you end up with is that the SR-IOV device, in this
case a VF running under something like ixgbevf, has a means to
provided the hypervisor with a list of pages that the device is
writing to so they could theoretically either be avoided for the live
migration, or at least we could keep track of them so that when the
device is either halted or invalidates the mapping we could then mark
the pages as dirty and migrate them.
The added bonus with all this is that we could probably use it as a
half-way point between the current iommu=pt solution that is often
used to get performance and the standard setup which gives you a bit
more in the way of security. At least with this we would probably see
just about the same throughput for most NICs, but we would have the
added security of a device that is unable to write outside of where we
had permitted it to.
>> >
>> > I was originally thinking I could get away with just reusing the
>> > identity mapping code but it looks like that would end up merging
>> > everything into one domain if I am understanding correctly. Do I have
>> > that right?
>> >
>> > Would I be correct in assuming that I will need to have a separate
>> > domain per device, each domain containing the 1 TO_DEVICE identity
>> > mapped region, and then whatever other mappings are needed to handle
>> > the FROM and BIDIRECTIONAL mappings?
>>
>> In the normal model where we explicitly map every RX and TX buffer, you
>> have a domain device anyway; that's not a new requirement for your
>> model.
>>
>> It sounds like an interesting idea; I agree that it's a reasonable
>> compromise between security and performance. The device can *read* all
>> of memory, but it can't write anywhere that isn't explicitly mapped.
>>
>> In addition, we're mapping buffers for RX some time in advance of them
>> being needed (replenishing the tail of the RX ring), where the latency
>> of the map operation hopefully shouldn't be quite so much of an issue.
>> While packets for TX can go straight to the device with no latency.
>> Overall, I think it might work really well.
>>
>> You don't want the existing identity mapping code; that will give you a
>> RW mapping which you don't want — you really do want read-only or this
>> whole exercise is pointless, right? And you're right, it would have put
>> the domain into the single identity domain.
>>
>> You could probably start by mocking this up with the IOMMU API. Create
>> a domain with the 1:1 read-only mapping of all memory, add your device
>> to it, and then do your writeable mappings on top (at IOVAs higher than
>> the top of physical memory). That's probably a quick way to assess
>> performance and prove the concept (although you don't get deferred
>> unmap of RX packets that way, which might mess things up a bit).
>
> Curious question. Do you think above could be a final solution or just
> a proof-of-concept? If the latter, will the cleaner option be to allow the
> device binding to multiple domains e.g. RO & RW which are selected
> based on DMA mapping direction (not sure whether it's an intrusive
> change to iommu driver which today likely assumes single binding)?
>
>>
>> When we expose this through the DMA API, I'd quite like this *not* to
>> be Intel-specific. It could reasonably live in a higher layer and be
>> usable with all kinds of IOMMU implementations.
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2018-03-01 5:05 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-02-26 23:01 Question on IOMMU indentity mapping for intel-iommu Alexander Duyck
[not found] ` <CAKgT0UfgW23f6K-TbOP=6t=LPEH8e-6xphjZ7amC-Bvb3-S-GA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2018-02-27 7:47 ` David Woodhouse
[not found] ` <1519717675.10676.36.camel-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2018-03-01 3:15 ` Tian, Kevin
[not found] ` <AADFC41AFE54684AB9EE6CBC0274A5D191021BD8-0J0gbvR4kThpB2pF5aRoyrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2018-03-01 5:05 ` Alexander Duyck
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox