From: dwmw2@infradead.org (David Woodhouse)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v4 2/7] iommu/core: split mapping to page sizes as supported by the hardware
Date: Thu, 10 Nov 2011 15:28:50 +0000 [thread overview]
Message-ID: <1320938930.22195.17.camel@i7.infradead.org> (raw)
In-Reply-To: <CANqQZNH1YRSgYRVxceicQ3szD=t7zeJjhJKEQ30PeKsxtj5V+A@mail.gmail.com>
On Thu, 2011-11-10 at 14:17 +0800, Kai Huang wrote:
> And another question: have we considered the IOTLB flush operation? I
> think we need to implement similar logic when flush the DVMA range.
> Intel VT-d's manual says software needs to specify the appropriate
> mask value to flush large pages, but it does not say we need to
> exactly match the page size as it is mapped. I guess it's not
> necessary for Intel IOMMU, but other vendor's IOMMU may have such
> limitation (or some other limitations). In my understanding current
> implementation does not provide page size information for particular
> DVMA ranges that has been mapped, and it's not flexible to implement
> IOTLB flush code (ex, we may need to walk through page table to find
> out actual page size). Maybe we can also add iommu_ops->flush_iotlb ?
Which brings me to another question I have been pondering... do we even
have a consensus on exactly *when* the IOTLB should be flushed?
Even just for the Intel IOMMU, we have three different behaviours:
- For DMA API users by default, we do 'batched unmap', so a mapping
may be active for a period of time after the driver has requested
that it be unmapped.
- ... unless booted with 'intel_iommu=strict', in which case we do the
unmap and IOTLB flush immediately before returning to the driver.
- But the IOMMU API for virtualisation is different. In fact that
doesn't seem to flush the IOTLB at all. Which is probably a bug.
What is acceptable, though? That batched unmap is quite important for
performance, because it means that we don't have to bash on the hardware
and wait for a flush to complete in the fast path of network driver RX,
for example.
If we move to a model where we have a separate ->flush_iotlb() call, we
need to be careful that we still allow necessary optimisations to
happen.
Since I have the right people on Cc and the iommu list is still down,
and it's vaguely tangentially related...
I'm looking at fixing performance issues in the Intel IOMMU code, with
its virtual address space allocation (the rbtree-based one in iova.c
that nobody else uses, which has a single spinlock that *all* CPUs bash
on when they need to allocate).
The plan is, vaguely, to allocate large chunks of space to each CPU, and
then for each CPU to allocate from its own region first, thus ensuring
that the common case doesn't bounce locks between CPUs. It'll be rare
for one CPU to have to touch a subregion 'belonging' to another CPU, so
lock contention should be drastically reduced.
Should I be planning to drop the DMA API support from intel-iommu.c
completely, and have the new allocator just call into the IOMMU API
functions instead? Other people have been looking at that, haven't they?
Is there any code? Or special platform-specific requirements for such a
generic wrapper that I might not have thought of? Details about when to
flush the IOTLB are one such thing which might need special handling for
certain hardware...
--
dwmw2
next prev parent reply other threads:[~2011-11-10 15:28 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-17 11:27 [PATCH v4 0/7] iommu: split mapping to page sizes as supported by the hardware Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 1/7] iommu/core: stop converting bytes to page order back and forth Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 2/7] iommu/core: split mapping to page sizes as supported by the hardware Ohad Ben-Cohen
2011-11-10 6:17 ` Kai Huang
2011-11-10 7:31 ` Ohad Ben-Cohen
2011-11-10 12:16 ` cody
2011-11-10 13:08 ` Joerg Roedel
2011-11-10 14:35 ` cody
2011-11-10 14:51 ` Joerg Roedel
2011-11-10 15:28 ` David Woodhouse [this message]
2011-11-10 17:09 ` Joerg Roedel
2011-11-10 19:28 ` David Woodhouse
2011-11-11 12:58 ` Joerg Roedel
2011-11-11 13:27 ` David Woodhouse
2011-11-11 14:18 ` Joerg Roedel
[not found] ` <20111111131728.GG13213@amd.com>
2011-11-24 12:52 ` Changing IOMMU-API for generic DMA-mapping " Marek Szyprowski
2011-11-24 15:27 ` 'Joerg Roedel'
2011-11-10 21:12 ` [PATCH v4 2/7] iommu/core: split mapping to page sizes as " Stepan Moskovchenko
2011-11-11 13:24 ` Joerg Roedel
2011-11-12 2:04 ` Stepan Moskovchenko
2011-11-13 1:43 ` KyongHo Cho
2011-10-17 11:27 ` [PATCH v4 3/7] iommu/omap: announce supported page sizes Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 4/7] iommu/msm: " Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 5/7] iommu/amd: " Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 6/7] iommu/intel: " Ohad Ben-Cohen
2011-10-17 11:27 ` [PATCH v4 7/7] iommu/core: remove the temporary pgsize settings Ohad Ben-Cohen
2011-11-08 12:57 ` [PATCH v4 0/7] iommu: split mapping to page sizes as supported by the hardware Ohad Ben-Cohen
2011-11-08 14:01 ` Joerg Roedel
2011-11-08 14:03 ` Ohad Ben-Cohen
2011-11-08 16:23 ` Joerg Roedel
2011-11-08 16:43 ` Ohad Ben-Cohen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1320938930.22195.17.camel@i7.infradead.org \
--to=dwmw2@infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).