linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Woodhouse <dwmw2@infradead.org>
To: Joerg Roedel <joro@8bytes.org>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	iommu@lists.linux-foundation.org,
	Sudeep Dutt <sudeep.dutt@intel.com>
Subject: Re: [RFC PATCH] iommu/vt-d: Add IOTLB flush support for kernel addresses
Date: Fri, 23 Oct 2015 11:33:33 +0100	[thread overview]
Message-ID: <1445596413.4113.175.camel@infradead.org> (raw)
In-Reply-To: <20151023102043.GZ27420@8bytes.org>

[-- Attachment #1: Type: text/plain, Size: 5388 bytes --]

On Fri, 2015-10-23 at 12:20 +0200, Joerg Roedel wrote:
> On Tue, Oct 20, 2015 at 05:17:04PM +0100, David Woodhouse wrote:
> > Can we assume that only one type of SVM-capable IOMMU will be present
> > in the system at a time? Perhaps we could just register a single
> > function (intel_iommu_flush_kernel_pasid in the VT-d case) to be used
> > as a notifier callback from tlb_flush_kernel_range()? Rather than the
> > overhead of a *list* of notifiers?
> 
> Yes, a single notifier is certainly preferable to a list. It is just
> too easy for others to attach to this list silently and adding more
> overhead to kernel TLB flushing.

Yeah. It's easy enough to add that in the x86 tlb_flush_kernel_range()
but I think we actually want it to be cross-platform.

Which means I'm pondering *renaming* tlb_flush_kernel_range() to
something like arch_tlb_flush_kernel_range() everywhere, then having a
tlb_flush_kernel_range() inline function which optionally calls
iommu_flush_kernel_range() first.

Or I could reduce the churn by adding explicit calls to
iommu_flush_kernel_range() at every location that calls
tlb_flush_kernel_range(), but that's going to lead to some callers
missing the IOMMU flush.

> > But... that's because the PASID-space is currently per-IOMMU. The plan
> > is to have a *single* PASID-space system-wide, And then I still want to
> > retain the property that there can be only *one* kernel PASID.
> 
> That makes a lot of sense. Then we can check in the call-back simply if
> this pasid has users and bail out early when not.
> 
> > I have forbidden the use of a given PASID to access *both* kernel and
> > user addresses. I'm hoping we can get away with putting that
> > restriction into the generic SVM APIs.
> 
> We have to, having kernel-pasids already nullifies all protection the
> IOMMU provides, giving kernel-access to a process-pasid is security wise
> equivalent to accessing /dev/mem.

Not entirely. The device still gets to specify whether it's doing
supervisor or user mode access, for each request it makes. It doesn't
open the door to users just using kernel addresses and getting away
with it!

Sure, we need to trust the *device* — but we need to trust it to
provide the correct PASID too. Which basically means in the VFIO case
where the user gets *full* control of the device, we have to ensure
that it gets its own PASID table with only the *one* PASID in it, and
*that* PASID can't have supervisor mode.

But in the general case, apart from the fact that it makes life hard
for us, there's no fundamental security reason why we couldn't set the
bit which allows supervisor mode access to happen in *any* PASID.

> > So yeah, perhaps we can set the notifier pointer to NULL when there's
> > no kernel PASID assigned, and only set it to point to
> > ${MY_IOMMU}_flush_kernel_pasid() if/when there *is* one?
> 
> That sounds like it needs some clever locking. Instead of checking the
> function pointer it is probably easier to put the check for pasid-users
> into an inline function and just do the real flush-call only when
> necessary.

The locking isn't hard; it's just RCU. Which in the VT-d case is just
the same as the handling of the kernel PASID structure which tells us
if we have any work to do anyway.

What I have in my working tree right now (but will probably throw away)
looks something like...

diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 6df2029..c9e0b6c 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -7,6 +7,22 @@
 #include <asm/processor.h>
 #include <asm/special_insns.h>
 
+#ifdef CONFIG_IOMMU_TLB_FLUSH
+#include <linux/rcupdate.h>
+typedef void (*iommu_flush_ktlb_fn)(unsigned long start, unsigned long end);
+extern iommu_flush_ktlb_fn __rcu iommu_flush_ktlb;
+
+static inline void do_iommu_flush_ktlb(unsigned long start, unsigned long end)
+{
+	iommu_flush_ktlb_fn *fn;
+	rcu_read_lock();
+	fn = rcu_dereference(iommu_flush_ktlb);
+	if (fn)
+		(*fn)(start, end);
+	rcu_read_unlock();
+}
+#endif
+
 #ifdef CONFIG_PARAVIRT
 #include <asm/paravirt.h>
 #else
@@ -223,6 +239,9 @@ static inline void reset_lazy_tlbstate(void)
 static inline void flush_tlb_kernel_range(unsigned long start,
 					  unsigned long end)
 {
+#ifdef CONFIG_IOMMU_FLUSH_TLB
+	do_iommu_flush_ktlb(start, end);
+#endif
 	flush_tlb_all();
 }
 
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 8ddb5d0..4122b49 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -266,6 +267,9 @@ static void do_kernel_range_flush(void *info)
 
 void flush_tlb_kernel_range(unsigned long start, unsigned long end)
 {
+#ifdef CONFIG_IOMMU_FLUSH_TLB
+	do_iommu_flush_ktlb(start, end);
+#endif
 
 	/* Balance as user space task's flush, a bit conservative */
 	if (end == TLB_FLUSH_ALL ||


Maybe we could keep it simple and just declare that once the function pointer is set, it may never be cleared? But I think we really do want to avoid the out-of-line function call altogether in the case where kernel PASIDs are not being used. Or at *least* the case where SVM isn't being used at all.

-- 
-- 
David Woodhouse                            Open Source Technology Centre
David.Woodhouse@intel.com                              Intel Corporation


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 5691 bytes --]

  reply	other threads:[~2015-10-23 10:33 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-20 15:52 [RFC PATCH] iommu/vt-d: Add IOTLB flush support for kernel addresses David Woodhouse
2015-10-20 16:03 ` Joerg Roedel
2015-10-20 16:17   ` David Woodhouse
2015-10-23 10:20     ` Joerg Roedel
2015-10-23 10:33       ` David Woodhouse [this message]
2015-10-23 11:03         ` Joerg Roedel
2015-10-23 11:37           ` David Woodhouse
2015-10-23 12:42             ` Joerg Roedel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1445596413.4113.175.camel@infradead.org \
    --to=dwmw2@infradead.org \
    --cc=iommu@lists.linux-foundation.org \
    --cc=joro@8bytes.org \
    --cc=linux-mm@kvack.org \
    --cc=sudeep.dutt@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).