From: James Gowans <jgowans@amazon.com>
To: <linux-kernel@vger.kernel.org>
Cc: Eric Biederman <ebiederm@xmission.com>,
<kexec@lists.infradead.org>, "Joerg Roedel" <joro@8bytes.org>,
Will Deacon <will@kernel.org>, <iommu@lists.linux.dev>,
Alexander Viro <viro@zeniv.linux.org.uk>,
"Christian Brauner" <brauner@kernel.org>,
<linux-fsdevel@vger.kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>, <kvm@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>, <linux-mm@kvack.org>,
Alexander Graf <graf@amazon.com>,
David Woodhouse <dwmw@amazon.co.uk>,
"Jan H . Schoenherr" <jschoenh@amazon.de>,
Usama Arif <usama.arif@bytedance.com>,
Anthony Yznaga <anthony.yznaga@oracle.com>,
Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>,
<madvenka@linux.microsoft.com>, <steven.sistare@oracle.com>,
<yuleixzhang@tencent.com>
Subject: [RFC 08/18] iommu: Add allocator for pgtables from persistent region
Date: Mon, 5 Feb 2024 12:01:53 +0000 [thread overview]
Message-ID: <20240205120203.60312-9-jgowans@amazon.com> (raw)
In-Reply-To: <20240205120203.60312-1-jgowans@amazon.com>
The specific IOMMU drivers will need to ability to allocate pages from a
pkernfs IOMMU pgtable file for their pgtables. Also, the IOMMU drivers
will need to ability to consistent get the same page for the root PGD
page - add a specific function to get this PGD "root" page. This is
different to allocating regular pgtable pages because the exact same
page needs to be *restored* after kexec into the pgd pointer on the
IOMMU domain struct.
To support this sort of allocation the pkernfs region is treated as an
array of 512 4 KiB pages, the first of which is an allocation bitmap.
---
drivers/iommu/Makefile | 1 +
drivers/iommu/pgtable_alloc.c | 36 +++++++++++++++++++++++++++++++++++
drivers/iommu/pgtable_alloc.h | 9 +++++++++
3 files changed, 46 insertions(+)
create mode 100644 drivers/iommu/pgtable_alloc.c
create mode 100644 drivers/iommu/pgtable_alloc.h
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 769e43d780ce..cadebabe9581 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -1,5 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
obj-y += amd/ intel/ arm/ iommufd/
+obj-y += pgtable_alloc.o
obj-$(CONFIG_IOMMU_API) += iommu.o
obj-$(CONFIG_IOMMU_API) += iommu-traces.o
obj-$(CONFIG_IOMMU_API) += iommu-sysfs.o
diff --git a/drivers/iommu/pgtable_alloc.c b/drivers/iommu/pgtable_alloc.c
new file mode 100644
index 000000000000..f0c2e12f8a8b
--- /dev/null
+++ b/drivers/iommu/pgtable_alloc.c
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "pgtable_alloc.h"
+#include <linux/mm.h>
+
+/*
+ * The first 4 KiB is the bitmap - set the first bit in the bitmap.
+ * Scan bitmap to find next free bits - it's next free page.
+ */
+
+void iommu_alloc_page_from_region(struct pkernfs_region *region, void **vaddr, unsigned long *paddr)
+{
+ int page_idx;
+
+ page_idx = bitmap_find_free_region(region->vaddr, 512, 0);
+ *vaddr = region->vaddr + (page_idx << PAGE_SHIFT);
+ if (paddr)
+ *paddr = region->paddr + (page_idx << PAGE_SHIFT);
+}
+
+
+void *pgtable_get_root_page(struct pkernfs_region *region, bool liveupdate)
+{
+ /*
+ * The page immediately after the bitmap is the root page.
+ * It would be wrong for the page to be allocated if we're
+ * NOT doing a liveupdate, or for a liveupdate to happen
+ * with no allocated page. Detect this mismatch.
+ */
+ if (test_bit(1, region->vaddr) ^ liveupdate) {
+ pr_err("%sdoing a liveupdate but root pg bit incorrect",
+ liveupdate ? "" : "NOT ");
+ }
+ set_bit(1, region->vaddr);
+ return region->vaddr + PAGE_SIZE;
+}
diff --git a/drivers/iommu/pgtable_alloc.h b/drivers/iommu/pgtable_alloc.h
new file mode 100644
index 000000000000..c1666a7be3d3
--- /dev/null
+++ b/drivers/iommu/pgtable_alloc.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#include <linux/types.h>
+#include <linux/pkernfs.h>
+
+void iommu_alloc_page_from_region(struct pkernfs_region *region,
+ void **vaddr, unsigned long *paddr);
+
+void *pgtable_get_root_page(struct pkernfs_region *region, bool liveupdate);
--
2.40.1
next prev parent reply other threads:[~2024-02-05 12:04 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-05 12:01 [RFC 00/18] Pkernfs: Support persistence for live update James Gowans
2024-02-05 12:01 ` [RFC 01/18] pkernfs: Introduce filesystem skeleton James Gowans
2024-02-05 12:01 ` [RFC 02/18] pkernfs: Add persistent inodes hooked into directies James Gowans
2024-02-05 12:01 ` [RFC 03/18] pkernfs: Define an allocator for persistent pages James Gowans
2024-02-05 12:01 ` [RFC 04/18] pkernfs: support file truncation James Gowans
2024-02-05 12:01 ` [RFC 05/18] pkernfs: add file mmap callback James Gowans
2024-02-05 23:34 ` Dave Chinner
2024-02-05 12:01 ` [RFC 06/18] init: Add liveupdate cmdline param James Gowans
2024-02-05 12:01 ` [RFC 07/18] pkernfs: Add file type for IOMMU root pgtables James Gowans
2024-02-05 12:01 ` James Gowans [this message]
2024-02-05 12:01 ` [RFC 09/18] intel-iommu: Use pkernfs for root/context pgtable pages James Gowans
2024-02-05 12:01 ` [RFC 10/18] iommu/intel: zap context table entries on kexec James Gowans
2024-02-05 12:01 ` [RFC 11/18] dma-iommu: Always enable deferred attaches for liveupdate James Gowans
2024-02-05 17:45 ` Jason Gunthorpe
2024-02-05 12:01 ` [RFC 12/18] pkernfs: Add IOMMU domain pgtables file James Gowans
2024-02-05 12:01 ` [RFC 13/18] vfio: add ioctl to define persistent pgtables on container James Gowans
2024-02-05 17:08 ` Jason Gunthorpe
2024-02-05 12:01 ` [RFC 14/18] intel-iommu: Allocate domain pgtable pages from pkernfs James Gowans
2024-02-05 17:12 ` Jason Gunthorpe
2024-02-05 12:02 ` [RFC 15/18] pkernfs: register device memory for IOMMU domain pgtables James Gowans
2024-02-05 12:02 ` [RFC 16/18] vfio: support not mapping IOMMU pgtables on live-update James Gowans
2024-02-05 12:02 ` [RFC 17/18] pci: Don't clear bus master is persistence enabled James Gowans
2024-02-05 12:02 ` [RFC 18/18] vfio-pci: Assume device working after liveupdate James Gowans
2024-02-05 17:10 ` [RFC 00/18] Pkernfs: Support persistence for live update Alex Williamson
2024-02-07 14:56 ` Gowans, James
2024-02-07 15:28 ` Jason Gunthorpe
2024-02-05 17:42 ` Jason Gunthorpe
2024-02-07 14:45 ` Gowans, James
2024-02-07 15:22 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240205120203.60312-9-jgowans@amazon.com \
--to=jgowans@amazon.com \
--cc=akpm@linux-foundation.org \
--cc=anthony.yznaga@oracle.com \
--cc=brauner@kernel.org \
--cc=dwmw@amazon.co.uk \
--cc=ebiederm@xmission.com \
--cc=graf@amazon.com \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=jschoenh@amazon.de \
--cc=kexec@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=madvenka@linux.microsoft.com \
--cc=pbonzini@redhat.com \
--cc=seanjc@google.com \
--cc=skinsburskii@linux.microsoft.com \
--cc=steven.sistare@oracle.com \
--cc=usama.arif@bytedance.com \
--cc=viro@zeniv.linux.org.uk \
--cc=will@kernel.org \
--cc=yuleixzhang@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox