From: Jason Gunthorpe <jgg@nvidia.com>
To: "Tian, Kevin" <kevin.tian@intel.com>
Cc: "bpf@vger.kernel.org" <bpf@vger.kernel.org>,
Jonathan Corbet <corbet@lwn.net>,
David Woodhouse <dwmw2@infradead.org>,
"iommu@lists.linux.dev" <iommu@lists.linux.dev>,
Joerg Roedel <joro@8bytes.org>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"linux-kselftest@vger.kernel.org"
<linux-kselftest@vger.kernel.org>,
"llvm@lists.linux.dev" <llvm@lists.linux.dev>,
Nathan Chancellor <nathan@kernel.org>,
Nick Desaulniers <ndesaulniers@google.com>,
Miguel Ojeda <ojeda@kernel.org>,
Robin Murphy <robin.murphy@arm.com>,
Shuah Khan <shuah@kernel.org>,
Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
Tom Rix <trix@redhat.com>, Will Deacon <will@kernel.org>,
Alex Williamson <alex.williamson@redhat.com>,
Lu Baolu <baolu.lu@linux.intel.com>,
Chaitanya Kulkarni <chaitanyak@nvidia.com>,
Cornelia Huck <cohuck@redhat.com>,
Daniel Jordan <daniel.m.jordan@oracle.com>,
David Gibson <david@gibson.dropbear.id.au>,
Eric Auger <eric.auger@redhat.com>,
Eric Farman <farman@linux.ibm.com>,
Jason Wang <jasowang@redhat.com>,
Jean-Philippe Brucker <jean-philippe@linaro.org>,
"Martins, Joao" <joao.m.martins@oracle.com>,
"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
Matthew Rosato <mjrosato@linux.ibm.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
Nicolin Chen <nicolinc@nvidia.com>,
Niklas Schnelle <schnelle@linux.ibm.com>,
Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>,
"Liu, Yi L" <yi.l.liu@intel.com>,
Keqian Zhu <zhukeqian1@huawei.com>
Subject: Re: [PATCH v4 08/17] iommufd: Algorithms for PFN storage
Date: Mon, 14 Nov 2022 14:02:38 -0400 [thread overview]
Message-ID: <Y3KCvnLOZpGXAGhU@nvidia.com> (raw)
In-Reply-To: <BN9PR11MB52762E5ACAAE7D7B398730D78C059@BN9PR11MB5276.namprd11.prod.outlook.com>
On Mon, Nov 14, 2022 at 05:50:50AM +0000, Tian, Kevin wrote:
> > From: Jason Gunthorpe <jgg@nvidia.com>
> > Sent: Tuesday, November 8, 2022 8:49 AM
> >
> > @@ -171,7 +183,7 @@ static struct iopt_area
> > *iopt_pages_find_domain_area(struct iopt_pages *pages,
> > */
> > struct pfn_batch {
> > unsigned long *pfns;
> > - u16 *npfns;
> > + u32 *npfns;
>
> why not making it u32 and removing later FIXME directly in patch7?
Rebase error, I fixed it
> > static bool batch_add_pfn(struct pfn_batch *batch, unsigned long pfn)
> > {
> > - /* FIXME: U16 is too small */
> > + const unsigned int MAX_NPFNS = type_max(typeof(*batch->npfns));
>
> use lowercase i.e. max_npfns.
It is customary to use caps for constants, eg enum values and things
> > +static void __iopt_area_unfill_domain(struct iopt_area *area,
> > + struct iopt_pages *pages,
> > + struct iommu_domain *domain,
> > + unsigned long last_index)
> > +{
> > + struct interval_tree_double_span_iter span;
> > + unsigned long start_index = iopt_area_index(area);
> > + unsigned long unmapped_end_index = start_index;
> > + u64 backup[BATCH_BACKUP_SIZE];
> > + struct pfn_batch batch;
> > +
> > + lockdep_assert_held(&pages->mutex);
> > +
> > + batch_init_backup(&batch, last_index + 1, backup, sizeof(backup));
> > + interval_tree_for_each_double_span(&span, &pages-
> > >domains_itree,
> > + &pages->access_itree, start_index,
> > + last_index) {
> > + if (span.is_used) {
> > + batch_skip_carry(&batch,
> > + span.last_used - span.start_used + 1);
> > + continue;
> > + }
> > + iopt_area_unpin_domain(&batch, area, pages, domain,
> > + span.start_hole, span.last_hole,
> > + &unmapped_end_index, last_index);
> > + }
> > + if (unmapped_end_index != last_index + 1)
> > + iopt_area_unmap_domain_range(area, domain,
> > unmapped_end_index,
> > + last_index);
>
> a comment marking that it's for the last trailing used span of which
> the pages are not contiguous to previous span.
/*
* If the range ends in a access then we do the residual unmap without
* any unpins.
*/
> btw it is not easy to understand how this func plus unpin_domain()
> actually work. more comments are welcomed to help readability.
/*
* For security we must not unpin something that is still DMA mapped,
* so this must unmap any IOVA before we go ahead and unpin the pages.
* This creates a complexity where we need to skip over unpinning pages
* held in the xarray, but continue to unmap from the domain.
*
* The domain unmap cannot stop in the middle of a contiguous range of
* PFNs. To solve this problem the unpinning step will read ahead to the
* end of any contiguous span, unmap that whole span, and then only
* unpin the leading part that does not have any accesses. The residual
* PFNs that were unmapped but not unpinned are called a "carry" in the
* batch as they are moved to the front of the PFN list and continue on
* to the next iteration(s).
*/
> > +/*
> > + * This can do everything and is fully coherent with what a iommu_domain
> > would
> > + * see.
> > + */
> > +static int iopt_pages_rw_slow(struct iopt_pages *pages,
>
> Can you elaborate what guarantees coherency in this function and how it
> becomes different in other rw variations?
/*
* This uses the pfn_reader instead of taking a shortcut by using the mm. It can
* do every scenario and is fully consistent with what an iommu_domain would
* see.
*/
> > + * iopt_pages_remove_access() - Release an in-kernel access for PFNs
> > + * @area: The source of PFNs
> > + * @start_index: First page index
> > + * @last_index: Inclusive last page index
> > + *
> > + * Undo iopt_pages_add_access() and unpin the pages if necessary. The
> > caller
> > + * must stop using the PFNs before calling this.
> > + */
> > +void iopt_pages_remove_access(struct iopt_area *area, unsigned long
> > start_index,
> > + unsigned long last_index)
>
> this is called iopt_pages_xxx() but the first parameter is iopt_area.
>
> also it's not balanced with iopt_pages_add_access() which requires the
> caller to hold pages->mutex and populate area->num_accesses.
OK, see below
Thanks,
Jason
--- a/drivers/iommu/iommufd/device.c
+++ b/drivers/iommu/iommufd/device.c
@@ -565,7 +565,7 @@ void iommufd_access_unpin_pages(struct iommufd_access *access,
down_read(&iopt->iova_rwsem);
iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova)
- iopt_pages_remove_access(
+ iopt_area_remove_access(
area, iopt_area_iova_to_index(area, iter.cur_iova),
iopt_area_iova_to_index(
area,
@@ -650,15 +650,10 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova,
goto err_remove;
}
- mutex_lock(&area->pages->mutex);
- rc = iopt_pages_add_access(area->pages, index, last_index,
- out_pages, flags);
- if (rc) {
- mutex_unlock(&area->pages->mutex);
+ rc = iopt_pages_add_access(area, index, last_index, out_pages,
+ flags);
+ if (rc)
goto err_remove;
- }
- area->num_accesses++;
- mutex_unlock(&area->pages->mutex);
out_pages += last_index - index + 1;
}
if (!iopt_area_contig_done(&iter)) {
@@ -673,7 +668,7 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova,
if (iova < iter.cur_iova) {
last_iova = iter.cur_iova - 1;
iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova)
- iopt_pages_remove_access(
+ iopt_area_remove_access(
area,
iopt_area_iova_to_index(area, iter.cur_iova),
iopt_area_iova_to_index(
diff --git a/drivers/iommu/iommufd/io_pagetable.h b/drivers/iommu/iommufd/io_pagetable.h
index 3b85fa344f6be3..68bc3957534dd7 100644
--- a/drivers/iommu/iommufd/io_pagetable.h
+++ b/drivers/iommu/iommufd/io_pagetable.h
@@ -221,10 +221,10 @@ int iopt_pages_fill_xarray(struct iopt_pages *pages, unsigned long start,
void iopt_pages_unfill_xarray(struct iopt_pages *pages, unsigned long start,
unsigned long last);
-int iopt_pages_add_access(struct iopt_pages *pages, unsigned long start,
- unsigned long last, struct page **out_pages,
- unsigned int flags);
-void iopt_pages_remove_access(struct iopt_area *area, unsigned long start,
+int iopt_area_add_access(struct iopt_area *area, unsigned long start,
+ unsigned long last, struct page **out_pages,
+ unsigned int flags);
+void iopt_area_remove_access(struct iopt_area *area, unsigned long start,
unsigned long last);
int iopt_pages_rw_access(struct iopt_pages *pages, unsigned long start_byte,
void *data, unsigned long length, unsigned int flags);
diff --git a/drivers/iommu/iommufd/pages.c b/drivers/iommu/iommufd/pages.c
index 01d2447eac4ede..e5f267d9e2b491 100644
--- a/drivers/iommu/iommufd/pages.c
+++ b/drivers/iommu/iommufd/pages.c
@@ -1807,8 +1826,8 @@ iopt_pages_get_exact_access(struct iopt_pages *pages, unsigned long index,
}
/**
- * iopt_pages_add_access() - Record an in-knerel access for PFNs
- * @pages: The source of PFNs
+ * iopt_area_add_access() - Record an in-knerel access for PFNs
+ * @area: The source of PFNs
* @start_index: First page index
* @last_index: Inclusive last page index
* @out_pages: Output list of struct page's representing the PFNs
@@ -1819,40 +1838,49 @@ iopt_pages_get_exact_access(struct iopt_pages *pages, unsigned long index,
*
* This should be undone through a matching call to iopt_pages_remove_access()
*/
-int iopt_pages_add_access(struct iopt_pages *pages, unsigned long start_index,
+int iopt_area_add_access(struct iopt_area *area, unsigned long start_index,
unsigned long last_index, struct page **out_pages,
unsigned int flags)
{
+ struct iopt_pages *pages = area->pages;
struct iopt_pages_access *access;
int rc;
if ((flags & IOMMUFD_ACCESS_RW_WRITE) && !pages->writable)
return -EPERM;
+ mutex_lock(&pages->mutex);
access = iopt_pages_get_exact_access(pages, start_index, last_index);
if (access) {
refcount_inc(&access->refcount);
iopt_pages_fill_from_xarray(pages, start_index, last_index,
out_pages);
+ mutex_unlock(&pages->mutex);
return 0;
}
access = kzalloc(sizeof(*access), GFP_KERNEL_ACCOUNT);
- if (!access)
- return -ENOMEM;
+ if (!access) {
+ rc = -ENOMEM;
+ goto err_unlock;
+ }
rc = iopt_pages_fill_xarray(pages, start_index, last_index, out_pages);
if (rc)
- goto out_free;
+ goto err_free;
access->node.start = start_index;
access->node.last = last_index;
refcount_set(&access->refcount, 1);
+ area->num_accesses++;
interval_tree_insert(&access->node, &pages->access_itree);
+ mutex_unlock(&pages->mutex);
return 0;
-out_free:
+err_free:
kfree(access);
+err_unlock:
+ mutex_unlock(&pages->mutex);
return rc;
}
@@ -1865,11 +1893,11 @@ int iopt_pages_add_access(struct iopt_pages *pages, unsigned long start_index,
* Undo iopt_pages_add_access() and unpin the pages if necessary. The caller
* must stop using the PFNs before calling this.
*/
-void iopt_pages_remove_access(struct iopt_area *area, unsigned long start_index,
- unsigned long last_index)
+void iopt_area_remove_access(struct iopt_area *area, unsigned long start_index,
+ unsigned long last_index)
{
- struct iopt_pages_access *access;
struct iopt_pages *pages = area->pages;
+ struct iopt_pages_access *access;
mutex_lock(&pages->mutex);
access = iopt_pages_get_exact_access(pages, start_index, last_index);
next prev parent reply other threads:[~2022-11-14 18:04 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-08 0:48 [PATCH v4 00/17] IOMMUFD Generic interface Jason Gunthorpe
2022-11-08 0:48 ` [PATCH v4 01/17] iommu: Add IOMMU_CAP_ENFORCE_CACHE_COHERENCY Jason Gunthorpe
2022-11-08 0:48 ` [PATCH v4 02/17] iommu: Add device-centric DMA ownership interfaces Jason Gunthorpe
2022-11-11 5:37 ` Tian, Kevin
2022-11-14 16:44 ` Jason Gunthorpe
2022-11-14 13:33 ` Eric Auger
2022-11-14 16:58 ` Jason Gunthorpe
2022-11-08 0:48 ` [PATCH v4 03/17] interval-tree: Add a utility to iterate over spans in an interval tree Jason Gunthorpe
2022-11-15 14:14 ` Eric Auger
2022-11-15 16:44 ` Jason Gunthorpe
2022-11-08 0:48 ` [PATCH v4 04/17] iommufd: Document overview of iommufd Jason Gunthorpe
2022-11-08 3:45 ` Bagas Sanjaya
2022-11-08 17:10 ` [PATCH v4 4/17] " Jason Gunthorpe
2022-11-11 5:59 ` Tian, Kevin
2022-11-14 15:14 ` Jason Gunthorpe
2022-11-10 9:30 ` [PATCH v4 04/17] " Bagas Sanjaya
2022-11-10 14:49 ` Jonathan Corbet
2022-11-10 14:54 ` Jason Gunthorpe
2022-11-10 15:10 ` Jonathan Corbet
2022-11-10 15:23 ` Jason Gunthorpe
2022-11-10 15:28 ` Jonathan Corbet
2022-11-10 15:29 ` Jason Gunthorpe
2022-11-10 15:52 ` Jonathan Corbet
2022-11-10 16:54 ` Jason Gunthorpe
2022-11-11 1:46 ` Bagas Sanjaya
2022-11-14 20:50 ` Eric Auger
2022-11-15 0:52 ` Jason Gunthorpe
2022-11-08 0:48 ` [PATCH v4 05/17] iommufd: File descriptor, context, kconfig and makefiles Jason Gunthorpe
2022-11-11 6:07 ` Tian, Kevin
2022-11-08 0:48 ` [PATCH v4 06/17] kernel/user: Allow user::locked_vm to be usable for iommufd Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 07/17] iommufd: PFN handling for iopt_pages Jason Gunthorpe
2022-11-11 9:56 ` Tian, Kevin
2022-11-14 17:20 ` Jason Gunthorpe
2022-11-11 11:09 ` Tian, Kevin
2022-11-14 17:24 ` Jason Gunthorpe
2022-11-15 2:59 ` Tian, Kevin
2022-11-08 0:49 ` [PATCH v4 08/17] iommufd: Algorithms for PFN storage Jason Gunthorpe
2022-11-14 5:50 ` Tian, Kevin
2022-11-14 18:02 ` Jason Gunthorpe [this message]
2022-11-15 3:06 ` Tian, Kevin
2022-11-15 14:49 ` Jason Gunthorpe
2022-11-14 19:19 ` [PATCH v4 8/17] " Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 09/17] iommufd: Data structure to provide IOVA to PFN mapping Jason Gunthorpe
2022-11-14 7:28 ` Tian, Kevin
2022-11-14 18:43 ` Jason Gunthorpe
2022-11-15 3:13 ` Tian, Kevin
2022-11-15 15:05 ` Jason Gunthorpe
2022-11-16 0:09 ` Tian, Kevin
2022-11-16 0:32 ` Jason Gunthorpe
2022-11-16 2:30 ` Tian, Kevin
2022-11-08 0:49 ` [PATCH v4 10/17] iommufd: IOCTLs for the io_pagetable Jason Gunthorpe
2022-11-08 13:27 ` Bagas Sanjaya
2022-11-08 17:01 ` Jason Gunthorpe
2022-11-14 7:46 ` Tian, Kevin
2022-11-08 0:49 ` [PATCH v4 11/17] iommufd: Add a HW pagetable object Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 12/17] iommufd: Add kAPI toward external drivers for physical devices Jason Gunthorpe
2022-11-08 14:34 ` Yi Liu
2022-11-08 17:57 ` Jason Gunthorpe
2022-11-14 7:59 ` Tian, Kevin
2022-11-08 0:49 ` [PATCH v4 13/17] iommufd: Add kAPI toward external drivers for kernel access Jason Gunthorpe
2022-11-14 8:25 ` Tian, Kevin
2022-11-14 19:05 ` Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 14/17] iommufd: vfio container FD ioctl compatibility Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 16/17] iommufd: Add some fault injection points Jason Gunthorpe
2022-11-08 7:25 ` Nicolin Chen
2022-11-08 12:37 ` Jason Gunthorpe
2022-11-08 0:49 ` [PATCH v4 17/17] iommufd: Add additional invariant assertions Jason Gunthorpe
[not found] ` <15-v4-0de2f6c78ed0+9d1-iommufd_jgg@nvidia.com>
2022-11-08 1:01 ` [PATCH v4 15/17] iommufd: Add a selftest Jason Gunthorpe
2022-11-08 5:48 ` Nicolin Chen
2022-11-08 13:27 ` Jason Gunthorpe
2022-11-09 23:51 ` Matthew Rosato
2022-11-11 15:51 ` [PATCH v4 00/17] IOMMUFD Generic interface Shameerali Kolothum Thodi
2022-11-12 12:44 ` Yi Liu
2023-01-10 11:35 ` Shameerali Kolothum Thodi
2023-01-10 13:49 ` Jason Gunthorpe
2023-01-10 15:16 ` Joao Martins
2023-01-10 15:18 ` Jason Gunthorpe
2023-01-10 15:30 ` Shameerali Kolothum Thodi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y3KCvnLOZpGXAGhU@nvidia.com \
--to=jgg@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=baolu.lu@linux.intel.com \
--cc=bpf@vger.kernel.org \
--cc=chaitanyak@nvidia.com \
--cc=cohuck@redhat.com \
--cc=corbet@lwn.net \
--cc=daniel.m.jordan@oracle.com \
--cc=david@gibson.dropbear.id.au \
--cc=dwmw2@infradead.org \
--cc=eric.auger@redhat.com \
--cc=farman@linux.ibm.com \
--cc=iommu@lists.linux.dev \
--cc=jasowang@redhat.com \
--cc=jean-philippe@linaro.org \
--cc=joao.m.martins@oracle.com \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=llvm@lists.linux.dev \
--cc=mjrosato@linux.ibm.com \
--cc=mst@redhat.com \
--cc=nathan@kernel.org \
--cc=ndesaulniers@google.com \
--cc=nicolinc@nvidia.com \
--cc=ojeda@kernel.org \
--cc=robin.murphy@arm.com \
--cc=schnelle@linux.ibm.com \
--cc=shameerali.kolothum.thodi@huawei.com \
--cc=shuah@kernel.org \
--cc=suravee.suthikulpanit@amd.com \
--cc=trix@redhat.com \
--cc=will@kernel.org \
--cc=yi.l.liu@intel.com \
--cc=zhukeqian1@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).