From: Jason Gunthorpe <jgg@nvidia.com>
To: Robin Murphy <robin.murphy@arm.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>,
Joerg Roedel <joro@8bytes.org>, Kevin Tian <kevin.tian@intel.com>,
Matthew Rosato <mjrosato@linux.ibm.com>,
Alex Williamson <alex.williamson@redhat.com>,
ath10k@lists.infradead.org, ath11k@lists.infradead.org,
Christian Borntraeger <borntraeger@linux.ibm.com>,
dri-devel@lists.freedesktop.org, iommu@lists.linux.dev,
kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-arm-msm@vger.kernel.org, linux-media@vger.kernel.org,
linux-rdma@vger.kernel.org, linux-remoteproc@vger.kernel.org,
linux-s390@vger.kernel.org,
linux-stm32@st-md-mailman.stormreply.com,
linux-tegra@vger.kernel.org, linux-wireless@vger.kernel.org,
netdev@vger.kernel.org, nouveau@lists.freedesktop.org,
Niklas Schnelle <schnelle@linux.ibm.com>,
virtualization@lists.linux-foundation.org
Subject: Re: [PATCH v2 04/10] iommu/dma: Use the gfp parameter in __iommu_dma_alloc_noncontiguous()
Date: Fri, 20 Jan 2023 16:38:18 -0400 [thread overview]
Message-ID: <Y8r7ujD8BVgWiIH/@nvidia.com> (raw)
In-Reply-To: <f24fcba7-2fcb-ed43-05da-60763dbb07bf@arm.com>
On Fri, Jan 20, 2023 at 07:28:19PM +0000, Robin Murphy wrote:
> On 2023-01-18 18:00, Jason Gunthorpe wrote:
> > Change the sg_alloc_table_from_pages() allocation that was hardwired to
> > GFP_KERNEL to use the gfp parameter like the other allocations in this
> > function.
> >
> > Auditing says this is never called from an atomic context, so it is safe
> > as is, but reads wrong.
>
> I think the point may have been that the sgtable metadata is a
> logically-distinct allocation from the buffer pages themselves. Much like
> the allocation of the pages array itself further down in
> __iommu_dma_alloc_pages().
That makes sense, and it is a good reason to mask off the allocation
policy flags from the gfp.
On the other hand it also makes sense to continue to pass in things
like NOWAIT|NOWARN to all the allocations. Even to the iommu driver.
So I'd prefer to change this to mask and make all the following calls
consistently use the input gfp
> I'd say the more confusing thing about this particular context is why we're
> using iommu_map_sg_atomic() further down - that seems to have been an
> oversight in 781ca2de89ba, since this particular path has never supported
> being called in atomic context.
Huh. I had fixed that in v1, this patch was supposed to have that
hunk, that was the main point of making this patch actually..
> Overall I'm starting to wonder if it might not be better to stick a "use
> GFP_KERNEL_ACCOUNT if you allocate" flag in the domain for any level of the
> API internals to pick up as appropriate, rather than propagate per-call gfp
> flags everywhere.
We might get to something like that, but it requires more parts that
are not ready yet. Most likely this would take the form of some kind
of 'this is an iommufd created domain' indication. This happens
naturally as part of the nesting patches.
Right now I want to get people to start testing with this because the
charge from the IOPTEs is far and away the largest memory draw. Parts
like fixing the iommu drivers to actually use gfp are necessary to
make it work.
If we flip the two places using KERNEL_ACCOUNT to something else later
it doesn't really matter. I think the removal of the two _atomic
wrappers is still appropriate stand-alone.
> As it stands we're still missing potential pagetable and other
> domain-related allocations by drivers in .attach_dev and even (in
Yes, I plan to get to those when we add an alloc_domain_iommufd() or
whatever op. The driver will know the calling context and can set the
gfp flags for any allocations under alloc_domain under that time.
Then we can go and figure out if there are other allocations and if
all or only some drivers need a flag - eg at attach time. Though this
is less worrying because you can only scale attach up to num_pasids *
num open vfios.
iommufd will let userspace create and populate an unlimited number of
iommu_domains, so everything linked to an unattached iommu_domain
should be charged.
> probably-shouldn't-really-happen cases) .unmap_pages...
Gah, unmap_pages isn't allow to fail. There is no way to recover from
this. iommufd will spew a warn and then have a small race where
userspace can UAF kernel memory.
I'd call such a driver implementation broken. Why would you need to do
this?? :(
Thanks,
Jason
next prev parent reply other threads:[~2023-01-20 20:38 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-18 18:00 [PATCH v2 00/10] Let iommufd charge IOPTE allocations to the memory cgroup Jason Gunthorpe
2023-01-18 18:00 ` [PATCH v2 01/10] iommu: Add a gfp parameter to iommu_map() Jason Gunthorpe
2023-01-19 3:45 ` Tian, Kevin
2023-01-18 18:00 ` [PATCH v2 02/10] iommu: Remove iommu_map_atomic() Jason Gunthorpe
2023-01-19 3:45 ` Tian, Kevin
2023-01-18 18:00 ` [PATCH v2 03/10] iommu: Add a gfp parameter to iommu_map_sg() Jason Gunthorpe
2023-01-19 3:46 ` Tian, Kevin
2023-01-18 18:00 ` [PATCH v2 04/10] iommu/dma: Use the gfp parameter in __iommu_dma_alloc_noncontiguous() Jason Gunthorpe
2023-01-19 3:47 ` Tian, Kevin
2023-01-20 19:28 ` Robin Murphy
2023-01-20 20:38 ` Jason Gunthorpe [this message]
2023-01-23 13:58 ` Jason Gunthorpe
2023-01-18 18:00 ` [PATCH v2 05/10] iommufd: Use GFP_KERNEL_ACCOUNT for iommu_map() Jason Gunthorpe
2023-01-19 3:48 ` Tian, Kevin
2023-01-18 18:00 ` [PATCH v2 06/10] iommu/intel: Add a gfp parameter to alloc_pgtable_page() Jason Gunthorpe
2023-01-19 3:49 ` Tian, Kevin
2023-01-19 11:47 ` Baolu Lu
2023-01-18 18:00 ` [PATCH v2 07/10] iommu/intel: Support the gfp argument to the map_pages op Jason Gunthorpe
2023-01-19 3:50 ` Tian, Kevin
2023-01-19 11:57 ` Baolu Lu
2023-01-19 11:59 ` Baolu Lu
2023-01-18 18:00 ` [PATCH v2 08/10] iommu/intel: Use GFP_KERNEL in sleepable contexts Jason Gunthorpe
2023-01-19 3:52 ` Tian, Kevin
2023-01-19 12:00 ` Baolu Lu
2023-01-18 18:00 ` [PATCH v2 09/10] iommu/s390: Push the gfp parameter to the kmem_cache_alloc()'s Jason Gunthorpe
2023-01-19 21:56 ` Matthew Rosato
2023-01-18 18:00 ` [PATCH v2 10/10] iommu/s390: Use GFP_KERNEL in sleepable contexts Jason Gunthorpe
2023-01-19 21:56 ` Matthew Rosato
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y8r7ujD8BVgWiIH/@nvidia.com \
--to=jgg@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=ath10k@lists.infradead.org \
--cc=ath11k@lists.infradead.org \
--cc=baolu.lu@linux.intel.com \
--cc=borntraeger@linux.ibm.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=iommu@lists.linux.dev \
--cc=joro@8bytes.org \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-arm-msm@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=linux-remoteproc@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=linux-stm32@st-md-mailman.stormreply.com \
--cc=linux-tegra@vger.kernel.org \
--cc=linux-wireless@vger.kernel.org \
--cc=mjrosato@linux.ibm.com \
--cc=netdev@vger.kernel.org \
--cc=nouveau@lists.freedesktop.org \
--cc=robin.murphy@arm.com \
--cc=schnelle@linux.ibm.com \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).