From: Jason Gunthorpe <jgg@nvidia.com>
To: Steven Sistare <steven.sistare@oracle.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
kvm@vger.kernel.org, Cornelia Huck <cohuck@redhat.com>,
Kevin Tian <kevin.tian@intel.com>
Subject: Re: [PATCH V7 2/7] vfio/type1: prevent underflow of locked_vm via exec()
Date: Mon, 9 Jan 2023 09:52:30 -0400 [thread overview]
Message-ID: <Y7wcHg0d0ebC6h+3@nvidia.com> (raw)
In-Reply-To: <3ee416e7-f997-60b0-e35f-b610e974bb97@oracle.com>
On Fri, Jan 06, 2023 at 10:14:57AM -0500, Steven Sistare wrote:
> On 1/3/2023 2:20 PM, Jason Gunthorpe wrote:
> > On Tue, Jan 03, 2023 at 01:12:53PM -0500, Steven Sistare wrote:
> >> On 1/3/2023 10:20 AM, Jason Gunthorpe wrote:
> >>> On Tue, Dec 20, 2022 at 12:39:20PM -0800, Steve Sistare wrote:
> >>>> When a vfio container is preserved across exec, the task does not change,
> >>>> but it gets a new mm with locked_vm=0, and loses the count from existing
> >>>> dma mappings. If the user later unmaps a dma mapping, locked_vm underflows
> >>>> to a large unsigned value, and a subsequent dma map request fails with
> >>>> ENOMEM in __account_locked_vm.
> >>>>
> >>>> To avoid underflow, grab and save the mm at the time a dma is mapped.
> >>>> Use that mm when adjusting locked_vm, rather than re-acquiring the saved
> >>>> task's mm, which may have changed. If the saved mm is dead, do nothing.
> >>>>
> >>>> locked_vm is incremented for existing mappings in a subsequent patch.
> >>>>
> >>>> Fixes: 73fa0d10d077 ("vfio: Type1 IOMMU implementation")
> >>>> Cc: stable@vger.kernel.org
> >>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >>>> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
> >>>> ---
> >>>> drivers/vfio/vfio_iommu_type1.c | 27 +++++++++++----------------
> >>>> 1 file changed, 11 insertions(+), 16 deletions(-)
> >>>>
> >>>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >>>> index 144f5bb..71f980b 100644
> >>>> --- a/drivers/vfio/vfio_iommu_type1.c
> >>>> +++ b/drivers/vfio/vfio_iommu_type1.c
> >>>> @@ -100,6 +100,7 @@ struct vfio_dma {
> >>>> struct task_struct *task;
> >>>> struct rb_root pfn_list; /* Ex-user pinned pfn list */
> >>>> unsigned long *bitmap;
> >>>> + struct mm_struct *mm;
> >>>> };
> >>>>
> >>>> struct vfio_batch {
> >>>> @@ -420,8 +421,8 @@ static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async)
> >>>> if (!npage)
> >>>> return 0;
> >>>>
> >>>> - mm = async ? get_task_mm(dma->task) : dma->task->mm;
> >>>> - if (!mm)
> >>>> + mm = dma->mm;
> >>>> + if (async && !mmget_not_zero(mm))
> >>>> return -ESRCH; /* process exited */
> >>>
> >>> Just delete the async, the lock_acct always acts on the dma which
> >>> always has a singular mm.
> >>>
> >>> FIx the few callers that need it to do the mmget_no_zero() before
> >>> calling in.
> >>
> >> Most of the callers pass async=true:
> >> ret = vfio_lock_acct(dma, lock_acct, false);
> >> vfio_lock_acct(dma, locked - unlocked, true);
> >> ret = vfio_lock_acct(dma, 1, true);
> >> vfio_lock_acct(dma, -unlocked, true);
> >> vfio_lock_acct(dma, -1, true);
> >> vfio_lock_acct(dma, -unlocked, true);
> >> ret = mm_lock_acct(task, mm, lock_cap, npage, false);
> >> mm_lock_acct(dma->task, dma->mm, dma->lock_cap, -npage, true);
> >> vfio_lock_acct(dma, locked - unlocked, true);
> >
> > Seems like if you make a lock_sub_acct() function that does the -1*
> > and does the mmget it will be OK?
>
> Do you mean, provide two versions of vfio_lock_acct? Simplified:
>
> vfio_lock_acct()
> {
> mm_lock_acct()
> dma->locked_vm += npage;
> }
>
> vfio_lock_acct_async()
> {
> mmget_not_zero(dma->mm)
>
> mm_lock_acct()
> dma->locked_vm += npage;
>
> mmput(dma->mm);
> }
I was thinking more like
vfio_lock_acct_subtract()
mmget_not_zero(dma->mm)
mm->locked_vm -= npage
Jason
next prev parent reply other threads:[~2023-01-09 13:52 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-20 20:39 [PATCH V7 0/7] fixes for virtual address update Steve Sistare
2022-12-20 20:39 ` [PATCH V7 1/7] vfio/type1: exclude mdevs from VFIO_UPDATE_VADDR Steve Sistare
2022-12-20 20:39 ` [PATCH V7 2/7] vfio/type1: prevent underflow of locked_vm via exec() Steve Sistare
2023-01-03 15:20 ` Jason Gunthorpe
2023-01-03 18:12 ` Steven Sistare
2023-01-03 19:20 ` Jason Gunthorpe
2023-01-06 15:14 ` Steven Sistare
2023-01-09 13:52 ` Jason Gunthorpe [this message]
2023-01-09 15:54 ` Steven Sistare
2023-01-09 21:16 ` Steven Sistare
2023-01-10 15:02 ` Jason Gunthorpe
2022-12-20 20:39 ` [PATCH V7 3/7] vfio/type1: track locked_vm per dma Steve Sistare
2023-01-03 15:21 ` Jason Gunthorpe
2023-01-03 18:13 ` Steven Sistare
2023-01-09 21:24 ` Steven Sistare
2023-01-10 0:31 ` Jason Gunthorpe
2022-12-20 20:39 ` [PATCH V7 4/7] vfio/type1: restore locked_vm Steve Sistare
2023-01-03 15:22 ` Jason Gunthorpe
2023-01-03 18:12 ` Steven Sistare
2022-12-20 20:39 ` [PATCH V7 5/7] vfio/type1: revert "block on invalid vaddr" Steve Sistare
2022-12-20 20:39 ` [PATCH V7 6/7] vfio/type1: revert "implement notify callback" Steve Sistare
2022-12-20 20:39 ` [PATCH V7 7/7] vfio: revert "iommu driver " Steve Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y7wcHg0d0ebC6h+3@nvidia.com \
--to=jgg@nvidia.com \
--cc=alex.williamson@redhat.com \
--cc=cohuck@redhat.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=steven.sistare@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox