From: Steven Sistare <steven.sistare@oracle.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: kvm@vger.kernel.org, Cornelia Huck <cohuck@redhat.com>
Subject: Re: [PATCH V1 2/2] vfio/type1: prevent locked_vm underflow
Date: Tue, 13 Dec 2022 13:21:15 -0500 [thread overview]
Message-ID: <43ad3256-e485-b358-6445-35645d943b7b@oracle.com> (raw)
In-Reply-To: <69e68902-eed9-748a-887a-549c717ebe01@oracle.com>
On 12/13/2022 1:17 PM, Steven Sistare wrote:
> On 12/13/2022 1:02 PM, Alex Williamson wrote:
>> On Tue, 13 Dec 2022 07:46:56 -0800
>> Steve Sistare <steven.sistare@oracle.com> wrote:
>>
>>> When a vfio container is preserved across exec using the VFIO_UPDATE_VADDR
>>> interfaces, locked_vm of the new mm becomes 0. If the user later unmaps a
>>> dma mapping, locked_vm underflows to a large unsigned value, and a
>>> subsequent dma map request fails with ENOMEM in __account_locked_vm.
>>>
>>> To fix, when VFIO_DMA_MAP_FLAG_VADDR is used and the dma's mm has changed,
>>> add the mapping's pinned page count to the new mm->locked_vm, subject to
>>> the rlimit. Now that mediated devices are excluded when using
>>> VFIO_UPDATE_VADDR, the amount of pinned memory equals the size of the
>>> mapping.
>>>
>>> Underflow will not occur when all dma mappings are invalidated before exec.
>>> An attempt to unmap before updating the vaddr with VFIO_DMA_MAP_FLAG_VADDR
>>> will fail with EINVAL because the mapping is in the vaddr_invalid state.
>>
>> Where is this enforced?
>
> In vfio_dma_do_unmap:
> if (invalidate_vaddr) {
> if (dma->vaddr_invalid) {
> ...
> ret = -EINVAL;
My bad, this is a different case, and my comment in the commit message is
incorrect. I should test mm != dma->mm during unmap as well, and suppress
the locked_vm deduction there.
- Steve
>>> Underflow may still occur in a buggy application that fails to invalidate
>>> all before exec.
>>>
>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>> ---
>>> drivers/vfio/vfio_iommu_type1.c | 11 +++++++++++
>>> 1 file changed, 11 insertions(+)
>>>
>>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>>> index f81e925..e5a02f8 100644
>>> --- a/drivers/vfio/vfio_iommu_type1.c
>>> +++ b/drivers/vfio/vfio_iommu_type1.c
>>> @@ -100,6 +100,7 @@ struct vfio_dma {
>>> struct task_struct *task;
>>> struct rb_root pfn_list; /* Ex-user pinned pfn list */
>>> unsigned long *bitmap;
>>> + struct mm_struct *mm;
>>> };
>>>
>>> struct vfio_batch {
>>> @@ -1174,6 +1175,7 @@ static void vfio_remove_dma(struct vfio_iommu *iommu, struct vfio_dma *dma)
>>> vfio_unmap_unpin(iommu, dma, true);
>>> vfio_unlink_dma(iommu, dma);
>>> put_task_struct(dma->task);
>>> + mmdrop(dma->mm);
>>> vfio_dma_bitmap_free(dma);
>>> if (dma->vaddr_invalid) {
>>> iommu->vaddr_invalid_count--;
>>> @@ -1622,6 +1624,13 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>> dma->vaddr = vaddr;
>>> dma->vaddr_invalid = false;
>>> iommu->vaddr_invalid_count--;
>>> + if (current->mm != dma->mm) {
>>> + mmdrop(dma->mm);
>>> + dma->mm = current->mm;
>>> + mmgrab(dma->mm);
>>> + ret = vfio_lock_acct(dma, size >> PAGE_SHIFT,
>>> + 0);
>>
>> What does it actually mean if this fails? The pages are still pinned.
>> lock_vm doesn't get updated. Underflow can still occur. Thanks,
>
> If this fails, the user has locked additional memory after exec and before making
> this call -- more than was locked before exec -- and the rlimit is exceeded.
> A misbehaving application, which will only hurt itself.
>
> However, I should reorder these, and check ret before changing the other state.
>
> - Steve
>
>>> + }
>>> wake_up_all(&iommu->vaddr_wait);
>>> }
>>> goto out_unlock;
>>> @@ -1679,6 +1688,8 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
>>> get_task_struct(current->group_leader);
>>> dma->task = current->group_leader;
>>> dma->lock_cap = capable(CAP_IPC_LOCK);
>>> + dma->mm = dma->task->mm;
>>> + mmgrab(dma->mm);
>>>
>>> dma->pfn_list = RB_ROOT;
>>>
>>
next prev parent reply other threads:[~2022-12-13 18:21 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-13 15:46 [PATCH V1 0/2] fixes for virtual address update Steve Sistare
2022-12-13 15:46 ` [PATCH V1 1/2] vfio/type1: exclude mdevs from VFIO_UPDATE_VADDR Steve Sistare
2022-12-13 16:26 ` Alex Williamson
2022-12-13 16:54 ` Steven Sistare
2022-12-13 17:31 ` Alex Williamson
2022-12-13 17:42 ` Steven Sistare
2022-12-13 15:46 ` [PATCH V1 2/2] vfio/type1: prevent locked_vm underflow Steve Sistare
2022-12-13 18:02 ` Alex Williamson
2022-12-13 18:17 ` Steven Sistare
2022-12-13 18:21 ` Steven Sistare [this message]
2022-12-13 19:29 ` Alex Williamson
2022-12-13 19:40 ` Steven Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=43ad3256-e485-b358-6445-35645d943b7b@oracle.com \
--to=steven.sistare@oracle.com \
--cc=alex.williamson@redhat.com \
--cc=cohuck@redhat.com \
--cc=kvm@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox