All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Kirti Wankhede <kwankhede@nvidia.com>
Cc: <pbonzini@redhat.com>, <kraxel@redhat.com>, <cjia@nvidia.com>,
	<qemu-devel@nongnu.org>, <kvm@vger.kernel.org>,
	<kevin.tian@intel.com>, <jike.song@intel.com>,
	<bjsdjshi@linux.vnet.ibm.com>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v11 09/22] vfio iommu type1: Add task structure to vfio_dma
Date: Tue, 8 Nov 2016 09:43:38 -0700	[thread overview]
Message-ID: <20161108094338.7eccc64f@t450s.home> (raw)
In-Reply-To: <71e24995-1678-7e43-90fa-7798cfcdebbc@nvidia.com>

On Tue, 8 Nov 2016 19:43:25 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> On 11/8/2016 2:33 AM, Alex Williamson wrote:
> > On Sat, 5 Nov 2016 02:40:43 +0530
> > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >   
> 
> ...
> 
> >>  static int vfio_dma_do_map(struct vfio_iommu *iommu,
> >>  			   struct vfio_iommu_type1_dma_map *map)
> >>  {
> >>  	dma_addr_t iova = map->iova;
> >>  	unsigned long vaddr = map->vaddr;
> >>  	size_t size = map->size;
> >> -	long npage;
> >>  	int ret = 0, prot = 0;
> >>  	uint64_t mask;
> >>  	struct vfio_dma *dma;
> >> -	unsigned long pfn;
> >> +	struct vfio_addr_space *addr_space;
> >> +	struct mm_struct *mm;
> >> +	bool free_addr_space_on_err = false;
> >>  
> >>  	/* Verify that none of our __u64 fields overflow */
> >>  	if (map->size != size || map->vaddr != vaddr || map->iova != iova)
> >> @@ -608,47 +685,56 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
> >>  	mutex_lock(&iommu->lock);
> >>  
> >>  	if (vfio_find_dma(iommu, iova, size)) {
> >> -		mutex_unlock(&iommu->lock);
> >> -		return -EEXIST;
> >> +		ret = -EEXIST;
> >> +		goto do_map_err;
> >> +	}
> >> +
> >> +	mm = get_task_mm(current);
> >> +	if (!mm) {
> >> +		ret = -ENODEV;  
> > 
> > -EFAULT?
> >  
> 
> -ENODEV return is in original code from vfio_pin_pages()
>         if (!current->mm)
>                 return -ENODEV;
> 
> Once I thought of changing it to -EFAULT, but then again changed to
> -ENODEV to be consistent with original error code.
> 
> Should I still change this return to -EFAULT?

Let's keep ENODEV for less code churn, I guess.
 
> >> +		goto do_map_err;
> >> +	}
> >> +
> >> +	addr_space = vfio_find_addr_space(iommu, mm);
> >> +	if (addr_space) {
> >> +		atomic_inc(&addr_space->ref_count);
> >> +		mmput(mm);
> >> +	} else {
> >> +		addr_space = kzalloc(sizeof(*addr_space), GFP_KERNEL);
> >> +		if (!addr_space) {
> >> +			ret = -ENOMEM;
> >> +			goto do_map_err;
> >> +		}
> >> +		addr_space->mm = mm;
> >> +		atomic_set(&addr_space->ref_count, 1);
> >> +		list_add(&addr_space->next, &iommu->addr_space_list);
> >> +		free_addr_space_on_err = true;
> >>  	}
> >>  
> >>  	dma = kzalloc(sizeof(*dma), GFP_KERNEL);
> >>  	if (!dma) {
> >> -		mutex_unlock(&iommu->lock);
> >> -		return -ENOMEM;
> >> +		if (free_addr_space_on_err) {
> >> +			mmput(mm);
> >> +			list_del(&addr_space->next);
> >> +			kfree(addr_space);
> >> +		}
> >> +		ret = -ENOMEM;
> >> +		goto do_map_err;
> >>  	}
> >>  
> >>  	dma->iova = iova;
> >>  	dma->vaddr = vaddr;
> >>  	dma->prot = prot;
> >> +	dma->addr_space = addr_space;
> >> +	get_task_struct(current);
> >> +	dma->task = current;
> >> +	dma->mlock_cap = capable(CAP_IPC_LOCK);  
> > 
> > 
> > How do you reason we can cache this?  Does the fact that the process
> > had this capability at the time that it did a DMA_MAP imply that it
> > necessarily still has this capability when an external user (vendor
> > driver) tries to pin pages?  I don't see how we can make that
> > assumption.
> > 
> >   
> 
> Will process change MEMLOCK limit at runtime? I think it shouldn't,
> correct me if I'm wrong. QEMU doesn't do that, right?

What QEMU does or doesn't do isn't relevant, the question is could a
process change CAP_IPC_LOCK runtime.  It seems plausible to me.

> The function capable() determines current task's capability. But when
> vfio_pin_pages() is called, it could come from other task but pages are
> pinned from address space of task who mapped it. So we can't use
> capable() in vfio_pin_pages()
> 
> If this capability shouldn't be cached, we have to use has_capability()
> with dma->task as argument in vfio_pin_pages()
> 
>  bool has_capability(struct task_struct *t, int cap)

Yep, that sounds better.  Thanks,

Alex

WARNING: multiple messages have this Message-ID (diff)
From: Alex Williamson <alex.williamson@redhat.com>
To: Kirti Wankhede <kwankhede@nvidia.com>
Cc: pbonzini@redhat.com, kraxel@redhat.com, cjia@nvidia.com,
	qemu-devel@nongnu.org, kvm@vger.kernel.org, kevin.tian@intel.com,
	jike.song@intel.com, bjsdjshi@linux.vnet.ibm.com,
	linux-kernel@vger.kernel.org
Subject: Re: [Qemu-devel] [PATCH v11 09/22] vfio iommu type1: Add task structure to vfio_dma
Date: Tue, 8 Nov 2016 09:43:38 -0700	[thread overview]
Message-ID: <20161108094338.7eccc64f@t450s.home> (raw)
In-Reply-To: <71e24995-1678-7e43-90fa-7798cfcdebbc@nvidia.com>

On Tue, 8 Nov 2016 19:43:25 +0530
Kirti Wankhede <kwankhede@nvidia.com> wrote:

> On 11/8/2016 2:33 AM, Alex Williamson wrote:
> > On Sat, 5 Nov 2016 02:40:43 +0530
> > Kirti Wankhede <kwankhede@nvidia.com> wrote:
> >   
> 
> ...
> 
> >>  static int vfio_dma_do_map(struct vfio_iommu *iommu,
> >>  			   struct vfio_iommu_type1_dma_map *map)
> >>  {
> >>  	dma_addr_t iova = map->iova;
> >>  	unsigned long vaddr = map->vaddr;
> >>  	size_t size = map->size;
> >> -	long npage;
> >>  	int ret = 0, prot = 0;
> >>  	uint64_t mask;
> >>  	struct vfio_dma *dma;
> >> -	unsigned long pfn;
> >> +	struct vfio_addr_space *addr_space;
> >> +	struct mm_struct *mm;
> >> +	bool free_addr_space_on_err = false;
> >>  
> >>  	/* Verify that none of our __u64 fields overflow */
> >>  	if (map->size != size || map->vaddr != vaddr || map->iova != iova)
> >> @@ -608,47 +685,56 @@ static int vfio_dma_do_map(struct vfio_iommu *iommu,
> >>  	mutex_lock(&iommu->lock);
> >>  
> >>  	if (vfio_find_dma(iommu, iova, size)) {
> >> -		mutex_unlock(&iommu->lock);
> >> -		return -EEXIST;
> >> +		ret = -EEXIST;
> >> +		goto do_map_err;
> >> +	}
> >> +
> >> +	mm = get_task_mm(current);
> >> +	if (!mm) {
> >> +		ret = -ENODEV;  
> > 
> > -EFAULT?
> >  
> 
> -ENODEV return is in original code from vfio_pin_pages()
>         if (!current->mm)
>                 return -ENODEV;
> 
> Once I thought of changing it to -EFAULT, but then again changed to
> -ENODEV to be consistent with original error code.
> 
> Should I still change this return to -EFAULT?

Let's keep ENODEV for less code churn, I guess.
 
> >> +		goto do_map_err;
> >> +	}
> >> +
> >> +	addr_space = vfio_find_addr_space(iommu, mm);
> >> +	if (addr_space) {
> >> +		atomic_inc(&addr_space->ref_count);
> >> +		mmput(mm);
> >> +	} else {
> >> +		addr_space = kzalloc(sizeof(*addr_space), GFP_KERNEL);
> >> +		if (!addr_space) {
> >> +			ret = -ENOMEM;
> >> +			goto do_map_err;
> >> +		}
> >> +		addr_space->mm = mm;
> >> +		atomic_set(&addr_space->ref_count, 1);
> >> +		list_add(&addr_space->next, &iommu->addr_space_list);
> >> +		free_addr_space_on_err = true;
> >>  	}
> >>  
> >>  	dma = kzalloc(sizeof(*dma), GFP_KERNEL);
> >>  	if (!dma) {
> >> -		mutex_unlock(&iommu->lock);
> >> -		return -ENOMEM;
> >> +		if (free_addr_space_on_err) {
> >> +			mmput(mm);
> >> +			list_del(&addr_space->next);
> >> +			kfree(addr_space);
> >> +		}
> >> +		ret = -ENOMEM;
> >> +		goto do_map_err;
> >>  	}
> >>  
> >>  	dma->iova = iova;
> >>  	dma->vaddr = vaddr;
> >>  	dma->prot = prot;
> >> +	dma->addr_space = addr_space;
> >> +	get_task_struct(current);
> >> +	dma->task = current;
> >> +	dma->mlock_cap = capable(CAP_IPC_LOCK);  
> > 
> > 
> > How do you reason we can cache this?  Does the fact that the process
> > had this capability at the time that it did a DMA_MAP imply that it
> > necessarily still has this capability when an external user (vendor
> > driver) tries to pin pages?  I don't see how we can make that
> > assumption.
> > 
> >   
> 
> Will process change MEMLOCK limit at runtime? I think it shouldn't,
> correct me if I'm wrong. QEMU doesn't do that, right?

What QEMU does or doesn't do isn't relevant, the question is could a
process change CAP_IPC_LOCK runtime.  It seems plausible to me.

> The function capable() determines current task's capability. But when
> vfio_pin_pages() is called, it could come from other task but pages are
> pinned from address space of task who mapped it. So we can't use
> capable() in vfio_pin_pages()
> 
> If this capability shouldn't be cached, we have to use has_capability()
> with dma->task as argument in vfio_pin_pages()
> 
>  bool has_capability(struct task_struct *t, int cap)

Yep, that sounds better.  Thanks,

Alex

  reply	other threads:[~2016-11-08 16:43 UTC|newest]

Thread overview: 149+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-04 21:10 [PATCH v11 00/22] Add Mediated device support Kirti Wankhede
2016-11-04 21:10 ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 01/22] vfio: Mediated device Core driver Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-07  6:40   ` Tian, Kevin
2016-11-07  6:40     ` [Qemu-devel] " Tian, Kevin
2016-11-08  9:25   ` Dong Jia Shi
2016-11-08  9:25   ` [Qemu-devel] " Dong Jia Shi
2016-11-08 21:06     ` Kirti Wankhede
2016-11-08 21:06       ` [Qemu-devel] " Kirti Wankhede
2016-11-09  1:09       ` Dong Jia Shi
2016-11-09  1:09       ` Dong Jia Shi
2016-11-04 21:10 ` [PATCH v11 02/22] vfio: VFIO based driver for Mediated devices Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-09  3:25   ` Dong Jia Shi
2016-11-09  3:25   ` [Qemu-devel] " Dong Jia Shi
2016-11-04 21:10 ` [PATCH v11 03/22] vfio: Rearrange functions to get vfio_group from dev Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 04/22] vfio: Common function to increment container_users Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 05/22] vfio iommu: Added pin and unpin callback functions to vfio_iommu_driver_ops Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-07 19:36   ` Alex Williamson
2016-11-07 19:36     ` [Qemu-devel] " Alex Williamson
2016-11-08 13:55     ` Kirti Wankhede
2016-11-08 13:55       ` [Qemu-devel] " Kirti Wankhede
2016-11-08 16:39       ` Alex Williamson
2016-11-08 16:39         ` [Qemu-devel] " Alex Williamson
2016-11-08 18:47         ` Kirti Wankhede
2016-11-08 18:47           ` [Qemu-devel] " Kirti Wankhede
2016-11-08 19:14           ` Alex Williamson
2016-11-08 19:14             ` [Qemu-devel] " Alex Williamson
2016-11-04 21:10 ` [PATCH v11 06/22] vfio iommu type1: Update arguments of vfio_lock_acct Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 07/22] vfio iommu type1: Update argument of vaddr_get_pfn() Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-07  8:42   ` Alexey Kardashevskiy
2016-11-07  8:42     ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-04 21:10 ` [PATCH v11 08/22] vfio iommu type1: Add find_iommu_group() function Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-10  7:29   ` Dong Jia Shi
2016-11-10  7:29   ` Dong Jia Shi
2016-11-04 21:10 ` [PATCH v11 09/22] vfio iommu type1: Add task structure to vfio_dma Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-07 21:03   ` Alex Williamson
2016-11-07 21:03     ` [Qemu-devel] " Alex Williamson
2016-11-08 14:13     ` Kirti Wankhede
2016-11-08 14:13       ` [Qemu-devel] " Kirti Wankhede
2016-11-08 16:43       ` Alex Williamson [this message]
2016-11-08 16:43         ` Alex Williamson
2016-11-10  8:24   ` Dong Jia Shi
2016-11-10  8:24   ` [Qemu-devel] " Dong Jia Shi
2016-11-04 21:10 ` [PATCH v11 10/22] vfio iommu type1: Add support for mediated devices Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-07 23:16   ` Alex Williamson
2016-11-07 23:16     ` [Qemu-devel] " Alex Williamson
2016-11-08  2:20     ` Jike Song
2016-11-08  2:20       ` [Qemu-devel] " Jike Song
2016-11-08 16:18       ` Alex Williamson
2016-11-08 16:18         ` [Qemu-devel] " Alex Williamson
2016-11-08 15:06     ` Kirti Wankhede
2016-11-08 15:06       ` [Qemu-devel] " Kirti Wankhede
2016-11-08 17:05       ` Alex Williamson
2016-11-08 17:05         ` [Qemu-devel] " Alex Williamson
2016-11-08  6:52   ` Alexey Kardashevskiy
2016-11-08  6:52     ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-15  5:17     ` Alexey Kardashevskiy
2016-11-15  5:17       ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-15  6:33       ` Kirti Wankhede
2016-11-15  6:33         ` [Qemu-devel] " Kirti Wankhede
2016-11-15  7:27         ` Alexey Kardashevskiy
2016-11-15  7:27           ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-15  7:56           ` Kirti Wankhede
2016-11-15  7:56             ` [Qemu-devel] " Kirti Wankhede
2016-11-14  2:49   ` Dong Jia Shi
2016-11-14  2:49   ` [Qemu-devel] " Dong Jia Shi
2016-11-04 21:10 ` [PATCH v11 11/22] vfio iommu: Add blocking notifier to notify DMA_UNMAP Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-07 23:45   ` Alex Williamson
2016-11-07 23:45     ` [Qemu-devel] " Alex Williamson
2016-11-08 16:26     ` Kirti Wankhede
2016-11-08 16:26       ` [Qemu-devel] " Kirti Wankhede
2016-11-08 17:46       ` Alex Williamson
2016-11-08 17:46         ` [Qemu-devel] " Alex Williamson
2016-11-08 19:59         ` Kirti Wankhede
2016-11-08 19:59           ` [Qemu-devel] " Kirti Wankhede
2016-11-08 21:28           ` Alex Williamson
2016-11-08 21:28             ` [Qemu-devel] " Alex Williamson
2016-11-14  7:52             ` Kirti Wankhede
2016-11-14  7:52               ` [Qemu-devel] " Kirti Wankhede
2016-11-14  7:52               ` Kirti Wankhede
2016-11-14 15:37               ` Alex Williamson
2016-11-14 15:37                 ` [Qemu-devel] " Alex Williamson
2016-11-04 21:10 ` [PATCH v11 12/22] vfio: Add notifier callback to parent's ops structure of mdev Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-07 23:51   ` Alex Williamson
2016-11-07 23:51     ` [Qemu-devel] " Alex Williamson
2016-11-04 21:10 ` [PATCH v11 13/22] vfio: Introduce common function to add capabilities Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-08  7:29   ` Alexey Kardashevskiy
2016-11-08  7:29     ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-08 20:46     ` Kirti Wankhede
2016-11-08 20:46       ` [Qemu-devel] " Kirti Wankhede
2016-11-08 21:42       ` Alex Williamson
2016-11-08 21:42         ` [Qemu-devel] " Alex Williamson
2016-11-09  2:23         ` Alexey Kardashevskiy
2016-11-09  2:23           ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-04 21:10 ` [PATCH v11 14/22] vfio_pci: Update vfio_pci to use vfio_info_add_capability() Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 15/22] vfio: Introduce vfio_set_irqs_validate_and_prepare() Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-08  8:46   ` Alexey Kardashevskiy
2016-11-08  8:46     ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-08 20:22     ` Kirti Wankhede
2016-11-08 20:22       ` [Qemu-devel] " Kirti Wankhede
2016-11-09  3:07       ` Alexey Kardashevskiy
2016-11-09  3:07         ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-09  3:35         ` Alex Williamson
2016-11-09  3:35           ` [Qemu-devel] " Alex Williamson
2016-11-04 21:10 ` [PATCH v11 16/22] vfio_pci: Updated to use vfio_set_irqs_validate_and_prepare() Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 17/22] vfio_platform: " Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-08  8:52   ` Alexey Kardashevskiy
2016-11-08  8:52     ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-08 20:41     ` Kirti Wankhede
2016-11-08 20:41       ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 18/22] vfio: Define device_api strings Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 19/22] docs: Add Documentation for Mediated devices Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 20/22] docs: Sysfs ABI for mediated device framework Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 21/22] docs: Sample driver to demonstrate how to use Mediated " Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-04 21:10 ` [PATCH v11 22/22] MAINTAINERS: Add entry VFIO based Mediated device drivers Kirti Wankhede
2016-11-04 21:10   ` [Qemu-devel] " Kirti Wankhede
2016-11-07  3:30 ` [PATCH v11 00/22] Add Mediated device support Alexey Kardashevskiy
2016-11-07  3:30   ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-07  3:59   ` Kirti Wankhede
2016-11-07  3:59     ` [Qemu-devel] " Kirti Wankhede
2016-11-07  5:06     ` Kirti Wankhede
2016-11-07  5:06       ` [Qemu-devel] " Kirti Wankhede
2016-11-07  6:15     ` Alexey Kardashevskiy
2016-11-07  6:15       ` [Qemu-devel] " Alexey Kardashevskiy
2016-11-07  6:36       ` Kirti Wankhede
2016-11-07  6:36         ` [Qemu-devel] " Kirti Wankhede
2016-11-07  6:46         ` Alexey Kardashevskiy
2016-11-07  6:46           ` [Qemu-devel] " Alexey Kardashevskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161108094338.7eccc64f@t450s.home \
    --to=alex.williamson@redhat.com \
    --cc=bjsdjshi@linux.vnet.ibm.com \
    --cc=cjia@nvidia.com \
    --cc=jike.song@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kwankhede@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.