From: Alex Williamson <alex.williamson@redhat.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org, kvm@vger.kernel.org,
Paul Mackerras <paulus@samba.org>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 29/29] vfio: powerpc/spapr: Support Dynamic DMA windows
Date: Tue, 10 Mar 2015 19:10:06 -0600 [thread overview]
Message-ID: <1426036206.25026.108.camel@redhat.com> (raw)
In-Reply-To: <1425910045-26167-30-git-send-email-aik@ozlabs.ru>
On Tue, 2015-03-10 at 01:07 +1100, Alexey Kardashevskiy wrote:
> This adds create/remove window ioctls to create and remove DMA windows.
> sPAPR defines a Dynamic DMA windows capability which allows
> para-virtualized guests to create additional DMA windows on a PCI bus.
> The existing linux kernels use this new window to map the entire guest
> memory and switch to the direct DMA operations saving time on map/unmap
> requests which would normally happen in a big amounts.
>
> This adds 2 ioctl handlers - VFIO_IOMMU_SPAPR_TCE_CREATE and
> VFIO_IOMMU_SPAPR_TCE_REMOVE - to create and remove windows.
> Up to 2 windows are supported now by the hardware and by this driver.
>
> This changes VFIO_IOMMU_SPAPR_TCE_GET_INFO handler to return additional
> information such as a number of supported windows and maximum number
> levels of TCE tables.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> Changes:
> v4:
> * moved code to tce_iommu_create_window()/tce_iommu_remove_window()
> helpers
> * added docs
> ---
> Documentation/vfio.txt | 19 +++++
> arch/powerpc/include/asm/iommu.h | 2 +-
> drivers/vfio/vfio_iommu_spapr_tce.c | 165 +++++++++++++++++++++++++++++++++++-
> include/uapi/linux/vfio.h | 24 +++++-
> 4 files changed, 207 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt
> index 791e85c..61ce393 100644
> --- a/Documentation/vfio.txt
> +++ b/Documentation/vfio.txt
> @@ -446,6 +446,25 @@ the memory block.
> The user space is not expected to call these often and the block descriptors
> are stored in a linked list in the kernel.
>
> +6) sPAPR specification allows guests to have an ddditional DMA window(s) on
> +a PCI bus with a variable page size. Two ioctls have been added to support
> +this: VFIO_IOMMU_SPAPR_TCE_CREATE and VFIO_IOMMU_SPAPR_TCE_REMOVE.
> +The platform has to support the functionality or error will be returned to
> +the userspace. The existing hardware supports up to 2 DMA windows, one is
> +2GB long, uses 4K pages and called "default 32bit window"; the other can
> +be as big as entire RAM, use different page size, it is optional - guests
> +create those in run-time if the guest driver supports 64bit DMA.
> +
> +VFIO_IOMMU_SPAPR_TCE_CREATE receives a page shift, a DMA window size and
> +a number of TCE table levels (if a TCE table is going to be big enough and
> +the kernel may not be able to allocate enough of physicall contiguous memory).
> +It creates a new window in the available slot and returns the bus address where
> +the new window starts. Due to hardware limitation, the user space cannot choose
> +the location of DMA windows.
> +
> +VFIO_IOMMU_SPAPR_TCE_REMOVE receives the bus start address of the window
> +and removes it.
> +
> -------------------------------------------------------------------------------
>
> [1] VFIO was originally an acronym for "Virtual Function I/O" in its
> diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
> index 04f72ac..de82b61 100644
> --- a/arch/powerpc/include/asm/iommu.h
> +++ b/arch/powerpc/include/asm/iommu.h
> @@ -138,7 +138,7 @@ extern void iommu_free_table(struct iommu_table *tbl, const char *node_name);
> extern struct iommu_table *iommu_init_table(struct iommu_table * tbl,
> int nid);
>
> -#define IOMMU_TABLE_GROUP_MAX_TABLES 1
> +#define IOMMU_TABLE_GROUP_MAX_TABLES 2
>
> struct iommu_table_group;
>
> diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
> index 3a0b5fe..7aa4141b 100644
> --- a/drivers/vfio/vfio_iommu_spapr_tce.c
> +++ b/drivers/vfio/vfio_iommu_spapr_tce.c
> @@ -96,6 +96,7 @@ struct tce_container {
> struct list_head mem_list;
> struct iommu_table tables[IOMMU_TABLE_GROUP_MAX_TABLES];
> struct list_head group_list;
> + bool v2;
> };
>
> struct tce_iommu_group {
> @@ -333,6 +334,20 @@ static struct iommu_table *spapr_tce_find_table(
> return ret;
> }
>
> +static int spapr_tce_find_free_table(struct tce_container *container)
> +{
> + int i;
> +
> + for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) {
> + struct iommu_table *tbl = &container->tables[i];
> +
> + if (!tbl->it_size)
> + return i;
> + }
> +
> + return -1;
Why not use a real errno here?
> +}
> +
> static int tce_iommu_enable(struct tce_container *container)
> {
> int ret = 0;
> @@ -432,6 +447,8 @@ static void *tce_iommu_open(unsigned long arg)
> INIT_LIST_HEAD_RCU(&container->mem_list);
> INIT_LIST_HEAD_RCU(&container->group_list);
>
> + container->v2 = arg == VFIO_SPAPR_TCE_v2_IOMMU;
> +
Ah, here v2 actually provides some enforced differentiation, right? ...
oh wait, nobody ever uses this.
> return container;
> }
>
> @@ -605,11 +622,90 @@ static long tce_iommu_build(struct tce_container *container,
> return ret;
> }
>
> +static long tce_iommu_create_window(struct tce_container *container,
> + __u32 page_shift, __u64 window_size, __u32 levels,
> + __u64 *start_addr)
> +{
> + struct iommu_table_group *table_group;
> + struct tce_iommu_group *tcegrp;
> + int num;
> + long ret;
> +
> + num = spapr_tce_find_free_table(container);
> + if (num < 0)
> + return -ENOSYS;
Wouldn't something like ENOSPC be more appropriate (returned from the
function, not invented here)?
> +
> + tcegrp = list_first_entry(&container->group_list,
> + struct tce_iommu_group, next);
> + table_group = iommu_group_get_iommudata(tcegrp->grp);
> +
> + ret = table_group->ops->create_table(table_group, num,
> + page_shift, window_size, levels,
> + &container->tables[num]);
> + if (ret)
> + return ret;
> +
> + list_for_each_entry(tcegrp, &container->group_list, next) {
> + struct iommu_table_group *table_group_tmp =
> + iommu_group_get_iommudata(tcegrp->grp);
> +
> + if (WARN_ON_ONCE(table_group_tmp->ops != table_group->ops))
> + return -EFAULT;
EFAULT doesn't seem appropriate either. What "bad address" did the user
provide?
> +
> + ret = table_group->ops->set_window(table_group_tmp, num,
> + &container->tables[num]);
> + if (ret)
> + return ret;
I admit I'm getting lost in the details here, but it seems we have a
number of cases we're we've set something up and we're just bailing on
errors with no sign that we're undoing any previous operations.
> + }
> +
> + *start_addr = container->tables[num].it_offset <<
> + container->tables[num].it_page_shift;
> +
> + return 0;
> +}
> +
> +static long tce_iommu_remove_window(struct tce_container *container,
> + __u64 start_addr)
> +{
> + struct iommu_table_group *table_group = NULL;
> + struct iommu_table *tbl;
> + struct tce_iommu_group *tcegrp;
> + int num;
> +
> + tbl = spapr_tce_find_table(container, start_addr);
> + if (!tbl)
> + return -EINVAL;
> +
> + /* Detach groups from IOMMUs */
> + num = tbl - container->tables;
> + list_for_each_entry(tcegrp, &container->group_list, next) {
> + table_group = iommu_group_get_iommudata(tcegrp->grp);
> + if (!table_group->ops || !table_group->ops->unset_window)
> + return -EFAULT;
Is this valid? Why would we let the user set_window on something that
doesn't have an unset? What state does this leave the system in to
fault here? What's the "bad address" the user passed to get here?
> + if (container->tables[num].it_size)
> + table_group->ops->unset_window(table_group, num);
> + }
> +
> + /* Free table */
> + tcegrp = list_first_entry(&container->group_list,
> + struct tce_iommu_group, next);
> + table_group = iommu_group_get_iommudata(tcegrp->grp);
> +
> + tce_iommu_clear(container, tbl,
> + tbl->it_offset, tbl->it_size);
> + if (tbl->it_ops->free)
> + tbl->it_ops->free(tbl);
> +
> + memset(tbl, 0, sizeof(*tbl));
> +
> + return 0;
> +}
> +
> static long tce_iommu_ioctl(void *iommu_data,
> unsigned int cmd, unsigned long arg)
> {
> struct tce_container *container = iommu_data;
> - unsigned long minsz;
> + unsigned long minsz, ddwsz;
> long ret;
>
> switch (cmd) {
> @@ -652,6 +748,16 @@ static long tce_iommu_ioctl(void *iommu_data,
>
> info.dma32_window_start = table_group->tce32_start;
> info.dma32_window_size = table_group->tce32_size;
> + info.max_dynamic_windows_supported =
> + table_group->max_dynamic_windows_supported;
> + info.levels = table_group->max_levels;
> + info.flags = table_group->flags;
> +
> + ddwsz = offsetofend(struct vfio_iommu_spapr_tce_info,
> + levels);
> +
> + if (info.argsz == ddwsz)
> + minsz = ddwsz;
>
> if (copy_to_user((void __user *)arg, &info, minsz))
> return -EFAULT;
> @@ -823,6 +929,63 @@ static long tce_iommu_ioctl(void *iommu_data,
> return ret;
> }
>
> + case VFIO_IOMMU_SPAPR_TCE_CREATE: {
> + struct vfio_iommu_spapr_tce_create create;
> +
> + if (!tce_preregistered(container))
> + return -EPERM;
> +
> + minsz = offsetofend(struct vfio_iommu_spapr_tce_create,
> + start_addr);
> +
> + if (copy_from_user(&create, (void __user *)arg, minsz))
> + return -EFAULT;
> +
> + if (create.argsz < minsz)
> + return -EINVAL;
> +
> + if (create.flags)
> + return -EINVAL;
> +
> + mutex_lock(&container->lock);
> +
> + ret = tce_iommu_create_window(container, create.page_shift,
> + create.window_size, create.levels,
> + &create.start_addr);
> +
> + if (!ret && copy_to_user((void __user *)arg, &create, minsz))
> + return -EFAULT;
> +
> + mutex_unlock(&container->lock);
Too bad that above return doesn't unlock the mutex too.
> +
> + return ret;
> + }
> + case VFIO_IOMMU_SPAPR_TCE_REMOVE: {
> + struct vfio_iommu_spapr_tce_remove remove;
> +
> + if (!tce_preregistered(container))
> + return -EPERM;
> +
> + minsz = offsetofend(struct vfio_iommu_spapr_tce_remove,
> + start_addr);
> +
> + if (copy_from_user(&remove, (void __user *)arg, minsz))
> + return -EFAULT;
> +
> + if (remove.argsz < minsz)
> + return -EINVAL;
> +
> + if (remove.flags)
> + return -EINVAL;
> +
> + mutex_lock(&container->lock);
> +
> + ret = tce_iommu_remove_window(container, remove.start_addr);
> +
> + mutex_unlock(&container->lock);
> +
> + return ret;
> + }
> }
>
> return -ENOTTY;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index fbc5286..150f418 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -457,9 +457,11 @@ struct vfio_iommu_type1_dma_unmap {
> */
> struct vfio_iommu_spapr_tce_info {
> __u32 argsz;
> - __u32 flags; /* reserved for future use */
> + __u32 flags;
> __u32 dma32_window_start; /* 32 bit window start (bytes) */
> __u32 dma32_window_size; /* 32 bit window size (bytes) */
> + __u32 max_dynamic_windows_supported;
> + __u32 levels;
How does the user know these extra fields are there? flags is a return
value here that could be used to indicate features.
> };
>
> #define VFIO_IOMMU_SPAPR_TCE_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
> @@ -520,6 +522,26 @@ struct vfio_iommu_spapr_register_memory {
> */
> #define VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY _IO(VFIO_TYPE, VFIO_BASE + 18)
>
> +struct vfio_iommu_spapr_tce_create {
> + __u32 argsz;
> + __u32 flags;
> + /* in */
> + __u32 page_shift;
> + __u64 window_size;
> + __u32 levels;
> + /* out */
> + __u64 start_addr;
> +};
> +#define VFIO_IOMMU_SPAPR_TCE_CREATE _IO(VFIO_TYPE, VFIO_BASE + 19)
> +
> +struct vfio_iommu_spapr_tce_remove {
> + __u32 argsz;
> + __u32 flags;
> + /* in */
> + __u64 start_addr;
> +};
> +#define VFIO_IOMMU_SPAPR_TCE_REMOVE _IO(VFIO_TYPE, VFIO_BASE + 20)
> +
Comments are lacking here compared to the reset of the interfaces.
> /* ***************************************************************** */
>
> #endif /* _UAPIVFIO_H */
WARNING: multiple messages have this Message-ID (diff)
From: Alex Williamson <alex.williamson@redhat.com>
To: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: linuxppc-dev@lists.ozlabs.org,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 29/29] vfio: powerpc/spapr: Support Dynamic DMA windows
Date: Tue, 10 Mar 2015 19:10:06 -0600 [thread overview]
Message-ID: <1426036206.25026.108.camel@redhat.com> (raw)
In-Reply-To: <1425910045-26167-30-git-send-email-aik@ozlabs.ru>
On Tue, 2015-03-10 at 01:07 +1100, Alexey Kardashevskiy wrote:
> This adds create/remove window ioctls to create and remove DMA windows.
> sPAPR defines a Dynamic DMA windows capability which allows
> para-virtualized guests to create additional DMA windows on a PCI bus.
> The existing linux kernels use this new window to map the entire guest
> memory and switch to the direct DMA operations saving time on map/unmap
> requests which would normally happen in a big amounts.
>
> This adds 2 ioctl handlers - VFIO_IOMMU_SPAPR_TCE_CREATE and
> VFIO_IOMMU_SPAPR_TCE_REMOVE - to create and remove windows.
> Up to 2 windows are supported now by the hardware and by this driver.
>
> This changes VFIO_IOMMU_SPAPR_TCE_GET_INFO handler to return additional
> information such as a number of supported windows and maximum number
> levels of TCE tables.
>
> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> ---
> Changes:
> v4:
> * moved code to tce_iommu_create_window()/tce_iommu_remove_window()
> helpers
> * added docs
> ---
> Documentation/vfio.txt | 19 +++++
> arch/powerpc/include/asm/iommu.h | 2 +-
> drivers/vfio/vfio_iommu_spapr_tce.c | 165 +++++++++++++++++++++++++++++++++++-
> include/uapi/linux/vfio.h | 24 +++++-
> 4 files changed, 207 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt
> index 791e85c..61ce393 100644
> --- a/Documentation/vfio.txt
> +++ b/Documentation/vfio.txt
> @@ -446,6 +446,25 @@ the memory block.
> The user space is not expected to call these often and the block descriptors
> are stored in a linked list in the kernel.
>
> +6) sPAPR specification allows guests to have an ddditional DMA window(s) on
> +a PCI bus with a variable page size. Two ioctls have been added to support
> +this: VFIO_IOMMU_SPAPR_TCE_CREATE and VFIO_IOMMU_SPAPR_TCE_REMOVE.
> +The platform has to support the functionality or error will be returned to
> +the userspace. The existing hardware supports up to 2 DMA windows, one is
> +2GB long, uses 4K pages and called "default 32bit window"; the other can
> +be as big as entire RAM, use different page size, it is optional - guests
> +create those in run-time if the guest driver supports 64bit DMA.
> +
> +VFIO_IOMMU_SPAPR_TCE_CREATE receives a page shift, a DMA window size and
> +a number of TCE table levels (if a TCE table is going to be big enough and
> +the kernel may not be able to allocate enough of physicall contiguous memory).
> +It creates a new window in the available slot and returns the bus address where
> +the new window starts. Due to hardware limitation, the user space cannot choose
> +the location of DMA windows.
> +
> +VFIO_IOMMU_SPAPR_TCE_REMOVE receives the bus start address of the window
> +and removes it.
> +
> -------------------------------------------------------------------------------
>
> [1] VFIO was originally an acronym for "Virtual Function I/O" in its
> diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
> index 04f72ac..de82b61 100644
> --- a/arch/powerpc/include/asm/iommu.h
> +++ b/arch/powerpc/include/asm/iommu.h
> @@ -138,7 +138,7 @@ extern void iommu_free_table(struct iommu_table *tbl, const char *node_name);
> extern struct iommu_table *iommu_init_table(struct iommu_table * tbl,
> int nid);
>
> -#define IOMMU_TABLE_GROUP_MAX_TABLES 1
> +#define IOMMU_TABLE_GROUP_MAX_TABLES 2
>
> struct iommu_table_group;
>
> diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c
> index 3a0b5fe..7aa4141b 100644
> --- a/drivers/vfio/vfio_iommu_spapr_tce.c
> +++ b/drivers/vfio/vfio_iommu_spapr_tce.c
> @@ -96,6 +96,7 @@ struct tce_container {
> struct list_head mem_list;
> struct iommu_table tables[IOMMU_TABLE_GROUP_MAX_TABLES];
> struct list_head group_list;
> + bool v2;
> };
>
> struct tce_iommu_group {
> @@ -333,6 +334,20 @@ static struct iommu_table *spapr_tce_find_table(
> return ret;
> }
>
> +static int spapr_tce_find_free_table(struct tce_container *container)
> +{
> + int i;
> +
> + for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) {
> + struct iommu_table *tbl = &container->tables[i];
> +
> + if (!tbl->it_size)
> + return i;
> + }
> +
> + return -1;
Why not use a real errno here?
> +}
> +
> static int tce_iommu_enable(struct tce_container *container)
> {
> int ret = 0;
> @@ -432,6 +447,8 @@ static void *tce_iommu_open(unsigned long arg)
> INIT_LIST_HEAD_RCU(&container->mem_list);
> INIT_LIST_HEAD_RCU(&container->group_list);
>
> + container->v2 = arg == VFIO_SPAPR_TCE_v2_IOMMU;
> +
Ah, here v2 actually provides some enforced differentiation, right? ...
oh wait, nobody ever uses this.
> return container;
> }
>
> @@ -605,11 +622,90 @@ static long tce_iommu_build(struct tce_container *container,
> return ret;
> }
>
> +static long tce_iommu_create_window(struct tce_container *container,
> + __u32 page_shift, __u64 window_size, __u32 levels,
> + __u64 *start_addr)
> +{
> + struct iommu_table_group *table_group;
> + struct tce_iommu_group *tcegrp;
> + int num;
> + long ret;
> +
> + num = spapr_tce_find_free_table(container);
> + if (num < 0)
> + return -ENOSYS;
Wouldn't something like ENOSPC be more appropriate (returned from the
function, not invented here)?
> +
> + tcegrp = list_first_entry(&container->group_list,
> + struct tce_iommu_group, next);
> + table_group = iommu_group_get_iommudata(tcegrp->grp);
> +
> + ret = table_group->ops->create_table(table_group, num,
> + page_shift, window_size, levels,
> + &container->tables[num]);
> + if (ret)
> + return ret;
> +
> + list_for_each_entry(tcegrp, &container->group_list, next) {
> + struct iommu_table_group *table_group_tmp =
> + iommu_group_get_iommudata(tcegrp->grp);
> +
> + if (WARN_ON_ONCE(table_group_tmp->ops != table_group->ops))
> + return -EFAULT;
EFAULT doesn't seem appropriate either. What "bad address" did the user
provide?
> +
> + ret = table_group->ops->set_window(table_group_tmp, num,
> + &container->tables[num]);
> + if (ret)
> + return ret;
I admit I'm getting lost in the details here, but it seems we have a
number of cases we're we've set something up and we're just bailing on
errors with no sign that we're undoing any previous operations.
> + }
> +
> + *start_addr = container->tables[num].it_offset <<
> + container->tables[num].it_page_shift;
> +
> + return 0;
> +}
> +
> +static long tce_iommu_remove_window(struct tce_container *container,
> + __u64 start_addr)
> +{
> + struct iommu_table_group *table_group = NULL;
> + struct iommu_table *tbl;
> + struct tce_iommu_group *tcegrp;
> + int num;
> +
> + tbl = spapr_tce_find_table(container, start_addr);
> + if (!tbl)
> + return -EINVAL;
> +
> + /* Detach groups from IOMMUs */
> + num = tbl - container->tables;
> + list_for_each_entry(tcegrp, &container->group_list, next) {
> + table_group = iommu_group_get_iommudata(tcegrp->grp);
> + if (!table_group->ops || !table_group->ops->unset_window)
> + return -EFAULT;
Is this valid? Why would we let the user set_window on something that
doesn't have an unset? What state does this leave the system in to
fault here? What's the "bad address" the user passed to get here?
> + if (container->tables[num].it_size)
> + table_group->ops->unset_window(table_group, num);
> + }
> +
> + /* Free table */
> + tcegrp = list_first_entry(&container->group_list,
> + struct tce_iommu_group, next);
> + table_group = iommu_group_get_iommudata(tcegrp->grp);
> +
> + tce_iommu_clear(container, tbl,
> + tbl->it_offset, tbl->it_size);
> + if (tbl->it_ops->free)
> + tbl->it_ops->free(tbl);
> +
> + memset(tbl, 0, sizeof(*tbl));
> +
> + return 0;
> +}
> +
> static long tce_iommu_ioctl(void *iommu_data,
> unsigned int cmd, unsigned long arg)
> {
> struct tce_container *container = iommu_data;
> - unsigned long minsz;
> + unsigned long minsz, ddwsz;
> long ret;
>
> switch (cmd) {
> @@ -652,6 +748,16 @@ static long tce_iommu_ioctl(void *iommu_data,
>
> info.dma32_window_start = table_group->tce32_start;
> info.dma32_window_size = table_group->tce32_size;
> + info.max_dynamic_windows_supported =
> + table_group->max_dynamic_windows_supported;
> + info.levels = table_group->max_levels;
> + info.flags = table_group->flags;
> +
> + ddwsz = offsetofend(struct vfio_iommu_spapr_tce_info,
> + levels);
> +
> + if (info.argsz == ddwsz)
> + minsz = ddwsz;
>
> if (copy_to_user((void __user *)arg, &info, minsz))
> return -EFAULT;
> @@ -823,6 +929,63 @@ static long tce_iommu_ioctl(void *iommu_data,
> return ret;
> }
>
> + case VFIO_IOMMU_SPAPR_TCE_CREATE: {
> + struct vfio_iommu_spapr_tce_create create;
> +
> + if (!tce_preregistered(container))
> + return -EPERM;
> +
> + minsz = offsetofend(struct vfio_iommu_spapr_tce_create,
> + start_addr);
> +
> + if (copy_from_user(&create, (void __user *)arg, minsz))
> + return -EFAULT;
> +
> + if (create.argsz < minsz)
> + return -EINVAL;
> +
> + if (create.flags)
> + return -EINVAL;
> +
> + mutex_lock(&container->lock);
> +
> + ret = tce_iommu_create_window(container, create.page_shift,
> + create.window_size, create.levels,
> + &create.start_addr);
> +
> + if (!ret && copy_to_user((void __user *)arg, &create, minsz))
> + return -EFAULT;
> +
> + mutex_unlock(&container->lock);
Too bad that above return doesn't unlock the mutex too.
> +
> + return ret;
> + }
> + case VFIO_IOMMU_SPAPR_TCE_REMOVE: {
> + struct vfio_iommu_spapr_tce_remove remove;
> +
> + if (!tce_preregistered(container))
> + return -EPERM;
> +
> + minsz = offsetofend(struct vfio_iommu_spapr_tce_remove,
> + start_addr);
> +
> + if (copy_from_user(&remove, (void __user *)arg, minsz))
> + return -EFAULT;
> +
> + if (remove.argsz < minsz)
> + return -EINVAL;
> +
> + if (remove.flags)
> + return -EINVAL;
> +
> + mutex_lock(&container->lock);
> +
> + ret = tce_iommu_remove_window(container, remove.start_addr);
> +
> + mutex_unlock(&container->lock);
> +
> + return ret;
> + }
> }
>
> return -ENOTTY;
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index fbc5286..150f418 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -457,9 +457,11 @@ struct vfio_iommu_type1_dma_unmap {
> */
> struct vfio_iommu_spapr_tce_info {
> __u32 argsz;
> - __u32 flags; /* reserved for future use */
> + __u32 flags;
> __u32 dma32_window_start; /* 32 bit window start (bytes) */
> __u32 dma32_window_size; /* 32 bit window size (bytes) */
> + __u32 max_dynamic_windows_supported;
> + __u32 levels;
How does the user know these extra fields are there? flags is a return
value here that could be used to indicate features.
> };
>
> #define VFIO_IOMMU_SPAPR_TCE_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12)
> @@ -520,6 +522,26 @@ struct vfio_iommu_spapr_register_memory {
> */
> #define VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY _IO(VFIO_TYPE, VFIO_BASE + 18)
>
> +struct vfio_iommu_spapr_tce_create {
> + __u32 argsz;
> + __u32 flags;
> + /* in */
> + __u32 page_shift;
> + __u64 window_size;
> + __u32 levels;
> + /* out */
> + __u64 start_addr;
> +};
> +#define VFIO_IOMMU_SPAPR_TCE_CREATE _IO(VFIO_TYPE, VFIO_BASE + 19)
> +
> +struct vfio_iommu_spapr_tce_remove {
> + __u32 argsz;
> + __u32 flags;
> + /* in */
> + __u64 start_addr;
> +};
> +#define VFIO_IOMMU_SPAPR_TCE_REMOVE _IO(VFIO_TYPE, VFIO_BASE + 20)
> +
Comments are lacking here compared to the reset of the interfaces.
> /* ***************************************************************** */
>
> #endif /* _UAPIVFIO_H */
next prev parent reply other threads:[~2015-03-11 1:10 UTC|newest]
Thread overview: 89+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-09 14:06 [PATCH v5 00/29] powerpc/iommu/vfio: Enable Dynamic DMA windows Alexey Kardashevskiy
2015-03-09 14:06 ` Alexey Kardashevskiy
2015-03-09 14:06 ` Alexey Kardashevskiy
2015-03-09 14:06 ` [PATCH v5 01/29] vfio: powerpc/spapr: Move page pinning from arch code to VFIO IOMMU driver Alexey Kardashevskiy
2015-03-09 14:06 ` Alexey Kardashevskiy
2015-03-09 14:06 ` Alexey Kardashevskiy
2015-03-09 14:06 ` [PATCH v5 02/29] vfio: powerpc/spapr: Do cleanup when releasing the group Alexey Kardashevskiy
2015-03-09 14:06 ` Alexey Kardashevskiy
2015-03-09 14:06 ` [PATCH v5 03/29] vfio: powerpc/spapr: Check that TCE page size is equal to it_page_size Alexey Kardashevskiy
2015-03-09 14:06 ` Alexey Kardashevskiy
2015-03-10 19:56 ` Alex Williamson
2015-03-10 19:56 ` Alex Williamson
2015-03-10 22:57 ` Alexey Kardashevskiy
2015-03-10 22:57 ` Alexey Kardashevskiy
2015-03-10 23:03 ` Alex Williamson
2015-03-10 23:03 ` Alex Williamson
2015-03-10 23:14 ` Benjamin Herrenschmidt
2015-03-10 23:14 ` Benjamin Herrenschmidt
2015-03-10 23:34 ` Alex Williamson
2015-03-10 23:34 ` Alex Williamson
2015-03-10 23:45 ` Alexey Kardashevskiy
2015-03-10 23:45 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 04/29] vfio: powerpc/spapr: Use it_page_size Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 05/29] vfio: powerpc/spapr: Move locked_vm accounting to helpers Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 06/29] vfio: powerpc/spapr: Disable DMA mappings on disabled container Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 07/29] vfio: powerpc/spapr: Moving pinning/unpinning to helpers Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-10 23:36 ` Alex Williamson
2015-03-10 23:36 ` Alex Williamson
2015-03-09 14:07 ` [PATCH v5 08/29] vfio: powerpc/spapr: Register memory Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 09/29] vfio: powerpc/spapr: Rework attach/detach Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 10/29] powerpc/powernv: Do not set "read" flag if direction==DMA_NONE Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 11/29] powerpc/iommu: Move tce_xxx callbacks from ppc_md to iommu_table Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 12/29] powerpc/iommu: Introduce iommu_table_alloc() helper Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 13/29] powerpc/spapr: vfio: Switch from iommu_table to new iommu_table_group Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 14/29] vfio: powerpc/spapr: powerpc/iommu: Rework IOMMU ownership control Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 15/29] vfio: powerpc/spapr: powerpc/powernv/ioda2: " Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 16/29] powerpc/iommu: Fix IOMMU ownership control functions Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 17/29] powerpc/powernv/ioda/ioda2: Rework tce_build()/tce_free() Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 18/29] powerpc/iommu/powernv: Release replaced TCE Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 19/29] poweppc/powernv/ioda2: Rework iommu_table creation Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 20/29] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_create_table/pnc_pci_free_table Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 21/29] powerpc/powernv/ioda2: Introduce pnv_pci_ioda2_set_window Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 22/29] powerpc/iommu: Split iommu_free_table into 2 helpers Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 23/29] powerpc/powernv: Implement multilevel TCE tables Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 24/29] powerpc/powernv: Change prototypes to receive iommu Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 25/29] powerpc/powernv/ioda: Define and implement DMA table/window management callbacks Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-11 8:54 ` Alexey Kardashevskiy
2015-03-11 8:54 ` Alexey Kardashevskiy
2015-03-11 9:31 ` Benjamin Herrenschmidt
2015-03-11 9:31 ` Benjamin Herrenschmidt
2015-03-09 14:07 ` [PATCH v5 26/29] vfio: powerpc/spapr: Define v2 IOMMU Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-11 0:00 ` Alex Williamson
2015-03-11 0:00 ` Alex Williamson
2015-03-09 14:07 ` [PATCH v5 27/29] vfio: powerpc/spapr: powerpc/powernv/ioda2: Rework ownership Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-11 0:09 ` Alex Williamson
2015-03-11 0:09 ` Alex Williamson
2015-03-11 0:29 ` Alexey Kardashevskiy
2015-03-11 0:29 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 28/29] vfio: powerpc/spapr: Support multiple groups in one container if possible Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-09 14:07 ` [PATCH v5 29/29] vfio: powerpc/spapr: Support Dynamic DMA windows Alexey Kardashevskiy
2015-03-09 14:07 ` Alexey Kardashevskiy
2015-03-11 1:10 ` Alex Williamson [this message]
2015-03-11 1:10 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1426036206.25026.108.camel@redhat.com \
--to=alex.williamson@redhat.com \
--cc=aik@ozlabs.ru \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.