* [PATCH rdma-rc 1/4] RDMA/core: Enforce requirement to hold lists_rwsem semaphore [not found] <20171224115458.27577-1-leon@kernel.org> @ 2017-12-24 11:54 ` Leon Romanovsky 2017-12-27 21:49 ` Jason Gunthorpe 2017-12-24 11:54 ` [PATCH rdma-rc 2/4] IB/mlx5: Serialize access to the VMA list Leon Romanovsky ` (2 subsequent siblings) 3 siblings, 1 reply; 5+ messages in thread From: Leon Romanovsky @ 2017-12-24 11:54 UTC (permalink / raw) To: Doug Ledford, Jason Gunthorpe Cc: linux-rdma, Daniel Jurgens, Majd Dibbiny, Mark Bloch, Moni Shoua, Yishai Hadas, Leon Romanovsky, stable From: Leon Romanovsky <leonro@mellanox.com> Add comment and run time check to the __ib_device_get_by_index() function to remind that the caller should hold lists_rwsem semaphore. Cc: <stable@vger.kernel.org> # v4.13 Fixes: ecc82c53f9a4 ("RDMA/core: Add and expose static device index") Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> --- drivers/infiniband/core/device.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 30914f3baa5f..3aeaf11d0a83 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -134,10 +134,15 @@ static int ib_device_check_mandatory(struct ib_device *device) return 0; } +/* + * Caller to this function should hold lists_rwsem + */ struct ib_device *__ib_device_get_by_index(u32 index) { struct ib_device *device; + WARN_ON_ONCE(!rwsem_is_locked(&lists_rwsem)); + list_for_each_entry(device, &device_list, core_list) if (device->index == index) return device; @@ -526,8 +531,8 @@ int ib_register_device(struct ib_device *device, if (!add_client_context(device, client) && client->add) client->add(device); - device->index = __dev_new_index(); down_write(&lists_rwsem); + device->index = __dev_new_index(); list_add_tail(&device->core_list, &device_list); up_write(&lists_rwsem); mutex_unlock(&device_mutex); -- 2.15.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH rdma-rc 1/4] RDMA/core: Enforce requirement to hold lists_rwsem semaphore 2017-12-24 11:54 ` [PATCH rdma-rc 1/4] RDMA/core: Enforce requirement to hold lists_rwsem semaphore Leon Romanovsky @ 2017-12-27 21:49 ` Jason Gunthorpe 0 siblings, 0 replies; 5+ messages in thread From: Jason Gunthorpe @ 2017-12-27 21:49 UTC (permalink / raw) To: Leon Romanovsky Cc: Doug Ledford, linux-rdma, Daniel Jurgens, Majd Dibbiny, Mark Bloch, Moni Shoua, Yishai Hadas, Leon Romanovsky, stable On Sun, Dec 24, 2017 at 01:54:55PM +0200, Leon Romanovsky wrote: > From: Leon Romanovsky <leonro@mellanox.com> > > Add comment and run time check to the __ib_device_get_by_index() > function to remind that the caller should hold lists_rwsem semaphore. Upon closer inspecting, this is not entirely right. There is no bug here, the locking is clearly explained in the comment for device_mutex. lists_rwsem's down_write must only be done while within the device_mutex. Therefore holding the device_mutex implies there can be no concurrent writers, and any read lock of lists_rwsem is not necessary. > struct ib_device *__ib_device_get_by_index(u32 index) > { > struct ib_device *device; > > + WARN_ON_ONCE(!rwsem_is_locked(&lists_rwsem)); So this is wrong, it needs to consider the device_mutex too > @@ -526,8 +531,8 @@ int ib_register_device(struct ib_device *device, > if (!add_client_context(device, client) && client->add) > client->add(device); > > - device->index = __dev_new_index(); > down_write(&lists_rwsem); > + device->index = __dev_new_index(); > list_add_tail(&device->core_list, &device_list); > up_write(&lists_rwsem); And this sequence was OK before - the only thing that needs to be protected by the write lock is the actual list manipulation, not the search. The locking here has become a bit nutzo, and the sequencing is wrong too.. Below is a draft of tidying at least some of this.. Can you work from here? Will drop this patch. * Get rid of going_down, we can use list_del and list_splice held under the locks to prevent access to the ib_client_data struct * This allow simplifiying the removal locking to only hold write locks while doing the list edit * Correct call ordering of removal - client remove should be called before we break set/get_client_data, otherwise the client has no control over when those calls start to fail. * Client remove must prevent other threads from calling set/get_client_data - so safety checks now become WARN_ON * The kfree doesn't need locking since the list manipulation have no dangling pointer anymore. * Add assert ASSERT_LISTS_READ_LOCKED in all the right places Jason diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c index 3aeaf11d0a83b7..9e973483b7b91d 100644 --- a/drivers/infiniband/core/device.c +++ b/drivers/infiniband/core/device.c @@ -52,12 +52,15 @@ MODULE_DESCRIPTION("core kernel InfiniBand API"); MODULE_LICENSE("Dual BSD/GPL"); struct ib_client_data { + /* + * list uses a dual locking scheme, readers can either grab the global + * read lists_rwsem or the per-device client_data_lock spin + * lock. writers must grab both the write lists_rwsem and the spin + * lock. + */ struct list_head list; struct ib_client *client; void * data; - /* The device or client is going down. Do not call client or device - * callbacks other than remove(). */ - bool going_down; }; struct workqueue_struct *ib_comp_wq; @@ -84,6 +87,16 @@ static LIST_HEAD(client_list); static DEFINE_MUTEX(device_mutex); static DECLARE_RWSEM(lists_rwsem); +/* + * Used to indicate the function is going to read from client_data_list/list + * or device_list/core_list. + */ +static void ASSERT_LISTS_READ_LOCKED(void) +{ + WARN_ON_ONCE(!rwsem_is_locked(&lists_rwsem) && + !mutex_is_locked(&device_mutex)); +} + static int ib_security_change(struct notifier_block *nb, unsigned long event, void *lsm_data); static void ib_policy_change_task(struct work_struct *work); @@ -141,7 +154,7 @@ struct ib_device *__ib_device_get_by_index(u32 index) { struct ib_device *device; - WARN_ON_ONCE(!rwsem_is_locked(&lists_rwsem)); + ASSERT_LISTS_READ_LOCKED(); list_for_each_entry(device, &device_list, core_list) if (device->index == index) @@ -154,6 +167,8 @@ static struct ib_device *__ib_device_get_by_name(const char *name) { struct ib_device *device; + ASSERT_LISTS_READ_LOCKED(); + list_for_each_entry(device, &device_list, core_list) if (!strncmp(name, device->name, IB_DEVICE_NAME_MAX)) return device; @@ -172,6 +187,8 @@ static int alloc_name(char *name) if (!inuse) return -ENOMEM; + ASSERT_LISTS_READ_LOCKED(); + list_for_each_entry(device, &device_list, core_list) { if (!sscanf(device->name, name, &i)) continue; @@ -292,7 +309,6 @@ static int add_client_context(struct ib_device *device, struct ib_client *client context->client = client; context->data = NULL; - context->going_down = false; down_write(&lists_rwsem); spin_lock_irqsave(&device->client_data_lock, flags); @@ -531,8 +547,8 @@ int ib_register_device(struct ib_device *device, if (!add_client_context(device, client) && client->add) client->add(device); - down_write(&lists_rwsem); device->index = __dev_new_index(); + down_write(&lists_rwsem); list_add_tail(&device->core_list, &device_list); up_write(&lists_rwsem); mutex_unlock(&device_mutex); @@ -559,23 +575,27 @@ void ib_unregister_device(struct ib_device *device) { struct ib_client_data *context, *tmp; unsigned long flags; + struct list_head client_data_tmp; + + INIT_LIST_HEAD(&client_data_tmp); mutex_lock(&device_mutex); + list_for_each_entry(context, &device->client_data_list, list) + if (context->client->remove) + context->client->remove(device, context->data); + down_write(&lists_rwsem); - list_del(&device->core_list); + spin_lock_irqsave(&device->client_data_lock, flags); - list_for_each_entry_safe(context, tmp, &device->client_data_list, list) - context->going_down = true; + list_splice_init(&device->client_data_list, &client_data_tmp); spin_unlock_irqrestore(&device->client_data_lock, flags); - downgrade_write(&lists_rwsem); - list_for_each_entry_safe(context, tmp, &device->client_data_list, - list) { - if (context->client->remove) - context->client->remove(device, context->data); - } - up_read(&lists_rwsem); + list_for_each_entry_safe(context, tmp, &client_data_tmp, list) + kfree(context); + + list_del(&device->core_list); + up_write(&lists_rwsem); ib_device_unregister_rdmacg(device); ib_device_unregister_sysfs(device); @@ -587,13 +607,6 @@ void ib_unregister_device(struct ib_device *device) ib_security_destroy_port_pkey_list(device); kfree(device->port_pkey_list); - down_write(&lists_rwsem); - spin_lock_irqsave(&device->client_data_lock, flags); - list_for_each_entry_safe(context, tmp, &device->client_data_list, list) - kfree(context); - spin_unlock_irqrestore(&device->client_data_lock, flags); - up_write(&lists_rwsem); - device->reg_state = IB_DEV_UNREGISTERED; } EXPORT_SYMBOL(ib_unregister_device); @@ -617,6 +630,8 @@ int ib_register_client(struct ib_client *client) mutex_lock(&device_mutex); + ASSERT_LISTS_READ_LOCKED(); + list_for_each_entry(device, &device_list, core_list) if (!add_client_context(device, client) && client->add) client->add(device); @@ -654,33 +669,27 @@ void ib_unregister_client(struct ib_client *client) list_for_each_entry(device, &device_list, core_list) { struct ib_client_data *found_context = NULL; - down_write(&lists_rwsem); - spin_lock_irqsave(&device->client_data_lock, flags); - list_for_each_entry_safe(context, tmp, &device->client_data_list, list) + list_for_each_entry_safe(context, tmp, &device->client_data_list, list) { if (context->client == client) { - context->going_down = true; found_context = context; break; } - spin_unlock_irqrestore(&device->client_data_lock, flags); - up_write(&lists_rwsem); + } - if (client->remove) - client->remove(device, found_context ? - found_context->data : NULL); + if (found_context) { + if (client->remove) + client->remove(device, found_context->data); - if (!found_context) { - pr_warn("No client context found for %s/%s\n", - device->name, client->name); - continue; - } + down_write(&lists_rwsem); + spin_lock_irqsave(&device->client_data_lock, flags); + list_del(&context->list); + spin_unlock_irqrestore(&device->client_data_lock, + flags); + up_write(&lists_rwsem); - down_write(&lists_rwsem); - spin_lock_irqsave(&device->client_data_lock, flags); - list_del(&found_context->list); - kfree(found_context); - spin_unlock_irqrestore(&device->client_data_lock, flags); - up_write(&lists_rwsem); + kfree(found_context); + } else + WARN_ON(!found_context); } mutex_unlock(&device_mutex); @@ -735,9 +744,7 @@ void ib_set_client_data(struct ib_device *device, struct ib_client *client, goto out; } - pr_warn("No client context found for %s/%s\n", - device->name, client->name); - + WARN_ON(true); out: spin_unlock_irqrestore(&device->client_data_lock, flags); } @@ -1133,9 +1140,6 @@ struct net_device *ib_get_net_dev_by_params(struct ib_device *dev, list_for_each_entry(context, &dev->client_data_list, list) { struct ib_client *client = context->client; - if (context->going_down) - continue; - if (client->get_net_dev_by_params) { net_dev = client->get_net_dev_by_params(dev, port, pkey, gid, addr, ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH rdma-rc 2/4] IB/mlx5: Serialize access to the VMA list [not found] <20171224115458.27577-1-leon@kernel.org> 2017-12-24 11:54 ` [PATCH rdma-rc 1/4] RDMA/core: Enforce requirement to hold lists_rwsem semaphore Leon Romanovsky @ 2017-12-24 11:54 ` Leon Romanovsky 2017-12-24 11:54 ` [PATCH rdma-rc 3/4] IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp() Leon Romanovsky 2017-12-24 11:54 ` [PATCH rdma-rc 4/4] IB/core: Verify that QP is security enabled in create and destroy Leon Romanovsky 3 siblings, 0 replies; 5+ messages in thread From: Leon Romanovsky @ 2017-12-24 11:54 UTC (permalink / raw) To: Doug Ledford, Jason Gunthorpe Cc: linux-rdma, Daniel Jurgens, Majd Dibbiny, Mark Bloch, Moni Shoua, Yishai Hadas, stable From: Majd Dibbiny <majd@mellanox.com> User-space applications can do mmap and munmap directly. Since the VMA list is not protected with a mutex, concurrent accesses to the VMA list from the mmap and munmap can cause data corruption. Cc: <stable@vger.kernel.org> # v4.7 Fixes: 7c2344c3bbf9 ("IB/mlx5: Implements disassociate_ucontext API") Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> --- drivers/infiniband/hw/mlx5/main.c | 8 ++++++++ drivers/infiniband/hw/mlx5/mlx5_ib.h | 6 ++++++ 2 files changed, 14 insertions(+) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index b4ef4d9b6ce5..8ac50de2b242 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -1463,6 +1463,7 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev, } INIT_LIST_HEAD(&context->vma_private_list); + mutex_init(&context->vma_private_list_mutex); INIT_LIST_HEAD(&context->db_page_list); mutex_init(&context->db_page_mutex); @@ -1624,7 +1625,9 @@ static void mlx5_ib_vma_close(struct vm_area_struct *area) * mlx5_ib_disassociate_ucontext(). */ mlx5_ib_vma_priv_data->vma = NULL; + mutex_lock(mlx5_ib_vma_priv_data->vma_private_list_mutex); list_del(&mlx5_ib_vma_priv_data->list); + mutex_unlock(mlx5_ib_vma_priv_data->vma_private_list_mutex); kfree(mlx5_ib_vma_priv_data); } @@ -1644,10 +1647,13 @@ static int mlx5_ib_set_vma_data(struct vm_area_struct *vma, return -ENOMEM; vma_prv->vma = vma; + vma_prv->vma_private_list_mutex = &ctx->vma_private_list_mutex; vma->vm_private_data = vma_prv; vma->vm_ops = &mlx5_ib_vm_ops; + mutex_lock(&ctx->vma_private_list_mutex); list_add(&vma_prv->list, vma_head); + mutex_unlock(&ctx->vma_private_list_mutex); return 0; } @@ -1690,6 +1696,7 @@ static void mlx5_ib_disassociate_ucontext(struct ib_ucontext *ibcontext) * mlx5_ib_vma_close. */ down_write(&owning_mm->mmap_sem); + mutex_lock(&context->vma_private_list_mutex); list_for_each_entry_safe(vma_private, n, &context->vma_private_list, list) { vma = vma_private->vma; @@ -1704,6 +1711,7 @@ static void mlx5_ib_disassociate_ucontext(struct ib_ucontext *ibcontext) list_del(&vma_private->list); kfree(vma_private); } + mutex_unlock(&context->vma_private_list_mutex); up_write(&owning_mm->mmap_sem); mmput(owning_mm); put_task_struct(owning_process); diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 6dd8cac78de2..dcf30e18409e 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -115,6 +115,9 @@ enum { struct mlx5_ib_vma_private_data { struct list_head list; struct vm_area_struct *vma; + /* protect vma_private_list add/del + */ + struct mutex *vma_private_list_mutex; }; struct mlx5_ib_ucontext { @@ -129,6 +132,9 @@ struct mlx5_ib_ucontext { /* Transport Domain number */ u32 tdn; struct list_head vma_private_list; + /* protect vma_private_list add/del + */ + struct mutex vma_private_list_mutex; unsigned long upd_xlt_page; /* protect ODP/KSM */ -- 2.15.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH rdma-rc 3/4] IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp() [not found] <20171224115458.27577-1-leon@kernel.org> 2017-12-24 11:54 ` [PATCH rdma-rc 1/4] RDMA/core: Enforce requirement to hold lists_rwsem semaphore Leon Romanovsky 2017-12-24 11:54 ` [PATCH rdma-rc 2/4] IB/mlx5: Serialize access to the VMA list Leon Romanovsky @ 2017-12-24 11:54 ` Leon Romanovsky 2017-12-24 11:54 ` [PATCH rdma-rc 4/4] IB/core: Verify that QP is security enabled in create and destroy Leon Romanovsky 3 siblings, 0 replies; 5+ messages in thread From: Leon Romanovsky @ 2017-12-24 11:54 UTC (permalink / raw) To: Doug Ledford, Jason Gunthorpe Cc: linux-rdma, Daniel Jurgens, Majd Dibbiny, Mark Bloch, Moni Shoua, Yishai Hadas, stable From: Moni Shoua <monis@mellanox.com> If the input command length is larger than the kernel supports an error should be returned in case the unsupported bytes are not cleared. Cc: <stable@vger.kernel.org> # v4.10 Fixes: 189aba99e700 ("IB/uverbs: Extend modify_qp and support packet pacing") Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> --- drivers/infiniband/core/uverbs_cmd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index d0202bb176a4..840b24096690 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -2074,8 +2074,8 @@ int ib_uverbs_ex_modify_qp(struct ib_uverbs_file *file, return -EOPNOTSUPP; if (ucore->inlen > sizeof(cmd)) { - if (ib_is_udata_cleared(ucore, sizeof(cmd), - ucore->inlen - sizeof(cmd))) + if (!ib_is_udata_cleared(ucore, sizeof(cmd), + ucore->inlen - sizeof(cmd))) return -EOPNOTSUPP; } -- 2.15.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH rdma-rc 4/4] IB/core: Verify that QP is security enabled in create and destroy [not found] <20171224115458.27577-1-leon@kernel.org> ` (2 preceding siblings ...) 2017-12-24 11:54 ` [PATCH rdma-rc 3/4] IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp() Leon Romanovsky @ 2017-12-24 11:54 ` Leon Romanovsky 3 siblings, 0 replies; 5+ messages in thread From: Leon Romanovsky @ 2017-12-24 11:54 UTC (permalink / raw) To: Doug Ledford, Jason Gunthorpe Cc: linux-rdma, Daniel Jurgens, Majd Dibbiny, Mark Bloch, Moni Shoua, Yishai Hadas, stable From: Moni Shoua <monis@mellanox.com> The XRC target QP create flow it does some security related work. However, security is enable only when link layer is InfiniBand so before doing this work the driver should check if security context is initialized. The same applies for destroy flow. Cc: <stable@vger.kernel.org> # v4.12 Fixes: d291f1a65232 ("IB/core: Enforce PKey security on QPs") Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> --- drivers/infiniband/core/security.c | 3 +++ drivers/infiniband/core/verbs.c | 3 ++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/core/security.c b/drivers/infiniband/core/security.c index feafdb961c48..59b2f96d986a 100644 --- a/drivers/infiniband/core/security.c +++ b/drivers/infiniband/core/security.c @@ -386,6 +386,9 @@ int ib_open_shared_qp_security(struct ib_qp *qp, struct ib_device *dev) if (ret) return ret; + if (!qp->qp_sec) + return 0; + mutex_lock(&real_qp->qp_sec->mutex); ret = check_qp_port_pkey_settings(real_qp->qp_sec->ports_pkeys, qp->qp_sec); diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c index 3fb8fb6cc824..e36d27ed4daa 100644 --- a/drivers/infiniband/core/verbs.c +++ b/drivers/infiniband/core/verbs.c @@ -1438,7 +1438,8 @@ int ib_close_qp(struct ib_qp *qp) spin_unlock_irqrestore(&real_qp->device->event_handler_lock, flags); atomic_dec(&real_qp->usecnt); - ib_close_shared_qp_security(qp->qp_sec); + if (qp->qp_sec) + ib_close_shared_qp_security(qp->qp_sec); kfree(qp); return 0; -- 2.15.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-12-27 21:50 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20171224115458.27577-1-leon@kernel.org>
2017-12-24 11:54 ` [PATCH rdma-rc 1/4] RDMA/core: Enforce requirement to hold lists_rwsem semaphore Leon Romanovsky
2017-12-27 21:49 ` Jason Gunthorpe
2017-12-24 11:54 ` [PATCH rdma-rc 2/4] IB/mlx5: Serialize access to the VMA list Leon Romanovsky
2017-12-24 11:54 ` [PATCH rdma-rc 3/4] IB/uverbs: Fix command checking as part of ib_uverbs_ex_modify_qp() Leon Romanovsky
2017-12-24 11:54 ` [PATCH rdma-rc 4/4] IB/core: Verify that QP is security enabled in create and destroy Leon Romanovsky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).