* [PATCH v2 hmm 01/11] mm/hmm: fix use after free with struct hmm in the mmu notifiers
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-06 18:44 ` Jason Gunthorpe
[not found] ` <20190606184438.31646-2-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-06 18:44 ` [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register Jason Gunthorpe
` (11 subsequent siblings)
12 siblings, 1 reply; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
mmu_notifier_unregister_no_release() is not a fence and the mmu_notifier
system will continue to reference hmm->mn until the srcu grace period
expires.
Resulting in use after free races like this:
CPU0 CPU1
__mmu_notifier_invalidate_range_start()
srcu_read_lock
hlist_for_each ()
// mn == hmm->mn
hmm_mirror_unregister()
hmm_put()
hmm_free()
mmu_notifier_unregister_no_release()
hlist_del_init_rcu(hmm-mn->list)
mn->ops->invalidate_range_start(mn, range);
mm_get_hmm()
mm->hmm = NULL;
kfree(hmm)
mutex_lock(&hmm->lock);
Use SRCU to kfree the hmm memory so that the notifiers can rely on hmm
existing. Get the now-safe hmm struct through container_of and directly
check kref_get_unless_zero to lock it against free.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
v2:
- Spell 'free' properly (Jerome/Ralph)
---
include/linux/hmm.h | 1 +
mm/hmm.c | 25 +++++++++++++++++++------
2 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 092f0234bfe917..688c5ca7068795 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -102,6 +102,7 @@ struct hmm {
struct mmu_notifier mmu_notifier;
struct rw_semaphore mirrors_sem;
wait_queue_head_t wq;
+ struct rcu_head rcu;
long notifiers;
bool dead;
};
diff --git a/mm/hmm.c b/mm/hmm.c
index 8e7403f081f44a..547002f56a163d 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -113,6 +113,11 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
return NULL;
}
+static void hmm_free_rcu(struct rcu_head *rcu)
+{
+ kfree(container_of(rcu, struct hmm, rcu));
+}
+
static void hmm_free(struct kref *kref)
{
struct hmm *hmm = container_of(kref, struct hmm, kref);
@@ -125,7 +130,7 @@ static void hmm_free(struct kref *kref)
mm->hmm = NULL;
spin_unlock(&mm->page_table_lock);
- kfree(hmm);
+ mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
}
static inline void hmm_put(struct hmm *hmm)
@@ -153,10 +158,14 @@ void hmm_mm_destroy(struct mm_struct *mm)
static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
- struct hmm *hmm = mm_get_hmm(mm);
+ struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
struct hmm_mirror *mirror;
struct hmm_range *range;
+ /* hmm is in progress to free */
+ if (!kref_get_unless_zero(&hmm->kref))
+ return;
+
/* Report this HMM as dying. */
hmm->dead = true;
@@ -194,13 +203,15 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
static int hmm_invalidate_range_start(struct mmu_notifier *mn,
const struct mmu_notifier_range *nrange)
{
- struct hmm *hmm = mm_get_hmm(nrange->mm);
+ struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
struct hmm_mirror *mirror;
struct hmm_update update;
struct hmm_range *range;
int ret = 0;
- VM_BUG_ON(!hmm);
+ /* hmm is in progress to free */
+ if (!kref_get_unless_zero(&hmm->kref))
+ return 0;
update.start = nrange->start;
update.end = nrange->end;
@@ -245,9 +256,11 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
static void hmm_invalidate_range_end(struct mmu_notifier *mn,
const struct mmu_notifier_range *nrange)
{
- struct hmm *hmm = mm_get_hmm(nrange->mm);
+ struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
- VM_BUG_ON(!hmm);
+ /* hmm is in progress to free */
+ if (!kref_get_unless_zero(&hmm->kref))
+ return;
mutex_lock(&hmm->lock);
hmm->notifiers--;
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-06 18:44 ` [PATCH v2 hmm 01/11] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe
@ 2019-06-06 18:44 ` Jason Gunthorpe
2019-06-07 22:33 ` Ira Weiny
[not found] ` <20190606184438.31646-3-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-06 18:44 ` [PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe
` (10 subsequent siblings)
12 siblings, 2 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
Ralph observes that hmm_range_register() can only be called by a driver
while a mirror is registered. Make this clear in the API by passing in the
mirror structure as a parameter.
This also simplifies understanding the lifetime model for struct hmm, as
the hmm pointer must be valid as part of a registered mirror so all we
need in hmm_register_range() is a simple kref_get.
Suggested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
v2
- Include the oneline patch to nouveau_svm.c
---
drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
include/linux/hmm.h | 7 ++++---
mm/hmm.c | 15 ++++++---------
3 files changed, 11 insertions(+), 13 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 93ed43c413f0bb..8c92374afcf227 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -649,7 +649,7 @@ nouveau_svm_fault(struct nvif_notify *notify)
range.values = nouveau_svm_pfn_values;
range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT;
again:
- ret = hmm_vma_fault(&range, true);
+ ret = hmm_vma_fault(&svmm->mirror, &range, true);
if (ret == 0) {
mutex_lock(&svmm->mutex);
if (!hmm_vma_range_done(&range)) {
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 688c5ca7068795..2d519797cb134a 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -505,7 +505,7 @@ static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror)
* Please see Documentation/vm/hmm.rst for how to use the range API.
*/
int hmm_range_register(struct hmm_range *range,
- struct mm_struct *mm,
+ struct hmm_mirror *mirror,
unsigned long start,
unsigned long end,
unsigned page_shift);
@@ -541,7 +541,8 @@ static inline bool hmm_vma_range_done(struct hmm_range *range)
}
/* This is a temporary helper to avoid merge conflict between trees. */
-static inline int hmm_vma_fault(struct hmm_range *range, bool block)
+static inline int hmm_vma_fault(struct hmm_mirror *mirror,
+ struct hmm_range *range, bool block)
{
long ret;
@@ -554,7 +555,7 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block)
range->default_flags = 0;
range->pfn_flags_mask = -1UL;
- ret = hmm_range_register(range, range->vma->vm_mm,
+ ret = hmm_range_register(range, mirror,
range->start, range->end,
PAGE_SHIFT);
if (ret)
diff --git a/mm/hmm.c b/mm/hmm.c
index 547002f56a163d..8796447299023c 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -925,13 +925,13 @@ static void hmm_pfns_clear(struct hmm_range *range,
* Track updates to the CPU page table see include/linux/hmm.h
*/
int hmm_range_register(struct hmm_range *range,
- struct mm_struct *mm,
+ struct hmm_mirror *mirror,
unsigned long start,
unsigned long end,
unsigned page_shift)
{
unsigned long mask = ((1UL << page_shift) - 1UL);
- struct hmm *hmm;
+ struct hmm *hmm = mirror->hmm;
range->valid = false;
range->hmm = NULL;
@@ -945,15 +945,12 @@ int hmm_range_register(struct hmm_range *range,
range->start = start;
range->end = end;
- hmm = hmm_get_or_create(mm);
- if (!hmm)
- return -EFAULT;
-
/* Check if hmm_mm_destroy() was call. */
- if (hmm->mm == NULL || hmm->dead) {
- hmm_put(hmm);
+ if (hmm->mm == NULL || hmm->dead)
return -EFAULT;
- }
+
+ range->hmm = hmm;
+ kref_get(&hmm->kref);
/* Initialize range to track CPU page table updates. */
mutex_lock(&hmm->lock);
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
2019-06-06 18:44 ` [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register Jason Gunthorpe
@ 2019-06-07 22:33 ` Ira Weiny
[not found] ` <20190606184438.31646-3-jgg-uk2M96/98Pc@public.gmane.org>
1 sibling, 0 replies; 79+ messages in thread
From: Ira Weiny @ 2019-06-07 22:33 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, Ralph Campbell, linux-rdma, John Hubbard,
Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
Jason Gunthorpe, amd-gfx
On Thu, Jun 06, 2019 at 03:44:29PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> Ralph observes that hmm_range_register() can only be called by a driver
> while a mirror is registered. Make this clear in the API by passing in the
> mirror structure as a parameter.
>
> This also simplifies understanding the lifetime model for struct hmm, as
> the hmm pointer must be valid as part of a registered mirror so all we
> need in hmm_register_range() is a simple kref_get.
>
> Suggested-by: Ralph Campbell <rcampbell@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
> v2
> - Include the oneline patch to nouveau_svm.c
> ---
> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
> include/linux/hmm.h | 7 ++++---
> mm/hmm.c | 15 ++++++---------
> 3 files changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index 93ed43c413f0bb..8c92374afcf227 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -649,7 +649,7 @@ nouveau_svm_fault(struct nvif_notify *notify)
> range.values = nouveau_svm_pfn_values;
> range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT;
> again:
> - ret = hmm_vma_fault(&range, true);
> + ret = hmm_vma_fault(&svmm->mirror, &range, true);
> if (ret == 0) {
> mutex_lock(&svmm->mutex);
> if (!hmm_vma_range_done(&range)) {
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 688c5ca7068795..2d519797cb134a 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -505,7 +505,7 @@ static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror)
> * Please see Documentation/vm/hmm.rst for how to use the range API.
> */
> int hmm_range_register(struct hmm_range *range,
> - struct mm_struct *mm,
> + struct hmm_mirror *mirror,
> unsigned long start,
> unsigned long end,
> unsigned page_shift);
> @@ -541,7 +541,8 @@ static inline bool hmm_vma_range_done(struct hmm_range *range)
> }
>
> /* This is a temporary helper to avoid merge conflict between trees. */
> -static inline int hmm_vma_fault(struct hmm_range *range, bool block)
> +static inline int hmm_vma_fault(struct hmm_mirror *mirror,
> + struct hmm_range *range, bool block)
> {
> long ret;
>
> @@ -554,7 +555,7 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block)
> range->default_flags = 0;
> range->pfn_flags_mask = -1UL;
>
> - ret = hmm_range_register(range, range->vma->vm_mm,
> + ret = hmm_range_register(range, mirror,
> range->start, range->end,
> PAGE_SHIFT);
> if (ret)
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 547002f56a163d..8796447299023c 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -925,13 +925,13 @@ static void hmm_pfns_clear(struct hmm_range *range,
> * Track updates to the CPU page table see include/linux/hmm.h
> */
> int hmm_range_register(struct hmm_range *range,
> - struct mm_struct *mm,
> + struct hmm_mirror *mirror,
> unsigned long start,
> unsigned long end,
> unsigned page_shift)
> {
> unsigned long mask = ((1UL << page_shift) - 1UL);
> - struct hmm *hmm;
> + struct hmm *hmm = mirror->hmm;
>
> range->valid = false;
> range->hmm = NULL;
> @@ -945,15 +945,12 @@ int hmm_range_register(struct hmm_range *range,
> range->start = start;
> range->end = end;
>
> - hmm = hmm_get_or_create(mm);
> - if (!hmm)
> - return -EFAULT;
> -
> /* Check if hmm_mm_destroy() was call. */
> - if (hmm->mm == NULL || hmm->dead) {
> - hmm_put(hmm);
> + if (hmm->mm == NULL || hmm->dead)
> return -EFAULT;
> - }
> +
> + range->hmm = hmm;
I don't think you need this assignment here. In the code below (right after
the mutext_lock()) it is set already. And looks like it remains that way after
the end of the series.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> + kref_get(&hmm->kref);
>
> /* Initialize range to track CPU page table updates. */
> mutex_lock(&hmm->lock);
> --
> 2.21.0
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread[parent not found: <20190606184438.31646-3-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
[not found] ` <20190606184438.31646-3-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-07 2:36 ` John Hubbard
2019-06-07 18:24 ` Ralph Campbell
2019-06-08 8:54 ` Christoph Hellwig
2 siblings, 0 replies; 79+ messages in thread
From: John Hubbard @ 2019-06-07 2:36 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, Ralph Campbell,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> Ralph observes that hmm_range_register() can only be called by a driver
> while a mirror is registered. Make this clear in the API by passing in the
> mirror structure as a parameter.
>
> This also simplifies understanding the lifetime model for struct hmm, as
> the hmm pointer must be valid as part of a registered mirror so all we
> need in hmm_register_range() is a simple kref_get.
>
> Suggested-by: Ralph Campbell <rcampbell@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
> v2
> - Include the oneline patch to nouveau_svm.c
> ---
> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
> include/linux/hmm.h | 7 ++++---
> mm/hmm.c | 15 ++++++---------
> 3 files changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index 93ed43c413f0bb..8c92374afcf227 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -649,7 +649,7 @@ nouveau_svm_fault(struct nvif_notify *notify)
> range.values = nouveau_svm_pfn_values;
> range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT;
> again:
> - ret = hmm_vma_fault(&range, true);
> + ret = hmm_vma_fault(&svmm->mirror, &range, true);
> if (ret == 0) {
> mutex_lock(&svmm->mutex);
> if (!hmm_vma_range_done(&range)) {
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 688c5ca7068795..2d519797cb134a 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -505,7 +505,7 @@ static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror)
> * Please see Documentation/vm/hmm.rst for how to use the range API.
> */
> int hmm_range_register(struct hmm_range *range,
> - struct mm_struct *mm,
> + struct hmm_mirror *mirror,
> unsigned long start,
> unsigned long end,
> unsigned page_shift);
> @@ -541,7 +541,8 @@ static inline bool hmm_vma_range_done(struct hmm_range *range)
> }
>
> /* This is a temporary helper to avoid merge conflict between trees. */
> -static inline int hmm_vma_fault(struct hmm_range *range, bool block)
> +static inline int hmm_vma_fault(struct hmm_mirror *mirror,
> + struct hmm_range *range, bool block)
> {
> long ret;
>
> @@ -554,7 +555,7 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block)
> range->default_flags = 0;
> range->pfn_flags_mask = -1UL;
>
> - ret = hmm_range_register(range, range->vma->vm_mm,
> + ret = hmm_range_register(range, mirror,
> range->start, range->end,
> PAGE_SHIFT);
> if (ret)
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 547002f56a163d..8796447299023c 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -925,13 +925,13 @@ static void hmm_pfns_clear(struct hmm_range *range,
> * Track updates to the CPU page table see include/linux/hmm.h
> */
> int hmm_range_register(struct hmm_range *range,
> - struct mm_struct *mm,
> + struct hmm_mirror *mirror,
> unsigned long start,
> unsigned long end,
> unsigned page_shift)
> {
> unsigned long mask = ((1UL << page_shift) - 1UL);
> - struct hmm *hmm;
> + struct hmm *hmm = mirror->hmm;
>
> range->valid = false;
> range->hmm = NULL;
> @@ -945,15 +945,12 @@ int hmm_range_register(struct hmm_range *range,
> range->start = start;
> range->end = end;
>
> - hmm = hmm_get_or_create(mm);
> - if (!hmm)
> - return -EFAULT;
> -
> /* Check if hmm_mm_destroy() was call. */
> - if (hmm->mm == NULL || hmm->dead) {
> - hmm_put(hmm);
> + if (hmm->mm == NULL || hmm->dead)
> return -EFAULT;
> - }
> +
> + range->hmm = hmm;
> + kref_get(&hmm->kref);
>
> /* Initialize range to track CPU page table updates. */
> mutex_lock(&hmm->lock);
>
I'm not a qualified Nouveau reviewer, but this looks obviously correct,
so:
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
[not found] ` <20190606184438.31646-3-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-07 2:36 ` John Hubbard
@ 2019-06-07 18:24 ` Ralph Campbell
2019-06-07 22:39 ` Ralph Campbell
2019-06-08 8:54 ` Christoph Hellwig
2 siblings, 1 reply; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 18:24 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> Ralph observes that hmm_range_register() can only be called by a driver
> while a mirror is registered. Make this clear in the API by passing in the
> mirror structure as a parameter.
>
> This also simplifies understanding the lifetime model for struct hmm, as
> the hmm pointer must be valid as part of a registered mirror so all we
> need in hmm_register_range() is a simple kref_get.
>
> Suggested-by: Ralph Campbell <rcampbell@nvidia.com>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
You might CC Ben for the nouveau part.
CC: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> v2
> - Include the oneline patch to nouveau_svm.c
> ---
> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
> include/linux/hmm.h | 7 ++++---
> mm/hmm.c | 15 ++++++---------
> 3 files changed, 11 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index 93ed43c413f0bb..8c92374afcf227 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -649,7 +649,7 @@ nouveau_svm_fault(struct nvif_notify *notify)
> range.values = nouveau_svm_pfn_values;
> range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT;
> again:
> - ret = hmm_vma_fault(&range, true);
> + ret = hmm_vma_fault(&svmm->mirror, &range, true);
> if (ret == 0) {
> mutex_lock(&svmm->mutex);
> if (!hmm_vma_range_done(&range)) {
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 688c5ca7068795..2d519797cb134a 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -505,7 +505,7 @@ static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror)
> * Please see Documentation/vm/hmm.rst for how to use the range API.
> */
> int hmm_range_register(struct hmm_range *range,
> - struct mm_struct *mm,
> + struct hmm_mirror *mirror,
> unsigned long start,
> unsigned long end,
> unsigned page_shift);
> @@ -541,7 +541,8 @@ static inline bool hmm_vma_range_done(struct hmm_range *range)
> }
>
> /* This is a temporary helper to avoid merge conflict between trees. */
> -static inline int hmm_vma_fault(struct hmm_range *range, bool block)
> +static inline int hmm_vma_fault(struct hmm_mirror *mirror,
> + struct hmm_range *range, bool block)
> {
> long ret;
>
> @@ -554,7 +555,7 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block)
> range->default_flags = 0;
> range->pfn_flags_mask = -1UL;
>
> - ret = hmm_range_register(range, range->vma->vm_mm,
> + ret = hmm_range_register(range, mirror,
> range->start, range->end,
> PAGE_SHIFT);
> if (ret)
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 547002f56a163d..8796447299023c 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -925,13 +925,13 @@ static void hmm_pfns_clear(struct hmm_range *range,
> * Track updates to the CPU page table see include/linux/hmm.h
> */
> int hmm_range_register(struct hmm_range *range,
> - struct mm_struct *mm,
> + struct hmm_mirror *mirror,
> unsigned long start,
> unsigned long end,
> unsigned page_shift)
> {
> unsigned long mask = ((1UL << page_shift) - 1UL);
> - struct hmm *hmm;
> + struct hmm *hmm = mirror->hmm;
>
> range->valid = false;
> range->hmm = NULL;
> @@ -945,15 +945,12 @@ int hmm_range_register(struct hmm_range *range,
> range->start = start;
> range->end = end;
>
> - hmm = hmm_get_or_create(mm);
> - if (!hmm)
> - return -EFAULT;
> -
> /* Check if hmm_mm_destroy() was call. */
> - if (hmm->mm == NULL || hmm->dead) {
> - hmm_put(hmm);
> + if (hmm->mm == NULL || hmm->dead)
> return -EFAULT;
> - }
> +
> + range->hmm = hmm;
> + kref_get(&hmm->kref);
>
> /* Initialize range to track CPU page table updates. */
> mutex_lock(&hmm->lock);
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
2019-06-07 18:24 ` Ralph Campbell
@ 2019-06-07 22:39 ` Ralph Campbell
[not found] ` <e460ddf5-9ed3-7f3b-98ce-526c12fdb8b1-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 22:39 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/7/19 11:24 AM, Ralph Campbell wrote:
>
> On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
>> From: Jason Gunthorpe <jgg@mellanox.com>
>>
>> Ralph observes that hmm_range_register() can only be called by a driver
>> while a mirror is registered. Make this clear in the API by passing in
>> the
>> mirror structure as a parameter.
>>
>> This also simplifies understanding the lifetime model for struct hmm, as
>> the hmm pointer must be valid as part of a registered mirror so all we
>> need in hmm_register_range() is a simple kref_get.
>>
>> Suggested-by: Ralph Campbell <rcampbell@nvidia.com>
>> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
>
> You might CC Ben for the nouveau part.
> CC: Ben Skeggs <bskeggs@redhat.com>
>
> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
>
>
>> ---
>> v2
>> - Include the oneline patch to nouveau_svm.c
>> ---
>> drivers/gpu/drm/nouveau/nouveau_svm.c | 2 +-
>> include/linux/hmm.h | 7 ++++---
>> mm/hmm.c | 15 ++++++---------
>> 3 files changed, 11 insertions(+), 13 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c
>> b/drivers/gpu/drm/nouveau/nouveau_svm.c
>> index 93ed43c413f0bb..8c92374afcf227 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
>> @@ -649,7 +649,7 @@ nouveau_svm_fault(struct nvif_notify *notify)
>> range.values = nouveau_svm_pfn_values;
>> range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT;
>> again:
>> - ret = hmm_vma_fault(&range, true);
>> + ret = hmm_vma_fault(&svmm->mirror, &range, true);
>> if (ret == 0) {
>> mutex_lock(&svmm->mutex);
>> if (!hmm_vma_range_done(&range)) {
>> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
>> index 688c5ca7068795..2d519797cb134a 100644
>> --- a/include/linux/hmm.h
>> +++ b/include/linux/hmm.h
>> @@ -505,7 +505,7 @@ static inline bool hmm_mirror_mm_is_alive(struct
>> hmm_mirror *mirror)
>> * Please see Documentation/vm/hmm.rst for how to use the range API.
>> */
>> int hmm_range_register(struct hmm_range *range,
>> - struct mm_struct *mm,
>> + struct hmm_mirror *mirror,
>> unsigned long start,
>> unsigned long end,
>> unsigned page_shift);
>> @@ -541,7 +541,8 @@ static inline bool hmm_vma_range_done(struct
>> hmm_range *range)
>> }
>> /* This is a temporary helper to avoid merge conflict between trees. */
>> -static inline int hmm_vma_fault(struct hmm_range *range, bool block)
>> +static inline int hmm_vma_fault(struct hmm_mirror *mirror,
>> + struct hmm_range *range, bool block)
>> {
>> long ret;
>> @@ -554,7 +555,7 @@ static inline int hmm_vma_fault(struct hmm_range
>> *range, bool block)
>> range->default_flags = 0;
>> range->pfn_flags_mask = -1UL;
>> - ret = hmm_range_register(range, range->vma->vm_mm,
>> + ret = hmm_range_register(range, mirror,
>> range->start, range->end,
>> PAGE_SHIFT);
>> if (ret)
>> diff --git a/mm/hmm.c b/mm/hmm.c
>> index 547002f56a163d..8796447299023c 100644
>> --- a/mm/hmm.c
>> +++ b/mm/hmm.c
>> @@ -925,13 +925,13 @@ static void hmm_pfns_clear(struct hmm_range *range,
>> * Track updates to the CPU page table see include/linux/hmm.h
>> */
>> int hmm_range_register(struct hmm_range *range,
>> - struct mm_struct *mm,
>> + struct hmm_mirror *mirror,
>> unsigned long start,
>> unsigned long end,
>> unsigned page_shift)
>> {
>> unsigned long mask = ((1UL << page_shift) - 1UL);
>> - struct hmm *hmm;
>> + struct hmm *hmm = mirror->hmm;
>> range->valid = false;
>> range->hmm = NULL;
>> @@ -945,15 +945,12 @@ int hmm_range_register(struct hmm_range *range,
>> range->start = start;
>> range->end = end;
>> - hmm = hmm_get_or_create(mm);
>> - if (!hmm)
>> - return -EFAULT;
>> -
>> /* Check if hmm_mm_destroy() was call. */
>> - if (hmm->mm == NULL || hmm->dead) {
>> - hmm_put(hmm);
>> + if (hmm->mm == NULL || hmm->dead)
>> return -EFAULT;
>> - }
>> +
>> + range->hmm = hmm;
>> + kref_get(&hmm->kref);
>> /* Initialize range to track CPU page table updates. */
>> mutex_lock(&hmm->lock);
>>
I forgot to add that I think you can delete the duplicate
"range->hmm = hmm;"
here between the mutex_lock/unlock.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
[not found] ` <20190606184438.31646-3-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-07 2:36 ` John Hubbard
2019-06-07 18:24 ` Ralph Campbell
@ 2019-06-08 8:54 ` Christoph Hellwig
[not found] ` <20190608085425.GB32185-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2 siblings, 1 reply; 79+ messages in thread
From: Christoph Hellwig @ 2019-06-08 8:54 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, Ralph Campbell,
linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
FYI, I very much disagree with the direction this is moving.
struct hmm_mirror literally is a trivial duplication of the
mmu_notifiers. All these drivers should just use the mmu_notifiers
directly for the mirroring part instead of building a thing wrapper
that adds nothing but helping to manage the lifetime of struct hmm,
which shouldn't exist to start with.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread
* [PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-06 18:44 ` [PATCH v2 hmm 01/11] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe
2019-06-06 18:44 ` [PATCH v2 hmm 02/11] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register Jason Gunthorpe
@ 2019-06-06 18:44 ` Jason Gunthorpe
[not found] ` <20190606184438.31646-4-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-07 22:38 ` Ira Weiny
2019-06-06 18:44 ` [PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe
` (9 subsequent siblings)
12 siblings, 2 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
So long a a struct hmm pointer exists, so should the struct mm it is
linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it
once the hmm refcount goes to zero.
Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL
mm->hmm delete the hmm_hmm_destroy().
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
---
v2:
- Fix error unwind paths in hmm_get_or_create (Jerome/Jason)
---
include/linux/hmm.h | 3 ---
kernel/fork.c | 1 -
mm/hmm.c | 22 ++++------------------
3 files changed, 4 insertions(+), 22 deletions(-)
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 2d519797cb134a..4ee3acabe5ed22 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -586,14 +586,11 @@ static inline int hmm_vma_fault(struct hmm_mirror *mirror,
}
/* Below are for HMM internal use only! Not to be used by device driver! */
-void hmm_mm_destroy(struct mm_struct *mm);
-
static inline void hmm_mm_init(struct mm_struct *mm)
{
mm->hmm = NULL;
}
#else /* IS_ENABLED(CONFIG_HMM_MIRROR) */
-static inline void hmm_mm_destroy(struct mm_struct *mm) {}
static inline void hmm_mm_init(struct mm_struct *mm) {}
#endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
diff --git a/kernel/fork.c b/kernel/fork.c
index b2b87d450b80b5..588c768ae72451 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -673,7 +673,6 @@ void __mmdrop(struct mm_struct *mm)
WARN_ON_ONCE(mm == current->active_mm);
mm_free_pgd(mm);
destroy_context(mm);
- hmm_mm_destroy(mm);
mmu_notifier_mm_destroy(mm);
check_mm(mm);
put_user_ns(mm->user_ns);
diff --git a/mm/hmm.c b/mm/hmm.c
index 8796447299023c..cc7c26fda3300e 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -29,6 +29,7 @@
#include <linux/swapops.h>
#include <linux/hugetlb.h>
#include <linux/memremap.h>
+#include <linux/sched/mm.h>
#include <linux/jump_label.h>
#include <linux/dma-mapping.h>
#include <linux/mmu_notifier.h>
@@ -82,6 +83,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
hmm->notifiers = 0;
hmm->dead = false;
hmm->mm = mm;
+ mmgrab(hmm->mm);
spin_lock(&mm->page_table_lock);
if (!mm->hmm)
@@ -109,6 +111,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
mm->hmm = NULL;
spin_unlock(&mm->page_table_lock);
error:
+ mmdrop(hmm->mm);
kfree(hmm);
return NULL;
}
@@ -130,6 +133,7 @@ static void hmm_free(struct kref *kref)
mm->hmm = NULL;
spin_unlock(&mm->page_table_lock);
+ mmdrop(hmm->mm);
mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
}
@@ -138,24 +142,6 @@ static inline void hmm_put(struct hmm *hmm)
kref_put(&hmm->kref, hmm_free);
}
-void hmm_mm_destroy(struct mm_struct *mm)
-{
- struct hmm *hmm;
-
- spin_lock(&mm->page_table_lock);
- hmm = mm_get_hmm(mm);
- mm->hmm = NULL;
- if (hmm) {
- hmm->mm = NULL;
- hmm->dead = true;
- spin_unlock(&mm->page_table_lock);
- hmm_put(hmm);
- return;
- }
-
- spin_unlock(&mm->page_table_lock);
-}
-
static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread[parent not found: <20190606184438.31646-4-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm
[not found] ` <20190606184438.31646-4-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-07 2:44 ` John Hubbard
[not found] ` <48fcaa19-6ac3-59d0-cd51-455abeca7cdb-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2019-06-07 18:41 ` Ralph Campbell
1 sibling, 1 reply; 79+ messages in thread
From: John Hubbard @ 2019-06-07 2:44 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, Ralph Campbell,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> So long a a struct hmm pointer exists, so should the struct mm it is
> linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it
> once the hmm refcount goes to zero.
>
> Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL
> mm->hmm delete the hmm_hmm_destroy().
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> v2:
> - Fix error unwind paths in hmm_get_or_create (Jerome/Jason)
> ---
> include/linux/hmm.h | 3 ---
> kernel/fork.c | 1 -
> mm/hmm.c | 22 ++++------------------
> 3 files changed, 4 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 2d519797cb134a..4ee3acabe5ed22 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -586,14 +586,11 @@ static inline int hmm_vma_fault(struct hmm_mirror *mirror,
> }
>
> /* Below are for HMM internal use only! Not to be used by device driver! */
> -void hmm_mm_destroy(struct mm_struct *mm);
> -
> static inline void hmm_mm_init(struct mm_struct *mm)
> {
> mm->hmm = NULL;
> }
> #else /* IS_ENABLED(CONFIG_HMM_MIRROR) */
> -static inline void hmm_mm_destroy(struct mm_struct *mm) {}
> static inline void hmm_mm_init(struct mm_struct *mm) {}
> #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index b2b87d450b80b5..588c768ae72451 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -673,7 +673,6 @@ void __mmdrop(struct mm_struct *mm)
> WARN_ON_ONCE(mm == current->active_mm);
> mm_free_pgd(mm);
> destroy_context(mm);
> - hmm_mm_destroy(mm);
This is particularly welcome, not to have an "HMM is special" case
in such a core part of process/mm code.
> mmu_notifier_mm_destroy(mm);
> check_mm(mm);
> put_user_ns(mm->user_ns);
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 8796447299023c..cc7c26fda3300e 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -29,6 +29,7 @@
> #include <linux/swapops.h>
> #include <linux/hugetlb.h>
> #include <linux/memremap.h>
> +#include <linux/sched/mm.h>
> #include <linux/jump_label.h>
> #include <linux/dma-mapping.h>
> #include <linux/mmu_notifier.h>
> @@ -82,6 +83,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> hmm->notifiers = 0;
> hmm->dead = false;
> hmm->mm = mm;
> + mmgrab(hmm->mm);
>
> spin_lock(&mm->page_table_lock);
> if (!mm->hmm)
> @@ -109,6 +111,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> mm->hmm = NULL;
> spin_unlock(&mm->page_table_lock);
> error:
> + mmdrop(hmm->mm);
> kfree(hmm);
> return NULL;
> }
> @@ -130,6 +133,7 @@ static void hmm_free(struct kref *kref)
> mm->hmm = NULL;
> spin_unlock(&mm->page_table_lock);
>
> + mmdrop(hmm->mm);
> mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
> }
>
> @@ -138,24 +142,6 @@ static inline void hmm_put(struct hmm *hmm)
> kref_put(&hmm->kref, hmm_free);
> }
>
> -void hmm_mm_destroy(struct mm_struct *mm)
> -{
> - struct hmm *hmm;
> -
> - spin_lock(&mm->page_table_lock);
> - hmm = mm_get_hmm(mm);
> - mm->hmm = NULL;
> - if (hmm) {
> - hmm->mm = NULL;
> - hmm->dead = true;
> - spin_unlock(&mm->page_table_lock);
> - hmm_put(hmm);
> - return;
> - }
> -
> - spin_unlock(&mm->page_table_lock);
> -}
> -
> static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> {
> struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
>
Failed to find any problems with this. :)
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm
[not found] ` <20190606184438.31646-4-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-07 2:44 ` John Hubbard
@ 2019-06-07 18:41 ` Ralph Campbell
[not found] ` <605172dc-5c66-123f-61a3-8e6880678aef-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
1 sibling, 1 reply; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 18:41 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> So long a a struct hmm pointer exists, so should the struct mm it is
s/a a/as a/
> linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it
> once the hmm refcount goes to zero.
>
> Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL
> mm->hmm delete the hmm_hmm_destroy().
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> v2:
> - Fix error unwind paths in hmm_get_or_create (Jerome/Jason)
> ---
> include/linux/hmm.h | 3 ---
> kernel/fork.c | 1 -
> mm/hmm.c | 22 ++++------------------
> 3 files changed, 4 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 2d519797cb134a..4ee3acabe5ed22 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -586,14 +586,11 @@ static inline int hmm_vma_fault(struct hmm_mirror *mirror,
> }
>
> /* Below are for HMM internal use only! Not to be used by device driver! */
> -void hmm_mm_destroy(struct mm_struct *mm);
> -
> static inline void hmm_mm_init(struct mm_struct *mm)
> {
> mm->hmm = NULL;
> }
> #else /* IS_ENABLED(CONFIG_HMM_MIRROR) */
> -static inline void hmm_mm_destroy(struct mm_struct *mm) {}
> static inline void hmm_mm_init(struct mm_struct *mm) {}
> #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index b2b87d450b80b5..588c768ae72451 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -673,7 +673,6 @@ void __mmdrop(struct mm_struct *mm)
> WARN_ON_ONCE(mm == current->active_mm);
> mm_free_pgd(mm);
> destroy_context(mm);
> - hmm_mm_destroy(mm);
> mmu_notifier_mm_destroy(mm);
> check_mm(mm);
> put_user_ns(mm->user_ns);
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 8796447299023c..cc7c26fda3300e 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -29,6 +29,7 @@
> #include <linux/swapops.h>
> #include <linux/hugetlb.h>
> #include <linux/memremap.h>
> +#include <linux/sched/mm.h>
> #include <linux/jump_label.h>
> #include <linux/dma-mapping.h>
> #include <linux/mmu_notifier.h>
> @@ -82,6 +83,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> hmm->notifiers = 0;
> hmm->dead = false;
> hmm->mm = mm;
> + mmgrab(hmm->mm);
>
> spin_lock(&mm->page_table_lock);
> if (!mm->hmm)
> @@ -109,6 +111,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> mm->hmm = NULL;
> spin_unlock(&mm->page_table_lock);
> error:
> + mmdrop(hmm->mm);
> kfree(hmm);
> return NULL;
> }
> @@ -130,6 +133,7 @@ static void hmm_free(struct kref *kref)
> mm->hmm = NULL;
> spin_unlock(&mm->page_table_lock);
>
> + mmdrop(hmm->mm);
> mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
> }
>
> @@ -138,24 +142,6 @@ static inline void hmm_put(struct hmm *hmm)
> kref_put(&hmm->kref, hmm_free);
> }
>
> -void hmm_mm_destroy(struct mm_struct *mm)
> -{
> - struct hmm *hmm;
> -
> - spin_lock(&mm->page_table_lock);
> - hmm = mm_get_hmm(mm);
> - mm->hmm = NULL;
> - if (hmm) {
> - hmm->mm = NULL;
> - hmm->dead = true;
> - spin_unlock(&mm->page_table_lock);
> - hmm_put(hmm);
> - return;
> - }
> -
> - spin_unlock(&mm->page_table_lock);
> -}
> -
> static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> {
> struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: [PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm
2019-06-06 18:44 ` [PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe
[not found] ` <20190606184438.31646-4-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-07 22:38 ` Ira Weiny
1 sibling, 0 replies; 79+ messages in thread
From: Ira Weiny @ 2019-06-07 22:38 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, Ralph Campbell, linux-rdma, John Hubbard,
Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
Jason Gunthorpe, amd-gfx
On Thu, Jun 06, 2019 at 03:44:30PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> So long a a struct hmm pointer exists, so should the struct mm it is
> linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it
> once the hmm refcount goes to zero.
>
> Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL
> mm->hmm delete the hmm_hmm_destroy().
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> ---
> v2:
> - Fix error unwind paths in hmm_get_or_create (Jerome/Jason)
> ---
> include/linux/hmm.h | 3 ---
> kernel/fork.c | 1 -
> mm/hmm.c | 22 ++++------------------
> 3 files changed, 4 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 2d519797cb134a..4ee3acabe5ed22 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -586,14 +586,11 @@ static inline int hmm_vma_fault(struct hmm_mirror *mirror,
> }
>
> /* Below are for HMM internal use only! Not to be used by device driver! */
> -void hmm_mm_destroy(struct mm_struct *mm);
> -
> static inline void hmm_mm_init(struct mm_struct *mm)
> {
> mm->hmm = NULL;
> }
> #else /* IS_ENABLED(CONFIG_HMM_MIRROR) */
> -static inline void hmm_mm_destroy(struct mm_struct *mm) {}
> static inline void hmm_mm_init(struct mm_struct *mm) {}
> #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
>
> diff --git a/kernel/fork.c b/kernel/fork.c
> index b2b87d450b80b5..588c768ae72451 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -673,7 +673,6 @@ void __mmdrop(struct mm_struct *mm)
> WARN_ON_ONCE(mm == current->active_mm);
> mm_free_pgd(mm);
> destroy_context(mm);
> - hmm_mm_destroy(mm);
> mmu_notifier_mm_destroy(mm);
> check_mm(mm);
> put_user_ns(mm->user_ns);
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 8796447299023c..cc7c26fda3300e 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -29,6 +29,7 @@
> #include <linux/swapops.h>
> #include <linux/hugetlb.h>
> #include <linux/memremap.h>
> +#include <linux/sched/mm.h>
> #include <linux/jump_label.h>
> #include <linux/dma-mapping.h>
> #include <linux/mmu_notifier.h>
> @@ -82,6 +83,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> hmm->notifiers = 0;
> hmm->dead = false;
> hmm->mm = mm;
> + mmgrab(hmm->mm);
>
> spin_lock(&mm->page_table_lock);
> if (!mm->hmm)
> @@ -109,6 +111,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> mm->hmm = NULL;
> spin_unlock(&mm->page_table_lock);
> error:
> + mmdrop(hmm->mm);
> kfree(hmm);
> return NULL;
> }
> @@ -130,6 +133,7 @@ static void hmm_free(struct kref *kref)
> mm->hmm = NULL;
> spin_unlock(&mm->page_table_lock);
>
> + mmdrop(hmm->mm);
> mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
> }
>
> @@ -138,24 +142,6 @@ static inline void hmm_put(struct hmm *hmm)
> kref_put(&hmm->kref, hmm_free);
> }
>
> -void hmm_mm_destroy(struct mm_struct *mm)
> -{
> - struct hmm *hmm;
> -
> - spin_lock(&mm->page_table_lock);
> - hmm = mm_get_hmm(mm);
> - mm->hmm = NULL;
> - if (hmm) {
> - hmm->mm = NULL;
> - hmm->dead = true;
> - spin_unlock(&mm->page_table_lock);
> - hmm_put(hmm);
> - return;
> - }
> -
> - spin_unlock(&mm->page_table_lock);
> -}
> -
> static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> {
> struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
> --
> 2.21.0
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* [PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
` (2 preceding siblings ...)
2019-06-06 18:44 ` [PATCH v2 hmm 03/11] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe
@ 2019-06-06 18:44 ` Jason Gunthorpe
2019-06-07 2:54 ` John Hubbard
` (2 more replies)
2019-06-06 18:44 ` [PATCH v2 hmm 05/11] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe
` (8 subsequent siblings)
12 siblings, 3 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
As coded this function can false-fail in various racy situations. Make it
reliable by running only under the write side of the mmap_sem and avoiding
the false-failing compare/exchange pattern.
Also make the locking very easy to understand by only ever reading or
writing mm->hmm while holding the write side of the mmap_sem.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
v2:
- Fix error unwind of mmgrab (Jerome)
- Use hmm local instead of 2nd container_of (Jerome)
---
mm/hmm.c | 80 ++++++++++++++++++++------------------------------------
1 file changed, 29 insertions(+), 51 deletions(-)
diff --git a/mm/hmm.c b/mm/hmm.c
index cc7c26fda3300e..dc30edad9a8a02 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -40,16 +40,6 @@
#if IS_ENABLED(CONFIG_HMM_MIRROR)
static const struct mmu_notifier_ops hmm_mmu_notifier_ops;
-static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
-{
- struct hmm *hmm = READ_ONCE(mm->hmm);
-
- if (hmm && kref_get_unless_zero(&hmm->kref))
- return hmm;
-
- return NULL;
-}
-
/**
* hmm_get_or_create - register HMM against an mm (HMM internal)
*
@@ -64,11 +54,20 @@ static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
*/
static struct hmm *hmm_get_or_create(struct mm_struct *mm)
{
- struct hmm *hmm = mm_get_hmm(mm);
- bool cleanup = false;
+ struct hmm *hmm;
- if (hmm)
- return hmm;
+ lockdep_assert_held_exclusive(&mm->mmap_sem);
+
+ if (mm->hmm) {
+ if (kref_get_unless_zero(&mm->hmm->kref))
+ return mm->hmm;
+ /*
+ * The hmm is being freed by some other CPU and is pending a
+ * RCU grace period, but this CPU can NULL now it since we
+ * have the mmap_sem.
+ */
+ mm->hmm = NULL;
+ }
hmm = kmalloc(sizeof(*hmm), GFP_KERNEL);
if (!hmm)
@@ -83,57 +82,36 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
hmm->notifiers = 0;
hmm->dead = false;
hmm->mm = mm;
- mmgrab(hmm->mm);
-
- spin_lock(&mm->page_table_lock);
- if (!mm->hmm)
- mm->hmm = hmm;
- else
- cleanup = true;
- spin_unlock(&mm->page_table_lock);
- if (cleanup)
- goto error;
-
- /*
- * We should only get here if hold the mmap_sem in write mode ie on
- * registration of first mirror through hmm_mirror_register()
- */
hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
- if (__mmu_notifier_register(&hmm->mmu_notifier, mm))
- goto error_mm;
+ if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) {
+ kfree(hmm);
+ return NULL;
+ }
+ mmgrab(hmm->mm);
+ mm->hmm = hmm;
return hmm;
-
-error_mm:
- spin_lock(&mm->page_table_lock);
- if (mm->hmm == hmm)
- mm->hmm = NULL;
- spin_unlock(&mm->page_table_lock);
-error:
- mmdrop(hmm->mm);
- kfree(hmm);
- return NULL;
}
static void hmm_free_rcu(struct rcu_head *rcu)
{
- kfree(container_of(rcu, struct hmm, rcu));
+ struct hmm *hmm = container_of(rcu, struct hmm, rcu);
+
+ down_write(&hmm->mm->mmap_sem);
+ if (hmm->mm->hmm == hmm)
+ hmm->mm->hmm = NULL;
+ up_write(&hmm->mm->mmap_sem);
+ mmdrop(hmm->mm);
+
+ kfree(hmm);
}
static void hmm_free(struct kref *kref)
{
struct hmm *hmm = container_of(kref, struct hmm, kref);
- struct mm_struct *mm = hmm->mm;
-
- mmu_notifier_unregister_no_release(&hmm->mmu_notifier, mm);
- spin_lock(&mm->page_table_lock);
- if (mm->hmm == hmm)
- mm->hmm = NULL;
- spin_unlock(&mm->page_table_lock);
-
- mmdrop(hmm->mm);
+ mmu_notifier_unregister_no_release(&hmm->mmu_notifier, hmm->mm);
mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
}
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable
2019-06-06 18:44 ` [PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe
@ 2019-06-07 2:54 ` John Hubbard
[not found] ` <20190606184438.31646-5-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-07 22:44 ` Ira Weiny
2 siblings, 0 replies; 79+ messages in thread
From: John Hubbard @ 2019-06-07 2:54 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, Ralph Campbell, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> As coded this function can false-fail in various racy situations. Make it
> reliable by running only under the write side of the mmap_sem and avoiding
> the false-failing compare/exchange pattern.
>
> Also make the locking very easy to understand by only ever reading or
> writing mm->hmm while holding the write side of the mmap_sem.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
> v2:
> - Fix error unwind of mmgrab (Jerome)
> - Use hmm local instead of 2nd container_of (Jerome)
> ---
> mm/hmm.c | 80 ++++++++++++++++++++------------------------------------
> 1 file changed, 29 insertions(+), 51 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index cc7c26fda3300e..dc30edad9a8a02 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -40,16 +40,6 @@
> #if IS_ENABLED(CONFIG_HMM_MIRROR)
> static const struct mmu_notifier_ops hmm_mmu_notifier_ops;
>
> -static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
> -{
> - struct hmm *hmm = READ_ONCE(mm->hmm);
> -
> - if (hmm && kref_get_unless_zero(&hmm->kref))
> - return hmm;
> -
> - return NULL;
> -}
> -
> /**
> * hmm_get_or_create - register HMM against an mm (HMM internal)
> *
> @@ -64,11 +54,20 @@ static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
> */
> static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> {
> - struct hmm *hmm = mm_get_hmm(mm);
> - bool cleanup = false;
> + struct hmm *hmm;
>
> - if (hmm)
> - return hmm;
> + lockdep_assert_held_exclusive(&mm->mmap_sem);
> +
> + if (mm->hmm) {
> + if (kref_get_unless_zero(&mm->hmm->kref))
> + return mm->hmm;
> + /*
> + * The hmm is being freed by some other CPU and is pending a
> + * RCU grace period, but this CPU can NULL now it since we
> + * have the mmap_sem.
> + */
> + mm->hmm = NULL;
> + }
>
> hmm = kmalloc(sizeof(*hmm), GFP_KERNEL);
> if (!hmm)
> @@ -83,57 +82,36 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> hmm->notifiers = 0;
> hmm->dead = false;
> hmm->mm = mm;
> - mmgrab(hmm->mm);
> -
> - spin_lock(&mm->page_table_lock);
> - if (!mm->hmm)
> - mm->hmm = hmm;
> - else
> - cleanup = true;
> - spin_unlock(&mm->page_table_lock);
>
> - if (cleanup)
> - goto error;
> -
> - /*
> - * We should only get here if hold the mmap_sem in write mode ie on
> - * registration of first mirror through hmm_mirror_register()
> - */
> hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
> - if (__mmu_notifier_register(&hmm->mmu_notifier, mm))
> - goto error_mm;
> + if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) {
> + kfree(hmm);
> + return NULL;
> + }
>
> + mmgrab(hmm->mm);
> + mm->hmm = hmm;
> return hmm;
> -
> -error_mm:
> - spin_lock(&mm->page_table_lock);
> - if (mm->hmm == hmm)
> - mm->hmm = NULL;
> - spin_unlock(&mm->page_table_lock);
> -error:
> - mmdrop(hmm->mm);
> - kfree(hmm);
> - return NULL;
> }
>
> static void hmm_free_rcu(struct rcu_head *rcu)
> {
> - kfree(container_of(rcu, struct hmm, rcu));
> + struct hmm *hmm = container_of(rcu, struct hmm, rcu);
> +
> + down_write(&hmm->mm->mmap_sem);
> + if (hmm->mm->hmm == hmm)
> + hmm->mm->hmm = NULL;
> + up_write(&hmm->mm->mmap_sem);
> + mmdrop(hmm->mm);
> +
> + kfree(hmm);
> }
>
> static void hmm_free(struct kref *kref)
> {
> struct hmm *hmm = container_of(kref, struct hmm, kref);
> - struct mm_struct *mm = hmm->mm;
> -
> - mmu_notifier_unregister_no_release(&hmm->mmu_notifier, mm);
>
> - spin_lock(&mm->page_table_lock);
> - if (mm->hmm == hmm)
> - mm->hmm = NULL;
> - spin_unlock(&mm->page_table_lock);
> -
> - mmdrop(hmm->mm);
> + mmu_notifier_unregister_no_release(&hmm->mmu_notifier, hmm->mm);
> mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
> }
>
>
Yes.
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread[parent not found: <20190606184438.31646-5-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable
[not found] ` <20190606184438.31646-5-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-07 18:52 ` Ralph Campbell
0 siblings, 0 replies; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 18:52 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> As coded this function can false-fail in various racy situations. Make it
> reliable by running only under the write side of the mmap_sem and avoiding
> the false-failing compare/exchange pattern.
>
> Also make the locking very easy to understand by only ever reading or
> writing mm->hmm while holding the write side of the mmap_sem.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> v2:
> - Fix error unwind of mmgrab (Jerome)
> - Use hmm local instead of 2nd container_of (Jerome)
> ---
> mm/hmm.c | 80 ++++++++++++++++++++------------------------------------
> 1 file changed, 29 insertions(+), 51 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index cc7c26fda3300e..dc30edad9a8a02 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -40,16 +40,6 @@
> #if IS_ENABLED(CONFIG_HMM_MIRROR)
> static const struct mmu_notifier_ops hmm_mmu_notifier_ops;
>
> -static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
> -{
> - struct hmm *hmm = READ_ONCE(mm->hmm);
> -
> - if (hmm && kref_get_unless_zero(&hmm->kref))
> - return hmm;
> -
> - return NULL;
> -}
> -
> /**
> * hmm_get_or_create - register HMM against an mm (HMM internal)
> *
> @@ -64,11 +54,20 @@ static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
> */
> static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> {
> - struct hmm *hmm = mm_get_hmm(mm);
> - bool cleanup = false;
> + struct hmm *hmm;
>
> - if (hmm)
> - return hmm;
> + lockdep_assert_held_exclusive(&mm->mmap_sem);
> +
> + if (mm->hmm) {
> + if (kref_get_unless_zero(&mm->hmm->kref))
> + return mm->hmm;
> + /*
> + * The hmm is being freed by some other CPU and is pending a
> + * RCU grace period, but this CPU can NULL now it since we
> + * have the mmap_sem.
> + */
> + mm->hmm = NULL;
> + }
>
> hmm = kmalloc(sizeof(*hmm), GFP_KERNEL);
> if (!hmm)
> @@ -83,57 +82,36 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> hmm->notifiers = 0;
> hmm->dead = false;
> hmm->mm = mm;
> - mmgrab(hmm->mm);
> -
> - spin_lock(&mm->page_table_lock);
> - if (!mm->hmm)
> - mm->hmm = hmm;
> - else
> - cleanup = true;
> - spin_unlock(&mm->page_table_lock);
>
> - if (cleanup)
> - goto error;
> -
> - /*
> - * We should only get here if hold the mmap_sem in write mode ie on
> - * registration of first mirror through hmm_mirror_register()
> - */
> hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
> - if (__mmu_notifier_register(&hmm->mmu_notifier, mm))
> - goto error_mm;
> + if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) {
> + kfree(hmm);
> + return NULL;
> + }
>
> + mmgrab(hmm->mm);
> + mm->hmm = hmm;
> return hmm;
> -
> -error_mm:
> - spin_lock(&mm->page_table_lock);
> - if (mm->hmm == hmm)
> - mm->hmm = NULL;
> - spin_unlock(&mm->page_table_lock);
> -error:
> - mmdrop(hmm->mm);
> - kfree(hmm);
> - return NULL;
> }
>
> static void hmm_free_rcu(struct rcu_head *rcu)
> {
> - kfree(container_of(rcu, struct hmm, rcu));
> + struct hmm *hmm = container_of(rcu, struct hmm, rcu);
> +
> + down_write(&hmm->mm->mmap_sem);
> + if (hmm->mm->hmm == hmm)
> + hmm->mm->hmm = NULL;
> + up_write(&hmm->mm->mmap_sem);
> + mmdrop(hmm->mm);
> +
> + kfree(hmm);
> }
>
> static void hmm_free(struct kref *kref)
> {
> struct hmm *hmm = container_of(kref, struct hmm, kref);
> - struct mm_struct *mm = hmm->mm;
> -
> - mmu_notifier_unregister_no_release(&hmm->mmu_notifier, mm);
>
> - spin_lock(&mm->page_table_lock);
> - if (mm->hmm == hmm)
> - mm->hmm = NULL;
> - spin_unlock(&mm->page_table_lock);
> -
> - mmdrop(hmm->mm);
> + mmu_notifier_unregister_no_release(&hmm->mmu_notifier, hmm->mm);
> mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
> }
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: [PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable
2019-06-06 18:44 ` [PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe
2019-06-07 2:54 ` John Hubbard
[not found] ` <20190606184438.31646-5-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-07 22:44 ` Ira Weiny
2 siblings, 0 replies; 79+ messages in thread
From: Ira Weiny @ 2019-06-07 22:44 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, Ralph Campbell, linux-rdma, John Hubbard,
Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
Jason Gunthorpe, amd-gfx
On Thu, Jun 06, 2019 at 03:44:31PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> As coded this function can false-fail in various racy situations. Make it
> reliable by running only under the write side of the mmap_sem and avoiding
> the false-failing compare/exchange pattern.
>
> Also make the locking very easy to understand by only ever reading or
> writing mm->hmm while holding the write side of the mmap_sem.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> ---
> v2:
> - Fix error unwind of mmgrab (Jerome)
> - Use hmm local instead of 2nd container_of (Jerome)
> ---
> mm/hmm.c | 80 ++++++++++++++++++++------------------------------------
> 1 file changed, 29 insertions(+), 51 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index cc7c26fda3300e..dc30edad9a8a02 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -40,16 +40,6 @@
> #if IS_ENABLED(CONFIG_HMM_MIRROR)
> static const struct mmu_notifier_ops hmm_mmu_notifier_ops;
>
> -static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
> -{
> - struct hmm *hmm = READ_ONCE(mm->hmm);
> -
> - if (hmm && kref_get_unless_zero(&hmm->kref))
> - return hmm;
> -
> - return NULL;
> -}
> -
> /**
> * hmm_get_or_create - register HMM against an mm (HMM internal)
> *
> @@ -64,11 +54,20 @@ static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
> */
> static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> {
> - struct hmm *hmm = mm_get_hmm(mm);
> - bool cleanup = false;
> + struct hmm *hmm;
>
> - if (hmm)
> - return hmm;
> + lockdep_assert_held_exclusive(&mm->mmap_sem);
> +
> + if (mm->hmm) {
> + if (kref_get_unless_zero(&mm->hmm->kref))
> + return mm->hmm;
> + /*
> + * The hmm is being freed by some other CPU and is pending a
> + * RCU grace period, but this CPU can NULL now it since we
> + * have the mmap_sem.
> + */
> + mm->hmm = NULL;
> + }
>
> hmm = kmalloc(sizeof(*hmm), GFP_KERNEL);
> if (!hmm)
> @@ -83,57 +82,36 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> hmm->notifiers = 0;
> hmm->dead = false;
> hmm->mm = mm;
> - mmgrab(hmm->mm);
> -
> - spin_lock(&mm->page_table_lock);
> - if (!mm->hmm)
> - mm->hmm = hmm;
> - else
> - cleanup = true;
> - spin_unlock(&mm->page_table_lock);
>
> - if (cleanup)
> - goto error;
> -
> - /*
> - * We should only get here if hold the mmap_sem in write mode ie on
> - * registration of first mirror through hmm_mirror_register()
> - */
> hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
> - if (__mmu_notifier_register(&hmm->mmu_notifier, mm))
> - goto error_mm;
> + if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) {
> + kfree(hmm);
> + return NULL;
> + }
>
> + mmgrab(hmm->mm);
> + mm->hmm = hmm;
> return hmm;
> -
> -error_mm:
> - spin_lock(&mm->page_table_lock);
> - if (mm->hmm == hmm)
> - mm->hmm = NULL;
> - spin_unlock(&mm->page_table_lock);
> -error:
> - mmdrop(hmm->mm);
> - kfree(hmm);
> - return NULL;
> }
>
> static void hmm_free_rcu(struct rcu_head *rcu)
> {
> - kfree(container_of(rcu, struct hmm, rcu));
> + struct hmm *hmm = container_of(rcu, struct hmm, rcu);
> +
> + down_write(&hmm->mm->mmap_sem);
> + if (hmm->mm->hmm == hmm)
> + hmm->mm->hmm = NULL;
> + up_write(&hmm->mm->mmap_sem);
> + mmdrop(hmm->mm);
> +
> + kfree(hmm);
> }
>
> static void hmm_free(struct kref *kref)
> {
> struct hmm *hmm = container_of(kref, struct hmm, kref);
> - struct mm_struct *mm = hmm->mm;
> -
> - mmu_notifier_unregister_no_release(&hmm->mmu_notifier, mm);
>
> - spin_lock(&mm->page_table_lock);
> - if (mm->hmm == hmm)
> - mm->hmm = NULL;
> - spin_unlock(&mm->page_table_lock);
> -
> - mmdrop(hmm->mm);
> + mmu_notifier_unregister_no_release(&hmm->mmu_notifier, hmm->mm);
> mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
> }
>
> --
> 2.21.0
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* [PATCH v2 hmm 05/11] mm/hmm: Remove duplicate condition test before wait_event_timeout
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
` (3 preceding siblings ...)
2019-06-06 18:44 ` [PATCH v2 hmm 04/11] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe
@ 2019-06-06 18:44 ` Jason Gunthorpe
2019-06-07 3:06 ` John Hubbard
[not found] ` <20190606184438.31646-6-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-06 18:44 ` [PATCH v2 hmm 06/11] mm/hmm: Hold on to the mmget for the lifetime of the range Jason Gunthorpe
` (7 subsequent siblings)
12 siblings, 2 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
The wait_event_timeout macro already tests the condition as its first
action, so there is no reason to open code another version of this, all
that does is skip the might_sleep() debugging in common cases, which is
not helpful.
Further, based on prior patches, we can no simplify the required condition
test:
- If range is valid memory then so is range->hmm
- If hmm_release() has run then range->valid is set to false
at the same time as dead, so no reason to check both.
- A valid hmm has a valid hmm->mm.
Also, add the READ_ONCE for range->valid as there is no lock held here.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
---
include/linux/hmm.h | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 4ee3acabe5ed22..2ab35b40992b24 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -218,17 +218,9 @@ static inline unsigned long hmm_range_page_size(const struct hmm_range *range)
static inline bool hmm_range_wait_until_valid(struct hmm_range *range,
unsigned long timeout)
{
- /* Check if mm is dead ? */
- if (range->hmm == NULL || range->hmm->dead || range->hmm->mm == NULL) {
- range->valid = false;
- return false;
- }
- if (range->valid)
- return true;
- wait_event_timeout(range->hmm->wq, range->valid || range->hmm->dead,
+ wait_event_timeout(range->hmm->wq, range->valid,
msecs_to_jiffies(timeout));
- /* Return current valid status just in case we get lucky */
- return range->valid;
+ return READ_ONCE(range->valid);
}
/*
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 05/11] mm/hmm: Remove duplicate condition test before wait_event_timeout
2019-06-06 18:44 ` [PATCH v2 hmm 05/11] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe
@ 2019-06-07 3:06 ` John Hubbard
[not found] ` <86962e22-88b1-c1bf-d704-d5a5053fa100-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
[not found] ` <20190606184438.31646-6-jgg-uk2M96/98Pc@public.gmane.org>
1 sibling, 1 reply; 79+ messages in thread
From: John Hubbard @ 2019-06-07 3:06 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, Ralph Campbell, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> The wait_event_timeout macro already tests the condition as its first
> action, so there is no reason to open code another version of this, all
> that does is skip the might_sleep() debugging in common cases, which is
> not helpful.
>
> Further, based on prior patches, we can no simplify the required condition
"now simplify"
> test:
> - If range is valid memory then so is range->hmm
> - If hmm_release() has run then range->valid is set to false
> at the same time as dead, so no reason to check both.
> - A valid hmm has a valid hmm->mm.
>
> Also, add the READ_ONCE for range->valid as there is no lock held here.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> include/linux/hmm.h | 12 ++----------
> 1 file changed, 2 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 4ee3acabe5ed22..2ab35b40992b24 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -218,17 +218,9 @@ static inline unsigned long hmm_range_page_size(const struct hmm_range *range)
> static inline bool hmm_range_wait_until_valid(struct hmm_range *range,
> unsigned long timeout)
> {
> - /* Check if mm is dead ? */
> - if (range->hmm == NULL || range->hmm->dead || range->hmm->mm == NULL) {
> - range->valid = false;
> - return false;
> - }
> - if (range->valid)
> - return true;
> - wait_event_timeout(range->hmm->wq, range->valid || range->hmm->dead,
> + wait_event_timeout(range->hmm->wq, range->valid,
> msecs_to_jiffies(timeout));
> - /* Return current valid status just in case we get lucky */
> - return range->valid;
> + return READ_ONCE(range->valid);
Just to ensure that I actually understand the model: I'm assuming that the
READ_ONCE is there solely to ensure that range->valid is read *after* the
wait_event_timeout() returns. Is that correct?
> }
>
> /*
>
In any case, it looks good, so:
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread[parent not found: <20190606184438.31646-6-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v2 hmm 05/11] mm/hmm: Remove duplicate condition test before wait_event_timeout
[not found] ` <20190606184438.31646-6-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-07 19:01 ` Ralph Campbell
[not found] ` <6833be96-12a3-1a1c-1514-c148ba2dd87b-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
0 siblings, 1 reply; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 19:01 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> The wait_event_timeout macro already tests the condition as its first
> action, so there is no reason to open code another version of this, all
> that does is skip the might_sleep() debugging in common cases, which is
> not helpful.
>
> Further, based on prior patches, we can no simplify the required condition
> test:
> - If range is valid memory then so is range->hmm
> - If hmm_release() has run then range->valid is set to false
> at the same time as dead, so no reason to check both.
> - A valid hmm has a valid hmm->mm.
>
> Also, add the READ_ONCE for range->valid as there is no lock held here.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> include/linux/hmm.h | 12 ++----------
> 1 file changed, 2 insertions(+), 10 deletions(-)
>
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 4ee3acabe5ed22..2ab35b40992b24 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -218,17 +218,9 @@ static inline unsigned long hmm_range_page_size(const struct hmm_range *range)
> static inline bool hmm_range_wait_until_valid(struct hmm_range *range,
> unsigned long timeout)
> {
> - /* Check if mm is dead ? */
> - if (range->hmm == NULL || range->hmm->dead || range->hmm->mm == NULL) {
> - range->valid = false;
> - return false;
> - }
> - if (range->valid)
> - return true;
> - wait_event_timeout(range->hmm->wq, range->valid || range->hmm->dead,
> + wait_event_timeout(range->hmm->wq, range->valid,
> msecs_to_jiffies(timeout));
> - /* Return current valid status just in case we get lucky */
> - return range->valid;
> + return READ_ONCE(range->valid);
> }
>
> /*
>
Since we are simplifying things, perhaps we should consider merging
hmm_range_wait_until_valid() info hmm_range_register() and
removing hmm_range_wait_until_valid() since the pattern
is to always call the two together.
In any case, this looks OK to me so you can add
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread
* [PATCH v2 hmm 06/11] mm/hmm: Hold on to the mmget for the lifetime of the range
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
` (4 preceding siblings ...)
2019-06-06 18:44 ` [PATCH v2 hmm 05/11] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe
@ 2019-06-06 18:44 ` Jason Gunthorpe
2019-06-07 3:15 ` John Hubbard
[not found] ` <20190606184438.31646-7-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-06 18:44 ` [PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments Jason Gunthorpe
` (6 subsequent siblings)
12 siblings, 2 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
Range functions like hmm_range_snapshot() and hmm_range_fault() call
find_vma, which requires hodling the mmget() and the mmap_sem for the mm.
Make this simpler for the callers by holding the mmget() inside the range
for the lifetime of the range. Other functions that accept a range should
only be called if the range is registered.
This has the side effect of directly preventing hmm_release() from
happening while a range is registered. That means range->dead cannot be
false during the lifetime of the range, so remove dead and
hmm_mirror_mm_is_alive() entirely.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
v2:
- Use Jerome's idea of just holding the mmget() for the range lifetime,
rework the patch to use that as as simplification to remove dead in
one step
---
include/linux/hmm.h | 26 --------------------------
mm/hmm.c | 28 ++++++++++------------------
2 files changed, 10 insertions(+), 44 deletions(-)
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 2ab35b40992b24..0e20566802967a 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -91,7 +91,6 @@
* @mirrors_sem: read/write semaphore protecting the mirrors list
* @wq: wait queue for user waiting on a range invalidation
* @notifiers: count of active mmu notifiers
- * @dead: is the mm dead ?
*/
struct hmm {
struct mm_struct *mm;
@@ -104,7 +103,6 @@ struct hmm {
wait_queue_head_t wq;
struct rcu_head rcu;
long notifiers;
- bool dead;
};
/*
@@ -469,30 +467,6 @@ struct hmm_mirror {
int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm);
void hmm_mirror_unregister(struct hmm_mirror *mirror);
-/*
- * hmm_mirror_mm_is_alive() - test if mm is still alive
- * @mirror: the HMM mm mirror for which we want to lock the mmap_sem
- * Return: false if the mm is dead, true otherwise
- *
- * This is an optimization, it will not always accurately return false if the
- * mm is dead; i.e., there can be false negatives (process is being killed but
- * HMM is not yet informed of that). It is only intended to be used to optimize
- * out cases where the driver is about to do something time consuming and it
- * would be better to skip it if the mm is dead.
- */
-static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror)
-{
- struct mm_struct *mm;
-
- if (!mirror || !mirror->hmm)
- return false;
- mm = READ_ONCE(mirror->hmm->mm);
- if (mirror->hmm->dead || !mm)
- return false;
-
- return true;
-}
-
/*
* Please see Documentation/vm/hmm.rst for how to use the range API.
*/
diff --git a/mm/hmm.c b/mm/hmm.c
index dc30edad9a8a02..f67ba32983d9f1 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -80,7 +80,6 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
mutex_init(&hmm->lock);
kref_init(&hmm->kref);
hmm->notifiers = 0;
- hmm->dead = false;
hmm->mm = mm;
hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
@@ -124,20 +123,17 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
struct hmm_mirror *mirror;
- struct hmm_range *range;
/* hmm is in progress to free */
if (!kref_get_unless_zero(&hmm->kref))
return;
- /* Report this HMM as dying. */
- hmm->dead = true;
-
- /* Wake-up everyone waiting on any range. */
mutex_lock(&hmm->lock);
- list_for_each_entry(range, &hmm->ranges, list)
- range->valid = false;
- wake_up_all(&hmm->wq);
+ /*
+ * Since hmm_range_register() holds the mmget() lock hmm_release() is
+ * prevented as long as a range exists.
+ */
+ WARN_ON(!list_empty(&hmm->ranges));
mutex_unlock(&hmm->lock);
down_write(&hmm->mirrors_sem);
@@ -909,8 +905,8 @@ int hmm_range_register(struct hmm_range *range,
range->start = start;
range->end = end;
- /* Check if hmm_mm_destroy() was call. */
- if (hmm->mm == NULL || hmm->dead)
+ /* Prevent hmm_release() from running while the range is valid */
+ if (!mmget_not_zero(hmm->mm))
return -EFAULT;
range->hmm = hmm;
@@ -955,6 +951,7 @@ void hmm_range_unregister(struct hmm_range *range)
/* Drop reference taken by hmm_range_register() */
range->valid = false;
+ mmput(hmm->mm);
hmm_put(hmm);
range->hmm = NULL;
}
@@ -982,10 +979,7 @@ long hmm_range_snapshot(struct hmm_range *range)
struct vm_area_struct *vma;
struct mm_walk mm_walk;
- /* Check if hmm_mm_destroy() was call. */
- if (hmm->mm == NULL || hmm->dead)
- return -EFAULT;
-
+ lockdep_assert_held(&hmm->mm->mmap_sem);
do {
/* If range is no longer valid force retry. */
if (!range->valid)
@@ -1080,9 +1074,7 @@ long hmm_range_fault(struct hmm_range *range, bool block)
struct mm_walk mm_walk;
int ret;
- /* Check if hmm_mm_destroy() was call. */
- if (hmm->mm == NULL || hmm->dead)
- return -EFAULT;
+ lockdep_assert_held(&hmm->mm->mmap_sem);
do {
/* If range is no longer valid force retry. */
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 06/11] mm/hmm: Hold on to the mmget for the lifetime of the range
2019-06-06 18:44 ` [PATCH v2 hmm 06/11] mm/hmm: Hold on to the mmget for the lifetime of the range Jason Gunthorpe
@ 2019-06-07 3:15 ` John Hubbard
[not found] ` <20190606184438.31646-7-jgg-uk2M96/98Pc@public.gmane.org>
1 sibling, 0 replies; 79+ messages in thread
From: John Hubbard @ 2019-06-07 3:15 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, Ralph Campbell, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> Range functions like hmm_range_snapshot() and hmm_range_fault() call
> find_vma, which requires hodling the mmget() and the mmap_sem for the mm.
>
> Make this simpler for the callers by holding the mmget() inside the range
> for the lifetime of the range. Other functions that accept a range should
> only be called if the range is registered.
>
> This has the side effect of directly preventing hmm_release() from
> happening while a range is registered. That means range->dead cannot be
> false during the lifetime of the range, so remove dead and
> hmm_mirror_mm_is_alive() entirely.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
> v2:
> - Use Jerome's idea of just holding the mmget() for the range lifetime,
> rework the patch to use that as as simplification to remove dead in
> one step
> ---
> include/linux/hmm.h | 26 --------------------------
> mm/hmm.c | 28 ++++++++++------------------
> 2 files changed, 10 insertions(+), 44 deletions(-)
>
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 2ab35b40992b24..0e20566802967a 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -91,7 +91,6 @@
> * @mirrors_sem: read/write semaphore protecting the mirrors list
> * @wq: wait queue for user waiting on a range invalidation
> * @notifiers: count of active mmu notifiers
> - * @dead: is the mm dead ?
> */
> struct hmm {
> struct mm_struct *mm;
> @@ -104,7 +103,6 @@ struct hmm {
> wait_queue_head_t wq;
> struct rcu_head rcu;
> long notifiers;
> - bool dead;
> };
>
> /*
> @@ -469,30 +467,6 @@ struct hmm_mirror {
> int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm);
> void hmm_mirror_unregister(struct hmm_mirror *mirror);
>
> -/*
> - * hmm_mirror_mm_is_alive() - test if mm is still alive
> - * @mirror: the HMM mm mirror for which we want to lock the mmap_sem
> - * Return: false if the mm is dead, true otherwise
> - *
> - * This is an optimization, it will not always accurately return false if the
> - * mm is dead; i.e., there can be false negatives (process is being killed but
> - * HMM is not yet informed of that). It is only intended to be used to optimize
> - * out cases where the driver is about to do something time consuming and it
> - * would be better to skip it if the mm is dead.
> - */
> -static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror)
> -{
> - struct mm_struct *mm;
> -
> - if (!mirror || !mirror->hmm)
> - return false;
> - mm = READ_ONCE(mirror->hmm->mm);
> - if (mirror->hmm->dead || !mm)
> - return false;
> -
> - return true;
> -}
> -
> /*
> * Please see Documentation/vm/hmm.rst for how to use the range API.
> */
> diff --git a/mm/hmm.c b/mm/hmm.c
> index dc30edad9a8a02..f67ba32983d9f1 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -80,7 +80,6 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> mutex_init(&hmm->lock);
> kref_init(&hmm->kref);
> hmm->notifiers = 0;
> - hmm->dead = false;
> hmm->mm = mm;
>
> hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
> @@ -124,20 +123,17 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> {
> struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
> struct hmm_mirror *mirror;
> - struct hmm_range *range;
>
> /* hmm is in progress to free */
> if (!kref_get_unless_zero(&hmm->kref))
> return;
>
> - /* Report this HMM as dying. */
> - hmm->dead = true;
> -
> - /* Wake-up everyone waiting on any range. */
> mutex_lock(&hmm->lock);
> - list_for_each_entry(range, &hmm->ranges, list)
> - range->valid = false;
> - wake_up_all(&hmm->wq);
> + /*
> + * Since hmm_range_register() holds the mmget() lock hmm_release() is
> + * prevented as long as a range exists.
> + */
> + WARN_ON(!list_empty(&hmm->ranges));
> mutex_unlock(&hmm->lock);
>
> down_write(&hmm->mirrors_sem);
> @@ -909,8 +905,8 @@ int hmm_range_register(struct hmm_range *range,
> range->start = start;
> range->end = end;
>
> - /* Check if hmm_mm_destroy() was call. */
> - if (hmm->mm == NULL || hmm->dead)
> + /* Prevent hmm_release() from running while the range is valid */
> + if (!mmget_not_zero(hmm->mm))
> return -EFAULT;
>
> range->hmm = hmm;
> @@ -955,6 +951,7 @@ void hmm_range_unregister(struct hmm_range *range)
>
> /* Drop reference taken by hmm_range_register() */
> range->valid = false;
> + mmput(hmm->mm);
> hmm_put(hmm);
> range->hmm = NULL;
> }
> @@ -982,10 +979,7 @@ long hmm_range_snapshot(struct hmm_range *range)
> struct vm_area_struct *vma;
> struct mm_walk mm_walk;
>
> - /* Check if hmm_mm_destroy() was call. */
> - if (hmm->mm == NULL || hmm->dead)
> - return -EFAULT;
> -
> + lockdep_assert_held(&hmm->mm->mmap_sem);
> do {
> /* If range is no longer valid force retry. */
> if (!range->valid)
> @@ -1080,9 +1074,7 @@ long hmm_range_fault(struct hmm_range *range, bool block)
> struct mm_walk mm_walk;
> int ret;
>
> - /* Check if hmm_mm_destroy() was call. */
> - if (hmm->mm == NULL || hmm->dead)
> - return -EFAULT;
> + lockdep_assert_held(&hmm->mm->mmap_sem);
>
> do {
> /* If range is no longer valid force retry. */
>
Nice cleanup.
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread[parent not found: <20190606184438.31646-7-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v2 hmm 06/11] mm/hmm: Hold on to the mmget for the lifetime of the range
[not found] ` <20190606184438.31646-7-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-07 20:29 ` Ralph Campbell
0 siblings, 0 replies; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 20:29 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> Range functions like hmm_range_snapshot() and hmm_range_fault() call
> find_vma, which requires hodling the mmget() and the mmap_sem for the mm.
>
> Make this simpler for the callers by holding the mmget() inside the range
> for the lifetime of the range. Other functions that accept a range should
> only be called if the range is registered.
>
> This has the side effect of directly preventing hmm_release() from
> happening while a range is registered. That means range->dead cannot be
> false during the lifetime of the range, so remove dead and
> hmm_mirror_mm_is_alive() entirely.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Looks good to me.
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> v2:
> - Use Jerome's idea of just holding the mmget() for the range lifetime,
> rework the patch to use that as as simplification to remove dead in
> one step
> ---
> include/linux/hmm.h | 26 --------------------------
> mm/hmm.c | 28 ++++++++++------------------
> 2 files changed, 10 insertions(+), 44 deletions(-)
>
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 2ab35b40992b24..0e20566802967a 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -91,7 +91,6 @@
> * @mirrors_sem: read/write semaphore protecting the mirrors list
> * @wq: wait queue for user waiting on a range invalidation
> * @notifiers: count of active mmu notifiers
> - * @dead: is the mm dead ?
> */
> struct hmm {
> struct mm_struct *mm;
> @@ -104,7 +103,6 @@ struct hmm {
> wait_queue_head_t wq;
> struct rcu_head rcu;
> long notifiers;
> - bool dead;
> };
>
> /*
> @@ -469,30 +467,6 @@ struct hmm_mirror {
> int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm);
> void hmm_mirror_unregister(struct hmm_mirror *mirror);
>
> -/*
> - * hmm_mirror_mm_is_alive() - test if mm is still alive
> - * @mirror: the HMM mm mirror for which we want to lock the mmap_sem
> - * Return: false if the mm is dead, true otherwise
> - *
> - * This is an optimization, it will not always accurately return false if the
> - * mm is dead; i.e., there can be false negatives (process is being killed but
> - * HMM is not yet informed of that). It is only intended to be used to optimize
> - * out cases where the driver is about to do something time consuming and it
> - * would be better to skip it if the mm is dead.
> - */
> -static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror)
> -{
> - struct mm_struct *mm;
> -
> - if (!mirror || !mirror->hmm)
> - return false;
> - mm = READ_ONCE(mirror->hmm->mm);
> - if (mirror->hmm->dead || !mm)
> - return false;
> -
> - return true;
> -}
> -
> /*
> * Please see Documentation/vm/hmm.rst for how to use the range API.
> */
> diff --git a/mm/hmm.c b/mm/hmm.c
> index dc30edad9a8a02..f67ba32983d9f1 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -80,7 +80,6 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> mutex_init(&hmm->lock);
> kref_init(&hmm->kref);
> hmm->notifiers = 0;
> - hmm->dead = false;
> hmm->mm = mm;
>
> hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
> @@ -124,20 +123,17 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> {
> struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
> struct hmm_mirror *mirror;
> - struct hmm_range *range;
>
> /* hmm is in progress to free */
> if (!kref_get_unless_zero(&hmm->kref))
> return;
>
> - /* Report this HMM as dying. */
> - hmm->dead = true;
> -
> - /* Wake-up everyone waiting on any range. */
> mutex_lock(&hmm->lock);
> - list_for_each_entry(range, &hmm->ranges, list)
> - range->valid = false;
> - wake_up_all(&hmm->wq);
> + /*
> + * Since hmm_range_register() holds the mmget() lock hmm_release() is
> + * prevented as long as a range exists.
> + */
> + WARN_ON(!list_empty(&hmm->ranges));
> mutex_unlock(&hmm->lock);
>
> down_write(&hmm->mirrors_sem);
> @@ -909,8 +905,8 @@ int hmm_range_register(struct hmm_range *range,
> range->start = start;
> range->end = end;
>
> - /* Check if hmm_mm_destroy() was call. */
> - if (hmm->mm == NULL || hmm->dead)
> + /* Prevent hmm_release() from running while the range is valid */
> + if (!mmget_not_zero(hmm->mm))
> return -EFAULT;
>
> range->hmm = hmm;
> @@ -955,6 +951,7 @@ void hmm_range_unregister(struct hmm_range *range)
>
> /* Drop reference taken by hmm_range_register() */
> range->valid = false;
> + mmput(hmm->mm);
> hmm_put(hmm);
> range->hmm = NULL;
> }
> @@ -982,10 +979,7 @@ long hmm_range_snapshot(struct hmm_range *range)
> struct vm_area_struct *vma;
> struct mm_walk mm_walk;
>
> - /* Check if hmm_mm_destroy() was call. */
> - if (hmm->mm == NULL || hmm->dead)
> - return -EFAULT;
> -
> + lockdep_assert_held(&hmm->mm->mmap_sem);
> do {
> /* If range is no longer valid force retry. */
> if (!range->valid)
> @@ -1080,9 +1074,7 @@ long hmm_range_fault(struct hmm_range *range, bool block)
> struct mm_walk mm_walk;
> int ret;
>
> - /* Check if hmm_mm_destroy() was call. */
> - if (hmm->mm == NULL || hmm->dead)
> - return -EFAULT;
> + lockdep_assert_held(&hmm->mm->mmap_sem);
>
> do {
> /* If range is no longer valid force retry. */
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread
* [PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
` (5 preceding siblings ...)
2019-06-06 18:44 ` [PATCH v2 hmm 06/11] mm/hmm: Hold on to the mmget for the lifetime of the range Jason Gunthorpe
@ 2019-06-06 18:44 ` Jason Gunthorpe
2019-06-07 3:19 ` John Hubbard
` (2 more replies)
2019-06-06 18:44 ` [PATCH v2 hmm 08/11] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe
` (5 subsequent siblings)
12 siblings, 3 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
So we can check locking at runtime.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
---
v2
- Fix missing & in lockdeps (Jason)
---
mm/hmm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/hmm.c b/mm/hmm.c
index f67ba32983d9f1..c702cd72651b53 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -254,11 +254,11 @@ static const struct mmu_notifier_ops hmm_mmu_notifier_ops = {
*
* To start mirroring a process address space, the device driver must register
* an HMM mirror struct.
- *
- * THE mm->mmap_sem MUST BE HELD IN WRITE MODE !
*/
int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm)
{
+ lockdep_assert_held_exclusive(&mm->mmap_sem);
+
/* Sanity check */
if (!mm || !mirror || !mirror->ops)
return -EINVAL;
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments
2019-06-06 18:44 ` [PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments Jason Gunthorpe
@ 2019-06-07 3:19 ` John Hubbard
[not found] ` <20190606184438.31646-8-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-07 22:16 ` Souptick Joarder
2 siblings, 0 replies; 79+ messages in thread
From: John Hubbard @ 2019-06-07 3:19 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, Ralph Campbell, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> So we can check locking at runtime.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> v2
> - Fix missing & in lockdeps (Jason)
> ---
> mm/hmm.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index f67ba32983d9f1..c702cd72651b53 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -254,11 +254,11 @@ static const struct mmu_notifier_ops hmm_mmu_notifier_ops = {
> *
> * To start mirroring a process address space, the device driver must register
> * an HMM mirror struct.
> - *
> - * THE mm->mmap_sem MUST BE HELD IN WRITE MODE !
> */
> int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm)
> {
> + lockdep_assert_held_exclusive(&mm->mmap_sem);
> +
> /* Sanity check */
> if (!mm || !mirror || !mirror->ops)
> return -EINVAL;
>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread[parent not found: <20190606184438.31646-8-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments
[not found] ` <20190606184438.31646-8-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-07 20:31 ` Ralph Campbell
0 siblings, 0 replies; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 20:31 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> So we can check locking at runtime.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> v2
> - Fix missing & in lockdeps (Jason)
> ---
> mm/hmm.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index f67ba32983d9f1..c702cd72651b53 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -254,11 +254,11 @@ static const struct mmu_notifier_ops hmm_mmu_notifier_ops = {
> *
> * To start mirroring a process address space, the device driver must register
> * an HMM mirror struct.
> - *
> - * THE mm->mmap_sem MUST BE HELD IN WRITE MODE !
> */
> int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm)
> {
> + lockdep_assert_held_exclusive(&mm->mmap_sem);
> +
> /* Sanity check */
> if (!mm || !mirror || !mirror->ops)
> return -EINVAL;
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: [PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments
2019-06-06 18:44 ` [PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments Jason Gunthorpe
2019-06-07 3:19 ` John Hubbard
[not found] ` <20190606184438.31646-8-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-07 22:16 ` Souptick Joarder
2 siblings, 0 replies; 79+ messages in thread
From: Souptick Joarder @ 2019-06-07 22:16 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, Ralph Campbell, linux-rdma, John Hubbard,
Felix.Kuehling, dri-devel, Linux-MM, Jerome Glisse,
Jason Gunthorpe, amd-gfx
On Fri, Jun 7, 2019 at 12:15 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> So we can check locking at runtime.
Little more descriptive change log would be helpful.
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> v2
> - Fix missing & in lockdeps (Jason)
> ---
> mm/hmm.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index f67ba32983d9f1..c702cd72651b53 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -254,11 +254,11 @@ static const struct mmu_notifier_ops hmm_mmu_notifier_ops = {
> *
> * To start mirroring a process address space, the device driver must register
> * an HMM mirror struct.
> - *
> - * THE mm->mmap_sem MUST BE HELD IN WRITE MODE !
> */
> int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm)
> {
> + lockdep_assert_held_exclusive(&mm->mmap_sem);
> +
> /* Sanity check */
> if (!mm || !mirror || !mirror->ops)
> return -EINVAL;
> --
> 2.21.0
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* [PATCH v2 hmm 08/11] mm/hmm: Remove racy protection against double-unregistration
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
` (6 preceding siblings ...)
2019-06-06 18:44 ` [PATCH v2 hmm 07/11] mm/hmm: Use lockdep instead of comments Jason Gunthorpe
@ 2019-06-06 18:44 ` Jason Gunthorpe
2019-06-07 3:29 ` John Hubbard
[not found] ` <20190606184438.31646-9-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-06 18:44 ` [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe
` (4 subsequent siblings)
12 siblings, 2 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
No other register/unregister kernel API attempts to provide this kind of
protection as it is inherently racy, so just drop it.
Callers should provide their own protection, it appears nouveau already
does, but just in case drop a debugging POISON.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
---
mm/hmm.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/mm/hmm.c b/mm/hmm.c
index c702cd72651b53..6802de7080d172 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -284,18 +284,13 @@ EXPORT_SYMBOL(hmm_mirror_register);
*/
void hmm_mirror_unregister(struct hmm_mirror *mirror)
{
- struct hmm *hmm = READ_ONCE(mirror->hmm);
-
- if (hmm == NULL)
- return;
+ struct hmm *hmm = mirror->hmm;
down_write(&hmm->mirrors_sem);
list_del_init(&mirror->list);
- /* To protect us against double unregister ... */
- mirror->hmm = NULL;
up_write(&hmm->mirrors_sem);
-
hmm_put(hmm);
+ memset(&mirror->hmm, POISON_INUSE, sizeof(mirror->hmm));
}
EXPORT_SYMBOL(hmm_mirror_unregister);
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 08/11] mm/hmm: Remove racy protection against double-unregistration
2019-06-06 18:44 ` [PATCH v2 hmm 08/11] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe
@ 2019-06-07 3:29 ` John Hubbard
[not found] ` <88400de9-e1ae-509b-718f-c6b0f726b14c-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
[not found] ` <20190606184438.31646-9-jgg-uk2M96/98Pc@public.gmane.org>
1 sibling, 1 reply; 79+ messages in thread
From: John Hubbard @ 2019-06-07 3:29 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, Ralph Campbell, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> No other register/unregister kernel API attempts to provide this kind of
> protection as it is inherently racy, so just drop it.
>
> Callers should provide their own protection, it appears nouveau already
> does, but just in case drop a debugging POISON.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> mm/hmm.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index c702cd72651b53..6802de7080d172 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -284,18 +284,13 @@ EXPORT_SYMBOL(hmm_mirror_register);
> */
> void hmm_mirror_unregister(struct hmm_mirror *mirror)
> {
> - struct hmm *hmm = READ_ONCE(mirror->hmm);
> -
> - if (hmm == NULL)
> - return;
> + struct hmm *hmm = mirror->hmm;
>
> down_write(&hmm->mirrors_sem);
> list_del_init(&mirror->list);
> - /* To protect us against double unregister ... */
> - mirror->hmm = NULL;
> up_write(&hmm->mirrors_sem);
> -
> hmm_put(hmm);
> + memset(&mirror->hmm, POISON_INUSE, sizeof(mirror->hmm));
I hadn't thought of POISON_* for these types of cases, it's a
good technique to remember.
I noticed that this is now done outside of the lock, but that
follows directly from your commit description, so that all looks
correct.
> }
> EXPORT_SYMBOL(hmm_mirror_unregister);
>
>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread[parent not found: <20190606184438.31646-9-jgg-uk2M96/98Pc@public.gmane.org>]
* Re: [PATCH v2 hmm 08/11] mm/hmm: Remove racy protection against double-unregistration
[not found] ` <20190606184438.31646-9-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-07 20:33 ` Ralph Campbell
0 siblings, 0 replies; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 20:33 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> No other register/unregister kernel API attempts to provide this kind of
> protection as it is inherently racy, so just drop it.
>
> Callers should provide their own protection, it appears nouveau already
> does, but just in case drop a debugging POISON.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> mm/hmm.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index c702cd72651b53..6802de7080d172 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -284,18 +284,13 @@ EXPORT_SYMBOL(hmm_mirror_register);
> */
> void hmm_mirror_unregister(struct hmm_mirror *mirror)
> {
> - struct hmm *hmm = READ_ONCE(mirror->hmm);
> -
> - if (hmm == NULL)
> - return;
> + struct hmm *hmm = mirror->hmm;
>
> down_write(&hmm->mirrors_sem);
> list_del_init(&mirror->list);
> - /* To protect us against double unregister ... */
> - mirror->hmm = NULL;
> up_write(&hmm->mirrors_sem);
> -
> hmm_put(hmm);
> + memset(&mirror->hmm, POISON_INUSE, sizeof(mirror->hmm));
> }
> EXPORT_SYMBOL(hmm_mirror_unregister);
>
>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread
* [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
` (7 preceding siblings ...)
2019-06-06 18:44 ` [PATCH v2 hmm 08/11] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe
@ 2019-06-06 18:44 ` Jason Gunthorpe
2019-06-07 3:37 ` John Hubbard
` (2 more replies)
2019-06-06 18:44 ` [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
` (3 subsequent siblings)
12 siblings, 3 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
Trying to misuse a range outside its lifetime is a kernel bug. Use WARN_ON
and poison bytes to detect this condition.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
---
v2
- Keep range start/end valid after unregistration (Jerome)
---
mm/hmm.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/mm/hmm.c b/mm/hmm.c
index 6802de7080d172..c2fecb3ecb11e1 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -937,7 +937,7 @@ void hmm_range_unregister(struct hmm_range *range)
struct hmm *hmm = range->hmm;
/* Sanity check this really should not happen. */
- if (hmm == NULL || range->end <= range->start)
+ if (WARN_ON(range->end <= range->start))
return;
mutex_lock(&hmm->lock);
@@ -948,7 +948,10 @@ void hmm_range_unregister(struct hmm_range *range)
range->valid = false;
mmput(hmm->mm);
hmm_put(hmm);
- range->hmm = NULL;
+
+ /* The range is now invalid, leave it poisoned. */
+ range->valid = false;
+ memset(&range->hmm, POISON_INUSE, sizeof(range->hmm));
}
EXPORT_SYMBOL(hmm_range_unregister);
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister
2019-06-06 18:44 ` [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe
@ 2019-06-07 3:37 ` John Hubbard
[not found] ` <c00da0f2-b4b8-813b-0441-a50d4de9d8be-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2019-06-07 20:46 ` Ralph Campbell
2019-06-07 23:01 ` Ira Weiny
2 siblings, 1 reply; 79+ messages in thread
From: John Hubbard @ 2019-06-07 3:37 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, Ralph Campbell, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> Trying to misuse a range outside its lifetime is a kernel bug. Use WARN_ON
> and poison bytes to detect this condition.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> v2
> - Keep range start/end valid after unregistration (Jerome)
> ---
> mm/hmm.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 6802de7080d172..c2fecb3ecb11e1 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -937,7 +937,7 @@ void hmm_range_unregister(struct hmm_range *range)
> struct hmm *hmm = range->hmm;
>
> /* Sanity check this really should not happen. */
That comment can also be deleted, as it has the same meaning as
the WARN_ON() that you just added.
> - if (hmm == NULL || range->end <= range->start)
> + if (WARN_ON(range->end <= range->start))
> return;
>
> mutex_lock(&hmm->lock);
> @@ -948,7 +948,10 @@ void hmm_range_unregister(struct hmm_range *range)
> range->valid = false;
> mmput(hmm->mm);
> hmm_put(hmm);
> - range->hmm = NULL;
> +
> + /* The range is now invalid, leave it poisoned. */
To be precise, we are poisoning the range's back pointer to it's
owning hmm instance. Maybe this is clearer:
/*
* The range is now invalid, so poison it's hmm pointer.
* Leave other range-> fields in place, for the caller's use.
*/
...or something like that?
> + range->valid = false;
> + memset(&range->hmm, POISON_INUSE, sizeof(range->hmm));
> }
> EXPORT_SYMBOL(hmm_range_unregister);
>
>
The above are very minor documentation points, so:
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister
2019-06-06 18:44 ` [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe
2019-06-07 3:37 ` John Hubbard
@ 2019-06-07 20:46 ` Ralph Campbell
2019-06-07 20:49 ` Jason Gunthorpe
2019-06-07 23:01 ` Ira Weiny
2 siblings, 1 reply; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 20:46 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> Trying to misuse a range outside its lifetime is a kernel bug. Use WARN_ON
> and poison bytes to detect this condition.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> v2
> - Keep range start/end valid after unregistration (Jerome)
> ---
> mm/hmm.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 6802de7080d172..c2fecb3ecb11e1 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -937,7 +937,7 @@ void hmm_range_unregister(struct hmm_range *range)
> struct hmm *hmm = range->hmm;
>
> /* Sanity check this really should not happen. */
> - if (hmm == NULL || range->end <= range->start)
> + if (WARN_ON(range->end <= range->start))
> return;
WARN_ON() is definitely better than silent return but I wonder how
useful it is since the caller shouldn't be modifying the hmm_range
once it is registered. Other fields could be changed too...
> mutex_lock(&hmm->lock);
> @@ -948,7 +948,10 @@ void hmm_range_unregister(struct hmm_range *range)
> range->valid = false;
> mmput(hmm->mm);
> hmm_put(hmm);
> - range->hmm = NULL;
> +
> + /* The range is now invalid, leave it poisoned. */
> + range->valid = false;
> + memset(&range->hmm, POISON_INUSE, sizeof(range->hmm));
> }
> EXPORT_SYMBOL(hmm_range_unregister);
>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister
2019-06-07 20:46 ` Ralph Campbell
@ 2019-06-07 20:49 ` Jason Gunthorpe
0 siblings, 0 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-07 20:49 UTC (permalink / raw)
To: Ralph Campbell
Cc: Andrea Arcangeli, linux-rdma, John Hubbard, Felix.Kuehling,
dri-devel, linux-mm, Jerome Glisse, amd-gfx
On Fri, Jun 07, 2019 at 01:46:30PM -0700, Ralph Campbell wrote:
>
> On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> >
> > Trying to misuse a range outside its lifetime is a kernel bug. Use WARN_ON
> > and poison bytes to detect this condition.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> > Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
>
> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
>
> > v2
> > - Keep range start/end valid after unregistration (Jerome)
> > mm/hmm.c | 7 +++++--
> > 1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/hmm.c b/mm/hmm.c
> > index 6802de7080d172..c2fecb3ecb11e1 100644
> > +++ b/mm/hmm.c
> > @@ -937,7 +937,7 @@ void hmm_range_unregister(struct hmm_range *range)
> > struct hmm *hmm = range->hmm;
> > /* Sanity check this really should not happen. */
> > - if (hmm == NULL || range->end <= range->start)
> > + if (WARN_ON(range->end <= range->start))
> > return;
>
> WARN_ON() is definitely better than silent return but I wonder how
> useful it is since the caller shouldn't be modifying the hmm_range
> once it is registered. Other fields could be changed too...
I deleted the warn on (see the other thread), but I'm confused by your
"shouldn't be modified" statement.
The only thing that needs to be set and remain unchanged for register
is the virtual start/end address. Everything else should be done once
it is clear to proceed based on the collision-retry locking scheme
this uses.
Basically the range register only setups a 'detector' for colliding
invalidations. The other stuff in the struct is just random temporary
storage for the API.
AFAICS at least..
Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister
2019-06-06 18:44 ` [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe
2019-06-07 3:37 ` John Hubbard
2019-06-07 20:46 ` Ralph Campbell
@ 2019-06-07 23:01 ` Ira Weiny
2 siblings, 0 replies; 79+ messages in thread
From: Ira Weiny @ 2019-06-07 23:01 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, Ralph Campbell, linux-rdma, John Hubbard,
Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
Jason Gunthorpe, amd-gfx
On Thu, Jun 06, 2019 at 03:44:36PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> Trying to misuse a range outside its lifetime is a kernel bug. Use WARN_ON
> and poison bytes to detect this condition.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> v2
> - Keep range start/end valid after unregistration (Jerome)
> ---
> mm/hmm.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 6802de7080d172..c2fecb3ecb11e1 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -937,7 +937,7 @@ void hmm_range_unregister(struct hmm_range *range)
> struct hmm *hmm = range->hmm;
>
> /* Sanity check this really should not happen. */
> - if (hmm == NULL || range->end <= range->start)
> + if (WARN_ON(range->end <= range->start))
> return;
>
> mutex_lock(&hmm->lock);
> @@ -948,7 +948,10 @@ void hmm_range_unregister(struct hmm_range *range)
> range->valid = false;
> mmput(hmm->mm);
> hmm_put(hmm);
> - range->hmm = NULL;
> +
> + /* The range is now invalid, leave it poisoned. */
> + range->valid = false;
No need to set valid false again as you just did this 5 lines above.
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> + memset(&range->hmm, POISON_INUSE, sizeof(range->hmm));
> }
> EXPORT_SYMBOL(hmm_range_unregister);
>
> --
> 2.21.0
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
` (8 preceding siblings ...)
2019-06-06 18:44 ` [PATCH v2 hmm 09/11] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe
@ 2019-06-06 18:44 ` Jason Gunthorpe
2019-06-07 3:40 ` John Hubbard
` (3 more replies)
2019-06-06 18:44 ` [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe
` (2 subsequent siblings)
12 siblings, 4 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
This list is always read and written while holding hmm->lock so there is
no need for the confusing _rcu annotations.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
---
mm/hmm.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/hmm.c b/mm/hmm.c
index c2fecb3ecb11e1..709d138dd49027 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -911,7 +911,7 @@ int hmm_range_register(struct hmm_range *range,
mutex_lock(&hmm->lock);
range->hmm = hmm;
- list_add_rcu(&range->list, &hmm->ranges);
+ list_add(&range->list, &hmm->ranges);
/*
* If there are any concurrent notifiers we have to wait for them for
@@ -941,7 +941,7 @@ void hmm_range_unregister(struct hmm_range *range)
return;
mutex_lock(&hmm->lock);
- list_del_rcu(&range->list);
+ list_del(&range->list);
mutex_unlock(&hmm->lock);
/* Drop reference taken by hmm_range_register() */
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges
2019-06-06 18:44 ` [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
@ 2019-06-07 3:40 ` John Hubbard
2019-06-07 20:49 ` Ralph Campbell
` (2 subsequent siblings)
3 siblings, 0 replies; 79+ messages in thread
From: John Hubbard @ 2019-06-07 3:40 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, Ralph Campbell, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> This list is always read and written while holding hmm->lock so there is
> no need for the confusing _rcu annotations.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
> ---
> mm/hmm.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index c2fecb3ecb11e1..709d138dd49027 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -911,7 +911,7 @@ int hmm_range_register(struct hmm_range *range,
> mutex_lock(&hmm->lock);
>
> range->hmm = hmm;
> - list_add_rcu(&range->list, &hmm->ranges);
> + list_add(&range->list, &hmm->ranges);
>
> /*
> * If there are any concurrent notifiers we have to wait for them for
> @@ -941,7 +941,7 @@ void hmm_range_unregister(struct hmm_range *range)
> return;
>
> mutex_lock(&hmm->lock);
> - list_del_rcu(&range->list);
> + list_del(&range->list);
> mutex_unlock(&hmm->lock);
>
> /* Drop reference taken by hmm_range_register() */
>
Good point. I'd overlooked this many times.
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
thanks,
--
John Hubbard
NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges
2019-06-06 18:44 ` [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
2019-06-07 3:40 ` John Hubbard
@ 2019-06-07 20:49 ` Ralph Campbell
2019-06-07 22:11 ` Souptick Joarder
2019-06-07 23:02 ` Ira Weiny
3 siblings, 0 replies; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 20:49 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> This list is always read and written while holding hmm->lock so there is
> no need for the confusing _rcu annotations.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> mm/hmm.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index c2fecb3ecb11e1..709d138dd49027 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -911,7 +911,7 @@ int hmm_range_register(struct hmm_range *range,
> mutex_lock(&hmm->lock);
>
> range->hmm = hmm;
> - list_add_rcu(&range->list, &hmm->ranges);
> + list_add(&range->list, &hmm->ranges);
>
> /*
> * If there are any concurrent notifiers we have to wait for them for
> @@ -941,7 +941,7 @@ void hmm_range_unregister(struct hmm_range *range)
> return;
>
> mutex_lock(&hmm->lock);
> - list_del_rcu(&range->list);
> + list_del(&range->list);
> mutex_unlock(&hmm->lock);
>
> /* Drop reference taken by hmm_range_register() */
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges
2019-06-06 18:44 ` [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
2019-06-07 3:40 ` John Hubbard
2019-06-07 20:49 ` Ralph Campbell
@ 2019-06-07 22:11 ` Souptick Joarder
2019-06-07 23:02 ` Ira Weiny
3 siblings, 0 replies; 79+ messages in thread
From: Souptick Joarder @ 2019-06-07 22:11 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, Ralph Campbell, linux-rdma, John Hubbard,
Felix.Kuehling, dri-devel, Linux-MM, Jerome Glisse,
Jason Gunthorpe, amd-gfx
On Fri, Jun 7, 2019 at 12:15 AM Jason Gunthorpe <jgg@ziepe.ca> wrote:
>
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> This list is always read and written while holding hmm->lock so there is
> no need for the confusing _rcu annotations.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
> ---
> mm/hmm.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index c2fecb3ecb11e1..709d138dd49027 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -911,7 +911,7 @@ int hmm_range_register(struct hmm_range *range,
> mutex_lock(&hmm->lock);
>
> range->hmm = hmm;
> - list_add_rcu(&range->list, &hmm->ranges);
> + list_add(&range->list, &hmm->ranges);
>
> /*
> * If there are any concurrent notifiers we have to wait for them for
> @@ -941,7 +941,7 @@ void hmm_range_unregister(struct hmm_range *range)
> return;
>
> mutex_lock(&hmm->lock);
> - list_del_rcu(&range->list);
> + list_del(&range->list);
> mutex_unlock(&hmm->lock);
>
> /* Drop reference taken by hmm_range_register() */
> --
> 2.21.0
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges
2019-06-06 18:44 ` [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
` (2 preceding siblings ...)
2019-06-07 22:11 ` Souptick Joarder
@ 2019-06-07 23:02 ` Ira Weiny
3 siblings, 0 replies; 79+ messages in thread
From: Ira Weiny @ 2019-06-07 23:02 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, Ralph Campbell, linux-rdma, John Hubbard,
Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
Jason Gunthorpe, amd-gfx
On Thu, Jun 06, 2019 at 03:44:37PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> This list is always read and written while holding hmm->lock so there is
> no need for the confusing _rcu annotations.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Ira Weiny <iweiny@intel.com>
> ---
> mm/hmm.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index c2fecb3ecb11e1..709d138dd49027 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -911,7 +911,7 @@ int hmm_range_register(struct hmm_range *range,
> mutex_lock(&hmm->lock);
>
> range->hmm = hmm;
> - list_add_rcu(&range->list, &hmm->ranges);
> + list_add(&range->list, &hmm->ranges);
>
> /*
> * If there are any concurrent notifiers we have to wait for them for
> @@ -941,7 +941,7 @@ void hmm_range_unregister(struct hmm_range *range)
> return;
>
> mutex_lock(&hmm->lock);
> - list_del_rcu(&range->list);
> + list_del(&range->list);
> mutex_unlock(&hmm->lock);
>
> /* Drop reference taken by hmm_range_register() */
> --
> 2.21.0
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
` (9 preceding siblings ...)
2019-06-06 18:44 ` [PATCH v2 hmm 10/11] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
@ 2019-06-06 18:44 ` Jason Gunthorpe
2019-06-07 3:47 ` John Hubbard
2019-06-07 21:37 ` Ralph Campbell
2019-06-07 16:05 ` [PATCH v2 12/11] mm/hmm: Fix error flows in hmm_invalidate_range_start Jason Gunthorpe
2019-06-11 19:48 ` [PATCH v2 hmm 00/11] Various revisions from a locking/code review Jason Gunthorpe
12 siblings, 2 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-06 18:44 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: Andrea Arcangeli, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jason Gunthorpe,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
From: Jason Gunthorpe <jgg@mellanox.com>
hmm_release() is called exactly once per hmm. ops->release() cannot
accidentally trigger any action that would recurse back onto
hmm->mirrors_sem.
This fixes a use after-free race of the form:
CPU0 CPU1
hmm_release()
up_write(&hmm->mirrors_sem);
hmm_mirror_unregister(mirror)
down_write(&hmm->mirrors_sem);
up_write(&hmm->mirrors_sem);
kfree(mirror)
mirror->ops->release(mirror)
The only user we have today for ops->release is an empty function, so this
is unambiguously safe.
As a consequence of plugging this race drivers are not allowed to
register/unregister mirrors from within a release op.
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
mm/hmm.c | 28 +++++++++-------------------
1 file changed, 9 insertions(+), 19 deletions(-)
diff --git a/mm/hmm.c b/mm/hmm.c
index 709d138dd49027..3a45dd3d778248 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -136,26 +136,16 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
WARN_ON(!list_empty(&hmm->ranges));
mutex_unlock(&hmm->lock);
- down_write(&hmm->mirrors_sem);
- mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror,
- list);
- while (mirror) {
- list_del_init(&mirror->list);
- if (mirror->ops->release) {
- /*
- * Drop mirrors_sem so the release callback can wait
- * on any pending work that might itself trigger a
- * mmu_notifier callback and thus would deadlock with
- * us.
- */
- up_write(&hmm->mirrors_sem);
+ down_read(&hmm->mirrors_sem);
+ list_for_each_entry(mirror, &hmm->mirrors, list) {
+ /*
+ * Note: The driver is not allowed to trigger
+ * hmm_mirror_unregister() from this thread.
+ */
+ if (mirror->ops->release)
mirror->ops->release(mirror);
- down_write(&hmm->mirrors_sem);
- }
- mirror = list_first_entry_or_null(&hmm->mirrors,
- struct hmm_mirror, list);
}
- up_write(&hmm->mirrors_sem);
+ up_read(&hmm->mirrors_sem);
hmm_put(hmm);
}
@@ -287,7 +277,7 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror)
struct hmm *hmm = mirror->hmm;
down_write(&hmm->mirrors_sem);
- list_del_init(&mirror->list);
+ list_del(&mirror->list);
up_write(&hmm->mirrors_sem);
hmm_put(hmm);
memset(&mirror->hmm, POISON_INUSE, sizeof(mirror->hmm));
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release
2019-06-06 18:44 ` [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe
@ 2019-06-07 3:47 ` John Hubbard
[not found] ` <3edc47bd-e8f6-0e65-5844-d16901890637-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
2019-06-07 21:37 ` Ralph Campbell
1 sibling, 1 reply; 79+ messages in thread
From: John Hubbard @ 2019-06-07 3:47 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, Ralph Campbell, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> hmm_release() is called exactly once per hmm. ops->release() cannot
> accidentally trigger any action that would recurse back onto
> hmm->mirrors_sem.
>
> This fixes a use after-free race of the form:
>
> CPU0 CPU1
> hmm_release()
> up_write(&hmm->mirrors_sem);
> hmm_mirror_unregister(mirror)
> down_write(&hmm->mirrors_sem);
> up_write(&hmm->mirrors_sem);
> kfree(mirror)
> mirror->ops->release(mirror)
>
> The only user we have today for ops->release is an empty function, so this
> is unambiguously safe.
>
> As a consequence of plugging this race drivers are not allowed to
> register/unregister mirrors from within a release op.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
> mm/hmm.c | 28 +++++++++-------------------
> 1 file changed, 9 insertions(+), 19 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 709d138dd49027..3a45dd3d778248 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -136,26 +136,16 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> WARN_ON(!list_empty(&hmm->ranges));
> mutex_unlock(&hmm->lock);
>
> - down_write(&hmm->mirrors_sem);
> - mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror,
> - list);
> - while (mirror) {
> - list_del_init(&mirror->list);
> - if (mirror->ops->release) {
> - /*
> - * Drop mirrors_sem so the release callback can wait
> - * on any pending work that might itself trigger a
> - * mmu_notifier callback and thus would deadlock with
> - * us.
> - */
> - up_write(&hmm->mirrors_sem);
> + down_read(&hmm->mirrors_sem);
This is cleaner and simpler, but I suspect it is leading to the deadlock
that Ralph Campbell is seeing in his driver testing. (And in general, holding
a lock during a driver callback usually leads to deadlocks.)
Ralph, is this the one? It's the only place in this patchset where I can
see a lock around a callback to driver code, that wasn't there before. So
I'm pretty sure it is the one...
thanks,
--
John Hubbard
NVIDIA
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release
2019-06-06 18:44 ` [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe
2019-06-07 3:47 ` John Hubbard
@ 2019-06-07 21:37 ` Ralph Campbell
2019-06-08 2:12 ` Jason Gunthorpe
[not found] ` <61ea869d-43d2-d1e5-dc00-cf5e3e139169-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
1 sibling, 2 replies; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 21:37 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard, Felix.Kuehling
Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
dri-devel
On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> hmm_release() is called exactly once per hmm. ops->release() cannot
> accidentally trigger any action that would recurse back onto
> hmm->mirrors_sem.
>
> This fixes a use after-free race of the form:
>
> CPU0 CPU1
> hmm_release()
> up_write(&hmm->mirrors_sem);
> hmm_mirror_unregister(mirror)
> down_write(&hmm->mirrors_sem);
> up_write(&hmm->mirrors_sem);
> kfree(mirror)
> mirror->ops->release(mirror)
>
> The only user we have today for ops->release is an empty function, so this
> is unambiguously safe.
>
> As a consequence of plugging this race drivers are not allowed to
> register/unregister mirrors from within a release op.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
I agree with the analysis above but I'm not sure that release() will
always be an empty function. It might be more efficient to write back
all data migrated to a device "in one pass" instead of relying
on unmap_vmas() calling hmm_start_range_invalidate() per VMA.
I think the bigger issue is potential deadlocks while calling
sync_cpu_device_pagetables() and tasks calling hmm_mirror_unregister():
Say you have three threads:
- Thread A is in try_to_unmap(), either without holding mmap_sem or with
mmap_sem held for read.
- Thread B has some unrelated driver calling hmm_mirror_unregister().
This doesn't require mmap_sem.
- Thread C is about to call migrate_vma().
Thread A Thread B Thread C
try_to_unmap hmm_mirror_unregister migrate_vma
---------------------- ----------------------- ----------------------
hmm_invalidate_range_start
down_read(mirrors_sem)
down_write(mirrors_sem)
// Blocked on A
device_lock
device_lock
// Blocked on C
migrate_vma()
hmm_invalidate_range_s
down_read(mirrors_sem)
// Blocked on B
// Deadlock
Perhaps we should consider using SRCU for walking the mirror->list?
> ---
> mm/hmm.c | 28 +++++++++-------------------
> 1 file changed, 9 insertions(+), 19 deletions(-)
>
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 709d138dd49027..3a45dd3d778248 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -136,26 +136,16 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> WARN_ON(!list_empty(&hmm->ranges));
> mutex_unlock(&hmm->lock);
>
> - down_write(&hmm->mirrors_sem);
> - mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror,
> - list);
> - while (mirror) {
> - list_del_init(&mirror->list);
> - if (mirror->ops->release) {
> - /*
> - * Drop mirrors_sem so the release callback can wait
> - * on any pending work that might itself trigger a
> - * mmu_notifier callback and thus would deadlock with
> - * us.
> - */
> - up_write(&hmm->mirrors_sem);
> + down_read(&hmm->mirrors_sem);
> + list_for_each_entry(mirror, &hmm->mirrors, list) {
> + /*
> + * Note: The driver is not allowed to trigger
> + * hmm_mirror_unregister() from this thread.
> + */
> + if (mirror->ops->release)
> mirror->ops->release(mirror);
> - down_write(&hmm->mirrors_sem);
> - }
> - mirror = list_first_entry_or_null(&hmm->mirrors,
> - struct hmm_mirror, list);
> }
> - up_write(&hmm->mirrors_sem);
> + up_read(&hmm->mirrors_sem);
>
> hmm_put(hmm);
> }
> @@ -287,7 +277,7 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror)
> struct hmm *hmm = mirror->hmm;
>
> down_write(&hmm->mirrors_sem);
> - list_del_init(&mirror->list);
> + list_del(&mirror->list);
> up_write(&hmm->mirrors_sem);
> hmm_put(hmm);
> memset(&mirror->hmm, POISON_INUSE, sizeof(mirror->hmm));
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release
2019-06-07 21:37 ` Ralph Campbell
@ 2019-06-08 2:12 ` Jason Gunthorpe
[not found] ` <61ea869d-43d2-d1e5-dc00-cf5e3e139169-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
1 sibling, 0 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-08 2:12 UTC (permalink / raw)
To: Ralph Campbell
Cc: Andrea Arcangeli, linux-rdma, John Hubbard, Felix.Kuehling,
dri-devel, linux-mm, Jerome Glisse, amd-gfx
On Fri, Jun 07, 2019 at 02:37:07PM -0700, Ralph Campbell wrote:
>
> On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> >
> > hmm_release() is called exactly once per hmm. ops->release() cannot
> > accidentally trigger any action that would recurse back onto
> > hmm->mirrors_sem.
> >
> > This fixes a use after-free race of the form:
> >
> > CPU0 CPU1
> > hmm_release()
> > up_write(&hmm->mirrors_sem);
> > hmm_mirror_unregister(mirror)
> > down_write(&hmm->mirrors_sem);
> > up_write(&hmm->mirrors_sem);
> > kfree(mirror)
> > mirror->ops->release(mirror)
> >
> > The only user we have today for ops->release is an empty function, so this
> > is unambiguously safe.
> >
> > As a consequence of plugging this race drivers are not allowed to
> > register/unregister mirrors from within a release op.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
>
> I agree with the analysis above but I'm not sure that release() will
> always be an empty function. It might be more efficient to write back
> all data migrated to a device "in one pass" instead of relying
> on unmap_vmas() calling hmm_start_range_invalidate() per VMA.
Sure, but it should not be allowed to recurse back to
hmm_mirror_unregister.
> I think the bigger issue is potential deadlocks while calling
> sync_cpu_device_pagetables() and tasks calling hmm_mirror_unregister():
>
> Say you have three threads:
> - Thread A is in try_to_unmap(), either without holding mmap_sem or with
> mmap_sem held for read.
> - Thread B has some unrelated driver calling hmm_mirror_unregister().
> This doesn't require mmap_sem.
> - Thread C is about to call migrate_vma().
>
> Thread A Thread B Thread C
> try_to_unmap hmm_mirror_unregister migrate_vma
> hmm_invalidate_range_start
> down_read(mirrors_sem)
> down_write(mirrors_sem)
> // Blocked on A
> device_lock
> device_lock
> // Blocked on C
> migrate_vma()
> hmm_invalidate_range_s
> down_read(mirrors_sem)
> // Blocked on B
> // Deadlock
Oh... you know I didn't know this about rwsems in linux that they have
a fairness policy for writes to block future reads..
Still, at least as things are designed, the driver cannot hold a lock
it obtains under sync_cpu_device_pagetables() and nest other things in
that lock. It certainly can't recurse back into any mmu notifiers
while holding that lock. (as you point out)
The lock in sync_cpu_device_pagetables() needs to be very narrowly
focused on updating device state only.
So, my first reaction is that the driver in thread C is wrong, and
needs a different locking scheme. I think you'd have to make a really
good case that there is no alternative for a driver..
> Perhaps we should consider using SRCU for walking the mirror->list?
It means the driver has to deal with races like in this patch
description. At that point there is almost no reason to insert hmm
here, just use mmu notifiers directly.
Drivers won't get this right, it is too hard.
Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread[parent not found: <61ea869d-43d2-d1e5-dc00-cf5e3e139169-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>]
* Re: [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release
[not found] ` <61ea869d-43d2-d1e5-dc00-cf5e3e139169-DDmLM1+adcrQT0dZR+AlfA@public.gmane.org>
@ 2019-06-10 16:02 ` Jason Gunthorpe
2019-06-10 22:03 ` Ralph Campbell
0 siblings, 1 reply; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-10 16:02 UTC (permalink / raw)
To: Ralph Campbell
Cc: Andrea Arcangeli,
linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo@public.gmane.org,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org, Jerome Glisse,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW@public.gmane.org
On Fri, Jun 07, 2019 at 02:37:07PM -0700, Ralph Campbell wrote:
>
> On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> >
> > hmm_release() is called exactly once per hmm. ops->release() cannot
> > accidentally trigger any action that would recurse back onto
> > hmm->mirrors_sem.
> >
> > This fixes a use after-free race of the form:
> >
> > CPU0 CPU1
> > hmm_release()
> > up_write(&hmm->mirrors_sem);
> > hmm_mirror_unregister(mirror)
> > down_write(&hmm->mirrors_sem);
> > up_write(&hmm->mirrors_sem);
> > kfree(mirror)
> > mirror->ops->release(mirror)
> >
> > The only user we have today for ops->release is an empty function, so this
> > is unambiguously safe.
> >
> > As a consequence of plugging this race drivers are not allowed to
> > register/unregister mirrors from within a release op.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
>
> I agree with the analysis above but I'm not sure that release() will
> always be an empty function. It might be more efficient to write back
> all data migrated to a device "in one pass" instead of relying
> on unmap_vmas() calling hmm_start_range_invalidate() per VMA.
I think we have to focus on the *current* kernel - and we have two
users of release, nouveau_svm.c is empty and amdgpu_mn.c does
schedule_work() - so I believe we should go ahead with this simple
solution to the actual race today that both of those will suffer from.
If we find a need for a more complex version then it can be debated
and justified with proper context...
Ok?
Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release
2019-06-10 16:02 ` Jason Gunthorpe
@ 2019-06-10 22:03 ` Ralph Campbell
0 siblings, 0 replies; 79+ messages in thread
From: Ralph Campbell @ 2019-06-10 22:03 UTC (permalink / raw)
To: Jason Gunthorpe
Cc: Andrea Arcangeli, linux-rdma@vger.kernel.org, John Hubbard,
Felix.Kuehling@amd.com, dri-devel@lists.freedesktop.org,
linux-mm@kvack.org, Jerome Glisse, amd-gfx@lists.freedesktop.org
On 6/10/19 9:02 AM, Jason Gunthorpe wrote:
> On Fri, Jun 07, 2019 at 02:37:07PM -0700, Ralph Campbell wrote:
>>
>> On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
>>> From: Jason Gunthorpe <jgg@mellanox.com>
>>>
>>> hmm_release() is called exactly once per hmm. ops->release() cannot
>>> accidentally trigger any action that would recurse back onto
>>> hmm->mirrors_sem.
>>>
>>> This fixes a use after-free race of the form:
>>>
>>> CPU0 CPU1
>>> hmm_release()
>>> up_write(&hmm->mirrors_sem);
>>> hmm_mirror_unregister(mirror)
>>> down_write(&hmm->mirrors_sem);
>>> up_write(&hmm->mirrors_sem);
>>> kfree(mirror)
>>> mirror->ops->release(mirror)
>>>
>>> The only user we have today for ops->release is an empty function, so this
>>> is unambiguously safe.
>>>
>>> As a consequence of plugging this race drivers are not allowed to
>>> register/unregister mirrors from within a release op.
>>>
>>> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
>>
>> I agree with the analysis above but I'm not sure that release() will
>> always be an empty function. It might be more efficient to write back
>> all data migrated to a device "in one pass" instead of relying
>> on unmap_vmas() calling hmm_start_range_invalidate() per VMA.
>
> I think we have to focus on the *current* kernel - and we have two
> users of release, nouveau_svm.c is empty and amdgpu_mn.c does
> schedule_work() - so I believe we should go ahead with this simple
> solution to the actual race today that both of those will suffer from.
>
> If we find a need for a more complex version then it can be debated
> and justified with proper context...
>
> Ok?
>
> Jason
OK.
I guess we have enough on the plate already :-)
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* [PATCH v2 12/11] mm/hmm: Fix error flows in hmm_invalidate_range_start
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
` (10 preceding siblings ...)
2019-06-06 18:44 ` [PATCH v2 hmm 11/11] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe
@ 2019-06-07 16:05 ` Jason Gunthorpe
2019-06-07 23:52 ` Ralph Campbell
2019-06-11 19:48 ` [PATCH v2 hmm 00/11] Various revisions from a locking/code review Jason Gunthorpe
12 siblings, 1 reply; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-07 16:05 UTC (permalink / raw)
To: Jerome Glisse, Ralph Campbell, John Hubbard,
Felix.Kuehling-5C7GfCeVMHo
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Andrea Arcangeli,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
If the trylock on the hmm->mirrors_sem fails the function will return
without decrementing the notifiers that were previously incremented. Since
the caller will not call invalidate_range_end() on EAGAIN this will result
in notifiers becoming permanently incremented and deadlock.
If the sync_cpu_device_pagetables() required blocking the function will
not return EAGAIN even though the device continues to touch the
pages. This is a violation of the mmu notifier contract.
Switch, and rename, the ranges_lock to a spin lock so we can reliably
obtain it without blocking during error unwind.
The error unwind is necessary since the notifiers count must be held
incremented across the call to sync_cpu_device_pagetables() as we cannot
allow the range to become marked valid by a parallel
invalidate_start/end() pair while doing sync_cpu_device_pagetables().
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
include/linux/hmm.h | 2 +-
mm/hmm.c | 77 +++++++++++++++++++++++++++------------------
2 files changed, 48 insertions(+), 31 deletions(-)
I almost lost this patch - it is part of the series, hasn't been
posted before, and wasn't sent with the rest, sorry.
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index bf013e96525771..0fa8ea34ccef6d 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -86,7 +86,7 @@
struct hmm {
struct mm_struct *mm;
struct kref kref;
- struct mutex lock;
+ spinlock_t ranges_lock;
struct list_head ranges;
struct list_head mirrors;
struct mmu_notifier mmu_notifier;
diff --git a/mm/hmm.c b/mm/hmm.c
index 4215edf737ef5b..10103a24e9b7b3 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -68,7 +68,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
init_rwsem(&hmm->mirrors_sem);
hmm->mmu_notifier.ops = NULL;
INIT_LIST_HEAD(&hmm->ranges);
- mutex_init(&hmm->lock);
+ spin_lock_init(&hmm->ranges_lock);
kref_init(&hmm->kref);
hmm->notifiers = 0;
hmm->mm = mm;
@@ -114,18 +114,19 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
struct hmm_mirror *mirror;
+ unsigned long flags;
/* Bail out if hmm is in the process of being freed */
if (!kref_get_unless_zero(&hmm->kref))
return;
- mutex_lock(&hmm->lock);
+ spin_lock_irqsave(&hmm->ranges_lock, flags);
/*
* Since hmm_range_register() holds the mmget() lock hmm_release() is
* prevented as long as a range exists.
*/
WARN_ON(!list_empty(&hmm->ranges));
- mutex_unlock(&hmm->lock);
+ spin_unlock_irqrestore(&hmm->ranges_lock, flags);
down_read(&hmm->mirrors_sem);
list_for_each_entry(mirror, &hmm->mirrors, list) {
@@ -141,6 +142,23 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
hmm_put(hmm);
}
+static void notifiers_decrement(struct hmm *hmm)
+{
+ lockdep_assert_held(&hmm->ranges_lock);
+
+ hmm->notifiers--;
+ if (!hmm->notifiers) {
+ struct hmm_range *range;
+
+ list_for_each_entry(range, &hmm->ranges, list) {
+ if (range->valid)
+ continue;
+ range->valid = true;
+ }
+ wake_up_all(&hmm->wq);
+ }
+}
+
static int hmm_invalidate_range_start(struct mmu_notifier *mn,
const struct mmu_notifier_range *nrange)
{
@@ -148,6 +166,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
struct hmm_mirror *mirror;
struct hmm_update update;
struct hmm_range *range;
+ unsigned long flags;
int ret = 0;
if (!kref_get_unless_zero(&hmm->kref))
@@ -158,12 +177,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
update.event = HMM_UPDATE_INVALIDATE;
update.blockable = mmu_notifier_range_blockable(nrange);
- if (mmu_notifier_range_blockable(nrange))
- mutex_lock(&hmm->lock);
- else if (!mutex_trylock(&hmm->lock)) {
- ret = -EAGAIN;
- goto out;
- }
+ spin_lock_irqsave(&hmm->ranges_lock, flags);
hmm->notifiers++;
list_for_each_entry(range, &hmm->ranges, list) {
if (update.end < range->start || update.start >= range->end)
@@ -171,7 +185,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
range->valid = false;
}
- mutex_unlock(&hmm->lock);
+ spin_unlock_irqrestore(&hmm->ranges_lock, flags);
if (mmu_notifier_range_blockable(nrange))
down_read(&hmm->mirrors_sem);
@@ -179,16 +193,26 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
ret = -EAGAIN;
goto out;
}
+
list_for_each_entry(mirror, &hmm->mirrors, list) {
- int ret;
+ int rc;
- ret = mirror->ops->sync_cpu_device_pagetables(mirror, &update);
- if (!update.blockable && ret == -EAGAIN)
+ rc = mirror->ops->sync_cpu_device_pagetables(mirror, &update);
+ if (rc) {
+ if (WARN_ON(update.blockable || rc != -EAGAIN))
+ continue;
+ ret = -EAGAIN;
break;
+ }
}
up_read(&hmm->mirrors_sem);
out:
+ if (ret) {
+ spin_lock_irqsave(&hmm->ranges_lock, flags);
+ notifiers_decrement(hmm);
+ spin_unlock_irqrestore(&hmm->ranges_lock, flags);
+ }
hmm_put(hmm);
return ret;
}
@@ -197,23 +221,14 @@ static void hmm_invalidate_range_end(struct mmu_notifier *mn,
const struct mmu_notifier_range *nrange)
{
struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
+ unsigned long flags;
if (!kref_get_unless_zero(&hmm->kref))
return;
- mutex_lock(&hmm->lock);
- hmm->notifiers--;
- if (!hmm->notifiers) {
- struct hmm_range *range;
-
- list_for_each_entry(range, &hmm->ranges, list) {
- if (range->valid)
- continue;
- range->valid = true;
- }
- wake_up_all(&hmm->wq);
- }
- mutex_unlock(&hmm->lock);
+ spin_lock_irqsave(&hmm->ranges_lock, flags);
+ notifiers_decrement(hmm);
+ spin_unlock_irqrestore(&hmm->ranges_lock, flags);
hmm_put(hmm);
}
@@ -866,6 +881,7 @@ int hmm_range_register(struct hmm_range *range,
{
unsigned long mask = ((1UL << page_shift) - 1UL);
struct hmm *hmm = mirror->hmm;
+ unsigned long flags;
range->valid = false;
range->hmm = NULL;
@@ -887,7 +903,7 @@ int hmm_range_register(struct hmm_range *range,
kref_get(&hmm->kref);
/* Initialize range to track CPU page table updates. */
- mutex_lock(&hmm->lock);
+ spin_lock_irqsave(&hmm->ranges_lock, flags);
range->hmm = hmm;
list_add(&range->list, &hmm->ranges);
@@ -898,7 +914,7 @@ int hmm_range_register(struct hmm_range *range,
*/
if (!hmm->notifiers)
range->valid = true;
- mutex_unlock(&hmm->lock);
+ spin_unlock_irqrestore(&hmm->ranges_lock, flags);
return 0;
}
@@ -914,13 +930,14 @@ EXPORT_SYMBOL(hmm_range_register);
void hmm_range_unregister(struct hmm_range *range)
{
struct hmm *hmm = range->hmm;
+ unsigned long flags;
if (WARN_ON(range->end <= range->start))
return;
- mutex_lock(&hmm->lock);
+ spin_lock_irqsave(&hmm->ranges_lock, flags);
list_del(&range->list);
- mutex_unlock(&hmm->lock);
+ spin_unlock_irqrestore(&hmm->ranges_lock, flags);
/* Drop reference taken by hmm_range_register() */
range->valid = false;
--
2.21.0
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply related [flat|nested] 79+ messages in thread* Re: [PATCH v2 12/11] mm/hmm: Fix error flows in hmm_invalidate_range_start
2019-06-07 16:05 ` [PATCH v2 12/11] mm/hmm: Fix error flows in hmm_invalidate_range_start Jason Gunthorpe
@ 2019-06-07 23:52 ` Ralph Campbell
2019-06-08 1:35 ` Jason Gunthorpe
0 siblings, 1 reply; 79+ messages in thread
From: Ralph Campbell @ 2019-06-07 23:52 UTC (permalink / raw)
To: Jason Gunthorpe, Jerome Glisse, John Hubbard, Felix.Kuehling
Cc: linux-rdma, linux-mm, Andrea Arcangeli, amd-gfx, dri-devel
On 6/7/19 9:05 AM, Jason Gunthorpe wrote:
> If the trylock on the hmm->mirrors_sem fails the function will return
> without decrementing the notifiers that were previously incremented. Since
> the caller will not call invalidate_range_end() on EAGAIN this will result
> in notifiers becoming permanently incremented and deadlock.
>
> If the sync_cpu_device_pagetables() required blocking the function will
> not return EAGAIN even though the device continues to touch the
> pages. This is a violation of the mmu notifier contract.
>
> Switch, and rename, the ranges_lock to a spin lock so we can reliably
> obtain it without blocking during error unwind.
>
> The error unwind is necessary since the notifiers count must be held
> incremented across the call to sync_cpu_device_pagetables() as we cannot
> allow the range to become marked valid by a parallel
> invalidate_start/end() pair while doing sync_cpu_device_pagetables().
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> include/linux/hmm.h | 2 +-
> mm/hmm.c | 77 +++++++++++++++++++++++++++------------------
> 2 files changed, 48 insertions(+), 31 deletions(-)
>
> I almost lost this patch - it is part of the series, hasn't been
> posted before, and wasn't sent with the rest, sorry.
>
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index bf013e96525771..0fa8ea34ccef6d 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -86,7 +86,7 @@
> struct hmm {
> struct mm_struct *mm;
> struct kref kref;
> - struct mutex lock;
> + spinlock_t ranges_lock;
> struct list_head ranges;
> struct list_head mirrors;
> struct mmu_notifier mmu_notifier;
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 4215edf737ef5b..10103a24e9b7b3 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -68,7 +68,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
> init_rwsem(&hmm->mirrors_sem);
> hmm->mmu_notifier.ops = NULL;
> INIT_LIST_HEAD(&hmm->ranges);
> - mutex_init(&hmm->lock);
> + spin_lock_init(&hmm->ranges_lock);
> kref_init(&hmm->kref);
> hmm->notifiers = 0;
> hmm->mm = mm;
> @@ -114,18 +114,19 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> {
> struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
> struct hmm_mirror *mirror;
> + unsigned long flags;
>
> /* Bail out if hmm is in the process of being freed */
> if (!kref_get_unless_zero(&hmm->kref))
> return;
>
> - mutex_lock(&hmm->lock);
> + spin_lock_irqsave(&hmm->ranges_lock, flags);
> /*
> * Since hmm_range_register() holds the mmget() lock hmm_release() is
> * prevented as long as a range exists.
> */
> WARN_ON(!list_empty(&hmm->ranges));
> - mutex_unlock(&hmm->lock);
> + spin_unlock_irqrestore(&hmm->ranges_lock, flags);
>
> down_read(&hmm->mirrors_sem);
> list_for_each_entry(mirror, &hmm->mirrors, list) {
> @@ -141,6 +142,23 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> hmm_put(hmm);
> }
>
> +static void notifiers_decrement(struct hmm *hmm)
> +{
> + lockdep_assert_held(&hmm->ranges_lock);
> +
> + hmm->notifiers--;
> + if (!hmm->notifiers) {
> + struct hmm_range *range;
> +
> + list_for_each_entry(range, &hmm->ranges, list) {
> + if (range->valid)
> + continue;
> + range->valid = true;
> + }
This just effectively sets all ranges to valid.
I'm not sure that is best.
Shouldn't hmm_range_register() start with range.valid = true and
then hmm_invalidate_range_start() set affected ranges to false?
Then this becomes just wake_up_all() if --notifiers == 0 and
hmm_range_wait_until_valid() should wait for notifiers == 0.
Otherwise, range.valid doesn't really mean it's valid.
> + wake_up_all(&hmm->wq);
> + }
> +}
> +
> static int hmm_invalidate_range_start(struct mmu_notifier *mn,
> const struct mmu_notifier_range *nrange)
> {
> @@ -148,6 +166,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
> struct hmm_mirror *mirror;
> struct hmm_update update;
> struct hmm_range *range;
> + unsigned long flags;
> int ret = 0;
>
> if (!kref_get_unless_zero(&hmm->kref))
> @@ -158,12 +177,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
> update.event = HMM_UPDATE_INVALIDATE;
> update.blockable = mmu_notifier_range_blockable(nrange);
>
> - if (mmu_notifier_range_blockable(nrange))
> - mutex_lock(&hmm->lock);
> - else if (!mutex_trylock(&hmm->lock)) {
> - ret = -EAGAIN;
> - goto out;
> - }
> + spin_lock_irqsave(&hmm->ranges_lock, flags);
> hmm->notifiers++;
> list_for_each_entry(range, &hmm->ranges, list) {
> if (update.end < range->start || update.start >= range->end)
> @@ -171,7 +185,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
>
> range->valid = false;
> }
> - mutex_unlock(&hmm->lock);
> + spin_unlock_irqrestore(&hmm->ranges_lock, flags);
>
> if (mmu_notifier_range_blockable(nrange))
> down_read(&hmm->mirrors_sem);
> @@ -179,16 +193,26 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
> ret = -EAGAIN;
> goto out;
> }
> +
> list_for_each_entry(mirror, &hmm->mirrors, list) {
> - int ret;
> + int rc;
>
> - ret = mirror->ops->sync_cpu_device_pagetables(mirror, &update);
> - if (!update.blockable && ret == -EAGAIN)
> + rc = mirror->ops->sync_cpu_device_pagetables(mirror, &update);
> + if (rc) {
> + if (WARN_ON(update.blockable || rc != -EAGAIN))
> + continue;
> + ret = -EAGAIN;
> break;
> + }
> }
> up_read(&hmm->mirrors_sem);
>
> out:
> + if (ret) {
> + spin_lock_irqsave(&hmm->ranges_lock, flags);
> + notifiers_decrement(hmm);
> + spin_unlock_irqrestore(&hmm->ranges_lock, flags);
> + }
> hmm_put(hmm);
> return ret;
> }
> @@ -197,23 +221,14 @@ static void hmm_invalidate_range_end(struct mmu_notifier *mn,
> const struct mmu_notifier_range *nrange)
> {
> struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
> + unsigned long flags;
>
> if (!kref_get_unless_zero(&hmm->kref))
> return;
>
> - mutex_lock(&hmm->lock);
> - hmm->notifiers--;
> - if (!hmm->notifiers) {
> - struct hmm_range *range;
> -
> - list_for_each_entry(range, &hmm->ranges, list) {
> - if (range->valid)
> - continue;
> - range->valid = true;
> - }
> - wake_up_all(&hmm->wq);
> - }
> - mutex_unlock(&hmm->lock);
> + spin_lock_irqsave(&hmm->ranges_lock, flags);
> + notifiers_decrement(hmm);
> + spin_unlock_irqrestore(&hmm->ranges_lock, flags);
>
> hmm_put(hmm);
> }
> @@ -866,6 +881,7 @@ int hmm_range_register(struct hmm_range *range,
> {
> unsigned long mask = ((1UL << page_shift) - 1UL);
> struct hmm *hmm = mirror->hmm;
> + unsigned long flags;
>
> range->valid = false;
> range->hmm = NULL;
> @@ -887,7 +903,7 @@ int hmm_range_register(struct hmm_range *range,
> kref_get(&hmm->kref);
>
> /* Initialize range to track CPU page table updates. */
> - mutex_lock(&hmm->lock);
> + spin_lock_irqsave(&hmm->ranges_lock, flags);
>
> range->hmm = hmm;
> list_add(&range->list, &hmm->ranges);
> @@ -898,7 +914,7 @@ int hmm_range_register(struct hmm_range *range,
> */
> if (!hmm->notifiers)
> range->valid = true;
> - mutex_unlock(&hmm->lock);
> + spin_unlock_irqrestore(&hmm->ranges_lock, flags);
>
> return 0;
> }
> @@ -914,13 +930,14 @@ EXPORT_SYMBOL(hmm_range_register);
> void hmm_range_unregister(struct hmm_range *range)
> {
> struct hmm *hmm = range->hmm;
> + unsigned long flags;
>
> if (WARN_ON(range->end <= range->start))
> return;
>
> - mutex_lock(&hmm->lock);
> + spin_lock_irqsave(&hmm->ranges_lock, flags);
> list_del(&range->list);
> - mutex_unlock(&hmm->lock);
> + spin_unlock_irqrestore(&hmm->ranges_lock, flags);
>
> /* Drop reference taken by hmm_range_register() */
> range->valid = false;
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread* Re: [PATCH v2 12/11] mm/hmm: Fix error flows in hmm_invalidate_range_start
2019-06-07 23:52 ` Ralph Campbell
@ 2019-06-08 1:35 ` Jason Gunthorpe
0 siblings, 0 replies; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-08 1:35 UTC (permalink / raw)
To: Ralph Campbell
Cc: Andrea Arcangeli, linux-rdma, John Hubbard, Felix.Kuehling,
dri-devel, linux-mm, Jerome Glisse, amd-gfx
On Fri, Jun 07, 2019 at 04:52:58PM -0700, Ralph Campbell wrote:
> > @@ -141,6 +142,23 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
> > hmm_put(hmm);
> > }
> > +static void notifiers_decrement(struct hmm *hmm)
> > +{
> > + lockdep_assert_held(&hmm->ranges_lock);
> > +
> > + hmm->notifiers--;
> > + if (!hmm->notifiers) {
> > + struct hmm_range *range;
> > +
> > + list_for_each_entry(range, &hmm->ranges, list) {
> > + if (range->valid)
> > + continue;
> > + range->valid = true;
> > + }
>
> This just effectively sets all ranges to valid.
> I'm not sure that is best.
This is a trade off, it would be much more expensive to have a precise
'valid = true' - instead this algorithm is precise about 'valid =
false' and lazy about 'valid = true' which is much less costly to
calculate.
> Shouldn't hmm_range_register() start with range.valid = true and
> then hmm_invalidate_range_start() set affected ranges to false?
It kind of does, expect when it doesn't, right? :)
> Then this becomes just wake_up_all() if --notifiers == 0 and
> hmm_range_wait_until_valid() should wait for notifiers == 0.
Almost.. but it is more tricky than that.
This scheme is a collision-retry algorithm. The pagefault side runs to
completion if no parallel invalidate start/end happens.
If a parallel invalidation happens then the pagefault retries.
Seeing notifiers == 0 means there is absolutely no current parallel
invalidation.
Seeing range->valid == true (under the device lock)
means this range doesn't intersect with a parallel invalidate.
So.. hmm_range_wait_until_valid() checks the per-range valid because
it doesn't want to sleep if *this range* is not involved in a parallel
invalidation - but once it becomes involved, then yes, valid == true
implies notifiers == 0.
It is easier/safer to use unlocked variable reads if there is only one
variable, thus the weird construction.
It is unclear to me if this micro optimization is really
worthwhile. It is very expensive to manage all this tracking, and no
other mmu notifier implementation really does something like
this. Eliminating the per-range tracking and using the notifier count
as a global lock would be much simpler...
> Otherwise, range.valid doesn't really mean it's valid.
Right, it doesn't really mean 'valid'
It is tracking possible colliding invalidates such that valid == true
(under the device lock) means that there was no colliding invalidate.
I still think this implementation doesn't quite work, as I described
here:
https://lore.kernel.org/linux-mm/20190527195829.GB18019@mellanox.com/
But the idea is basically sound and matches what other mmu notifier
users do, just using a seqcount like scheme, not a boolean.
Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread
* Re: [PATCH v2 hmm 00/11] Various revisions from a locking/code review
[not found] ` <20190606184438.31646-1-jgg-uk2M96/98Pc@public.gmane.org>
` (11 preceding siblings ...)
2019-06-07 16:05 ` [PATCH v2 12/11] mm/hmm: Fix error flows in hmm_invalidate_range_start Jason Gunthorpe
@ 2019-06-11 19:48 ` Jason Gunthorpe
2019-06-12 17:54 ` Kuehling, Felix
12 siblings, 1 reply; 79+ messages in thread
From: Jason Gunthorpe @ 2019-06-11 19:48 UTC (permalink / raw)
To: Felix.Kuehling-5C7GfCeVMHo, Deucher, Alexander
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA,
linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW
On Thu, Jun 06, 2019 at 03:44:27PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> For hmm.git:
>
> This patch series arised out of discussions with Jerome when looking at the
> ODP changes, particularly informed by use after free races we have already
> found and fixed in the ODP code (thanks to syzkaller) working with mmu
> notifiers, and the discussion with Ralph on how to resolve the lifetime model.
>
> Overall this brings in a simplified locking scheme and easy to explain
> lifetime model:
>
> If a hmm_range is valid, then the hmm is valid, if a hmm is valid then the mm
> is allocated memory.
>
> If the mm needs to still be alive (ie to lock the mmap_sem, find a vma, etc)
> then the mmget must be obtained via mmget_not_zero().
>
> Locking of mm->hmm is shifted to use the mmap_sem consistently for all
> read/write and unlocked accesses are removed.
>
> The use unlocked reads on 'hmm->dead' are also eliminated in favour of using
> standard mmget() locking to prevent the mm from being released. Many of the
> debugging checks of !range->hmm and !hmm->mm are dropped in favour of poison -
> which is much clearer as to the lifetime intent.
>
> The trailing patches are just some random cleanups I noticed when reviewing
> this code.
>
> This v2 incorporates alot of the good off list changes & feedback Jerome had,
> and all the on-list comments too. However, now that we have the shared git I
> have kept the one line change to nouveau_svm.c rather than the compat
> funtions.
>
> I believe we can resolve this merge in the DRM tree now and keep the core
> mm/hmm.c clean. DRM maintainers, please correct me if I'm wrong.
>
> It is on top of hmm.git, and I have a git tree of this series to ease testing
> here:
>
> https://github.com/jgunthorpe/linux/tree/hmm
>
> There are still some open locking issues, as I think this remains unaddressed:
>
> https://lore.kernel.org/linux-mm/20190527195829.GB18019@mellanox.com/
>
> I'm looking for some more acks, reviews and tests so this can move ahead to
> hmm.git.
AMD Folks, this is looking pretty good now, can you please give at
least a Tested-by for the new driver code using this that I see in
linux-next?
Thanks,
Jason
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
^ permalink raw reply [flat|nested] 79+ messages in thread* Re: [PATCH v2 hmm 00/11] Various revisions from a locking/code review
2019-06-11 19:48 ` [PATCH v2 hmm 00/11] Various revisions from a locking/code review Jason Gunthorpe
@ 2019-06-12 17:54 ` Kuehling, Felix
[not found] ` <5d3b0ae2-3662-cab2-5e6c-82912f32356a-5C7GfCeVMHo@public.gmane.org>
0 siblings, 1 reply; 79+ messages in thread
From: Kuehling, Felix @ 2019-06-12 17:54 UTC (permalink / raw)
To: Jason Gunthorpe, Deucher, Alexander, Yang, Philip
Cc: linux-rdma@vger.kernel.org, linux-mm@kvack.org,
amd-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org
[+Philip]
Hi Jason,
I'm out of the office this week.
Hi Philip, can you give this a go? Not sure how much you've been
following this patch series review. Message or call me on Skype to
discuss any questions.
Thanks,
Felix
On 2019-06-11 12:48, Jason Gunthorpe wrote:
> On Thu, Jun 06, 2019 at 03:44:27PM -0300, Jason Gunthorpe wrote:
>> From: Jason Gunthorpe <jgg@mellanox.com>
>>
>> For hmm.git:
>>
>> This patch series arised out of discussions with Jerome when looking at the
>> ODP changes, particularly informed by use after free races we have already
>> found and fixed in the ODP code (thanks to syzkaller) working with mmu
>> notifiers, and the discussion with Ralph on how to resolve the lifetime model.
>>
>> Overall this brings in a simplified locking scheme and easy to explain
>> lifetime model:
>>
>> If a hmm_range is valid, then the hmm is valid, if a hmm is valid then the mm
>> is allocated memory.
>>
>> If the mm needs to still be alive (ie to lock the mmap_sem, find a vma, etc)
>> then the mmget must be obtained via mmget_not_zero().
>>
>> Locking of mm->hmm is shifted to use the mmap_sem consistently for all
>> read/write and unlocked accesses are removed.
>>
>> The use unlocked reads on 'hmm->dead' are also eliminated in favour of using
>> standard mmget() locking to prevent the mm from being released. Many of the
>> debugging checks of !range->hmm and !hmm->mm are dropped in favour of poison -
>> which is much clearer as to the lifetime intent.
>>
>> The trailing patches are just some random cleanups I noticed when reviewing
>> this code.
>>
>> This v2 incorporates alot of the good off list changes & feedback Jerome had,
>> and all the on-list comments too. However, now that we have the shared git I
>> have kept the one line change to nouveau_svm.c rather than the compat
>> funtions.
>>
>> I believe we can resolve this merge in the DRM tree now and keep the core
>> mm/hmm.c clean. DRM maintainers, please correct me if I'm wrong.
>>
>> It is on top of hmm.git, and I have a git tree of this series to ease testing
>> here:
>>
>> https://github.com/jgunthorpe/linux/tree/hmm
>>
>> There are still some open locking issues, as I think this remains unaddressed:
>>
>> https://lore.kernel.org/linux-mm/20190527195829.GB18019@mellanox.com/
>>
>> I'm looking for some more acks, reviews and tests so this can move ahead to
>> hmm.git.
> AMD Folks, this is looking pretty good now, can you please give at
> least a Tested-by for the new driver code using this that I see in
> linux-next?
>
> Thanks,
> Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply [flat|nested] 79+ messages in thread