* [xiaolong.ye@intel.com: [mm] 0331ab667f: kernel BUG at mm/mmap.c:327!]
@ 2016-09-20 13:46 Andrea Arcangeli
2016-09-20 13:55 ` Andrea Arcangeli
2016-09-21 0:49 ` Michel Lespinasse
0 siblings, 2 replies; 4+ messages in thread
From: Andrea Arcangeli @ 2016-09-20 13:46 UTC (permalink / raw)
To: Michel Lespinasse, Hugh Dickins, Andrew Morton, Rik van Riel; +Cc: linux-mm
Hello Michel,
I altered the vma_adjust code and it's triggering what looks like to
be a false positive in vma_rb_erase->validate_mm_rb with
CONFIG_DEBUG_VM_RB=y.
So what happens is normally remove_next == 1 or == 2, and set
vma->vm_end to next->vm_end and then call validate_mm_rb(next) and it
passes and then unlink "next" (removed from vm_next/prev and rbtree).
I introduced a new case to fix a bug remove_next == 3 that actually
removes "vma" and sets next->vm_start = vma->vm_start.
So the old code was always doing:
vma->vm_end = next->vm_end
vma_rb_erase(next) // in __vma_unlink
vma->vm_next = next->vm_next // in __vma_unlink
next = vma->vm_next
vma_gap_update(next)
The new code still does the above for remove_next == 1 and 2, but for
remove_next ==3 it has been changed and it does:
next->vm_start = vma->vm_start
vma_rb_erase(vma) // in __vma_unlink
vma_gap_update(next)
However it bugs out in vma_rb_erase(vma) because next->vm_start was
reduced. However I tend to think what I'm executing is correct.
It's pointless to call vma_gap_update before I can call vm_rb_erase
anyway so certainly I can't fix it that way. I'm forced to remove
"vma" from the rbtree before I can call vma_gap_update(next).
So I did other tests:
diff --git a/mm/mmap.c b/mm/mmap.c
index 27f0509..a38c8a0 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -400,15 +400,9 @@ static inline void vma_rb_insert(struct vm_area_struct *vma,
rb_insert_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
}
-static void vma_rb_erase(struct vm_area_struct *vma, struct rb_root *root)
+static void __vma_rb_erase(struct vm_area_struct *vma, struct rb_root *root)
{
/*
- * All rb_subtree_gap values must be consistent prior to erase,
- * with the possible exception of the vma being erased.
- */
- validate_mm_rb(root, vma);
-
- /*
* Note rb_erase_augmented is a fairly large inline function,
* so make sure we instantiate it only once with our desired
* augmented rbtree callbacks.
@@ -416,6 +410,18 @@ static void vma_rb_erase(struct vm_area_struct *vma, struct rb_root *root)
rb_erase_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
}
+static __always_inline void vma_rb_erase(struct vm_area_struct *vma,
+ struct rb_root *root)
+{
+ /*
+ * All rb_subtree_gap values must be consistent prior to erase,
+ * with the possible exception of the vma being erased.
+ */
+ validate_mm_rb(root, vma);
+
+ __vma_rb_erase(vma, root);
+}
+
/*
* vma has some anon_vma assigned, and is already inserted on that
* anon_vma's interval trees.
@@ -606,7 +612,10 @@ static __always_inline void __vma_unlink_common(struct mm_struct *mm,
{
struct vm_area_struct *next;
- vma_rb_erase(vma, &mm->mm_rb);
+ if (has_prev)
+ vma_rb_erase(vma, &mm->mm_rb);
+ else
+ __vma_rb_erase(vma, &mm->mm_rb);
next = vma->vm_next;
if (has_prev)
prev->vm_next = next;
@@ -892,9 +901,11 @@ again:
end = next->vm_end;
goto again;
}
- else if (next)
+ else if (next) {
vma_gap_update(next);
- else
+ if (remove_next == 3)
+ validate_mm_rb(&mm->mm_rb, next);
+ } else
mm->highest_vm_end = end;
}
if (insert && file)
The above shifts the validate_mm_rb(next) for the remove_next == 3
case from before the rb_removal of "vma" to after vma_gap_update is
called on "next". This works fine.
So if you agree this is a false positive of CONFIG_DEBUG_MM_RB and
there was no actual bug, I just suggest to shut off the warning by
telling validate_mm_rb not to ignore the vma that is being removed but
the next one, if the next->vm_start was reduced to overlap over the
vma that is being removed.
This shut off the warning just fine for me and it leaves the
validation in place and always enabled. Just it skips the check on the
next vma that was updated instead of the one that is being removed if
it was the next one that had next->vm_start reduced.
On a side note I also noticed "mm->highest_vm_end = end" is erroneous,
it should be VM_WARN_ON(mm->highest_vm_end != end) but that's
offtopic.
So this would be the patch I'd suggest to shut off the false positive,
it's a noop when CONFIG_DEBUG_VM_RB=n.
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [xiaolong.ye@intel.com: [mm] 0331ab667f: kernel BUG at mm/mmap.c:327!]
2016-09-20 13:46 [xiaolong.ye@intel.com: [mm] 0331ab667f: kernel BUG at mm/mmap.c:327!] Andrea Arcangeli
@ 2016-09-20 13:55 ` Andrea Arcangeli
2016-09-21 0:49 ` Michel Lespinasse
1 sibling, 0 replies; 4+ messages in thread
From: Andrea Arcangeli @ 2016-09-20 13:55 UTC (permalink / raw)
To: Michel Lespinasse, Hugh Dickins, Andrew Morton, Rik van Riel; +Cc: linux-mm
On Tue, Sep 20, 2016 at 03:46:38PM +0200, Andrea Arcangeli wrote:
>
> - vma_rb_erase(vma, &mm->mm_rb);
> + if (has_prev)
> + vma_rb_erase_ignore(vma, &mm->mm_rb, ignore);
> + else
> + vma_rb_erase_ignore(vma, &mm->mm_rb, ignore);
> next = vma->vm_next;
> if (has_prev)
Once this is confirmed as false positive, the above can get a noop
cleanup before merging or I can do a more proper submit with this bit
cleaned up:
diff --git a/mm/mmap.c b/mm/mmap.c
index c682dee..0c5f6f7 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -627,10 +627,7 @@ static __always_inline void __vma_unlink_common(struct mm_struct *mm,
{
struct vm_area_struct *next;
- if (has_prev)
- vma_rb_erase_ignore(vma, &mm->mm_rb, ignore);
- else
- vma_rb_erase_ignore(vma, &mm->mm_rb, ignore);
+ vma_rb_erase_ignore(vma, &mm->mm_rb, ignore);
next = vma->vm_next;
if (has_prev)
prev->vm_next = next;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [xiaolong.ye@intel.com: [mm] 0331ab667f: kernel BUG at mm/mmap.c:327!]
2016-09-20 13:46 [xiaolong.ye@intel.com: [mm] 0331ab667f: kernel BUG at mm/mmap.c:327!] Andrea Arcangeli
2016-09-20 13:55 ` Andrea Arcangeli
@ 2016-09-21 0:49 ` Michel Lespinasse
2016-09-21 16:13 ` Andrea Arcangeli
1 sibling, 1 reply; 4+ messages in thread
From: Michel Lespinasse @ 2016-09-21 0:49 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: Hugh Dickins, Andrew Morton, Rik van Riel, linux-mm
[-- Attachment #1: Type: text/plain, Size: 8604 bytes --]
On Tue, Sep 20, 2016 at 6:46 AM, Andrea Arcangeli <aarcange@redhat.com>
wrote:
> Hello Michel,
>
Hi Andrea, nice hearing from you :)
I altered the vma_adjust code and it's triggering what looks like to
> be a false positive in vma_rb_erase->validate_mm_rb with
> CONFIG_DEBUG_VM_RB=y.
>
> So what happens is normally remove_next == 1 or == 2, and set
> vma->vm_end to next->vm_end and then call validate_mm_rb(next) and it
> passes and then unlink "next" (removed from vm_next/prev and rbtree).
>
> I introduced a new case to fix a bug remove_next == 3 that actually
> removes "vma" and sets next->vm_start = vma->vm_start.
>
> So the old code was always doing:
>
> vma->vm_end = next->vm_end
> vma_rb_erase(next) // in __vma_unlink
> vma->vm_next = next->vm_next // in __vma_unlink
> next = vma->vm_next
> vma_gap_update(next)
>
> The new code still does the above for remove_next == 1 and 2, but for
> remove_next ==3 it has been changed and it does:
>
> next->vm_start = vma->vm_start
> vma_rb_erase(vma) // in __vma_unlink
> vma_gap_update(next)
>
> However it bugs out in vma_rb_erase(vma) because next->vm_start was
> reduced. However I tend to think what I'm executing is correct.
>
It sounds like the gaps get temporarily out of sync, which is not an actual
problem as long as they get fixed before releasing the appropriate locks
(which you can verify by checking if the validate_mm() call at the end of
vma_adjust() still passes).
I'm guessing that for the update you're doing, the validate_mm_rb call
within vma_rb_erase may need to ignore vma->next rather than vma itself.
> It's pointless to call vma_gap_update before I can call vm_rb_erase
> anyway so certainly I can't fix it that way. I'm forced to remove
> "vma" from the rbtree before I can call vma_gap_update(next).
>
>
> So I did other tests:
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 27f0509..a38c8a0 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -400,15 +400,9 @@ static inline void vma_rb_insert(struct
> vm_area_struct *vma,
> rb_insert_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
> }
>
> -static void vma_rb_erase(struct vm_area_struct *vma, struct rb_root *root)
> +static void __vma_rb_erase(struct vm_area_struct *vma, struct rb_root
> *root)
> {
> /*
> - * All rb_subtree_gap values must be consistent prior to erase,
> - * with the possible exception of the vma being erased.
> - */
> - validate_mm_rb(root, vma);
> -
> - /*
> * Note rb_erase_augmented is a fairly large inline function,
> * so make sure we instantiate it only once with our desired
> * augmented rbtree callbacks.
> @@ -416,6 +410,18 @@ static void vma_rb_erase(struct vm_area_struct *vma,
> struct rb_root *root)
> rb_erase_augmented(&vma->vm_rb, root, &vma_gap_callbacks);
> }
>
> +static __always_inline void vma_rb_erase(struct vm_area_struct *vma,
> + struct rb_root *root)
> +{
> + /*
> + * All rb_subtree_gap values must be consistent prior to erase,
> + * with the possible exception of the vma being erased.
> + */
> + validate_mm_rb(root, vma);
> +
> + __vma_rb_erase(vma, root);
> +}
> +
> /*
> * vma has some anon_vma assigned, and is already inserted on that
> * anon_vma's interval trees.
> @@ -606,7 +612,10 @@ static __always_inline void
> __vma_unlink_common(struct mm_struct *mm,
> {
> struct vm_area_struct *next;
>
> - vma_rb_erase(vma, &mm->mm_rb);
> + if (has_prev)
> + vma_rb_erase(vma, &mm->mm_rb);
> + else
> + __vma_rb_erase(vma, &mm->mm_rb);
> next = vma->vm_next;
> if (has_prev)
> prev->vm_next = next;
> @@ -892,9 +901,11 @@ again:
> end = next->vm_end;
> goto again;
> }
> - else if (next)
> + else if (next) {
> vma_gap_update(next);
> - else
> + if (remove_next == 3)
> + validate_mm_rb(&mm->mm_rb, next);
> + } else
> mm->highest_vm_end = end;
> }
> if (insert && file)
>
>
> The above shifts the validate_mm_rb(next) for the remove_next == 3
> case from before the rb_removal of "vma" to after vma_gap_update is
> called on "next". This works fine.
>
> So if you agree this is a false positive of CONFIG_DEBUG_MM_RB and
> there was no actual bug, I just suggest to shut off the warning by
> telling validate_mm_rb not to ignore the vma that is being removed but
> the next one, if the next->vm_start was reduced to overlap over the
> vma that is being removed.
>
I haven't looked in enough detail, but this seems workable. The important
part is that validate_mm must pass at the end up the update. Any other
intermediate checks are secondary - don't feel bad about overriding them if
they get in the way :)
This shut off the warning just fine for me and it leaves the
> validation in place and always enabled. Just it skips the check on the
> next vma that was updated instead of the one that is being removed if
> it was the next one that had next->vm_start reduced.
>
> On a side note I also noticed "mm->highest_vm_end = end" is erroneous,
> it should be VM_WARN_ON(mm->highest_vm_end != end) but that's
> offtopic.
>
> So this would be the patch I'd suggest to shut off the false positive,
> it's a noop when CONFIG_DEBUG_VM_RB=n.
>
> From fc256d7f71cd6295a5258387c0cb2af9134d16a2 Mon Sep 17 00:00:00 2001
> From: Andrea Arcangeli <aarcange@redhat.com>
> Date: Tue, 20 Sep 2016 15:01:33 +0200
> Subject: [PATCH 1/1] mm: vma_merge: correct false positive from
> __vma_unlink->validate_mm_rb
>
> The old code was always doing:
>
> vma->vm_end = next->vm_end
> vma_rb_erase(next) // in __vma_unlink
> vma->vm_next = next->vm_next // in __vma_unlink
> next = vma->vm_next
> vma_gap_update(next)
>
> The new code still does the above for remove_next == 1 and 2, but for
> remove_next == 3 it has been changed and it does:
>
> next->vm_start = vma->vm_start
> vma_rb_erase(vma) // in __vma_unlink
> vma_gap_update(next)
>
> In the latter case, while unlinking "vma", validate_mm_rb() is told to
> ignore "vma" that is being removed, but next->vm_start was reduced
> instead. So for the new case, to avoid the false positive from
> validate_mm_rb, it should be "next" that is ignored when "vma" is
> being unlinked.
>
> "vma" and "next" in the above comment, considered pre-swap().
>
> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
>
Still confused by some parts of the proposed patch:
> @@ -600,11 +620,15 @@ static void __insert_vm_struct(struct mm_struct *mm,
> struct vm_area_struct *vma)
> static __always_inline void __vma_unlink_common(struct mm_struct *mm,
> struct vm_area_struct *vma,
> struct vm_area_struct
> *prev,
> - bool has_prev)
> + bool has_prev,
> + struct vm_area_struct
> *ignore)
> {
> struct vm_area_struct *next;
>
> - vma_rb_erase(vma, &mm->mm_rb);
> + if (has_prev)
> + vma_rb_erase_ignore(vma, &mm->mm_rb, ignore);
> + else
> + vma_rb_erase_ignore(vma, &mm->mm_rb, ignore);
> next = vma->vm_next;
> if (has_prev)
> prev->vm_next = next;
>
You seem to have the same function call on both sides of the if ???
> @@ -626,13 +650,7 @@ static inline void __vma_unlink_prev(struct mm_struct
> *mm,
> struct vm_area_struct *vma,
> struct vm_area_struct *prev)
> {
> - __vma_unlink_common(mm, vma, prev, true);
> -}
> -
> -static inline void __vma_unlink(struct mm_struct *mm,
> - struct vm_area_struct *vma)
> -{
> - __vma_unlink_common(mm, vma, NULL, false);
> + __vma_unlink_common(mm, vma, prev, true, vma);
> }
>
> /*
>
confused as to why some of the __vma_unlink_common parameters change, other
than just adding the ignore parameter
Sorry this is not a full review - but I do agree on the general principle
of working around the intermediate checks in any way you need as long as
validate_mm passes when you're done modifying the vma structures :)
Hope this helps,
[-- Attachment #2: Type: text/html, Size: 11445 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [xiaolong.ye@intel.com: [mm] 0331ab667f: kernel BUG at mm/mmap.c:327!]
2016-09-21 0:49 ` Michel Lespinasse
@ 2016-09-21 16:13 ` Andrea Arcangeli
0 siblings, 0 replies; 4+ messages in thread
From: Andrea Arcangeli @ 2016-09-21 16:13 UTC (permalink / raw)
To: Michel Lespinasse; +Cc: Hugh Dickins, Andrew Morton, Rik van Riel, linux-mm
On Tue, Sep 20, 2016 at 05:49:01PM -0700, Michel Lespinasse wrote:
> Hi Andrea, nice hearing from you :)
Same from my part :)
> It sounds like the gaps get temporarily out of sync, which is not an actual
> problem as long as they get fixed before releasing the appropriate locks
> (which you can verify by checking if the validate_mm() call at the end of
> vma_adjust() still passes).
Ok I did this change to test it. It reports zero problems with the
patch applied that skips "next" instead of "vma" in the case that sets
next->vm_start = vma->vm_start.
diff --git a/mm/mmap.c b/mm/mmap.c
index 0c5f6f7..62b7273 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -915,9 +915,10 @@ again:
end = next->vm_end;
goto again;
}
- else if (next)
+ else if (next) {
vma_gap_update(next);
- else
+ validate_mm(mm);
+ } else
mm->highest_vm_end = end;
}
if (insert && file)
the validate_mm is always executed in case 8 that removes "vma"
instead of "next".
So I think this is definitive confirmation there was no bug and this
was a false positive from DEBUG_VM_RR, that is fully corrected by the
incremental patch I sent yesterday.
> I'm guessing that for the update you're doing, the validate_mm_rb call
> within vma_rb_erase may need to ignore vma->next rather than vma itself.
Exactly, that's what the patch below does. Because vma->next->vm_start
was reduced to vma->vm_start and vma is still in the tree (I'm calling
the vma_rb_erase precisely to remove "vma").
> I haven't looked in enough detail, but this seems workable. The important
> part is that validate_mm must pass at the end up the update. Any other
> intermediate checks are secondary - don't feel bad about overriding them if
> they get in the way :)
I didn't shut off any check to correct the validation code after my
changes: I only shifted the "ignore" parameter from "vma" to "next"
like you suggested above.
> > struct vm_area_struct *next;
> >
> > - vma_rb_erase(vma, &mm->mm_rb);
> > + if (has_prev)
> > + vma_rb_erase_ignore(vma, &mm->mm_rb, ignore);
> > + else
> > + vma_rb_erase_ignore(vma, &mm->mm_rb, ignore);
> > next = vma->vm_next;
> > if (has_prev)
> > prev->vm_next = next;
> >
>
> You seem to have the same function call on both sides of the if ???
Never mind, that was a leftover, but the code was still correct. I
already sent a cleanup follow up patch to deduplicate the above if.
>
>
> > @@ -626,13 +650,7 @@ static inline void __vma_unlink_prev(struct mm_struct
> > *mm,
> > struct vm_area_struct *vma,
> > struct vm_area_struct *prev)
> > {
> > - __vma_unlink_common(mm, vma, prev, true);
> > -}
> > -
> > -static inline void __vma_unlink(struct mm_struct *mm,
> > - struct vm_area_struct *vma)
> > -{
> > - __vma_unlink_common(mm, vma, NULL, false);
> > + __vma_unlink_common(mm, vma, prev, true, vma);
> > }
> >
> > /*
> >
>
> confused as to why some of the __vma_unlink_common parameters change, other
> than just adding the ignore parameter
That changes __vma_unlink_prev, it's just the patch that is
confusing. I just dropped __vma_unlink enterely and I call
__vma_unlink_common directly now, in order to pass the different
"ignore" parameter to it.
The real change to __unlink_vma_prev is this:
> > - __vma_unlink_common(mm, vma, prev, true);
> > + __vma_unlink_common(mm, vma, prev, true, vma)
Which only adds the "same" ignore parameter.
In case8 when I remove "vma" instead of "next", I have no prev for
vma, and vma->vm_prev in fact may be null. So I can't call
__vma_unlink_prev, I got to call the common version directly that is
capable of doing an unlink without a prev guaranteed not-null.
> Sorry this is not a full review - but I do agree on the general principle
> of working around the intermediate checks in any way you need as long as
> validate_mm passes when you're done modifying the vma structures :)
Thanks a lot for the quick review, and yes validate_mm passes if put
immediately after the vma_gap_update(next) as shown at the top of the
email, so it should be all good with this change that passes "next" as
"ignore" parameter, instead of "vma" when next->vm_start is reduced
(instead of vma->vm_end increased in all other cases).
And so there is no bug in the fix in -mm, this was just a false
positive debug check that needed an update to the validation code to
cope with the new code.
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 4+ messages in thread
end of thread, other threads:[~2016-09-21 16:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-09-20 13:46 [xiaolong.ye@intel.com: [mm] 0331ab667f: kernel BUG at mm/mmap.c:327!] Andrea Arcangeli
2016-09-20 13:55 ` Andrea Arcangeli
2016-09-21 0:49 ` Michel Lespinasse
2016-09-21 16:13 ` Andrea Arcangeli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).