From: Robin Holt <holt@sgi.com>
To: Andrea Arcangeli <aarcange@redhat.com>
Cc: linux-mm@kvack.org
Subject: Re: Question/problem with mmu_notifier_unregister.
Date: Wed, 16 Jan 2013 14:00:18 -0600 [thread overview]
Message-ID: <20130116200018.GA3460@sgi.com> (raw)
In-Reply-To: <20130115162956.GH3438@sgi.com>
On Tue, Jan 15, 2013 at 10:29:56AM -0600, Robin Holt wrote:
> Andrea,
>
> On a community or upcoming distro based kernel, I have one XPMEM test
> which is failing. The following patch (with lots of context) fixes it.
> I do not yet understand the cause of the race condition. I did hope it
> would be obvious to you and save me the time of investigating further.
>
> The first test finds mn->hlist is on the chain. The new second test
> does not. I have verified that neither xpmem nor gru is calling to
> unregister the same mmu_notifier twice. It is very difficult to find the
> problem as the test case requires a very long time to trip. To increase
> the likelihood, I run many copies in parallel. Each is 2 processes each
> with one pthread. I run two per socket on an 8 core per socket system
> with 256 socket system. When one job hits the null pointer deref, I can
> have many thousands of lines of debug before the other jobs stop running.
> Each process has /dev/gru mapped into their address space. They have
> used xpmem to attach part of the address space of the other process.
> The tasks are exiting at approximately the same time. The race seems to
> be with one pthread hitting the final mmput at about the same time as the
> other task is getting into the filp_close->flush() callout.
>
> I have a couple other things to work on today, and will likely try to
> chase this failure more tomorrow. With this patch, I have not been able
> to trigger any other issues in many hours of testing. The test likewise
> runs for many hours on a RHEL 6.3 based system and a SLES11 SP2 based
> system so it might have something to do with the change in srcu locking.
Using code inspection, I think I understand this issue some more.
Assume two tasks, one calling mmu_notifier_unregister() as a result
of a filp_close() ->flush() callout (task A), and the other calling
mmu_notifier_release() from an mmput() (task B).
A B
t1 srcu_read_lock()
t2 if (!hlist_unhashed())
t3 srcu_read_lock()
t4 srcu_read_unlock()
t5 hlist_del...()
t6 synchronize_srcu()
t7 srcu_read_unlock()
t8 hlist_del_rcu() <--- NULL pointer deref.
Does this seem plausible? Am I missing something?
My earlier patch feels wrong as I think we will see two ->release()
calls are made, but I am not sure of that.
Thanks,
Robin
> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> index 8a5ac8c..e2c9827 100644
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -297,37 +299,38 @@ void mmu_notifier_unregister(struct mmu_notifier *mn, struct mm_struct *mm)
> if (!hlist_unhashed(&mn->hlist)) {
> /*
> * SRCU here will force exit_mmap to wait ->release to finish
> * before freeing the pages.
> */
> int id;
>
> id = srcu_read_lock(&srcu);
> /*
> * exit_mmap will block in mmu_notifier_release to
> * guarantee ->release is called before freeing the
> * pages.
> */
> if (mn->ops->release)
> mn->ops->release(mn, mm);
> srcu_read_unlock(&srcu, id);
>
> spin_lock(&mm->mmu_notifier_mm->lock);
> - hlist_del_rcu(&mn->hlist);
> + if (!hlist_unhashed(&mn->hlist))
> + hlist_del_rcu(&mn->hlist);
> spin_unlock(&mm->mmu_notifier_mm->lock);
> }
>
> /*
> * Wait any running method to finish, of course including
> * ->release if it was run by mmu_notifier_relase instead of us.
> */
> synchronize_srcu(&srcu);
>
> BUG_ON(atomic_read(&mm->mm_count) <= 0);
>
> mmdrop(mm);
> }
> EXPORT_SYMBOL_GPL(mmu_notifier_unregister);
>
> static int __init mmu_notifier_init(void)
> {
> return init_srcu_struct(&srcu);
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-01-16 20:00 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-15 16:29 Question/problem with mmu_notifier_unregister Robin Holt
2013-01-16 20:00 ` Robin Holt [this message]
2013-01-16 21:01 ` [PATCH] [Patch] mmu_notifier_unregister NULL Pointer deref fix Robin Holt
2013-01-17 2:45 ` Xiao Guangrong
2013-01-17 11:12 ` Robin Holt
2013-01-17 12:19 ` Xiao Guangrong
2013-01-17 13:45 ` Robin Holt
2013-01-18 2:42 ` Xiao Guangrong
2013-01-18 2:48 ` Robin Holt
2013-01-18 3:04 ` Xiao Guangrong
2013-01-18 12:14 ` Robin Holt
2013-01-18 12:51 ` Xiao Guangrong
2013-01-18 15:14 ` Robin Holt
2013-01-23 2:00 ` Wanpeng Li
2013-01-23 2:00 ` Wanpeng Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130116200018.GA3460@sgi.com \
--to=holt@sgi.com \
--cc=aarcange@redhat.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.