From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Paul Menage <menage@google.com>
Cc: Pavel Emelianov <xemul@openvz.org>,
Hugh Dickins <hugh@veritas.com>,
Sudhir Kumar <skumar@linux.vnet.ibm.com>,
YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
lizf@cn.fujitsu.com, linux-kernel@vger.kernel.org,
taka@valinux.co.jp, linux-mm@kvack.org,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [-mm] Add an owner to the mm_struct (v2)
Date: Fri, 28 Mar 2008 23:40:44 +0530 [thread overview]
Message-ID: <47ED34A4.70604@linux.vnet.ibm.com> (raw)
In-Reply-To: <6599ad830803280838s19ffc366w1a950ebb12e2907b@mail.gmail.com>
Paul Menage wrote:
> On Fri, Mar 28, 2008 at 7:52 AM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>> mm->owner_lock is there to protect mm->owner field from changing simultaneously
>> as tasks fork/exit.
>>
>
> But the *hardware* already does that for you - individual writes to
> pointers are already atomic operations and so will be serialized.
> Using a lock to guard something only does anything useful if at least
> one of the critical regions that takes the lock consists of more than
> a single atomic operation, or if you have a mixture of read sections
> and write sections. Now it's true that your critical region in
> mm_fork_init_owner() is more than a single atomic op, but I'm arguing
> below that it's a no-op. So that just leaves the single region
>
> spin_lock(&mm->owner_lock);
> mm->owner = new_owner;
> spin_unlock(&mm->owner_lock);
>
> which isn't observably different if you remove the spinlock.
>
At fork time, we can have do_fork() run in parallel and we need to protect
mm->owner, if several threads are created at the same time. We don't want to
overwrite mm->owner for each thread that is created.
>> Oh! yes.. my bad again. The check should have been p == p->thread_group, but
>> that is not required either. The check should now ideally be
>>
>> if (!(clone_flags & CLONE_VM))
>>
>
> OK, so if the new thread has its own mm (and hence will already have
> mm->owner set up to point to p in mm_init()) then we do:
>
>> + if (mm->owner != p)
>> + rcu_assign_pointer(mm->owner, p->group_leader);
>
> which is a no-op since we know mm->owner == p.
>
>> Yes.. I think we need to call it earlier.
>>
>
> No, I think we need to call it later - after we've cleared current->mm
> (from within task_lock(current)) - so we can't rely on p->mm in this
> function, we have to pass it in. If we call it before while
> current->mm == mm, then we risk a race where the (new or existing)
> owner exits and passes it back to us *after* we've done a check to see
> if we need to find a new owner. If we ensure that current->mm != mm
> before we call mm_update_next_owner(), then we know we're not a
> candidate for receiving the ownership if we don't have it already.
>
Yes and we could also check for flags & PF_EXITING
>> But there is no way to guarantee that, what is the new_owner exec's after we've
>> done the check and assigned. Won't we end up breaking the invariant? How about
>> we have mm_update_new_owner() call in exec_mmap() as well? That way, we can
>> still use owner_lock and keep the invariant.
>>
>
> Oops, I thought that exit_mm() already got called in the execve()
> path, but you're right, it doesn't.
>
> Yes, exit_mmap() should call mm_update_next_owner() after the call to
> task_unlock(), i.e. after it's set its new mm.
>
> So I need to express the invariant more carefully.
>
> What we need to preserve is that, for every mm at all times, mm->owner
> points to a valid task. So either:
>
> 1) mm->owner->mm == mm AND mm->owner will check to see whether it
> needs to pass ownership before it exits or execs.
>
> OR
>
> 2) mm->owner is the last user of mm and is about to free mm.
>
> OR
>
> 3) mm->owner is currently searching for another user of mm to pass the
> ownership to.
>
> In order to get from state 3 to state 1 safely we have to hold
> task_lock(new_owner). Otherwise we can race with an exit or exec in
> new_owner, resulting in a process that has already passed the point of
> checking current->mm->owner.
>
No.. like you said if we do it after current->mm has changed and is different
from mm, then it's safe to find a new owner. I still don't see why we need
task_lock(new_owner). Even if we have task_lock(new_owner), it can still exit or
exec later.
> I don't see why we need mm->owner_lock to maintain this invariant.
> (But am quite prepared to be proven wrong).
>
Why mix task_lock() to protect mm->owner? owner_lock can provide the protection
you are talking about.
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
WARNING: multiple messages have this Message-ID (diff)
From: Balbir Singh <balbir@linux.vnet.ibm.com>
To: Paul Menage <menage@google.com>
Cc: Pavel Emelianov <xemul@openvz.org>,
Hugh Dickins <hugh@veritas.com>,
Sudhir Kumar <skumar@linux.vnet.ibm.com>,
YAMAMOTO Takashi <yamamoto@valinux.co.jp>,
lizf@cn.fujitsu.com, linux-kernel@vger.kernel.org,
taka@valinux.co.jp, linux-mm@kvack.org,
David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Subject: Re: [-mm] Add an owner to the mm_struct (v2)
Date: Fri, 28 Mar 2008 23:40:44 +0530 [thread overview]
Message-ID: <47ED34A4.70604@linux.vnet.ibm.com> (raw)
In-Reply-To: <6599ad830803280838s19ffc366w1a950ebb12e2907b@mail.gmail.com>
Paul Menage wrote:
> On Fri, Mar 28, 2008 at 7:52 AM, Balbir Singh <balbir@linux.vnet.ibm.com> wrote:
>> mm->owner_lock is there to protect mm->owner field from changing simultaneously
>> as tasks fork/exit.
>>
>
> But the *hardware* already does that for you - individual writes to
> pointers are already atomic operations and so will be serialized.
> Using a lock to guard something only does anything useful if at least
> one of the critical regions that takes the lock consists of more than
> a single atomic operation, or if you have a mixture of read sections
> and write sections. Now it's true that your critical region in
> mm_fork_init_owner() is more than a single atomic op, but I'm arguing
> below that it's a no-op. So that just leaves the single region
>
> spin_lock(&mm->owner_lock);
> mm->owner = new_owner;
> spin_unlock(&mm->owner_lock);
>
> which isn't observably different if you remove the spinlock.
>
At fork time, we can have do_fork() run in parallel and we need to protect
mm->owner, if several threads are created at the same time. We don't want to
overwrite mm->owner for each thread that is created.
>> Oh! yes.. my bad again. The check should have been p == p->thread_group, but
>> that is not required either. The check should now ideally be
>>
>> if (!(clone_flags & CLONE_VM))
>>
>
> OK, so if the new thread has its own mm (and hence will already have
> mm->owner set up to point to p in mm_init()) then we do:
>
>> + if (mm->owner != p)
>> + rcu_assign_pointer(mm->owner, p->group_leader);
>
> which is a no-op since we know mm->owner == p.
>
>> Yes.. I think we need to call it earlier.
>>
>
> No, I think we need to call it later - after we've cleared current->mm
> (from within task_lock(current)) - so we can't rely on p->mm in this
> function, we have to pass it in. If we call it before while
> current->mm == mm, then we risk a race where the (new or existing)
> owner exits and passes it back to us *after* we've done a check to see
> if we need to find a new owner. If we ensure that current->mm != mm
> before we call mm_update_next_owner(), then we know we're not a
> candidate for receiving the ownership if we don't have it already.
>
Yes and we could also check for flags & PF_EXITING
>> But there is no way to guarantee that, what is the new_owner exec's after we've
>> done the check and assigned. Won't we end up breaking the invariant? How about
>> we have mm_update_new_owner() call in exec_mmap() as well? That way, we can
>> still use owner_lock and keep the invariant.
>>
>
> Oops, I thought that exit_mm() already got called in the execve()
> path, but you're right, it doesn't.
>
> Yes, exit_mmap() should call mm_update_next_owner() after the call to
> task_unlock(), i.e. after it's set its new mm.
>
> So I need to express the invariant more carefully.
>
> What we need to preserve is that, for every mm at all times, mm->owner
> points to a valid task. So either:
>
> 1) mm->owner->mm == mm AND mm->owner will check to see whether it
> needs to pass ownership before it exits or execs.
>
> OR
>
> 2) mm->owner is the last user of mm and is about to free mm.
>
> OR
>
> 3) mm->owner is currently searching for another user of mm to pass the
> ownership to.
>
> In order to get from state 3 to state 1 safely we have to hold
> task_lock(new_owner). Otherwise we can race with an exit or exec in
> new_owner, resulting in a process that has already passed the point of
> checking current->mm->owner.
>
No.. like you said if we do it after current->mm has changed and is different
from mm, then it's safe to find a new owner. I still don't see why we need
task_lock(new_owner). Even if we have task_lock(new_owner), it can still exit or
exec later.
> I don't see why we need mm->owner_lock to maintain this invariant.
> (But am quite prepared to be proven wrong).
>
Why mix task_lock() to protect mm->owner? owner_lock can provide the protection
you are talking about.
--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2008-03-28 18:14 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-28 8:23 [-mm] Add an owner to the mm_struct (v2) Balbir Singh
2008-03-28 8:23 ` Balbir Singh
2008-03-28 9:41 ` Jiri Slaby
2008-03-28 9:41 ` Jiri Slaby
2008-03-28 9:43 ` Jiri Slaby
2008-03-28 9:43 ` Jiri Slaby
2008-03-28 10:11 ` Balbir Singh
2008-03-28 10:11 ` Balbir Singh
2008-03-28 10:48 ` KAMEZAWA Hiroyuki
2008-03-28 10:48 ` KAMEZAWA Hiroyuki
2008-03-28 10:51 ` Balbir Singh
2008-03-28 10:51 ` Balbir Singh
2008-03-28 11:06 ` KAMEZAWA Hiroyuki
2008-03-28 11:06 ` KAMEZAWA Hiroyuki
2008-03-28 10:55 ` KAMEZAWA Hiroyuki
2008-03-28 10:55 ` KAMEZAWA Hiroyuki
2008-03-28 10:52 ` Balbir Singh
2008-03-28 10:52 ` Balbir Singh
2008-03-28 11:04 ` Paul Menage
2008-03-28 11:04 ` Paul Menage
2008-03-28 11:15 ` KAMEZAWA Hiroyuki
2008-03-28 11:15 ` KAMEZAWA Hiroyuki
2008-03-28 11:21 ` KAMEZAWA Hiroyuki
2008-03-28 11:21 ` KAMEZAWA Hiroyuki
2008-03-28 11:01 ` Paul Menage
2008-03-28 11:01 ` Paul Menage
2008-03-28 12:36 ` Balbir Singh
2008-03-28 12:36 ` Balbir Singh
2008-03-28 12:54 ` Balbir Singh
2008-03-28 12:54 ` Balbir Singh
2008-03-28 14:06 ` Paul Menage
2008-03-28 14:06 ` Paul Menage
2008-03-28 14:05 ` Paul Menage
2008-03-28 14:05 ` Paul Menage
2008-03-28 14:52 ` Balbir Singh
2008-03-28 14:52 ` Balbir Singh
2008-03-28 15:38 ` Paul Menage
2008-03-28 15:38 ` Paul Menage
2008-03-28 18:10 ` Balbir Singh [this message]
2008-03-28 18:10 ` Balbir Singh
2008-03-28 18:52 ` Paul Menage
2008-03-28 18:52 ` Paul Menage
2008-03-29 1:02 ` Balbir Singh
2008-03-29 1:02 ` Balbir Singh
2008-03-29 5:46 ` Balbir Singh
2008-03-29 5:46 ` Balbir Singh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47ED34A4.70604@linux.vnet.ibm.com \
--to=balbir@linux.vnet.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=hugh@veritas.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizf@cn.fujitsu.com \
--cc=menage@google.com \
--cc=rientjes@google.com \
--cc=skumar@linux.vnet.ibm.com \
--cc=taka@valinux.co.jp \
--cc=xemul@openvz.org \
--cc=yamamoto@valinux.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.