All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Linus Torvalds
	<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
	fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org,
	rjw-KKrjLPT3xs0@public.gmane.org,
	linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org
Subject: Re: [PATCH UPDATED 03/10] threadgroup: extend threadgroup_lock() to cover exit and exec
Date: Sun, 27 Nov 2011 13:25:58 -0800	[thread overview]
Message-ID: <20111127212558.GE4266@google.com> (raw)
In-Reply-To: <20111127192155.GB4266-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

Hello, Linus.

On Sun, Nov 27, 2011 at 11:21:55AM -0800, Tejun Heo wrote:
> On Thu, Nov 24, 2011 at 08:02:18PM -0800, Linus Torvalds wrote:
> >  - do you even *need* two separate locks at all? If you extend the
> > threadgroup_fork_lock to cover exec/exit too (and rename it), then why
> > does that separate cred_guard_mutex make sense at all?
> 
> That was mostly choosing the path of least resistance as both locks
> already existed and exec exclusion was added later on, but yeah there
> isn't much to lose by merging those two locks (do_exit nesting inside
> exec will need some care tho).  I'll try to unify the two locks.

Hmmm... I tried this but it seems more difficult than originally
expected.  The biggest obstacle is that we don't have _interruptible
and _killable rwsem ops.

As proposed, what we have is,

* fork and exit paths are read-protected by group_rwsem.  ie. forks
  and exits don't block each other.

* exec path is protected by cred_guard_mutex.  cred_guard_mutex also
  synchronizes parallel exec attempts.  (It's probably better to
  rename it to exec_mutex)

* thread_group_lock() makes the thread group stable by write-locking
  group_rwsem and grabbing cred_guard_mutex.  The former excludes
  forks and exits and the latter exec.

group_rwsem and cred_guard_mutex can be merged by,

* fork and exit paths read-lock group_rwsem as it currently does.

* exec path write-locks group_rwsem.  Note that it'll need to unlock
  in de_thread() while waiting for other threads to exit as exit path
  can't proceed while group_rwsem is write-locked.
  prepare_bprm_creds() should be updated to check signal_pending()
  after acquiring group_rwsem.

  This introduces synchronization between exec and fork/exit paths,
  but I don't think this is something we need to worry about as both
  fork-exec and exit-exec scalabilities don't matter.

* thread_group_lock() write-locks group_rwsem.

The problem is that cred_guard_mutex uses _interruptible/_killable
operations and rwsem doesn't have them, so cred_guard_mutex can't be
easily replaced with write-locking group_rwsem.

If the two locks can't be merged, under the proposed scheme, while not
exactly pretty, both fork/exit and exec paths go through single
locking and only the ones which want stable threadgroup need to grab
both locks, so IMHO it is at least reasonable.

Any better ideas?

Thanks.

-- 
tejun

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: paul@paulmenage.org, rjw@sisk.pl, lizf@cn.fujitsu.com,
	linux-pm@lists.linux-foundation.org,
	linux-kernel@vger.kernel.org,
	containers@lists.linux-foundation.org, fweisbec@gmail.com,
	matthltc@us.ibm.com, akpm@linux-foundation.org, oleg@redhat.com,
	kamezawa.hiroyu@jp.fujitsu.com
Subject: Re: [PATCH UPDATED 03/10] threadgroup: extend threadgroup_lock() to cover exit and exec
Date: Sun, 27 Nov 2011 13:25:58 -0800	[thread overview]
Message-ID: <20111127212558.GE4266@google.com> (raw)
In-Reply-To: <20111127192155.GB4266@google.com>

Hello, Linus.

On Sun, Nov 27, 2011 at 11:21:55AM -0800, Tejun Heo wrote:
> On Thu, Nov 24, 2011 at 08:02:18PM -0800, Linus Torvalds wrote:
> >  - do you even *need* two separate locks at all? If you extend the
> > threadgroup_fork_lock to cover exec/exit too (and rename it), then why
> > does that separate cred_guard_mutex make sense at all?
> 
> That was mostly choosing the path of least resistance as both locks
> already existed and exec exclusion was added later on, but yeah there
> isn't much to lose by merging those two locks (do_exit nesting inside
> exec will need some care tho).  I'll try to unify the two locks.

Hmmm... I tried this but it seems more difficult than originally
expected.  The biggest obstacle is that we don't have _interruptible
and _killable rwsem ops.

As proposed, what we have is,

* fork and exit paths are read-protected by group_rwsem.  ie. forks
  and exits don't block each other.

* exec path is protected by cred_guard_mutex.  cred_guard_mutex also
  synchronizes parallel exec attempts.  (It's probably better to
  rename it to exec_mutex)

* thread_group_lock() makes the thread group stable by write-locking
  group_rwsem and grabbing cred_guard_mutex.  The former excludes
  forks and exits and the latter exec.

group_rwsem and cred_guard_mutex can be merged by,

* fork and exit paths read-lock group_rwsem as it currently does.

* exec path write-locks group_rwsem.  Note that it'll need to unlock
  in de_thread() while waiting for other threads to exit as exit path
  can't proceed while group_rwsem is write-locked.
  prepare_bprm_creds() should be updated to check signal_pending()
  after acquiring group_rwsem.

  This introduces synchronization between exec and fork/exit paths,
  but I don't think this is something we need to worry about as both
  fork-exec and exit-exec scalabilities don't matter.

* thread_group_lock() write-locks group_rwsem.

The problem is that cred_guard_mutex uses _interruptible/_killable
operations and rwsem doesn't have them, so cred_guard_mutex can't be
easily replaced with write-locking group_rwsem.

If the two locks can't be merged, under the proposed scheme, while not
exactly pretty, both fork/exit and exec paths go through single
locking and only the ones which want stable threadgroup need to grab
both locks, so IMHO it is at least reasonable.

Any better ideas?

Thanks.

-- 
tejun

  parent reply	other threads:[~2011-11-27 21:25 UTC|newest]

Thread overview: 167+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-01 23:46 [PATCHSET] cgroup: stable threadgroup during attach & subsys methods consolidation Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-01 23:46 ` [PATCH 01/10] cgroup: add cgroup_root_mutex Tejun Heo
2011-11-01 23:46   ` Tejun Heo
     [not found]   ` <1320191193-8110-2-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04  8:38     ` KAMEZAWA Hiroyuki
2011-11-04  8:38   ` KAMEZAWA Hiroyuki
2011-11-04  8:38   ` KAMEZAWA Hiroyuki
2011-11-01 23:46 ` [PATCH 02/10] threadgroup: rename signal->threadgroup_fork_lock to ->group_rwsem Tejun Heo
2011-11-01 23:46 ` [PATCH 03/10] threadgroup: extend threadgroup_lock() to cover exit and exec Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-04  8:45   ` KAMEZAWA Hiroyuki
2011-11-04  8:45   ` KAMEZAWA Hiroyuki
2011-11-13 16:44   ` Frederic Weisbecker
     [not found]   ` <1320191193-8110-4-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04  8:45     ` KAMEZAWA Hiroyuki
2011-11-13 16:44     ` Frederic Weisbecker
2011-11-13 16:44       ` Frederic Weisbecker
2011-11-14 13:54       ` Frederic Weisbecker
2011-11-14 13:54         ` Frederic Weisbecker
2011-11-21 22:03         ` Tejun Heo
2011-11-21 22:03           ` Tejun Heo
     [not found]           ` <20111121220326.GM25776-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-11-23 14:34             ` Frederic Weisbecker
2011-11-23 14:34               ` Frederic Weisbecker
2011-11-23 14:34           ` Frederic Weisbecker
2011-11-21 22:03         ` Tejun Heo
2011-11-14 13:54       ` Frederic Weisbecker
2011-11-21 21:58       ` Tejun Heo
2011-11-21 21:58         ` Tejun Heo
     [not found]         ` <20111121215839.GL25776-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-11-23 14:02           ` Frederic Weisbecker
2011-11-23 14:02             ` Frederic Weisbecker
2011-11-24 21:22             ` Tejun Heo
2011-11-24 21:22               ` Tejun Heo
     [not found]             ` <20111123140139.GB10669-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-11-24 21:22               ` Tejun Heo
2011-11-23 14:02         ` Frederic Weisbecker
2011-11-21 21:58       ` Tejun Heo
2011-11-13 18:20     ` Frederic Weisbecker
2011-11-24 22:50     ` [PATCH UPDATED " Tejun Heo
2011-11-13 18:20   ` [PATCH " Frederic Weisbecker
2011-11-13 18:20     ` Frederic Weisbecker
2011-11-24 22:50   ` [PATCH UPDATED " Tejun Heo
2011-11-24 22:50     ` Tejun Heo
2011-11-25  4:02     ` Linus Torvalds
2011-11-25  4:02       ` Linus Torvalds
2011-11-27 19:21       ` Tejun Heo
     [not found]       ` <CA+55aFxq1wztMzYhKaY5RHazLBDz4pSXUgiGzTj2wA6EJcDbAw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-11-27 19:21         ` Tejun Heo
2011-11-27 19:21           ` Tejun Heo
2011-11-27 21:25           ` Tejun Heo
     [not found]           ` <20111127192155.GB4266-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-11-27 21:25             ` Tejun Heo [this message]
2011-11-27 21:25               ` Tejun Heo
     [not found]               ` <20111127212558.GE4266-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-01 19:29                 ` Tejun Heo
2011-12-01 19:29               ` Tejun Heo
2011-12-01 19:29                 ` Tejun Heo
2011-11-25 14:01     ` Frederic Weisbecker
2011-11-25 14:01       ` Frederic Weisbecker
     [not found]       ` <20111125140136.GC23307-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-11-27 19:30         ` Tejun Heo
2011-11-27 19:30           ` Tejun Heo
     [not found]           ` <20111127193001.GC4266-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-02 16:28             ` Frederic Weisbecker
2011-12-02 16:28           ` Frederic Weisbecker
2011-12-02 16:28             ` Frederic Weisbecker
     [not found]             ` <20111202162753.GA19752-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-12-05 18:43               ` Tejun Heo
2011-12-05 18:43                 ` Tejun Heo
     [not found]                 ` <20111205184315.GJ627-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-07 15:30                   ` Frederic Weisbecker
2011-12-07 15:30                 ` Frederic Weisbecker
2011-12-07 15:30                   ` Frederic Weisbecker
2011-12-07 18:22                   ` Tejun Heo
2011-12-07 18:22                     ` Tejun Heo
     [not found]                     ` <20111207182214.GA7610-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-08 20:50                       ` [PATCH UPDATED AGAIN " Tejun Heo
2011-12-08 20:50                     ` Tejun Heo
2011-12-08 20:50                       ` Tejun Heo
2011-12-09 23:42                       ` Frederic Weisbecker
     [not found]                       ` <20111208205055.GB12108-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-09 23:42                         ` Frederic Weisbecker
2011-12-09 23:42                           ` Frederic Weisbecker
2011-12-13  1:33                           ` Tejun Heo
2011-12-13  1:33                           ` Tejun Heo
2011-12-13  1:33                           ` Tejun Heo
2011-12-13  2:17                             ` Tejun Heo
     [not found]                             ` <20111213013334.GC25802-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-12-13  2:17                               ` Tejun Heo
2011-12-13  2:17                                 ` Tejun Heo
     [not found]                   ` <20111207153046.GC13252-oHC15RC7JGTpAmv0O++HtFaTQe2KTcn/@public.gmane.org>
2011-12-07 18:22                     ` [PATCH UPDATED " Tejun Heo
2011-12-05 18:43             ` Tejun Heo
2011-11-27 19:30       ` Tejun Heo
     [not found]     ` <20111124225054.GA14828-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-11-25  4:02       ` Linus Torvalds
2011-11-25 14:01       ` Frederic Weisbecker
2011-11-01 23:46 ` [PATCH 04/10] cgroup: always lock threadgroup during migration Tejun Heo
2011-11-01 23:46   ` Tejun Heo
2011-11-04  8:54   ` KAMEZAWA Hiroyuki
2011-11-04  8:54   ` KAMEZAWA Hiroyuki
2011-11-04 15:21     ` Tejun Heo
     [not found]     ` <20111104175413.30afaf8e.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-11-04 15:21       ` Tejun Heo
2011-11-04 15:21         ` Tejun Heo
2011-11-14 18:46   ` Frederic Weisbecker
     [not found]   ` <1320191193-8110-5-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04  8:54     ` KAMEZAWA Hiroyuki
2011-11-14 18:46     ` Frederic Weisbecker
2011-11-14 18:46       ` Frederic Weisbecker
2011-11-14 18:52       ` Frederic Weisbecker
2011-11-14 18:52       ` Frederic Weisbecker
2011-11-14 18:52         ` Frederic Weisbecker
2011-11-21 22:05         ` Tejun Heo
2011-11-21 22:05         ` Tejun Heo
2011-11-21 22:05           ` Tejun Heo
2011-11-01 23:46 ` [PATCH 05/10] cgroup: subsys->attach_task() should be called after migration Tejun Heo
     [not found]   ` <1320191193-8110-6-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-14 20:06     ` Frederic Weisbecker
2011-11-14 20:06   ` Frederic Weisbecker
2011-11-14 20:06     ` Frederic Weisbecker
2011-11-21 22:04     ` Tejun Heo
2011-11-21 22:04       ` Tejun Heo
2011-11-21 22:04     ` Tejun Heo
2011-11-01 23:46 ` Tejun Heo
2011-11-01 23:46 ` [PATCH 06/10] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
     [not found] ` <1320191193-8110-1-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-01 23:46   ` [PATCH 01/10] cgroup: add cgroup_root_mutex Tejun Heo
2011-11-01 23:46   ` [PATCH 02/10] threadgroup: rename signal->threadgroup_fork_lock to ->group_rwsem Tejun Heo
2011-11-01 23:46     ` Tejun Heo
     [not found]     ` <1320191193-8110-3-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04  8:40       ` KAMEZAWA Hiroyuki
2011-11-04  8:40     ` KAMEZAWA Hiroyuki
2011-11-04 15:16       ` Tejun Heo
2011-11-04 15:16         ` Tejun Heo
     [not found]       ` <20111104174032.e0c4fc11.kamezawa.hiroyu-+CUm20s59erQFUHtdCDX3A@public.gmane.org>
2011-11-04 15:16         ` Tejun Heo
2011-11-04  8:40     ` KAMEZAWA Hiroyuki
2011-11-01 23:46   ` [PATCH 03/10] threadgroup: extend threadgroup_lock() to cover exit and exec Tejun Heo
2011-11-01 23:46   ` [PATCH 04/10] cgroup: always lock threadgroup during migration Tejun Heo
2011-11-01 23:46   ` [PATCH 05/10] cgroup: subsys->attach_task() should be called after migration Tejun Heo
2011-11-01 23:46   ` [PATCH 06/10] cgroup: improve old cgroup handling in cgroup_attach_proc() Tejun Heo
2011-11-01 23:46     ` Tejun Heo
2011-11-14 20:37     ` Frederic Weisbecker
     [not found]     ` <1320191193-8110-7-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-14 20:37       ` Frederic Weisbecker
2011-11-14 20:37         ` Frederic Weisbecker
2011-11-01 23:46   ` [PATCH 07/10] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
2011-11-01 23:46     ` Tejun Heo
     [not found]     ` <1320191193-8110-8-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-14 21:16       ` Frederic Weisbecker
2011-11-14 21:16         ` Frederic Weisbecker
2011-11-14 21:16     ` Frederic Weisbecker
2011-11-01 23:46   ` [PATCH 08/10] cgroup: don't use subsys->can_attach_task() or ->attach_task() Tejun Heo
2011-11-01 23:46   ` [PATCH 09/10] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
2011-11-01 23:46     ` Tejun Heo
2011-11-15  0:51     ` Frederic Weisbecker
     [not found]     ` <1320191193-8110-10-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-15  0:51       ` Frederic Weisbecker
2011-11-15  0:51         ` Frederic Weisbecker
2011-11-01 23:46   ` [PATCH 10/10] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task() Tejun Heo
2011-11-01 23:46     ` Tejun Heo
     [not found]     ` <1320191193-8110-11-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04  9:10       ` KAMEZAWA Hiroyuki
2011-11-15  0:54       ` Frederic Weisbecker
2011-11-15  0:54         ` Frederic Weisbecker
2011-11-04  9:10     ` KAMEZAWA Hiroyuki
2011-11-04  9:10     ` KAMEZAWA Hiroyuki
2011-11-15  0:54     ` Frederic Weisbecker
2011-11-21 22:07   ` [PATCHSET] cgroup: stable threadgroup during attach & subsys methods consolidation Tejun Heo
2011-11-21 22:07     ` Tejun Heo
2011-11-22  2:27     ` Li Zefan
     [not found]     ` <20111121220719.GP25776-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2011-11-22  2:27       ` Li Zefan
2011-11-22  2:27         ` Li Zefan
     [not found]         ` <4ECB089C.3080208-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2011-11-22 16:20           ` Tejun Heo
2011-11-22 16:20             ` Tejun Heo
2011-11-22 16:20         ` Tejun Heo
2011-11-24 22:51   ` Tejun Heo
2011-11-01 23:46 ` [PATCH 07/10] cgroup: introduce cgroup_taskset and use it in subsys->can_attach(), cancel_attach() and attach() Tejun Heo
2011-11-01 23:46 ` [PATCH 08/10] cgroup: don't use subsys->can_attach_task() or ->attach_task() Tejun Heo
2011-11-01 23:46   ` Tejun Heo
2011-11-04  9:08   ` KAMEZAWA Hiroyuki
2011-11-04  9:08   ` KAMEZAWA Hiroyuki
     [not found]   ` <1320191193-8110-9-git-send-email-tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2011-11-04  9:08     ` KAMEZAWA Hiroyuki
2011-11-14 23:54     ` Frederic Weisbecker
2011-11-14 23:54       ` Frederic Weisbecker
2011-11-14 23:54   ` Frederic Weisbecker
2011-11-01 23:46 ` [PATCH 09/10] cgroup, cpuset: don't use ss->pre_attach() Tejun Heo
2011-11-01 23:46 ` [PATCH 10/10] cgroup: kill subsys->can_attach_task(), pre_attach() and attach_task() Tejun Heo
2011-11-21 22:07 ` [PATCHSET] cgroup: stable threadgroup during attach & subsys methods consolidation Tejun Heo
2011-11-24 22:51 ` Tejun Heo
2011-11-24 22:51   ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111127212558.GE4266@google.com \
    --to=tj-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=fweisbec-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-pm-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=paul-inf54ven1CmVyaH7bEyXVA@public.gmane.org \
    --cc=rjw-KKrjLPT3xs0@public.gmane.org \
    --cc=torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.