All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Serge E. Hallyn" <serue@us.ibm.com>
To: Benjamin Blum <bblum@google.com>
Cc: linux-kernel@vger.kernel.org,
	containers@lists.linux-foundation.org, akpm@linux-foundation.org,
	lizf@cn.fujitsu.com, menage@google.com
Subject: Re: [PATCH 6/6] Makes procs file writable to move all threads by tgid at once
Date: Mon, 3 Aug 2009 13:55:56 -0500	[thread overview]
Message-ID: <20090803185556.GA8469@us.ibm.com> (raw)
In-Reply-To: <2f86c2480908031113y525b6cbdhe418b8a0364c7760@mail.gmail.com>

Quoting Benjamin Blum (bblum@google.com):
> On Mon, Aug 3, 2009 at 1:54 PM, Serge E. Hallyn<serue@us.ibm.com> wrote:
> > Quoting Ben Blum (bblum@google.com):
> > ...
> >> +static int cgroup_task_migrate(struct cgroup *cgrp, struct cgroup *oldcgrp,
> >> +                            struct task_struct *tsk, int guarantee)
> >> +{
> >> +     struct css_set *oldcg;
> >> +     struct css_set *newcg;
> >> +
> >> +     /*
> >> +      * get old css_set. we need to take task_lock and refcount it, because
> >> +      * an exiting task can change its css_set to init_css_set and drop its
> >> +      * old one without taking cgroup_mutex.
> >> +      */
> >> +     task_lock(tsk);
> >> +     oldcg = tsk->cgroups;
> >> +     get_css_set(oldcg);
> >> +     task_unlock(tsk);
> >> +     /*
> >> +      * locate or allocate a new css_set for this task. 'guarantee' tells
> >> +      * us whether or not we are sure that a new css_set already exists;
> >> +      * in that case, we are not allowed to fail, as we won't need malloc.
> >> +      */
> >> +     if (guarantee) {
> >> +             /*
> >> +              * our caller promises us that the css_set we want already
> >> +              * exists, so we use find_existing_css_set directly.
> >> +              */
> >> +             struct cgroup_subsys_state *template[CGROUP_SUBSYS_COUNT];
> >> +             read_lock(&css_set_lock);
> >> +             newcg = find_existing_css_set(oldcg, cgrp, template);
> >> +             BUG_ON(!newcg);
> >> +             get_css_set(newcg);
> >> +             read_unlock(&css_set_lock);
> >> +     } else {
> >> +             might_sleep();
> >
> > So cgroup_task_migrate() might sleep, but
> >
> > ...
> >
> >
> >> +     down_write(&leader->cgroup_fork_mutex);
> >> +     rcu_read_lock();
> >> +     list_for_each_entry_rcu(tsk, &leader->thread_group, thread_group) {
> >> +             /* leave current thread as it is if it's already there */
> >> +             oldcgrp = task_cgroup(tsk, subsys_id);
> >> +             if (cgrp == oldcgrp)
> >> +                     continue;
> >> +             /* we don't care whether these threads are exiting */
> >> +             retval = cgroup_task_migrate(cgrp, oldcgrp, tsk, 1);
> >
> > Here it is called under rcu_read_lock().
> 
> You'll notice the fourth argument, which tells cgroup_task_migrate

Hmmm, in my defense one would notice it more readily if the caller used a
meaningful #define instead of '1'.

> whether the css_set is guaranteed or not. If we say we've already got
> it covered, the might_sleep section doesn't happen.
> 
> >> -void cgroup_fork(struct task_struct *child)
> >> +void cgroup_fork(struct task_struct *child, int clone_flags)
> >>  {
> >> +     if (clone_flags & CLONE_THREAD)
> >> +             down_read(&current->group_leader->cgroup_fork_mutex);
> >> +     else
> >> +             init_rwsem(&child->cgroup_fork_mutex);
> >
> > I'm also worried about the overhead here on what should be a
> > fast case, CLONE_THREAD.  Have you done any benchmarking of
> > one thread spawning a bunch of others?
> 
> Should be strictly better as this is making the rwsem local to the
> threadgroup - at least in comparison to the previous edition of this
> patch which had it as a global lock.
> 
> > What *exactly* is it we are protecting with cgroup_fork_mutex?
> > 'fork' (as the name implies) is not a good answer, since we should be
> > protecting data, not code.  If it is solely tsk->cgroups, then perhaps
> > we should in fact try switching to (s?)rcu.  Then cgroup_fork() could
> > just do rcu_read_lock, while cgroup_task_migrate() would make the change
> > under a spinlock (to protect against concurrent cgroup_task_migrate()s),
> > and using rcu_assign_pointer to let cgroup_fork() see consistent data
> > either before or after the update...  That might mean that any checks done
> > before completing the migrate which involve the # of tasks might become
> > invalidated before the migration completes?  Seems acceptable (since
> > it'll be a small overcharge at most and can be quickly remedied).
> 
> You'll notice where the rwsem is released - not until cgroup_post_fork
> or cgroup_fork_failed. It doesn't just protect the tsk->cgroups
> pointer, but rather guarantees atomicity between adjusting
> tsk->cgroups and attaching it to the cgroups lists with respect to the
> critical section in attach_proc. If you've a better name for the lock
> for such a race condition, do suggest.

No the name is pretty accurate - it's the lock itself I'm objecting
to.  Maybe it's the best we can do, though.

thanks,
-serge

  parent reply	other threads:[~2009-08-03 18:55 UTC|newest]

Thread overview: 100+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-31  1:51 [PATCH v2 0/6] CGroups: cgroup memberlist enhancement+fix Ben Blum
2009-07-31  1:51 ` [PATCH 1/6] Adds a read-only "procs" file similar to "tasks" that shows only unique tgids Ben Blum
     [not found] ` <20090731012908.27908.62208.stgit-/yCBOHwbXCxd3OlUiQof+WCaruZE5nAUZeezCHUQhQ4@public.gmane.org>
2009-07-31  1:51   ` Ben Blum
2009-07-31  1:51   ` [PATCH 2/6] Ensures correct concurrent opening/reading of pidlists across pid namespaces Ben Blum
2009-07-31  1:51   ` [PATCH 3/6] Quick vmalloc vs kmalloc fix to the case where array size is too large Ben Blum
2009-07-31  1:51     ` Ben Blum
2009-07-31  1:51   ` [PATCH 4/6] Changes css_set freeing mechanism to be under RCU Ben Blum
2009-07-31  1:51   ` [PATCH 5/6] Lets ss->can_attach and ss->attach do whole threadgroups at a time Ben Blum
2009-07-31  1:51   ` [PATCH 6/6] Makes procs file writable to move all threads by tgid at once Ben Blum
2009-07-31  1:51 ` [PATCH 2/6] Ensures correct concurrent opening/reading of pidlists across pid namespaces Ben Blum
2009-07-31  1:51 ` [PATCH 4/6] Changes css_set freeing mechanism to be under RCU Ben Blum
2009-07-31  1:51 ` [PATCH 5/6] Lets ss->can_attach and ss->attach do whole threadgroups at a time Ben Blum
2009-08-03  2:22   ` Li Zefan
2009-08-04  0:35     ` Benjamin Blum
     [not found]     ` <4A7649E1.4000200-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2009-08-04  0:35       ` Benjamin Blum
     [not found]   ` <20090731015149.27908.25403.stgit-/yCBOHwbXCxd3OlUiQof+WCaruZE5nAUZeezCHUQhQ4@public.gmane.org>
2009-08-03  2:22     ` Li Zefan
2009-07-31  1:51 ` [PATCH 6/6] Makes procs file writable to move all threads by tgid at once Ben Blum
2009-08-03  3:00   ` Li Zefan
     [not found]     ` <4A7652E7.4020206-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2009-08-04  0:56       ` Benjamin Blum
2009-08-04  0:56     ` Benjamin Blum
     [not found]       ` <2f86c2480908031756j557e7aebmbf7951da6a1aadb0-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-04  1:05         ` Paul Menage
2009-08-04  1:09         ` Li Zefan
2009-08-04  1:09           ` Li Zefan
     [not found]           ` <4A778A49.6040302-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2009-08-04  1:19             ` Benjamin Blum
2009-08-04  1:19           ` Benjamin Blum
     [not found]             ` <2f86c2480908031819h2513cdb4tac3d6def3e0aa320-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-04  1:45               ` Li Zefan
2009-08-04  1:45             ` Li Zefan
2009-08-04  1:55               ` Paul Menage
     [not found]               ` <4A7792C4.5010504-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>
2009-08-04  1:55                 ` Paul Menage
2009-08-04  1:05       ` Paul Menage
2009-08-04  1:11         ` Benjamin Blum
     [not found]         ` <6599ad830908031805y31136eceqeff0bab455100d6c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-04  1:11           ` Benjamin Blum
     [not found]   ` <20090731015154.27908.9639.stgit-/yCBOHwbXCxd3OlUiQof+WCaruZE5nAUZeezCHUQhQ4@public.gmane.org>
2009-08-03  3:00     ` Li Zefan
2009-08-03 17:54     ` Serge E. Hallyn
2009-08-03 17:54   ` Serge E. Hallyn
     [not found]     ` <20090803175452.GA5481-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-03 18:07       ` Paul Menage
2009-08-03 18:13       ` Benjamin Blum
2009-08-03 18:07     ` Paul Menage
2009-08-03 18:13     ` Benjamin Blum
     [not found]       ` <2f86c2480908031113y525b6cbdhe418b8a0364c7760-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-03 18:55         ` Serge E. Hallyn
2009-08-03 18:55       ` Serge E. Hallyn [this message]
2009-08-03 19:45         ` Serge E. Hallyn
2009-08-03 19:55           ` Paul Menage
     [not found]             ` <6599ad830908031255j68ce047x7165bfefa62ed53c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-04 14:01               ` Serge E. Hallyn
2009-08-04 21:40               ` Matt Helsley
2009-08-04 14:01             ` Serge E. Hallyn
2009-08-04 21:40             ` Matt Helsley
2009-08-04 21:40             ` Matt Helsley
     [not found]           ` <20090803194555.GA10158-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-03 19:55             ` Paul Menage
2009-08-04 18:48             ` Paul Menage
2009-08-04 18:48               ` Paul Menage
     [not found]               ` <6599ad830908041148h6d3f3e9bxfef9f3eedec0ab6d-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-04 19:01                 ` Serge E. Hallyn
2009-08-04 19:01                   ` Serge E. Hallyn
2009-08-04 19:14                 ` Benjamin Blum
2009-08-04 19:14                   ` Benjamin Blum
     [not found]                   ` <2f86c2480908041214r1f23c1b7q9a25b04e26c92a1a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-04 19:28                     ` Paul Menage
2009-08-04 19:28                       ` Paul Menage
2009-08-05 10:20                       ` Louis Rilling
     [not found]                         ` <20090805102057.GT29252-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-08-05 16:11                           ` Paul Menage
2009-08-05 16:11                             ` Paul Menage
2009-08-05 16:42                             ` Louis Rilling
2009-08-05 16:53                               ` Peter Zijlstra
     [not found]                               ` <20090805164218.GB26446-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-08-05 16:53                                 ` Peter Zijlstra
2009-08-06  0:01                                 ` Benjamin Blum
2009-08-06  0:01                               ` Benjamin Blum
     [not found]                                 ` <2f86c2480908051701s57120404q475edbedb58cdca1-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-06  9:58                                   ` Louis Rilling
2009-08-06  9:58                                 ` Louis Rilling
2009-08-06 10:04                                   ` Louis Rilling
     [not found]                                   ` <20090806095854.GD26446-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-08-06 10:04                                     ` Louis Rilling
2009-08-06 10:28                                     ` Paul Menage
2009-08-06 10:28                                       ` Paul Menage
     [not found]                                       ` <6599ad830908060328y21a008c1pc5ed5c27e0ec905d-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-06 10:34                                         ` Peter Zijlstra
2009-08-06 10:34                                           ` Peter Zijlstra
2009-08-06 10:42                                           ` Paul Menage
2009-08-06 10:42                                           ` Paul Menage
2009-08-06 11:02                                             ` Peter Zijlstra
2009-08-06 11:24                                               ` Paul Menage
2009-08-06 11:24                                                 ` Paul Menage
     [not found]                                                 ` <6599ad830908060424r72e1aa12g2b246785e7bc039c-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-06 11:39                                                   ` Peter Zijlstra
2009-08-06 11:39                                                 ` Peter Zijlstra
2009-08-06 15:19                                                   ` Paul E. McKenney
2009-08-06 15:19                                                     ` Paul E. McKenney
2009-08-06 15:24                                                     ` Peter Zijlstra
2009-08-06 15:37                                                       ` Paul E. McKenney
2009-08-06 15:37                                                       ` Paul E. McKenney
     [not found]                                                     ` <20090806151922.GB6747-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2009-08-06 15:24                                                       ` Peter Zijlstra
     [not found]                                             ` <6599ad830908060342m1fc8cdd2me25af248a8e0e183-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-06 11:02                                               ` Peter Zijlstra
2009-08-06 11:24                                         ` Louis Rilling
2009-08-06 11:24                                           ` Louis Rilling
2009-08-06 11:40                                           ` Paul Menage
2009-08-06 14:54                                             ` Louis Rilling
     [not found]                                             ` <6599ad830908060440g2f6cbed6xdc54c7096cd3745e-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-06 14:54                                               ` Louis Rilling
2009-08-08  1:41                                               ` Benjamin Blum
2009-08-08  1:41                                             ` Benjamin Blum
     [not found]                                               ` <2f86c2480908071841h13009856hd8fcae167b1fadbf-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-08  1:51                                                 ` Benjamin Blum
2009-08-08  1:51                                               ` Benjamin Blum
     [not found]                                           ` <20090806112450.GF26446-Hu8+6S1rdjywhHL9vcZdMVaTQe2KTcn/@public.gmane.org>
2009-08-06 11:40                                             ` Paul Menage
     [not found]                             ` <6599ad830908050911t6f23f810i65fe8fe17f3ee698-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-05 16:42                               ` Louis Rilling
     [not found]                       ` <6599ad830908041228w67bc6f7fh57e28f244e1923b3-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-08-05 10:20                         ` Louis Rilling
     [not found]         ` <20090803185556.GA8469-r/Jw6+rmf7HQT0dZR+AlfA@public.gmane.org>
2009-08-03 19:45           ` Serge E. Hallyn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090803185556.GA8469@us.ibm.com \
    --to=serue@us.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=bblum@google.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizf@cn.fujitsu.com \
    --cc=menage@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.