Re: [patch 3/4] Container Freezer: Implement freezer cgroup subsystem

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Matt Helsley <matthltc@us.ibm.com>
To: Paul Menage <menage@google.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>, Pavel Machek <pavel@ucw.cz>,
	Linux-Kernel <linux-kernel@vger.kernel.org>,
	linux-pm@lists.linux-foundation.org,
	Linux Containers <containers@lists.linux-foundation.org>,
	Cedric Le Goater <clg@fr.ibm.com>
Subject: Re: [patch 3/4] Container Freezer: Implement freezer cgroup subsystem
Date: Mon, 07 Jul 2008 15:42:08 -0700	[thread overview]
Message-ID: <1215470528.9023.6.camel@localhost.localdomain> (raw)
In-Reply-To: <6599ad830806241427n4e174994oce21e93a034b7051@mail.gmail.com>


On Tue, 2008-06-24 at 14:27 -0700, Paul Menage wrote: 
> On Tue, Jun 24, 2008 at 6:58 AM, Matt Helsley <matthltc@us.ibm.com> wrote:
> > From: Cedric Le Goater <clg@fr.ibm.com>
> > Subject: [patch 3/4] Container Freezer: Implement freezer cgroup subsystem
> >
> > This patch implements a new freezer subsystem for Paul Menage's
> > control groups framework.
> 
> You can s/Paul Menage's// now that it's in mainline.

OK. Incidentally sorry for the delayed reply. Got so caught up in making
changes in response to your email that I neglected to reply sooner. I'll
be posting the changes shortly but first I want to address your earlier
mail.

> > +static const char *freezer_state_strs[] = {
> > +       "RUNNING",
> > +       "FREEZING",
> > +       "FROZEN",
> > +};
> > +
> > +/* Check and update whenever adding new freezer states. Currently is:
> > +   strlen("FREEZING") */
> > +#define STATE_MAX_STRLEN 8
> > +
> 
> That's a bit nasty ...
> 
> But hopefully it could go away when the write_string() method is
> available in cgroups? (See my patchset from earlier this week).

I've looked at this and I like it. I've changed the patches to use this
interface.

> > +
> > +struct cgroup_subsys freezer_subsys;
> > +
> > +/* Locking and lock ordering:
> > + *
> > + * can_attach(), cgroup_frozen():
> > + * rcu (task->cgroup, freezer->state)
> > + *
> > + * freezer_fork():
> > + * rcu (task->cgroup, freezer->state)
> > + *  freezer->lock
> > + *   task_lock
> > + *    sighand->siglock
> > + *
> > + * freezer_read():
> > + * rcu (freezer->state)
> > + *  freezer->lock (upgrade to write)
> > + *   read_lock css_set_lock
> > + *
> > + * freezer_write()
> > + * cgroup_lock
> > + *  rcu
> > + *   freezer->lock
> > + *    read_lock css_set_lock
> > + *     task_lock
> > + *      sighand->siglock
> > + *
> > + * freezer_create(), freezer_destroy():
> > + * cgroup_lock [ by cgroup core ]
> > + */
> 
> 
> > +static int freezer_can_attach(struct cgroup_subsys *ss,
> > +                             struct cgroup *new_cgroup,
> > +                             struct task_struct *task)
> > +{
> > +       struct freezer *freezer;
> > +       int retval = 0;
> > +
> > +       /*
> > +        * The call to cgroup_lock() in the freezer.state write method prevents
> > +        * a write to that file racing against an attach, and hence the
> > +        * can_attach() result will remain valid until the attach completes.
> > +        */
> > +       rcu_read_lock();
> > +       freezer = cgroup_freezer(new_cgroup);
> > +       if (freezer->state == STATE_FROZEN)
> > +               retval = -EBUSY;
> 
> Is it meant to be OK to move a task into a cgroup that's currently in
> the FREEZING state but not yet fully frozen?

Yes.

> > +       struct freezer *freezer;
> > +
> > +       rcu_read_lock(); /* needed to fetch task's cgroup
> > +                           can't use task_lock() here because
> > +                           freeze_task() grabs that */
> 
> I'm not sure that RCU is the right thing for this. All that the RCU
> lock will guarantee is that the freezer structure you get a pointer to
> doesn't go away. It doesn't guarantee that the task doesn't move
> cgroup, or that the cgroup doesn't get a freeze request via a write.
> But in this case, the fork callback is called before the task is added
> to the task_list/pidhash, or to its cgroups' linked lists. So it
> shouldn't be able to change groups. Racing against a concurrent write
> to the cgroup's freeze file may be more of an issue.

I think you're right. The problem is it could change state between the
test of the state and the call to freeze_task(). If we're changing from
FROZEN to running that would leave us with a frozen task even though
we're in the running state. Thanks for spotting this one.

> Can you add a __freeze_task() that has to be called with task_lock(p)
> already held?

task_lock() is no longer acquired in freeze_task(). So I've updated the
patches to drop RCU in favor of acquiring the task_lock() here. It's
still taken in thaw_process() however, so something like this is still
needed.

> > +       freezer = task_freezer(task);
> 
> Maybe BUG_ON(freezer->state == STATE_FROZEN) ?

Seems appropriate.

> > +
> > +static ssize_t freezer_read(struct cgroup *cgroup,
> > +                           struct cftype *cft,
> > +                           struct file *file, char __user *buf,
> > +                           size_t nbytes, loff_t *ppos)
> > +{
> > +       struct freezer *freezer;
> > +       enum freezer_state state;
> > +
> > +       rcu_read_lock();
> > +       freezer = cgroup_freezer(cgroup);
> > +       state = freezer->state;
> > +       if (state == STATE_FREEZING) {
> > +               /* We change from FREEZING to FROZEN lazily if the cgroup was
> > +                * only partially frozen when we exitted write. */
> > +               spin_lock_irq(&freezer->lock);
> > +               if (freezer_check_if_frozen(cgroup)) {
> > +                       freezer->state = STATE_FROZEN;
> > +                       state = STATE_FROZEN;
> > +               }
> > +               spin_unlock_irq(&freezer->lock);
> > +       }
> > +       rcu_read_unlock();
> > +
> > +       return simple_read_from_buffer(buf, nbytes, ppos,
> > +                                      freezer_state_strs[state],
> > +                                      strlen(freezer_state_strs[state]));
> > +}
> 
> Technically this could return weird results if someone read it
> byte-by-byte and the status changed between reads. If you used
> read_seq_string rather than read you'd avoid that.

Good point. I've made that change as well. 

> > +               return -EIO;
> > +
> > +       cgroup_lock();
> 
> If you're taking cgroup_lock() here in freezer_write(), there's no
> need for the rcu_read_lock() in freezer_freeze()

Yup. Fixed since I'll no longer be using RCU.

Cheers,
	-Matt Helsley

next prev parent reply	other threads:[~2008-07-07 22:42 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-24 13:58 [patch 0/4] Container Freezer: Reuse Suspend Freezer Matt Helsley
2008-06-24 13:58 ` [patch 1/4] Container Freezer: Add TIF_FREEZE flag to all architectures Matt Helsley
2008-06-24 19:24   ` Pavel Machek
2008-06-24 13:58 ` [patch 2/4] Container Freezer: Make refrigerator always available Matt Helsley
2008-06-24 13:58 ` [patch 3/4] Container Freezer: Implement freezer cgroup subsystem Matt Helsley
2008-06-24 21:27   ` Paul Menage
2008-07-07 22:42     ` Matt Helsley [this message]
2008-06-24 13:58 ` [patch 4/4] Container Freezer: Skip frozen cgroups during power management resume Matt Helsley
  -- strict thread matches above, loose matches on Subject: below --
2008-07-07 22:58 [PATCH 0/4] Container Freezer: Reuse Suspend Freezer Matt Helsley
2008-07-07 22:58 ` [PATCH 3/4] Container Freezer: Implement freezer cgroup subsystem Matt Helsley

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1215470528.9023.6.camel@localhost.localdomain \
    --to=matthltc@us.ibm.com \
    --cc=clg@fr.ibm.com \
    --cc=containers@lists.linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=menage@google.com \
    --cc=pavel@ucw.cz \
    --cc=rjw@sisk.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox