Re: [PATCH v9 4/9] cgroup: cgroup v2 freezer

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: Roman Gushchin <guroan@gmail.com>
Cc: Tejun Heo <tj@kernel.org>, Roman Gushchin <guro@fb.com>,
	kernel-team@fb.com, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v9 4/9] cgroup: cgroup v2 freezer
Date: Tue, 2 Apr 2019 18:02:41 +0200	[thread overview]
Message-ID: <20190402160241.GA10425@redhat.com> (raw)
In-Reply-To: <20190316175812.6787-5-guro@fb.com>

Hi Roman,

let me apologize again for the huge delay.

I see nothing really wrong in this version, so no objections from me.

However, 4/9 doesn't apply, so it seems you will need to make v10 anyway
to adapt these changes to the recent changes in kernel/signal.c ;)

Just a couple of minor nits below...

On 03/16, Roman Gushchin wrote:
>
> + * If always_leave is not set, and the cgroup is freezing,
> + * we're racing with the cgroup freezing. In this case, we don't
> + * drop the frozen counter to avoid a transient switch to
> + * the unfrozen state. To make sure that the task won't go
> + * to the userspace before reaching the signal handler loop,
> + * let's set TIF_SIGPENDING flag.
> + */
> +void cgroup_leave_frozen(bool always_leave)
> +{
> +	struct cgroup *cgrp;
> +
> +	spin_lock_irq(&css_set_lock);
> +	cgrp = task_dfl_cgroup(current);
> +	if (always_leave || !test_bit(CGRP_FREEZE, &cgrp->flags)) {
> +		cgroup_dec_frozen_cnt(cgrp);
> +		cgroup_update_frozen(cgrp);
> +		WARN_ON_ONCE(!current->frozen);
> +		current->frozen = false;
> +	} else {
> +		set_tsk_thread_flag(current, TIF_SIGPENDING);

The setting of TIF_SIGPENDING looks unnecessary and even not correct; because
this flag must not be updated without ->siglock held (even if "set" is more
or less safe).

If JOBCTL_TRAP_FREEZE is already set, then TIF_SIGPENDING must be set too.

Otherwise set_tsk_thread_flag(TIF_SIGPENDING) can't help because the task can
do recalc_sigpending() at any moment.

In particular, get_signal() does dequeue_signal()->recalc_sigpending() right
after cgroup_leave_frozen(), so I fail to understand why do we need to set
TIF_SIGPENDING.


> @@ -912,6 +912,10 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node)
>  	tsk->fail_nth = 0;
>  #endif
>
> +#ifdef CONFIG_CGROUPS
> +	tsk->frozen = 0;
> +#endif

Hmm, do we really need this? How can a cgroup_task_frozen() task call
copy_process() ?

> +static void do_freezer_trap(void)
> +	__releases(&current->sighand->siglock)
> +{
> +	/*
> +	 * If a fatal signal is pending, there is no way back for the process,
> +	 * so let it escape from the freezer trap and exit.
> +	 * If the task has been frozen, cgroup_leave_frozen() will be invoked
> +	 * to update the cgroup state, if necessary.
> +	 */
> +	if (fatal_signal_pending(current)) {
> +		current->jobctl &= ~JOBCTL_TRAP_FREEZE;
> +		spin_unlock_irq(&current->sighand->siglock);
> +		return;
> +	}
> +
> +	/*
> +	 * If there are other trap bits pending except JOBCTL_TRAP_FREEZE,
> +	 * let's make another loop to give it a chance to be handled.
> +	 * In any case, we'll return back.
> +	 */
> +	if (((current->jobctl & (JOBCTL_PENDING_MASK | JOBCTL_TRAP_FREEZE)) !=
> +	     JOBCTL_TRAP_FREEZE) || fatal_signal_pending(current)) {
                                    ^^^^^^^^^^^^^^^^^^^^

We have already checked fatal_signal_pending() at the start?

And in fact, you can probably remove fatal_signal_pending() altogether...
Note that with recent changes get_signal() does

	if (signal_group_exit(signal)) {
		ksig->info.si_signo = signr = SIGKILL;
		sigdelset(&current->pending.signal, SIGKILL);
		recalc_sigpending();
		goto fatal;
	}

before the main loop, so afaics fatal_signal_pending() == T in do_freezer_trap()
is simply impossible. This means that you can't clear JOBCTL_TRAP_FREEZE, but
this is probably fine... if not, you can add jobctl &= ~JOBCTL_TRAP_FREEZE into
the "if (signal_group_exit(signal))" above.

> @@ -2401,12 +2453,27 @@ bool get_signal(struct ksignal *ksig)
>  		    do_signal_stop(0))
>  			goto relock;
>  
> -		if (unlikely(current->jobctl & JOBCTL_TRAP_MASK)) {
> -			do_jobctl_trap();
> -			spin_unlock_irq(&sighand->siglock);
> +		if (unlikely(current->jobctl &
> +			     (JOBCTL_TRAP_MASK | JOBCTL_TRAP_FREEZE))) {
> +			if (current->jobctl & JOBCTL_TRAP_MASK) {
> +				do_jobctl_trap();
> +				spin_unlock_irq(&sighand->siglock);
> +			} else if (current->jobctl & JOBCTL_TRAP_FREEZE)
> +				do_freezer_trap();
> +
>  			goto relock;
>  		}
>  
> +		/*
> +		 * If the task is leaving the frozen state, let's update
> +		 * cgroup counters and reset the frozen bit.
> +		 */
> +		if (unlikely(cgroup_task_frozen(current))) {
> +			spin_unlock_irq(&sighand->siglock);
> +			cgroup_leave_frozen(true);
> +			spin_lock_irq(&sighand->siglock);

I'd suggest to do "goto relock" rather than spin_lock_irq(&sighand->siglock).
To ensure we can't miss SIGKILL which can come right after we drop siglock,
note again the new signal_group_exit() check above.

Oleg.

next prev parent reply	other threads:[~2019-04-02 16:02 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-16 17:58 [PATCH v9 0/9] freezer for cgroup v2 Roman Gushchin
2019-03-16 17:58 ` [PATCH v9 1/9] cgroup: rename freezer.c into legacy_freezer.c Roman Gushchin
2019-03-16 17:58 ` [PATCH v9 2/9] cgroup: implement __cgroup_task_count() helper Roman Gushchin
2019-03-16 17:58 ` [PATCH v9 3/9] cgroup: protect cgroup->nr_(dying_)descendants by css_set_lock Roman Gushchin
2019-03-16 17:58 ` [PATCH v9 4/9] cgroup: cgroup v2 freezer Roman Gushchin
2019-04-02 16:02   ` Oleg Nesterov [this message]
2019-04-02 22:08     ` Roman Gushchin
2019-03-16 17:58 ` [PATCH v9 5/9] kselftests: cgroup: don't fail on cg_kill_all() error in cg_destroy() Roman Gushchin
2019-03-16 17:58   ` Roman Gushchin
2019-03-16 17:58   ` guroan
2019-03-16 17:58 ` [PATCH v9 6/9] kselftests: cgroup: add freezer controller self-tests Roman Gushchin
2019-03-16 17:58   ` Roman Gushchin
2019-03-16 17:58   ` guroan
2019-03-16 17:58 ` [PATCH v9 7/9] cgroup: make TRACE_CGROUP_PATH irq-safe Roman Gushchin
2019-03-16 17:58 ` [PATCH v9 8/9] cgroup: add tracing points for cgroup v2 freezer Roman Gushchin
2019-03-16 17:58 ` [PATCH v9 9/9] cgroup: document cgroup v2 freezer interface Roman Gushchin
2019-03-25 16:59 ` [PATCH v9 0/9] freezer for cgroup v2 Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190402160241.GA10425@redhat.com \
    --to=oleg@redhat.com \
    --cc=cgroups@vger.kernel.org \
    --cc=guro@fb.com \
    --cc=guroan@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.