From: Oleg Nesterov <oleg@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
David Rientjes <rientjes@google.com>,
David Laight <David.Laight@ACULAB.COM>,
Geert Uytterhoeven <geert@linux-m68k.org>,
Ingo Molnar <mingo@kernel.org>,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: oom-kill && frozen()
Date: Wed, 13 Nov 2013 18:07:24 +0100 [thread overview]
Message-ID: <20131113170724.GA17739@redhat.com> (raw)
In-Reply-To: <20131113032053.GA19394@mtj.dyndns.org>
On 11/13, Tejun Heo wrote:
>
> Hello,
>
> On Tue, Nov 12, 2013 at 05:56:43PM +0100, Oleg Nesterov wrote:
> > On 11/12, Oleg Nesterov wrote:
> > > I am also wondering if it makes any sense to turn PF_FROZEN into
> > > TASK_FROZEN, something like (incomplete, probably racy) patch below.
> > > Note that it actually adds the new state, not the the qualifier.
> >
> > As for the current usage of PF_FROZEN... David, it seems that
> > oom_scan_process_thread()->__thaw_task() is dead? Probably this
> > was fine before, when __thaw_task() cleared the "need to freeze"
> > condition, iirc it was PF_FROZEN.
> >
> > But today __thaw_task() can't help, no? the task will simply
> > schedule() in D state again.
>
> Yeah, it'll have to be actively excluded using e.g. PF_FREEZER_SKIP,
> which, BTW, can usually only be manipulated by the task itself.
Oh, yes, yes, yes, I agree. PF_FREEZER_SKIP and the growing number of
freezable_schedule() makes this all more confusing.
In fact I was think about something like
1. Add the new __TASK_FREEZABLE qualifier
2. Turn freezable_schedule() into
void freezable_schedule(void)
{
spin_lock_irq(¤t->pi_lock);
if (current->state)
current->state |= __TASK_FREEZABLE
spin_unlock_irq(¤t->pi_lock);
schedule();
try_to_freeze();
}
3. Kill PF_FREEZER_SKIP/freezer_do_not_count/count/should_skip
4. Change freeze_task() and fake_signal_wake_up()
- wake_up_state(p, TASK_INTERRUPTIBLE);
+ wake_up_state(p, TASK_INTERRUPTIBLE | __TASK_FREEZABLE);
Unfortunately, this can only work if the caller can tolerate the
false wakeup. We can even fix wait_for_vfork_done(), but say
ptrace_stop() can't work this way.
And even if we can make this work, the very fact that freezable_schedule()
does schedule() twice does not look right.
_Perhaps_ we can do something like "selective wakeup"? IOW, ignoring the
races/details,
1. Add __TASK_FROZEN qualifier _and_ state
2. Change frozen(),
static inline bool frozen(struct task_struct *p)
{
return p->state & __TASK_FROZEN;
}
2. Change freezable_schedule(),
void freezable_schedule(void)
{
spin_lock_irq(¤t->pi_lock);
if (current->state)
current->state |= __TASK_FROZEN;
spin_unlock_irq(¤t->pi_lock);
schedule();
}
3. Change __refrigerator() to use saved_state | __TASK_FROZEN
too.
4. Finally, change try_to_wake_up() path to do
- p->state = TASK_WAKING;
+ p->state &= ~state;
+ if (p->state & ~(TASK_DEAD | TASK_WAKEKILL | TASK_PARKED))
+ return;
+ else
+ p->state = TASK_WAKING;
IOW, if the task sleeps in, say, TASK_INTERRUPTIBLE | __TASK_FROZEN
then it need both try_to_wake_up(TASK_INTERRUPTIBLE) and
try_to_wake_up(__TASK_FROZEN) to wake up.
5. Kill PF_FREEZER_SKIP / etc.
Unfortunately, 4. is obviously needs more changes, although at first glance
nothing really nontrivial... we need a common helper for try_to_wake_up()
and ttwu_remote() which checks/changes ->state and we need to avoid "stat"
if we do not actually wake up.
Hmm. and this all makes me think that at least s/PF_FROZEN/TASK_FROZEN/ as
a first step actually makes some sense... Note the "qualifier _and_ state"
above.
Tejun, Peter, do you think this makes any sense? I am just curious, but
"selective wakeup" looks potentially useful.
And what about oom_scan_process_thread() ? should we simply kill this
dead frozen/__thaw_task code or should we change freezing() to respect
TIF_MEMDIE?
Oleg.
next prev parent reply other threads:[~2013-11-13 17:06 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-11-12 13:53 [PATCH] ipvs: Remove unused variable ret from sync_thread_master() Geert Uytterhoeven
2013-11-12 14:13 ` Peter Zijlstra
2013-11-12 14:21 ` David Laight
2013-11-12 14:31 ` Peter Zijlstra
2013-11-12 14:38 ` David Laight
2013-11-12 16:26 ` Oleg Nesterov
2013-11-12 14:52 ` Peter Zijlstra
2013-11-12 16:21 ` Oleg Nesterov
2013-11-12 16:56 ` oom-kill && frozen() Oleg Nesterov
2013-11-13 3:20 ` Tejun Heo
2013-11-13 17:07 ` Oleg Nesterov [this message]
2013-11-13 17:42 ` Peter Zijlstra
2013-11-13 18:15 ` Oleg Nesterov
2013-11-13 19:11 ` __refrigerator() && saved task->state Oleg Nesterov
2013-11-13 19:14 ` Peter Zijlstra
2013-11-13 19:40 ` Oleg Nesterov
2013-11-12 17:00 ` [PATCH] ipvs: Remove unused variable ret from sync_thread_master() Peter Zijlstra
2013-11-12 18:04 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131113170724.GA17739@redhat.com \
--to=oleg@redhat.com \
--cc=David.Laight@ACULAB.COM \
--cc=geert@linux-m68k.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).