From: Peter Zijlstra <peterz@infradead.org>
To: Oleg Nesterov <oleg@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>,
linux-rt-users <linux-rt-users@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
Miklos Szeredi <miklos@szeredi.hu>, mingo <mingo@redhat.com>
Subject: Re: rt14: strace -> migrate_disable_atomic imbalance
Date: Thu, 22 Sep 2011 17:13:08 +0200 [thread overview]
Message-ID: <1316704389.31429.24.camel@twins> (raw)
In-Reply-To: <20110922145257.GA13960@redhat.com>
On Thu, 2011-09-22 at 16:52 +0200, Oleg Nesterov wrote:
> On 09/22, Peter Zijlstra wrote:
> >
> > +static void wait_task_inactive_sched_in(struct preempt_notifier *n, int cpu)
> > +{
> > + struct task_struct *p;
> > + struct wait_task_inactive_blocked *blocked =
> > + container_of(n, struct wait_task_inactive_blocked, notifier);
> > +
> > + hlist_del(&n->link);
> > +
> > + p = ACCESS_ONCE(blocked->waiter);
> > + blocked->waiter = NULL;
> > + wake_up_process(p);
> > +}
> > ...
> > +static void
> > +wait_task_inactive_sched_out(struct preempt_notifier *n, struct task_struct *next)
> > +{
> > + if (current->on_rq) /* we're not inactive yet */
> > + return;
> > +
> > + hlist_del(&n->link);
> > + n->ops = &wait_task_inactive_ops_post;
> > + hlist_add_head(&n->link, &next->preempt_notifiers);
> > +}
>
> Tricky ;) Yes, the first ->sched_out() is not enough.
Not enough isn't the problem, its ran with rq->lock held and irqs
disabled, you simply cannot do ttwu() from there.
If we could, the subsequent task_rq_lock() in wait_task_inactive() would
be enough to serialize against the still in-flight context switch.
One of the problems with doing it from the next sched_in notifier, is
that next can be idle, and then we do a A -> idle -> B switch, which is
of course sub-optimal.
> > unsigned long wait_task_inactive(struct task_struct *p, long match_state)
> > {
> > ...
> > + rq = task_rq_lock(p, &flags);
> > + trace_sched_wait_task(p);
> > + if (!p->on_rq) /* we're already blocked */
> > + goto done;
>
> This doesn't look right. schedule() clears ->on_rq a long before
> __switch_to/etc.
Oh, bugger, yes its before we can drop the rq for idle balance and
nonsense like that. (!p->on_rq && !p->on_cpu) should suffice I think.
> And it seems that we check ->on_cpu above, this is not UP friendly.
True, but its what the old code did.. and I was seeing performance
suckage compared to the unpatched kernel (not that the p->on_cpu busy
wait fixed it)...
> >
> > - set_current_state(TASK_UNINTERRUPTIBLE);
> > - schedule_hrtimeout(&to, HRTIMER_MODE_REL);
> > - continue;
> > - }
> > + hlist_add_head(&blocked.notifier.link, &p->preempt_notifiers);
> > + task_rq_unlock(rq, p, &flags);
>
> I thought about reimplementing wait_task_inactive() too, but afaics there
> is a problem: why we can't race with p doing register_preempt_notifier() ?
> I guess register_ needs rq->lock too.
We can actually, now you mention it.. ->pi_lock would be sufficient and
less expensive to acquire.
next prev parent reply other threads:[~2011-09-22 15:13 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-10 9:12 [ANNOUNCE] 3.0.4-rt13 Thomas Gleixner
2011-09-10 14:53 ` Madovsky
2011-09-10 17:27 ` Rolando Martins
2011-09-11 10:35 ` Mike Galbraith
2011-09-11 10:35 ` Mike Galbraith
2011-09-11 17:01 ` Mike Galbraith
2011-09-12 7:24 ` Thomas Gleixner
2011-09-12 8:59 ` Peter Zijlstra
2011-09-12 9:05 ` Mike Galbraith
2011-09-12 13:52 ` Mike Galbraith
2011-09-12 14:53 ` Mike Galbraith
2011-09-13 13:36 ` Peter Zijlstra
2011-09-13 15:17 ` Mike Galbraith
2011-09-13 15:08 ` Peter Zijlstra
2011-09-13 15:28 ` Mike Galbraith
2011-09-13 16:13 ` Peter Zijlstra
2011-09-21 10:17 ` rt14: strace -> migrate_disable_atomic imbalance Mike Galbraith
2011-09-21 17:01 ` Peter Zijlstra
2011-09-21 18:50 ` Peter Zijlstra
2011-09-21 18:50 ` Peter Zijlstra
2011-09-22 4:46 ` Mike Galbraith
2011-09-22 6:31 ` Peter Zijlstra
2011-09-22 8:38 ` Peter Zijlstra
2011-09-22 10:00 ` Peter Zijlstra
2011-09-22 10:00 ` Peter Zijlstra
2011-09-22 11:55 ` Mike Galbraith
2011-09-22 12:09 ` Peter Zijlstra
2011-09-22 13:42 ` Mike Galbraith
2011-09-22 14:05 ` Mike Galbraith
2011-09-22 15:20 ` Peter Zijlstra
2011-09-22 14:34 ` Peter Zijlstra
2011-09-22 14:38 ` Mike Galbraith
2011-09-22 14:41 ` Mike Galbraith
2011-09-22 14:41 ` Peter Zijlstra
2011-09-22 14:46 ` Mike Galbraith
2011-09-22 14:46 ` Mike Galbraith
2011-09-22 11:31 ` Peter Zijlstra
2011-09-22 11:46 ` Peter Zijlstra
2011-09-22 11:46 ` Peter Zijlstra
2011-09-22 14:52 ` Oleg Nesterov
2011-09-22 15:13 ` Peter Zijlstra [this message]
2011-09-14 9:57 ` [PATCH -rt] ipc/sem: Rework semaphore wakeups Peter Zijlstra
2011-09-14 13:02 ` Mike Galbraith
2011-09-14 18:48 ` Manfred Spraul
2011-09-14 19:23 ` Peter Zijlstra
2011-09-15 17:04 ` Manfred Spraul
2011-09-12 10:04 ` [ANNOUNCE] 3.0.4-rt13 Peter Zijlstra
2011-09-12 11:33 ` Mike Galbraith
2011-09-11 18:14 ` Mike Galbraith
2011-09-12 7:33 ` Thomas Gleixner
2011-09-12 8:05 ` Mike Galbraith
2011-09-12 8:43 ` Mike Galbraith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1316704389.31429.24.camel@twins \
--to=peterz@infradead.org \
--cc=efault@gmx.de \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=mingo@redhat.com \
--cc=oleg@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.