Re: workqueue cpu affinity - Max Krasnyansky

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Max Krasnyansky <maxk@qualcomm.com>
To: Oleg Nesterov <oleg@tv-sign.ru>, Peter Zijlstra <peterz@infradead.org>
Cc: mingo@elte.hu, Andrew Morton <akpm@linux-foundation.org>,
	David Rientjes <rientjes@google.com>, Paul Jackson <pj@sgi.com>,
	menage@google.com, linux-kernel@vger.kernel.org,
	Mark Hounschell <dmarkh@cfl.rr.com>,
	Nick Piggin <nickpiggin@yahoo.com.au>
Subject: Re: workqueue cpu affinity
Date: Wed, 11 Jun 2008 13:44:33 -0700	[thread overview]
Message-ID: <48503931.3050600@qualcomm.com> (raw)
In-Reply-To: <20080611160815.GA150@tv-sign.ru>

Previous emails were very long :). So here is an executive summary of the
discussions so far:

----
Workqueue kthread starvation by non-blocking user RT threads.

Starving workqueue threads on the isolated cpus does not seems like a big
deal. All current mainline users of schedule_on_cpu() kind of api can live
with it. Starvation of the workqueue threads is an issue for the -rt kernels.
See http://marc.info/?l=linux-kernel&m=121316707117552&w=2 for more info.

If absolutely necessary moving workqueue threads from the isolated cpus is
also not a big deal, even for cpu hotplug. It's certainly _not_ encouraged in
general but at the same time is not strictly prohibited either, because
nothing fundamental brakes (that's what my current isolation solution does).

----
Optimize workqueue flush.

Current flush_work_queue() implementation is an issue for the starvation case
mentioned above and in general it's not very efficient because it has to
schedule in each online cpu.

Peter suggested rewriting flush logic to avoid scheduling on each online cpu.

Oleg suggested converting existing users of flush_queued_work() to
cancel_work_sync(work) which will provide fine grained flushing and will not
schedule on each cpu.

Both of the suggestions would improve overall performance and address the case
when machine gets stuck due work queue thread starvation.

----
Timer or IPI based Oprofile.

Currently oprofile collects samples by using schedule_work_on_cpu(). Which
means that if workqueue threads are starved on, or moved from cpuX oprofile
fails to collect samples on that cpuX.

It seems that it can be easily converted to use per-CPU timer or IPI.
This might be useful in general (ie less expensive) and will take care of the
issue described above.

----
Optimize pagevec drain.

Current pavevec drain logic on the NUMA boxes schedules a workqueue on each
online cpu. It's not an issue for the CPU isolation per se but can be improved
in general.
Peter suggested keeping a cpumask of cpus with non-emppty pagevecs which will
not require scheduling work on each cpu.
I wonder if there is something on that front in the Nick's latest patches.
CC'ing Nick.

----

Did I miss anything ?

Max

next prev parent reply	other threads:[~2008-06-11 20:44 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-06-05 19:57 [patch] sched: prevent bound kthreads from changing cpus_allowed David Rientjes
2008-06-05 20:29 ` Paul Jackson
2008-06-05 21:12   ` David Rientjes
2008-06-09 20:59     ` Max Krasnyanskiy
2008-06-09 22:07       ` David Rientjes
2008-06-10  4:23         ` Max Krasnyansky
2008-06-10 17:04           ` David Rientjes
2008-06-10 16:30         ` cpusets and kthreads, inconsistent behaviour Max Krasnyansky
2008-06-10 18:47           ` David Rientjes
2008-06-10 20:44             ` Max Krasnyansky
2008-06-10 20:54               ` David Rientjes
2008-06-10 21:15                 ` Max Krasnyansky
2008-06-10  6:44       ` [patch] sched: prevent bound kthreads from changing cpus_allowed Peter Zijlstra
2008-06-10 15:38         ` Max Krasnyansky
2008-06-10 17:00           ` Oleg Nesterov
2008-06-10 17:19             ` Peter Zijlstra
2008-06-10 20:24               ` workqueue cpu affinity Max Krasnyansky
2008-06-11  6:49                 ` Peter Zijlstra
2008-06-11 19:02                   ` Max Krasnyansky
2008-06-12 18:44                     ` Peter Zijlstra
2008-06-12 19:10                       ` Max Krasnyanskiy
2008-06-11 16:08                 ` Oleg Nesterov
2008-06-11 19:21                   ` Max Krasnyansky
2008-06-11 19:21                   ` Max Krasnyansky
2008-06-12 16:35                     ` Oleg Nesterov
2008-06-11 20:44                   ` Max Krasnyansky [this message]
2008-06-10 18:00             ` [patch] sched: prevent bound kthreads from changing cpus_allowed Max Krasnyansky
2008-06-05 20:52 ` Daniel Walker
2008-06-05 21:47 ` Paul Jackson
2008-06-10 10:28 ` Ingo Molnar
2008-06-10 17:47 ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=48503931.3050600@qualcomm.com \
    --to=maxk@qualcomm.com \
    --cc=akpm@linux-foundation.org \
    --cc=dmarkh@cfl.rr.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=menage@google.com \
    --cc=mingo@elte.hu \
    --cc=nickpiggin@yahoo.com.au \
    --cc=oleg@tv-sign.ru \
    --cc=peterz@infradead.org \
    --cc=pj@sgi.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).