dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Mandeep Singh Baines <msb@chromium.org>
To: Tejun Heo <tj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mikulas Patocka <mpatocka@redhat.com>,
	Mandeep Singh Baines <msb@chromium.org>,
	linux-kernel@vger.kernel.org, dm-devel@redhat.com,
	Alasdair G Kergon <agk@redhat.com>,
	Will Drewry <wad@chromium.org>,
	Elly Jones <ellyjones@chromium.org>,
	Milan Broz <mbroz@redhat.com>,
	Olof Johansson <olofj@chromium.org>,
	Steffen Klassert <steffen.klassert@secunet.com>,
	Rusty Russell <rusty@rustcorp.com.au>
Subject: Re: workqueues and percpu (was: [PATCH] dm: remake of the verity target)
Date: Fri, 9 Mar 2012 13:15:12 -0800	[thread overview]
Message-ID: <20120309211512.GL27051@google.com> (raw)
In-Reply-To: <20120308231521.GA2968@htj.dyndns.org>

Tejun Heo (tj@kernel.org) wrote:
> Hello,
> 
> On Thu, Mar 08, 2012 at 02:39:09PM -0800, Andrew Morton wrote:
> > > I looked at it --- and using percpu variables in workqueues isn't safe 
> > > because the workqueue can change CPU if the CPU is hot-unplugged.
> 
> Generally, I don't think removing preemption enable/disable around
> percpu variable access is a worthwhile optimization unles it's on
> really really really hot path.  We'll eventually add debug annotations
> to percpu accessors and the ones used outside proper preemption
> protections would need to be updated anyway.
> 

In this case, I need the per-cpu data for the duration of calculating
a cryptographics hash on a 4K page of data. That's a long time to disable
pre-emption.

I could fix the bug temporarily by adding get/put for the per_cpu data
but would that be acceptable? I'm not sure what the OK limit is for how
long one can disable preemption. An alternative fix would be not allow
CONFIG_VERITY when CONFIG_HOTPLUG_CPU. Once workqueues are fixed, I could
remove that restriction.

Thoughts?

> > > dm-crypt has the same bug --- it also uses workqueue with per-cpu 
> > > variables and assumes that the CPU doesn't change for a single work item.
> > > 
> > > This program shows that work executed in a workqueue can be switched to a 
> > > different CPU.
> > > 
> > > I'm wondering how much other kernel code assumes that workqueues are bound 
> > > to a specific CPU, which isn't true if we unplug that CPU.
> > 
> > ugh.
> > 
> > We really don't want to have to avoid using workqueues because of some
> > daft issue with CPU hot-unplug.
> 
> Using or not using wq is orthogonal tho.  Using kthreads directly
> requires hooking into CPU hotplug callbacks and one might as well call
> flush_work_sync() from there instead of shutting down kthread.
> 
> > And yes, there are assumptions in various work handlers that they
> > will be pinned to a single CPU.  Finding and fixing those
> > assumptions would be painful.
> >
> > Heck, even debug_smp_processor_id() can be wrong in the presence of the
> > cpu-unplug thing.
> 
> Yeah, that's a generic problem with cpu unplug.
> 
> > I'm not sure what we can do about it really, apart from blocking unplug
> > until all the target CPU's workqueues have been cleared.  And/or refusing
> > to unplug a CPU until all pinned-to-that-cpu kernel threads have been
> > shut down or pinned elsewhere (which is the same thing, only more
> > general).
> > 
> > Tejun, is this new behaviour?  I do recall that a long time ago we
> > wrestled with unplug-vs-worker-threads and I ended up OK with the
> > result, but I forget what it was.  IIRC Rusty was involved.
> 
> Unfortunately, yes, this is a new behavior.  Before, we could have
> unbound delays during unplug from work items.  Now, we have CPU
> affinity assumption breakage.  The behavior change was primarily to
> allow long running work items to use regular workqueues without
> worrying about inducing delay across cpu hotplug operations, which is
> important as it's also used on suspend / hibernation, especially on
> mobile platforms.
> 
> During the cmwq conversion, I ended up auditing a lot of (I think I
> went through most of them) workqueue users and IIRC there weren't too
> many which required stable affinity.
> 
> > That being said, I don't think it's worth compromising the DM code
> > because of this workqueue wart: lots of other code has the same wart,
> > and we should find a centralised fix for it.
> 
> Probably the best way to solve this is introducing pinned attribute to
> workqueues and have them drained automatically on cpu hotplug events.
> It'll require auditing workqueue users but I guess we'll just have to
> do it given that we need to actually distinguish the ones need to be
> pinned.  Or maybe we can use explicit queue_work_on() to distinguish
> the ones which require pinning.
> 
> Another approach would be requiring all workqueues to be drained on
> cpu offlining and requiring any work item which may stall to use
> unbound wq.  IMHO, picking out the ones which may stall would be much
> less obvious than the ones which require cpu pinning.
> 
> Better ideas?
> 
> Thanks.
> 
> -- 
> tejun

  parent reply	other threads:[~2012-03-09 21:15 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-02  0:33 [PATCH] dm: verity target Mandeep Singh Baines
2012-03-02 16:08 ` Mandeep Singh Baines
2012-03-04 19:18 ` [PATCH] dm: remake of the " Mikulas Patocka
2012-03-04 19:35   ` userspace hashing utility for dm-verity Mikulas Patocka
2012-03-06 21:59   ` [PATCH] dm: remake of the verity target Mandeep Singh Baines
2012-03-08 22:21     ` workqueues and percpu (was: [PATCH] dm: remake of the verity target) Mikulas Patocka
2012-03-08 22:39       ` Andrew Morton
2012-03-08 23:15         ` Tejun Heo
2012-03-08 23:30           ` Andrew Morton
2012-03-09  0:33             ` Tejun Heo
2012-03-09  0:51               ` Tejun Heo
2012-03-09 21:15           ` Mandeep Singh Baines [this message]
2012-03-09 21:20             ` Tejun Heo
2012-03-09 22:06               ` Mandeep Singh Baines
2012-08-14 17:54                 ` Tejun Heo
2012-03-13 22:20     ` [PATCH] dm: remake of the verity target Mikulas Patocka
2012-03-14 21:13       ` Will Drewry
2012-03-17  1:16         ` Mikulas Patocka
2012-03-17  3:06           ` Will Drewry
2012-03-14 21:43       ` Mandeep Singh Baines
2012-03-20 15:41       ` Mandeep Singh Baines
2012-03-21  0:54         ` Mikulas Patocka
2012-03-21  3:03           ` Mandeep Singh Baines
2012-03-21  3:11           ` [dm-devel] " Mikulas Patocka
2012-03-21  3:30             ` Mandeep Singh Baines
2012-03-21  3:44               ` Mikulas Patocka
2012-03-21  3:49                 ` Mandeep Singh Baines
2012-03-21 17:08                   ` Mikulas Patocka
2012-03-21 17:09                     ` Mikulas Patocka
2012-03-22 17:41                     ` [dm-devel] " Mandeep Singh Baines
2012-03-22 21:52                       ` Mikulas Patocka
2012-03-23  3:15                         ` Mandeep Singh Baines
2012-03-24  3:48                           ` Mikulas Patocka
2012-03-21  1:10         ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120309211512.GL27051@google.com \
    --to=msb@chromium.org \
    --cc=agk@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dm-devel@redhat.com \
    --cc=ellyjones@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbroz@redhat.com \
    --cc=mpatocka@redhat.com \
    --cc=olofj@chromium.org \
    --cc=rusty@rustcorp.com.au \
    --cc=steffen.klassert@secunet.com \
    --cc=tj@kernel.org \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).