linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>, Jeff Layton <jlayton@redhat.com>,
	"Myklebust, Trond" <Trond.Myklebust@netapp.com>,
	Mandeep Singh Baines <msb@chromium.org>,
	Ming Lei <ming.lei@canonical.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@redhat.com>, Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held!
Date: Wed, 6 Mar 2013 13:24:52 -0800	[thread overview]
Message-ID: <20130306212452.GO1227@htj.dyndns.org> (raw)
In-Reply-To: <CA+55aFwDogteVd=vwGHXDSASnga-nZZnKaQz9aO1yBU2CPKSbA@mail.gmail.com>

Hello, Linus.

On Wed, Mar 06, 2013 at 01:00:02PM -0800, Linus Torvalds wrote:
> > Oh yeah, we don't need another signal.  We just need sigpending state
> > and a wakeup.  I wasn't really going into details.  The important
> > point is that for code paths outside signal/ptrace, freezing could
> > look and behave about the same as signal delivery.
> 
> Don't we already do that? The whole "try_to_freeze()" in
> get_signal_to_deliver() is about exactly this. See
> fake_signal_wake_up().

Yeap, that was what I had in mind too.  Maybe we'll need to modify it
slightly but we already have most of the basic stuff.

> You still have kernel threads (that don't do signals) to worry about,
> so it doesn't make things go away. And you still have issues with
> latency of disk wait, which is, I think, the reason for that
> "freezable_schedule()" in the NFS code to begin with.

I haven't thought about it for quite some time so things are hazy, but
here's what I can recall now.

With syscall paths out of the way, the surface is reduced a lot.
Another part is converting most freezable kthread users to freezable
workqueue which provides natural resource boundaries (the duration of
work item execution).  kthread is already difficult to get the
synchronization completely right and significant number of freezable +
should_stop users are subtly broken the last time I went over the
freezer users.  I think we would be much better off converting most
over to freezable workqueues which is easier to get right and likely
to be less expensive.  Freezing happens at work item boundary which in
most cases could be made to coincide with the original freezer check
point.

There could be kthreads which can't be converted to workqueue for
whatever reason (there shouldn't be many at this point) but most
freezer usages in kthreads are pretty simple.  It's usually single or
a couple freezer check points in the main loop.  While we may still
need special handling for them, I don't think they're likely to have
implications on issues like this.

We probably would want to handle restart for freezable kthreads
calling syscalls.  Haven't thought about this one too much yet.  Maybe
freezable kthreads doing syscalls just need to be ready for
-ERESTARTSYS?

I'm not sure I follow the disk wait latency part.  Are you saying that
switching to jobctl trap based freezer implementation wouldn't help
them?  If so, right, it doesn't in itself.  It's just changing the
infrastructure used for freezing and can't make the underlying
synchronization issues just disappear but at least it becomes the same
problem as being responsive to SIGKILL rather than a completely
separate problem.

Thanks.

-- 
tejun

  reply	other threads:[~2013-03-06 21:24 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-04 13:57 LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held! Ming Lei
2013-03-04 14:14 ` Myklebust, Trond
2013-03-04 14:23   ` Jeff Layton
2013-03-04 19:55     ` Mandeep Singh Baines
2013-03-04 20:53       ` Oleg Nesterov
2013-03-04 22:08         ` Myklebust, Trond
2013-03-05 13:23           ` Jeff Layton
2013-03-05 17:46             ` Tejun Heo
2013-03-05 17:49               ` Tejun Heo
2013-03-05 19:03                 ` Jeff Layton
2013-03-05 19:09                   ` Tejun Heo
2013-03-05 23:39                     ` Jeff Layton
2013-03-05 23:47                       ` Tejun Heo
2013-03-06 18:16                         ` Oleg Nesterov
2013-03-06 18:53                           ` Tejun Heo
2013-03-06 21:00                             ` Linus Torvalds
2013-03-06 21:24                               ` Tejun Heo [this message]
2013-03-06 21:31                                 ` Linus Torvalds
2013-03-06 21:36                                   ` Tejun Heo
2013-03-06 21:40                                     ` Tejun Heo
2013-03-13 15:17                                       ` Jeff Layton
2013-03-31  0:07                                         ` Paul Walmsley
2013-03-07 11:41                                     ` Jeff Layton
2013-03-07 15:25                                       ` Tejun Heo
2013-03-07 15:55                                       ` Linus Torvalds
2013-03-07 15:59                                         ` Myklebust, Trond
2013-03-07 16:25                                           ` Linus Torvalds
2013-03-07 16:45                                             ` Myklebust, Trond
2013-03-07 17:03                                               ` Linus Torvalds
2013-03-07 17:16                                                 ` Myklebust, Trond
2013-03-07 21:43                                                   ` Jeff Layton
2013-03-08 14:01                                                 ` Ingo Molnar
2013-03-07 20:55                                             ` Rafael J. Wysocki
2013-03-07 16:00                                         ` Tejun Heo
2013-03-06 18:17                       ` Oleg Nesterov
2013-03-06 18:40                         ` Jeff Layton
2013-03-06 18:45                           ` Tejun Heo
2013-03-06  1:10                   ` Myklebust, Trond
2013-03-06  1:14                     ` Tejun Heo
2013-03-06  1:28                       ` Tejun Heo
2013-03-06 12:00                     ` Jeff Layton
2013-03-05 23:11                 ` J. Bruce Fields
2013-03-06  0:02                   ` Rafael J. Wysocki
2013-03-06  0:30                   ` [PATCH] lockdep: make lock held while freezing check optional Mandeep Singh Baines
2013-03-07 12:03                     ` Maarten Lankhorst
2013-03-06  0:59                   ` LOCKDEP: 3.9-rc1: mount.nfs/4272 still has locks held! Mandeep Singh Baines
2013-03-06  1:05                     ` J. Bruce Fields
2013-03-06  1:16                       ` Tejun Heo
2013-03-06  3:11                         ` Mandeep Singh Baines
2013-03-06  9:09                           ` Ingo Molnar
2013-03-06 12:06                             ` Jeff Layton
2013-03-06 15:59                               ` Mandeep Singh Baines
2013-03-06 18:23                                 ` Jeff Layton
2013-03-06 18:37                                   ` Myklebust, Trond
2013-03-06 20:15                                     ` Mandeep Singh Baines
2013-03-04 14:40   ` Ming Lei
2013-03-04 15:04     ` Jeff Layton
2013-03-04 15:33       ` Ming Lei
2013-03-04 15:53         ` Myklebust, Trond
2013-03-04 20:09           ` Mandeep Singh Baines
2013-03-04 20:10             ` Mandeep Singh Baines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130306212452.GO1227@htj.dyndns.org \
    --to=tj@kernel.org \
    --cc=Trond.Myklebust@netapp.com \
    --cc=akpm@linux-foundation.org \
    --cc=bfields@fieldses.org \
    --cc=jlayton@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=ming.lei@canonical.com \
    --cc=mingo@redhat.com \
    --cc=msb@chromium.org \
    --cc=oleg@redhat.com \
    --cc=rjw@sisk.pl \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).