From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Nigel Cunningham <ncunningham@crca.org.au>
Cc: Alan Stern <stern@rowland.harvard.edu>,
"linux-kernel" <linux-kernel@vger.kernel.org>,
Jens Axboe <jens.axboe@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
"linux-pm" <linux-pm@lists.linux-foundation.org>,
Matt Reimer <mattjreimer@gmail.com>
Subject: Re: [linux-pm] Is it supposed to be ok to call del_gendisk while userspace is frozen?
Date: Tue, 18 May 2010 21:43:27 +0200 [thread overview]
Message-ID: <201005182143.28004.rjw@sisk.pl> (raw)
In-Reply-To: <4BF1C86B.8000203@crca.org.au>
On Tuesday 18 May 2010, Nigel Cunningham wrote:
> Hi.
>
> On 18/05/10 06:35, Rafael J. Wysocki wrote:
> > On Monday 17 May 2010, Nigel Cunningham wrote:
> >> On 17/05/10 12:22, Alan Stern wrote:
> >>> On Mon, 17 May 2010, Nigel Cunningham wrote:
> >>>>>> I object to the patch.
> >>>>>>
> >>>>>> Tell the patch it ought to exit once thawed, by all means.
> >>>>>
> >>>>> I'm not sure what you mean. Care to explain?
> >>>>
> >>>> I mean "Set up some sort of flag that it can look at once thawed at
> >>>> resume time, and use that to tell it to exit at that point."
> >>>
> >>> Doesn't the patch do exactly that? The "flag" is set by virtue of the
> >>> fact that this is part of del_gendisk -- which means the disk is being
> >>> unregistered and hence the writeback thread will exit shortly.
> >>>
> >>>>>> Make the patch unfreezeable to begin with, by all means.
> >>>>>
> >>>>> That wouldn't work.
> >>>>
> >>>> Why not?
> >>>
> >>> It would be nice to know exactly why. Perhaps the underlying problem
> >>> can be fixed.
> >>>
> >>>>>> If you know a disk is going to be unregistered during resume,
> >>>>>
> >>>>> How do we check that, exactly?
> >>>>
> >>>> Well, if you can figure out that you need to go down this path at this
> >>>> point in the process, you must be able to apply the same logic to come
> >>>> to the same conclusion earlier in the process.
> >>>
> >>> That's not true. You don't know that a device is going to be unplugged
> >>> until it actually _is_ unplugged.
> >>
> >> Sorry - I got unregistered during suspend (instead of resume) in my
> >> head. That said, I'd argue that we should be...
> >>
> >> 1) Syncing all the data at the start of the suspend/hibernate, so
> >> there's nothing for the workthread to do if we do del_gendisk.
> >> 2) Telling things to exit if we do find the device is gone away at
> >> resume time, but not relying on the going-away happening until post
> >> process thaw, for a couple of reasons:
> >> - Potential for races/confusion/mess etc in having $random process
> >> thawing other processes. Only the thread doing the suspend/hibernate
> >> should be freezing/thawing.
> >
> > I don't see a problem here, as far as kernel threads are concerned. In this
> > particular case this is a subsystem thawing a thread that belongs to it. No
> > problem.
> >
> >> - We're dealing with the symptom, not the cause. Almost always a bad idea.
> >
> > I very much prefer to have a fix for a symptom than no fix at all, which is the
> > realistic alternative in this case.
> >
> > So, I think we should merge the patch and if someone finds the root cause
> > at one point in future, then we can just use the *right* approach instead of
> > the present one.
> >
> > The problem is real and people in the field are affected by it, so if you don't
> > have a working alternative patch, please just let go.
>
> I'm not denying that the problem is real. What I am concerned about is
> finding a real solution, not just putting a sticky plaster over the
> wound. It seems to me to be much wiser to deal with the issue properly
> now instead of doing extra work later to diagnose what might be a harder
> to reproduce symptom of the same problem. I'd happily put the time in
> now myself, but I simply don't have the time this week.
>
> Would it be possible to apply the patch, adding some sort of new tag
> that can be used to say "This needs further attention", perhaps
> including an enduring reference to this conversation.
Yeah, /* FIXME: */ is for that. With some comment why we're doing this. :-)
> Later, the 'real' fix could include another special tag that says "Proper fix for the
> symptom addressed in commit 5e94f810"?
Thanks,
Rafael
next prev parent reply other threads:[~2010-05-18 19:42 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-13 13:29 Is it supposed to be ok to call del_gendisk while userspace is frozen? Maxim Levitsky
2010-02-15 16:00 ` Maxim Levitsky
2010-02-15 21:04 ` Rafael J. Wysocki
2010-02-16 16:27 ` [linux-pm] " Alan Stern
2010-02-20 22:22 ` Maxim Levitsky
2010-02-23 12:33 ` Jens Axboe
2010-02-23 15:29 ` Alan Stern
2010-02-23 15:58 ` Jens Axboe
2010-02-23 16:33 ` Alan Stern
2010-02-23 22:16 ` Jens Axboe
2010-02-24 15:59 ` Alan Stern
2010-02-24 19:12 ` Jens Axboe
2010-02-24 20:19 ` Alan Stern
2010-02-23 16:42 ` Testing for dirty buffers on a block device Alan Stern
2010-02-23 22:13 ` Jens Axboe
2010-02-24 15:51 ` Alan Stern
2010-02-24 19:09 ` Jens Axboe
2010-02-24 20:09 ` Alan Stern
2010-02-25 8:20 ` Jens Axboe
2010-02-25 22:19 ` Dave Chinner
2010-03-01 6:35 ` [linux-pm] Is it supposed to be ok to call del_gendisk while userspace is frozen? Pavel Machek
2010-03-01 15:23 ` Alan Stern
2010-03-03 21:50 ` Pavel Machek
2010-03-03 22:23 ` Alan Stern
2010-03-04 0:23 ` Rafael J. Wysocki
2010-03-04 2:48 ` Alan Stern
2010-03-04 19:26 ` Rafael J. Wysocki
2010-03-04 19:36 ` Alan Stern
2010-03-04 20:04 ` Rafael J. Wysocki
2010-03-04 20:15 ` Pavel Machek
2010-04-22 23:40 ` Matt Reimer
2010-04-23 5:17 ` Rafael J. Wysocki
2010-05-11 23:55 ` Matt Reimer
2010-05-12 14:50 ` Alan Stern
2010-05-13 21:44 ` Matt Reimer
2010-05-13 21:54 ` Alan Stern
2010-05-13 22:20 ` Matt Reimer
2010-05-13 22:47 ` Nigel Cunningham
2010-05-15 2:37 ` Alan Stern
2010-05-15 2:53 ` Nigel Cunningham
2010-05-16 19:35 ` Rafael J. Wysocki
2010-05-15 2:32 ` Alan Stern
2010-05-15 20:30 ` Rafael J. Wysocki
2010-05-16 7:49 ` Nigel Cunningham
2010-05-16 19:38 ` Rafael J. Wysocki
2010-05-16 21:32 ` Nigel Cunningham
2010-05-17 2:22 ` Alan Stern
2010-05-17 7:45 ` Nigel Cunningham
2010-05-17 20:35 ` Rafael J. Wysocki
2010-05-17 22:51 ` Nigel Cunningham
2010-05-18 19:43 ` Rafael J. Wysocki [this message]
2010-05-18 20:06 ` Alan Stern
2010-05-24 19:02 ` Pavel Machek
2010-05-24 21:21 ` Nigel Cunningham
2010-03-04 13:53 ` Pavel Machek
2010-06-04 11:20 ` Maxim Levitsky
2010-06-04 14:59 ` Alan Stern
2010-06-04 15:19 ` Maxim Levitsky
2010-06-04 17:52 ` Alan Stern
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=201005182143.28004.rjw@sisk.pl \
--to=rjw@sisk.pl \
--cc=akpm@linux-foundation.org \
--cc=jens.axboe@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@lists.linux-foundation.org \
--cc=mattjreimer@gmail.com \
--cc=ncunningham@crca.org.au \
--cc=stern@rowland.harvard.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox