[GIT PULL] libata fixes for v3.13-rc5

Linux ATA/IDE development
 help / color / mirror / Atom feed

* [GIT PULL] libata fixes for v3.13-rc5
@ 2013-12-24 14:21 Tejun Heo
  2013-12-24 21:55 ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2013-12-24 14:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel, linux-ide, Rafael J. Wysocki, linux-pm

Hello, Linus.

libata fixes for v3.13-rc5.  There's one interseting commit - "libata,
freezer: avoid block device removal while system is frozen".  It's an
ugly hack working around a deadlock condition between driver core
resume and block layer device removal paths through freezer which was
made more reproducible by writeback being converted to workqueue some
releases ago.  The bug has nothing to do with libata but it's just an
workaround which is easy to backport.  After discussion, Rafael and I
seem to agree that we don't really need kernel freezables - both
kthread and workqueue.  There are few specific workqueues which
constitute PM operations and require freezing, which will be converted
to use workqueue_set_max_active() instead.  All other kernel freezer
uses are planned to be removed, followed by the removal of kthread and
workqueue freezer support, hopefully.

Others are device-specific fixes.  The most notable is the addition of
NO_NCQ_TRIM which is used to disable queued TRIM commands to Micro
M500 SSDs which otherwise suffers data corruption.

The following changes since commit c5700766975c69d27150256444db63fbfd103791:

  ATA: Fix port removal ordering (2013-11-27 13:55:16 -0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata.git for-3.13-fixes

for you to fetch changes up to 85fbd722ad0f5d64d1ad15888cd1eb2188bfb557:

  libata, freezer: avoid block device removal while system is frozen (2013-12-19 13:50:32 -0500)

Thanks.
----------------------------------------------------------------
Marc Carino (1):
      libata: implement ATA_HORKAGE_NO_NCQ_TRIM and apply it to Micro M500 SSDs

Marek Vasut (1):
      ahci: imx: Explicitly clear IMX6Q_GPR13_SATA_MPLL_CLK_EN

Michele Baldessari (1):
      libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for Seagate Momentus SpinPoint M8

Paul Bolle (1):
      ahci: bail out on ICH6 before using AHCI BAR

Robin H. Johnson (1):
      libata: disable a disk via libata.force params

Tejun Heo (1):
      libata, freezer: avoid block device removal while system is frozen

 Documentation/kernel-parameters.txt |  2 ++
 drivers/ata/ahci.c                  | 18 +++++++++---------
 drivers/ata/ahci_imx.c              |  3 ++-
 drivers/ata/libata-core.c           | 19 +++++++++++++++++--
 drivers/ata/libata-scsi.c           | 21 +++++++++++++++++++++
 include/linux/libata.h              |  1 +
 kernel/freezer.c                    |  6 ++++++
 7 files changed, 58 insertions(+), 12 deletions(-)

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [GIT PULL] libata fixes for v3.13-rc5
  2013-12-24 14:21 [GIT PULL] libata fixes for v3.13-rc5 Tejun Heo
@ 2013-12-24 21:55 ` Alan Stern
  2013-12-25 14:45   ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2013-12-24 21:55 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Linus Torvalds, linux-kernel, linux-ide, Rafael J. Wysocki,
	linux-pm

On Tue, 24 Dec 2013, Tejun Heo wrote:

> Hello, Linus.
> 
> libata fixes for v3.13-rc5.  There's one interseting commit - "libata,
> freezer: avoid block device removal while system is frozen".  It's an
> ugly hack working around a deadlock condition between driver core
> resume and block layer device removal paths through freezer which was
> made more reproducible by writeback being converted to workqueue some
> releases ago.  The bug has nothing to do with libata but it's just an
> workaround which is easy to backport.  After discussion, Rafael and I
> seem to agree that we don't really need kernel freezables - both
> kthread and workqueue.  There are few specific workqueues which
> constitute PM operations and require freezing, which will be converted
> to use workqueue_set_max_active() instead.  All other kernel freezer
> uses are planned to be removed, followed by the removal of kthread and
> workqueue freezer support, hopefully.

Wait a minute.  I don't recall anybody mentioning this earlier.  What 
about khubd?  There isn't any plan to remove _it_.

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [GIT PULL] libata fixes for v3.13-rc5
  2013-12-24 21:55 ` Alan Stern
@ 2013-12-25 14:45   ` Rafael J. Wysocki
  2013-12-25 14:57     ` Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2013-12-25 14:45 UTC (permalink / raw)
  To: Alan Stern; +Cc: Tejun Heo, Linus Torvalds, linux-kernel, linux-ide, linux-pm

On Tuesday, December 24, 2013 04:55:46 PM Alan Stern wrote:
> On Tue, 24 Dec 2013, Tejun Heo wrote:
> 
> > Hello, Linus.
> > 
> > libata fixes for v3.13-rc5.  There's one interseting commit - "libata,
> > freezer: avoid block device removal while system is frozen".  It's an
> > ugly hack working around a deadlock condition between driver core
> > resume and block layer device removal paths through freezer which was
> > made more reproducible by writeback being converted to workqueue some
> > releases ago.  The bug has nothing to do with libata but it's just an
> > workaround which is easy to backport.  After discussion, Rafael and I
> > seem to agree that we don't really need kernel freezables - both
> > kthread and workqueue.  There are few specific workqueues which
> > constitute PM operations and require freezing, which will be converted
> > to use workqueue_set_max_active() instead.  All other kernel freezer
> > uses are planned to be removed, followed by the removal of kthread and
> > workqueue freezer support, hopefully.
> 
> Wait a minute.  I don't recall anybody mentioning this earlier.  What 
> about khubd?  There isn't any plan to remove _it_.

No, but we are going to replace the freezing of kernel stuff with something
more direct, like "suspend" routines called from the system suspend code path
and causing things to stop (and corresponding "resume" starting them again).

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [GIT PULL] libata fixes for v3.13-rc5
  2013-12-25 14:45   ` Rafael J. Wysocki
@ 2013-12-25 14:57     ` Alan Stern
  2013-12-25 22:12       ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2013-12-25 14:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Linus Torvalds, linux-kernel, linux-ide, linux-pm

On Wed, 25 Dec 2013, Rafael J. Wysocki wrote:

> On Tuesday, December 24, 2013 04:55:46 PM Alan Stern wrote:
> > On Tue, 24 Dec 2013, Tejun Heo wrote:
> > 
> > > Hello, Linus.
> > > 
> > > libata fixes for v3.13-rc5.  There's one interseting commit - "libata,
> > > freezer: avoid block device removal while system is frozen".  It's an
> > > ugly hack working around a deadlock condition between driver core
> > > resume and block layer device removal paths through freezer which was
> > > made more reproducible by writeback being converted to workqueue some
> > > releases ago.  The bug has nothing to do with libata but it's just an
> > > workaround which is easy to backport.  After discussion, Rafael and I
> > > seem to agree that we don't really need kernel freezables - both
> > > kthread and workqueue.  There are few specific workqueues which
> > > constitute PM operations and require freezing, which will be converted
> > > to use workqueue_set_max_active() instead.  All other kernel freezer
> > > uses are planned to be removed, followed by the removal of kthread and
> > > workqueue freezer support, hopefully.
> > 
> > Wait a minute.  I don't recall anybody mentioning this earlier.  What 
> > about khubd?  There isn't any plan to remove _it_.
> 
> No, but we are going to replace the freezing of kernel stuff with something
> more direct, like "suspend" routines called from the system suspend code path
> and causing things to stop (and corresponding "resume" starting them again).

Is this discussed in more detail somewhere (an email thread, for
example)?

Alan Stern


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [GIT PULL] libata fixes for v3.13-rc5
  2013-12-25 14:57     ` Alan Stern
@ 2013-12-25 22:12       ` Rafael J. Wysocki
  2013-12-26  3:29         ` No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5) Alan Stern
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2013-12-25 22:12 UTC (permalink / raw)
  To: Alan Stern; +Cc: Tejun Heo, Linus Torvalds, linux-kernel, linux-ide, linux-pm

On Wednesday, December 25, 2013 09:57:09 AM Alan Stern wrote:
> On Wed, 25 Dec 2013, Rafael J. Wysocki wrote:
> 
> > On Tuesday, December 24, 2013 04:55:46 PM Alan Stern wrote:
> > > On Tue, 24 Dec 2013, Tejun Heo wrote:
> > > 
> > > > Hello, Linus.
> > > > 
> > > > libata fixes for v3.13-rc5.  There's one interseting commit - "libata,
> > > > freezer: avoid block device removal while system is frozen".  It's an
> > > > ugly hack working around a deadlock condition between driver core
> > > > resume and block layer device removal paths through freezer which was
> > > > made more reproducible by writeback being converted to workqueue some
> > > > releases ago.  The bug has nothing to do with libata but it's just an
> > > > workaround which is easy to backport.  After discussion, Rafael and I
> > > > seem to agree that we don't really need kernel freezables - both
> > > > kthread and workqueue.  There are few specific workqueues which
> > > > constitute PM operations and require freezing, which will be converted
> > > > to use workqueue_set_max_active() instead.  All other kernel freezer
> > > > uses are planned to be removed, followed by the removal of kthread and
> > > > workqueue freezer support, hopefully.
> > > 
> > > Wait a minute.  I don't recall anybody mentioning this earlier.  What 
> > > about khubd?  There isn't any plan to remove _it_.
> > 
> > No, but we are going to replace the freezing of kernel stuff with something
> > more direct, like "suspend" routines called from the system suspend code path
> > and causing things to stop (and corresponding "resume" starting them again).
> 
> Is this discussed in more detail somewhere (an email thread, for
> example)?

This one, more or less: https://lkml.org/lkml/2013/12/13/402

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 16+ messages in thread

* No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-25 22:12       ` Rafael J. Wysocki
@ 2013-12-26  3:29         ` Alan Stern
  2013-12-26  4:18           ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2013-12-26  3:29 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Tejun Heo, Linus Torvalds, Kernel development list, linux-ide,
	Linux-pm mailing list

n Wed, 25 Dec 2013, Rafael J. Wysocki wrote:

> > Is this discussed in more detail somewhere (an email thread, for
> > example)?
> 
> This one, more or less: https://lkml.org/lkml/2013/12/13/402

Thanks.  As I understand it (correct me if this is wrong), the problem
is that some subsystems wait for a freezable work queue or kthread to
carry out some job, and they do this as part of their resume pathway.  
Obviously this leads to deadlock.

One possible answer is to fix the subsystems -- make them not do that
any more.  Easier said than done, I suppose.  In any case, this is a
class of errors that is not easily detectable at the moment (could
lockdep be taught to recognize it?).

But I don't see how swinging to the other extreme (i.e., making no
kernel threads or work queues freezable) really solves anything.  
Those things are freezable for real reasons; they do stuff that must
not happen while the system is in the middle of a sleep transition.

Thus...  If a subsystem's resume pathway depends on something happening 
which must not happen during a sleep transition, then something is 
fundamentally broken.

Perhaps the problem could be solved by a finer-grained approach.  For
example, maybe some of these work queues or kthreads need to be frozen
only while the system is suspending, so they can safely be thawed when
the resume begins.  Would that fix the problem that began this 
discussion?

I know that in the case of khubd, we really do want it to remain frozen 
throughout the entire sleep transition.

Alan Stern

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-26  3:29         ` No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5) Alan Stern
@ 2013-12-26  4:18           ` Tejun Heo
  2013-12-26 15:05             ` Rafael J. Wysocki
  0 siblings, 1 reply; 16+ messages in thread
From: Tejun Heo @ 2013-12-26  4:18 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linus Torvalds, Kernel development list,
	linux-ide, Linux-pm mailing list

Hello, Alan.

On Wed, Dec 25, 2013 at 10:29:30PM -0500, Alan Stern wrote:
> Thanks.  As I understand it (correct me if this is wrong), the problem
> is that some subsystems wait for a freezable work queue or kthread to
> carry out some job, and they do this as part of their resume pathway.  
> Obviously this leads to deadlock.

It's not even that.  Freezer is too big and ends up causing deadlock
which doesn't have much to do with resuming in itself.  The observed
deadlock is between driver core resume and block layer removal paths.
While the removal operation is kicked off by bus rescan during resume,
it isn't an inherent part of libata resume.  ie. any block device
removal can race against driver core freezer path.

> But I don't see how swinging to the other extreme (i.e., making no
> kernel threads or work queues freezable) really solves anything.  
> Those things are freezable for real reasons; they do stuff that must
> not happen while the system is in the middle of a sleep transition.

Kernel freezer has a lot of similarities with BKL.  I wouldn't call
its eventual removal an extreme choice.

> Thus...  If a subsystem's resume pathway depends on something happening 
> which must not happen during a sleep transition, then something is 
> fundamentally broken.

Again, the deadlock doesn't have much to do with "a subsystem's resume
pathway".  It's just freezer behaving as too big a lock used by too
many subsystems which don't even need them eventually leading to messy
deadlock.

> Perhaps the problem could be solved by a finer-grained approach.  For

Yeah, we seem to agree on the fact that it needs to be finer grained

> example, maybe some of these work queues or kthreads need to be frozen
> only while the system is suspending, so they can safely be thawed when
> the resume begins.  Would that fix the problem that began this 
> discussion?

but not how to achieve it.  Kernel freezer's problem is that it's too
big a mechanism for the given problem.  What we need is a mechanism to
plug a handful (from what I've seen, there aren't too many legit
users) of kthreads / workqueues, but what we have is a mechanism which
is integral part of the whole task and workqueue machineries.

>From the beginning and still, it's used as a voodoo mechanism which
somehow makes the machine "mostly stopped" before performing PM
operations.  That "mostly stopped" has never been well defined and
always been a source of confusions causing people to just believe it's
somehow all quiet and good without actually pondering what should
happen - freezing of kthreads isn't atomic and doesn't have any
inherent ordering among different kthreads / wqs && freezing kthread /
wq doesn't mean the layer is actually empty or inactive.  irq / bh
paths aren't blocked at all.

A common misconception seems to be that by marking the workers in the
upper layers freezable, somehow things are blocked from all the
sources and thus the actual device, which needs to be plugged across
suspend, is safe.  It's damaging because it'd work in many, if not
most, cases but can never be made fully reliable.

> I know that in the case of khubd, we really do want it to remain frozen 
> throughout the entire sleep transition.

I could be wrong but the only legitimate use cases of kernel freezer
that I've seen are drivers using it as a fast way to implement
suspend/resume callbacks and khubd seems to fall in that category.  I
think it'd be actually healthier to implement it as a part of the
usual PM notification / callback mechanism - a natural benefit would
be helping consolidate system and runtime PM paths.  The conversion
should be fairly simple and we definitely can add helpers and whatnot
if we end up having to convert more than a handful.

The danger is that because we've been plugging so many sources, we're
likely to have lower level queues which aren't properly plugged.
Those are broken but could have been working most of the time.
Removing freezing from the upper layers could expose those issues, so
the conversion as a whole may have some challanges in that department.
It has been quite a while since I went through PM paths of many
different drivers so I'd defer the judgement to Rafael.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-26  4:18           ` Tejun Heo
@ 2013-12-26 15:05             ` Rafael J. Wysocki
  2013-12-26 16:05               ` Tejun Heo
  0 siblings, 1 reply; 16+ messages in thread
From: Rafael J. Wysocki @ 2013-12-26 15:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Alan Stern, Linus Torvalds, Kernel development list, linux-ide,
	Linux-pm mailing list

Hi Tejun,

On Wednesday, December 25, 2013 11:18:56 PM Tejun Heo wrote:
> Hello, Alan.
> 
> On Wed, Dec 25, 2013 at 10:29:30PM -0500, Alan Stern wrote:
> > Thanks.  As I understand it (correct me if this is wrong), the problem
> > is that some subsystems wait for a freezable work queue or kthread to
> > carry out some job, and they do this as part of their resume pathway.  
> > Obviously this leads to deadlock.
> 
> It's not even that.  Freezer is too big and ends up causing deadlock
> which doesn't have much to do with resuming in itself.  The observed
> deadlock is between driver core resume and block layer removal paths.
> While the removal operation is kicked off by bus rescan during resume,
> it isn't an inherent part of libata resume.  ie. any block device
> removal can race against driver core freezer path.

It looks like we're talking about two problems somewhat related to each other.

First, there is a removal-vs-resume deadlock which technically is related to
the freezing, but it is not entirely clear to me that using a more fine grained
mechanism instead of the freezing will actually help here.  It may turn out to
be necessary to defer the removal until the resume is complete anyway.

Second, there is the problem with the freezer being a huge sledgehammer which
sometimes is used too eagerly and without enough understanding.  I agree that
that is a problem and probably the best way to address it would be to make
people use more specific things instead of the freezer for workqueues/kthreads.

> > But I don't see how swinging to the other extreme (i.e., making no
> > kernel threads or work queues freezable) really solves anything.  
> > Those things are freezable for real reasons; they do stuff that must
> > not happen while the system is in the middle of a sleep transition.
> 
> Kernel freezer has a lot of similarities with BKL.  I wouldn't call
> its eventual removal an extreme choice.
> 
> > Thus...  If a subsystem's resume pathway depends on something happening 
> > which must not happen during a sleep transition, then something is 
> > fundamentally broken.
> 
> Again, the deadlock doesn't have much to do with "a subsystem's resume
> pathway".  It's just freezer behaving as too big a lock used by too
> many subsystems which don't even need them eventually leading to messy
> deadlock.
> 
> > Perhaps the problem could be solved by a finer-grained approach.  For
> 
> Yeah, we seem to agree on the fact that it needs to be finer grained
> 
> > example, maybe some of these work queues or kthreads need to be frozen
> > only while the system is suspending, so they can safely be thawed when
> > the resume begins.  Would that fix the problem that began this 
> > discussion?
> 
> but not how to achieve it.  Kernel freezer's problem is that it's too
> big a mechanism for the given problem.  What we need is a mechanism to
> plug a handful (from what I've seen, there aren't too many legit
> users) of kthreads / workqueues, but what we have is a mechanism which
> is integral part of the whole task and workqueue machineries.
> 
> From the beginning and still, it's used as a voodoo mechanism which
> somehow makes the machine "mostly stopped" before performing PM
> operations.  That "mostly stopped" has never been well defined and
> always been a source of confusions causing people to just believe it's
> somehow all quiet and good without actually pondering what should
> happen - freezing of kthreads isn't atomic and doesn't have any
> inherent ordering among different kthreads / wqs && freezing kthread /
> wq doesn't mean the layer is actually empty or inactive.  irq / bh
> paths aren't blocked at all.

Well, we agreed long ago that making the freezer any more "complete" (in
terms of stopping more things) would be harmful, but then we thought that
it was not necessary to rework the subsystems using it already.  Perhaps
that was a mistake.

> A common misconception seems to be that by marking the workers in the
> upper layers freezable, somehow things are blocked from all the
> sources and thus the actual device, which needs to be plugged across
> suspend, is safe.  It's damaging because it'd work in many, if not
> most, cases but can never be made fully reliable.
> 
> > I know that in the case of khubd, we really do want it to remain frozen 
> > throughout the entire sleep transition.
> 
> I could be wrong but the only legitimate use cases of kernel freezer
> that I've seen are drivers using it as a fast way to implement
> suspend/resume callbacks and khubd seems to fall in that category.  I
> think it'd be actually healthier to implement it as a part of the
> usual PM notification / callback mechanism - a natural benefit would
> be helping consolidate system and runtime PM paths.  The conversion
> should be fairly simple and we definitely can add helpers and whatnot
> if we end up having to convert more than a handful.

I actually think that khubd is somewhat analogous to the PM workqueue, so
it needs to be stopped before we start calling suspend callbacks or we'd
have to deal with a lot more complexity than really necessary.  There may
be more things like that, but that's hard to tell without reviewing all
users of freezable kthreads and workqueues and analysing them all.  So I
guess that's the next step if we want to go into that direction.

> The danger is that because we've been plugging so many sources, we're
> likely to have lower level queues which aren't properly plugged.
> Those are broken but could have been working most of the time.
> Removing freezing from the upper layers could expose those issues, so
> the conversion as a whole may have some challanges in that department.
> It has been quite a while since I went through PM paths of many
> different drivers so I'd defer the judgement to Rafael.

Well, it looks like we don't really know why things are done the way they
are done at least in some cases, so in my personal view it would be good to
go through all of the kernel freezer users just for this reason alone.  We
can't really say which of them are legitimate without that and how difficult
it would be for them to switch over to using something more fine grained than
the freezer.

I'm not worried about workqueues, becuase I see no reason why they can't use
workqueue_set_max_active() the way we discussed (after the two patches of yours
that haven't been applied yet), but kthreads are a somewhat different matter.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-26 15:05             ` Rafael J. Wysocki
@ 2013-12-26 16:05               ` Tejun Heo
  2013-12-26 18:42                 ` Alan Stern
                                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Tejun Heo @ 2013-12-26 16:05 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Alan Stern, Linus Torvalds, Kernel development list, linux-ide,
	Linux-pm mailing list

Hello, Rafael.

On Thu, Dec 26, 2013 at 04:05:53PM +0100, Rafael J. Wysocki wrote:
> First, there is a removal-vs-resume deadlock which technically is related to
> the freezing, but it is not entirely clear to me that using a more fine grained
> mechanism instead of the freezing will actually help here.  It may turn out to
> be necessary to defer the removal until the resume is complete anyway.
> 
> Second, there is the problem with the freezer being a huge sledgehammer which
> sometimes is used too eagerly and without enough understanding.  I agree that
> that is a problem and probably the best way to address it would be to make
> people use more specific things instead of the freezer for workqueues/kthreads.

I don't think the two points are separate.  There's no reason for
anything above device drivers (themselves or whatever midlayer
implementing command queue) to be participating in the freezer and
this deadlock occurred only because writeback is freezable for no good
reason that I can see.  If we make "freezing" specific to the places
where the actual PM operations are necessary, these lockups cannot
happen, or rather, if a deadlock happens, the blame would likely be
clearly on the device driver or its subsystem implementation.

> Well, we agreed long ago that making the freezer any more "complete" (in
> terms of stopping more things) would be harmful, but then we thought that
> it was not necessary to rework the subsystems using it already.  Perhaps
> that was a mistake.

Hmmm... it doesn't look like we have too many freezable users in the
kernel at this point.  Hopefully doing a sweep through what's
remaining wouldn't be too hard?

> I actually think that khubd is somewhat analogous to the PM workqueue, so
> it needs to be stopped before we start calling suspend callbacks or we'd
> have to deal with a lot more complexity than really necessary.  There may
> be more things like that, but that's hard to tell without reviewing all
> users of freezable kthreads and workqueues and analysing them all.  So I
> guess that's the next step if we want to go into that direction.

I see.  I'm kinda curious how that jives with runtime PM.  One thing
which bothers me about the freezer is that it's essentially a separate
entry point for suspend/resume implementation, and not a particularly
well designed one at that.  Things which depend on freezer for PM ops
would need completely separate paths for runtime PM.  They probably
need some deviations anyway but freezer would push it unnecessarily.

> Well, it looks like we don't really know why things are done the way they
> are done at least in some cases, so in my personal view it would be good to
> go through all of the kernel freezer users just for this reason alone.  We
> can't really say which of them are legitimate without that and how difficult
> it would be for them to switch over to using something more fine grained than
> the freezer.

I'm a bit worried about things which may not be explicit.
ie. something which is broken but sorta working because things like
writeback and jbd are frozen.  I think I worry about that because I
remember one argument for kernel freezer from way way back, that it's
too hard to implement proper suspend/resume for all drivers and by
freezing most kthreads things should mostly work, which sounded pretty
crazy to me even back then.  Hopefully, we don't have much left
depending on such magic.

> I'm not worried about workqueues, becuase I see no reason why they can't use
> workqueue_set_max_active() the way we discussed (after the two patches of yours
> that haven't been applied yet), but kthreads are a somewhat different matter.

Well, converting kthreads to workqueues are pretty easy and usually
beneficial anyway, especially given that people often get things not
completely right when mixing kthread_should_stop() and freezing
condition checks (it can be surprisingly tricky and likely to work
most of the time even when slightly broken) and workqueue is a lot
easier to get right on that respect.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-26 16:05               ` Tejun Heo
@ 2013-12-26 18:42                 ` Alan Stern
  2013-12-26 19:01                   ` Tejun Heo
  2013-12-26 23:49                 ` Rafael J. Wysocki
  2014-01-11 21:15                 ` Pavel Machek
  2 siblings, 1 reply; 16+ messages in thread
From: Alan Stern @ 2013-12-26 18:42 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Linus Torvalds, Kernel development list,
	linux-ide, Linux-pm mailing list

On Thu, 26 Dec 2013, Tejun Heo wrote:

> Hello, Rafael.
> 
> On Thu, Dec 26, 2013 at 04:05:53PM +0100, Rafael J. Wysocki wrote:
> > First, there is a removal-vs-resume deadlock which technically is related to
> > the freezing, but it is not entirely clear to me that using a more fine grained
> > mechanism instead of the freezing will actually help here.  It may turn out to
> > be necessary to defer the removal until the resume is complete anyway.

In fact, some places may be using the freezer as a convenient way to 
implement just such a deferral.

> > Second, there is the problem with the freezer being a huge sledgehammer which
> > sometimes is used too eagerly and without enough understanding.  I agree that
> > that is a problem and probably the best way to address it would be to make
> > people use more specific things instead of the freezer for workqueues/kthreads.
> 
> I don't think the two points are separate.  There's no reason for
> anything above device drivers (themselves or whatever midlayer
> implementing command queue) to be participating in the freezer and
> this deadlock occurred only because writeback is freezable for no good
> reason that I can see.

I don't know why writeback was made freezable -- but that would be a
reasonable way to avoid changing on-disk data during hibernation.

>  If we make "freezing" specific to the places
> where the actual PM operations are necessary, these lockups cannot
> happen, or rather, if a deadlock happens, the blame would likely be
> clearly on the device driver or its subsystem implementation.

In the case of hibernation, it's not so simple.  We do need to perform 
I/O, in order to save the memory image.  But we also need to avoid 
unnecessary I/O, in order to keep the on-disk data consistent with the 
data in the memory image.  You probably can't accomplish this at the 
device driver or subsystem level.

> > I actually think that khubd is somewhat analogous to the PM workqueue, so
> > it needs to be stopped before we start calling suspend callbacks or we'd
> > have to deal with a lot more complexity than really necessary.  There may
> > be more things like that, but that's hard to tell without reviewing all
> > users of freezable kthreads and workqueues and analysing them all.  So I
> > guess that's the next step if we want to go into that direction.
> 
> I see.  I'm kinda curious how that jives with runtime PM.  One thing

There are some very significant differences between system sleep and 
runtime PM.  What is appropriate for one may not be appropriate for the 
other.  In particular, the freezer is not appropriate for runtime PM.

> which bothers me about the freezer is that it's essentially a separate
> entry point for suspend/resume implementation, and not a particularly
> well designed one at that.  Things which depend on freezer for PM ops
> would need completely separate paths for runtime PM.  They probably
> need some deviations anyway but freezer would push it unnecessarily.

Maybe it's the other way around: The separate paths are necessary, and 
the freezer _simplifies_ the system sleep ops.

> Well, converting kthreads to workqueues are pretty easy and usually
> beneficial anyway, especially given that people often get things not
> completely right when mixing kthread_should_stop() and freezing
> condition checks (it can be surprisingly tricky and likely to work
> most of the time even when slightly broken) and workqueue is a lot
> easier to get right on that respect.

Taking khubd as an example, I have to agree that converting it to a
workqueue would be a big simplification overall.  And yet there are
some things khubd does which are (as far as I know) rather difficult to
accomplish with workqueues.  One example in drivers/usb/core/hub.c:  
kick_khubd() calls usb_autopm_get_interface_no_resume() if and only if
it added the hub to the event list (and it does so before releasing the
list's lock).  How can you do that with a workqueue?

Alan Stern

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-26 18:42                 ` Alan Stern
@ 2013-12-26 19:01                   ` Tejun Heo
  2013-12-26 23:23                     ` Rafael J. Wysocki
  2013-12-27  2:14                     ` Alan Stern
  0 siblings, 2 replies; 16+ messages in thread
From: Tejun Heo @ 2013-12-26 19:01 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linus Torvalds, Kernel development list,
	linux-ide, Linux-pm mailing list

Hey,

On Thu, Dec 26, 2013 at 01:42:29PM -0500, Alan Stern wrote:
> In the case of hibernation, it's not so simple.  We do need to perform 
> I/O, in order to save the memory image.  But we also need to avoid 
> unnecessary I/O, in order to keep the on-disk data consistent with the 
> data in the memory image.  You probably can't accomplish this at the 
> device driver or subsystem level.

That was what I assumed too but Rafael tells me it has nothing to do
with hibernation.

> > which bothers me about the freezer is that it's essentially a separate
> > entry point for suspend/resume implementation, and not a particularly
> > well designed one at that.  Things which depend on freezer for PM ops
> > would need completely separate paths for runtime PM.  They probably
> > need some deviations anyway but freezer would push it unnecessarily.
> 
> Maybe it's the other way around: The separate paths are necessary, and 
> the freezer _simplifies_ the system sleep ops.

Again, the point is it's too big a tool given the problem with history
of abuse.  It sure is "convenient" to have tools at that level for
that particular user - not because the task at hand fits such solution
but because a lot more is being paid elsewhere.  It just is out of
proportion and isn't a good design in larger sense.

As for autopm vs. system pm, there sure are necessary differences
between the two but they also can share a lot.  At least, it looks
that way from libata side.  I really don't think having two separate
paradigms in implementing PM is a good idea even if the two paths have
to deviate in significant ways.

> Taking khubd as an example, I have to agree that converting it to a
> workqueue would be a big simplification overall.  And yet there are
> some things khubd does which are (as far as I know) rather difficult to
> accomplish with workqueues.  One example in drivers/usb/core/hub.c:  
> kick_khubd() calls usb_autopm_get_interface_no_resume() if and only if
> it added the hub to the event list (and it does so before releasing the
> list's lock).  How can you do that with a workqueue?

Do the same thing and just replace wake_up() with queue_work()?

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-26 19:01                   ` Tejun Heo
@ 2013-12-26 23:23                     ` Rafael J. Wysocki
  2013-12-27  2:14                     ` Alan Stern
  1 sibling, 0 replies; 16+ messages in thread
From: Rafael J. Wysocki @ 2013-12-26 23:23 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Alan Stern, Linus Torvalds, Kernel development list, linux-ide,
	Linux-pm mailing list

On Thursday, December 26, 2013 02:01:20 PM Tejun Heo wrote:
> Hey,
> 
> On Thu, Dec 26, 2013 at 01:42:29PM -0500, Alan Stern wrote:
> > In the case of hibernation, it's not so simple.  We do need to perform 
> > I/O, in order to save the memory image.  But we also need to avoid 
> > unnecessary I/O, in order to keep the on-disk data consistent with the 
> > data in the memory image.  You probably can't accomplish this at the 
> > device driver or subsystem level.
> 
> That was what I assumed too but Rafael tells me it has nothing to do
> with hibernation.

It doesn't prevent on-disk data corruption from happening in case of a failing
hibernation.  In case of a successful hibernation it is key to keep on-disk
data in sync with the contents of the image, but relying on the freezing
for that is rather not a winning strategy, so to speak.  It would be better
to freeze filesystems instead (which has been discussed in another thread
recently).

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-26 16:05               ` Tejun Heo
  2013-12-26 18:42                 ` Alan Stern
@ 2013-12-26 23:49                 ` Rafael J. Wysocki
  2014-01-11 21:15                 ` Pavel Machek
  2 siblings, 0 replies; 16+ messages in thread
From: Rafael J. Wysocki @ 2013-12-26 23:49 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Alan Stern, Linus Torvalds, Kernel development list, linux-ide,
	Linux-pm mailing list

On Thursday, December 26, 2013 11:05:17 AM Tejun Heo wrote:
> Hello, Rafael.

Hi Tejun,

> On Thu, Dec 26, 2013 at 04:05:53PM +0100, Rafael J. Wysocki wrote:
> > First, there is a removal-vs-resume deadlock which technically is related to
> > the freezing, but it is not entirely clear to me that using a more fine grained
> > mechanism instead of the freezing will actually help here.  It may turn out to
> > be necessary to defer the removal until the resume is complete anyway.
> > 
> > Second, there is the problem with the freezer being a huge sledgehammer which
> > sometimes is used too eagerly and without enough understanding.  I agree that
> > that is a problem and probably the best way to address it would be to make
> > people use more specific things instead of the freezer for workqueues/kthreads.
> 
> I don't think the two points are separate.  There's no reason for
> anything above device drivers (themselves or whatever midlayer
> implementing command queue) to be participating in the freezer and
> this deadlock occurred only because writeback is freezable for no good
> reason that I can see.  If we make "freezing" specific to the places
> where the actual PM operations are necessary, these lockups cannot
> happen, or rather, if a deadlock happens, the blame would likely be
> clearly on the device driver or its subsystem implementation.

The rule is that system suspend (or hibernation) can happen at any time and
certain things that may be running at that point need to be stopped, this way
or another.  The stopping may be done by individual drivers (through PM
callbacks) or at a higher level (e.g. through the freezer) and what makes more
sense depends on the specific case in my opinion.

Runtime PM has a mechanism by which you can say "do not run PM callbacks for
that device now" which makes things quite a bit different.

> > Well, we agreed long ago that making the freezer any more "complete" (in
> > terms of stopping more things) would be harmful, but then we thought that
> > it was not necessary to rework the subsystems using it already.  Perhaps
> > that was a mistake.
> 
> Hmmm... it doesn't look like we have too many freezable users in the
> kernel at this point.  Hopefully doing a sweep through what's
> remaining wouldn't be too hard?
> 
> > I actually think that khubd is somewhat analogous to the PM workqueue, so
> > it needs to be stopped before we start calling suspend callbacks or we'd
> > have to deal with a lot more complexity than really necessary.  There may
> > be more things like that, but that's hard to tell without reviewing all
> > users of freezable kthreads and workqueues and analysing them all.  So I
> > guess that's the next step if we want to go into that direction.
> 
> I see.  I'm kinda curious how that jives with runtime PM.  One thing
> which bothers me about the freezer is that it's essentially a separate
> entry point for suspend/resume implementation, and not a particularly
> well designed one at that.  Things which depend on freezer for PM ops
> would need completely separate paths for runtime PM.  They probably
> need some deviations anyway but freezer would push it unnecessarily.

As I said above, using the freezer is just one of possible ways to stop things
that need to be stopped.  Whoever choses the freezer for that should better
know why exactly or it most likely is a bug.

> > Well, it looks like we don't really know why things are done the way they
> > are done at least in some cases, so in my personal view it would be good to
> > go through all of the kernel freezer users just for this reason alone.  We
> > can't really say which of them are legitimate without that and how difficult
> > it would be for them to switch over to using something more fine grained than
> > the freezer.
> 
> I'm a bit worried about things which may not be explicit.
> ie. something which is broken but sorta working because things like
> writeback and jbd are frozen.  I think I worry about that because I
> remember one argument for kernel freezer from way way back, that it's
> too hard to implement proper suspend/resume for all drivers and by
> freezing most kthreads things should mostly work, which sounded pretty
> crazy to me even back then.  Hopefully, we don't have much left
> depending on such magic.

I *hope* we don't.  In any case, we certainly need to know *why* things are
done in a specific way, whatever that way is.  In this particular case, if
some kthread/workqueue is frozen over suspend/resume, we should know why
exactly (i.e. what problem the freezing is supposed to address) or all that
becomes voodoo programming.

> > I'm not worried about workqueues, becuase I see no reason why they can't use
> > workqueue_set_max_active() the way we discussed (after the two patches of yours
> > that haven't been applied yet), but kthreads are a somewhat different matter.
> 
> Well, converting kthreads to workqueues are pretty easy and usually
> beneficial anyway, especially given that people often get things not
> completely right when mixing kthread_should_stop() and freezing
> condition checks (it can be surprisingly tricky and likely to work
> most of the time even when slightly broken) and workqueue is a lot
> easier to get right on that respect.

Well, agreed. :-)

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-26 19:01                   ` Tejun Heo
  2013-12-26 23:23                     ` Rafael J. Wysocki
@ 2013-12-27  2:14                     ` Alan Stern
  2013-12-27  2:21                       ` Tejun Heo
  1 sibling, 1 reply; 16+ messages in thread
From: Alan Stern @ 2013-12-27  2:14 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Linus Torvalds, Kernel development list,
	linux-ide, Linux-pm mailing list

Hello,

On Thu, 26 Dec 2013, Tejun Heo wrote:

> > Maybe it's the other way around: The separate paths are necessary, and 
> > the freezer _simplifies_ the system sleep ops.
> 
> Again, the point is it's too big a tool given the problem with history
> of abuse.  It sure is "convenient" to have tools at that level for
> that particular user - not because the task at hand fits such solution
> but because a lot more is being paid elsewhere.  It just is out of
> proportion and isn't a good design in larger sense.

I can't disagree with this.  But the design may well be perfectly
adequate for some use cases.  Given a workqueue or kthread which should
not operate during system sleep, we have to:

	Tell the wq/thread to stop running because a sleep is about
	to start, and

	Provide a function the wq/thread can call to put itself on 
	hold for the duration of the sleep.

The freezer does both these things pretty efficiently.  Problems may
arise, though, because of workqueues or kthreads which need to continue 
some but not all of their operations, or which need to be put on hold 
in a specific sequence with respect to other threads.  The freezer is 
not suited for such things.

(There is a similar problem when userspace gets frozen.  Tasks 
implementing FUSE filesystems, in particular, have a deplorable 
tendency to get frozen at inconvenient times.)

> As for autopm vs. system pm, there sure are necessary differences
> between the two but they also can share a lot.  At least, it looks
> that way from libata side.  I really don't think having two separate
> paradigms in implementing PM is a good idea even if the two paths have
> to deviate in significant ways.

Put it the other way around: Implementing two significantly different
kinds of PM is okay because the two paths can share a lot.

Really.  Although system PM and runtime PM may appear similar from some
points of view, underneath they are quite different in several
important ways.  Most of what they have in common is the idea of 
putting devices into or out of low-power states.  But when and how to 
do these things...

> > Taking khubd as an example, I have to agree that converting it to a
> > workqueue would be a big simplification overall.  And yet there are
> > some things khubd does which are (as far as I know) rather difficult to
> > accomplish with workqueues.  One example in drivers/usb/core/hub.c:  
> > kick_khubd() calls usb_autopm_get_interface_no_resume() if and only if
> > it added the hub to the event list (and it does so before releasing the
> > list's lock).  How can you do that with a workqueue?
> 
> Do the same thing and just replace wake_up() with queue_work()?

So you're suggesting changing the kthread to a workqueue thread, but
keeping the existing list of scheduled events instead of relying on the
workqueue's own queue of work items?  What's the advantage?  Making
such a change wouldn't simplify anything.

Alan Stern

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-27  2:14                     ` Alan Stern
@ 2013-12-27  2:21                       ` Tejun Heo
  0 siblings, 0 replies; 16+ messages in thread
From: Tejun Heo @ 2013-12-27  2:21 UTC (permalink / raw)
  To: Alan Stern
  Cc: Rafael J. Wysocki, Linus Torvalds, Kernel development list,
	linux-ide, Linux-pm mailing list

Hello,

On Thu, Dec 26, 2013 at 09:14:49PM -0500, Alan Stern wrote:
> I can't disagree with this.  But the design may well be perfectly
> adequate for some use cases.  Given a workqueue or kthread which should
> not operate during system sleep, we have to:
> 
> 	Tell the wq/thread to stop running because a sleep is about
> 	to start, and
> 
> 	Provide a function the wq/thread can call to put itself on 
> 	hold for the duration of the sleep.
> 
> The freezer does both these things pretty efficiently.  Problems may

I don't even like the interface itself.  It's too implicit and spread
all over the place - we basically had to spread it all over the wait
interfaces, and the implementation is far more involved than called
for.  I don't know how you're defining "efficiently" but that isn't a
word I'd use to describe the freezer.

> So you're suggesting changing the kthread to a workqueue thread, but
> keeping the existing list of scheduled events instead of relying on the
> workqueue's own queue of work items?  What's the advantage?  Making
> such a change wouldn't simplify anything.

Well, that'd be an easy first step which does away with the dedicated
kthread, likely reduces cache footprint and allows use of wq
synchronization / freezing constructs which are easier to deal with.
More involved conversion would of course be doable but even simple
conversion seems like a win.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5)
  2013-12-26 16:05               ` Tejun Heo
  2013-12-26 18:42                 ` Alan Stern
  2013-12-26 23:49                 ` Rafael J. Wysocki
@ 2014-01-11 21:15                 ` Pavel Machek
  2 siblings, 0 replies; 16+ messages in thread
From: Pavel Machek @ 2014-01-11 21:15 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Rafael J. Wysocki, Alan Stern, Linus Torvalds,
	Kernel development list, linux-ide, Linux-pm mailing list

Hi!

> > Well, it looks like we don't really know why things are done the way they
> > are done at least in some cases, so in my personal view it would be good to
> > go through all of the kernel freezer users just for this reason alone.  We
> > can't really say which of them are legitimate without that and how difficult
> > it would be for them to switch over to using something more fine grained than
> > the freezer.
> 
> I'm a bit worried about things which may not be explicit.
> ie. something which is broken but sorta working because things like
> writeback and jbd are frozen.  I think I worry about that because I
> remember one argument for kernel freezer from way way back, that it's
> too hard to implement proper suspend/resume for all drivers and by
> freezing most kthreads things should mostly work, which sounded pretty
> crazy to me even back then.  Hopefully, we don't have much left
> depending on such magic.

Careful there. Hibernation depends on data on disk not changing after
freeze.

So you definitely do _not_ want writeback/jbd running while uswsusp
does its image writing.

(I wonder what happens if uswsusp needs to trigger writeback to free
memory. But I'm too scared to check :-).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2014-01-11 21:15 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-24 14:21 [GIT PULL] libata fixes for v3.13-rc5 Tejun Heo
2013-12-24 21:55 ` Alan Stern
2013-12-25 14:45   ` Rafael J. Wysocki
2013-12-25 14:57     ` Alan Stern
2013-12-25 22:12       ` Rafael J. Wysocki
2013-12-26  3:29         ` No freezing of kernel threads (was: Re: [GIT PULL] libata fixes for v3.13-rc5) Alan Stern
2013-12-26  4:18           ` Tejun Heo
2013-12-26 15:05             ` Rafael J. Wysocki
2013-12-26 16:05               ` Tejun Heo
2013-12-26 18:42                 ` Alan Stern
2013-12-26 19:01                   ` Tejun Heo
2013-12-26 23:23                     ` Rafael J. Wysocki
2013-12-27  2:14                     ` Alan Stern
2013-12-27  2:21                       ` Tejun Heo
2013-12-26 23:49                 ` Rafael J. Wysocki
2014-01-11 21:15                 ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox