public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Rafael J. Wysocki" <rjw@sisk.pl>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Parag Warudkar <parag.lkml@gmail.com>,
	Matt Carlson <mcarlson@broadcom.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: What should PCI core do during suspend-resume? (was: Re: 2.6.29-rc3: tg3 dead after resume)
Date: Sat, 31 Jan 2009 22:42:16 +0100	[thread overview]
Message-ID: <200901312242.17273.rjw@sisk.pl> (raw)
In-Reply-To: <200901312208.20850.rjw@sisk.pl>

On Saturday 31 January 2009, Rafael J. Wysocki wrote:
> On Saturday 31 January 2009, Linus Torvalds wrote:
> > 
> > On Sat, 31 Jan 2009, Rafael J. Wysocki wrote:
> > > 
> > > Can you test the patch below, please?
> > 
> > Rafael, you're making this _way_ too difficult.
> > 
> > Don't make it use the new PM infrastructure, because that one is certainly 
> > broken: pci_pm_default_suspend_generic() is total crap.
> 
> Oh my.  I beg to differ.
> 
> > It's saving the disabled state. No WAY is that correct.
> 
> So why does it work, actually?
> 
> > That "pci_disable_enabled_device()" should be removed, but even then 
> > that's wrong, because if the driver suspend disabled it, you're now 
> > (again) saving the disabled state.
> > 
> > But all of that is only called if you use the new PM infrastructure. So 
> > the thing is, when you're trying to move the PCI-E drive to the new pm 
> > infrastructure, you're making things _worse_.
> 
> I don't really think so (see below).
> 
> > The legacy PM infrastructure at least does the whole
> > 
> > 	pci_dev->state_saved = false;
> > 	i = drv->suspend(pci_dev, state);
> > 	..
> > 	if (pci_dev->state_saved)
> > 		goto Fixup;
> > 
> > thing, which will avoid overwriting the state if it was already saved.
> > 
> > HOWEVER. The problem here (I think) is that PCI-E actually does the state 
> > save late, so it won't ever see the "state_saved" in the early ->suspend. 
> > I think a patch like the one below at least simplifies this all, and lets 
> > the PCI layer itself do all the core restore stuff.
> > 
> > The new PM infrastructure gets this totally wrong, and
> >  (a) disables the device before saving state
> 
> pci_disable_device() does really only one thing: it clears the bus master bit.
> Yes, it also calls pcibios_disable_device(), but on x86 this is a NOP.
> 
> I don't think it is SO bad, is it?
> 
> > and
> >  (b) overwrites the (now corrupted) saved state that the driver perhaps 
> >      already saved, after the driver may even have put it to sleep.
> 
> The driver using the new model is not supposed to save the state and power
> off the device.  Still, it's probably a good idea not to trust the drivers. :-)
> 
> > So let's not use the new PM infrastructure - I don't think it's ready yet.
> >
> > Let's start simplifying first. Start off by getting rid of the 
> > suspend_early/resume_late, since the PCI layer now does it for us.
> > 
> > I don't see why we don't resume with IO/MEM on, though. The legacy suspend 
> > sequence shouldn't disable them, afaik.
> 
> No, it shouldn't.
> 
> However, the patch below actually may help and it really is not too different
> from my "new infrastructure" patch.  It leaves the pcie_portdrv_restore_config()
> in the PCIe port driver's ->resume(), but that shouldn't change things,
> pci_enable_device() in there shouldn't do anything and the bus master bit
> should already be set.
> 
> The "new infrastracture" patch makes pci_disable_enabled_device() be called in
> the suspend code path, but that only disables the bus master bit, and
> pci_reenable_device() be called in the resume code path, but that only sets the
> bus master bit back again.

I should have said "in this particular case", because it actually makes a
difference for devices using interrupt pins.  Namely, pcibios_enable_device()
additionally enables PCI resources (that may make a difference, but everything
should have been restored already) and sets up an interrupt link for the
device.  If the link has been set up already, which often is the case, it
increases a reference count and exits.

Now, this reference count is not used for anything, but it was supposed to be
used for disabling the interrupt links that are no longer needed.  This
mechanism is currently disabled, but if we enable it at one point (which may
be necessary to fix suspend-resume on some boxes), it won't work if
the "enable" calls are not balanced with "disable" ones.  IOW, for every
pcibios_enable_device() call there should be a complementary
pcibios_disable_device() call.  That's why I decided to put the
disable/enable things into the PCI core's suspend/resume code paths (also,
because pci_reenable_device() has been already there for driverless devices).
This might not be the right choice, but I don't really think it does break
things.

Anyway, from what I can tell reading your messages in this thread so far,
you seem to want the PCI core to:
(1) save the state of devices during suspend (avoid doing that if the driver has
    already saved the state),
(2) put devices into D0 during resume (early),
(3) restore the state of devices during resume (early).
Still, you don't want the core to disable devices during suspend and to enable
(or reenable) them during resume.

What about putting devices into low power states?  [Note that ACPI may be
necessary for this purpose.]

What about devices with no drivers and/or without suspend/resume support?
Do you want them to be disabled during suspend and enabled during resume by
the core?  [I guess you do, they have no interrupt handlers that may break
after all.]

Thanks,
Rafael

  reply	other threads:[~2009-01-31 21:43 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-29  0:14 2.6.29-rc3: tg3 dead after resume Parag Warudkar
2009-01-29  1:09 ` Linus Torvalds
2009-01-29  1:49   ` Parag Warudkar
2009-01-29  2:10     ` Linus Torvalds
2009-01-29  2:19       ` Matt Carlson
2009-01-29 22:22         ` Rafael J. Wysocki
2009-01-29 18:42     ` Matt Carlson
2009-01-29 22:06       ` Parag Warudkar
2009-01-29 22:22         ` Matt Carlson
2009-01-29 22:35           ` Parag Warudkar
2009-01-29 23:10             ` Rafael J. Wysocki
2009-01-30 18:40             ` Matt Carlson
2009-01-30 22:50               ` Parag Warudkar
2009-01-30 23:06                 ` Linus Torvalds
2009-01-30 23:33                   ` Linus Torvalds
2009-01-30 23:45                   ` Parag Warudkar
2009-01-30 23:57                     ` Linus Torvalds
2009-01-30 23:59                     ` Rafael J. Wysocki
2009-01-31  0:28                       ` Parag Warudkar
2009-01-31  0:38                         ` Rafael J. Wysocki
2009-01-31  0:44                           ` Ingo Molnar
2009-01-31  0:47                             ` Rafael J. Wysocki
2009-01-31  1:21                           ` Parag Warudkar
2009-01-31  1:37                             ` Rafael J. Wysocki
2009-01-31  1:42                               ` Parag Warudkar
2009-02-03  9:29                                 ` Rafael J. Wysocki
2009-02-03 21:27                                   ` Parag Warudkar
2009-02-03 22:15                                     ` Rafael J. Wysocki
2009-02-03 23:50                                       ` WARNING: at drivers/pci/pci-driver.c:368 Parag Warudkar
2009-02-04  0:10                                         ` Rafael J. Wysocki
2009-02-04  0:17                                           ` Parag Warudkar
2009-02-04  0:19                                             ` Parag Warudkar
2009-02-04  0:38                                       ` 2.6.29-rc3: tg3 dead after resume Parag Warudkar
2009-02-04  0:41                                         ` Rafael J. Wysocki
2009-02-07  3:00                                           ` Linus Torvalds
2009-02-07 18:03                                             ` Jesse Barnes
2009-01-31  1:46                             ` Linus Torvalds
2009-01-31  1:54                               ` Parag Warudkar
2009-01-31  2:25                                 ` Linus Torvalds
2009-01-31  2:40                                   ` Parag Warudkar
2009-01-31 18:51                                     ` Rafael J. Wysocki
2009-01-31  2:19                               ` Linus Torvalds
2009-01-31 20:45                                 ` Rafael J. Wysocki
2009-01-31  1:41                           ` Linus Torvalds
2009-01-31 21:08                             ` Rafael J. Wysocki
2009-01-31 21:42                               ` Rafael J. Wysocki [this message]
2009-01-31 21:59                                 ` What should PCI core do during suspend-resume? (was: Re: 2.6.29-rc3: tg3 dead after resume) Linus Torvalds
2009-01-31 23:08                                   ` Rafael J. Wysocki
2009-01-31 23:27                                     ` Linus Torvalds
2009-01-31 23:39                                       ` Linus Torvalds
2009-02-01  0:36                                         ` Rafael J. Wysocki
2009-02-01  1:06                                           ` Linus Torvalds
2009-02-01  1:13                                             ` Linus Torvalds
2009-02-01  1:20                                             ` Arjan van de Ven
2009-02-01  1:24                                             ` Rafael J. Wysocki
2009-02-07  9:21                                 ` Pavel Machek
2009-01-31 21:47                               ` 2.6.29-rc3: tg3 dead after resume Linus Torvalds
2009-01-31 22:46                                 ` Rafael J. Wysocki
2009-01-31 23:01                                   ` Linus Torvalds
2009-02-01  0:11                                     ` Rafael J. Wysocki
2009-02-01  0:32                                       ` Linus Torvalds
2009-02-01  0:41                                         ` Rafael J. Wysocki
2009-02-01  0:51                                         ` Linus Torvalds
2009-02-07  3:27                                           ` Benjamin Herrenschmidt
2009-02-07  3:26                                     ` Benjamin Herrenschmidt
2009-01-29 23:03         ` Rafael J. Wysocki
2009-01-29 23:41           ` Matt Carlson
2009-01-30  0:10             ` Rafael J. Wysocki
2009-01-30 22:31               ` Parag Warudkar
2009-01-30 22:36                 ` Linus Torvalds
2009-01-30 22:54                 ` Rafael J. Wysocki
2009-01-30 23:07                   ` Linus Torvalds
2009-01-30 23:13                   ` Parag Warudkar
2009-01-30 23:31                     ` Rafael J. Wysocki
2009-01-30 23:51                       ` Linus Torvalds
2009-01-31  0:07                         ` Rafael J. Wysocki
2009-01-31  0:34                         ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200901312242.17273.rjw@sisk.pl \
    --to=rjw@sisk.pl \
    --cc=akpm@linux-foundation.org \
    --cc=davem@davemloft.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcarlson@broadcom.com \
    --cc=netdev@vger.kernel.org \
    --cc=parag.lkml@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox