* device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] [not found] ` <40FD4CFA.6070603@optonline.net> @ 2004-07-20 17:46 ` Pavel Machek 2004-07-20 18:10 ` Nathan Bryant 0 siblings, 1 reply; 13+ messages in thread From: Pavel Machek @ 2004-07-20 17:46 UTC (permalink / raw) To: Nathan Bryant; +Cc: linux-scsi, random1, Luben Tuikov, benh, kernel list Hi! > >Can you kill #if 0 code? > > Yes. This is a work in progress. Interestingly, the ifdef'd-out code was > pasted from mptbase.c in the MPT Fusion driver. If it's broken here, > it's probably broken there -- seems the state parameter passed to the > pci resume callback is intended to be a PCI D state, not an ACPI S > state. Can somebody confirm or deny? The kernel is actually passing > state 2 (D2) to the driver when I enter ACPI S3, so presumably the same > failure could happen to fusion. I'm no longer sure what should be passed there... We'll probably need to turn it into enum... Actually swsusp code in -mm actually passes value from enum, and mainline swsusp code passes 0/3. Pavel -- Horseback riding is like software... ...vgf orggre jura vgf serr. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-20 17:46 ` device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] Pavel Machek @ 2004-07-20 18:10 ` Nathan Bryant 2004-07-20 18:25 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: Nathan Bryant @ 2004-07-20 18:10 UTC (permalink / raw) To: Pavel Machek; +Cc: linux-scsi, random1, Luben Tuikov, benh, kernel list Pavel Machek wrote: > Hi! > > >>>Can you kill #if 0 code? >> >>Yes. This is a work in progress. Interestingly, the ifdef'd-out code was >>pasted from mptbase.c in the MPT Fusion driver. If it's broken here, >>it's probably broken there -- seems the state parameter passed to the >>pci resume callback is intended to be a PCI D state, not an ACPI S >>state. Can somebody confirm or deny? The kernel is actually passing >>state 2 (D2) to the driver when I enter ACPI S3, so presumably the same >>failure could happen to fusion. > > > I'm no longer sure what should be passed there... We'll probably need > to turn it into enum... Actually swsusp code in -mm actually passes > value from enum, and mainline swsusp code passes 0/3. > > Pavel Seems to me, aside from whether it's an enum or not, it should represent a D-state not an ACPI S-state. Some platforms (Power Mac?) probably implement PCI power management but not in an ACPI way. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-20 18:10 ` Nathan Bryant @ 2004-07-20 18:25 ` Benjamin Herrenschmidt 2004-07-20 18:34 ` Nathan Bryant 0 siblings, 1 reply; 13+ messages in thread From: Benjamin Herrenschmidt @ 2004-07-20 18:25 UTC (permalink / raw) To: Nathan Bryant Cc: Pavel Machek, linux-scsi, random1, Luben Tuikov, Linux Kernel list > Seems to me, aside from whether it's an enum or not, it should represent > a D-state not an ACPI S-state. Some platforms (Power Mac?) probably > implement PCI power management but not in an ACPI way. 2 comments here: - The low level bus state (PCI D state for example) and the "linux" state should be 2 different entities. - For PCI, we probably want a hook so the arch can implement it's own version of pci_set_power_state() so that ACPI can use it's own trickery there. Ben. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-20 18:25 ` Benjamin Herrenschmidt @ 2004-07-20 18:34 ` Nathan Bryant 2004-07-20 19:10 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: Nathan Bryant @ 2004-07-20 18:34 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Pavel Machek, linux-scsi, random1, Luben Tuikov, Linux Kernel list Benjamin Herrenschmidt wrote: > 2 comments here: > > - The low level bus state (PCI D state for example) and the "linux" > state should be 2 different entities. > > - For PCI, we probably want a hook so the arch can implement it's own > version of pci_set_power_state() so that ACPI can use it's own trickery > there. Ok, so the takeaway message for driver writers is to treat the pci_dev->suspend() state parameter as an opaque value as far as possible, and just pass it on to the other layers ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-20 18:34 ` Nathan Bryant @ 2004-07-20 19:10 ` Benjamin Herrenschmidt 2004-07-20 19:23 ` Pavel Machek [not found] ` <40FD82B1.8030704@optonline.net> 0 siblings, 2 replies; 13+ messages in thread From: Benjamin Herrenschmidt @ 2004-07-20 19:10 UTC (permalink / raw) To: Nathan Bryant Cc: Pavel Machek, linux-scsi, random1, Luben Tuikov, Linux Kernel list On Tue, 2004-07-20 at 14:34, Nathan Bryant wrote: > Benjamin Herrenschmidt wrote: > > > 2 comments here: > > > > - The low level bus state (PCI D state for example) and the "linux" > > state should be 2 different entities. > > > > - For PCI, we probably want a hook so the arch can implement it's own > > version of pci_set_power_state() so that ACPI can use it's own trickery > > there. > > Ok, so the takeaway message for driver writers is to treat the > pci_dev->suspend() state parameter as an opaque value as far as > possible, and just pass it on to the other layers NO ! The exact opposite in fact. I'll work on cleaning that up and write some doco this week with Pavel. Ben. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-20 19:10 ` Benjamin Herrenschmidt @ 2004-07-20 19:23 ` Pavel Machek [not found] ` <40FD82B1.8030704@optonline.net> 1 sibling, 0 replies; 13+ messages in thread From: Pavel Machek @ 2004-07-20 19:23 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Nathan Bryant, linux-scsi, random1, Luben Tuikov, Linux Kernel list Hi! > > > 2 comments here: > > > > > > - The low level bus state (PCI D state for example) and the "linux" > > > state should be 2 different entities. > > > > > > - For PCI, we probably want a hook so the arch can implement it's own > > > version of pci_set_power_state() so that ACPI can use it's own trickery > > > there. > > > > Ok, so the takeaway message for driver writers is to treat the > > pci_dev->suspend() state parameter as an opaque value as far as > > possible, and just pass it on to the other layers > > NO ! The exact opposite in fact. I'll work on cleaning that up and > write some doco this week with Pavel. Agreed. I do not have strong opinion about Sx or Dx, but it should be enum or equivalent so it is hard to get wrong. [And not "really easy to get wrong because noone knows for sure", as it is now]. Pavel -- Horseback riding is like software... ...vgf orggre jura vgf serr. ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <40FD82B1.8030704@optonline.net>]
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] [not found] ` <40FD82B1.8030704@optonline.net> @ 2004-07-20 20:41 ` Benjamin Herrenschmidt 2004-07-20 20:50 ` Nathan Bryant 0 siblings, 1 reply; 13+ messages in thread From: Benjamin Herrenschmidt @ 2004-07-20 20:41 UTC (permalink / raw) To: Nathan Bryant; +Cc: Linux Kernel list On Tue, 2004-07-20 at 16:38, Nathan Bryant wrote: > Benjamin Herrenschmidt wrote: > > NO ! The exact opposite in fact. I'll work on cleaning that up and > > write some doco this week with Pavel. > > Did more code reading. I assume the value passed to pci_dev->suspend is > going to be one of: > > enum { > PM_SUSPEND_ON, > PM_SUSPEND_STANDBY, > PM_SUSPEND_MEM, > PM_SUSPEND_DISK, > PM_SUSPEND_MAX, > }; Yes something around those lines > And the value passed to pci_set_power_state will continue to be 0..3 > (except enumerated)? We could keep those unenumerated, which power state is to be used by the device is under driver control, most of the time, they'll do D3 though, for suspend-to-RAM, and they can probably just not do anything special for suspend-to-disk (except shutting down the driver itself of course) Note regarding aix7xxx, we also need proper hooks in the SCSI stack to block the queue correctly etc... in the same way we do on IDE. I didn't have time to look into this yet. Ben. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-20 20:41 ` Benjamin Herrenschmidt @ 2004-07-20 20:50 ` Nathan Bryant 2004-07-20 21:02 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: Nathan Bryant @ 2004-07-20 20:50 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Linux Kernel list Benjamin Herrenschmidt wrote: > Note regarding aix7xxx, we also need proper hooks in the SCSI stack to > block the queue correctly etc... in the same way we do on IDE. I didn't > have time to look into this yet. Here's what we currently do, aic7xxx_core.c - looks like it attempts to quiesce, and then refuse to suspend if we happen to be busy. This is a little messy because it's done in the suspend call rather than the save_state call, therefore resume will still be called if this routine returns an error code, which will reinitialize the device when we didn't really need to. int ahc_suspend(struct ahc_softc *ahc) { ahc_pause_and_flushwork(ahc); if (LIST_FIRST(&ahc->pending_scbs) != NULL) { ahc_unpause(ahc); return (EBUSY); } #ifdef AHC_TARGET_MODE /* * XXX What about ATIOs that have not yet been serviced? * Perhaps we should just refuse to be suspended if we * are acting in a target role. */ if (ahc->pending_device != NULL) { ahc_unpause(ahc); return (EBUSY); } #endif ahc_shutdown(ahc); return (0); } ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-20 20:50 ` Nathan Bryant @ 2004-07-20 21:02 ` Benjamin Herrenschmidt 2004-07-24 15:31 ` Nathan Bryant 0 siblings, 1 reply; 13+ messages in thread From: Benjamin Herrenschmidt @ 2004-07-20 21:02 UTC (permalink / raw) To: Nathan Bryant; +Cc: Linux Kernel list On Tue, 2004-07-20 at 16:50, Nathan Bryant wrote: > Benjamin Herrenschmidt wrote: > > > Note regarding aix7xxx, we also need proper hooks in the SCSI stack to > > block the queue correctly etc... in the same way we do on IDE. I didn't > > have time to look into this yet. > > Here's what we currently do, aic7xxx_core.c - looks like it attempts to > quiesce, and then refuse to suspend if we happen to be busy. This is a > little messy because it's done in the suspend call rather than the > save_state call, therefore resume will still be called if this routine > returns an error code, which will reinitialize the device when we didn't > really need to. save_state is a nonsense, didn't we kill it ? queue quiescing must be done by the upper level, which is a bit nasty with things like md & multipath... basically, the low level driver must have a way to notify it's functional parent (as opposed to it's bus parent) that it's going to sleep, and any path using this low level driver must then be quiesced, the parent must resume only when all the drivers it relies on are back up. > int > ahc_suspend(struct ahc_softc *ahc) > { > > ahc_pause_and_flushwork(ahc); > > if (LIST_FIRST(&ahc->pending_scbs) != NULL) { > ahc_unpause(ahc); > return (EBUSY); > } > > #ifdef AHC_TARGET_MODE > /* > * XXX What about ATIOs that have not yet been serviced? > * Perhaps we should just refuse to be suspended if we > * are acting in a target role. > */ > if (ahc->pending_device != NULL) { > ahc_unpause(ahc); > return (EBUSY); > } > #endif > ahc_shutdown(ahc); > return (0); > } -- Benjamin Herrenschmidt <benh@kernel.crashing.org> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-20 21:02 ` Benjamin Herrenschmidt @ 2004-07-24 15:31 ` Nathan Bryant 2004-07-24 16:00 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: Nathan Bryant @ 2004-07-24 15:31 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Linux Kernel list, Pavel Machek Benjamin Herrenschmidt wrote: >save_state is a nonsense, didn't we kill it ? queue quiescing must be >done by the upper level, which is a bit nasty with things like md & >multipath... basically, the low level driver must have a way to notify >it's functional parent (as opposed to it's bus parent) that it's going >to sleep, and any path using this low level driver must then be >quiesced, the parent must resume only when all the drivers it relies >on are back up. > Isn't sysfs supposed to take care of this for us? IOW, I shouldn't have to call up to the SCSI midlayer from pcidev->suspend in order to notify it of a suspend, the midlayer should call the driver before we ever try to suspend. This may become important some day when the upper layers need to decide which order to bring pci devices down Looking in /sys/devices shows that sysfs already knows that 'host0' is a child of a SCSI PCI device. $ ls /sys/devices/pci0000\:00/0000\:00\:1e.0/0000\:02\:02.0/host0/0\:0\:0\:0/ block detach_state model queue_depth rev state type delete device_blocked power rescan scsi_level timeout vendor ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-24 15:31 ` Nathan Bryant @ 2004-07-24 16:00 ` Benjamin Herrenschmidt 2004-07-24 16:45 ` Nathan Bryant 0 siblings, 1 reply; 13+ messages in thread From: Benjamin Herrenschmidt @ 2004-07-24 16:00 UTC (permalink / raw) To: Nathan Bryant; +Cc: Linux Kernel list, Pavel Machek On Sat, 2004-07-24 at 11:31, Nathan Bryant wrote: > Benjamin Herrenschmidt wrote: > > >save_state is a nonsense, didn't we kill it ? sysfs only takes care about the bus hierarchy as far as suspend/resume is concerned (which is the only sane way to do it imho) > queue quiescing must be > >done by the upper level, which is a bit nasty with things like md & > >multipath... basically, the low level driver must have a way to notify > >it's functional parent (as opposed to it's bus parent) that it's going > >to sleep, and any path using this low level driver must then be > >quiesced, the parent must resume only when all the drivers it relies > >on are back up. > > > Isn't sysfs supposed to take care of this for us? IOW, I shouldn't have > to call up to the SCSI midlayer from pcidev->suspend in order to notify > it of a suspend, the midlayer should call the driver before we ever try > to suspend. This may become important some day when the upper layers > need to decide which order to bring pci devices down No, the ordering cannot be dictated by the upper layer, but by the physical bus hierarchy. The low level driver gets the suspend callback and need to notify the parent. The md/multipath must keep track that one of the device it relies on is going away and thus block the queues. That is at least for machine suspend/resume. > Looking in /sys/devices shows that sysfs already knows that 'host0' is a > child of a SCSI PCI device. Yes, but the PM herarchy is the bus hierarchy, I don't see a simple way of going through both in this case ... > $ ls > /sys/devices/pci0000\:00/0000\:00\:1e.0/0000\:02\:02.0/host0/0\:0\:0\:0/ > block detach_state model queue_depth rev state type > delete device_blocked power rescan scsi_level timeout vendor -- Benjamin Herrenschmidt <benh@kernel.crashing.org> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-24 16:00 ` Benjamin Herrenschmidt @ 2004-07-24 16:45 ` Nathan Bryant 2004-07-24 18:35 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 13+ messages in thread From: Nathan Bryant @ 2004-07-24 16:45 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: Linux Kernel list, Pavel Machek Benjamin Herrenschmidt wrote: > sysfs only takes care about the bus hierarchy as far as suspend/resume > is concerned (which is the only sane way to do it imho) I saw comments in one of the PCI IDE driver pcidev->suspend routines that say "we don't need to iterate over the list of drives, sysfs does that for us." > No, the ordering cannot be dictated by the upper layer, but by the > physical bus hierarchy. The low level driver gets the suspend callback > and need to notify the parent. The md/multipath must keep track that one > of the device it relies on is going away and thus block the queues. > > That is at least for machine suspend/resume. We're talking past each other. I'm saying you take into consideration the physical bus hierarchy: PCI bus x is a parent of SCSI bus y which is a parent of SCSI disk drive z. Suspend disk z, with involvement from the block layer and scsi midlayer, before even calling the actual pcidev->suspend routine on the SCSI bus adapter. Shouldn't require more than minimal LLD involvement. >>Looking in /sys/devices shows that sysfs already knows that 'host0' is a >>child of a SCSI PCI device. > > > Yes, but the PM herarchy is the bus hierarchy, I don't see a simple way > of going through both in this case ... In the case of IDE, IDE is registered as a bus_type and has generic suspend code for the whole bus that is unrelated to the pcidev. The PIIX IDE (Intel chipsets) PCI pcidev struct doesn't even have suspend and resume callbacks filled in, but it works fine! ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] 2004-07-24 16:45 ` Nathan Bryant @ 2004-07-24 18:35 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 13+ messages in thread From: Benjamin Herrenschmidt @ 2004-07-24 18:35 UTC (permalink / raw) To: Nathan Bryant; +Cc: Linux Kernel list, Pavel Machek On Sat, 2004-07-24 at 12:45, Nathan Bryant wrote: > Benjamin Herrenschmidt wrote: > > sysfs only takes care about the bus hierarchy as far as suspend/resume > > is concerned (which is the only sane way to do it imho) > > I saw comments in one of the PCI IDE driver pcidev->suspend routines > that say "we don't need to iterate over the list of drives, sysfs does > that for us." That's different, because the disks are actually registered as "struct device" childs of the bus, and thus get proper suspend/resume callbacks. > > No, the ordering cannot be dictated by the upper layer, but by the > > physical bus hierarchy. The low level driver gets the suspend callback > > and need to notify the parent. The md/multipath must keep track that one > > of the device it relies on is going away and thus block the queues. > > > > That is at least for machine suspend/resume. > > We're talking past each other. I'm saying you take into consideration > the physical bus hierarchy: PCI bus x is a parent of SCSI bus y which is > a parent of SCSI disk drive z. Suspend disk z, with involvement from the > block layer and scsi midlayer, before even calling the actual > pcidev->suspend routine on the SCSI bus adapter. Shouldn't require more > than minimal LLD involvement. Oh sure, the disks are in the loop, the problem happens with multipath and such which "breaks" the bus hiearchy somewhat. The queue management is part of the "functional" hierarchy (read: block layer) on top of SCSI disks, thus the disks will be the one getting the suspend callback, but they have to "notify" their functional parent (block layer, md, ...) to properly get the queues stopped. IDE sort-of does that internally, by generating a special request that goes down the queue (in order to be properly ordered with whatever is pending in the queue, including pending tagged commands if any), and the "toplevel" IDE handling will stop processing the queue once that request got past, but it's a hackery that at this point is quite specific to drivers/ide/ > >>Looking in /sys/devices shows that sysfs already knows that 'host0' is a > >>child of a SCSI PCI device. > > > > > > Yes, but the PM herarchy is the bus hierarchy, I don't see a simple way > > of going through both in this case ... > > In the case of IDE, IDE is registered as a bus_type and has generic > suspend code for the whole bus that is unrelated to the pcidev. The PIIX > IDE (Intel chipsets) PCI pcidev struct doesn't even have suspend and > resume callbacks filled in, but it works fine! Well... not exactly. The pci_dev is the parent of the IDE bus in the bus hierarchy. PIIX may lack the proper callbacks (it probably need some stuff there too). For a good working example, ide/ppc/pmac.c, it is a macio_dev (or a pci_dev, depending on the ASIC model). The suspend request first reaches the disk(s), which does the queue processing I mentioned earlier & stanby's the disks. Once that's done, the parent (ide pmac) gets it's suspend call and does some suspend work on the controller HW. Resume is the opposite, the controller gets resumed first, then the child(s) (disk(s)) -- Benjamin Herrenschmidt <benh@kernel.crashing.org> ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2004-07-24 18:44 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <40FD38A0.3000603@optonline.net>
[not found] ` <20040720155928.GC10921@atrey.karlin.mff.cuni.cz>
[not found] ` <40FD4CFA.6070603@optonline.net>
2004-07-20 17:46 ` device_suspend() levels [was Re: [patch] ACPI work on aic7xxx] Pavel Machek
2004-07-20 18:10 ` Nathan Bryant
2004-07-20 18:25 ` Benjamin Herrenschmidt
2004-07-20 18:34 ` Nathan Bryant
2004-07-20 19:10 ` Benjamin Herrenschmidt
2004-07-20 19:23 ` Pavel Machek
[not found] ` <40FD82B1.8030704@optonline.net>
2004-07-20 20:41 ` Benjamin Herrenschmidt
2004-07-20 20:50 ` Nathan Bryant
2004-07-20 21:02 ` Benjamin Herrenschmidt
2004-07-24 15:31 ` Nathan Bryant
2004-07-24 16:00 ` Benjamin Herrenschmidt
2004-07-24 16:45 ` Nathan Bryant
2004-07-24 18:35 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox