* Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway [not found] <18059.6726.218205.104212@cargo.ozlabs.ibm.com> @ 2007-07-04 15:12 ` Alan Stern [not found] ` <Pine.LNX.4.44L0.0707041104140.25704-100000@netrider.rowland.org> 1 sibling, 0 replies; 18+ messages in thread From: Alan Stern @ 2007-07-04 15:12 UTC (permalink / raw) To: Paul Mackerras Cc: Matthew Garrett, Kernel development list, Pavel Machek, Johannes Berg, Linux-pm mailing list On Wed, 4 Jul 2007, Paul Mackerras wrote: > Whether or not to resume a suspended device when an I/O request comes > in is a policy decision, and there could be cases where the user wants > I/O requests to be blocked, or to fail, or to be dropped while the > device is suspended, even for runtime power management. For example, > a sound card could be suspended due to a low-battery condition, and in > that case you would want the driver to just drop any data that > userspace tries to write to the soundcard. We have provisions for that (my earlier description was somewhat incomplete). > > Yes, the code could be changed to keep track of the reason for a device > > suspend. But that just raises the old problem of what to do when > > there's an I/O request for a suspended device during STR. > > Is this actually a real problem? I would think the policy would be > "block" for block devices (pun not intended :), "drop" for network > devices, etc. It is indeed a real problem, or at least, it can be. > > Consider a particularly troublesome case: During STR, a non-frozen task > > writes to /sys/bus/BBB/drivers/DDD/bind. The sysfs core grabs the > > device semaphore and calls the driver's probe routine. If the driver > > isn't PM-aware it simply tries to initialize the device and fails > > because the device is already suspended. That's no good; it isn't > > transparent. > > How did the device get suspended if it didn't have a driver? If it > did have a driver, why didn't the bind attempt fail? Bus subsystems can suspend devices with no drivers. > Suppose the device-model core code simply blocked all bind and unbind > requests while suspend is under way, until resume is finished. > Wouldn't that solve the problem? It would help. It would help even more if the sysfs core also blocked all I/O while suspend is under way. (Although this might be tricky, considering that the suspend is initiated by a sysfs write...) The fact remains that lots of drivers would still need to be changed. In the read and write methods someone would have to add code amounting to this: if (suspend_is_under_way()) { mutex_unlock(...); block_until_resume(); goto restart; } Freezing userspace is a small amount of code by comparison. Alan Stern ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <Pine.LNX.4.44L0.0707041104140.25704-100000@netrider.rowland.org>]
* Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway [not found] ` <Pine.LNX.4.44L0.0707041104140.25704-100000@netrider.rowland.org> @ 2007-07-05 0:35 ` Paul Mackerras 2007-07-05 9:15 ` removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) Pavel Machek ` (2 more replies) 0 siblings, 3 replies; 18+ messages in thread From: Paul Mackerras @ 2007-07-05 0:35 UTC (permalink / raw) To: Alan Stern Cc: Matthew Garrett, Kernel development list, Pavel Machek, Johannes Berg, Linux-pm mailing list Alan Stern writes: > > > Yes, the code could be changed to keep track of the reason for a device > > > suspend. But that just raises the old problem of what to do when > > > there's an I/O request for a suspended device during STR. > > > > Is this actually a real problem? I would think the policy would be > > "block" for block devices (pun not intended :), "drop" for network > > devices, etc. > > It is indeed a real problem, or at least, it can be. How so? Can you give me an example? > > How did the device get suspended if it didn't have a driver? If it > > did have a driver, why didn't the bind attempt fail? > > Bus subsystems can suspend devices with no drivers. Interesting. I assume this is for buses for which there is a bus-specific but device-independent suspend procedure defined. It would seem sensible to me that the PM core should get the bus to resume such a device before calling a driver probe routine. The resume should be blocked or deferred while a system suspend is underway. In fact I think that all driver bind/unbind and probe operations should be deferred while the system is suspending (i.e. put on a list to be done after the system resumes). > It would help. It would help even more if the sysfs core also blocked > all I/O while suspend is under way. (Although this might be tricky, > considering that the suspend is initiated by a sysfs write...) I didn't think sysfs got involved at all in normal read and write requests, so I don't know how it would block them... > The fact remains that lots of drivers would still need to be changed. > In the read and write methods someone would have to add code amounting > to this: > > if (suspend_is_under_way()) { > mutex_unlock(...); > block_until_resume(); > goto restart; > } > > Freezing userspace is a small amount of code by comparison. Normally devices have some sort of queue of pending operations. So all that is required on suspend is to stop processing the queue and wait for any currently-underway operations to complete. The blocking then happens naturally using the normal I/O wait mechanisms. Paul. ^ permalink raw reply [flat|nested] 18+ messages in thread
* removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) 2007-07-05 0:35 ` Paul Mackerras @ 2007-07-05 9:15 ` Pavel Machek [not found] ` <20070705091526.GA3084@elf.ucw.cz> 2007-07-05 14:42 ` Re: [PATCH] Remove process freezer from suspend to RAM pathway Alan Stern 2 siblings, 0 replies; 18+ messages in thread From: Pavel Machek @ 2007-07-05 9:15 UTC (permalink / raw) To: Paul Mackerras Cc: Matthew Garrett, Kernel development list, Johannes Berg, Linux-pm mailing list Hi! > > The fact remains that lots of drivers would still need to be changed. > > In the read and write methods someone would have to add code amounting > > to this: > > > > if (suspend_is_under_way()) { > > mutex_unlock(...); > > block_until_resume(); > > goto restart; > > } > > > > Freezing userspace is a small amount of code by comparison. > > Normally devices have some sort of queue of pending operations. So > all that is required on suspend is to stop processing the queue and > wait for any currently-underway operations to complete. The blocking > then happens naturally using the normal I/O wait mechanisms. So... instead of one big freezer (we know it is problematic), you have 100 small freezers, problematic in same way :-(. Let's take current FUSE problems, and see if they have problem on PPC, ok? Let's say FUSE thread touches one of those "blocking" devices. It is now in D state, somewhere in kernel.... exactly same way refrigerator works. Now, if kernel needs FUSE services for some reason (that's the problem we hit in s2ram case, right?), we have a deadlock. So main problem still seems to be "kernel should not depend on userland services during suspend", refrigerator or not. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20070705091526.GA3084@elf.ucw.cz>]
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) [not found] ` <20070705091526.GA3084@elf.ucw.cz> @ 2007-07-05 13:57 ` Matthew Garrett [not found] ` <20070705135758.GB22177@srcf.ucam.org> 1 sibling, 0 replies; 18+ messages in thread From: Matthew Garrett @ 2007-07-05 13:57 UTC (permalink / raw) To: Pavel Machek Cc: Kernel development list, Johannes Berg, Linux-pm mailing list On Thu, Jul 05, 2007 at 11:15:26AM +0200, Pavel Machek wrote: > Now, if kernel needs FUSE services for some reason (that's the problem > we hit in s2ram case, right?), we have a deadlock. > > So main problem still seems to be "kernel should not depend on > userland services during suspend", refrigerator or not. And also "Userland should not depend on userland services", which is rather more of a problem. -- Matthew Garrett | mjg59@srcf.ucam.org ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20070705135758.GB22177@srcf.ucam.org>]
[parent not found: <200707051628.12199.rjw@sisk.pl>]
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) [not found] ` <200707051628.12199.rjw@sisk.pl> @ 2007-07-05 14:26 ` Matthew Garrett [not found] ` <20070705142600.GB22598@srcf.ucam.org> 1 sibling, 0 replies; 18+ messages in thread From: Matthew Garrett @ 2007-07-05 14:26 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Kernel development list, Pavel Machek, Johannes Berg, Linux-pm mailing list On Thu, Jul 05, 2007 at 04:28:11PM +0200, Rafael J. Wysocki wrote: > On Thursday, 5 July 2007 15:57, Matthew Garrett wrote: > > And also "Userland should not depend on userland services", which is > > rather more of a problem. > > I think you're oversimplifying it, as far as FUSE is concerned. > > Namely, if there are two userland tasks, A and B, and B is uninterruptible, > because A is blocked, then this is not a usual situation. Fuse is one case of it occuring, and if we end up with more userspace drivers then the problem is only going to get worse. -- Matthew Garrett | mjg59@srcf.ucam.org ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20070705142600.GB22598@srcf.ucam.org>]
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) [not found] ` <20070705142600.GB22598@srcf.ucam.org> @ 2007-07-05 14:41 ` Rafael J. Wysocki 2007-07-05 14:39 ` Matthew Garrett [not found] ` <20070705143918.GA23248@srcf.ucam.org> 2007-07-07 11:49 ` Pavel Machek 1 sibling, 2 replies; 18+ messages in thread From: Rafael J. Wysocki @ 2007-07-05 14:41 UTC (permalink / raw) To: Matthew Garrett Cc: Kernel development list, Pavel Machek, Johannes Berg, Linux-pm mailing list On Thursday, 5 July 2007 16:26, Matthew Garrett wrote: > On Thu, Jul 05, 2007 at 04:28:11PM +0200, Rafael J. Wysocki wrote: > > On Thursday, 5 July 2007 15:57, Matthew Garrett wrote: > > > And also "Userland should not depend on userland services", which is > > > rather more of a problem. > > > > I think you're oversimplifying it, as far as FUSE is concerned. > > > > Namely, if there are two userland tasks, A and B, and B is uninterruptible, > > because A is blocked, then this is not a usual situation. > > Fuse is one case of it occuring, and if we end up with more userspace > drivers then the problem is only going to get worse. But this is a problem by itself, regardless of the freezer etc., no? Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) 2007-07-05 14:41 ` Rafael J. Wysocki @ 2007-07-05 14:39 ` Matthew Garrett [not found] ` <20070705143918.GA23248@srcf.ucam.org> 1 sibling, 0 replies; 18+ messages in thread From: Matthew Garrett @ 2007-07-05 14:39 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Kernel development list, Pavel Machek, Johannes Berg, Linux-pm mailing list On Thu, Jul 05, 2007 at 04:41:39PM +0200, Rafael J. Wysocki wrote: > On Thursday, 5 July 2007 16:26, Matthew Garrett wrote: > > On Thu, Jul 05, 2007 at 04:28:11PM +0200, Rafael J. Wysocki wrote: > > > On Thursday, 5 July 2007 15:57, Matthew Garrett wrote: > > > > And also "Userland should not depend on userland services", which is > > > > rather more of a problem. > > > > > > I think you're oversimplifying it, as far as FUSE is concerned. > > > > > > Namely, if there are two userland tasks, A and B, and B is uninterruptible, > > > because A is blocked, then this is not a usual situation. > > > > Fuse is one case of it occuring, and if we end up with more userspace > > drivers then the problem is only going to get worse. > > But this is a problem by itself, regardless of the freezer etc., no? Why? -- Matthew Garrett | mjg59@srcf.ucam.org ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20070705143918.GA23248@srcf.ucam.org>]
[parent not found: <200707051704.48137.rjw@sisk.pl>]
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) [not found] ` <200707051704.48137.rjw@sisk.pl> @ 2007-07-05 15:03 ` Matthew Garrett [not found] ` <20070705150301.GB23647@srcf.ucam.org> 1 sibling, 0 replies; 18+ messages in thread From: Matthew Garrett @ 2007-07-05 15:03 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Kernel development list, Pavel Machek, Johannes Berg, Linux-pm mailing list On Thu, Jul 05, 2007 at 05:04:47PM +0200, Rafael J. Wysocki wrote: > On Thursday, 5 July 2007 16:39, Matthew Garrett wrote: > > Why? > > You have processes that don't react to signals, because some other user land > task is misbehaving. I'd call that ugly at the very least. It already happens with, say, NFS. Don't think about it in terms of a userland task misbehaving - think of it in terms of a resource becoming unavailable. -- Matthew Garrett | mjg59@srcf.ucam.org ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20070705150301.GB23647@srcf.ucam.org>]
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) [not found] ` <20070705150301.GB23647@srcf.ucam.org> @ 2007-07-05 15:27 ` Rafael J. Wysocki [not found] ` <200707051727.36715.rjw@sisk.pl> 1 sibling, 0 replies; 18+ messages in thread From: Rafael J. Wysocki @ 2007-07-05 15:27 UTC (permalink / raw) To: Matthew Garrett Cc: Kernel development list, Pavel Machek, Johannes Berg, Linux-pm mailing list On Thursday, 5 July 2007 17:03, Matthew Garrett wrote: > On Thu, Jul 05, 2007 at 05:04:47PM +0200, Rafael J. Wysocki wrote: > > On Thursday, 5 July 2007 16:39, Matthew Garrett wrote: > > > Why? > > > > You have processes that don't react to signals, because some other user land > > task is misbehaving. I'd call that ugly at the very least. > > It already happens with, say, NFS. Don't think about it in terms of a > userland task misbehaving - think of it in terms of a resource becoming > unavailable. I think there's a difference between a userland task playing the role of a resource and a "real" external resource the kernel doesn't control. IMO, userland tasks should not have the power to affect each other as though they were parts of the kernel. Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <200707051727.36715.rjw@sisk.pl>]
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) [not found] ` <200707051727.36715.rjw@sisk.pl> @ 2007-07-05 15:32 ` Miklos Szeredi [not found] ` <E1I6TJq-000164-00@dorka.pomaz.szeredi.hu> 1 sibling, 0 replies; 18+ messages in thread From: Miklos Szeredi @ 2007-07-05 15:32 UTC (permalink / raw) To: rjw; +Cc: mjg59, linux-kernel, pavel, johannes, linux-pm > > > You have processes that don't react to signals, because some > > > other user land task is misbehaving. I'd call that ugly at the > > > very least. > > > > It already happens with, say, NFS. Don't think about it in terms of a > > userland task misbehaving - think of it in terms of a resource becoming > > unavailable. > > I think there's a difference between a userland task playing the role of a > resource and a "real" external resource the kernel doesn't control. > > IMO, userland tasks should not have the power to affect each other as though > they were parts of the kernel. One task doing ptrace() can basically do whatever it wants with the task being traced. This is not an exact analogy to what fuse does, but close. And for this reason the security model for allowing access to a fuse filesystem is similar to that for allowing tracing. The gory details can be found in Documentation/filesystems/fuse.txt. Miklos ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <E1I6TJq-000164-00@dorka.pomaz.szeredi.hu>]
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) [not found] ` <E1I6TJq-000164-00@dorka.pomaz.szeredi.hu> @ 2007-07-07 11:50 ` Pavel Machek 2007-07-07 20:14 ` Miklos Szeredi 0 siblings, 1 reply; 18+ messages in thread From: Pavel Machek @ 2007-07-07 11:50 UTC (permalink / raw) To: Miklos Szeredi; +Cc: mjg59, linux-kernel, johannes, linux-pm Hi! > > > > You have processes that don't react to signals, because some > > > > other user land task is misbehaving. I'd call that ugly at the > > > > very least. > > > > > > It already happens with, say, NFS. Don't think about it in terms of a > > > userland task misbehaving - think of it in terms of a resource becoming > > > unavailable. > > > > I think there's a difference between a userland task playing the role of a > > resource and a "real" external resource the kernel doesn't control. > > > > IMO, userland tasks should not have the power to affect each other as though > > they were parts of the kernel. > > One task doing ptrace() can basically do whatever it wants with the > task being traced. This is not an exact analogy to what fuse does, > but close. Well, IMO userland tasks should not have power to grab VFS mutexes for indefinite ammount of time. ("fused is allowed to deadlock kernel, in a way only write to special file helps" is ugly). Unfortunately, I don't think there's a way to work around that deadlock within fuse design limits... (coda was able to get around it by working on whole files granularity, AFAICT), so we'll have to live with that. I think we have two separate problems here, and both are solvable, without major changes to fuse or suspend framework. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) 2007-07-07 11:50 ` Pavel Machek @ 2007-07-07 20:14 ` Miklos Szeredi 0 siblings, 0 replies; 18+ messages in thread From: Miklos Szeredi @ 2007-07-07 20:14 UTC (permalink / raw) To: pavel; +Cc: mjg59, miklos, linux-kernel, johannes, linux-pm > > One task doing ptrace() can basically do whatever it wants with the > > task being traced. This is not an exact analogy to what fuse does, > > but close. > > Well, IMO userland tasks should not have power to grab VFS mutexes for > indefinite ammount of time. ("fused is allowed to deadlock kernel, in > a way only write to special file helps" is ugly). Unfortunately, I > don't think there's a way to work around that deadlock within fuse > design limits... (coda was able to get around it by working on whole > files granularity, AFAICT), so we'll have to live with that. That's just file I/O. You can easily deadlock coda with any other file operation. In fact coda is _less_ robust wrt a misbehaving userspace server than fuse by a big margin. Miklos ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) [not found] ` <20070705143918.GA23248@srcf.ucam.org> [not found] ` <200707051704.48137.rjw@sisk.pl> @ 2007-07-05 15:04 ` Rafael J. Wysocki 1 sibling, 0 replies; 18+ messages in thread From: Rafael J. Wysocki @ 2007-07-05 15:04 UTC (permalink / raw) To: Matthew Garrett Cc: Kernel development list, Pavel Machek, Johannes Berg, Linux-pm mailing list On Thursday, 5 July 2007 16:39, Matthew Garrett wrote: > On Thu, Jul 05, 2007 at 04:41:39PM +0200, Rafael J. Wysocki wrote: > > On Thursday, 5 July 2007 16:26, Matthew Garrett wrote: > > > On Thu, Jul 05, 2007 at 04:28:11PM +0200, Rafael J. Wysocki wrote: > > > > On Thursday, 5 July 2007 15:57, Matthew Garrett wrote: > > > > > And also "Userland should not depend on userland services", which is > > > > > rather more of a problem. > > > > > > > > I think you're oversimplifying it, as far as FUSE is concerned. > > > > > > > > Namely, if there are two userland tasks, A and B, and B is uninterruptible, > > > > because A is blocked, then this is not a usual situation. > > > > > > Fuse is one case of it occuring, and if we end up with more userspace > > > drivers then the problem is only going to get worse. > > > > But this is a problem by itself, regardless of the freezer etc., no? > > Why? You have processes that don't react to signals, because some other user land task is misbehaving. I'd call that ugly at the very least. Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) [not found] ` <20070705142600.GB22598@srcf.ucam.org> 2007-07-05 14:41 ` Rafael J. Wysocki @ 2007-07-07 11:49 ` Pavel Machek 1 sibling, 0 replies; 18+ messages in thread From: Pavel Machek @ 2007-07-07 11:49 UTC (permalink / raw) To: Matthew Garrett Cc: Kernel development list, Johannes Berg, Linux-pm mailing list Hi! > > > And also "Userland should not depend on userland services", which is > > > rather more of a problem. > > > > I think you're oversimplifying it, as far as FUSE is concerned. > > > > Namely, if there are two userland tasks, A and B, and B is uninterruptible, > > because A is blocked, then this is not a usual situation. > > Fuse is one case of it occuring, and if we end up with more userspace > drivers then the problem is only going to get worse. We'll have to solve them as they come. Face it, hardware drivers _have_ to know about suspend/resume. Even userspace drivers will have to know about suspend/resume, because they need to reinit the hw during resume. Now... most parts of kernel need to know (a bit) about suspend/resume -- at least enough to play nicely with refrigerator. In retrospect it is pretty obvious that this covers fused, too, unfortunately noone noticed that when fuse was designed. Can we try to solve the suspend vs. fuse problem now? "Just removing the refrigerator" is not the answer. First, refrigerator is impossible to remove in few months timeframe, and second, it does not solve the problem anyway. (Actually, there are two separate problems with suspend vs. fuse.) Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) [not found] ` <20070705135758.GB22177@srcf.ucam.org> [not found] ` <200707051628.12199.rjw@sisk.pl> @ 2007-07-05 14:28 ` Rafael J. Wysocki 2007-07-07 12:08 ` problem 1 (was Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway)) Pavel Machek [not found] ` <20070707120827.GA3111@elf.ucw.cz> 3 siblings, 0 replies; 18+ messages in thread From: Rafael J. Wysocki @ 2007-07-05 14:28 UTC (permalink / raw) To: Matthew Garrett Cc: Kernel development list, Pavel Machek, Johannes Berg, Linux-pm mailing list On Thursday, 5 July 2007 15:57, Matthew Garrett wrote: > On Thu, Jul 05, 2007 at 11:15:26AM +0200, Pavel Machek wrote: > > > Now, if kernel needs FUSE services for some reason (that's the problem > > we hit in s2ram case, right?), we have a deadlock. > > > > So main problem still seems to be "kernel should not depend on > > userland services during suspend", refrigerator or not. > > And also "Userland should not depend on userland services", which is > rather more of a problem. I think you're oversimplifying it, as far as FUSE is concerned. Namely, if there are two userland tasks, A and B, and B is uninterruptible, because A is blocked, then this is not a usual situation. Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth ^ permalink raw reply [flat|nested] 18+ messages in thread
* problem 1 (was Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway)) [not found] ` <20070705135758.GB22177@srcf.ucam.org> [not found] ` <200707051628.12199.rjw@sisk.pl> 2007-07-05 14:28 ` Rafael J. Wysocki @ 2007-07-07 12:08 ` Pavel Machek [not found] ` <20070707120827.GA3111@elf.ucw.cz> 3 siblings, 0 replies; 18+ messages in thread From: Pavel Machek @ 2007-07-07 12:08 UTC (permalink / raw) To: Matthew Garrett Cc: Kernel development list, Johannes Berg, Linux-pm mailing list Hi! > > Now, if kernel needs FUSE services for some reason (that's the problem > > we hit in s2ram case, right?), we have a deadlock. > > > > So main problem still seems to be "kernel should not depend on > > userland services during suspend", refrigerator or not. > > And also "Userland should not depend on userland services", which is > rather more of a problem. No, that's not a problem. Or rather, that's different problem, called "problem 2" (fuse causes freezer to fail to stop processes). But we still have "problem 1" here: after devices are suspended, kernel tries to use fuse's services. That is not going to work, one way or another, because devices are suspended and userland can't work reliably. (Aha, it _may_ be it is kernel tries to use fuse's services after freezing userland but before freezing devices. I don't think it is). To solve "problem 1", we need to know which part of kernel asks for fuse services. sysrq-t trace is likely to tell us. Can someone repeat the "problem 1" scenario (freezer succeeds but then it deadlocks), and produce sysrq-t trace? That way we can solve "problem 1". Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <20070707120827.GA3111@elf.ucw.cz>]
* Re: problem 1 (was Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway)) [not found] ` <20070707120827.GA3111@elf.ucw.cz> @ 2007-07-07 20:55 ` Rafael J. Wysocki 0 siblings, 0 replies; 18+ messages in thread From: Rafael J. Wysocki @ 2007-07-07 20:55 UTC (permalink / raw) To: Pavel Machek Cc: Matthew Garrett, Kernel development list, Johannes Berg, Linux-pm mailing list On Saturday, 7 July 2007 14:08, Pavel Machek wrote: > Hi! > > > > Now, if kernel needs FUSE services for some reason (that's the problem > > > we hit in s2ram case, right?), we have a deadlock. > > > > > > So main problem still seems to be "kernel should not depend on > > > userland services during suspend", refrigerator or not. > > > > And also "Userland should not depend on userland services", which is > > rather more of a problem. > > No, that's not a problem. Or rather, that's different problem, called > "problem 2" (fuse causes freezer to fail to stop processes). > > But we still have "problem 1" here: after devices are suspended, > kernel tries to use fuse's services. That is not going to work, one > way or another, because devices are suspended and userland can't work > reliably. > > (Aha, it _may_ be it is kernel tries to use fuse's services after > freezing userland but before freezing devices. I don't think it is). > > To solve "problem 1", we need to know which part of kernel asks for > fuse services. sysrq-t trace is likely to tell us. Can someone repeat > the "problem 1" scenario (freezer succeeds but then it deadlocks), and > produce sysrq-t trace? That way we can solve "problem 1". Well, such a trace would be helpful in any case. Greetings, Rafael -- "Premature optimization is the root of all evil." - Donald Knuth ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway 2007-07-05 0:35 ` Paul Mackerras 2007-07-05 9:15 ` removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) Pavel Machek [not found] ` <20070705091526.GA3084@elf.ucw.cz> @ 2007-07-05 14:42 ` Alan Stern 2 siblings, 0 replies; 18+ messages in thread From: Alan Stern @ 2007-07-05 14:42 UTC (permalink / raw) To: Paul Mackerras Cc: Matthew Garrett, Kernel development list, Pavel Machek, Johannes Berg, Linux-pm mailing list On Thu, 5 Jul 2007, Paul Mackerras wrote: > Alan Stern writes: > > > > > Yes, the code could be changed to keep track of the reason for a device > > > > suspend. But that just raises the old problem of what to do when > > > > there's an I/O request for a suspended device during STR. > > > > > > Is this actually a real problem? I would think the policy would be > > > "block" for block devices (pun not intended :), "drop" for network > > > devices, etc. > > > > It is indeed a real problem, or at least, it can be. > > How so? Can you give me an example? The example I quoted earlier about binding during a suspend will do. I agree that we can and should try to prevent it from ever occurring. Read and write are a problem only in that fixing them would potentially involve changing lots of drivers; I don't think they pose a serious theoretical obstacle. (Lord knows what will happen with async I/O!) Any other entry points to drivers are also potential problems, but it's hard to say anything definite about them since they are so varied. > > Bus subsystems can suspend devices with no drivers. > > Interesting. I assume this is for buses for which there is a > bus-specific but device-independent suspend procedure defined. Yes. > It would seem sensible to me that the PM core should get the bus to > resume such a device before calling a driver probe routine. The > resume should be blocked or deferred while a system suspend is > underway. In fact I think that all driver bind/unbind and probe > operations should be deferred while the system is suspending (i.e. put > on a list to be done after the system resumes). Getting the PM core to resume a device before probing could be difficult; in general it doesn't know enough about specific device behaviors to do something like that. But the subsystem certainly ought to take care of it. USB does. Yes, bind/unbind/etc. should be deferred during a system suspend. But it has to be done carefully, because these operations generally involve locks that can't be released. They need to be prevented at their source, not in the driver core. That's one reason why khubd needs to be frozen (being part of the USB hub driver, it is the task responsible for binding and unbinding drivers to USB devices). Another thing to look out for is registration and unregistration of drivers. These activities also cause bind/unbind operations. Note that if userspace is frozen then neither insmod nor rmmod can run. :-) > > It would help. It would help even more if the sysfs core also blocked > > all I/O while suspend is under way. (Although this might be tricky, > > considering that the suspend is initiated by a sysfs write...) > > I didn't think sysfs got involved at all in normal read and write > requests, so I don't know how it would block them... All I/O to sysfs attributes passes through the routines in fs/sysfs/*. It could be blocked there. (But if userspace is frozen it won't need to be.) > Normally devices have some sort of queue of pending operations. That's certainly true of block devices, whose drivers use the block subsystem. It's not true for lots of other devices, though. > So > all that is required on suspend is to stop processing the queue and > wait for any currently-underway operations to complete. The blocking > then happens naturally using the normal I/O wait mechanisms. In my experience, most non-block drivers do not have any queue of pending I/O operations. They simply carry out requests as they arrive. Alan Stern ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2007-07-07 20:55 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <18059.6726.218205.104212@cargo.ozlabs.ibm.com>
2007-07-04 15:12 ` Re: [PATCH] Remove process freezer from suspend to RAM pathway Alan Stern
[not found] ` <Pine.LNX.4.44L0.0707041104140.25704-100000@netrider.rowland.org>
2007-07-05 0:35 ` Paul Mackerras
2007-07-05 9:15 ` removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway) Pavel Machek
[not found] ` <20070705091526.GA3084@elf.ucw.cz>
2007-07-05 13:57 ` Matthew Garrett
[not found] ` <20070705135758.GB22177@srcf.ucam.org>
[not found] ` <200707051628.12199.rjw@sisk.pl>
2007-07-05 14:26 ` Matthew Garrett
[not found] ` <20070705142600.GB22598@srcf.ucam.org>
2007-07-05 14:41 ` Rafael J. Wysocki
2007-07-05 14:39 ` Matthew Garrett
[not found] ` <20070705143918.GA23248@srcf.ucam.org>
[not found] ` <200707051704.48137.rjw@sisk.pl>
2007-07-05 15:03 ` Matthew Garrett
[not found] ` <20070705150301.GB23647@srcf.ucam.org>
2007-07-05 15:27 ` Rafael J. Wysocki
[not found] ` <200707051727.36715.rjw@sisk.pl>
2007-07-05 15:32 ` Miklos Szeredi
[not found] ` <E1I6TJq-000164-00@dorka.pomaz.szeredi.hu>
2007-07-07 11:50 ` Pavel Machek
2007-07-07 20:14 ` Miklos Szeredi
2007-07-05 15:04 ` Rafael J. Wysocki
2007-07-07 11:49 ` Pavel Machek
2007-07-05 14:28 ` Rafael J. Wysocki
2007-07-07 12:08 ` problem 1 (was Re: removing refrigerator does not help with s2ram vs. fuse deadlocks (was Re: Re: [PATCH] Remove process freezer from suspend to RAM pathway)) Pavel Machek
[not found] ` <20070707120827.GA3111@elf.ucw.cz>
2007-07-07 20:55 ` Rafael J. Wysocki
2007-07-05 14:42 ` Re: [PATCH] Remove process freezer from suspend to RAM pathway Alan Stern
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox