From mboxrd@z Thu Jan 1 00:00:00 1970 From: Benjamin Herrenschmidt Subject: suspend/resume issue (Was: [PATCH 2/2] Fix console handling during suspend/resume) Date: Fri, 16 Jun 2006 11:02:13 +1000 Message-ID: <1150419733.7725.51.camel@localhost.localdomain> References: <20060614103404.GC28536@elf.ucw.cz> Mime-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-pm-bounces@lists.osdl.org Errors-To: linux-pm-bounces@lists.osdl.org To: Linus Torvalds Cc: Power management list , Pavel Machek List-Id: linux-pm@vger.kernel.org > My Mac Mini (Intel dual-core CPU) now resumes and suspends in SMP mode = > too, which was not true just a couple of days ago. It even seems to do it = > fairly reliable. > = > The debugging patch helped me figure out a number of the problems (and = > even more problems that then didn't actually make any difference once I = > started getting things working ;) Hi Linus ! Heh, good to see you on the PM wagon :) One thing we really need to look into is the problem that when the suspend process starts, at any point in time, kmalloc() might block forever. The basic issue is as usual the swap device(s) going down, thus any allocation that might try to push things out to swap will possibly sleep forever. I think we might need something like kmalloc silently switching to NOIO or something like that when the system state changes to "suspending". As-is, we have all sort of well hidden possible deadlocks, where a driver will have some part (a bottom half for example) blocked in a kmalloc & holding mutex X while that driver's suspend routine gets called and tries to acquire that same mutex... there are plenty others... driver suspend calling thigns that implicitely will block on a kmalloc, etc etc... My very early proposal for suspend callbacks (years ago, maybe you remember), had an additional round of callbacks to drivers called "prepare for suspend" for that. Drivers were supposed to enter a state where they avoided blocking allocation etc... Of course, I realize that this was not a good approach: too complex and we would never have all drivers to properly handle that. Another source of problems is the request_firmware() interface. Most drivers use it synchronously and do it at resume() time, when coming back from sleep. However, on resume, userland is still frozen...the kernel might still be able to launch things but I wouldn't be too much on the result, especially since the swap device might potentially be still suspended too. This is a typical cause of either deadlocks or non-working wireless devices on resume. Not sure what the perfect solution here... drivers will _have_ to delay their resume process for that... one possibility would be to make request_firmware() kind of interfaces asynchronous only (with a completion callback) and have the core delay it... that leads to the next issue .. :) ... which is hotplug events happening during the suspend process... Very similar to the above problem: Trying to run userland things when userland isn't supposed to be in a state where it can handle them. I proposed a while ago that a way to fix both issues is to 1- make request_firmware type of interfaces asynchronous only and 2- have the "core" queue up all userland helper calls when the suspend process is in progress and send them as a batch on resume. Of course, that isn't necessarily totally efficient. A more elaborate option would be to drop them relying on: 1- for normal hotplug events, we only send a single "rescan all" event to userland at the end of the resume process where it basically re-does what it does at boot. 2- call_usermodehelper just fails with something like -EAGAIN when called in the suspend/resume process. Thus normal hotplug events are just dropped on the floor. For request_firmware, the fix is hidden in the implementation of request_firmware_async which will then queue up the request and re-emit after the suspend process is over. All these issues lead to a need to globally: - Know that the suspend process has started. That is, userland can't be relied upon and touching swap is not an option (GFP_KERNEL can deadlock). - Be notified of the above and of the end of the above situation (suspend process aborted or resume finished). Could just be a global notifier, I don't think we need that much ordering for this. With the above, some subsystems could enter a "suspend safe" state that would make things a lot more reliable. One example is slab/buddy turning gfp_kernel into noio (and sync'ing all CPUs after doing that to avoid having a big lock), the usermodehelper stuff, the request firmware stuff, etc... Ideas ? Ben. =