kernel-testers.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alan Jenkins <sourcejedi.lkml@googlemail.com>
To: Mel Gorman <mel@csn.ul.ie>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>,
	pm list <linux-pm@lists.linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Kernel Testers List <kernel-testers@vger.kernel.org>
Subject: Re: Bisected: s2disk (uswsusp only) hangs just before poweroff
Date: Wed, 2 Dec 2009 14:25:27 +0000	[thread overview]
Message-ID: <9b2b86520912020625j31d180a1t1bc2a9b13a9d988d@mail.gmail.com> (raw)
In-Reply-To: <20091202122019.GD1457@csn.ul.ie>

On 12/2/09, Mel Gorman <mel@csn.ul.ie> wrote:
> On Wed, Dec 02, 2009 at 11:49:47AM +0000, Alan Jenkins wrote:
>> Rafael J. Wysocki wrote:
>>> On Tuesday 01 December 2009, Mel Gorman wrote:
>>>
>>>> On Tue, Dec 01, 2009 at 07:59:40PM +0000, Alan Jenkins wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> Suspend to disk is (sometimes) hanging for me in 2.6.32-rc.  I
>>>>> finally  got around to bisecting it, which blamed the following
>>>>> commit by Mel:
>>>>>
>>>>> 5f8dcc2 "page-allocator: split per-cpu list into
>>>>> one-list-per-migrate-type"
>>>>>
>>>>> I was able to confirm this by reverting the commit, which fixed the
>>>>>  hang.  I had to revert one other commit first to avoid a conflict:
>>>>>
>>>>> a6f9edd "page-allocator: maintain rolling count of pages to free
>>>>> from  the PCP"
>>>>>
>>>>>
>>>> Which RC kernel? Specifically, are the commits
>>>>
>>>> cc4a6851466039a8a688c843962a05689059ff3b always wake kswapd when
>>>> restarting an allocation attempt
>>>> 9d0ed60fe9cd1fbf57f755cd27a23ae9114d7210 Do not allow interrupts to use
>>>> ALLOC_HARDER
>>>>
>>>> applied?
>>>>
>>>> The latter one in particular might make a difference if s2disk is
>>>> pushing the system far below the watermarks. I don't suppose you know
>>>> where it's hanging? i.e. is it hanging in the allocator itself?
>>>>
>>>> If those patches are applied, then one difference that 5f8dcc2 makes is
>>>> that pages on the PCP lists but not of the right migratetype are not
>>>> used. Prior to that commit, an allocation might succeed even if the
>>>> buddy lists were empty because one of the other PCP page types would be
>>>> used.
>>>>
>>>>
>>>>> -- detail --
>>>>>
>>>>> When I suspend my EeePc 701 to disk, it sometimes hangs after
>>>>> writing  out the hibernation image.  The system is still able to
>>>>> resume from this  image (after working around the hang by pressing
>>>>> the power button).
>>>>>
>>>>> This is specific to s2disk from the uswsusp package (which is now
>>>>> installed by default on debian unstable).  It doesn't happen if I
>>>>> uninstall uswsusp and use the in-kernel suspend instead.
>>>>>
>>>>>
>>>> This leads me to believe that uswsusp is able to push available pages
>>>> far below what is expected. It's a total guess though, I have no idea
>>>> how uswsusp is implemented or how it differs from what is in kernel.
>>>>
>>>
>>> It doesn't differ at all in that respect.  Actually, it uses the same
>>> code, but
>>> the distro configuration may be such that it leaves fewer available pages
>>> than the default in-kernel hibernation.
>>>
>>> Thanks,
>>> Rafael
>>>
>>
>> It seems unintuitive that lack of memory is a problem _after we've
>> written out the hibernation image_.  The backtrace I captured shows the
>> hang happens within hibernation_platform_enter()...
>>
>
> I think the backtrace is also showing that it's trying to create a kernel
> thread. For this to be getting locked up, memory must be exceptionally
> tight. One thing that the patch changes is that in certain circumstances,
> an additional 128K of memory per-CPU could be on each the PCP lists.
>
> Ordinarily it doesn't matter because reclaim would resolve the situation
> or the PCP lists would be drained very shortly after. However, if the
> CPUs were no longer being used but still have pages pinned, it could be
> causing a problem.
>
>> Hmm.  Doesn't the in-kernel suspend free the in-memory image before
>> powering off?
>>
>> int hibernate(void)
>> ...
>>        pr_debug("PM: writing image.\n");
>>        error = swsusp_write(flags);
>>        swsusp_free();
>>        if (!error)
>>            power_down();
>>
>>
>>
>> Would that explain why only uswsusp is affected?  Do we want to fix
>> snapshot_read() in user.c, so that it calls swsusp_free() once all the
>> data has been read?
>>
>
> Could you try it please?

Yes, that fixes it.  I left it running over lunch, and it did 24
hibernations cycles without hanging.  I'll post it and we'll see what
Rafael thinks.  It's only four lines of code, and I think there's a
strong case for it.

> Another possibility would be to call drain_all_pages() before powering
> off. If that makes a difference, it would confirm that pages are pinned
> on PCP lists of inactive processors.

Probably not, since this is a single processor machine :).  It's the
original EeePC model with a Celeron processor, no fancy dual cores or
hyperthreading.

Thanks
Alan

  reply	other threads:[~2009-12-02 14:25 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-01 19:59 Bisected: s2disk (uswsusp only) hangs just before poweroff Alan Jenkins
     [not found] ` <4B1575AC.6080904-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
2009-12-01 20:24   ` Justin P. Mattock
     [not found]     ` <4B157B81.9050703-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2009-12-01 20:27       ` Alan Jenkins
2009-12-01 21:14         ` Justin P. Mattock
2009-12-01 21:45   ` Mel Gorman
2009-12-01 21:53     ` Rafael J. Wysocki
     [not found]       ` <200912012253.08522.rjw-KKrjLPT3xs0@public.gmane.org>
2009-12-02 11:49         ` Alan Jenkins
     [not found]           ` <4B16545B.3090703-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
2009-12-02 12:20             ` Mel Gorman
2009-12-02 14:25               ` Alan Jenkins [this message]
     [not found]               ` <20091202122019.GD1457-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
2009-12-02 14:28                 ` [PATCH] uswsusp: automatically free the in-memory image once s2disk has finished with it Alan Jenkins
     [not found]                   ` <4B16797C.3010304-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
2009-12-02 21:11                     ` Pavel Machek
     [not found]                       ` <20091202211107.GA20830-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2009-12-02 22:07                         ` Mel Gorman
2009-12-02 22:15                           ` Pavel Machek
     [not found]                             ` <20091202221524.GB20830-I/5MKhXcvmPrBKCeMvbIDA@public.gmane.org>
2009-12-02 22:25                               ` Mel Gorman
     [not found]                                 ` <20091202222516.GD26702-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
2009-12-02 23:22                                   ` Rafael J. Wysocki
2009-12-03  7:53                                   ` Pavel Machek
2009-12-03 12:57                                     ` Alan Jenkins
     [not found]                                       ` <4B17B5B8.1060105-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
2009-12-03 14:50                                         ` Mel Gorman
     [not found]                                           ` <20091203145018.GG26702-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
2009-12-08  0:37                                             ` Alan Jenkins
     [not found]                                               ` <9b2b86520912071637v6957ed24ie0f67acf6785ab08-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-12-11 10:53                                                 ` Mel Gorman
     [not found]                                                   ` <20091211105352.GB30670-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
2009-12-14 11:08                                                     ` Pavel Machek
2009-12-03 20:16                                         ` Pavel Machek
2009-12-03 19:50                                       ` Rafael J. Wysocki
2009-12-02 21:47                     ` Rafael J. Wysocki
     [not found]     ` <20091201214529.GA1457-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
2009-12-02  8:57       ` Bisected: s2disk (uswsusp only) hangs just before poweroff Alan Jenkins
     [not found]         ` <4B162BE1.7070709-cCz0Lq7MMjm9FHfhHBbuYA@public.gmane.org>
2009-12-02 10:35           ` Mel Gorman
     [not found]             ` <20091202103538.GB1457-wPRd99KPJ+uzQB+pC5nmwQ@public.gmane.org>
2009-12-02 11:35               ` Alan Jenkins
2009-12-02 11:11       ` Alan Jenkins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9b2b86520912020625j31d180a1t1bc2a9b13a9d988d@mail.gmail.com \
    --to=sourcejedi.lkml@googlemail.com \
    --cc=kernel-testers@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@lists.linux-foundation.org \
    --cc=mel@csn.ul.ie \
    --cc=rjw@sisk.pl \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).