linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH] x86: Fix S4 regression
       [not found] <s5hwrbvv227.wl%tiwai@suse.de>
@ 2011-10-24  0:29 ` Yinghai Lu
  2011-10-24  4:10   ` Yinghai Lu
  2011-10-24  9:02   ` Takashi Iwai
  0 siblings, 2 replies; 4+ messages in thread
From: Yinghai Lu @ 2011-10-24  0:29 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Linus Torvalds, Rafael J.Wysocki, x86, linux-kernel

On 10/23/2011 02:19 PM, Takashi Iwai wrote:

> The commit 4b239f458: [x86-64, mm: Put early page table high] causes
> a S4 regression since 2.6.39, namely the machine reboots occasionally
> at S4 resume.  It doesn't happen always, overall rate is about 1/20.
> But, like other bugs, once when this happens, it continues to happen.
> 
> This patch fixes the problem by essentially reverting the memory
> assignment in the older way.
> 
> Cc: <stable@kernel.org>
> Signed-off-by: Takashi Iwai <tiwai@suse.de>
> 
> ---
> I resend this as a "fix" patch now before it's forgotten and rotten.
> It's just papering again over the mystery, but IMO better than the
> hard-reset behavior as of now.  Unfortunately, bisection is pretty 
> much difficult because the bug itself is fairly unstable...



Did you try to check several commit that Rafael pointed out:


On Wed, Sep 28, 2011 at 12:30 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Wednesday, September 28, 2011, Takashi Iwai wrote:
>>
>> If my previous test -- 2.6.37+Yinghai's patches didn't show the
>> problem -- is correct, it means that some change in 2.6.38 reacted
>> badly with Yinghai's patches, not about 2.6.39.  I'll check tomorrow
>> again whether this observation is really correct.
>
> Yes, that would be good to know, thanks for doing this!
>
> If that turns out to be the case, there are the following commits
> looking like worth checking:
>
> d344e38 x86, nx: Mark the ACPI resume trampoline code as +x
> 884b821 ACPI: Fix acpi_os_read_memory() and acpi_os_write_memory() (v2)
> d551d81 ACPI / PM: Call suspend_nvs_free() earlier during resume
> 2d6d9fd ACPI: Introduce acpi_os_ioremap()

Thanks

Yinghai Lu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: Fix S4 regression
  2011-10-24  0:29 ` [PATCH] x86: Fix S4 regression Yinghai Lu
@ 2011-10-24  4:10   ` Yinghai Lu
  2011-10-24  9:03     ` Takashi Iwai
  2011-10-24  9:02   ` Takashi Iwai
  1 sibling, 1 reply; 4+ messages in thread
From: Yinghai Lu @ 2011-10-24  4:10 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: Linus Torvalds, Rafael J.Wysocki, x86, linux-kernel

On Sun, Oct 23, 2011 at 5:29 PM, Yinghai Lu <yinghai.lu@oracle.com> wrote:
> On 10/23/2011 02:19 PM, Takashi Iwai wrote:
>
>> The commit 4b239f458: [x86-64, mm: Put early page table high] causes
>> a S4 regression since 2.6.39, namely the machine reboots occasionally
>> at S4 resume.  It doesn't happen always, overall rate is about 1/20.
>> But, like other bugs, once when this happens, it continues to happen.
>>
>> This patch fixes the problem by essentially reverting the memory
>> assignment in the older way.
>>
>> Cc: <stable@kernel.org>
>> Signed-off-by: Takashi Iwai <tiwai@suse.de>
>>
>> ---
>> I resend this as a "fix" patch now before it's forgotten and rotten.
>> It's just papering again over the mystery, but IMO better than the
>> hard-reset behavior as of now.  Unfortunately, bisection is pretty
>> much difficult because the bug itself is fairly unstable...
>
>
>
> Did you try to check several commit that Rafael pointed out:
>
>
> On Wed, Sep 28, 2011 at 12:30 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Wednesday, September 28, 2011, Takashi Iwai wrote:
>>>
>>> If my previous test -- 2.6.37+Yinghai's patches didn't show the
>>> problem -- is correct, it means that some change in 2.6.38 reacted
>>> badly with Yinghai's patches, not about 2.6.39.  I'll check tomorrow
>>> again whether this observation is really correct.
>>
>> Yes, that would be good to know, thanks for doing this!
>>
>> If that turns out to be the case, there are the following commits
>> looking like worth checking:
>>
>> d344e38 x86, nx: Mark the ACPI resume trampoline code as +x
>> 884b821 ACPI: Fix acpi_os_read_memory() and acpi_os_write_memory() (v2)
>> d551d81 ACPI / PM: Call suspend_nvs_free() earlier during resume
>> 2d6d9fd ACPI: Introduce acpi_os_ioremap()
>

Also, can you check if reverting following patch could help?

| commit e5f15b45ddf3afa2bbbb10c7ea34fb32b6de0a0e
| Author: Yinghai Lu <yinghai@kernel.org>
| Date:   Fri Feb 18 11:30:30 2011 +0000
|
|    x86: Cleanup highmap after brk is concluded
|
|    Now cleanup_highmap actually is in two steps: one is early in head64.c
|    and only clears above _end; a second one is in init_memory_mapping() and
|    tries to clean from _brk_end to _end.
|    It should check if those boundaries are PMD_SIZE aligned but currently
|    does not.
|    Also init_memory_mapping() is called several times for numa or memory
|    hotplug, so we really should not handle initial kernel mappings there.
|
|   This patch moves cleanup_highmap() down after _brk_end is settled so
|    we can do everything in one step.
|    Also we honor max_pfn_mapped in the implementation of cleanup_highmap.

Thanks

Yinghai Lu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: Fix S4 regression
  2011-10-24  0:29 ` [PATCH] x86: Fix S4 regression Yinghai Lu
  2011-10-24  4:10   ` Yinghai Lu
@ 2011-10-24  9:02   ` Takashi Iwai
  1 sibling, 0 replies; 4+ messages in thread
From: Takashi Iwai @ 2011-10-24  9:02 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Takashi Iwai, Linus Torvalds, Rafael J.Wysocki, x86, linux-kernel

At Sun, 23 Oct 2011 17:29:24 -0700,
Yinghai Lu wrote:
> 
> On 10/23/2011 02:19 PM, Takashi Iwai wrote:
> 
> > The commit 4b239f458: [x86-64, mm: Put early page table high] causes
> > a S4 regression since 2.6.39, namely the machine reboots occasionally
> > at S4 resume.  It doesn't happen always, overall rate is about 1/20.
> > But, like other bugs, once when this happens, it continues to happen.
> > 
> > This patch fixes the problem by essentially reverting the memory
> > assignment in the older way.
> > 
> > Cc: <stable@kernel.org>
> > Signed-off-by: Takashi Iwai <tiwai@suse.de>
> > 
> > ---
> > I resend this as a "fix" patch now before it's forgotten and rotten.
> > It's just papering again over the mystery, but IMO better than the
> > hard-reset behavior as of now.  Unfortunately, bisection is pretty 
> > much difficult because the bug itself is fairly unstable...
> 
> 
> 
> Did you try to check several commit that Rafael pointed out:
> 
> 
> On Wed, Sep 28, 2011 at 12:30 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Wednesday, September 28, 2011, Takashi Iwai wrote:
> >>
> >> If my previous test -- 2.6.37+Yinghai's patches didn't show the
> >> problem -- is correct, it means that some change in 2.6.38 reacted
> >> badly with Yinghai's patches, not about 2.6.39.  I'll check tomorrow
> >> again whether this observation is really correct.
> >
> > Yes, that would be good to know, thanks for doing this!
> >
> > If that turns out to be the case, there are the following commits
> > looking like worth checking:
> >
> > d344e38 x86, nx: Mark the ACPI resume trampoline code as +x
> > 884b821 ACPI: Fix acpi_os_read_memory() and acpi_os_write_memory() (v2)
> > d551d81 ACPI / PM: Call suspend_nvs_free() earlier during resume
> > 2d6d9fd ACPI: Introduce acpi_os_ioremap()

Yes, but these are harmless, as far as I've tested.


Takashi

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] x86: Fix S4 regression
  2011-10-24  4:10   ` Yinghai Lu
@ 2011-10-24  9:03     ` Takashi Iwai
  0 siblings, 0 replies; 4+ messages in thread
From: Takashi Iwai @ 2011-10-24  9:03 UTC (permalink / raw)
  To: Yinghai Lu; +Cc: Linus Torvalds, Rafael J.Wysocki, x86, linux-kernel

At Sun, 23 Oct 2011 21:10:11 -0700,
Yinghai Lu wrote:
> 
> On Sun, Oct 23, 2011 at 5:29 PM, Yinghai Lu <yinghai.lu@oracle.com> wrote:
> > On 10/23/2011 02:19 PM, Takashi Iwai wrote:
> >
> >> The commit 4b239f458: [x86-64, mm: Put early page table high] causes
> >> a S4 regression since 2.6.39, namely the machine reboots occasionally
> >> at S4 resume.  It doesn't happen always, overall rate is about 1/20.
> >> But, like other bugs, once when this happens, it continues to happen.
> >>
> >> This patch fixes the problem by essentially reverting the memory
> >> assignment in the older way.
> >>
> >> Cc: <stable@kernel.org>
> >> Signed-off-by: Takashi Iwai <tiwai@suse.de>
> >>
> >> ---
> >> I resend this as a "fix" patch now before it's forgotten and rotten.
> >> It's just papering again over the mystery, but IMO better than the
> >> hard-reset behavior as of now.  Unfortunately, bisection is pretty
> >> much difficult because the bug itself is fairly unstable...
> >
> >
> >
> > Did you try to check several commit that Rafael pointed out:
> >
> >
> > On Wed, Sep 28, 2011 at 12:30 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> On Wednesday, September 28, 2011, Takashi Iwai wrote:
> >>>
> >>> If my previous test -- 2.6.37+Yinghai's patches didn't show the
> >>> problem -- is correct, it means that some change in 2.6.38 reacted
> >>> badly with Yinghai's patches, not about 2.6.39.  I'll check tomorrow
> >>> again whether this observation is really correct.
> >>
> >> Yes, that would be good to know, thanks for doing this!
> >>
> >> If that turns out to be the case, there are the following commits
> >> looking like worth checking:
> >>
> >> d344e38 x86, nx: Mark the ACPI resume trampoline code as +x
> >> 884b821 ACPI: Fix acpi_os_read_memory() and acpi_os_write_memory() (v2)
> >> d551d81 ACPI / PM: Call suspend_nvs_free() earlier during resume
> >> 2d6d9fd ACPI: Introduce acpi_os_ioremap()
> >
> 
> Also, can you check if reverting following patch could help?

OK, I'll try it later but in the next week, as I'll be in conferences
for the whole this week...


thanks,

Takashi

> | commit e5f15b45ddf3afa2bbbb10c7ea34fb32b6de0a0e
> | Author: Yinghai Lu <yinghai@kernel.org>
> | Date:   Fri Feb 18 11:30:30 2011 +0000
> |
> |    x86: Cleanup highmap after brk is concluded
> |
> |    Now cleanup_highmap actually is in two steps: one is early in head64.c
> |    and only clears above _end; a second one is in init_memory_mapping() and
> |    tries to clean from _brk_end to _end.
> |    It should check if those boundaries are PMD_SIZE aligned but currently
> |    does not.
> |    Also init_memory_mapping() is called several times for numa or memory
> |    hotplug, so we really should not handle initial kernel mappings there.
> |
> |   This patch moves cleanup_highmap() down after _brk_end is settled so
> |    we can do everything in one step.
> |    Also we honor max_pfn_mapped in the implementation of cleanup_highmap.
> 
> Thanks
> 
> Yinghai Lu
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-10-24  9:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <s5hwrbvv227.wl%tiwai@suse.de>
2011-10-24  0:29 ` [PATCH] x86: Fix S4 regression Yinghai Lu
2011-10-24  4:10   ` Yinghai Lu
2011-10-24  9:03     ` Takashi Iwai
2011-10-24  9:02   ` Takashi Iwai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).