public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Chen Yu <yu.c.chen@intel.com>
To: Pavel Machek <pavel@ucw.cz>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, "H . Peter Anvin" <hpa@zytor.com>,
	linux-pm@vger.kernel.org,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Len Brown <len.brown@intel.com>, Borislav Petkov <bp@suse.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Zhu Guihua <zhugh.fnst@cn.fujitsu.com>,
	Juergen Gross <jgross@suse.com>, "Zheng, Lv" <lv.zheng@intel.com>
Subject: Re: [PATCH][RFC] x86, hotplug: Use zero page for monitor when resuming from hibernation
Date: Tue, 7 Jun 2016 16:44:24 +0800	[thread overview]
Message-ID: <57568968.2090100@intel.com> (raw)
In-Reply-To: <20160607080307.GB13858@amd>

Hi Pavel,

On 2016年06月07日 16:03, Pavel Machek wrote:
> On Mon 2016-06-06 22:19:09, Chen Yu wrote:
>> Stress test from Varun Koyyalagunta reports that, the
>> nonboot CPU would hang occasionally, when resuming from
>> hibernation. Further investigation shows that, the precise
>> phase when nonboot CPU hangs, is the time when the nonboot
>> CPU been woken up incorrectly, and tries to monitor the
>> mwait_ptr for the second time, then an exception is
>> triggered due to illegal vaddr access, say, something like,
>> 'Unable to handler kernel address of 0xffff8800ba800010...'
>>
>> One of the possible scenarios for this issue is illustrated below,
>> when the boot CPU tries to resume from hibernation:
>> 1. puts the nonboot CPUs offline, so the nonboot CPUs are monitoring
>>     at the address of the task_struct.flags.
>> 2. boot CPU copies pages to their original address, which includes
>>     task_struct.flags, thus wakes up one of the nonboot CPUs.
>> 3. nonboot CPU tries to monitor the task_struct.flags again, but since
>>     the page table for task_struct.flags has been overwritten by
>>     boot CPU, and there is probably a changed across hibernation
>>     (because of inconsistence of e820 memory map), an exception is
>>     triggered.
> If memory map changes between suspend and resume, there'll be fun. If
> that's suspected, should we attach md5 sum of e820 to the hibernation
> image?
Actually what I described  in the  scenario might be not so accurate,
it might not be related to inconsistence of e820 map,
because there is no guarantee that boot kernel and resume kernel
have the same memory layout(page table).

I've re-checked the logs from reporter, it seems that, the fault
access is caused by accessing a page without PRESENT flag,
and the pte entry for this vaddr is zero:

    // DATA ADDRESS TRANSLATION, virtual address = 0xFFFF8800BA803E88
    //   DATA TABLE WALK, virtual address = 0xFFFF8800BA803E88
    pml4read   0x0001C12880    0x01FD2067; // pgd
    pdptread   0x0001FD2010    0x9BEBF063; // pud
    pderead    0x009BEBFEA0    0x2D9D9063; // pmd
    pteread    0x002D9D9018    0x00000000; // pte

The last line above is a pte entry located at physical address 0x2d9d9018,
with value zero, thus access to this vaddr results in a page-not-present exception.


Since some of the pud/pde/pte are allocated dynamically during bootup
(kernel_physical_mapping_init),
it is possible that, when the boot cpu writes to this vaddr,
the page table(especially for pud/pmd/pte) for this vaddr
are not the same as it is before hibernation,  thus an exception would be
triggered due to incorrect page table, even e820 is consistent.

I'm doing more test to verify this.

thanks,
Yu

  reply	other threads:[~2016-06-07  8:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-06 14:19 [PATCH][RFC] x86, hotplug: Use zero page for monitor when resuming from hibernation Chen Yu
2016-06-06 14:25 ` Peter Zijlstra
2016-06-06 15:59   ` Chen, Yu C
2016-06-06 16:40     ` Peter Zijlstra
2016-06-06 17:34       ` Brian Gerst
2016-06-06 21:05       ` H. Peter Anvin
2016-06-07  8:03 ` Pavel Machek
2016-06-07  8:44   ` Chen Yu [this message]
2016-06-07  9:13     ` Borislav Petkov
2016-06-07  9:43       ` Chen Yu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57568968.2090100@intel.com \
    --to=yu.c.chen@intel.com \
    --cc=bp@suse.de \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lv.zheng@intel.com \
    --cc=mingo@redhat.com \
    --cc=pavel@ucw.cz \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=zhugh.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox