From: Chen Yu <yu.c.chen@intel.com>
To: joeyli <jlee@suse.com>
Cc: rjw@rjwysocki.net, pavel@ucw.cz, len.brown@intel.com,
hpa@zytor.com, mingo@redhat.com, tglx@linutronix.de,
rui.zhang@intel.com, x86@kernel.org, linux-pm@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH][v6] PM / hibernate: Print the possible panic reason when resuming with inconsistent e820 map
Date: Tue, 23 Aug 2016 18:01:55 +0800 [thread overview]
Message-ID: <20160823100155.GA12738@sharon> (raw)
In-Reply-To: <20160823094527.GG7276@linux-rxt1.site>
Hi,
thanks for your interest :)
On Tue, Aug 23, 2016 at 05:45:27PM +0800, joeyli wrote:
> Hi all,
>
> On Wed, Oct 21, 2015 at 01:21:40PM +0800, Chen Yu wrote:
> > On some platforms, there is occasional panic triggered when trying to
> > resume from hibernation, a typical panic looks like:
> >
> > "BUG: unable to handle kernel paging request at ffff880085894000
> > IP: [<ffffffff810c5dc2>] load_image_lzo+0x8c2/0xe70"
> >
> > This is because e820 map has been changed by BIOS before/after
> > hibernation, and one of the page frames from first kernel
> > is right located in second kernel's unmapped region, so panic
> > comes out when accessing unmapped kernel address.
> >
> > In order to tell the user why this happeneded, and for scalability,
> > we introduce a framework(a new file named hibernation_e820.c) to
> > compare the e820 maps before/after hibernation. If these two
> > e820 maps are not compatible with each other, we will print
> > warning about the first corrupt e820 entry's information
> > (there might be more than one broken e820 entries) once the
> > system goes into panic, for example:
> >
> > BUG: unable to handle kernel paging request at ffff8800a9688000
> > IP: [<ffffffff810c5dc2>] load_image_lzo+0x8c2/0xe70
> > PM: Hibernation Caution! Oops might be due to inconsistent e820 table.
> > PM: mem [0xa963b000-0xa963d000][ACPI Table] is an invalid old e820 region.
> > PM: Inconsistent with current [mem 0xa963b000-0xa963e000][ACPI Table].
> > PM: Please update your BIOS, or do not use hibernation on this machine.
> >
> > The following kind of e820 entries will be regarded as invalid ones:
> > 1.E820_RAM: old region is not a subset of any current region.
> > 2.E820_ACPI: old region is not strictly the same as any current
> > region(example above).
> >
> > Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> > ---
> > v6:
> > - Fix some compiling errors reported by 0day/LKP, adjust
> > Kconfig/variable namings.
> > v5:
> > - Rewrite this patch to just warn user of the broken BIOS
> > when panic.
> > v4:
> > - Add __attribute__ ((unused)) for swsusp_page_is_valid,
> > to eliminate the warnning of:
> > 'swsusp_page_is_valid' defined but not used
> > on non-x86 platforms.
> >
> > v3:
> > - Adjust the logic to exclude the end_pfn boundary in pfn_mapped
> > when invoking mark_valid_pages, because the end_pfn is not
> > a mapped page frame, we should not regard it as a valid page.
> >
> > Move the sanity check of valid pages to a early stage in resuming
> > process(moved to mark_unsafe_pages), in this way, we can avoid
> > unnecessarily accessing these invalid pages in later stage(yes,
> > move to the original position Joey once introduced in:
> > Commit 84c91b7ae07c ("PM / hibernate: avoid unsafe pages in e820
> > reserved regions")
> >
> > With v3 patch applied, I did 30 cycles on my problematic platform,
> > no panic triggered anymore(50% reproducible before patched, by
> > plugging/unplugging memory peripheral during hibernation), and it
> > just warns of invalid pages.
> >
> > v2:
> > - According to Ingo's suggestion, rewrite this patch.
> >
> > New version just checks each page frame according to pfn_mapped array.
> > So that we do not need to touch existing code related to
> > E820_RESERVED_KERN. And this method can naturely guarantee
> > that the system before/after hibernation do not need to be of
> > the same memory size on x86_64.
>
> What's the progress of this patch? Looks already have experts review it.
> Why this patch didn't accept?
This patch is a little overkilled, and I have saved another simpler
version to only check the md5 hash (as people suggested) for it. I can post it later.
thanks,
Yu
next prev parent reply other threads:[~2016-08-23 9:54 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-21 5:21 [PATCH][v6] PM / hibernate: Print the possible panic reason when resuming with inconsistent e820 map Chen Yu
2016-08-23 9:45 ` joeyli
2016-08-23 10:01 ` Chen Yu [this message]
2016-08-24 1:36 ` joeyli
2016-08-25 11:07 ` Chen Yu
2016-08-26 19:56 ` Pavel Machek
2016-08-28 2:07 ` Chen Yu
2016-08-28 12:47 ` Pavel Machek
2016-08-28 13:08 ` Chen, Yu C
2016-08-28 13:15 ` Pavel Machek
2016-08-28 13:34 ` Chen, Yu C
-- strict thread matches above, loose matches on Subject: below --
2016-08-29 1:37 Andreas Mohr
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160823100155.GA12738@sharon \
--to=yu.c.chen@intel.com \
--cc=hpa@zytor.com \
--cc=jlee@suse.com \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pavel@ucw.cz \
--cc=rjw@rjwysocki.net \
--cc=rui.zhang@intel.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.