From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: David Vrabel <david.vrabel@citrix.com>
Cc: Andre Przywara <andre.przywara@amd.com>,
Jeremy Fitzhardinge <jeremy@goop.org>,
xen-devel <xen-devel@lists.xensource.com>,
Jan Beulich <JBeulich@suse.com>
Subject: Re: acpidump crashes on some machines
Date: Thu, 23 Aug 2012 10:35:12 -0400 [thread overview]
Message-ID: <20120823143512.GI26098@phenom.dumpdata.com> (raw)
In-Reply-To: <50363FEF.10001@citrix.com>
On Thu, Aug 23, 2012 at 03:36:31PM +0100, David Vrabel wrote:
> On 23/08/12 15:10, Konrad Rzeszutek Wilk wrote:
> > On Thu, Aug 23, 2012 at 11:22:29AM +0100, David Vrabel wrote:
> >> On 23/08/12 11:14, Andre Przywara wrote:
> >>> On 08/17/2012 10:52 PM, Konrad Rzeszutek Wilk wrote:
> >>>> On Thu, Jul 26, 2012 at 03:02:58PM +0200, Andre Przywara wrote:
> >>>>> On 06/30/2012 04:19 AM, Konrad Rzeszutek Wilk wrote:
> >>>>>
> >>>>> Konrad, David,
> >>>>>
> >>>>> back on track for this issue. Thanks for your input, I could do some
> >>>>> more debugging (see below for a refresh):
> >>>>>
> >>>>> It seems like it affects only the first page of the 1:1 mapping. I
> >>>>> didn't have an issues with the last PFN or the page behind it (which
> >>>>> failed properly).
> >>>>>
> >>>>> David, thanks for the hint with varying dom0_mem parameter. I
> >>>>> thought I already checked this, but I did it once again and it
> >>>>> turned out that it is only an issue if dom0_mem is smaller than the
> >>>>> ACPI area, which generates a hole in the memory map. So we have
> >>>>> (simplified)
> >>>>> * 1:1 mapping to 1 MB
> >>>>> * normal mapping till dom0_mem
> >>>>> * unmapped area till ACPI E820 area
> >>>>> * ACPI E820 1:1 mapping
> >>>>>
> >>>>> As far as I could chase it down the 1:1 mapping itself looks OK, I
> >>>>> couldn't find any off-by-one bugs here. So maybe it is code that
> >>>>> later on invalidates areas between the normal guest mapping and the
> >>>>> ACPI mem?
> >>>>
> >>>> I think I found it. Can you try this pls [and if you can't find
> >>>> early_to_phys.. just use the __set_phys_to call]
> >>>
> >>> Yes, that works. At least after a quick test on my test box. Both the
> >>> test module and acpidump work as expected. If I replace the "<" in your
> >>> patch with the original "<=", I get the warning (and due to the
> >>> "continue" it also works).
> >>
> >> Note that the balloon driver could subsequently overwrite the p2m entry.
> >
> > Hmm, I am not seeing how.. the region that is passed in is right up to
> > the PFN (I believe). And I did run with this patch over a couple of days
> > with ballooning up and down. But maybe I missed something?
>
> Hrrm. I was sure I wrote "Note that the balloon driver could
> subsequently overwrite the p2m entry /if/ this warning is triggered."
> but it seems I did not. :/
>
> i.e., if the warning is triggered, the xen_extra_mem region will be
> incorrectly sized and the balloon driver will make use of the incorrect
> region.
Ah, that makes more sense. Yes we would do the overwritting part later on
in that scenario.. which makes me wonder - if we did that in the past
how come MMIO devices still worked! Some boxes have the gap/MMIO right
at the edge of the E820_RAM - perhaps they silently coping with and
we just never caught on this fact.
>
> David
>
> > Let me prep a patch that adds some more checks in the balloon driver
> > just in case we do hit this.
> >
> >> I don't think it is worth redoing the patch to adjust the region passed
> >> to the balloon driver to avoid this though.
> >>
> >>> I also successfully tested the minimal fix (just replacing <= with <).
> >>> I will feed it to the testers here to cover more machines.
> >>>
> >>> Do you want to keep the warnings in (which exceed 80 characters, btw)?
> >>
> >> I think we do.
> >>
> >> David
next prev parent reply other threads:[~2012-08-23 14:35 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-20 12:37 acpidump crashes on some machines Andre Przywara
2012-06-20 14:51 ` Konrad Rzeszutek Wilk
2012-06-21 14:21 ` Andre Przywara
2012-06-30 1:48 ` Konrad Rzeszutek Wilk
2012-06-30 2:19 ` Konrad Rzeszutek Wilk
2012-07-26 13:02 ` Andre Przywara
2012-08-17 20:52 ` Konrad Rzeszutek Wilk
2012-08-23 10:14 ` Andre Przywara
2012-08-23 10:22 ` David Vrabel
2012-08-23 14:10 ` Konrad Rzeszutek Wilk
2012-08-23 14:36 ` David Vrabel
2012-08-23 14:35 ` Konrad Rzeszutek Wilk [this message]
2012-08-23 14:06 ` Konrad Rzeszutek Wilk
2012-07-04 10:21 ` David Vrabel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120823143512.GI26098@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=JBeulich@suse.com \
--cc=andre.przywara@amd.com \
--cc=david.vrabel@citrix.com \
--cc=jeremy@goop.org \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.