All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Thomas Schwinge <thomas@schwinge.name>
Cc: xen-devel@lists.xensource.com,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Ian Campbell <ijc@hellion.org.uk>
Subject: Re: Debian linux-image-2.6.32-4-xen-amd64 2.6.32-11	doesn't boot with > 4 GiB; resets immediatelly, no log messages
Date: Fri, 09 Apr 2010 11:20:52 -0700	[thread overview]
Message-ID: <4BBF7004.8000707@goop.org> (raw)
In-Reply-To: <20100409180016.GA14029@kepler.schwinge.homeip.net>

On 04/09/2010 11:00 AM, Thomas Schwinge wrote:
> Before we get to the backtrace, one further detail: this kernel *does*
> boot if one of the following has happened before: the BIOS memchecker has
> run, memtest86+ has run, some other kernel has run (though it doesn't
> always boot in this latter case).  Thus, I wildly guess that some
> uninitialized data structure (in memory) is dereferenced -- that happens
> to be in a sane state after memtest86+ et al.
>   

OK, I think I see what's happening here...

>     $ for ip in ffffffff814f6d88 ffffffff81433e38 ffffffff814f6d3d ffffffff81433e60 ffffffff815a73ac ffffffff81433f98 ffffffff814f6f85 ffffffff8152b2d0 ffffffff814f95fb ffffffff814f8249 ffffffff813f3f5f ffffffff813b4119 ffffffff81433f90 ffffffff811ff14f ffffffff8100e361 ffffffff8100e343 ffffffff813b4119 ffffffff813f3f5f ffffffff8152a7b0 ffffffff814f49d0 ffffffff81001000 ffffffff814f6aca; do echo "* $ip:" && addr2line -fie debian/build/build_amd64_xen_amd64/vmlinux "$ip"; done > ~/shared/tmp/tmp
>     * ffffffff814f6d88:
>     xen_release_chunk
>   

This is the code which goes through the gaps between the E820 table
entries looking for pages which Xen has assigned the kernel, but the
kernel can't use (because they're not covered by E820).  It does this with:

	for(pfn = start; pfn < end; pfn++) {
		unsigned long mfn = pfn_to_mfn(pfn);

		/* Make sure pfn exists to start with */
		if (mfn == INVALID_P2M_ENTRY || mfn_to_pfn(mfn) != pfn)
			continue;
		...


So in theory we're poking at the p2m and m2p tables for random pages
which may or may not be valid.  So if we do a pfn_to_mfn on a pfn which
is within the range of valid pfns, but not actually a valid pfn for our
domain, then the resulting mfn is undefined (and may depend on random
memory contents, which is why it is affected by what you've previously
booted).

We then pass that mfn back to mfn_to_pfn to see if it really does belong
to us (because it will return the same pfn back).  But it could be
random garbage, which mfn_to_pfn uses to index an array.

Normally that would be OK, because it uses:

	__get_user(pfn, &machine_to_phys_mapping[mfn]);

to dereference the array.  But at this early stage, none of the kernel's
exception handlers have been set up, so this will just fault into Xen.

It would be interesting to confirm this by building your kernel with
CONFIG_DEBUG_INFO=y in the .config, and verify that the faulting
instruction is actually this line.

Thanks,
    J

  reply	other threads:[~2010-04-09 18:20 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-08 11:34 Debian linux-image-2.6.32-4-xen-amd64 2.6.32-11 doesn't boot with > 4 GiB; resets immediatelly, no log messages Thomas Schwinge
2010-04-08 13:38 ` Konrad Rzeszutek Wilk
2010-04-08 13:42   ` Ian Campbell
2010-04-08 22:19   ` Thomas Schwinge
2010-04-08 22:51     ` Jeremy Fitzhardinge
2010-04-09 18:00       ` Thomas Schwinge
2010-04-09 18:20         ` Jeremy Fitzhardinge [this message]
2010-04-10 22:13           ` Thomas Schwinge
2010-04-10 22:52             ` Jeremy Fitzhardinge
2010-04-11  9:49               ` Thomas Goirand
2010-04-12 19:34                 ` Jeremy Fitzhardinge
2010-04-13  1:20                   ` Thomas Goirand
2010-04-13  6:02                     ` Pasi Kärkkäinen
2010-04-13  9:48                       ` Thomas Goirand
2010-04-13  9:52                         ` Pasi Kärkkäinen
2010-04-13 10:08                           ` Thomas Goirand
2010-04-13 19:50                     ` Jeremy Fitzhardinge
2010-04-13 22:27                       ` Thomas Goirand
2010-04-13 23:16                         ` Mike Viau
2010-04-13 23:59                         ` Jeremy Fitzhardinge
2010-04-14  5:48                           ` Thomas Goirand
2010-04-20 11:18               ` Thomas Schwinge
2010-04-20 18:19                 ` Jeremy Fitzhardinge
2010-04-28 21:21                 ` Jeremy Fitzhardinge
2010-05-08 15:46                   ` Thomas Schwinge
2010-05-08 23:01                     ` Jeremy Fitzhardinge
2010-05-10  9:48                       ` Thomas Schwinge
2010-04-09 18:52         ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BBF7004.8000707@goop.org \
    --to=jeremy@goop.org \
    --cc=ijc@hellion.org.uk \
    --cc=konrad.wilk@oracle.com \
    --cc=thomas@schwinge.name \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.