public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@elte.hu>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	xen-devel <xen-devel@lists.xensource.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Rafael J. Wysocki" <rjw@sisk.pl>,
	x86@kernel.org, Sam Ravnborg <sam@ravnborg.org>
Subject: [bisected] Re: [PATCH 05 of 12] xen: add p2m mfn_list_list
Date: Wed, 28 May 2008 14:28:13 +0200	[thread overview]
Message-ID: <20080528122813.GA5502@elte.hu> (raw)
In-Reply-To: <a405e5f9ddf6f5f142ea.1211550072@localhost>


* Jeremy Fitzhardinge <jeremy@goop.org> wrote:

> When saving a domain, the Xen tools need to remap all our mfns to 
> portable pfns.  In order to remap our p2m table, it needs to know 
> where all its pages are, so maintain the references to the p2m table 
> for it to use.

-tip tree auto-testing found the following early bootup hang:

-------------->
get_memcfg_from_srat: assigning address to rsdp
RSD PTR  v0 [Nvidia]
BUG: Int 14: CR2 ffd00040
     EDI 8092fbfe  ESI ffd00040  EBP 80b0aee8  ESP 80b0aed0
     EBX 000f76f0  EDX 0000000e  ECX 00000003  EAX ffd00040
     err 00000000  EIP 802c055a   CS 00000060  flg 00010006
Stack: ffd00040 80bc78d0 80b0af6c 80b1dbfe 8093d8ba 00000008 80b42810 80b4ddb4
       80b42842 00000000 80b0af1c 801079c8 808e724e 00000000 80b42871 802c0531
       00000100 00000000 0003fff0 80b0af40 80129999 00040100 00040100 00000000
Pid: 0, comm: swapper Not tainted 2.6.26-rc4-sched-devel.git #570
 [<802c055a>] ? strncmp+0x11/0x25
 [<80b1dbfe>] ? get_memcfg_from_srat+0xb4/0x568
 [<801079c8>] ? mcount_call+0x5/0x9
 [<802c0531>] ? strcmp+0xa/0x22
 [<80129999>] ? printk+0x38/0x3a
 [<80129999>] ? printk+0x38/0x3a
 [<8011b122>] ? memory_present+0x66/0x6f
 [<80b216b4>] ? setup_memory+0x13/0x40c
 [<80b16b47>] ? propagate_e820_map+0x80/0x97
 [<80b1622a>] ? setup_arch+0x248/0x477
 [<80129999>] ? printk+0x38/0x3a
 [<80b11759>] ? start_kernel+0x6e/0x2eb
 [<80b110fc>] ? i386_start_kernel+0xeb/0xf2
 =======================
<------

with this config:

   http://redhat.com/~mingo/misc/config-Wed_May_28_01_33_33_CEST_2008.bad

The thing is, the crash makes little sense at first sight. We crash on a 
benign-looking printk. The code around it got changed in -tip but 
checking those topic branches individually did not reproduce the bug.

Bisection led to this commit:

|   d5edbc1f75420935b1ec7e65df10c8f81cea82de is first bad commit
|   commit d5edbc1f75420935b1ec7e65df10c8f81cea82de
|   Author: Jeremy Fitzhardinge <jeremy@goop.org>
|   Date:   Mon May 26 23:31:22 2008 +0100
|
|   xen: add p2m mfn_list_list

Which is somewhat surprising, as on native hardware Xen client side 
should have little to no side-effects.

After some head scratching, it turns out the following happened: 
randconfig enabled the following Xen options:

  CONFIG_XEN=y
  CONFIG_XEN_MAX_DOMAIN_MEMORY=8
  # CONFIG_XEN_BLKDEV_FRONTEND is not set
  # CONFIG_XEN_NETDEV_FRONTEND is not set
  CONFIG_HVC_XEN=y
  # CONFIG_XEN_BALLOON is not set

which activated this piece of code in arch/x86/xen/mmu.c:

> @@ -69,6 +69,13 @@
>  	__attribute__((section(".data.page_aligned"))) =
>  		{ [ 0 ... TOP_ENTRIES - 1] = &p2m_missing[0] };
>  
> +/* Arrays of p2m arrays expressed in mfns used for save/restore */
> +static unsigned long p2m_top_mfn[TOP_ENTRIES]
> +	__attribute__((section(".bss.page_aligned")));
> +
> +static unsigned long p2m_top_mfn_list[TOP_ENTRIES / P2M_ENTRIES_PER_PAGE]
> +	__attribute__((section(".bss.page_aligned")));

The problem is, you must only put variables into .bss.page_aligned that 
have a _size_ that is _exactly_ page aligned. In this case the size of 
p2m_top_mfn_list is not page aligned:

 80b8d000 b p2m_top_mfn
 80b8f000 b p2m_top_mfn_list
 80b8f008 b softirq_stack
 80b97008 b hardirq_stack
 80b9f008 b bm_pte

So all subsequent variables get unaligned which, depending on luck, 
breaks the kernel in various funny ways. In this case what killed the 
kernel first was the misaligned bootmap pte page, resulting in that 
creative crash above.

Anyway, this was a fun bug to track down :-)

I think the moral is that .bss.page_aligned is a dangerous construct in 
its current form, and the symptoms of breakage are very non-trivial, so 
i think we need build-time checks to make sure all symbols in 
.bss.page_aligned are truly page aligned.

Sam, any ideas how to accomplish that best?

The Xen fix below gets the kernel booting again. I suspect we really 
need this list to stay in its separate page due to Xen assumptions, even 
though it's only 8 bytes large?

	Ingo

---
 arch/x86/xen/mmu.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Index: linux/arch/x86/xen/mmu.c
===================================================================
--- linux.orig/arch/x86/xen/mmu.c
+++ linux/arch/x86/xen/mmu.c
@@ -73,7 +73,8 @@ static unsigned long *p2m_top[TOP_ENTRIE
 static unsigned long p2m_top_mfn[TOP_ENTRIES]
 	__attribute__((section(".bss.page_aligned")));
 
-static unsigned long p2m_top_mfn_list[TOP_ENTRIES / P2M_ENTRIES_PER_PAGE]
+static unsigned long p2m_top_mfn_list[
+			PAGE_ALIGN(TOP_ENTRIES / P2M_ENTRIES_PER_PAGE)]
 	__attribute__((section(".bss.page_aligned")));
 
 static inline unsigned p2m_top_index(unsigned long pfn)

  reply	other threads:[~2008-05-28 12:29 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-05-23 13:41 [PATCH 00 of 12] xen: add save/restore/migrate for Xen domains Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 01 of 12] xen: make phys_to_machine structure dynamic Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 02 of 12] xen: add configurable max domain size Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 03 of 12] xen: efficiently support a holey p2m table Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 04 of 12] xen: make dummy_shared_info non-static Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 05 of 12] xen: add p2m mfn_list_list Jeremy Fitzhardinge
2008-05-28 12:28   ` Ingo Molnar [this message]
2008-05-28 14:02     ` [bisected] " Jeremy Fitzhardinge
2008-05-28 14:10       ` Sam Ravnborg
2008-06-02 10:10       ` Ingo Molnar
2008-06-02 13:12         ` Jeremy Fitzhardinge
2008-05-30  7:40     ` Gerd Hoffmann
2008-05-30  8:04       ` Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 06 of 12] xen: add rebind_evtchn_irq Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 07 of 12] xen: fix unbind_from_irq() Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 08 of 12] xen-console: add save/restore Jeremy Fitzhardinge
2008-06-02 11:17   ` Ingo Molnar
2008-06-02 11:18     ` Ingo Molnar
2008-06-02 11:50     ` Jeremy Fitzhardinge
2008-06-02 12:13     ` [Xen-devel] " Markus Armbruster
2008-05-23 13:41 ` [PATCH 09 of 12] xenbus: rebind irq on restore Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 10 of 12] xen: implement save/restore Jeremy Fitzhardinge
2008-05-29  7:31   ` Ingo Molnar
2008-05-29  8:00     ` Jeremy Fitzhardinge
2008-05-29  8:02     ` [PATCH] xen: fix compilation when CONFIG_PM_SLEEP is disabled Jeremy Fitzhardinge
2008-05-30  0:24       ` Randy Dunlap
2008-06-02 10:13         ` Ingo Molnar
2008-06-02  9:21   ` [Xen-devel] [PATCH 10 of 12] xen: implement save/restore Isaku Yamahata
2008-06-02 10:03     ` Jeremy Fitzhardinge
2008-06-02 10:47       ` Isaku Yamahata
2008-06-02 10:52         ` Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 11 of 12] xen: maintain clock offset over save/restore Jeremy Fitzhardinge
2008-05-23 13:41 ` [PATCH 12 of 12] hrtimer: remove warning in hres_timers_resume Jeremy Fitzhardinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080528122813.GA5502@elte.hu \
    --to=mingo@elte.hu \
    --cc=jeremy@goop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rjw@sisk.pl \
    --cc=sam@ravnborg.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox