From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Lukas Hejtmanek <xhejtman@ics.muni.cz>
Cc: roland@kernel.org, linux-rdma@vger.kernel.org,
Jan Beulich <JBeulich@suse.com>,
xen-devel@lists.xen.org
Subject: Re: BUG: bad page map under Xen
Date: Mon, 21 Oct 2013 10:18:55 -0400 [thread overview]
Message-ID: <20131021141855.GA4211@phenom.dumpdata.com> (raw)
In-Reply-To: <20131021140607.GQ20913@ics.muni.cz>
On Mon, Oct 21, 2013 at 04:06:07PM +0200, Lukas Hejtmanek wrote:
> On Mon, Oct 21, 2013 at 09:39:33AM -0400, konrad wilk wrote:
> > Anyhow, one easy thing to figure out is to get the lspci -v output
> > from the InfiniBand card
> > to see where its BARs are, and also the start of the kernel. You
> > should see an E820 map (please also boot with
> > "debug" on the Linux command line).
>
> note, adding _PAGE_IO as Jan suggested fixed those mem errors.
<nods> Right.
>
> here is lspci from the card and its virtual functions.
>
> 06:00.0 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3]
> Subsystem: Mellanox Technologies Device 0017
> Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> Latency: 0, Cache Line Size: 64 bytes
> Interrupt: pin A routed to IRQ 42
> Region 0: Memory at dfa00000 (64-bit, non-prefetchable) [size=1M]
> Region 2: Memory at 380fff000000 (64-bit, prefetchable) [size=8M]
Wow.
> 06:00.1 Network controller: Mellanox Technologies MT27500 Family [ConnectX-3 Virtual Function]
> Subsystem: Mellanox Technologies Device 61b0
> Control: I/O- Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> Latency: 0
> Region 2: [virtual] Memory at 380fdf000000 (64-bit, prefetchable) [size=8M]
Wow again.
.. snip..
> and this is from dmesg:
>
> [ 0.000000] e820: BIOS-provided physical RAM map:
> [ 0.000000] Xen: [mem 0x0000000000000000-0x0000000000090fff] usable
> [ 0.000000] Xen: [mem 0x0000000000091800-0x00000000000fffff] reserved
> [ 0.000000] Xen: [mem 0x0000000000100000-0x000000007dd76fff] usable
> [ 0.000000] Xen: [mem 0x000000007dd77000-0x000000007ddb5fff] reserved
> [ 0.000000] Xen: [mem 0x000000007ddb6000-0x000000007debefff] ACPI data
> [ 0.000000] Xen: [mem 0x000000007debf000-0x000000007e0dafff] ACPI NVS
> [ 0.000000] Xen: [mem 0x000000007e0db000-0x000000007f357fff] reserved
> [ 0.000000] Xen: [mem 0x000000007f358000-0x000000007f7fffff] ACPI NVS
> [ 0.000000] Xen: [mem 0x0000000080000000-0x000000008fffffff] reserved
> [ 0.000000] Xen: [mem 0x00000000fec00000-0x00000000fec01fff] reserved
> [ 0.000000] Xen: [mem 0x00000000fec40000-0x00000000fec40fff] reserved
> [ 0.000000] Xen: [mem 0x00000000fed1c000-0x00000000fed3ffff] reserved
> [ 0.000000] Xen: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
> [ 0.000000] Xen: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
> [ 0.000000] Xen: [mem 0x0000000100000000-0x000000107fffffff] usable
Odd, there should be messages about 1-1 mapping when you use 'debug'.
But either way - the problem (bug) is what I suspected - we treat any region
past the E820 as INVALID_P2M_ENTRY and hence doing any set_pte(..) operations
will fetch an 0 value, which in turn means that the PTE is zero (with the
0x200 _PAGE_SPECIAL b/c of VMA tracking).
Now the fix is to determine _where_ the end of real memory is so that we
can make sure that ballooning will work (in case of dom0_mem_max parameter).
And then anything past that PFN can be treated as IDENTITY_FRAME.
Naively, I think this patch would do it:
diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 09f3059..3871554 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -92,6 +92,9 @@ static void __init xen_add_extra_mem(u64 start, u64 size)
__set_phys_to_machine(pfn, INVALID_P2M_ENTRY);
}
+ /* Anything past the balloon area is marked as identity. */
+ for (pfn = xen_max_p2m_pfn; pfn < MAX_DOMAIN_PAGES; pfn++)
+ __set_phys_to_machine(pfn, IDENTITY_FRAME(pfn));
}
static unsigned long __init xen_do_chunk(unsigned long start,
But this is not even compile tested :-(
next prev parent reply other threads:[~2013-10-21 14:18 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-21 11:57 BUG: bad page map under Xen Lukas Hejtmanek
[not found] ` <20131021115740.GN20913-8qz54MUs51PtwjQa/ONI9g@public.gmane.org>
2013-10-21 12:59 ` [Xen-devel] " konrad wilk
2013-10-21 13:18 ` Jan Beulich
[not found] ` <52652534.2040303-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2013-10-21 13:18 ` [Xen-devel] " Jan Beulich
[not found] ` <526545E002000078000FC5F1-ce6RLXgGx+vWGUEhTRrCg1aTQe2KTcn/@public.gmane.org>
2013-10-21 13:39 ` konrad wilk
2013-10-21 13:57 ` konrad wilk
2013-10-21 14:06 ` Lukas Hejtmanek
[not found] ` <52652E95.3020305-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
2013-10-21 13:57 ` [Xen-devel] " konrad wilk
2013-10-21 14:06 ` Lukas Hejtmanek
2013-10-21 14:18 ` Konrad Rzeszutek Wilk [this message]
[not found] ` <20131021141855.GA4211-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
2013-10-21 14:23 ` Lukas Hejtmanek
2013-10-21 14:27 ` Jan Beulich
2013-10-21 14:44 ` Konrad Rzeszutek Wilk
[not found] ` <5265560602000078000FC73E-ce6RLXgGx+vWGUEhTRrCg1aTQe2KTcn/@public.gmane.org>
2013-10-21 14:44 ` [Xen-devel] " Konrad Rzeszutek Wilk
2013-10-21 15:12 ` Jan Beulich
[not found] ` <20131021144407.GC4560-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
2013-10-21 15:12 ` [Xen-devel] " Jan Beulich
[not found] ` <5265609802000078000FC7B7-ce6RLXgGx+vWGUEhTRrCg1aTQe2KTcn/@public.gmane.org>
2013-10-23 15:36 ` Konrad Rzeszutek Wilk
2013-10-23 15:45 ` Jan Beulich
[not found] ` <20131023153645.GA28011-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
2013-10-23 15:45 ` [Xen-devel] " Jan Beulich
[not found] ` <5267FD3102000078000A56A1-ce6RLXgGx+vWGUEhTRrCg1aTQe2KTcn/@public.gmane.org>
2013-10-23 16:04 ` Konrad Rzeszutek Wilk
2013-10-23 16:35 ` Jan Beulich
[not found] ` <20131023160433.GA28260-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
2013-10-23 16:35 ` [Xen-devel] " Jan Beulich
2013-10-23 16:04 ` Konrad Rzeszutek Wilk
2013-10-24 23:08 ` [Xen-devel] " David Vrabel
[not found] ` <5269A865.2010100-5LkwijKnu/2sTnJN9+BGXg@public.gmane.org>
2013-10-25 14:21 ` Konrad Rzeszutek Wilk
[not found] ` <20131025142147.GB3742-6K5HmflnPlqSPmnEAIUT9EEOCMrvLtNR@public.gmane.org>
2013-12-26 6:39 ` Zhang, Yang Z
[not found] ` <A9667DDFB95DB7438FA9D7D576C3D87E0A99CE00-0J0gbvR4kTg/UvCtAeCM4rfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2014-01-02 14:18 ` David Vrabel
2014-01-02 14:18 ` David Vrabel
2013-12-26 6:39 ` Zhang, Yang Z
2013-10-25 14:21 ` Konrad Rzeszutek Wilk
2013-10-24 23:08 ` David Vrabel
2013-10-23 15:36 ` Konrad Rzeszutek Wilk
2013-10-21 14:23 ` Lukas Hejtmanek
2013-10-21 14:27 ` Jan Beulich
[not found] ` <20131021140607.GQ20913-8qz54MUs51PtwjQa/ONI9g@public.gmane.org>
2013-10-21 14:20 ` [Xen-devel] " Jan Beulich
2013-10-21 14:20 ` Jan Beulich
2013-10-21 13:39 ` konrad wilk
2013-10-21 13:14 ` [Xen-devel] " Jan Beulich
2013-10-21 12:59 ` konrad wilk
2013-10-21 13:14 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131021141855.GA4211@phenom.dumpdata.com \
--to=konrad.wilk@oracle.com \
--cc=JBeulich@suse.com \
--cc=linux-rdma@vger.kernel.org \
--cc=roland@kernel.org \
--cc=xen-devel@lists.xen.org \
--cc=xhejtman@ics.muni.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.