All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Julien Grall <julien.grall@arm.com>,
	sstabellini@kernel.org, dario.faggioli@citrix.com,
	xen-devel@lists.xenproject.org
Subject: Re: CONFIG_SCRUB_DEBUG=y + arm64 + livepatch = Xen BUG at page_alloc.c:738
Date: Fri, 15 Sep 2017 14:48:09 -0400	[thread overview]
Message-ID: <20170915184809.GC31227@char.us.oracle.com> (raw)
In-Reply-To: <8446c04d-8709-18d9-a186-0e836bed1b2c@oracle.com>

On Thu, Sep 14, 2017 at 05:39:23PM -0400, Boris Ostrovsky wrote:
> On 09/14/2017 05:26 PM, Konrad Rzeszutek Wilk wrote:
> > On Wed, Sep 13, 2017 at 02:49:41PM -0400, Boris Ostrovsky wrote:
> >> On 09/13/2017 02:25 PM, Julien Grall wrote:
> >>> Hi,
> >>>
> >>> On 09/13/2017 07:05 PM, Boris Ostrovsky wrote:
> >>>> On 09/13/2017 11:32 AM, Konrad Rzeszutek Wilk wrote:
> >>>> Well, that's not a fix. This eliminates the case that something in
> >>>> ARM-specific code (which I haven't tested) accidentally clears
> >>>> _PGC_need_scrub.
> >>>>
> >>>> OK, I think I know what the problem is. You are using
> >>>> CONFIG_SEPARATE_XENHEAP, are you?
> >>> It seems the bug appear on Arm64, so CONFIG_SEPARATE_XENHEAP is not set.
> >>>
> >>> Note that Arm32 is using separate heap.
> >>
> >> For separate heap we will need
> >>
> >>
> >> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> >> index b5243fc..9f62ea2 100644
> >> --- a/xen/common/page_alloc.c
> >> +++ b/xen/common/page_alloc.c
> >> @@ -2059,7 +2059,7 @@ void free_xenheap_pages(void *v, unsigned int order)
> >>
> >>      memguard_guard_range(v, 1 << (order + PAGE_SHIFT));
> >>
> >> -    free_heap_pages(virt_to_page(v), order, false);
> >> +    free_heap_pages(virt_to_page(v), order, scrub_debug);
> >>  }
> >>
> >>  #else
> >>
> >>
> >> If that doesn't help then there are two cases where free_heap_pages is
> >> called with 'false' --- one in alloc_domheap_pages() and the other in
> >> online_page().
> >>
> >> Setting one and then the other would further narrow it down.
> > It went further. See the serial log:
> 
> Hmm. As Julien said, this is ARM64 so this patch should not have any effect.
> 
> Have you tried flipping false to true in the two alloc_domheap_pages()
> invocations that I mentioned?

Yeah, it didn't help. But I decided during a certain call to debug this.


@@ -1705,6 +1711,7 @@ static void init_heap_pages(
 {
     unsigned long i;
 
+    printk("%s: 0x%lx -> 0x%lx %s\n", __func__, page_to_mfn(pg), page_to_mfn(pg) + nr_pages, scrub_debug ? "scrub" : "");
     for ( i = 0; i < nr_pages; i++ )
     {
         unsigned int nid = phys_to_nid(page_to_maddr(pg+i));
@@ -1000,7 +1001,12 @@ if ( memflags & MEMF_debug ) {
                 spin_unlock(&heap_lock);
             }
             else if ( !(memflags & MEMF_no_scrub) )
+            {
+
+       printk("%s:%d %d scrub mfn=0%lx\n", __func__, __LINE__, i, page_to_mfn(&pg[i]));
+
                 check_one_page(&pg[i]);
+               }
         }
 
         if ( dirty_cnt )
@@ -1836,6 +1843,7 @@ static void __init smp_scrub_heap_pages(void *data)
     else
         end = start + chunk_size;
 
+    printk("CPU%d: MFN=0x%lx -> 0x%lx\n", cpu, start, end);
     for ( mfn = start; mfn < end; mfn++ )
     {
         pg = mfn_to_page(mfn);

Shows:

(XEN) Loading dom0 DTB to 0x0000000017e00000-0x0000000017e08265
(XEN) init_domheap_pages: 0xb87b1->0xb87bc
(XEN) init_heap_pages: 0xb87b1 -> 0xb87bc
(XEN) init_domheap_pages: 0xb88f1->0xb98ae
(XEN) init_heap_pages: 0xb88f1 -> 0xb98ae	<- so the memory is from here

(XEN) Scrubbing Free RAM on 1 nodes using 8 CPUs
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Scrubbing Free RAM on 1 nodes using 8 CPUs
(XEN) CPU0: MFN=0x0 -> 0x8000
(XEN) CPU6: MFN=0x6a12e -> 0x7212e
(XEN) CPU5: MFN=0x58651 -> 0x60651
(XEN) CPU2: MFN=0x235ba -> 0x2b5ba
(XEN) CPU1: MFN=0x11add -> 0x19add
(XEN) CPU3: MFN=0x35097 -> 0x3d097
(XEN) CPU4: MFN=0x46b74 -> 0x4eb74
(XEN) CPU7: MFN=0x7bc0b -> 0x83c0b
(XEN) .(XEN) CPU6: MFN=0x7212e -> 0x7a12e
(XEN) CPU5: MFN=0x60651 -> 0x68651
(XEN) CPU4: MFN=0x4eb74 -> 0x56b74
(XEN) CPU1: MFN=0x19add -> 0x21add
CPU0: MFN=0x8000 -> 0x10000
(XEN) CPU7: MFN=0x83c0b -> 0x8bc0b
(XEN) CPU2: MFN=0x2b5ba -> 0x335ba
(XEN) CPU3: MFN=0x3d097 -> 0x45097
(XEN) .(XEN) CPU1: MFN=0x21add -> 0x235ba
(XEN) CPU2: MFN=0x335ba -> 0x35097
CPU0: MFN=0x10000 -> 0x11add
(XEN) CPU3: MFN=0x45097 -> 0x46b74
(XEN) CPU6: MFN=0x7a12e -> 0x7bc0b
(XEN) CPU4: MFN=0x56b74 -> 0x58651
(XEN) CPU5: MFN=0x68651 -> 0x6a12e
(XEN) CPU7: MFN=0x8bc0b -> 0x8d6ea
(XEN) .done.
..snip..

(XEN) alloc_heap_pages:1006 0 scrub mfn=0b98ab
(XEN) Xen BUG at page_alloc.c:738

So in other words, it looks like scrub_heap_pages is somehow not
including this MFN.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-09-15 18:48 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-11 23:55 CONFIG_SCRUB_DEBUG=y + arm64 + livepatch = Xen BUG at page_alloc.c:738 Konrad Rzeszutek Wilk
2017-09-12  0:45 ` Boris Ostrovsky
2017-09-13  0:01   ` Konrad Rzeszutek Wilk
2017-09-13  1:19     ` Boris Ostrovsky
2017-09-13 15:32       ` Konrad Rzeszutek Wilk
2017-09-13 18:05         ` Boris Ostrovsky
2017-09-13 18:25           ` Julien Grall
2017-09-13 18:49             ` Boris Ostrovsky
2017-09-14 21:26               ` Konrad Rzeszutek Wilk
2017-09-14 21:39                 ` Boris Ostrovsky
2017-09-15 18:48                   ` Konrad Rzeszutek Wilk [this message]
2017-09-15 19:20                     ` Boris Ostrovsky
2017-09-15 19:50                     ` Konrad Rzeszutek Wilk
2017-09-15 20:28                       ` Julien Grall
2017-09-13  7:56     ` Dario Faggioli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170915184809.GC31227@char.us.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=dario.faggioli@citrix.com \
    --cc=julien.grall@arm.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.