From: Daniel De Graaf <dgdegra@tycho.nsa.gov>
To: David Vrabel <david.vrabel@citrix.com>
Cc: xen-devel@lists.xen.org, Ian Campbell <Ian.Campbell@citrix.com>,
Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
Vincent Bernardoff <vb@luminar.eu.org>
Subject: Re: Crashing kernel with dom0/libxc gnttab/gntshr
Date: Tue, 30 Jul 2013 17:03:31 -0400 [thread overview]
Message-ID: <51F82A23.2030209@tycho.nsa.gov> (raw)
In-Reply-To: <51F7F0BE.1040503@citrix.com>
On 07/30/2013 12:58 PM, David Vrabel wrote:
[...]
>
> [ 902.729307] BUG: Bad page map in process vchan-node1 pte:12bfff167
> pmd:b9b5c067
> [ 902.729312] page:ffffea0004afffc0 count:1 mapcount:-1 mapping:
> (null) index:0xffffffffffffffff
>
> I think this is the test for page_mapcount(page) < 0 in zap_pte_range().
> This has looked up the page using the PTE it is trying to clear. Has
> it found the correct page? Since the MFN is currently mapped into the
> same domain, has the m2p_override stuff confused the look up and it is
> checking the grantee page not the granter?
>
> David
I think something like this is happening, since while reproducing this
on my test system, some linked list corruption was found that I believe
to be the cause of this problem. The gnttab_map_refs function on PV uses
m2p_add_override on the page, which threads page->lru to an
m2p_overrides list. However, something else is using page->lru during
the use of gntdev, as shown by the following debug patch:
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 3c8803f..198e57e 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -294,6 +294,11 @@ static int map_grant_pages(struct grant_map *map)
if (err)
return err;
+ printk("map page0 lru: %p prev=%p:%p next=%p:%p\n",
+ &map->pages[0]->lru,
+ map->pages[0]->lru.prev, map->pages[0]->lru.prev->next,
+ map->pages[0]->lru.next, map->pages[0]->lru.next->prev);
+
for (i = 0; i < map->count; i++) {
if (map->map_ops[i].status)
err = -EINVAL;
@@ -320,6 +325,10 @@ static int __unmap_grant_pages(struct grant_map *map, int offset, int pages)
}
}
+ printk("unmap page0 lru: %p prev=%p:%p next=%p:%p\n",
+ &map->pages[0]->lru,
+ map->pages[0]->lru.prev, map->pages[0]->lru.prev->next,
+ map->pages[0]->lru.next, map->pages[0]->lru.next->prev);
err = gnttab_unmap_refs(map->unmap_ops + offset,
use_ptemod ? map->kmap_ops + offset : NULL, map->pages + offset,
pages);
Output:
[ 88.610644] map page0 lru: ffffea0001dee160 prev=ffffffff82f2d510:ffffea0001dee160 next=ffffffff82f2d510:ffffea0001dee160
[ 88.611515] BUG: Bad page map in process a.out pte:8000000077b85167 pmd:2541a067
[ 88.611525] page:ffffea0001dee140 count:1 mapcount:-1 mapping: (null) index:0xffffffffffffffff
[ 88.611532] page flags: 0x1000000000000814(referenced|dirty|private)
[ 88.611541] addr:00007f1adaef3000 vm_flags:140400fb anon_vma: (null) mapping:ffff8800692974a0 index:0
[ 88.611547] vma->vm_ops->fault: (null)
[ 88.611555] vma->vm_file->f_op->mmap: gntalloc_mmap+0x0/0x1d0
[...backtrace cropped...]
[ 88.614301] unmap page0 lru: ffffea0001dee160 prev=ffff8800254c9d08:ffff88001ea0b120 next=ffff8800254c9d08:ffff88001ea0b938
The initial map is a linked list with only that element, so the address
0xffffffff82f2d510 is the m2p_overrides entry. This means the page being
found by zap_pte_range is not a valid struct page.
The struct page* being used by the gntalloc device was 0xffffea0000952740,
for reference; it's not a direct collision between the page used by the
gntdev and gntalloc devices.
Not sure what the best fix is for this at the moment.
--
Daniel De Graaf
National Security Agency
next prev parent reply other threads:[~2013-07-30 21:03 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-30 10:50 Crashing kernel with dom0/libxc gnttab/gntshr Vincent Bernardoff
2013-07-30 10:59 ` Ian Campbell
2013-07-30 13:41 ` Vincent Bernardoff
2013-07-30 15:50 ` Vincent Bernardoff
2013-07-30 15:55 ` Ian Campbell
2013-07-30 16:58 ` David Vrabel
2013-07-30 21:03 ` Daniel De Graaf [this message]
2013-08-02 13:50 ` Stefano Stabellini
2013-08-02 14:10 ` Ian Campbell
2013-08-02 16:49 ` Jeremy Fitzhardinge
2013-08-02 17:02 ` Stefano Stabellini
2013-08-03 10:06 ` Ian Campbell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51F82A23.2030209@tycho.nsa.gov \
--to=dgdegra@tycho.nsa.gov \
--cc=Ian.Campbell@citrix.com \
--cc=david.vrabel@citrix.com \
--cc=stefano.stabellini@eu.citrix.com \
--cc=vb@luminar.eu.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.