From: Daniel De Graaf <dgdegra@tycho.nsa.gov>
To: David Vrabel <david.vrabel@citrix.com>
Cc: xen-devel@lists.xen.org, Ian Campbell <Ian.Campbell@citrix.com>,
Stefano Stabellini <stefano.stabellini@eu.citrix.com>,
Vincent Bernardoff <vb@luminar.eu.org>
Subject: Re: Crashing kernel with dom0/libxc gnttab/gntshr
Date: Tue, 30 Jul 2013 17:03:31 -0400 [thread overview]
Message-ID: <51F82A23.2030209@tycho.nsa.gov> (raw)
In-Reply-To: <51F7F0BE.1040503@citrix.com>
On 07/30/2013 12:58 PM, David Vrabel wrote:
[...]
>
> [ 902.729307] BUG: Bad page map in process vchan-node1 pte:12bfff167
> pmd:b9b5c067
> [ 902.729312] page:ffffea0004afffc0 count:1 mapcount:-1 mapping:
> (null) index:0xffffffffffffffff
>
> I think this is the test for page_mapcount(page) < 0 in zap_pte_range().
> This has looked up the page using the PTE it is trying to clear. Has
> it found the correct page? Since the MFN is currently mapped into the
> same domain, has the m2p_override stuff confused the look up and it is
> checking the grantee page not the granter?
>
> David
I think something like this is happening, since while reproducing this
on my test system, some linked list corruption was found that I believe
to be the cause of this problem. The gnttab_map_refs function on PV uses
m2p_add_override on the page, which threads page->lru to an
m2p_overrides list. However, something else is using page->lru during
the use of gntdev, as shown by the following debug patch:
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 3c8803f..198e57e 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -294,6 +294,11 @@ static int map_grant_pages(struct grant_map *map)
if (err)
return err;
+ printk("map page0 lru: %p prev=%p:%p next=%p:%p\n",
+ &map->pages[0]->lru,
+ map->pages[0]->lru.prev, map->pages[0]->lru.prev->next,
+ map->pages[0]->lru.next, map->pages[0]->lru.next->prev);
+
for (i = 0; i < map->count; i++) {
if (map->map_ops[i].status)
err = -EINVAL;
@@ -320,6 +325,10 @@ static int __unmap_grant_pages(struct grant_map *map, int offset, int pages)
}
}
+ printk("unmap page0 lru: %p prev=%p:%p next=%p:%p\n",
+ &map->pages[0]->lru,
+ map->pages[0]->lru.prev, map->pages[0]->lru.prev->next,
+ map->pages[0]->lru.next, map->pages[0]->lru.next->prev);
err = gnttab_unmap_refs(map->unmap_ops + offset,
use_ptemod ? map->kmap_ops + offset : NULL, map->pages + offset,
pages);
Output:
[ 88.610644] map page0 lru: ffffea0001dee160 prev=ffffffff82f2d510:ffffea0001dee160 next=ffffffff82f2d510:ffffea0001dee160
[ 88.611515] BUG: Bad page map in process a.out pte:8000000077b85167 pmd:2541a067
[ 88.611525] page:ffffea0001dee140 count:1 mapcount:-1 mapping: (null) index:0xffffffffffffffff
[ 88.611532] page flags: 0x1000000000000814(referenced|dirty|private)
[ 88.611541] addr:00007f1adaef3000 vm_flags:140400fb anon_vma: (null) mapping:ffff8800692974a0 index:0
[ 88.611547] vma->vm_ops->fault: (null)
[ 88.611555] vma->vm_file->f_op->mmap: gntalloc_mmap+0x0/0x1d0
[...backtrace cropped...]
[ 88.614301] unmap page0 lru: ffffea0001dee160 prev=ffff8800254c9d08:ffff88001ea0b120 next=ffff8800254c9d08:ffff88001ea0b938
The initial map is a linked list with only that element, so the address
0xffffffff82f2d510 is the m2p_overrides entry. This means the page being
found by zap_pte_range is not a valid struct page.
The struct page* being used by the gntalloc device was 0xffffea0000952740,
for reference; it's not a direct collision between the page used by the
gntdev and gntalloc devices.
Not sure what the best fix is for this at the moment.
--
Daniel De Graaf
National Security Agency
next prev parent reply other threads:[~2013-07-30 21:03 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-30 10:50 Crashing kernel with dom0/libxc gnttab/gntshr Vincent Bernardoff
2013-07-30 10:59 ` Ian Campbell
2013-07-30 13:41 ` Vincent Bernardoff
2013-07-30 15:50 ` Vincent Bernardoff
2013-07-30 15:55 ` Ian Campbell
2013-07-30 16:58 ` David Vrabel
2013-07-30 21:03 ` Daniel De Graaf [this message]
2013-08-02 13:50 ` Stefano Stabellini
2013-08-02 14:10 ` Ian Campbell
2013-08-02 16:49 ` Jeremy Fitzhardinge
2013-08-02 17:02 ` Stefano Stabellini
2013-08-03 10:06 ` Ian Campbell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51F82A23.2030209@tycho.nsa.gov \
--to=dgdegra@tycho.nsa.gov \
--cc=Ian.Campbell@citrix.com \
--cc=david.vrabel@citrix.com \
--cc=stefano.stabellini@eu.citrix.com \
--cc=vb@luminar.eu.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).