All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>,
	xen-devel@lists.xen.org, David Vrabel <david.vrabel@citrix.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	Vincent Bernardoff <vb@luminar.eu.org>
Subject: Re: Crashing kernel with dom0/libxc gnttab/gntshr
Date: Fri, 02 Aug 2013 09:49:11 -0700	[thread overview]
Message-ID: <51FBE307.30101@goop.org> (raw)
In-Reply-To: <alpine.DEB.2.02.1308021448100.4893@kaball.uk.xensource.com>

On 08/02/2013 06:50 AM, Stefano Stabellini wrote:
> On Tue, 30 Jul 2013, Daniel De Graaf wrote:
>> On 07/30/2013 12:58 PM, David Vrabel wrote:
>> [...]
>>> [  902.729307] BUG: Bad page map in process vchan-node1  pte:12bfff167
>>> pmd:b9b5c067
>>> [  902.729312] page:ffffea0004afffc0 count:1 mapcount:-1 mapping:
>>>     (null) index:0xffffffffffffffff
>>>
>>> I think this is the test for page_mapcount(page) < 0 in zap_pte_range().
>>>   This has looked up the page using the PTE it is trying to clear.  Has
>>> it found the correct page?  Since the MFN is currently mapped into the
>>> same domain, has the m2p_override stuff confused the look up and it is
>>> checking the grantee page not the granter?
>>>
>>> David
>> I think something like this is happening, since while reproducing this
>> on my test system, some linked list corruption was found that I believe
>> to be the cause of this problem. The gnttab_map_refs function on PV uses
>> m2p_add_override on the page, which threads page->lru to an
>> m2p_overrides list. However, something else is using page->lru during
>> the use of gntdev, as shown by the following debug patch:
> I have never managed to prove that something else is trying to use
> page->lru while the m2p_override is using it.
>
> Jeremy, at the time the code was written, you were pretty confident
> that page->lru couldn't be used by anybody else.
> Why was that?

Hm. Probably the reasoning was that page->lru was only used for pages
which in the pagecache, mapped from files, and m2p pages are never
mapped from files. But maybe something else has decided to use lru for
non-mapped pages (transparent hugepage? page dedup?), or are m2p pages
getting into the pagecache somehow?

    J

>
>
>
>> diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
>> index 3c8803f..198e57e 100644
>> --- a/drivers/xen/gntdev.c
>> +++ b/drivers/xen/gntdev.c
>> @@ -294,6 +294,11 @@ static int map_grant_pages(struct grant_map *map)
>>  	if (err)
>>  		return err;
>>  +	printk("map page0 lru: %p prev=%p:%p next=%p:%p\n",
>> +		&map->pages[0]->lru,
>> +		map->pages[0]->lru.prev, map->pages[0]->lru.prev->next,
>> +		map->pages[0]->lru.next, map->pages[0]->lru.next->prev);
>> +
>>  	for (i = 0; i < map->count; i++) {
>>  		if (map->map_ops[i].status)
>>  			err = -EINVAL;
>> @@ -320,6 +325,10 @@ static int __unmap_grant_pages(struct grant_map *map, int
>> offset, int pages)
>>  		}
>>  	}
>>  +	printk("unmap page0 lru: %p prev=%p:%p next=%p:%p\n",
>> +		&map->pages[0]->lru,
>> +		map->pages[0]->lru.prev, map->pages[0]->lru.prev->next,
>> +		map->pages[0]->lru.next, map->pages[0]->lru.next->prev);
>>  	err = gnttab_unmap_refs(map->unmap_ops + offset,
>>  			use_ptemod ? map->kmap_ops + offset : NULL, map->pages
>> + offset,
>> 			pages);
>>
>> Output:
>> [   88.610644] map page0 lru: ffffea0001dee160
>> prev=ffffffff82f2d510:ffffea0001dee160 next=ffffffff82f2d510:ffffea0001dee160
>> [   88.611515] BUG: Bad page map in process a.out  pte:8000000077b85167
>> pmd:2541a067
>> [   88.611525] page:ffffea0001dee140 count:1 mapcount:-1 mapping:
>> (null) index:0xffffffffffffffff
>> [   88.611532] page flags: 0x1000000000000814(referenced|dirty|private)
>> [   88.611541] addr:00007f1adaef3000 vm_flags:140400fb anon_vma:
>> (null) mapping:ffff8800692974a0 index:0
>> [   88.611547] vma->vm_ops->fault:           (null)
>> [   88.611555] vma->vm_file->f_op->mmap: gntalloc_mmap+0x0/0x1d0
>> [...backtrace cropped...]
>> [   88.614301] unmap page0 lru: ffffea0001dee160
>> prev=ffff8800254c9d08:ffff88001ea0b120 next=ffff8800254c9d08:ffff88001ea0b938
>>
>> The initial map is a linked list with only that element, so the address
>> 0xffffffff82f2d510 is the m2p_overrides entry. This means the page being
>> found by zap_pte_range is not a valid struct page.
>>
>> The struct page* being used by the gntalloc device was 0xffffea0000952740,
>> for reference; it's not a direct collision between the page used by the
>> gntdev and gntalloc devices.
>>
>> Not sure what the best fix is for this at the moment.
>>
>> -- 
>> Daniel De Graaf
>> National Security Agency
>>

  parent reply	other threads:[~2013-08-02 16:49 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-07-30 10:50 Crashing kernel with dom0/libxc gnttab/gntshr Vincent Bernardoff
2013-07-30 10:59 ` Ian Campbell
2013-07-30 13:41   ` Vincent Bernardoff
2013-07-30 15:50     ` Vincent Bernardoff
2013-07-30 15:55       ` Ian Campbell
2013-07-30 16:58       ` David Vrabel
2013-07-30 21:03         ` Daniel De Graaf
2013-08-02 13:50           ` Stefano Stabellini
2013-08-02 14:10             ` Ian Campbell
2013-08-02 16:49             ` Jeremy Fitzhardinge [this message]
2013-08-02 17:02               ` Stefano Stabellini
2013-08-03 10:06                 ` Ian Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51FBE307.30101@goop.org \
    --to=jeremy@goop.org \
    --cc=Ian.Campbell@citrix.com \
    --cc=david.vrabel@citrix.com \
    --cc=dgdegra@tycho.nsa.gov \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=vb@luminar.eu.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.