xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Daniel De Graaf <dgdegra@tycho.nsa.gov>
To: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: jeremy@goop.org, xen-devel@lists.xensource.com, keir@xen.org,
	Ian.Campbell@citrix.com
Subject: Re: c/s 22402 ("86 hvm: Refuse to perform __hvm_copy() work in atomic context.") breaks HVM, race possible in other code - any ideas?
Date: Tue, 11 Jan 2011 13:24:05 -0500	[thread overview]
Message-ID: <4D2CA045.3060904@tycho.nsa.gov> (raw)
In-Reply-To: <20110111180032.GH14017@dumpdata.com>

On 01/11/2011 01:00 PM, Konrad Rzeszutek Wilk wrote:
> On Tue, Jan 11, 2011 at 09:52:19AM -0500, Daniel De Graaf wrote:
>> On 01/11/2011 08:15 AM, Daniel De Graaf wrote:
>>> On 01/10/2011 05:41 PM, Konrad Rzeszutek Wilk wrote:
>>>>> @@ -284,8 +304,25 @@ static void unmap_grant_pages(struct grant_map *map, int offset, int pages)
>>>>>  		goto out;
>>>>>  
>>>>>  	for (i = 0; i < pages; i++) {
>>>>> +		uint32_t check, *tmp;
>>>>>  		WARN_ON(unmap_ops[i].status);
>>>>> -		__free_page(map->pages[offset+i]);
>>>>> +		if (!map->pages[i])
>>>>> +			continue;
>>>>> +		/* XXX When unmapping, Xen will sometimes end up mapping the GFN
>>>>> +		 * to an invalid MFN. In this case, writes will be discarded and
>>>>> +		 * reads will return all 0xFF bytes. Leak these unusable GFNs
>>>>> +		 * until a way to restore them is found.
>>>>> +		 */
>>>>> +		tmp = kmap(map->pages[i]);
>>>>> +		tmp[0] = 0xdeaddead;
>>>>> +		mb();
>>>>> +		check = tmp[0];
>>>>> +		kunmap(map->pages[i]);
>>>>> +		if (check == 0xdeaddead)
>>>>> +			__free_page(map->pages[i]);
>>>>> +		else if (debug)
>>>>> +			printk("%s: Discard page %d=%ld\n", __func__,
>>>>> +				i, page_to_pfn(map->pages[i]));
>>>>
>>>> Whoa. Any leads to when the "sometimes" happens? Does the status report an
>>>> error or is it silent?
>>>
>>> Status is silent in this case. I can produce it quite reliably on my
>>> test system where I am mapping a framebuffer (1280 pages) between two
>>> HVM guests - in this case, about 2/3 of the released pages will end up
>>> being invalid. It doesn't seem to be size-related as I have also seen
>>> it on the small 3-page page index mapping. There is a message on xm
>>> dmesg that may be related:
>>>
>>> (XEN) sh error: sh_remove_all_mappings(): can't find all mappings of mfn 7cbc6: c=8000000000000004 t=7400000000000002
>>>
>>> This appears about once per page, with different MFNs but the same c/t.
>>> One of the two HVM guests (the one doing the mapping) has the PCI
>>> graphics card forwarded to it.
>>>
>>
>> Just tested on the latest xen 4.1 (with 22402:7d2fdc083c9c reverted as
>> that breaks HVM grants), which produces different output:
> 
> Keir, the c/s 22402 has your name on it.
> 
> Any ideas on the problem that Daniel is hitting with unmapping grants?

c/s 22402 was discussed at the beginning of December:
http://lists.xensource.com/archives/html/xen-devel/2010-12/msg00063.html

Since I don't use xenpaging (which is the reason for the change), this
revert shouldn't be relevant to the problems I am seeing.

>> ...
>> (XEN) mm.c:889:d1 Error getting mfn b803e (pfn 25a3e) from L1 entry 00000000b803e021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn b8038 (pfn 25a38) from L1 entry 00000000b8038021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn b803d (pfn 25a3d) from L1 entry 00000000b803d021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn 10829 (pfn 25a29) from L1 entry 0000000010829021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn 1081c (pfn 25a1c) from L1 entry 000000001081c021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn 10816 (pfn 25a16) from L1 entry 0000000010816021 for l1e_owner=1, pg_owner=1
>> (XEN) mm.c:889:d1 Error getting mfn 1081a (pfn 25a1a) from L1 entry 000000001081a021 for l1e_owner=1, pg_owner=1
>> ...
>>
>> This appears on the map; nothing is printed on the unmap. If the
>> unmap happens while the domain is up, it seems to be invalid more often;
>> most (perhaps all) of the destination-valid unmaps happen when the domain
>> is being destroyed. Exactly which pages are valid or invalid seems to be
>> mostly random, although nearby GFNs tend to have the same validity.
>>
>> If you have any thoughts as to the cause, I can test patches or provide
>> output as needed; it would be better if this workaround weren't needed.
>>
>> -- 
>> Daniel De Graaf
>> National Security Agency

  reply	other threads:[~2011-01-11 18:24 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-17  0:17 [PATCH v3] Userspace grant communication Daniel De Graaf
2010-12-17  0:17 ` [PATCH 1/7] xen-gntdev: Fix circular locking dependency Daniel De Graaf
2010-12-17  0:17 ` [PATCH 2/7] xen-gntdev: Change page limit to be global instead of per-open Daniel De Graaf
2011-01-10 21:52   ` Konrad Rzeszutek Wilk
2011-01-11 12:45     ` Daniel De Graaf
2011-01-11 17:51       ` Konrad Rzeszutek Wilk
2011-01-11 18:18         ` Daniel De Graaf
2011-01-11 18:21           ` Konrad Rzeszutek Wilk
2011-01-11 18:49             ` [PATCH libxc] Remove set_max_grants in linux Daniel De Graaf
2011-01-12 17:17               ` Ian Jackson
2011-01-12 17:57                 ` Daniel De Graaf
2011-01-13 12:09               ` Ian Jackson
2011-01-13 12:48                 ` Daniel De Graaf
2011-01-17 17:29               ` Ian Jackson
2010-12-17  0:17 ` [PATCH 3/7] xen-gntdev: Remove unneeded structures from grant_map tracking data Daniel De Graaf
2011-01-10 22:14   ` Konrad Rzeszutek Wilk
2011-01-11 13:02     ` Daniel De Graaf
2010-12-17  0:17 ` [PATCH 4/7] xen-gntdev: Use find_vma rather than iterating our vma list manually Daniel De Graaf
2010-12-17  0:17 ` [PATCH 5/7] xen-gntdev: Add reference counting to maps Daniel De Graaf
2010-12-17  0:49   ` Jeremy Fitzhardinge
2010-12-17 15:11     ` Daniel De Graaf
2010-12-17  0:51   ` Jeremy Fitzhardinge
2010-12-17 15:22   ` [PATCH 5/7 v2] " Daniel De Graaf
2011-01-10 22:28     ` Konrad Rzeszutek Wilk
2011-01-10 22:24   ` [PATCH 5/7] " Konrad Rzeszutek Wilk
2011-01-11 11:10     ` Stefano Stabellini
2011-01-11 17:46       ` Konrad Rzeszutek Wilk
2011-01-12 11:58         ` Stefano Stabellini
2010-12-17  0:17 ` [PATCH 6/7] xen-gntdev: Support mapping in HVM domains Daniel De Graaf
2010-12-17 15:22   ` [PATCH 6/7 v2] " Daniel De Graaf
2011-01-10 22:41   ` [PATCH 6/7] " Konrad Rzeszutek Wilk
2011-01-11 13:15     ` Daniel De Graaf
2011-01-11 14:52       ` Daniel De Graaf
2011-01-11 18:00         ` c/s 22402 ("86 hvm: Refuse to perform __hvm_copy() work in atomic context.") breaks HVM, race possible in other code - any ideas? Konrad Rzeszutek Wilk
2011-01-11 18:24           ` Daniel De Graaf [this message]
2010-12-17  0:17 ` [PATCH 7/7] xen-gntalloc: Userspace grant allocation driver Daniel De Graaf
2011-01-07 11:56 ` [PATCH v3] Userspace grant communication Stefano Stabellini
2011-01-14 15:18 ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D2CA045.3060904@tycho.nsa.gov \
    --to=dgdegra@tycho.nsa.gov \
    --cc=Ian.Campbell@citrix.com \
    --cc=jeremy@goop.org \
    --cc=keir@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).