* Fwd: Question regarding the behavior of guest_physmap_remove_page on x86 [not found] <564B6C56.8080501@citrix.com> @ 2015-11-17 18:09 ` Julien Grall 2015-11-17 18:47 ` Andrew Cooper 0 siblings, 1 reply; 3+ messages in thread From: Julien Grall @ 2015-11-17 18:09 UTC (permalink / raw) To: Jan Beulich, Tim Deegan, Andrew Cooper Cc: xen-devel, Ian Campbell, Stefano Stabellini (Resending with this time xen-devel in CC) Hi, On ARM, it's possible to fail when removing a page from the P2M. It's happening if we are trying to shatter a superpage and we don't have memory to allocate the table. Therefore the mapping won't be removed from the P2M. However on ARM (and until recently on x86 [1]), the function guest_physmap_remove_page is not supposed to return an error. So we would free the page even if we fail to remove the page. This will end up to re-use the page by someone else even though the mapping is still present in the P2M. I looked to the x86 version and I'm not sure how the function is behaving. Maybe an x86 maintainers could give me insight here. I'm thinking to fix the problem by checking the return of guest_physmap_remove_page to avoid the page being reallocate to someone else (see for instance guest_remove_page in xen/common/memory.c). Is it a sensible way to fix it? Regards, [1] 5ae03990 "xen/vtd: create RMRR mapping" -- Julien Grall ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Fwd: Question regarding the behavior of guest_physmap_remove_page on x86 2015-11-17 18:09 ` Fwd: Question regarding the behavior of guest_physmap_remove_page on x86 Julien Grall @ 2015-11-17 18:47 ` Andrew Cooper 2015-11-18 16:16 ` Julien Grall 0 siblings, 1 reply; 3+ messages in thread From: Andrew Cooper @ 2015-11-17 18:47 UTC (permalink / raw) To: Julien Grall, Jan Beulich, Tim Deegan Cc: xen-devel, Ian Campbell, Stefano Stabellini On 17/11/15 18:09, Julien Grall wrote: > (Resending with this time xen-devel in CC) > > Hi, > > On ARM, it's possible to fail when removing a page from the P2M. It's > happening if we are trying to shatter a superpage and we don't have > memory to allocate the table. Therefore the mapping won't be removed > from the P2M. > > However on ARM (and until recently on x86 [1]), the function > guest_physmap_remove_page is not supposed to return an error. So we > would free the page even if we fail to remove the page. This will end up > to re-use the page by someone else even though the mapping is still > present in the P2M. > > I looked to the x86 version and I'm not sure how the function is > behaving. Maybe an x86 maintainers could give me insight here. > > I'm thinking to fix the problem by checking the return of > guest_physmap_remove_page to avoid the page being reallocate to someone > else (see for instance guest_remove_page in xen/common/memory.c). Is it > a sensible way to fix it? x86 can just as easily fail because of a failure to shatter a superpage. Despite the below changeset, none of the callee's were updated to actually act upon the error. As a result, the same issue affects x86, in principle. Does ARM have a shadow pool? On x86, we arrange that the shadow pool (should be) large enough so that we never actually encounter an out-of-memory when shattering a superpage. I also observe that there is a latent bug with iommu_unmap_page() (which is part of guest_physmap_remove_page()) as (almost) nothing checks its return value. Currently all (x86) callpaths either return success, or crash the domain. Looking at other codepaths, other possible errors (other than -ENOMEM from shattering) are: if ( unlikely(p2m_is_foreign(p2mt)) ) { /* pvh fixme: foreign types are only supported on ept at present */ gdprintk(XENLOG_WARNING, "Unimplemented foreign p2m type.\n"); return -EINVAL; } or: if ( !(p2m_entry = p2m_find_entry(*table, gfn_remainder, gfn, shift, max)) ) return -ENOENT; All this code looks quite rotten through, and is in some serious need of some error handling hygiene. I would not be surprised if some correctness issues are lurking in here. ~Andrew ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Fwd: Question regarding the behavior of guest_physmap_remove_page on x86 2015-11-17 18:47 ` Andrew Cooper @ 2015-11-18 16:16 ` Julien Grall 0 siblings, 0 replies; 3+ messages in thread From: Julien Grall @ 2015-11-18 16:16 UTC (permalink / raw) To: Andrew Cooper, Jan Beulich, Tim Deegan Cc: Stefano Stabellini, Ian Campbell, xen-devel Hi Andrew, On 17/11/15 18:47, Andrew Cooper wrote: > On 17/11/15 18:09, Julien Grall wrote: >> On ARM, it's possible to fail when removing a page from the P2M. It's >> happening if we are trying to shatter a superpage and we don't have >> memory to allocate the table. Therefore the mapping won't be removed >> from the P2M. >> >> However on ARM (and until recently on x86 [1]), the function >> guest_physmap_remove_page is not supposed to return an error. So we >> would free the page even if we fail to remove the page. This will end up >> to re-use the page by someone else even though the mapping is still >> present in the P2M. >> >> I looked to the x86 version and I'm not sure how the function is >> behaving. Maybe an x86 maintainers could give me insight here. >> >> I'm thinking to fix the problem by checking the return of >> guest_physmap_remove_page to avoid the page being reallocate to someone >> else (see for instance guest_remove_page in xen/common/memory.c). Is it >> a sensible way to fix it? > > x86 can just as easily fail because of a failure to shatter a superpage. > > Despite the below changeset, none of the callee's were updated to > actually act upon the error. > > As a result, the same issue affects x86, in principle. > > Does ARM have a shadow pool? On x86, we arrange that the shadow pool > (should be) large enough so that we never actually encounter an > out-of-memory when shattering a superpage. We don't have shadow pool on ARM. Even if we implement I think we have to check the return of guest_physmap_remove_page in the event there is other error path. > I also observe that there is a latent bug with iommu_unmap_page() (which > is part of guest_physmap_remove_page()) as (almost) nothing checks its > return value. Currently all (x86) callpaths either return success, or > crash the domain. > > Looking at other codepaths, other possible errors (other than -ENOMEM > from shattering) are: > > if ( unlikely(p2m_is_foreign(p2mt)) ) > { > /* pvh fixme: foreign types are only supported on ept at present */ > gdprintk(XENLOG_WARNING, "Unimplemented foreign p2m type.\n"); > return -EINVAL; > } > > or: > > if ( !(p2m_entry = p2m_find_entry(*table, gfn_remainder, gfn, > shift, max)) ) > return -ENOENT; > > > All this code looks quite rotten through, and is in some serious need of > some error handling hygiene. I don't know the x86 code. I would appreciate can take care of the x86 part. Regards, -- Julien Grall ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2015-11-18 16:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <564B6C56.8080501@citrix.com>
2015-11-17 18:09 ` Fwd: Question regarding the behavior of guest_physmap_remove_page on x86 Julien Grall
2015-11-17 18:47 ` Andrew Cooper
2015-11-18 16:16 ` Julien Grall
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).