public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Avi Kivity <avi-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
To: Marcelo Tosatti <marcelo-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org>
Cc: kvm-devel <kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org>
Subject: Re: [PATCH] Use cmpxchg for pte updates on walk_addr()
Date: Fri, 07 Dec 2007 07:06:26 +0200	[thread overview]
Message-ID: <4758D4D2.8090208@qumranet.com> (raw)
In-Reply-To: <20071207023237.GA2841@dmt>

Marcelo Tosatti wrote:
> Right, patch at end of the message restarts the process if the pte
> changes under the walker. The goto is pretty ugly, but I fail to see any
> elegant way of doing that. Ideas?
>
>   

goto is fine for that.  But there's a subtle livelock here: suppose vcpu 
0 is in guest mode with continuously updating a memory location.  vcpu 1 
is faulting with that memory location acting as a pte.  While we're in 
kernel mode, we aren't responding to signals like we should; so we need 
to abort the walk and let the guest retry; that way we go through the 
signal_pending() check.

However, this is an intrusive change, so let's start with the goto and 
drop it later in favor or an abort.

>>> @@ -1510,6 +1510,9 @@ static int emulator_write_phys(struct kvm_vcpu *vcpu, gpa_t gpa,
>>>  {
>>>  	int ret;
>>>  
>>> +	/* No need for kvm_cmpxchg_guest_pte here, its the guest 
>>> + 	 * responsability to synchronize pte updates and page faults.
>>> +	 */
>>>  	ret = kvm_write_guest(vcpu->kvm, gpa, val, bytes);
>>>  	if (ret < 0)
>>>  		return 0;
>>>       
>> Hmm.  What if an i386 pae guest carefully uses cmpxchg8b to atomically 
>> set a pte?  kvm_write_guest() doesn't guarantee atomicity, so an 
>> intended atomic write can be seen splitted by the guest walker doing a 
>> concurrent walk.
>>     
>
> True, an atomic write is needed... a separate patch for that seems more
> appropriate.
>
>
>   

Yes.

> +static inline bool FNAME(cmpxchg_gpte)(struct kvm *kvm,
> +			 gfn_t table_gfn, unsigned index, 
> +			 pt_element_t orig_pte, pt_element_t new_pte)
> +{
> +	pt_element_t ret;
> +	pt_element_t *table;
> +	struct page *page;
> +
> +	page = gfn_to_page(kvm, table_gfn);
> +	table = kmap_atomic(page, KM_USER0);
> +	
> +	ret = CMPXCHG(&table[index], orig_pte, new_pte);
> +
> +	kunmap_atomic(page, KM_USER0);
> +
>   

Missing kvm_release_page_dirty() here.  May also move mark_page_dirty() 
here.

No need to force inlining.

> +	return (ret != orig_pte);
> +}
> +
>  /*
>   * Fetch a guest pte for a guest virtual address
>   */
> @@ -91,6 +112,7 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
>  	gpa_t pte_gpa;
>  
>  	pgprintk("%s: addr %lx\n", __FUNCTION__, addr);
> +walk:
>  	walker->level = vcpu->mmu.root_level;
>  	pte = vcpu->cr3;
>  #if PTTYPE == 64
> @@ -135,8 +157,9 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
>  
>  		if (!(pte & PT_ACCESSED_MASK)) {
>  			mark_page_dirty(vcpu->kvm, table_gfn);
> -			pte |= PT_ACCESSED_MASK;
> -			kvm_write_guest(vcpu->kvm, pte_gpa, &pte, sizeof(pte));
> +			if (FNAME(cmpxchg_gpte)(vcpu->kvm, table_gfn, 
> +			    index, pte, pte|PT_ACCESSED_MASK))
> +				goto walk;
>   

We lose the accessed bit in the local variable pte here.  Not sure if it 
matters but let's play it safe.

>  		}
>  
>  		if (walker->level == PT_PAGE_TABLE_LEVEL) {
> @@ -159,9 +182,13 @@ static int FNAME(walk_addr)(struct guest_walker *walker,
>  	}
>  
>  	if (write_fault && !is_dirty_pte(pte)) {
> +		bool ret;
>  		mark_page_dirty(vcpu->kvm, table_gfn);
> -		pte |= PT_DIRTY_MASK;
> -		kvm_write_guest(vcpu->kvm, pte_gpa, &pte, sizeof(pte));
> +		ret = FNAME(cmpxchg_gpte)(vcpu->kvm, table_gfn, index, pte,
> +		       	    pte|PT_DIRTY_MASK);
> +		if (ret)
> +			goto walk;
> +	

Again we lose a bit in pte.  That ends up in walker->pte and is quite 
important.




-- 
Any sufficiently difficult bug is indistinguishable from a feature.


-------------------------------------------------------------------------
SF.Net email is sponsored by:
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php

  reply	other threads:[~2007-12-07  5:06 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-06 15:04 [PATCH] Use cmpxchg for pte updates on walk_addr() Marcelo Tosatti
2007-12-06 15:24 ` Avi Kivity
     [not found]   ` <47581418.8000506-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-07  2:32     ` Marcelo Tosatti
2007-12-07  5:06       ` Avi Kivity [this message]
     [not found]         ` <4758D4D2.8090208-atKUWr5tajBWk0Htik3J/w@public.gmane.org>
2007-12-07 12:56           ` Marcelo Tosatti
2007-12-07 17:54             ` Andrea Arcangeli
2007-12-09  8:38             ` Avi Kivity
2007-12-07 22:47           ` Marcelo Tosatti
2007-12-09  8:47             ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4758D4D2.8090208@qumranet.com \
    --to=avi-atkuwr5tajbwk0htik3j/w@public.gmane.org \
    --cc=kvm-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f@public.gmane.org \
    --cc=marcelo-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox