All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Cc: Avi Kivity <avi@redhat.com>, LKML <linux-kernel@vger.kernel.org>,
	KVM <kvm@vger.kernel.org>
Subject: Re: [PATCH v5 00/12] KVM: introduce readonly memslot
Date: Thu, 16 Aug 2012 13:03:07 -0300	[thread overview]
Message-ID: <20120816160307.GA24376@amt.cnet> (raw)
In-Reply-To: <502C89D7.6040906@linux.vnet.ibm.com>

On Thu, Aug 16, 2012 at 01:49:11PM +0800, Xiao Guangrong wrote:
> On 08/14/2012 11:25 PM, Marcelo Tosatti wrote:
> > On Tue, Aug 14, 2012 at 10:58:07AM +0800, Xiao Guangrong wrote:
> >> On 08/14/2012 01:39 AM, Marcelo Tosatti wrote:
> >>> On Sat, Aug 11, 2012 at 11:36:20AM +0800, Xiao Guangrong wrote:
> >>>> On 08/11/2012 02:14 AM, Marcelo Tosatti wrote:
> >>>>> On Tue, Aug 07, 2012 at 05:47:15PM +0800, Xiao Guangrong wrote:
> >>>>>> Changelog:
> >>>>>> - introduce KVM_PFN_ERR_RO_FAULT instead of dummy page
> >>>>>> - introduce KVM_HVA_ERR_BAD and optimize error hva indicators
> >>>>>>
> >>>>>> The test case can be found at:
> >>>>>> http://lkml.indiana.edu/hypermail/linux/kernel/1207.2/00819/migrate-perf.tar.bz2
> >>>>>>
> >>>>>> In current code, if we map a readonly memory space from host to guest
> >>>>>> and the page is not currently mapped in the host, we will get a fault-pfn
> >>>>>> and async is not allowed, then the vm will crash.
> >>>>>>
> >>>>>> As Avi's suggestion, We introduce readonly memory region to map ROM/ROMD
> >>>>>> to the guest, read access is happy for readonly memslot, write access on
> >>>>>> readonly memslot will cause KVM_EXIT_MMIO exit.
> >>>>>
> >>>>> Memory slots whose QEMU mapping is write protected is supported
> >>>>> today, as long as there are no write faults.
> >>>>>
> >>>>> What prevents the use of mmap(!MAP_WRITE) to handle read-only memslots
> >>>>> again?
> >>>>>
> >>>>
> >>>> It is happy to map !write host memory space to the readonly memslot,
> >>>> and they can coexist as well.
> >>>>
> >>>> readonly memslot checks the write-permission by seeing slot->flags and
> >>>> !write memory checks the write-permission in hva_to_pfn() function
> >>>> which checks vma->flags. It is no conflict.
> >>>
> >>> Yes, there is no conflict. The point is, if you can use the
> >>> mmap(PROT_READ) interface (supporting read faults on read-only slots)
> >>> for this behavior, what is the advantage of a new memslot flag?
> >>>
> >>
> >> You can get the discussion at:
> >> https://lkml.org/lkml/2012/5/22/228
> >>
> >>> I'm not saying mmap(PROT_READ) is the best interface, i am just asking
> >>> why it is not.
> >>
> >> My fault. :(
> >>
> >>>
> >>>>> The initial objective was to fix a vm crash, can you explain that
> >>>>> initial problem?
> >>>>>
> >>>>
> >>>> The issue was trigged by this code:
> >>>>
> >>>>                 } else {
> >>>>                         if (async && (vma->vm_flags & VM_WRITE))
> >>>>                                 *async = true;
> >>>>                         pfn = KVM_PFN_ERR_FAULT;
> >>>>                 }
> >>>>
> >>>> If the host memory region is readonly (!vma->vm_flags & VM_WRITE) and
> >>>> its physical page is swapped out (or the file data does not be read in),
> >>>> get_user_page_nowait will fail, above code reject to set async,
> >>>> then we will get a fault pfn and async=false.
> >>>>
> >>>> I guess this issue also exists in "QEMU write protected mapping" as
> >>>> you mentioned above.
> >>>
> >>> Yes, it does. As far as i understand, what that check does from a high
> >>> level pov is:
> >>>
> >>> - Did get_user_pages_nowait() fail due to a swapped out page (in which 
> >>> case we should try to swappin the page asynchronously), or due to 
> >>> another reason (for which case an error should be returned).
> >>>
> >>> Using vma->vm_flags VM_WRITE for that is trying to guess why
> >>> get_user_pages_nowait() failed, because it (gup_nowait return values) 
> >>> does not provide sufficient information by itself.
> >>>
> >>
> >> That is exactly what i did in the first version. :)
> >>
> >> You can see it and the reason why it switched to the new way (readonly memslot)
> >> in the above website (the first message in thread).
> > 
> > Userspace can create multiple mappings for the same memory region, for
> > example via shared memory (shm_open), and have different protections for
> > the two (or more) regions. I had old patch doing this, its attached.
> > 
> 
> In this way, if guest try to write a readonly gfn, the vm will be crashed since
> it will return FAULT_PFN on the page-fault path. VMM can not detect this kind
> of fault, we have these problems:
> - even if guest try to write ROM on a PCI device, the guest will die, but
>   we'd ignore this write, it looks more like the real machine.
> 
> - can not implement ROMD beacuse write to a ROMD is MMIO access
> 
> Yes, we can rework get_user_page_nowait and get_user_pages_fast, let them
> tell us the fault reason, but it is more complex i think.
> 
> >>> Can't that be fixed separately? 
> >>>
> >>> Another issue which is also present with the mmap(PROT_READ) scheme is
> >>> interaction with reexecute_instruction. That is, unless i am mistaken,
> >>> reexecute_instruction can succeed (return true) on a region that is
> >>> write protected. This breaks the "write faults on read-only slots exit
> >>> to userspace via EXIT_MMIO" behaviour.
> >>
> >> Sorry, Why? After re-entry to the guest, it can not generate a correct MMIO?
> > 
> > reexecute_instruction validates presence of GPA by looking at registered
> > memslots. But if the access is a write, and userspace memory map is
> > read-only, reexecute_instruction should exit via MMIO.
> > 
> > That is, reexecute_instruction must validate GPA using registered
> > memslots AND additionaly userspace map permission, not only registered
> > memslot.
> > 
> 
> What will happen if we always retry a unhandleable instruction which try to write
> readonly memory? It will goto a endless loop (write-fault -> emulation fail ->
> write-fault...)? Right?

I think so... thats what would happen on real hardware.

> I do not think exit via MMIO is a good idea because the instructions can not be
> emulated, after the userspace finished the MMIO, the emulation will fail again.
> 
> I think we can simply exit via KVM_EXIT_INTERNAL_ERROR for all the access on
> readonly memory because:
> - it is fine for the read access since the read fault is always fixed on page-fault path,
>   it does not go to x86_emulate_instruction()
> 
> - for the write access, we can not emulate it. It is not bad since it only happen on
>   the instructions kvm unsupported.
> 
> Your idea?

Either KVM_EXIT_INTERNAL_ERROR or leave the guest looping. 

  reply	other threads:[~2012-08-16 16:03 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-07  9:47 [PATCH v5 00/12] KVM: introduce readonly memslot Xiao Guangrong
2012-08-07  9:48 ` [PATCH v5 01/12] KVM: fix missing check for memslot flags Xiao Guangrong
2012-08-07  9:48 ` [PATCH v5 02/12] KVM: hide KVM_MEMSLOT_INVALID from userspace Xiao Guangrong
2012-08-09 18:48   ` Marcelo Tosatti
2012-08-10  2:11     ` Xiao Guangrong
2012-08-07  9:49 ` [PATCH v5 03/12] KVM: introduce gfn_to_pfn_memslot_atomic Xiao Guangrong
2012-08-09 18:50   ` Marcelo Tosatti
2012-08-10  3:22     ` Xiao Guangrong
2012-08-07  9:50 ` [PATCH v5 04/12] KVM: introduce gfn_to_hva_read/kvm_read_hva/kvm_read_hva_atomic Xiao Guangrong
2012-08-07  9:51 ` [PATCH v5 05/12] KVM: reorganize hva_to_pfn Xiao Guangrong
2012-08-10 17:51   ` Marcelo Tosatti
2012-08-11  3:11     ` Xiao Guangrong
2012-08-07  9:51 ` [PATCH v5 06/12] KVM: use 'writable' as a hint to map writable pfn Xiao Guangrong
2012-08-07  9:52 ` [PATCH v5 07/12] KVM: introduce KVM_PFN_ERR_RO_FAULT Xiao Guangrong
2012-08-07  9:52 ` [PATCH v5 08/12] KVM: introduce KVM_HVA_ERR_BAD Xiao Guangrong
2012-08-07  9:53 ` [PATCH v5 09/12] KVM: introduce KVM_HVA_ERR_RO_BAD Xiao Guangrong
2012-08-07  9:54 ` [PATCH v5 10/12] KVM: introduce readonly memslot Xiao Guangrong
2012-08-07  9:54 ` [PATCH v5 11/12] KVM: x86: introduce set_mmio_exit_info Xiao Guangrong
2012-08-10 18:03   ` Marcelo Tosatti
2012-08-11  3:13     ` Xiao Guangrong
2012-08-07  9:55 ` [PATCH v5 12/12] KVM: indicate readonly access fault Xiao Guangrong
2012-08-10 18:14 ` [PATCH v5 00/12] KVM: introduce readonly memslot Marcelo Tosatti
2012-08-11  3:36   ` Xiao Guangrong
2012-08-13 17:39     ` Marcelo Tosatti
2012-08-14  2:58       ` Xiao Guangrong
2012-08-14 15:25         ` Marcelo Tosatti
2012-08-16  5:49           ` Xiao Guangrong
2012-08-16 16:03             ` Marcelo Tosatti [this message]
2012-08-14 14:00   ` Avi Kivity
2012-08-14 15:51     ` Marcelo Tosatti
2012-08-15 10:44       ` Avi Kivity
2012-08-15 17:53         ` Marcelo Tosatti
2012-08-16  9:03           ` Avi Kivity
2012-08-16 15:57             ` Marcelo Tosatti
2012-08-16 16:17               ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120816160307.GA24376@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.