linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: cdall@linaro.org (Christoffer Dall)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 07/10] KVM: arm/arm64: Preserve Exec permission across R/W permission faults
Date: Tue, 17 Oct 2017 07:46:21 -0700	[thread overview]
Message-ID: <20171017144621.GJ5886@lvm> (raw)
In-Reply-To: <9f1e5366-fe47-69e9-8df9-31105c39094f@arm.com>

On Tue, Oct 17, 2017 at 12:22:08PM +0100, Marc Zyngier wrote:
> On 16/10/17 21:08, Christoffer Dall wrote:
> > On Mon, Oct 09, 2017 at 04:20:29PM +0100, Marc Zyngier wrote:
> >> So far, we loose the Exec property whenever we take permission
> >> faults, as we always reconstruct the PTE/PMD from scratch. This
> >> can be counter productive as we can end-up with the following
> >> fault sequence:
> >>
> >> 	X -> RO -> ROX -> RW -> RWX
> >>
> >> Instead, we can lookup the existing PTE/PMD and clear the XN bit in the
> >> new entry if it was already cleared in the old one, leadig to a much
> >> nicer fault sequence:
> >>
> >> 	X -> ROX -> RWX
> >>
> >> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> >> ---
> >>  arch/arm/include/asm/kvm_mmu.h   | 10 ++++++++++
> >>  arch/arm64/include/asm/kvm_mmu.h | 10 ++++++++++
> >>  virt/kvm/arm/mmu.c               | 25 +++++++++++++++++++++++++
> >>  3 files changed, 45 insertions(+)
> >>
> >> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> >> index bf76150aad5f..ad442d86c23e 100644
> >> --- a/arch/arm/include/asm/kvm_mmu.h
> >> +++ b/arch/arm/include/asm/kvm_mmu.h
> >> @@ -107,6 +107,11 @@ static inline bool kvm_s2pte_readonly(pte_t *pte)
> >>  	return (pte_val(*pte) & L_PTE_S2_RDWR) == L_PTE_S2_RDONLY;
> >>  }
> >>  
> >> +static inline bool kvm_s2pte_exec(pte_t *pte)
> >> +{
> >> +	return !(pte_val(*pte) & L_PTE_XN);
> >> +}
> >> +
> >>  static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
> >>  {
> >>  	pmd_val(*pmd) = (pmd_val(*pmd) & ~L_PMD_S2_RDWR) | L_PMD_S2_RDONLY;
> >> @@ -117,6 +122,11 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
> >>  	return (pmd_val(*pmd) & L_PMD_S2_RDWR) == L_PMD_S2_RDONLY;
> >>  }
> >>  
> >> +static inline bool kvm_s2pmd_exec(pmd_t *pmd)
> >> +{
> >> +	return !(pmd_val(*pmd) & PMD_SECT_XN);
> >> +}
> >> +
> >>  static inline bool kvm_page_empty(void *ptr)
> >>  {
> >>  	struct page *ptr_page = virt_to_page(ptr);
> >> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> >> index 60c420a5ac0d..e7af74b8b51a 100644
> >> --- a/arch/arm64/include/asm/kvm_mmu.h
> >> +++ b/arch/arm64/include/asm/kvm_mmu.h
> >> @@ -203,6 +203,11 @@ static inline bool kvm_s2pte_readonly(pte_t *pte)
> >>  	return (pte_val(*pte) & PTE_S2_RDWR) == PTE_S2_RDONLY;
> >>  }
> >>  
> >> +static inline bool kvm_s2pte_exec(pte_t *pte)
> >> +{
> >> +	return !(pte_val(*pte) & PTE_S2_XN);
> >> +}
> >> +
> >>  static inline void kvm_set_s2pmd_readonly(pmd_t *pmd)
> >>  {
> >>  	kvm_set_s2pte_readonly((pte_t *)pmd);
> >> @@ -213,6 +218,11 @@ static inline bool kvm_s2pmd_readonly(pmd_t *pmd)
> >>  	return kvm_s2pte_readonly((pte_t *)pmd);
> >>  }
> >>  
> >> +static inline bool kvm_s2pmd_exec(pmd_t *pmd)
> >> +{
> >> +	return !(pmd_val(*pmd) & PMD_S2_XN);
> >> +}
> >> +
> >>  static inline bool kvm_page_empty(void *ptr)
> >>  {
> >>  	struct page *ptr_page = virt_to_page(ptr);
> >> diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
> >> index 1911fadde88b..ccc6106764a6 100644
> >> --- a/virt/kvm/arm/mmu.c
> >> +++ b/virt/kvm/arm/mmu.c
> >> @@ -926,6 +926,17 @@ static int stage2_set_pmd_huge(struct kvm *kvm, struct kvm_mmu_memory_cache
> >>  	return 0;
> >>  }
> >>  
> >> +static pte_t *stage2_get_pte(struct kvm *kvm, phys_addr_t addr)
> >> +{
> >> +	pmd_t *pmdp;
> >> +
> >> +	pmdp = stage2_get_pmd(kvm, NULL, addr);
> >> +	if (!pmdp || pmd_none(*pmdp))
> >> +		return NULL;
> >> +
> >> +	return pte_offset_kernel(pmdp, addr);
> >> +}
> >> +
> > 
> > nit, couldn't you change this to be
> > 
> >     stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
> > 
> > Which, if the pmd is a section mapping just checks that, and if we find
> > a pte, we check that, and then we can have a simpler one-line call and
> > check from both the pte and pmd paths below?
> 
> Yes, that's pretty neat. I've folded that in.
> 
> > 
> >>  static int stage2_set_pte(struct kvm *kvm, struct kvm_mmu_memory_cache *cache,
> >>  			  phys_addr_t addr, const pte_t *new_pte,
> >>  			  unsigned long flags)
> >> @@ -1407,6 +1418,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >>  		if (exec_fault) {
> >>  			new_pmd = kvm_s2pmd_mkexec(new_pmd);
> >>  			coherent_icache_guest_page(vcpu, pfn, PMD_SIZE);
> >> +		} else if (fault_status == FSC_PERM) {
> >> +			/* Preserve execute if XN was already cleared */
> >> +			pmd_t *old_pmdp = stage2_get_pmd(kvm, NULL, fault_ipa);
> >> +
> >> +			if (old_pmdp && pmd_present(*old_pmdp) &&
> >> +			    kvm_s2pmd_exec(old_pmdp))
> >> +				new_pmd = kvm_s2pmd_mkexec(new_pmd);
> > 
> > Is the reverse case not also possible then?  That is, if we have an
> > exec_fault, we could check if the entry is already writable and maintain
> > the property as well.  Not sure how often that would get hit though, as
> > a VM would only execute instructions on a page that has been written to,
> > but is somehow read-only at stage2, meaning the host must have marked
> > the page as read-only since content was written.  I think this could be
> > a somewhat common pattern with something like KSM though?
> 
> I think this is already the case, because we always build the PTE/PMD as
> either ROXN or RWXN, and only later clear the XN bit (see the
> unconditional call to gfn_to_pfn_prot which should tell us whether to
> map the page as writable or not). Or am I missing your point entirely?
> 

I am worried about the flow where we map the page as RWXN, then execute
some code on it, so it becomes RW, then make it read-only as ROXN.
Shouldn't we we should preserve the exec state here?

I'm guessing that something like KSM will want to make pages read-only
to support COW, but perhaps that's always done by copying the content to
a new page and redirecting the old mapping to a new mapping (something
that calls kvm_set_spte_hva()) and in that case we do probably really
want the XN bit to be set again so that we can do the necessary
maintenance.

However, we shouldn't try to optimize for something we don't know to be
a problem, so as long as it's functionally correct, which I think it is,
we should be fine.

Thanks,
-Christoffer

  reply	other threads:[~2017-10-17 14:46 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-09 15:20 [PATCH 00/10] arm/arm64: KVM: limit icache invalidation to prefetch aborts Marc Zyngier
2017-10-09 15:20 ` [PATCH 01/10] KVM: arm/arm64: Split dcache/icache flushing Marc Zyngier
2017-10-16 20:07   ` Christoffer Dall
2017-10-17  8:57     ` Marc Zyngier
2017-10-17 14:28       ` Christoffer Dall
2017-10-17 14:41         ` Marc Zyngier
2017-10-16 21:35   ` Roy Franz (Cavium)
2017-10-17  6:44     ` Christoffer Dall
2017-10-09 15:20 ` [PATCH 02/10] arm64: KVM: Add invalidate_icache_range helper Marc Zyngier
2017-10-16 20:08   ` Christoffer Dall
2017-10-19 16:47   ` Will Deacon
2017-10-20 13:41     ` Marc Zyngier
2017-10-09 15:20 ` [PATCH 03/10] arm: KVM: Add optimized PIPT icache flushing Marc Zyngier
2017-10-16 20:07   ` Christoffer Dall
2017-10-17  9:26     ` Marc Zyngier
2017-10-17 14:34       ` Christoffer Dall
2017-10-09 15:20 ` [PATCH 04/10] arm64: KVM: PTE/PMD S2 XN bit definition Marc Zyngier
2017-10-16 20:07   ` Christoffer Dall
2017-10-09 15:20 ` [PATCH 05/10] KVM: arm/arm64: Limit icache invalidation to prefetch aborts Marc Zyngier
2017-10-16 20:08   ` Christoffer Dall
2017-10-09 15:20 ` [PATCH 06/10] KVM: arm/arm64: Only clean the dcache on translation fault Marc Zyngier
2017-10-16 20:08   ` Christoffer Dall
2017-10-17  9:34     ` Marc Zyngier
2017-10-17 14:36       ` Christoffer Dall
2017-10-17 14:52         ` Marc Zyngier
2017-10-09 15:20 ` [PATCH 07/10] KVM: arm/arm64: Preserve Exec permission across R/W permission faults Marc Zyngier
2017-10-16 20:08   ` Christoffer Dall
2017-10-17 11:22     ` Marc Zyngier
2017-10-17 14:46       ` Christoffer Dall [this message]
2017-10-17 15:04         ` Marc Zyngier
2017-10-09 15:20 ` [PATCH 08/10] KVM: arm/arm64: Drop vcpu parameter from coherent_{d, i}cache_guest_page Marc Zyngier
2017-10-16 20:08   ` [PATCH 08/10] KVM: arm/arm64: Drop vcpu parameter from coherent_{d,i}cache_guest_page Christoffer Dall
2017-10-09 15:20 ` [PATCH 09/10] KVM: arm/arm64: Detangle kvm_mmu.h from kvm_hyp.h Marc Zyngier
2017-10-16 20:08   ` Christoffer Dall
2017-10-09 15:20 ` [PATCH 10/10] arm: KVM: Use common implementation for all flushes to PoC Marc Zyngier
2017-10-16 20:06   ` Christoffer Dall
2017-10-17 12:40     ` Marc Zyngier
2017-10-17 14:48       ` Christoffer Dall
2017-10-16 20:59 ` [PATCH 00/10] arm/arm64: KVM: limit icache invalidation to prefetch aborts Christoffer Dall

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171017144621.GJ5886@lvm \
    --to=cdall@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).