public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: Mario Smarduch <m.smarduch@samsung.com>
To: Christoffer Dall <christoffer.dall@linaro.org>
Cc: kvmarm@lists.cs.columbia.edu, marc.zyngier@arm.com,
	xiaoguangrong@linux.vnet.ibm.com, steve.capper@arm.com,
	kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	gavin.guo@canonical.com, peter.maydell@linaro.org,
	jays.lee@samsung.com, sungjinn.chung@samsung.com
Subject: Re: [RESEND PATCH v7 3/4] arm: dirty log write protect management support
Date: Mon, 09 Jun 2014 18:47:12 -0700	[thread overview]
Message-ID: <539663A0.9080507@samsung.com> (raw)
In-Reply-To: <20140608120522.GG3279@lvm>

On 06/08/2014 05:05 AM, Christoffer Dall wrote:
> On Fri, Jun 06, 2014 at 10:33:41AM -0700, Mario Smarduch wrote:
>> kvm_vm_ioctl_get_dirty_log() is generic used by x86, ARM. x86 recent patch 
>> changed this function, this patch picks up those changes, re-tested everything
>> works. Applies cleanly with other patches.
>>
>> This patch adds support for keeping track of VM dirty pages. As dirty page log
>> is retrieved, the pages that have been written are write protected again for
>> next write and log read.
>>
>> Signed-off-by: Mario Smarduch <m.smarduch@samsung.com>
>> ---
>>  arch/arm/include/asm/kvm_host.h |    3 ++
>>  arch/arm/kvm/arm.c              |    5 ---
>>  arch/arm/kvm/mmu.c              |   79 +++++++++++++++++++++++++++++++++++
>>  arch/x86/kvm/x86.c              |   86 ---------------------------------------
>>  virt/kvm/kvm_main.c             |   86 +++++++++++++++++++++++++++++++++++++++
>>  5 files changed, 168 insertions(+), 91 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
>> index 59565f5..b760f9c 100644
>> --- a/arch/arm/include/asm/kvm_host.h
>> +++ b/arch/arm/include/asm/kvm_host.h
>> @@ -232,5 +232,8 @@ u64 kvm_arm_timer_get_reg(struct kvm_vcpu *, u64 regid);
>>  int kvm_arm_timer_set_reg(struct kvm_vcpu *, u64 regid, u64 value);
>>  
>>  void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>> +void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
>> +	struct kvm_memory_slot *slot,
>> +	gfn_t gfn_offset, unsigned long mask);
> 
> Do all other architectures implement this function?  arm64?

Besides arm, x86 but the function is not generic.
> 
>>  
>>  #endif /* __ARM_KVM_HOST_H__ */
>> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
>> index dfd63ac..f06fb21 100644
>> --- a/arch/arm/kvm/arm.c
>> +++ b/arch/arm/kvm/arm.c
>> @@ -780,11 +780,6 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
>>  	}
>>  }
>>  
>> -int kvm_vm_ioctl_get_dirty_log(struct kvm *kvm, struct kvm_dirty_log *log)
>> -{
>> -	return -EINVAL;
>> -}
>> -
> 
> What about the other architectures implementing this function?

Six architectures define this function. With this patch this
function is generic in kvm_main.c used by x86.
> 
>>  static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm,
>>  					struct kvm_arm_device_addr *dev_addr)
>>  {
>> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
>> index e5dff85..907344c 100644
>> --- a/arch/arm/kvm/mmu.c
>> +++ b/arch/arm/kvm/mmu.c
>> @@ -874,6 +874,85 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot)
>>  	spin_unlock(&kvm->mmu_lock);
>>  }
>>  
>> +/**
>> + * stage2_wp_mask_range() - write protect memslot pages set in mask
>> + * @pmd - pointer to page table
>> + * @start_ipa - the start range of mask
>> + * @addr - start_ipa or start range of adjusted mask if crossing PMD range
>> + * @mask - mask of dirty pages
>> + *
>> + * Walk mask and write protect the associated dirty pages in the memory region.
>> + * If mask crosses a PMD range adjust it to next page table and return.
>> + */
>> +static void stage2_wp_mask_range(pmd_t *pmd, phys_addr_t start_ipa,
>> +		phys_addr_t *addr, unsigned long *mask)
>> +{
>> +	pte_t *pte;
>> +	bool crosses_pmd;
>> +	int i = __ffs(*mask);
>> +
>> +	if (unlikely(*addr > start_ipa))
>> +		start_ipa = *addr - i * PAGE_SIZE;
> 
> huh?
> 
>> +	pte = pte_offset_kernel(pmd, start_ipa);
>> +	for (*addr = start_ipa + i * PAGE_SIZE; *mask;
>> +		i = __ffs(*mask), *addr = start_ipa + i * PAGE_SIZE) {
>> +		crosses_pmd = !!((start_ipa & PMD_MASK) ^ (*addr & PMD_MASK));
>> +		if (unlikely(crosses_pmd)) {
>> +			/* Adjust mask dirty bits relative to next page table */
>> +			*mask >>= (PTRS_PER_PTE - pte_index(start_ipa));
>> +			return;
>> +		}
>> +		if (!pte_none(pte[i]))
>> +			kvm_set_s2pte_readonly(&pte[i]);
>> +		*mask &= ~(1 << i);
> 
> This is *really* complicated, and *really* unintuitive and *really* hard
> to read!
> 
> I feel this may very likely break, and is optimizing prematurely for
> some very special case.  Can't you follow the usual scheme of traversing
> the levels one-by-one and just calculate the 'end' address based on the
> number of bits in your long, and just adjust the mask in the calling
> function each time you are about to call a lower-level function?

Agreed I'll extend wp_range functions, it probably makes no sense
to be optimizing at this phase.

> 
> In fact, I think this could be trivially implemented as an extension to
> your existing wp_range functions.  On ARM you are mostly going to
> consider 32 pages, on arm64 you are mostly going to consider 64 pages,
> just calculate that range in terms of IPAs and set that as the limit for
> calling stage2_wp_pgd_range (which should be factor'ed out into its
> function and called from kvm_mmu_wp_memory_region).
> 
> 
> 
>> +	}
>> +}
>> +
>> +/**
>> + * kvm_mmu_write_protected_pt_masked() - write protect dirty pages set in mask
>> + * @kvm:        The KVM pointer
>> + * @slot:       The memory slot associated with mask
>> + * @gfn_offset: The gfn offset in memory slot
>> + * @mask:       The mask of dirty pages at offset 'gnf_offset' in this memory
> 
> s/gnf_offset/gfn_offset/
ok
> 
>> + *              slot to be write protected
>> + *
>> + * Called from dirty page logging read function to write protect bits set in
>> + * mask to record future writes to these pages in dirty page log. This function
>> + * uses simplified page table walk given  mask can spawn no more then 2 PMD
> 
> random double white space before mask
ok
> 
>> + * table range.
>> + * 'kvm->mmu_lock' must be  held to protect against concurrent modification
>> + * of page tables (2nd stage fault, mmu modifiers, ...)
>> + *
>> + */
>> +void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
>> +		struct kvm_memory_slot *slot,
>> +		gfn_t gfn_offset, unsigned long mask)
>> +{
>> +	pud_t *pud;
>> +	pmd_t *pmd;
>> +	phys_addr_t start_ipa = (slot->base_gfn + gfn_offset) << PAGE_SHIFT;
>> +	phys_addr_t end_ipa = start_ipa + BITS_PER_LONG * PAGE_SIZE;
>> +	phys_addr_t addr = start_ipa;
>> +	pgd_t *pgdp = kvm->arch.pgd, *pgd;
>> +
>> +	do {
>> +		pgd = pgdp + pgd_index(addr);
>> +		if (pgd_present(*pgd)) {
>> +			pud = pud_offset(pgd, addr);
>> +			if (!pud_none(*pud) && !pud_huge(*pud)) {
>> +				pmd = pmd_offset(pud, addr);
>> +				if (!pmd_none(*pmd) && !kvm_pmd_huge(*pmd))
>> +					stage2_wp_mask_range(pmd, start_ipa,
>> +								&addr, &mask);
>> +				else
>> +					addr += PMD_SIZE;
>> +			} else
>> +				addr += PUD_SIZE;
> 
> is this correct? what if your gfn_offset puts you at the last page of a
> PUD, don't you need pud_addr_end() instead?
> 
>> +		} else
>> +			addr += PGDIR_SIZE;
> 
> please use braces for both of the single-line else-clauses above when
> the main if-clause is multi-line (see Documentation/CodingStyle chapter
> 3, just before 3.1).
> 
>> +	} while (mask && addr < end_ipa);
> 
> this seems like a complicated loop condition.  It seems to me that
> either you clear all the bits in the mask you want to check or you check
> for an address range, why is there a need for both?
> 
>> +}
>> +
>>  static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>>  			  struct kvm_memory_slot *memslot,
>>  			  unsigned long fault_status)
> 
> [...]
> 
> Thanks,
> -Christoffer
> 


  reply	other threads:[~2014-06-10  1:47 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-03 23:19 [PATCH v7 0/4] arm: dirty page logging support for ARMv7 Mario Smarduch
2014-06-03 23:19 ` [PATCH v7 1/4] arm: add ARMv7 HYP API to flush VM TLBs without address param Mario Smarduch
2014-06-08 12:05   ` Christoffer Dall
2014-06-09 17:06     ` Mario Smarduch
2014-06-09 17:49       ` Christoffer Dall
2014-06-09 18:36         ` Mario Smarduch
2014-06-03 23:19 ` [PATCH v7 2/4] arm: dirty page logging inital mem region write protect (w/no huge PUD support) Mario Smarduch
2014-06-08 12:05   ` Christoffer Dall
2014-06-09 17:58     ` Mario Smarduch
2014-06-09 18:09       ` Christoffer Dall
2014-06-09 18:33         ` Mario Smarduch
2014-06-03 23:19 ` [PATCH v7 3/4] arm: dirty log write protect management support Mario Smarduch
2014-06-03 23:19 ` [PATCH v7 4/4] arm: dirty page logging 2nd stage page fault handling support Mario Smarduch
2014-06-08 12:05   ` Christoffer Dall
2014-06-10 18:23     ` Mario Smarduch
2014-06-11  6:58       ` Christoffer Dall
2014-06-12  2:53         ` Mario Smarduch
2014-06-06 17:33 ` [RESEND PATCH v7 3/4] arm: dirty log write protect management support Mario Smarduch
2014-06-08 12:05   ` Christoffer Dall
2014-06-10  1:47     ` Mario Smarduch [this message]
2014-06-10  9:22       ` Christoffer Dall
2014-06-10 18:08         ` Mario Smarduch
2014-06-11  7:03           ` Christoffer Dall
2014-06-12  3:02             ` Mario Smarduch
2014-06-18  1:41             ` Mario Smarduch
2014-07-03 15:04               ` Christoffer Dall
2014-07-04 16:29                 ` Paolo Bonzini
2014-07-17 16:00                   ` Mario Smarduch
2014-07-17 16:17                 ` Mario Smarduch
2014-06-08 10:45 ` [PATCH v7 0/4] arm: dirty page logging support for ARMv7 Christoffer Dall
2014-06-09 17:02   ` Mario Smarduch
  -- strict thread matches above, loose matches on Subject: below --
2014-06-04 21:11 [RESEND PATCH v7 3/4] arm: dirty log write protect management support Mario Smarduch
2014-06-05  6:55 ` Xiao Guangrong
2014-06-05 19:09   ` Mario Smarduch
2014-06-06  5:52     ` Xiao Guangrong
2014-06-06 17:36       ` Mario Smarduch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=539663A0.9080507@samsung.com \
    --to=m.smarduch@samsung.com \
    --cc=christoffer.dall@linaro.org \
    --cc=gavin.guo@canonical.com \
    --cc=jays.lee@samsung.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=marc.zyngier@arm.com \
    --cc=peter.maydell@linaro.org \
    --cc=steve.capper@arm.com \
    --cc=sungjinn.chung@samsung.com \
    --cc=xiaoguangrong@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox