[PATCH v5 2/4] live migration support for initial write protect of VM

public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed

From: christoffer.dall@linaro.org (Christoffer Dall)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 2/4] live migration support for initial write protect of VM
Date: Mon, 19 May 2014 18:56:48 +0100	[thread overview]
Message-ID: <20140519175648.GB5292@lvm> (raw)
In-Reply-To: <53768584.4060705@samsung.com>

On Fri, May 16, 2014 at 02:39:16PM -0700, Mario Smarduch wrote:
> Hi Christoffer,
>   few more comments
> >>>  	struct vgic_dist	vgic;
> >>> +	/* Marks start of migration, used to handle 2nd stage page faults
> >>> +	 * during migration, prevent installing huge pages and split huge pages
> >>> +	 * to small pages.
> >>> +	 */
> >>
> >> commenting style
> >>
> >> this is a bit verbose for a field in a struct, perhaps moving the longer
> >> version to where you set this?
> > Will do.
> >>
> >>> +	int migration_in_progress;
> >>>  };
> 
> I think this flag could be removed all together. Migration can be
> stopped at any time (started too), through user request or other events. 
> When that happens (like migrate_cancel) migrate cleanup bh runs and eventually calls 
> KVM memory listener kvm_log_global_start() (cancel handler) 
> that stops logging, clears KVM_MEM_LOG_DIRTY_PAGES, and region ops ioctl,
> clears dirty_bitmap. In either case dirty_bitmap for memslot is set or 
> unset during migration to track dirty pages, following that field seems to be 
> a better way to keep track of migration. This again is QEMU view but it appears 
> all these policies are driven from user space.
> 

ok, I need to look more closely at the whole thing to properly comment
on this.

> 
> 
> >>>  
> >>> +/* kvm_split_pmd - splits huge pages to small pages, required to keep a dirty
> >>> + *	log of smaller memory granules, otherwise huge pages would need to be
> >>> + *	migrated. Practically an idle system has problems migrating with
> >>
> >> This seems abrupt.  Why can't we just represent a 2M huge page as 512 4K
> >> bits and write protect the huge pages, if you take a write fault on a 2M
> >> page, then split it then.
> > 
> > That's one alternative the one I put into v6 is clear the PMD
> > and force user_mem_abort() to fault in 4k pages, and mark the
> > dirty_bitmap[] for that page, reuse the current code. Have not
> > checked the impact on performance, it takes few seconds longer
> > to converge for the tests I'm running. 
> 
> I was thinking about this and if PMD attributes need to be passed
> onto the PTEs then it appears what you recommend is required.
> But during run time I don't see how 2nd stage attributes can
> change, could the guest do anything to change them (SH, Memattr)?

You should be able to just grab the kvm_mmu lock, update the stage-2
page tables to remove all writable bits, flush all Stage-2 TLBs for that
VMID, and you should be all set.

> 
> 
> Performance may also be other reason but that always depends
> on the load, clearing a PMD seems easier and reuses current code.
> Probably several load tests/benchmarks can help here.
> Also noticed hw PMD/PTE attributes differ a little which
> is not significant now, but moving forward different page size
> and any new revisions to fields may require additional maintenance.

I think clearing out all PMD mappings will carry a significant
performance degradation on the source VM, and in the case you keep it
running, will be quite unfortunate.  Hint: Page faults are expensive and
huge pages have shown to give about 10-15% performance increase on ARMv7
for CPU/memory intensive benchmarks.

> 
> I'll be out next week and back 26'th, I'll create a link with
> details on test environment and tests. The cover letter will
> will go through general overview only.
> 

ok, I have some time then.

-Christoffer

> 
> > 
> >>
> >> If your use case is HA, then you will be doing this a lot, and you don't
> >> want to hurt performance of your main live system more than necessary.
> > 
> >>
> >>> + *	huge pages.  Called during WP of entire VM address space, done
> >>
> 
>

next prev parent reply	other threads:[~2014-05-19 17:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-08  0:40 [PATCH v5 0/4] live migration dirty bitmap support for ARMv7 Mario Smarduch
2014-05-08  0:40 ` [PATCH v5 1/4] add ARMv7 HYP API to flush VM TLBs without address param Mario Smarduch
2014-05-14 16:47   ` Christoffer Dall
2014-05-15  2:00     ` Mario Smarduch
2014-05-15 18:50       ` Christoffer Dall
2014-05-08  0:40 ` [PATCH v5 2/4] live migration support for initial write protect of VM Mario Smarduch
2014-05-15 18:53   ` Christoffer Dall
2014-05-15 22:51     ` Mario Smarduch
2014-05-16 21:39       ` Mario Smarduch
2014-05-19 17:56         ` Christoffer Dall [this message]
2014-05-30 16:48     ` Mario Smarduch
2014-05-08  0:40 ` [PATCH v5 3/4] live migration support for VM dirty log management Mario Smarduch
2014-05-08  0:40 ` [PATCH v5 4/4] add 2nd stage page fault handling during live migration Mario Smarduch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140519175648.GB5292@lvm \
    --to=christoffer.dall@linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox