From: christoffer.dall@linaro.org (Christoffer Dall)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v5 2/4] live migration support for initial write protect of VM
Date: Mon, 19 May 2014 18:56:48 +0100 [thread overview]
Message-ID: <20140519175648.GB5292@lvm> (raw)
In-Reply-To: <53768584.4060705@samsung.com>
On Fri, May 16, 2014 at 02:39:16PM -0700, Mario Smarduch wrote:
> Hi Christoffer,
> few more comments
> >>> struct vgic_dist vgic;
> >>> + /* Marks start of migration, used to handle 2nd stage page faults
> >>> + * during migration, prevent installing huge pages and split huge pages
> >>> + * to small pages.
> >>> + */
> >>
> >> commenting style
> >>
> >> this is a bit verbose for a field in a struct, perhaps moving the longer
> >> version to where you set this?
> > Will do.
> >>
> >>> + int migration_in_progress;
> >>> };
>
> I think this flag could be removed all together. Migration can be
> stopped at any time (started too), through user request or other events.
> When that happens (like migrate_cancel) migrate cleanup bh runs and eventually calls
> KVM memory listener kvm_log_global_start() (cancel handler)
> that stops logging, clears KVM_MEM_LOG_DIRTY_PAGES, and region ops ioctl,
> clears dirty_bitmap. In either case dirty_bitmap for memslot is set or
> unset during migration to track dirty pages, following that field seems to be
> a better way to keep track of migration. This again is QEMU view but it appears
> all these policies are driven from user space.
>
ok, I need to look more closely at the whole thing to properly comment
on this.
>
>
> >>>
> >>> +/* kvm_split_pmd - splits huge pages to small pages, required to keep a dirty
> >>> + * log of smaller memory granules, otherwise huge pages would need to be
> >>> + * migrated. Practically an idle system has problems migrating with
> >>
> >> This seems abrupt. Why can't we just represent a 2M huge page as 512 4K
> >> bits and write protect the huge pages, if you take a write fault on a 2M
> >> page, then split it then.
> >
> > That's one alternative the one I put into v6 is clear the PMD
> > and force user_mem_abort() to fault in 4k pages, and mark the
> > dirty_bitmap[] for that page, reuse the current code. Have not
> > checked the impact on performance, it takes few seconds longer
> > to converge for the tests I'm running.
>
> I was thinking about this and if PMD attributes need to be passed
> onto the PTEs then it appears what you recommend is required.
> But during run time I don't see how 2nd stage attributes can
> change, could the guest do anything to change them (SH, Memattr)?
You should be able to just grab the kvm_mmu lock, update the stage-2
page tables to remove all writable bits, flush all Stage-2 TLBs for that
VMID, and you should be all set.
>
>
> Performance may also be other reason but that always depends
> on the load, clearing a PMD seems easier and reuses current code.
> Probably several load tests/benchmarks can help here.
> Also noticed hw PMD/PTE attributes differ a little which
> is not significant now, but moving forward different page size
> and any new revisions to fields may require additional maintenance.
I think clearing out all PMD mappings will carry a significant
performance degradation on the source VM, and in the case you keep it
running, will be quite unfortunate. Hint: Page faults are expensive and
huge pages have shown to give about 10-15% performance increase on ARMv7
for CPU/memory intensive benchmarks.
>
> I'll be out next week and back 26'th, I'll create a link with
> details on test environment and tests. The cover letter will
> will go through general overview only.
>
ok, I have some time then.
-Christoffer
>
> >
> >>
> >> If your use case is HA, then you will be doing this a lot, and you don't
> >> want to hurt performance of your main live system more than necessary.
> >
> >>
> >>> + * huge pages. Called during WP of entire VM address space, done
> >>
>
>
next prev parent reply other threads:[~2014-05-19 17:56 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-08 0:40 [PATCH v5 0/4] live migration dirty bitmap support for ARMv7 Mario Smarduch
2014-05-08 0:40 ` [PATCH v5 1/4] add ARMv7 HYP API to flush VM TLBs without address param Mario Smarduch
2014-05-14 16:47 ` Christoffer Dall
2014-05-15 2:00 ` Mario Smarduch
2014-05-15 18:50 ` Christoffer Dall
2014-05-08 0:40 ` [PATCH v5 2/4] live migration support for initial write protect of VM Mario Smarduch
2014-05-15 18:53 ` Christoffer Dall
2014-05-15 22:51 ` Mario Smarduch
2014-05-16 21:39 ` Mario Smarduch
2014-05-19 17:56 ` Christoffer Dall [this message]
2014-05-30 16:48 ` Mario Smarduch
2014-05-08 0:40 ` [PATCH v5 3/4] live migration support for VM dirty log management Mario Smarduch
2014-05-08 0:40 ` [PATCH v5 4/4] add 2nd stage page fault handling during live migration Mario Smarduch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140519175648.GB5292@lvm \
--to=christoffer.dall@linaro.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).