From: Marcelo Tosatti <mtosatti@redhat.com>
To: Takuya Yoshikawa <takuya.yoshikawa@gmail.com>
Cc: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>,
avi@redhat.com, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 3/4 changelog-v2] KVM: Switch to srcu-less get_dirty_log()
Date: Tue, 6 Mar 2012 12:01:04 -0300 [thread overview]
Message-ID: <20120306150104.GA3041@amt.cnet> (raw)
In-Reply-To: <20120306234317.2817d2071038d11ab3831c82@gmail.com>
On Tue, Mar 06, 2012 at 11:43:17PM +0900, Takuya Yoshikawa wrote:
> Marcelo Tosatti <mtosatti@redhat.com> wrote:
>
> > > + spin_lock(&kvm->mmu_lock);
> >
> > It is not clear why mmu_lock is needed. Dropping it across the xchg loop
> > should be similar to srcu implementation, in that concurrent updates
> > will be visible only on the next get_dirty call? Well, it is necessary
> > anyway for write protecting the sptes.
>
> My implementation does write protection inside the xchg loop.
> Then, after that loop, flushes TLB.
>
> mmu_lock must protect both of these together.
>
> If we do not mind scanning the bitmap twice, we can decouple the
> xchg loop and write protection, but it will be a bit slower, and in
> any case we need to hold mmu_lock until TLB is flushed.
Why is it necessary to scan twice? Simply continuing to the next set
of pages, after dropping the lock, should be enough.
The potential problem i am referring to is:
- kvm.git next + srcu-less series
average(ns) stdev ns/page pages improvement(%)
8497356.4 16441.0 32.4 256K -29
So 8ms for 1GB. Assuming it increases linearly, it would take
400ms for get_dirty on a 50GB slot (most of that time spent
with mmu_lock held). Is this correct?
> As can be seen from the unit-test result the majority of time
> is being spent on write protecting sptes, so decoupling xchg loop
> alone will not alleviate the problem so much -- my guess.
>
> > A cond_resched_lock() would alleviate the potentially long held
> > times for mmu_lock (can you measure it with large memslots?)
>
> How to move TLB flush out of mmu_lock critical sections was discussed
> before, and there seemed to be some proposals.
>
> Anyone is working on that?
>
> After that we can do many things.
>
> One idea is to make the extra bitmap buffer size shrink to one page
> or so and do xchg and write protection loop by that limited size.
>
> Because we can drop mmu_lock, it is possible to copy_to_user part of
> the dirty bitmap, and then go to the next part.
>
> After everything is protected, we can then do TLB flush after dropping
> mmu_lock.
>
> > Otherwise looks nice.
>
> Thanks,
> Takuya
>
>
> > > - r = -ENOMEM;
> > > - slots = kmemdup(kvm->memslots, sizeof(*kvm->memslots), GFP_KERNEL);
> > > - if (!slots)
> > > - goto out;
> > > + for (i = 0; i < n / sizeof(long); i++) {
> > > + unsigned long mask;
> > > + gfn_t offset;
> > >
> > > - memslot = id_to_memslot(slots, log->slot);
> > > - memslot->nr_dirty_pages = 0;
> > > - memslot->dirty_bitmap = dirty_bitmap_head;
> > > - update_memslots(slots, NULL);
> > > + if (!dirty_bitmap[i])
> > > + continue;
> > >
> > > - old_slots = kvm->memslots;
> > > - rcu_assign_pointer(kvm->memslots, slots);
> > > - synchronize_srcu_expedited(&kvm->srcu);
> > > - kfree(old_slots);
> > > + is_dirty = true;
> > >
> > > - write_protect_slot(kvm, memslot, dirty_bitmap, nr_dirty_pages);
> > > + mask = xchg(&dirty_bitmap[i], 0);
> > > + dirty_bitmap_buffer[i] = mask;
> > >
> > > - r = -EFAULT;
> > > - if (copy_to_user(log->dirty_bitmap, dirty_bitmap, n))
> > > - goto out;
> > > - } else {
> > > - r = -EFAULT;
> > > - if (clear_user(log->dirty_bitmap, n))
> > > - goto out;
> > > + offset = i * BITS_PER_LONG;
> > > + kvm_mmu_write_protect_pt_masked(kvm, memslot, offset, mask);
> > > }
> > > + if (is_dirty)
> > > + kvm_flush_remote_tlbs(kvm);
> > > +
> > > + spin_unlock(&kvm->mmu_lock);
next prev parent reply other threads:[~2012-03-06 15:01 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-03-01 10:30 [PATCH 0/4] KVM: srcu-less dirty logging -v2 Takuya Yoshikawa
2012-03-01 10:31 ` [PATCH 1/4] KVM: MMU: Split the main body of rmap_write_protect() off from others Takuya Yoshikawa
2012-03-12 7:39 ` Takuya Yoshikawa
2012-03-12 7:52 ` Takuya Yoshikawa
2012-03-01 10:32 ` [PATCH 2/4] KVM: Avoid checking huge page mappings in get_dirty_log() Takuya Yoshikawa
2012-03-02 2:56 ` Takuya Yoshikawa
2012-03-02 5:11 ` Takuya Yoshikawa
2012-03-12 18:04 ` Avi Kivity
2012-03-13 9:20 ` Takuya Yoshikawa
2012-03-13 10:12 ` Avi Kivity
2012-03-13 23:04 ` Marcelo Tosatti
2012-03-14 1:04 ` Takuya Yoshikawa
2012-03-14 5:34 ` Takuya Yoshikawa
2012-03-14 10:58 ` Marcelo Tosatti
2012-03-01 10:33 ` [PATCH 3/4] KVM: Switch to srcu-less get_dirty_log() Takuya Yoshikawa
2012-03-03 5:21 ` [PATCH 3/4 changelog-v2] " Takuya Yoshikawa
2012-03-06 11:15 ` Marcelo Tosatti
2012-03-06 14:43 ` Takuya Yoshikawa
2012-03-06 15:01 ` Marcelo Tosatti [this message]
2012-03-06 15:23 ` Takuya Yoshikawa
2012-03-06 15:28 ` Marcelo Tosatti
2012-03-07 8:07 ` Takuya Yoshikawa
2012-03-07 23:25 ` Marcelo Tosatti
2012-03-08 1:35 ` Takuya Yoshikawa
2012-03-09 0:08 ` Marcelo Tosatti
2012-03-12 12:05 ` Avi Kivity
2012-03-07 8:18 ` Takuya Yoshikawa
2012-03-07 23:20 ` Marcelo Tosatti
2012-03-16 5:03 ` [PATCH 3/4] " Xiao Guangrong
2012-03-16 6:55 ` Takuya Yoshikawa
2012-03-16 7:30 ` Xiao Guangrong
2012-03-16 7:55 ` Takuya Yoshikawa
2012-03-16 8:28 ` Xiao Guangrong
2012-03-16 9:44 ` Takuya Yoshikawa
2012-03-19 9:34 ` Xiao Guangrong
2012-03-19 10:15 ` Takuya Yoshikawa
2012-03-01 10:34 ` [PATCH 4/4] KVM: Remove unused dirty_bitmap_head and nr_dirty_pages Takuya Yoshikawa
2012-03-03 5:12 ` [PATCH 0/4] KVM: srcu-less dirty logging -v2 Takuya Yoshikawa
2012-03-20 14:43 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120306150104.GA3041@amt.cnet \
--to=mtosatti@redhat.com \
--cc=avi@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=takuya.yoshikawa@gmail.com \
--cc=yoshikawa.takuya@oss.ntt.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.