From: Gleb Natapov <gleb@redhat.com>
To: Gregory Haskins <gregory.haskins@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
avi@redhat.com, "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
paulmck@linux.vnet.ibm.com
Subject: Re: [PATCH 4/4] Convert irq notifiers lists to RCU locking.
Date: Mon, 13 Jul 2009 17:08:49 +0300 [thread overview]
Message-ID: <20090713140849.GQ28046@redhat.com> (raw)
In-Reply-To: <4A5B3E65.8060309@gmail.com>
On Mon, Jul 13, 2009 at 10:02:13AM -0400, Gregory Haskins wrote:
> Gleb Natapov wrote:
> > On Mon, Jul 13, 2009 at 09:40:01AM -0400, Gregory Haskins wrote:
> >
> >> Gleb Natapov wrote:
> >>
> >>> On Mon, Jul 13, 2009 at 09:26:21AM -0400, Gregory Haskins wrote:
> >>>
> >>>
> >>>> Gleb Natapov wrote:
> >>>>
> >>>>
> >>>>> On Mon, Jul 13, 2009 at 04:02:56PM +0300, Michael S. Tsirkin wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Sun, Jul 12, 2009 at 03:03:53PM +0300, Gleb Natapov wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> Use RCU locking for mask/ack notifiers lists.
> >>>>>>>
> >>>>>>> Signed-off-by: Gleb Natapov <gleb@redhat.com>
> >>>>>>> ---
> >>>>>>> virt/kvm/irq_comm.c | 20 +++++++++++---------
> >>>>>>> 1 files changed, 11 insertions(+), 9 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>>>>>> index 5dde1ef..ba3a115 100644
> >>>>>>> --- a/virt/kvm/irq_comm.c
> >>>>>>> +++ b/virt/kvm/irq_comm.c
> >>>>>>> @@ -179,18 +179,18 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
> >>>>>>> break;
> >>>>>>> }
> >>>>>>> }
> >>>>>>> - rcu_read_unlock();
> >>>>>>>
> >>>>>>> - hlist_for_each_entry(kian, n, &kvm->irq_ack_notifier_list, link)
> >>>>>>> + hlist_for_each_entry_rcu(kian, n, &kvm->irq_ack_notifier_list, link)
> >>>>>>> if (kian->gsi == gsi)
> >>>>>>> kian->irq_acked(kian);
> >>>>>>> + rcu_read_unlock();
> >>>>>>> }
> >>>>>>>
> >>>>>>> void kvm_register_irq_ack_notifier(struct kvm *kvm,
> >>>>>>> struct kvm_irq_ack_notifier *kian)
> >>>>>>> {
> >>>>>>> mutex_lock(&kvm->irq_lock);
> >>>>>>> - hlist_add_head(&kian->link, &kvm->irq_ack_notifier_list);
> >>>>>>> + hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
> >>>>>>> mutex_unlock(&kvm->irq_lock);
> >>>>>>> }
> >>>>>>>
> >>>>>>> @@ -198,8 +198,9 @@ void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
> >>>>>>> struct kvm_irq_ack_notifier *kian)
> >>>>>>> {
> >>>>>>> mutex_lock(&kvm->irq_lock);
> >>>>>>> - hlist_del_init(&kian->link);
> >>>>>>> + hlist_del_init_rcu(&kian->link);
> >>>>>>> mutex_unlock(&kvm->irq_lock);
> >>>>>>> + synchronize_rcu();
> >>>>>>> }
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>> This is done under kvm->lock still, which means the lock might be held
> >>>>>> potentially for a very long time. Can synchronize_rcu be moved out of
> >>>>>> this lock?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>> Only if kvm_free_assigned_device() will be moved out of this lock.
> >>>>> Device de-assignment is not very frequent event though. How long do you
> >>>>> think it may be held? KVM RCU read sections are very brief.
> >>>>>
> >>>>>
> >>>>>
> >>>> Note that the delay imposed by the barrier is not only related to the
> >>>> length of the critical section. The barrier blocks until the next grace
> >>>> period, and depending on the type of RCU you are using and your config
> >>>> options, this could be multiple milliseconds.
> >>>>
> >>>> I am not saying that this is definitely a problem for your design. I
> >>>> am just pointing out that the length of the KVM-RCU read section is only
> >>>>
> >>>>
> >>> Yeah I understand that other RCU read section may introduce delays too.
> >>> The question is how big the delay may be.
> >>>
> >> I think you are misunderstanding me. The read-side CS is not a
> >> significant factor here so I am not worried about concurrent read-side
> >> CS causing a longer delay. What I am saying is that the grace period of
> >> your RCU subsystem is the dominant factor in the equation here, and this
> >> may be several milliseconds.
> >>
> >>
> > How is the "grace period" is determined? Isn't it just means "no cpus is
> > in RCU read section anymore"?
> >
>
> Nope ;)
>
Now I recall something about each CPU passing scheduler. Thanks.
> RCU is pretty complex, so I won't even try to explain it here as there
> are numerous articles floating around out there that do a much better job.
>
> But here is a summary: RCU buys you two things: 1) concurrent readers
> *and* writers, and 2) a much lower overhead reader path because it
> generally doesn't use atomic. Its point (2) that is relevant here.
>
> If taking an atomic were ok, you could approximate the RCU model using
> reference counting. Reference counting buys you "precise" resource
> acquistion/release at the expense of the overhead of the atomic
> operation (and any associated cache-line bouncing). RCU uses a
> "imprecise" model where we don't really know the *exact* moment the
> resource is released. Instead, there are specific boundaries in time
> when we can guarantee that it had to have been released prior to the
> expiry of the event. This event is what is called the "grace period".
>
> So that is what synchronize_rcu() is doing. Its a barrier to the next
> imprecise moment in time when we can be assured (if you used the rest of
> the RCU API properly) that there can not be any outstanding references
> to your object left in flight. Each grace period can be milliseconds,
> depending on what version of the kernel you have and how it is configured.
>
> HTH
>
> Kind Regards,
> -Greg
>
> >
> >>> I don't think multiple
> >>> milliseconds delay in device de-assignment is a big issue though.
> >>>
> >>>
> >> I would tend to agree with you. It's not fast path.
> >>
> >> I only brought this up because I saw your design being justified
> >> incorrectly: you said "KVM RCU read sections are very brief", but that
> >> is not really relevant to Michael's point. I just want to make sure
> >> that the true impact is understood.
> >>
> >> Kind Regards,
> >> -Greg
> >>
> >>
> >>
> >
> >
> >
> > --
> > Gleb.
> >
>
>
--
Gleb.
next prev parent reply other threads:[~2009-07-13 14:08 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-12 12:03 [PATCH 0/4] moving irq routing and notifiers to RCU locking Gleb Natapov
2009-07-12 12:03 ` [PATCH 1/4] Move irq routing data structure to rcu locking Gleb Natapov
2009-07-13 12:55 ` Michael S. Tsirkin
2009-07-13 13:03 ` Gleb Natapov
2009-07-13 13:15 ` Michael S. Tsirkin
2009-07-13 13:23 ` Gleb Natapov
2009-07-13 13:36 ` Michael S. Tsirkin
2009-07-13 13:01 ` Gregory Haskins
2009-07-13 13:15 ` Gleb Natapov
2009-07-13 13:16 ` Gregory Haskins
2009-07-13 13:25 ` Gleb Natapov
2009-07-13 13:29 ` Gregory Haskins
2009-07-13 15:55 ` Marcelo Tosatti
2009-07-13 16:24 ` Gleb Natapov
2009-07-13 16:27 ` Marcelo Tosatti
2009-07-13 16:33 ` Gleb Natapov
2009-07-13 16:42 ` Marcelo Tosatti
2009-07-13 16:44 ` Gleb Natapov
2009-07-13 16:45 ` Marcelo Tosatti
2009-07-13 16:54 ` Gleb Natapov
2009-07-12 12:03 ` [PATCH 2/4] Unregister ack notifier callback on PIT freeing Gleb Natapov
2009-07-12 12:03 ` [PATCH 3/4] Move irq ack notifier list to arch independent code Gleb Natapov
2009-07-12 12:03 ` [PATCH 4/4] Convert irq notifiers lists to RCU locking Gleb Natapov
2009-07-13 12:56 ` Michael S. Tsirkin
2009-07-13 13:05 ` Gleb Natapov
2009-07-13 13:29 ` Michael S. Tsirkin
2009-07-13 13:48 ` Gregory Haskins
2009-07-13 13:02 ` Michael S. Tsirkin
2009-07-13 13:11 ` Gleb Natapov
2009-07-13 13:26 ` Gregory Haskins
2009-07-13 13:32 ` Gleb Natapov
2009-07-13 13:40 ` Gregory Haskins
2009-07-13 13:52 ` Gleb Natapov
2009-07-13 14:02 ` Gregory Haskins
2009-07-13 14:08 ` Gleb Natapov [this message]
2009-07-13 13:40 ` Michael S. Tsirkin
2009-07-13 13:44 ` Gregory Haskins
2009-07-13 19:31 ` Paul E. McKenney
2009-07-14 5:46 ` Gleb Natapov
2009-07-14 12:03 ` Paul E. McKenney
2009-07-14 12:06 ` Gleb Natapov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090713140849.GQ28046@redhat.com \
--to=gleb@redhat.com \
--cc=avi@redhat.com \
--cc=gregory.haskins@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox