All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gleb Natapov <gleb@redhat.com>
To: Gregory Haskins <gregory.haskins@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	avi@redhat.com, "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	paulmck@linux.vnet.ibm.com
Subject: Re: [PATCH 4/4] Convert irq notifiers lists to RCU locking.
Date: Mon, 13 Jul 2009 17:08:49 +0300	[thread overview]
Message-ID: <20090713140849.GQ28046@redhat.com> (raw)
In-Reply-To: <4A5B3E65.8060309@gmail.com>

On Mon, Jul 13, 2009 at 10:02:13AM -0400, Gregory Haskins wrote:
> Gleb Natapov wrote:
> > On Mon, Jul 13, 2009 at 09:40:01AM -0400, Gregory Haskins wrote:
> >   
> >> Gleb Natapov wrote:
> >>     
> >>> On Mon, Jul 13, 2009 at 09:26:21AM -0400, Gregory Haskins wrote:
> >>>   
> >>>       
> >>>> Gleb Natapov wrote:
> >>>>     
> >>>>         
> >>>>> On Mon, Jul 13, 2009 at 04:02:56PM +0300, Michael S. Tsirkin wrote:
> >>>>>   
> >>>>>       
> >>>>>           
> >>>>>> On Sun, Jul 12, 2009 at 03:03:53PM +0300, Gleb Natapov wrote:
> >>>>>>     
> >>>>>>         
> >>>>>>             
> >>>>>>> Use RCU locking for mask/ack notifiers lists.
> >>>>>>>
> >>>>>>> Signed-off-by: Gleb Natapov <gleb@redhat.com>
> >>>>>>> ---
> >>>>>>>  virt/kvm/irq_comm.c |   20 +++++++++++---------
> >>>>>>>  1 files changed, 11 insertions(+), 9 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>>>>>> index 5dde1ef..ba3a115 100644
> >>>>>>> --- a/virt/kvm/irq_comm.c
> >>>>>>> +++ b/virt/kvm/irq_comm.c
> >>>>>>> @@ -179,18 +179,18 @@ void kvm_notify_acked_irq(struct kvm *kvm, unsigned irqchip, unsigned pin)
> >>>>>>>  			break;
> >>>>>>>  		}
> >>>>>>>  	}
> >>>>>>> -	rcu_read_unlock();
> >>>>>>>  
> >>>>>>> -	hlist_for_each_entry(kian, n, &kvm->irq_ack_notifier_list, link)
> >>>>>>> +	hlist_for_each_entry_rcu(kian, n, &kvm->irq_ack_notifier_list, link)
> >>>>>>>  		if (kian->gsi == gsi)
> >>>>>>>  			kian->irq_acked(kian);
> >>>>>>> +	rcu_read_unlock();
> >>>>>>>  }
> >>>>>>>  
> >>>>>>>  void kvm_register_irq_ack_notifier(struct kvm *kvm,
> >>>>>>>  				   struct kvm_irq_ack_notifier *kian)
> >>>>>>>  {
> >>>>>>>  	mutex_lock(&kvm->irq_lock);
> >>>>>>> -	hlist_add_head(&kian->link, &kvm->irq_ack_notifier_list);
> >>>>>>> +	hlist_add_head_rcu(&kian->link, &kvm->irq_ack_notifier_list);
> >>>>>>>  	mutex_unlock(&kvm->irq_lock);
> >>>>>>>  }
> >>>>>>>  
> >>>>>>> @@ -198,8 +198,9 @@ void kvm_unregister_irq_ack_notifier(struct kvm *kvm,
> >>>>>>>  				    struct kvm_irq_ack_notifier *kian)
> >>>>>>>  {
> >>>>>>>  	mutex_lock(&kvm->irq_lock);
> >>>>>>> -	hlist_del_init(&kian->link);
> >>>>>>> +	hlist_del_init_rcu(&kian->link);
> >>>>>>>  	mutex_unlock(&kvm->irq_lock);
> >>>>>>> +	synchronize_rcu();
> >>>>>>>  }
> >>>>>>>       
> >>>>>>>           
> >>>>>>>               
> >>>>>> This is done under kvm->lock still, which means the lock might be held
> >>>>>> potentially for a very long time. Can synchronize_rcu be moved out of
> >>>>>> this lock?
> >>>>>>
> >>>>>>     
> >>>>>>         
> >>>>>>             
> >>>>> Only if kvm_free_assigned_device() will be moved out of this lock.
> >>>>> Device de-assignment is not very frequent event though. How long do you
> >>>>> think it may be held? KVM RCU read sections are very brief.
> >>>>>   
> >>>>>       
> >>>>>           
> >>>> Note that the delay imposed by the barrier is not only related to the
> >>>> length of the critical section.  The barrier blocks until the next grace
> >>>> period, and depending on the type of RCU you are using and your config
> >>>> options, this could be multiple milliseconds.
> >>>>
> >>>> I am not saying that this is definitely a problem for your design.   I
> >>>> am just pointing out that the length of the KVM-RCU read section is only
> >>>>     
> >>>>         
> >>> Yeah I understand that other RCU read section may introduce delays too.
> >>> The question is how big the delay may be.
> >>>       
> >> I think you are misunderstanding me.  The read-side CS is not a
> >> significant factor here so I am not worried about concurrent read-side
> >> CS causing a longer delay.  What I am saying is that the grace period of
> >> your RCU subsystem is the dominant factor in the equation here, and this
> >> may be several milliseconds.
> >>
> >>     
> > How is the "grace period" is determined? Isn't it just means "no cpus is
> > in RCU read section anymore"?
> >   
> 
> Nope ;)
> 
Now I recall something about each CPU passing scheduler. Thanks.

> RCU is pretty complex, so I won't even try to explain it here as there
> are numerous articles floating around out there that do a much better job.
> 
> But here is a summary:  RCU buys you two things: 1) concurrent readers
> *and* writers, and 2) a much lower overhead reader path because it
> generally doesn't use atomic.  Its point (2) that is relevant here.
> 
> If taking an atomic were ok, you could approximate the RCU model using
> reference counting.  Reference counting buys you "precise" resource
> acquistion/release at the expense of the overhead of the atomic
> operation (and any associated cache-line bouncing).  RCU uses a
> "imprecise" model where we don't really know the *exact* moment the
> resource is released.  Instead, there are specific boundaries in time
> when we can guarantee that it had to have been released prior to the
> expiry of the event.  This event is what is called the "grace period".
> 
> So that is what synchronize_rcu() is doing.  Its a barrier to the next
> imprecise moment in time when we can be assured (if you used the rest of
> the RCU API properly) that there can not be any outstanding references
> to your object left in flight.  Each grace period can be milliseconds,
> depending on what version of the kernel you have and how it is configured.
> 
> HTH
> 
> Kind Regards,
> -Greg
> 
> >   
> >>>  I don't think multiple
> >>> milliseconds delay in device de-assignment is a big issue though.
> >>>   
> >>>       
> >> I would tend to agree with you.  It's not fast path.
> >>
> >> I only brought this up because I saw your design being justified
> >> incorrectly:  you said "KVM RCU read sections are very brief", but that
> >> is not really relevant to Michael's point.  I just want to make sure
> >> that the true impact is understood.
> >>
> >> Kind Regards,
> >> -Greg
> >>
> >>
> >>     
> >
> >
> >
> > --
> > 			Gleb.
> >   
> 
> 



--
			Gleb.

  reply	other threads:[~2009-07-13 14:08 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-07-12 12:03 [PATCH 0/4] moving irq routing and notifiers to RCU locking Gleb Natapov
2009-07-12 12:03 ` [PATCH 1/4] Move irq routing data structure to rcu locking Gleb Natapov
2009-07-13 12:55   ` Michael S. Tsirkin
2009-07-13 13:03     ` Gleb Natapov
2009-07-13 13:15       ` Michael S. Tsirkin
2009-07-13 13:23         ` Gleb Natapov
2009-07-13 13:36           ` Michael S. Tsirkin
2009-07-13 13:01   ` Gregory Haskins
2009-07-13 13:15     ` Gleb Natapov
2009-07-13 13:16       ` Gregory Haskins
2009-07-13 13:25         ` Gleb Natapov
2009-07-13 13:29           ` Gregory Haskins
2009-07-13 15:55       ` Marcelo Tosatti
2009-07-13 16:24         ` Gleb Natapov
2009-07-13 16:27           ` Marcelo Tosatti
2009-07-13 16:33             ` Gleb Natapov
2009-07-13 16:42               ` Marcelo Tosatti
2009-07-13 16:44                 ` Gleb Natapov
2009-07-13 16:45                   ` Marcelo Tosatti
2009-07-13 16:54                     ` Gleb Natapov
2009-07-12 12:03 ` [PATCH 2/4] Unregister ack notifier callback on PIT freeing Gleb Natapov
2009-07-12 12:03 ` [PATCH 3/4] Move irq ack notifier list to arch independent code Gleb Natapov
2009-07-12 12:03 ` [PATCH 4/4] Convert irq notifiers lists to RCU locking Gleb Natapov
2009-07-13 12:56   ` Michael S. Tsirkin
2009-07-13 13:05     ` Gleb Natapov
2009-07-13 13:29       ` Michael S. Tsirkin
2009-07-13 13:48     ` Gregory Haskins
2009-07-13 13:02   ` Michael S. Tsirkin
2009-07-13 13:11     ` Gleb Natapov
2009-07-13 13:26       ` Gregory Haskins
2009-07-13 13:32         ` Gleb Natapov
2009-07-13 13:40           ` Gregory Haskins
2009-07-13 13:52             ` Gleb Natapov
2009-07-13 14:02               ` Gregory Haskins
2009-07-13 14:08                 ` Gleb Natapov [this message]
2009-07-13 13:40           ` Michael S. Tsirkin
2009-07-13 13:44             ` Gregory Haskins
2009-07-13 19:31             ` Paul E. McKenney
2009-07-14  5:46               ` Gleb Natapov
2009-07-14 12:03                 ` Paul E. McKenney
2009-07-14 12:06                   ` Gleb Natapov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090713140849.GQ28046@redhat.com \
    --to=gleb@redhat.com \
    --cc=avi@redhat.com \
    --cc=gregory.haskins@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.