From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Daniel P. Berrange" Subject: Re: [PATCH] tools/hotplug: fix locking Date: Tue, 26 Jun 2012 17:14:23 +0100 Message-ID: <20120626161423.GV14451@redhat.com> References: <650b03f412143c3057ab.1339608555@zhigang.us.oracle.com> <1340206463.4906.72.camel@zakaz.uk.xensource.com> <4FE20E27.8000108@oracle.com> <1340278976.21872.114.camel@zakaz.uk.xensource.com> <1340279364.21872.118.camel@zakaz.uk.xensource.com> <20120621122022.GO21124@redhat.com> <1340725992.3832.159.camel@zakaz.uk.xensource.com> Reply-To: "Daniel P. Berrange" Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Content-Disposition: inline In-Reply-To: <1340725992.3832.159.camel@zakaz.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Ian Campbell Cc: Andrew Jones , Ian Jackson , xen-devel , Don Dutile , Kouya Shimura , Zhigang Wang List-Id: xen-devel@lists.xenproject.org On Tue, Jun 26, 2012 at 04:53:12PM +0100, Ian Campbell wrote: > On Thu, 2012-06-21 at 13:20 +0100, Daniel P. Berrange wrote: > > On Thu, Jun 21, 2012 at 12:49:24PM +0100, Ian Campbell wrote: > > > I wonder if there were any followup fixes or changes, especially wrt to > > > your point 1 above (the lack of a timeout now) requiring xend (or now > > > libxl) changes? > > > > Removing any kind of timeout was in fact the point of this change. > > We do not want hotplug scripts stealing each others locks, or > > timing out, because the timeouts cause alot of problems when > > launching guests on a highly loaded machine where execution time > > can be alot longer than a timeout setting may expect. > > Agreed. > > What I meant was if there was any code in xend which had inadvertently > come to rely on the existence of the timeout which needed fixing. I'd > guess not -- it'd be a pretty strange pattern... It has been a while, but I don't recall any issues in the Xend layer in this respect. > > I expect you can't see the original BZ ticket quoted, since it > > has customer information in it. Here is my description of > > what the problem we weere solving was: > > > > [quote] > > In the normal case of a single domain booting with one disk, the disk hotplug > > script will fire once. In this case taking out the lock will never cause a sleep > > because there's no other concurrent invocations. If a domain has 4 disks > > configured, then the hotplug script will fire 4 times, all 4 invocations at > > pretty much the same time. If there is even a little load on the system, the > > locking function in the shell script will sleep for a few seconds - as many as 5 > > seconds, or potentially even time out & fail completely. > > > > If say 4 or even more domains each with 4 disks start up at once, that's 16 > > invocations of the hotplug script running at once. There will be alot of sleep's > > done & because of the very coarse 1 second granularity the delay can add up > > significantly. > > > > The change to using flock() removes the arbitrary 1 second sleep, so the very > > instant once hotplug script finishes, another can start & there is no repeated > > attempts & failures to lock which would add more delay. > > [/quote] > > Thanks, this description makes sense. I'd be inclined to use it verbatim > as the commit message. > > > Usually it was our policy to send all these kind of patches upstream, > > so I'm not really sure why this was not already merged. Possibly I > > forgot to submit it, or maybe it was rejected - I honestly can't > > remember. > > > > Below is the original patch I wrote, to which I apply: > > > > Signed-off-by: Daniel P. Berrange. > > Cheers. By my analysis the result is that the claim_lock(), > release_lock() and _setlockfd() functions (i.e. all the actual code) are > identical to those in the patch which Zhigang sent (the differences are > in the removed code, which I presume has changed since you wrote the > original patch). > > Therefore I think it is appropriate to include your S-o-b on the new > patch -- assuming you are OK with that? Yes, that's fine with me > > Also the patch is: > > Acked-by: Ian Campbell > > Ian. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|