From mboxrd@z Thu Jan 1 00:00:00 1970 From: Michael Haggerty Subject: Re: [PATCH 2/2] lock_packed_refs(): allow retries when acquiring the packed-refs lock Date: Mon, 11 May 2015 12:26:23 +0200 Message-ID: <555083CF.8010205@alum.mit.edu> References: <1430491977-25817-1-git-send-email-mhagger@alum.mit.edu> <1430491977-25817-3-git-send-email-mhagger@alum.mit.edu> <20150501182257.GA27728@peff.net> <55445E60.6010205@alum.mit.edu> <20150505192110.GD10463@peff.net> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Cc: Stefan Beller , Junio C Hamano , "git@vger.kernel.org" To: Jeff King X-From: git-owner@vger.kernel.org Mon May 11 12:26:36 2015 Return-path: Envelope-to: gcvg-git-2@plane.gmane.org Received: from vger.kernel.org ([209.132.180.67]) by plane.gmane.org with esmtp (Exim 4.69) (envelope-from ) id 1Yrkuy-0003I4-2W for gcvg-git-2@plane.gmane.org; Mon, 11 May 2015 12:26:36 +0200 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753437AbbEKK0b (ORCPT ); Mon, 11 May 2015 06:26:31 -0400 Received: from alum-mailsec-scanner-2.mit.edu ([18.7.68.13]:55169 "EHLO alum-mailsec-scanner-2.mit.edu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753355AbbEKK03 (ORCPT ); Mon, 11 May 2015 06:26:29 -0400 X-AuditID: 1207440d-f79976d000005643-3e-555083d13909 Received: from outgoing-alum.mit.edu (OUTGOING-ALUM.MIT.EDU [18.7.68.33]) by alum-mailsec-scanner-2.mit.edu (Symantec Messaging Gateway) with SMTP id 2B.5A.22083.1D380555; Mon, 11 May 2015 06:26:25 -0400 (EDT) Received: from [192.168.69.130] (p5DDB195E.dip0.t-ipconnect.de [93.219.25.94]) (authenticated bits=0) (User authenticated as mhagger@ALUM.MIT.EDU) by outgoing-alum.mit.edu (8.13.8/8.12.4) with ESMTP id t4BAQNCA021460 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Mon, 11 May 2015 06:26:24 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.6.0 In-Reply-To: <20150505192110.GD10463@peff.net> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprNKsWRmVeSWpSXmKPExsUixO6iqHuxOSDU4P9VUYuuK91MFg29V5gt frT0MFts3tzO4sDisWBTqcez3j2MHhcvKXt83iQXwBLFbZOUWFIWnJmep2+XwJ3xfvFZtoLH UhUvOs4yNjDuE+1i5OSQEDCR2PzrITuELSZx4d56ti5GLg4hgcuMEltevWKHcM4xSWxZ38MC UsUroC1x++JqIJuDg0VAVaJ5UzRImE1AV2JRTzMTiC0qECTRem0qI0S5oMTJmU/AWkUEZCW+ H97ICDKTWaCBUeLDgc1sIAlhgQSJHfe3Qi1bxiRx9d8asG5OAT2JHV8bmUFsZgF1iT/zLkHZ 8hLNW2czT2AUmIVkySwkZbOQlC1gZF7FKJeYU5qrm5uYmVOcmqxbnJyYl5dapGukl5tZopea UrqJERLQvDsY/6+TOcQowMGoxMPbcck/VIg1say4MvcQoyQHk5Iob1tVQKgQX1J+SmVGYnFG fFFpTmrxIUYJDmYlEV7ORqAcb0piZVVqUT5MSpqDRUmcV22Jup+QQHpiSWp2ampBahFMVoaD Q0mC93YTUKNgUWp6akVaZk4JQpqJgxNkOJeUSHFqXkpqUWJpSUY8KFrji4HxCpLiAdpbDNLO W1yQmAsUhWg9xagoJc7bBZIQAElklObBjYWlqVeM4kBfCvPygVTxAFMcXPcroMFMQIMd48AG lyQipKQaGCc4v/edMWGvFn+m05xH2qmzdxvoTlv6Ta2fN/hB5/uSD75/bAzE0pb0KugWBgl+ VvN1m1XD0Cwz7WxWjFBqedqfCV1R1YY535XmJa259HXGw9LCtTtjMpM9OH8ePT/z Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Archived-At: On 05/05/2015 09:21 PM, Jeff King wrote: > On Sat, May 02, 2015 at 07:19:28AM +0200, Michael Haggerty wrote: > >> 100 ms seems to be considered an acceptable delay between the time that >> a user, say, clicks a button and the time that the button reacts. What >> we are talking about is the time between the release of a lock by one >> process and the resumption of another process that was blocked waiting >> for the lock. The former is probably not under the control of the user >> anyway, and perhaps not even observable by the user. Thus I don't think >> that a perceivable delay between that event and the resumption of the >> blocked process would be annoying. The more salient delay is between the >> time that the user started the blocked command and when that command >> completed. Let's look in more detail. > > Yeah, you can't impact when the other process will drop the lock, but if > we assume that it takes on the order of 100ms for the other process to > do its whole operation, then on average we experience half that. And > then tack on to that whatever time we waste in sleep() after the other > guy drops the lock. And that's on average half of our backoff time. > > So something like 100ms max backoff makes sense to me, in that it keeps > us in the same order of magnitude as the expected time that the lock is > held. [...] I don't understand your argument. If another process blocks us for on the order of 100 ms, the backoff time (reading from my table) is less than half of that. It is only if another process blocks us for longer that our backoff times grow larger than 100 ms. I don't see the point of comparing those larger backoff numbers to hypothetical 100 ms expected blocking times when the larger backoffs *can only happen* for larger blocking times [1]. But even aside from bikeshedding about which backoff algorithm might be a tiny bit better than another, let's remember that these locking conflicts are INCREDIBLY RARE in real life. Current git doesn't have any retry at all, but users don't seem to be noticeably upset. In a moment I will submit a re-roll, changing the test case to add the "wait" that Johannes suggested but leaving the maximum backoff time unchanged. If anybody feels strongly about changing it, go ahead and do so (or make it configurable). I like the current setting because I think it makes more sense for servers, which is the only environment where lock contention is likely to occur with any measurable frequency. Michael [1] For completeness, let's also consider a difference scenario: Suppose the blocking is not being caused by a single long-lived process but rather by many short-lived processes running one after the other. In that case the time we spend blocking depends more on the duty cycle of other blocking processes, so our backoff time could grow to be longer than the mean time that any single process holds the lock. But in this scenario we are throughput-limited rather than latency limited, so our success in acquiring the lock sooner only deprives another process of the lock, not significantly improving the throughput of the system as a whole. (And given that the other processes are probably following the same rules as we are, the shorter backoff times are just as often helping them snatch the lock from us as us from them.) -- Michael Haggerty mhagger@alum.mit.edu