From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5374A154BE3 for ; Thu, 22 Feb 2024 17:22:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708622582; cv=none; b=hdlOeNTZIjpB4R3X/wHbmQYwP0HYNUbHPaTyC7SSKkTc/Q0SF6cAo5vzS+/OL+/LAkrl3aZscBKV7aXMYbfZnebbVQbroCHSPYPKnuxlOovlYpOGpe9XHVjo3Fj2mASbqtlT79n42XwMB/ae7i1NCPm3wDtlCyH/izxoaAaMc5o= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708622582; c=relaxed/simple; bh=kLqdPJ4SiWweh6w7v2ftW9RtWN5HjrXRUNbsTqqSehA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=T24RjXy6COVFvBN2czOmR2so8cEZg1Q4k/6ivDbosGehf2dCvjQ/OQ8LVOP4zTOq9REqejvOpAY+dr5VMwhynjMNVuwQNa6TRuHissFGp0L9z8tj/K8zpwv5IPr4LM0cB2MikA7WB/dKqK9HEAKcdXYXk///HFdoNOj/6picxrQ= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Received: (qmail 25283 invoked by uid 109); 22 Feb 2024 17:22:53 -0000 Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Thu, 22 Feb 2024 17:22:53 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 19416 invoked by uid 111); 22 Feb 2024 17:22:54 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Thu, 22 Feb 2024 12:22:54 -0500 Authentication-Results: peff.net; auth=none Date: Thu, 22 Feb 2024 12:22:52 -0500 From: Jeff King To: Junio C Hamano Cc: Patrick Steinhardt , Yasushi SHOJI , Denton Liu , Git Mailing List Subject: Re: Segfault: git show-branch --reflog refs/pullreqs/1 Message-ID: <20240222172252.GA3535450@coredump.intra.peff.net> References: <20240221084250.GA25385@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: On Thu, Feb 22, 2024 at 08:32:06AM -0800, Junio C Hamano wrote: > > Hum, I dunno. I don't really understand what the benefit of this > > fallback is. If a user wants to know the latest object ID of the ref > > they shouldn't ask for `foo@{0}`, they should ask for `foo`. On the > > other hand, if I want to know "What is the latest entry in the ref's > > log", I want to ask for `foo@{0}`. > > The usability hack helps small things like "List up to 4 most recent > states from a branch", e.g. > > for nth in $(seq 0 3) > do > git rev-parse --quiet --verify @$nth || break > git show -s --format="@$nth %h %s" @$nth > done > > vs > > for rev in HEAD @{1} @{2} @{3} > do > git rev-parse --quiet --verify "$rev" || break > git show -s --format="$rev %h %s" "$rev" > done > > by not forcing you to special case the "current". In those examples, though, it is useful precisely because you _do_ have a reflog, and ref@{0} is conceptually the top entry (which brought us to the same state as just "ref"). The question to me is more "is ref@{0} useful on its own, even when you do not necessarily have a reflog". That I am less sure of. > Ideally, "foo@{0}" should have meant "the state immediately before > the current state of foo" so that "foo" is the unambiguous and only > way to refer to "the current state of foo", but that was not how we > implemented the reflog, allowing a subtle repository corruption > where the latest state of a branch according to the reflog and the > current commit pointed by the branch can diverge. But that wasn't > what we did, and instead both "foo@{0}" and "foo" mean to refer to > "the latest state of foo". We can take advantage of that misdesign > and allow "foo@{0}" to refer to the same commit as "foo", at least > at the get_oid_basic() level, whether a reflog actually exists or > not, and that would make the whole thing more consistent. I think there is some confusion here between how get_oid_basic() behaves and how read_ref_at() is used for something like show-branch. In the former case, we only care about getting an oid as output, but in the latter we actually want the reflog entry (because we care about its timestamp, message, and so on). So in terms of reflog entries, ref@{0} should refer to the most recent entry. And the oid it returns should be the end-result of that entry, which (in a non corrupted repository) is identical to the current ref value. And that "should" is reinforced by stuff like: git log -g "%gd %gs" which shows the most recent entry as HEAD@{0}. I think 6436a20284 (refs: allow @{n} to work with n-sized reflog, 2021-01-07) confused things mightily by having read_ref_at() with a count of "n" find entry "n-1" instead, and then return the oid for the "old" value. That makes get_oid_basic() work, because it doesn't care about which entry we found, only the oid. But for show-branch, now we are confused about which reflog entry ref@{1}, etc, refers to (but ref@{0} still works because of the weird special-casing done by that commit). I think we should fix that (and I have the start of some patches to do so). But in that world-view, having read_ref_at() return anything for a count of "0" when there is no reflog does not make sense. There is no such entry! OTOH, we face the same problem when asking about ref@{N} when there are only N entries. We can provide an oid (based on the "old" value from the oldest entry we did see), but we have to "fake" the reflog entry data (like the messsage), since there wasn't one. So the open questions to me are: - should this faking happen in read_ref_at(), just returning a dummy reflog message? Or should we keep read_ref_at() purely about finding the entry, and put the special-casing into get_oid_basic(), which only cares about the oid result? - wherever we put the faking, should we only fake ref@{N} when N > 0? Or should we also fake ref@{0} when there is no reflog at all? If none of this makes sense, it is because I am only now untangling what is going on with 6436a20284. ;) I will try to polish my proposed patches and hopefully that will explain it a bit more clearly (I may not get to it until tomorrow though). -Peff