git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Michael Herrmann <michael@herrmann.io>
Cc: Junio C Hamano <gitster@pobox.com>,
	"brian m. carlson" <sandals@crustytoothpaste.net>,
	git@vger.kernel.org
Subject: Re: A puzzle: reset --hard and hard links
Date: Tue, 25 Jan 2022 12:33:56 +0100	[thread overview]
Message-ID: <220125.865yq8ghae.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <CABrKpmDjrTPhL_55YaXEAVTEmu8iZEsKUJYab7OgK0=w9d_7MA@mail.gmail.com>


On Mon, Jan 24 2022, Michael Herrmann wrote:

> Thank you for your explanations Junio. This is the first part where I differ:
>
>> $ ln -f a b
>
> My hard link is outside the repo. In your example, it makes sense that
> Git has to sever the hard link to be able to give the files different
> contents. In my case and example, this complication is not present.
> And it does not address the main point:
>
> My working tree is clean. `git reset --hard HEAD` (not HEAD^ like you
> had) should not do anything.
>
> Finally, your (kind!) explanation does not give a reason why calling
> `git status` should change the behavior that Git unnecessarily severs
> the hard link.
>
> My suspicion is that Git keeps a cache of the stat(...) result of
> files. An additional hard link increases the .st_nlink count of this
> struct. `git reset` compares the cached stat(...) values to the actual
> ones and sees that one has changed. `git status` does the same but is
> smart enough to realize that the additional hard link does not change
> anything. It writes this to the cache. `git reset` should also be
> smart!

What you're observing is that we tweak the index when various commands
are run, some of that is documented, and others we consider purely
implementation details. Whether we sever a hard link relationship is
definitely on the "implementation detail" side of that.

I.e. that you can observe a behavior difference here doesn't mean that
it's a bug, it means that you're poking at behavior that was never
supposed to work this way, or be stable.

That being said I don't see a reason for why we shouldn't ever support
what you're requesting here in some way. E.g. when we spin up different
a different 'git worktree' on the same storage we could optionally
hardlink to an existing checkout to save space.

This would be useful e.g. for spinning up a bunch of trees to run
compilations on, where much of the checkout tree will be duplicated.

And this probably won't match your use-case, but I wonder how far you
could get with the post-checkout hook, i.e. to have it run around after
a checkout and fix up things that aren't hard links to be hardlinked
appropriately.

I don't know of a tool to take two directories and hardlink things where
possible, but it wouldn't be hard to write. I thought rsync could, but
it appears just to support copying things as hardlink, not "fixing"
files with the same content to be hardlinks after the fact (but maybe
I've just missed a way to operate it).

  parent reply	other threads:[~2022-01-25 12:00 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-19 20:37 A puzzle: reset --hard and hard links Michael Herrmann
2022-01-19 22:20 ` brian m. carlson
2022-01-19 22:37   ` Junio C Hamano
2022-01-20  8:59     ` Michael Herrmann
2022-01-20 22:20       ` brian m. carlson
2022-01-21 12:50         ` Michael Herrmann
2022-01-24 13:48           ` Michael Herrmann
2022-01-24 18:07             ` Junio C Hamano
2022-01-24 18:16               ` Michael Herrmann
2022-01-24 21:19                 ` Junio C Hamano
2022-01-24 21:50                   ` Michael Herrmann
2022-01-25  8:49                     ` Andreas Schwab
2022-01-25 11:33                     ` Ævar Arnfjörð Bjarmason [this message]
2022-01-25 13:29                       ` Andreas Schwab
2022-01-25 14:30                         ` Michael Herrmann
2022-01-26  2:14                           ` brian m. carlson
2022-01-26 18:46                             ` Junio C Hamano
2022-01-24 22:18                   ` rsbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220125.865yq8ghae.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=michael@herrmann.io \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).