Git development
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: "Jason Newton via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org,  Jason Newton <nevion@gmail.com>
Subject: Re: [PATCH 0/2] worktree: copy-on-write creation and shared-branch worktrees
Date: Mon, 08 Jun 2026 07:36:33 -0700	[thread overview]
Message-ID: <xmqq8q8pxfny.fsf@gitster.g> (raw)
In-Reply-To: <pull.2317.git.git.1780685368.gitgitgadget@gmail.com> (Jason Newton via GitGitGadget's message of "Fri, 05 Jun 2026 18:49:26 +0000")

"Jason Newton via GitGitGadget" <gitgitgadget@gmail.com> writes:

> When many worktrees share one repository -- e .g. a fleet of agents each
> needing an isolated checkout -- "git worktree add" is costly at scale.
> Objects are shared via the common dir, but the working tree is not: each add
> rewrites every tracked file, so N worktrees cost N full checkouts of disk
> and I/O.

Are the "CoW" semantics offered by these underlying mechanisms,
which may differ per operating system and possibly filesystem type,
all meant as mere storage-space optimization, or do some of them
trade potential space saving with some limitation of the features,
i.e., what you can do in the CoW copy and original, or increased
runtime cost, either at the clone time or the time of first
modification?

What I am trying to get at is why should this be even an opt-in
feature.  If "cp treeA treeB" at the shell level would make all the
files in treeA under identical names and contents in treeB, and let
you edit/update/delete copies in either tree without affecting the
other tree, then in practice you would not even be able to _tell_ if
CoW is in use, no?

It may tilt the scale if there is a downside associated with the use
of CoW, like at the first modification of one copy, the system may
need to do real copies of other copies, but even such a cost should
not be outrageously worse than the cost of copying everything once
at the worktree creation time.

So I would understand "whenever we say git_copy_file(A, B), we
always use CoW facility under the hood if available, regardless of
the purpose of the operation to copy one file to another location---
it may include, but does not have to be limited to, populating
working file trees in a new worktree", and I think it is a welcome
change.

But I do not quite get "... only if the user gives --reflink
option".  Why is it even necessary to offer a choice?  Especially
since you seem to have auto-probe, we should be able to implement a
low-level operation to materialize contents identified by a_hash at
a_path in the working tree in two different ways, switching on the
availablity of CoW, e.g.,

	if (CoW available && we can find existing path with a_hash) {
	        copy-cow the found path to a_path;
	} else {
		write object identified by a_hash to a_path;
	}

>  And a branch can only be checked out in one worktree.

This is a safety feature that has nothing to do with shared files
across worktrees, no?  Your two worktrees may think they have a
checkout of the same branch (thus the same commit), one worktree
makes changes and commits, the other worktree suddenly starts seeing
a totally different output from its "git diff HEAD" that mixes what
it did relative to where it started (which is what we want) plus the
reversion of what was done in the other worktree (which is definitely
not what we want).

      parent reply	other threads:[~2026-06-08 14:36 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-05 18:49 [PATCH 0/2] worktree: copy-on-write creation and shared-branch worktrees Jason Newton via GitGitGadget
2026-06-05 18:49 ` [PATCH 1/2] worktree: add --reflink for copy-on-write worktree creation Jason Newton via GitGitGadget
2026-06-05 18:49 ` [PATCH 2/2] worktree: allow sharing a checked-out branch across worktrees Jason Newton via GitGitGadget
2026-06-05 19:59 ` [PATCH 0/2] worktree: copy-on-write creation and shared-branch worktrees brian m. carlson
2026-06-08 14:36 ` Junio C Hamano [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqq8q8pxfny.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=nevion@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox