From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: ZheNing Hu <adlternative@gmail.com>
Cc: Git List <git@vger.kernel.org>, Derrick Stolee <stolee@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Christian Couder <christian.couder@gmail.com>,
Jeff King <peff@peff.net>
Subject: Re: Question: What's the best way to implement directory permission control in git?
Date: Thu, 28 Jul 2022 17:50:14 +0200 [thread overview]
Message-ID: <220728.861qu5kz2c.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <CAOLTT8QpYzoKDq6Pf8+YegCWngogy=3hUf-SyV180kntgucMpQ@mail.gmail.com>
On Thu, Jul 28 2022, ZheNing Hu wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> 于2022年7月27日周三 17:20写道:
>>
>>
>> On Wed, Jul 27 2022, ZheNing Hu wrote:
>>
>> > if there is a monorepo such as
>> > git@github.com:derrickstolee/sparse-checkout-example.git
>> >
>> > There are many files and directories:
>> >
>> > client/
>> > android/
>> > electron/
>> > iOS/
>> > service/
>> > common/
>> > identity/
>> > list/
>> > photos/
>> > web/
>> > browser/
>> > editor/
>> > friends/
>> > boostrap.sh
>> > LICENSE.md
>> > README.md
>> >
>> > Now we can use partial-clone + sparse-checkout to reduce
>> > the network overhead, and reduce disk storage space size, that's good.
>> >
>> > But I also need a ACL to control what directory or file people can fetch/push.
>> > e.g. I don't want a client fetch the code in "service" or "web".
>> >
>> > Now if the user client use "git log -p" or "git sparse-checkout add service"...
>> > or other git command, git which will download them by
>> > "git fetch --filter=blob:none --stdin <oid>" automatically.
>> >
>> > This means that the git client and server interact with git objects
>> > (and don't care about path) we cannot simply ban someone download
>> > a "path" on the server side.
>> >
>> > What should I do? You may recommend me to use submodule,
>> > but due to its complexity, I don't really want to use it :-(
>>
>> There isn't a way to do this in git.
>>
>> It's theoretically possible, i.e. a client could be told that the SHA-1
>> of a directory is XYZ, and construct a commit object with a reference to
>> it.
>>
>
> I guess you mean use a special reference to hold the restricted path which
> the client can access, and pre-receive-hook can ban the client from downloading
> other references. But this method is a little weird... How can this reference
> sync with main branches? If we have changed client permission to access
> server directory, how to get the "history" of the server directory?
>
> I believe this approach is not very appropriate and is not maintainable.
It's not maintainable at all, and I don't believe any current git client
supports this.
But due to git's commits referring to a Merkle tree I can tell you that
a subdirectory "secret" has a current tree SHA-1 of XYZ, without giving
you any of that content.
You *could* then manually construct a commit like:
tree <NEW_TREE>
...
Where the "<NEW_TREE>" would be a tree like:
100644 blob <NEW-BLOB-SHA1> UPDATED.md
040000 tree <XYZ> secret-stuff
And send you a PACK with my new two three new objects (commit, blob &
new top-level NEW_TREE). To the remote end & protocol it wouldn't be
distinguishable from a "normal" push.
But nothing supports this already, as a practical matter most of git
either hard dies if content is missing, or has other odd edge-case
semantics (and I'm not up-to-date on the state of the art).
Anyway, just saying that for the longer term I'm not aware of an
*intrinsic* reason for why we couldn't support this sort of thing, in
case anyone's interested in putting in a *lot* of leg work to make it
happen.
>> But currently a *lot* of things in the client code assume that these
>> things will be available in one way or another.
>>
>> The state-of-the-art in the "sparse" code may differ from the above, I
>> don't know.
>>
>> Also note that there's a well-known edge case in the git protocol where
>> it's really incompatible with the notion of "secret" data, i.e. even if
>> you hide a ref you'll be able to "guess" it by seeing what delta(s) the
>> server will produce or accept etc.
>
> Yeah, there are data security issues... Unless we need to isolate objects
> between directories. Or in this case we disable the delta object.....
> Okay, this seems a little strange.
You can't really just "disable the delta(s)". Well, you can in
principle, but like what I outlined above it's one of those things
that's a far way off, and it's one thing to e.g. have a client that's
able to craft a commit referring to data it doesn't have.
It's quite another to secure a server in such a way that it can serve up
secret data from the repo to some clients, but not to others.
I can imagine some hacks to make that happen, but I won't go into that
here...
next prev parent reply other threads:[~2022-07-28 16:00 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-27 8:56 Question: What's the best way to implement directory permission control in git? ZheNing Hu
2022-07-27 9:17 ` Ævar Arnfjörð Bjarmason
2022-07-28 14:54 ` ZheNing Hu
2022-07-28 15:50 ` Ævar Arnfjörð Bjarmason [this message]
2022-07-29 1:48 ` Elijah Newren
2022-07-29 14:22 ` ZheNing Hu
2022-07-29 14:57 ` rsbecker
2022-07-29 13:15 ` ZheNing Hu
2022-07-27 9:24 ` Thomas Guyot
2022-07-29 12:49 ` ZheNing Hu
2022-07-29 23:50 ` Emily Shaffer
2022-07-31 16:15 ` ZheNing Hu
2022-08-01 10:14 ` Han-Wen Nienhuys
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=220728.861qu5kz2c.gmgdl@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=adlternative@gmail.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.