* rsync deprecated but promoted?
@ 2005-09-25 16:32 Zack Brown
2005-09-25 17:07 ` H. Peter Anvin
2005-09-25 19:06 ` Martin Coxall
0 siblings, 2 replies; 26+ messages in thread
From: Zack Brown @ 2005-09-25 16:32 UTC (permalink / raw)
To: git
Hi folks,
When I use cogito, it gives a warning saying the rsync method is deprecated and
will be removed in the future. But when I visit kernel.org/git, the page says to
use an rsync URL with cg-clone.
Maybe kernel.org should be updated?
Be well,
Zack
--
Zack Brown
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-25 16:32 rsync deprecated but promoted? Zack Brown
@ 2005-09-25 17:07 ` H. Peter Anvin
2005-09-25 19:06 ` Martin Coxall
1 sibling, 0 replies; 26+ messages in thread
From: H. Peter Anvin @ 2005-09-25 17:07 UTC (permalink / raw)
To: Zack Brown; +Cc: git
Zack Brown wrote:
> Hi folks,
>
> When I use cogito, . it gives a warning saying the rsync method is deprecated and
> will be removed in the future. But when I visit kernel.org/git, the page says to
> use an rsync URL with cg-clone.
>
> Maybe kernel.org should be updated?
No, since it's currently the only method available to the general
public. git-daemon still needs some tweaking before I trust to enable
it; I've been meaning to do this but I've been personally very busy.
-hpa
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-25 16:32 rsync deprecated but promoted? Zack Brown
2005-09-25 17:07 ` H. Peter Anvin
@ 2005-09-25 19:06 ` Martin Coxall
2005-09-26 13:32 ` Petr Baudis
1 sibling, 1 reply; 26+ messages in thread
From: Martin Coxall @ 2005-09-25 19:06 UTC (permalink / raw)
To: Zack Brown; +Cc: git
On 25 Sep 2005, at 17:32, Zack Brown wrote:
> Hi folks,
>
> When I use cogito, it gives a warning saying the rsync method is
> deprecated and
> will be removed in the future. But when I visit kernel.org/git, the
> page says to
> use an rsync URL with cg-clone.
>
> Maybe kernel.org should be updated?
>
It does seem to be sending out a confusing message to us users too,
since an initial clone of Linus's tree with rsync is on my machine 10x
faster than an http clone, so it seems to be sending out something of a
confused/confusing message re: rsync.
Am I right in thinking it's because rsync didn't originally have pack
support, but now it does, Petr has simply forgotten to deprecate the
deprecation message?
Kind Regards,
Martin
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-25 19:06 ` Martin Coxall
@ 2005-09-26 13:32 ` Petr Baudis
2005-09-26 14:41 ` Brian Gerst
` (2 more replies)
0 siblings, 3 replies; 26+ messages in thread
From: Petr Baudis @ 2005-09-26 13:32 UTC (permalink / raw)
To: Martin Coxall; +Cc: Zack Brown, git
Dear diary, on Sun, Sep 25, 2005 at 09:06:37PM CEST, I got a letter
where Martin Coxall <quasi@cream.org> told me that...
> On 25 Sep 2005, at 17:32, Zack Brown wrote:
> >Hi folks,
> >
> >When I use cogito, it gives a warning saying the rsync method is
> >deprecated and
> >will be removed in the future. But when I visit kernel.org/git, the
> >page says to
> >use an rsync URL with cg-clone.
> >
> >Maybe kernel.org should be updated?
> >
>
> It does seem to be sending out a confusing message to us users too,
> since an initial clone of Linus's tree with rsync is on my machine 10x
> faster than an http clone, so it seems to be sending out something of a
> confused/confusing message re: rsync.
>
> Am I right in thinking it's because rsync didn't originally have pack
> support, but now it does, Petr has simply forgotten to deprecate the
> deprecation message?
Nope. rsync always did packs, I actually un-deprecated it for the time
period when HTTP didn't. The thing is, rsync is bad - it will happily
put duplicate, redundant, and especially unwanted data to your
repository, especially when the shared GIT repositories happen. HTTP and
git-daemon are much better access methods in this regard - actually, I
still like HTTP the most:
+ Works everywhere - no special setup, no dedicated service, firewalls
and proxies won't stop it
+ Works properly, i.e. only getting stuff you want, unlike rsync
+ Replicates packs setup - would be even better if it would kill objects
and packs which the new pack makes redundant
It would be best to have some smarter git-prune-packed, which
would process just a single pack. The other alternative would be
that it would prune packs being subsets of other packs as well,
but that scaled bad. I will write another mail about that.
- It is slow. Actually, I think it should be much faster for incremental
fetches, and the initial fetch should take about the same time if you
use packs. But the question is, did we already hit the limit? Are we
using HTTP keepalive connections, do we parallelize the requests?
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 13:32 ` Petr Baudis
@ 2005-09-26 14:41 ` Brian Gerst
2005-09-26 16:36 ` Petr Baudis
2005-09-26 15:04 ` Linus Torvalds
2005-09-27 6:35 ` hared GIT repos (was Re: rsync deprecated but promoted?) Matthias Urlichs
2 siblings, 1 reply; 26+ messages in thread
From: Brian Gerst @ 2005-09-26 14:41 UTC (permalink / raw)
To: Petr Baudis; +Cc: Martin Coxall, Zack Brown, git
Petr Baudis wrote:
> Dear diary, on Sun, Sep 25, 2005 at 09:06:37PM CEST, I got a letter
> where Martin Coxall <quasi@cream.org> told me that...
>
>>On 25 Sep 2005, at 17:32, Zack Brown wrote:
>>
>>>Hi folks,
>>>
>>>When I use cogito, it gives a warning saying the rsync method is
>>>deprecated and
>>>will be removed in the future. But when I visit kernel.org/git, the
>>>page says to
>>>use an rsync URL with cg-clone.
>>>
>>>Maybe kernel.org should be updated?
>>>
>>
>>It does seem to be sending out a confusing message to us users too,
>>since an initial clone of Linus's tree with rsync is on my machine 10x
>>faster than an http clone, so it seems to be sending out something of a
>>confused/confusing message re: rsync.
>>
>>Am I right in thinking it's because rsync didn't originally have pack
>>support, but now it does, Petr has simply forgotten to deprecate the
>>deprecation message?
>
>
> Nope. rsync always did packs, I actually un-deprecated it for the time
> period when HTTP didn't. The thing is, rsync is bad - it will happily
> put duplicate, redundant, and especially unwanted data to your
> repository, especially when the shared GIT repositories happen. HTTP and
> git-daemon are much better access methods in this regard - actually, I
> still like HTTP the most:
>
> + Works everywhere - no special setup, no dedicated service, firewalls
> and proxies won't stop it
> + Works properly, i.e. only getting stuff you want, unlike rsync
> + Replicates packs setup - would be even better if it would kill objects
> and packs which the new pack makes redundant
>
> It would be best to have some smarter git-prune-packed, which
> would process just a single pack. The other alternative would be
> that it would prune packs being subsets of other packs as well,
> but that scaled bad. I will write another mail about that.
>
> - It is slow. Actually, I think it should be much faster for incremental
> fetches, and the initial fetch should take about the same time if you
> use packs. But the question is, did we already hit the limit? Are we
> using HTTP keepalive connections, do we parallelize the requests?
>
The current HTTP fetch doesn't do asynchronous requests (using
curl_multi_*). This means that no transfers occur while processing
received objects.
The other problem with HTTP vs. rsync is that the HTTP fetch will walk
the entire tree down to the root to verify it has every object. While
this isn't a bad thing it's usually unnecessary when it's all in one big
pack file.
--
Brian Gerst
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 13:32 ` Petr Baudis
2005-09-26 14:41 ` Brian Gerst
@ 2005-09-26 15:04 ` Linus Torvalds
2005-09-26 16:38 ` Petr Baudis
2005-09-26 16:44 ` walt
2005-09-27 6:35 ` hared GIT repos (was Re: rsync deprecated but promoted?) Matthias Urlichs
2 siblings, 2 replies; 26+ messages in thread
From: Linus Torvalds @ 2005-09-26 15:04 UTC (permalink / raw)
To: Petr Baudis; +Cc: Martin Coxall, Zack Brown, git
On Mon, 26 Sep 2005, Petr Baudis wrote:
>
> Nope. rsync always did packs, I actually un-deprecated it for the time
> period when HTTP didn't. The thing is, rsync is bad - it will happily
> put duplicate, redundant, and especially unwanted data to your
> repository, especially when the shared GIT repositories happen.
Worse than that, rsync will happily sync up to a remote repository without
even getting _all_ the object files, and never tell you anything is wrong.
This happened to several people when the kernel.org mirroring was
broken/delayed.
So yes, rsync is fast. But it's fast exactly _because_ it is broken. Very
very fundamentally broken.
You basically have to run fsck on your repository after an rsync. And if
it returns errors, you're screwed unless you remember what your old heads
were.
Linus
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 14:41 ` Brian Gerst
@ 2005-09-26 16:36 ` Petr Baudis
2005-09-26 16:47 ` Brian Gerst
0 siblings, 1 reply; 26+ messages in thread
From: Petr Baudis @ 2005-09-26 16:36 UTC (permalink / raw)
To: Brian Gerst; +Cc: Martin Coxall, Zack Brown, git
Dear diary, on Mon, Sep 26, 2005 at 04:41:54PM CEST, I got a letter
where Brian Gerst <bgerst@didntduck.org> told me that...
> The current HTTP fetch doesn't do asynchronous requests (using
> curl_multi_*). This means that no transfers occur while processing
> received objects.
That should be fixed then, so that we fully utilize the network.
> The other problem with HTTP vs. rsync is that the HTTP fetch will walk
> the entire tree down to the root to verify it has every object. While
> this isn't a bad thing it's usually unnecessary when it's all in one big
> pack file.
Is that really the case? I believe it will walk only to the original ref
and assume everything before is complete. (Actually, it doesn't even
seem to honor the --recover patch anymore, which isn't so nice
especially in case some objects disappeared from your database and you
would like to get them back. Happenned to me.)
But there were changes in that not so long ago, so maybe I'm still
confused.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 15:04 ` Linus Torvalds
@ 2005-09-26 16:38 ` Petr Baudis
2005-09-26 16:43 ` Linus Torvalds
2005-09-26 16:44 ` walt
1 sibling, 1 reply; 26+ messages in thread
From: Petr Baudis @ 2005-09-26 16:38 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Martin Coxall, Zack Brown, git
Dear diary, on Mon, Sep 26, 2005 at 05:04:25PM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> told me that...
> You basically have to run fsck on your repository after an rsync. And if
> it returns errors, you're screwed unless you remember what your old heads
> were.
Actually, it would be nice to be able to tell git-fsck-objects to only
verify objects which are referenced between given two commits (perhaps
just make it support the ^object notation). Then I wouldn't mind running
that after each rsync fetch in Cogito.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 16:38 ` Petr Baudis
@ 2005-09-26 16:43 ` Linus Torvalds
2005-11-10 23:17 ` Petr Baudis
0 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2005-09-26 16:43 UTC (permalink / raw)
To: Petr Baudis; +Cc: Martin Coxall, Zack Brown, git
On Mon, 26 Sep 2005, Petr Baudis wrote:
>
> Actually, it would be nice to be able to tell git-fsck-objects to only
> verify objects which are referenced between given two commits (perhaps
> just make it support the ^object notation). Then I wouldn't mind running
> that after each rsync fetch in Cogito.
You can kind of do it.
Do
git-rev-list --objects $oldheads --not $newheads >& /dev/null
echo "$?"
and it _should_ largely work. Untested, of course, but I _hope_ that if
any object is missing, git-rev-list should die with an error. And if it
doesn't, I should fix it ;)
Linus
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 15:04 ` Linus Torvalds
2005-09-26 16:38 ` Petr Baudis
@ 2005-09-26 16:44 ` walt
2005-09-26 17:55 ` Linus Torvalds
2005-09-26 20:43 ` Petr Baudis
1 sibling, 2 replies; 26+ messages in thread
From: walt @ 2005-09-26 16:44 UTC (permalink / raw)
To: git
Linus Torvalds wrote:
[...]
> You basically have to run fsck on your repository after an rsync. And if
> it returns errors, you're screwed unless you remember what your old heads
> were.
Just because you mentioned it, I did a git-fsck-objects on my local
copies of your kernel tree and Junio's git tree.
From git I got this:
$git-fsck-objects
missing commit 00d8bbd3c4bba72a6dfd48c2c0c9cbaa000f13c2
broken link from tag 02b2acff8bafb6d73c6513469cdda0c6c18c4138
to commit d5bc7eecbbb0b9f6122708bf5cd62f78ebdaafd8
<similar lines snipped>
From your tree I got only this single line:
dangling commit 02459eaab98a6a57717bc0cacede148fc76af881
Yet both trees compile and run perfectly. Are these messages
worrisome? (BTW, git was cloned and updated using http.)
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 16:36 ` Petr Baudis
@ 2005-09-26 16:47 ` Brian Gerst
0 siblings, 0 replies; 26+ messages in thread
From: Brian Gerst @ 2005-09-26 16:47 UTC (permalink / raw)
To: Petr Baudis; +Cc: Martin Coxall, Zack Brown, git
Petr Baudis wrote:
> Dear diary, on Mon, Sep 26, 2005 at 04:41:54PM CEST, I got a letter
> where Brian Gerst <bgerst@didntduck.org> told me that...
>
>>The other problem with HTTP vs. rsync is that the HTTP fetch will walk
>>the entire tree down to the root to verify it has every object. While
>>this isn't a bad thing it's usually unnecessary when it's all in one big
>>pack file.
>
>
> Is that really the case? I believe it will walk only to the original ref
> and assume everything before is complete. (Actually, it doesn't even
> seem to honor the --recover patch anymore, which isn't so nice
> especially in case some objects disappeared from your database and you
> would like to get them back. Happenned to me.)
I was talking about the initial pull. It does stop at the previous head
for updates.
--
Brian Gerst
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 16:44 ` walt
@ 2005-09-26 17:55 ` Linus Torvalds
2005-09-26 19:23 ` walt
2005-09-26 22:13 ` Daniel Barkalow
2005-09-26 20:43 ` Petr Baudis
1 sibling, 2 replies; 26+ messages in thread
From: Linus Torvalds @ 2005-09-26 17:55 UTC (permalink / raw)
To: walt; +Cc: git
On Mon, 26 Sep 2005, walt wrote:
>
> Just because you mentioned it, I did a git-fsck-objects on my local
> copies of your kernel tree and Junio's git tree.
>
> From git I got this:
> $git-fsck-objects
> missing commit 00d8bbd3c4bba72a6dfd48c2c0c9cbaa000f13c2
> broken link from tag 02b2acff8bafb6d73c6513469cdda0c6c18c4138
> to commit d5bc7eecbbb0b9f6122708bf5cd62f78ebdaafd8
> <similar lines snipped>
>
> From your tree I got only this single line:
> dangling commit 02459eaab98a6a57717bc0cacede148fc76af881
That commit shouldn't be dangling, but I suspect it is harmless and is
most likely because you have pack-files. Use "git-fsck-cache --full" if
you are downloading with http/rsync (since that gets packs without
unpacking them, and you haven't re-packed everything).
The git thing may be similar, although it sounds unlikely. A more likely
reason is that earlier http pulling got incomplete trees if you ever
interrupted it with ^C.
> Yet both trees compile and run perfectly. Are these messages
> worrisome? (BTW, git was cloned and updated using http.)
Yes, they can be worrisome. Some of it may be normal (I really suspect
that the kernel tree is that kind - a "dangling commit" is almost always
either because you've lost a tag or because of a pack-file that wasn't
examined).
Your git tree is quote possibly corrupted.
The good news is that if "git checkout" works, then the corruption is all
old - you may not have all of the history, but the corruption is
"harmless".
There's nothing fundamentally wrong with not having all of history: it
will cause fsck to complain (unless you "plug" the history by using a
graft file). And obviously it means that you may not be able to go back in
time - but you may never even care.
A "git-http-fetch --recover HEAD <url>" _should_ fix it, but I don't think
that works right now. It's documented, but it doesn't do anything. Junio?
Linus
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 17:55 ` Linus Torvalds
@ 2005-09-26 19:23 ` walt
2005-09-26 20:12 ` Johannes Schindelin
2005-09-26 20:19 ` Junio C Hamano
2005-09-26 22:13 ` Daniel Barkalow
1 sibling, 2 replies; 26+ messages in thread
From: walt @ 2005-09-26 19:23 UTC (permalink / raw)
To: git
Linus Torvalds wrote:
>
> On Mon, 26 Sep 2005, walt wrote:
>> Just because you mentioned it, I did a git-fsck-objects on my local
>> copies of your kernel tree and Junio's git tree.
>>
>> From git I got this:
>> $git-fsck-objects
>> missing commit 00d8bbd3c4bba72a6dfd48c2c0c9cbaa000f13c2
>> broken link from tag 02b2acff8bafb6d73c6513469cdda0c6c18c4138
>> to commit d5bc7eecbbb0b9f6122708bf5cd62f78ebdaafd8
>> <similar lines snipped>
>>
>> From your tree I got only this single line:
>> dangling commit 02459eaab98a6a57717bc0cacede148fc76af881
>
> That commit shouldn't be dangling, but I suspect it is harmless and is
> most likely because you have pack-files. Use "git-fsck-cache --full"
Using the --full flag made the error disappear for your kernel tree,
but had no effect on the git tree.
I neglected to mention that I use cg-clone and cg-update rather than
the git equivalents. (cogito 0.15.1 from kernel.org)
> Your git tree is quote possibly corrupted.
I recloned from http://kernel.org and I still get exactly the same fsck
errors for git, with or without the --full flag.
(I mention this only FYI. I'm not having any problems compiling or
using either git or the kernel.)
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 19:23 ` walt
@ 2005-09-26 20:12 ` Johannes Schindelin
2005-09-26 20:19 ` Junio C Hamano
1 sibling, 0 replies; 26+ messages in thread
From: Johannes Schindelin @ 2005-09-26 20:12 UTC (permalink / raw)
To: walt; +Cc: git
Hi,
On Mon, 26 Sep 2005, walt wrote:
> Using the --full flag made the error disappear for your kernel tree,
> but had no effect on the git tree.
I think it is because of the "pu" branch, which gets fetched using rsync,
but no ref is pointing to it. Since the "pu" branch is rebased quite
often, it would also happen if you fetched the "pu" branch, though.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 19:23 ` walt
2005-09-26 20:12 ` Johannes Schindelin
@ 2005-09-26 20:19 ` Junio C Hamano
1 sibling, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2005-09-26 20:19 UTC (permalink / raw)
To: walt; +Cc: git
walt <wa1ter@myrealbox.com> writes:
>>> From git I got this:
>>> $git-fsck-objects
>>> missing commit 00d8bbd3c4bba72a6dfd48c2c0c9cbaa000f13c2
>>> broken link from tag 02b2acff8bafb6d73c6513469cdda0c6c18c4138
>>> to commit d5bc7eecbbb0b9f6122708bf5cd62f78ebdaafd8
>>> <similar lines snipped>
>> Your git tree is quote possibly corrupted.
>
> I recloned from http://kernel.org and I still get exactly the same fsck
> errors for git, with or without the --full flag.
That 00d8bbd3c4bba72a6dfd48c2c0c9cbaa000f13c2 is v0.99.7c commit.
While I do not doubt your git repository is missing that object,
it probably is a buggy clone method. Let me see if I can
reproduce.
$ git clone http://kernel.org/pub/scm/git/git.git/ git-clone
(says a lot of "got" and "walk" here)
$ cd git-clone
$ git-cat-file -t 00d8bbd3c4bba72a6dfd48c2c0c9cbaa000f13c2
commit
$ git-fsck-objects
$ git-fsck-objects --full
$ exit
Nope. Things look OK from here.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 16:44 ` walt
2005-09-26 17:55 ` Linus Torvalds
@ 2005-09-26 20:43 ` Petr Baudis
1 sibling, 0 replies; 26+ messages in thread
From: Petr Baudis @ 2005-09-26 20:43 UTC (permalink / raw)
To: walt; +Cc: git
Dear diary, on Mon, Sep 26, 2005 at 06:44:04PM CEST, I got a letter
where walt <wa1ter@myrealbox.com> told me that...
> Linus Torvalds wrote:
> [...]
> >You basically have to run fsck on your repository after an rsync. And if
> >it returns errors, you're screwed unless you remember what your old heads
> >were.
>
> Just because you mentioned it, I did a git-fsck-objects on my local
> copies of your kernel tree and Junio's git tree.
>
> From git I got this:
> $git-fsck-objects
> missing commit 00d8bbd3c4bba72a6dfd48c2c0c9cbaa000f13c2
> broken link from tag 02b2acff8bafb6d73c6513469cdda0c6c18c4138
> to commit d5bc7eecbbb0b9f6122708bf5cd62f78ebdaafd8
> <similar lines snipped>
This isn't too harmful. It just means that you have a tag ref and the
corresponding tag object, but not the commit tagged by that object.
This is nothing harmful as long as you don't try to reference the tag,
and if you don't have the commit object already, it's actually not quite
likely that you would, since you don't have the branch the bug belongs
to anyway. I'll hopefully fix this bug during the weekend.
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 17:55 ` Linus Torvalds
2005-09-26 19:23 ` walt
@ 2005-09-26 22:13 ` Daniel Barkalow
2005-09-26 22:38 ` Junio C Hamano
1 sibling, 1 reply; 26+ messages in thread
From: Daniel Barkalow @ 2005-09-26 22:13 UTC (permalink / raw)
To: Linus Torvalds; +Cc: walt, git, Junio C Hamano
On Mon, 26 Sep 2005, Linus Torvalds wrote:
> A "git-http-fetch --recover HEAD <url>" _should_ fix it, but I don't think
> that works right now. It's documented, but it doesn't do anything. Junio?
I should actually implement that now that it's easy; you just skip the
"for_each_ref(mark_complete);" on line 217 of fetch.c, and it'll make sure
that it has everything.
(I'll make a patch tonight if nobody beats me to it.)
-Daniel
*This .sig left intentionally blank*
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 22:13 ` Daniel Barkalow
@ 2005-09-26 22:38 ` Junio C Hamano
0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2005-09-26 22:38 UTC (permalink / raw)
To: Daniel Barkalow; +Cc: Linus Torvalds, walt, git
Daniel Barkalow <barkalow@iabervon.org> writes:
> I should actually implement that now that it's easy; you just skip the
> "for_each_ref(mark_complete);" on line 217 of fetch.c, and it'll make sure
> that it has everything.
>
> (I'll make a patch tonight if nobody beats me to it.)
Thanks.
^ permalink raw reply [flat|nested] 26+ messages in thread
* hared GIT repos (was Re: rsync deprecated but promoted?)
2005-09-26 13:32 ` Petr Baudis
2005-09-26 14:41 ` Brian Gerst
2005-09-26 15:04 ` Linus Torvalds
@ 2005-09-27 6:35 ` Matthias Urlichs
2005-09-27 7:13 ` shared GIT repos Junio C Hamano
2005-09-27 18:36 ` hared GIT repos (was Re: rsync deprecated but promoted?) A Large Angry SCM
2 siblings, 2 replies; 26+ messages in thread
From: Matthias Urlichs @ 2005-09-27 6:35 UTC (permalink / raw)
To: git
Hi, Petr Baudis wrote:
> The thing is, rsync is bad - it will happily put
> duplicate, redundant, and especially unwanted data to your repository,
> especially when the shared GIT repositories happen.
Speaking of which -- is anybody working on that one?
I find myself in need of a multiuser shared repository that cannot
be corrupted (i.e. I want to prevent the users from removing objects,
and replacing a ref with something that is not a child of the sha1 that's
already there should also be prevented).
--
Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
- -
Beware of bugs in the above code; I have only proved it correct, not tried it.
-- Donald Knuth
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: shared GIT repos
2005-09-27 6:35 ` hared GIT repos (was Re: rsync deprecated but promoted?) Matthias Urlichs
@ 2005-09-27 7:13 ` Junio C Hamano
2005-09-27 8:45 ` Matthias Urlichs
2005-09-27 18:36 ` hared GIT repos (was Re: rsync deprecated but promoted?) A Large Angry SCM
1 sibling, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2005-09-27 7:13 UTC (permalink / raw)
To: Matthias Urlichs; +Cc: git
Matthias Urlichs <smurf@smurf.noris.de> writes:
> Speaking of which -- is anybody working on that one?
>
> I find myself in need of a multiuser shared repository that cannot
> be corrupted (i.e. I want to prevent the users from removing objects,
> and replacing a ref with something that is not a child of the sha1 that's
> already there should also be prevented).
Do you want to guard the repository from malicious users? Or is
it enough to guard a casual/careless user from making mistakes?
If one has commit privileges, then one can already do enough
harm to the project without being able to remove objects nor
updating a ref with non-fast-forward ref. So let's assume for
now that malicious users are not something we worry about. In
that case, "working on" might be too scary a word. I think most
of the pieces are already there and you only need to assemble
them and write a howto ;-).
- Place the users that has write access to the repository in
the same Unix group, and have the repository owned by that
group;
- Give the users ssh access, perhaps with authorized_keys set
up to only allow running git-receive-pack and nothing else
(like normal shell access);
- Set up hooks/update to make sure the ref updates are fast
forward. Additionally, you could set up a mapping that says
which user can/cannot update which refs if you wanted to.
-jc
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: shared GIT repos
2005-09-27 7:13 ` shared GIT repos Junio C Hamano
@ 2005-09-27 8:45 ` Matthias Urlichs
2005-09-27 9:59 ` Sergey Vlasov
2005-09-27 15:21 ` Linus Torvalds
0 siblings, 2 replies; 26+ messages in thread
From: Matthias Urlichs @ 2005-09-27 8:45 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
Hi,
Junio C Hamano:
> Do you want to guard the repository from malicious users? Or is
> it enough to guard a casual/careless user from making mistakes?
>
Well, s/malicious users/somebody who wants to cover up an ugly mistake/
would be more accurate.
What I am doing: I'm writing a system management frontend which allows
people to install version-controlled stuff (like, the configuration for
a backup server, or Yet Another PHPBB Installation) on servers -- without
even having a login there.
Some of these contain login scripts that might need root privileges or
similar (like, "restart Apache"). I want people to be unable to simply
remove the commit that included the "rm -rf /" command, move the ref
back, upload a new version, and pretend that nothing happened *la la la*.
> If one has commit privileges, then one can already do enough
> harm to the project without being able to remove objects nor
> updating a ref with non-fast-forward ref.
But in that case it's traceable what happened and whodunit.
> I think most of the pieces are already there and you only need to
> assemble them and write a howto ;-).
> [ list ]
OK, thanks, that helps. I'll write something up.
--
Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
- -
When angry, count four; when very angry, swear.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: shared GIT repos
2005-09-27 8:45 ` Matthias Urlichs
@ 2005-09-27 9:59 ` Sergey Vlasov
2005-09-27 10:29 ` Matthias Urlichs
2005-09-27 15:21 ` Linus Torvalds
1 sibling, 1 reply; 26+ messages in thread
From: Sergey Vlasov @ 2005-09-27 9:59 UTC (permalink / raw)
To: Matthias Urlichs; +Cc: Junio C Hamano, git
[-- Attachment #1: Type: text/plain, Size: 500 bytes --]
On Tue, 27 Sep 2005 10:45:13 +0200 Matthias Urlichs wrote:
> > If one has commit privileges, then one can already do enough
> > harm to the project without being able to remove objects nor
> > updating a ref with non-fast-forward ref.
>
> But in that case it's traceable what happened and whodunit.
Don't forget that the user who has rights to invoke git-receive-pack
can set the "author" and "committer" fields in his commits to anything
he wants - unless you check these fields in hooks/update.
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: shared GIT repos
2005-09-27 9:59 ` Sergey Vlasov
@ 2005-09-27 10:29 ` Matthias Urlichs
0 siblings, 0 replies; 26+ messages in thread
From: Matthias Urlichs @ 2005-09-27 10:29 UTC (permalink / raw)
To: git
Hi, Sergey Vlasov wrote:
> On Tue, 27 Sep 2005 10:45:13 +0200 Matthias Urlichs wrote:
>
>> > If one has commit privileges, then one can already do enough
>> > harm to the project without being able to remove objects nor
>> > updating a ref with non-fast-forward ref.
>>
>> But in that case it's traceable what happened and whodunit.
>
> Don't forget that the user who has rights to invoke git-receive-pack
> can set the "author" and "committer" fields in his commits to anything
> he wants - unless you check these fields in hooks/update.
Sure. I plan to; "committer" at least should match one of the user's known
email addresses. In addition to that, the files will belong to the user.
--
Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de
Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de
- -
Never count your chickens before they rip your lips off.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: shared GIT repos
2005-09-27 8:45 ` Matthias Urlichs
2005-09-27 9:59 ` Sergey Vlasov
@ 2005-09-27 15:21 ` Linus Torvalds
1 sibling, 0 replies; 26+ messages in thread
From: Linus Torvalds @ 2005-09-27 15:21 UTC (permalink / raw)
To: Matthias Urlichs; +Cc: Junio C Hamano, git
On Tue, 27 Sep 2005, Matthias Urlichs wrote:
>
> Junio C Hamano:
> > Do you want to guard the repository from malicious users? Or is
> > it enough to guard a casual/careless user from making mistakes?
> >
> Well, s/malicious users/somebody who wants to cover up an ugly mistake/
> would be more accurate.
Hmm.
What you _can_ do is to make your object and refs directories sticky.
That automatically means that only the owner of a file can remove it.
Now, people can still cover up their _own_ mistakes in that case, but they
can't change other peoples branches (since that involves overwriting
somebody elses ref), and they can't remove objects that somebody else has
written.
But they can, for example, change their _own_ branch to not have a ref to
that object, of course.
A more draconian option is to make the git programs setgid to a "git"
group, and making the object and ref directories only writable by the git
group. And then you change all the git programs to verify whatever rules
you have. That requires pretty big changes, though.
For example, you'd have to make all the scripts use the new git-update-ref
thing, and if you want to enforce that any new ref is a proper child of
the old ref, then you'd have to make git-update-ref test that one
explicitly (instead of leaving it to the scripts).
Quite frankly, I'd rather avoid that.
Oh. One thing you can do: don't allow direct filesystem access at _all_.
Use ssh to log in (even if it's on the same machine) as a special user
which is the only one that is allowed to touch the repo, and make that
special users login shell only accept git-receive-pack.
I wrote and posted an untested "git-sh" that did that some time ago,
holler if you want it again.
Add logging, and testing, and it should give you a safe write-only
alternative to "git-daemon" that only allows people to append to the git
history (oh, you'd still have to add some _small_ code to git-receive-pack
to not allow the "ignore old ref contents" case, but that's like two lines
of code).
Linus
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: hared GIT repos (was Re: rsync deprecated but promoted?)
2005-09-27 6:35 ` hared GIT repos (was Re: rsync deprecated but promoted?) Matthias Urlichs
2005-09-27 7:13 ` shared GIT repos Junio C Hamano
@ 2005-09-27 18:36 ` A Large Angry SCM
1 sibling, 0 replies; 26+ messages in thread
From: A Large Angry SCM @ 2005-09-27 18:36 UTC (permalink / raw)
To: Matthias Urlichs; +Cc: git
Matthias Urlichs wrote:
> Hi, Petr Baudis wrote:
>
>>The thing is, rsync is bad - it will happily put
>>duplicate, redundant, and especially unwanted data to your repository,
>>especially when the shared GIT repositories happen.
>
> Speaking of which -- is anybody working on that one?
>
> I find myself in need of a multiuser shared repository that cannot
> be corrupted (i.e. I want to prevent the users from removing objects,
> and replacing a ref with something that is not a child of the sha1 that's
> already there should also be prevented).
>
Yes. Actually just the protocol spec so far. Although, progress has been
interrupted by the process of moving to the SoCal area from Fla.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: rsync deprecated but promoted?
2005-09-26 16:43 ` Linus Torvalds
@ 2005-11-10 23:17 ` Petr Baudis
0 siblings, 0 replies; 26+ messages in thread
From: Petr Baudis @ 2005-11-10 23:17 UTC (permalink / raw)
To: Linus Torvalds; +Cc: Martin Coxall, Zack Brown, git
Dear diary, on Mon, Sep 26, 2005 at 06:43:08PM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
>
>
> On Mon, 26 Sep 2005, Petr Baudis wrote:
> >
> > Actually, it would be nice to be able to tell git-fsck-objects to only
> > verify objects which are referenced between given two commits (perhaps
> > just make it support the ^object notation). Then I wouldn't mind running
> > that after each rsync fetch in Cogito.
>
> You can kind of do it.
>
> Do
>
> git-rev-list --objects $oldheads --not $newheads >& /dev/null
> echo "$?"
>
> and it _should_ largely work. Untested, of course, but I _hope_ that if
> any object is missing, git-rev-list should die with an error. And if it
> doesn't, I should fix it ;)
It should obviously be
git-rev-list $(git-rev-parse $newheads --not $oldheads) >& /dev/null
but it is indeed broken:
$ git-rev-list --objects ... ^... && echo ':)'
001439c6a797461c3e75018d95744d463077ae33
841e3297d8df764da417da81dbfe1044e24d4082
cf11a3d561e76c8ba273cb0bb62d46a4b2959c1f file
:)
$ git-cat-file -t cf11a3d561e76c8ba273cb0bb62d46a4b2959c1f
error: unable to find cf11a3d561e76c8ba273cb0bb62d46a4b2959c1f
fatal: git-cat-file cf11a3d561e76c8ba273cb0bb62d46a4b2959c1f: bad file
Currently I just modified it so that I iterate through all the sha1s
git-rev-list spits out, and test them by git-cat-file -t.
Thanks for the hint,
--
Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
VI has two modes: the one in which it beeps and the one in which
it doesn't.
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2005-11-10 23:17 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-25 16:32 rsync deprecated but promoted? Zack Brown
2005-09-25 17:07 ` H. Peter Anvin
2005-09-25 19:06 ` Martin Coxall
2005-09-26 13:32 ` Petr Baudis
2005-09-26 14:41 ` Brian Gerst
2005-09-26 16:36 ` Petr Baudis
2005-09-26 16:47 ` Brian Gerst
2005-09-26 15:04 ` Linus Torvalds
2005-09-26 16:38 ` Petr Baudis
2005-09-26 16:43 ` Linus Torvalds
2005-11-10 23:17 ` Petr Baudis
2005-09-26 16:44 ` walt
2005-09-26 17:55 ` Linus Torvalds
2005-09-26 19:23 ` walt
2005-09-26 20:12 ` Johannes Schindelin
2005-09-26 20:19 ` Junio C Hamano
2005-09-26 22:13 ` Daniel Barkalow
2005-09-26 22:38 ` Junio C Hamano
2005-09-26 20:43 ` Petr Baudis
2005-09-27 6:35 ` hared GIT repos (was Re: rsync deprecated but promoted?) Matthias Urlichs
2005-09-27 7:13 ` shared GIT repos Junio C Hamano
2005-09-27 8:45 ` Matthias Urlichs
2005-09-27 9:59 ` Sergey Vlasov
2005-09-27 10:29 ` Matthias Urlichs
2005-09-27 15:21 ` Linus Torvalds
2005-09-27 18:36 ` hared GIT repos (was Re: rsync deprecated but promoted?) A Large Angry SCM
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).