git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Linux 2.6.15-rc2
       [not found] <Pine.LNX.4.64.0511191934210.8552@g5.osdl.org>
@ 2005-11-24 12:37 ` Ed Tomlinson
  2005-11-24 13:07   ` Andreas Ericsson
  2005-11-24 18:37   ` Linus Torvalds
  0 siblings, 2 replies; 10+ messages in thread
From: Ed Tomlinson @ 2005-11-24 12:37 UTC (permalink / raw)
  To: Linus Torvalds, Junio C Hamano, git; +Cc: Linux Kernel Mailing List

On Saturday 19 November 2005 22:40, Linus Torvalds wrote:
> There it is (or will soon be - the tar-ball and patches are still 
> uploading, and mirroring can obviously take some time after that).

Something strange here.   After a cg-update, I had no tag for rc2.   Checking
showed no problems so I used cg-clone to get another copy of the repository.
Still no rc2.

ed@grover:/usr/src/2.6$ cg-version
cogito-0.16rc2 (73874dddeec2d0a8e5cd343eec762d98314def63)
ed@grover:/usr/src/2.6$ git --version
git version 0.99.9.GIT

cg-clone http://www.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2.6

It looks to be the tag that is missing, gitk show commits after Nov 19.

Both git and cg were  updated just prior to the cg-update (~Nov 22 8pm EST).

What is happening?

TIA
Ed Tomlinson

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux 2.6.15-rc2
  2005-11-24 12:37 ` Linux 2.6.15-rc2 Ed Tomlinson
@ 2005-11-24 13:07   ` Andreas Ericsson
  2005-11-24 18:44     ` Linus Torvalds
  2005-11-24 18:37   ` Linus Torvalds
  1 sibling, 1 reply; 10+ messages in thread
From: Andreas Ericsson @ 2005-11-24 13:07 UTC (permalink / raw)
  To: Ed Tomlinson
  Cc: Linus Torvalds, Junio C Hamano, git, Linux Kernel Mailing List

Ed Tomlinson wrote:
> Something strange here.   After a cg-update, I had no tag for rc2.   Checking
> showed no problems so I used cg-clone to get another copy of the repository.
> Still no rc2.
> 
> ed@grover:/usr/src/2.6$ cg-version
> cogito-0.16rc2 (73874dddeec2d0a8e5cd343eec762d98314def63)
> ed@grover:/usr/src/2.6$ git --version
> git version 0.99.9.GIT
> 
> cg-clone http://www.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2.6
> 

This happened a while ago to someone else too. Apparently the http 
transport needs serverside help (git-update-server-info or some such 
must be run on the remote side).

Unless you're restricted by firewalls and other you could try

git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2.6

which works flawlessly for me although it takes quite some time to 
transfer all the data.

Linus, HPA: Are the packs cached on kernel.org? It seems to be at least 
a minute before the transfers start.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux 2.6.15-rc2
  2005-11-24 12:37 ` Linux 2.6.15-rc2 Ed Tomlinson
  2005-11-24 13:07   ` Andreas Ericsson
@ 2005-11-24 18:37   ` Linus Torvalds
  2005-11-24 19:52     ` Nick Hengeveld
  1 sibling, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2005-11-24 18:37 UTC (permalink / raw)
  To: Ed Tomlinson; +Cc: Junio C Hamano, git, Linux Kernel Mailing List



On Thu, 24 Nov 2005, Ed Tomlinson wrote:
> 
> What is happening?

The http transport isn't very good for git, so git adds various special 
files to make it work at all. They need to be specially updated, and I 
hadn't done that.

Using the native git protocol through git://git.kernel.org/.. gets around 
it, as does using rsync. 

I just repacked and updated it now, so how http should work too, although 
inefficiently (because it will get a whole new pack - just one of the 
disadvantages of the non-native protocols).

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux 2.6.15-rc2
  2005-11-24 13:07   ` Andreas Ericsson
@ 2005-11-24 18:44     ` Linus Torvalds
  2005-11-24 19:42       ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2005-11-24 18:44 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Ed Tomlinson, Junio C Hamano, git, Linux Kernel Mailing List



On Thu, 24 Nov 2005, Andreas Ericsson wrote:
> 
> git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git 2.6
> 
> which works flawlessly for me although it takes quite some time to transfer
> all the data.

The initial clone is very expensive for the native git protocol: the 
protocol is designed to scale well for incremental updates (ie you have a 
_huge_ repository that has changed just a bit, and the protocol should 
work well for that), and that makes the initial clone quite expensive as 
it marshalls the whole damn repository into this nice packed format.

So it's often nicer (certainly on the remote server) to use "rsync" for 
the initial clone, and then only after that start using the git protocol.

(This is in no way really fundamental, and the server could cache the 
packs it generates for initial clones, but that isn't implemented yet, and 
probably won't be for some times).

Of course, especially if you're mostly bandwidth-constrained and the 
server side is not under a big load, using the native git protocol may 
actually be faster anyway. Because it's always going to generate the 
nicest packing, while rsync:// will just use whatever packing that the 
server happens to have at that point (but I do repack every few weeks, so 
rsync for the initial clone should never be horribly bad - and since I 
just repacked, it should get that "perfect" pack too).

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux 2.6.15-rc2
  2005-11-24 18:44     ` Linus Torvalds
@ 2005-11-24 19:42       ` Junio C Hamano
  2005-11-24 19:57         ` Linus Torvalds
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2005-11-24 19:42 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andreas Ericsson, Ed Tomlinson, git, Linux Kernel Mailing List

Linus Torvalds <torvalds@osdl.org> writes:

> (This is in no way really fundamental, and the server could cache the 
> packs it generates for initial clones, but that isn't implemented yet, and 
> probably won't be for some times).

Performance perceived by cloners is helped by

    $ mkdir -p .git/pack-cache
    $ git-rev-list --objects --all | git-pack-objects .git/pack-cache/pack

on the server side.  This exact example of preparing by the
repository maintainer is optimizing for a wrong case, and I do
not think it is worth doing in practice, but this will give you
the lower bound when server side cache is implemented to do it
on demand.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux 2.6.15-rc2
  2005-11-24 18:37   ` Linus Torvalds
@ 2005-11-24 19:52     ` Nick Hengeveld
  2005-11-25  2:50       ` Ed Tomlinson
  0 siblings, 1 reply; 10+ messages in thread
From: Nick Hengeveld @ 2005-11-24 19:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Ed Tomlinson, Junio C Hamano, git, Linux Kernel Mailing List

On Thu, Nov 24, 2005 at 10:37:15AM -0800, Linus Torvalds wrote:

> I just repacked and updated it now, so how http should work too, although 
> inefficiently (because it will get a whole new pack - just one of the 
> disadvantages of the non-native protocols).

There's room to improve on that particular inefficiency.  The http
commit walker could use Range: headers to fetch loose objects directly
from inside a pack if it didn't make sense to fetch the entire pack.
For this to work, pack fetches would need to be deferred until the
entire tree had been walked, and the commit walker could decide whether
to fetch the pack or loose objects based on the percentage of packed
objects it needed to fetch.  It would also need to fetch all
tag/commit/tree objects using ranges to be able to fully walk the tree.

-- 
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux 2.6.15-rc2
  2005-11-24 19:42       ` Junio C Hamano
@ 2005-11-24 19:57         ` Linus Torvalds
  2005-11-24 21:02           ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Linus Torvalds @ 2005-11-24 19:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Andreas Ericsson, Ed Tomlinson, git, Linux Kernel Mailing List



On Thu, 24 Nov 2005, Junio C Hamano wrote:
> 
> Performance perceived by cloners is helped by
> 
>     $ mkdir -p .git/pack-cache
>     $ git-rev-list --objects --all | git-pack-objects .git/pack-cache/pack

That really doesn't work very well. I push to that tree often several 
times a day, and you'd have to re-do the cache each time.

So it would be much better if git-pack-objects would just always cache its 
output in .git/pack-cache - along with some logic to just get rid of old 
ones regularly.

Since git-pack-objects has to generate the pack _anyway_, it might as well 
save it away when it does - so that if you have lots of people doing 
clones or pulling, you'd only need to run it once for a particular set of 
objects, and you'd not have to do any extra (or unnecessary) maintenance.

		Linus

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux 2.6.15-rc2
  2005-11-24 19:57         ` Linus Torvalds
@ 2005-11-24 21:02           ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2005-11-24 21:02 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andreas Ericsson, Ed Tomlinson, git, Linux Kernel Mailing List

Linus Torvalds <torvalds@osdl.org> writes:

> Since git-pack-objects has to generate the pack _anyway_, it might as well 
> save it away when it does - so that if you have lots of people doing 
> clones or pulling, you'd only need to run it once for a particular set of 
> objects, and you'd not have to do any extra (or unnecessary) maintenance.

Caching itself is relatively easy (just implement an equivalent
of tee inside pack-objects ourselves).  More problematic is
pruning.  We could do it from cron based on atime _if_ the
filesystem is not mounted noatime but without arranging a
reasonably way for automated pruning this would become a disk
hog and extra maintenance burden, which is why I did not
implement the dynamic caching part in the initial round.

Since git-daemon would be the primary user of pack-cache/, this
implies a repository writable by git-daemon user on public
machine (not master), which is an extra thing to note.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux 2.6.15-rc2
  2005-11-24 19:52     ` Nick Hengeveld
@ 2005-11-25  2:50       ` Ed Tomlinson
  2005-11-25  8:42         ` Andreas Ericsson
  0 siblings, 1 reply; 10+ messages in thread
From: Ed Tomlinson @ 2005-11-25  2:50 UTC (permalink / raw)
  To: Nick Hengeveld
  Cc: Linus Torvalds, Junio C Hamano, git, Linux Kernel Mailing List

On Thursday 24 November 2005 14:52, Nick Hengeveld wrote:
> On Thu, Nov 24, 2005 at 10:37:15AM -0800, Linus Torvalds wrote:
> 
> > I just repacked and updated it now, so how http should work too, although 
> > inefficiently (because it will get a whole new pack - just one of the 
> > disadvantages of the non-native protocols).
> 
> There's room to improve on that particular inefficiency.  The http
> commit walker could use Range: headers to fetch loose objects directly
> from inside a pack if it didn't make sense to fetch the entire pack.
> For this to work, pack fetches would need to be deferred until the
> entire tree had been walked, and the commit walker could decide whether
> to fetch the pack or loose objects based on the percentage of packed
> objects it needed to fetch.  It would also need to fetch all
> tag/commit/tree objects using ranges to be able to fully walk the tree.

Alternately, when creating a new archive the client could ask the server
what protocols are active.  It could then use the best one for the clone and
update the .git/origin files with the optimal one for incremental pulls.

Thoughts?
Ed Tomlinson

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Linux 2.6.15-rc2
  2005-11-25  2:50       ` Ed Tomlinson
@ 2005-11-25  8:42         ` Andreas Ericsson
  0 siblings, 0 replies; 10+ messages in thread
From: Andreas Ericsson @ 2005-11-25  8:42 UTC (permalink / raw)
  To: Ed Tomlinson
  Cc: Nick Hengeveld, Linus Torvalds, Junio C Hamano, git,
	Linux Kernel Mailing List

Ed Tomlinson wrote:
> On Thursday 24 November 2005 14:52, Nick Hengeveld wrote:
> 
>>On Thu, Nov 24, 2005 at 10:37:15AM -0800, Linus Torvalds wrote:
>>
>>
>>>I just repacked and updated it now, so how http should work too, although 
>>>inefficiently (because it will get a whole new pack - just one of the 
>>>disadvantages of the non-native protocols).
>>
>>There's room to improve on that particular inefficiency.  The http
>>commit walker could use Range: headers to fetch loose objects directly
>>from inside a pack if it didn't make sense to fetch the entire pack.
>>For this to work, pack fetches would need to be deferred until the
>>entire tree had been walked, and the commit walker could decide whether
>>to fetch the pack or loose objects based on the percentage of packed
>>objects it needed to fetch.  It would also need to fetch all
>>tag/commit/tree objects using ranges to be able to fully walk the tree.
> 
> 
> Alternately, when creating a new archive the client could ask the server
> what protocols are active.  It could then use the best one for the clone and
> update the .git/origin files with the optimal one for incremental pulls.
> 

This would only work with the git protocol, and since that's the fastest 
protocol (theoretically that is, Pasky seems to have gotten other 
figures but I'm not sure I believe those) it should really only ever 
return itself which wouldn't make much sense.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2005-11-25  8:42 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <Pine.LNX.4.64.0511191934210.8552@g5.osdl.org>
2005-11-24 12:37 ` Linux 2.6.15-rc2 Ed Tomlinson
2005-11-24 13:07   ` Andreas Ericsson
2005-11-24 18:44     ` Linus Torvalds
2005-11-24 19:42       ` Junio C Hamano
2005-11-24 19:57         ` Linus Torvalds
2005-11-24 21:02           ` Junio C Hamano
2005-11-24 18:37   ` Linus Torvalds
2005-11-24 19:52     ` Nick Hengeveld
2005-11-25  2:50       ` Ed Tomlinson
2005-11-25  8:42         ` Andreas Ericsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).