git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Summer of Code 2008 project application draft: Pack v4
@ 2008-03-29 20:50 Peter Eriksen
  2008-03-31  4:12 ` Shawn O. Pearce
  0 siblings, 1 reply; 3+ messages in thread
From: Peter Eriksen @ 2008-03-29 20:50 UTC (permalink / raw)
  To: git

Dear Git community,

Here is a draft of my application for the Google Summer of Code 2008.  I
am aware, that this is a big project, so how might the scope be limited,
so that the work is still useful to build on top of? How about just
rebasing the patches already in the sp/pack4 branch in the
fastimport.git repository?

The application text goes like this:

"
The project goal is to rebase the code and ideas developed for the
version 4 of the git pack format, which showed good promise of making
packs smaller, and faster.

The ideas of a new even more optimized pack format has been floating
around the git world for almost two years, and because of the rapid pace
of development the code implementing those ideas has become less, and
less ready for inclusion in mainline.

Since those patches touch so many of the core functions in git, it will
be a good chunk of work getting them mergeable, and nobody has gotten
around to doing that yet. This will be a good oppertunity for laying the
ground work, and getting the ball rolling again.

This project will not only benefit Git itself, but also the nummerous
projects, and developers using Git as their prefered revision control
system, among them many prominent open source and free software
projects.


About me:
I have been following the development of Git on and of almost from the
beginning, and have been trying to learn from its design, and
implementation, and especially I have been interested in implementation
of the git repository, index, and pack format. I have contributed a few
general clean-up patches, but I have not yet had the chance (read time)
to really dive in, and make a significant, and non-trivial contribution,
although I would have liked to.
"

Comments?

Regards,

Peter

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Summer of Code 2008 project application draft: Pack v4
  2008-03-29 20:50 Summer of Code 2008 project application draft: Pack v4 Peter Eriksen
@ 2008-03-31  4:12 ` Shawn O. Pearce
  2008-03-31 11:06   ` Peter Eriksen
  0 siblings, 1 reply; 3+ messages in thread
From: Shawn O. Pearce @ 2008-03-31  4:12 UTC (permalink / raw)
  To: Peter Eriksen; +Cc: git

Peter Eriksen <s022018@student.dtu.dk> wrote:
> Here is a draft of my application for the Google Summer of Code 2008.
...
> The project goal is to rebase the code and ideas developed for the
> version 4 of the git pack format, which showed good promise of making
> packs smaller, and faster.
> 
> The ideas of a new even more optimized pack format has been floating
> around the git world for almost two years, and because of the rapid pace
> of development the code implementing those ideas has become less, and
> less ready for inclusion in mainline.
> 
> Since those patches touch so many of the core functions in git, it will
> be a good chunk of work getting them mergeable, and nobody has gotten
> around to doing that yet. This will be a good oppertunity for laying the
> ground work, and getting the ball rolling again.

Have you had a chance to look at those patches yet?  Or the code
that they touch, but which has been heavily modified since then
(like say builtin-pack-objects.c)?

I would hope that forward-porting those patches would only take
us through to about the mid-term, and then finishing out the bulk
of the series (like commit dict encoding, maybe dict of object ids
used in trees) would be the remainder of the summer.  But that may
be aggressive.  To be successful I think the student working on
this project needs to spend some time during the bonding period to
understand the current pack v2 format and how the pack v4 format
was going to address some of the shortcomings of v2.

To some extent I have left the design details about pack v4 off
the ideas page hoping to draw students into explaining their own
ideas for how to improve upon Git's pack data storage.  Even if the
student's ideas provide less compression than pack v4 was hinting
it can give us, it shows the student's ability to think through
the problem and their desire to work on the project.  Its also why
I called it "v4/v5"... some of the students own ideas may be novel
to us and better than v4, hence creating a v5...
 
-- 
Shawn.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Summer of Code 2008 project application draft: Pack v4
  2008-03-31  4:12 ` Shawn O. Pearce
@ 2008-03-31 11:06   ` Peter Eriksen
  0 siblings, 0 replies; 3+ messages in thread
From: Peter Eriksen @ 2008-03-31 11:06 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git

On Mon, Mar 31, 2008 at 12:12:17AM -0400, Shawn O. Pearce wrote:
> Peter Eriksen <s022018@student.dtu.dk> wrote:
> > Here is a draft of my application for the Google Summer of Code 2008.
> ...
> > The project goal is to rebase the code and ideas developed for the
> > version 4 of the git pack format, which showed good promise of making
> > packs smaller, and faster.
> 
> Have you had a chance to look at those patches yet?

Yes, more than a year ago, see e.g. 
http://thread.gmane.org/gmane.comp.version-control.git/43016
but I will need to refresh my memory.

> Or the code that they touch, but which has been heavily modified
> since then (like say builtin-pack-objects.c)?

No, I am not yet too familiar with all the newest changes. I obviously
would need to read that code carefully.

> I would hope that forward-porting those patches would only take
> us through to about the mid-term, and then finishing out the bulk
> of the series (like commit dict encoding, maybe dict of object ids
> used in trees) would be the remainder of the summer.  But that may
> be aggressive.

I will think a bit about this, and try to make a time line.

> To be successful I think the student working on
> this project needs to spend some time during the bonding period to
> understand the current pack v2 format and how the pack v4 format
> was going to address some of the shortcomings of v2.

Yes, since I basic understanding of the pack formats some time ago, it
should be possible to get up to speed fairly quickly in the bonding
period.

> To some extent I have left the design details about pack v4 off
> the ideas page hoping to draw students into explaining their own
> ideas for how to improve upon Git's pack data storage.

This would be nice of course, but my time is quite limited at the
moment, so this will not be possible for me yet. The reason why I
would like to participate this year is because my summer vacation this
year will be much earlier, and will fit very well into the GSoC window.

Thank you for comments.

Peter

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-03-31 11:07 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-29 20:50 Summer of Code 2008 project application draft: Pack v4 Peter Eriksen
2008-03-31  4:12 ` Shawn O. Pearce
2008-03-31 11:06   ` Peter Eriksen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).