* [GSoC] Google Summer of Code 2009 - new ideas
@ 2009-03-07 0:44 Jakub Narebski
2009-03-07 1:09 ` Jan Janak
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Jakub Narebski @ 2009-03-07 0:44 UTC (permalink / raw)
To: git; +Cc: Shawn Pearce
Time to submit application as mentoring organization to
Google Summer of Code 2009 is close: March 9 -- March 13.
I'd like to add a few ideas to SoC2009Ideas wiki page, but before I do
this I'd like to ask for comments. (The proposals also lacks proposed
mentor).
I am wondering if it would be worth it to make a separate class between
"New to Git?" easy tasks, and "Larger Projects" hard tasks...
BTW. some of ideas didn't make it from SoC2008Ideas wiki page to current
year page, namely:
* Apply sparse To Fix Errors
* Lazy clone / remote alternates
* Implement git-submodule using .gitlink file
* Teach git-apply the 3-way merge fallback git-am knows
* Better Emacs integration
Was this ommision deliberate or accidental?
-- >8 --
= New To Git? New To Open Source Development? =
== Packfile caching for git-daemon ==
Even with delta reuse, enumerating objects to be present in packfile
generates significant load on server for pack-generating protocols,
such as git:// protocol used by git-daemon. Many of requests result in
the same packfile to be generated and sent; examples include full
clone, or fetch of all branches since last update. It would make sense
then to save (cache) packfiles, and if possible avoid regenerating
packfiles by sending them from cache. (Possible extension would be to
send slightly larger pack than needed if one can reuse cached packfile
instead).
The goal is for git-daemon to cache packfiles, use cached packfiles if
possible, and to manage packfile cache. Note that one would need in
the final version some way to specify upper limit on packfile cache
size and some cache entry expire policy.
'''Goal:''' Support for packfile cache in git-daemon,
benchmark server load
'''Language:''' C
== Single credentials ==
Currently if you don't save your username and password in plain-text
`.netrc` file (for HTTP transport), or avoid need for interactive
credentials using public key / private key pair (for SSH), you need to
repeat credentials many times during single git-fetch or git-clone
command. The goal is to reuse existing connections if possible, so the
whole transaction occurs using single connection and single
credentials; if that is not possible cache credentials (in secure way)
so user need to provide username and password at most once.
'''Goal:''' git-fetch and git-clone over HTTPS and git://
requiring providing username and password at most once
'''Language:''' C (perhaps also shell script)
= Larger Projects =
== Directory renames ==
Git deals quite well with renames when merging. One of the corner cases
is when one side renamed some directory, and other side created ''new
files'' in the old-name directory. Git currently creates new files in
resurrected old-name directory, while it could create new files under
new-name directory instead.
There is a bit of controversy about this feature, as for example in
some programming languages (e.g. Java) or in some project build tool
info it is not posible to simply move a file (or create new file in
different directory) without changing file contents. Some say that
is better to fail than to do wrongly clean merge.
'''Goal:''' At minimum option enabling wholesame directory rename
detection. Preferred to add dealing with directory renames also to
merge. At last, one can try to implement "git log --follow" for
directories.
'''Language:''' C
'''See:''' [http://thread.gmane.org/gmane.comp.version-control.git/99529
|RFC PATCH v2 0/2| Detection of directory renames] thread on git
mailing list (via GMane)
'''See also:'''
*
[http://thread.gmane.org/gmane.comp.version-control.git/80912/focus=81362
merge renamed files/directories?] subthread on git mailing list
* [http://thread.gmane.org/gmane.comp.version-control.git/108106
Comments on "Understanding Version Control" by Eric S. Raymond] thread
contains some thoughts on wholesame directory rename detection
* [http://blog.teksol.info/2008/01/16/directory-renames-under-git
Directory renames under Git] blog post notice the issue
* [http://www.markshuttleworth.com/archives/123 Renaming is the killer
app of distributed version control] blog post by Mak Shuttleworth
(pro-Bazaar).
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [GSoC] Google Summer of Code 2009 - new ideas
2009-03-07 0:44 [GSoC] Google Summer of Code 2009 - new ideas Jakub Narebski
@ 2009-03-07 1:09 ` Jan Janak
2009-03-07 2:56 ` Tay Ray Chuan
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Jan Janak @ 2009-03-07 1:09 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git, Shawn Pearce
On 07-03 01:44, Jakub Narebski wrote:
> Time to submit application as mentoring organization to
> Google Summer of Code 2009 is close: March 9 -- March 13.
>
> I'd like to add a few ideas to SoC2009Ideas wiki page, but before I do
> this I'd like to ask for comments. (The proposals also lacks proposed
> mentor).
>
> I am wondering if it would be worth it to make a separate class between
> "New to Git?" easy tasks, and "Larger Projects" hard tasks...
>
> BTW. some of ideas didn't make it from SoC2008Ideas wiki page to current
> year page, namely:
> * Apply sparse To Fix Errors
> * Lazy clone / remote alternates
> * Implement git-submodule using .gitlink file
> * Teach git-apply the 3-way merge fallback git-am knows
> * Better Emacs integration
There are already two (IMHO good) emacs modes for git, magit and egg:
http://zagadka.vm.bytemark.co.uk/magit/
http://github.com/bogolisk/egg/tree/master
Jan.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [GSoC] Google Summer of Code 2009 - new ideas
2009-03-07 0:44 [GSoC] Google Summer of Code 2009 - new ideas Jakub Narebski
2009-03-07 1:09 ` Jan Janak
@ 2009-03-07 2:56 ` Tay Ray Chuan
2009-03-08 23:59 ` Jakub Narebski
2009-03-07 19:59 ` P Baker
2009-03-10 0:49 ` Shawn O. Pearce
3 siblings, 1 reply; 8+ messages in thread
From: Tay Ray Chuan @ 2009-03-07 2:56 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git, Shawn Pearce
Hi,
On 3/7/09, Jakub Narebski <jnareb@gmail.com> wrote:
> == Single credentials ==
>
> Currently if you don't save your username and password in plain-text
> `.netrc` file (for HTTP transport), or avoid need for interactive
> credentials using public key / private key pair (for SSH), you need to
> repeat credentials many times during single git-fetch or git-clone
> command. The goal is to reuse existing connections if possible, so the
> whole transaction occurs using single connection and single
> credentials; if that is not possible cache credentials (in secure way)
> so user need to provide username and password at most once.
>
> '''Goal:''' git-fetch and git-clone over HTTPS and git://
> requiring providing username and password at most once
> '''Language:''' C (perhaps also shell script)
Perhaps you might want to look at this:
http://marc.info/?l=git&m=123599968929476&w=4
At that time, I was thinking more of removing git's reliance on curl's
multi interface so that it could use older versions of libcurl. But,
on this point, Daniel convinced me otherwise. In fact, it doesn't make
sense if you could have a up-to-date git, but not an up-to-date curl.
I didn't really get a reply on my point of "minimized credential
prompting", though, and I think this GSoC proposal kinda gives support
to it.
>From a learning standpoint, I don't think this project would be too
challenging, nor can it sustain for a whole summer -- the basic
strategy to allow non-curl multi usage (ie. single connections) would
be to "fork" the current http slot methods and make them
non-curl_multi, then finding and replacing instances of them
throughout the code base.
I already have a patch series that does that, plus a --persistent
option for push. I'm fairly sure that it takes place on a single
connection (I'm relying on my firewall log though I'm doubting it's
reliability on this issue).
--
Cheers,
Ray Chuan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [GSoC] Google Summer of Code 2009 - new ideas
2009-03-07 2:56 ` Tay Ray Chuan
@ 2009-03-08 23:59 ` Jakub Narebski
[not found] ` <20090309115026.obsvt34miowwcw8w@webmail.fussycoder.id.au>
0 siblings, 1 reply; 8+ messages in thread
From: Jakub Narebski @ 2009-03-08 23:59 UTC (permalink / raw)
To: Tay Ray Chuan; +Cc: git, Shawn Pearce
On Sat, 7 Mar 2009, Tay Ray Chuan wrote:
> On 3/7/09, Jakub Narebski <jnareb@gmail.com> wrote:
> > == Single credentials ==
> >
> > Currently if you don't save your username and password in plain-text
> > `.netrc` file (for HTTP transport), or avoid need for interactive
> > credentials using public key / private key pair (for SSH), you need to
> > repeat credentials many times during single git-fetch or git-clone
> > command. The goal is to reuse existing connections if possible, so the
> > whole transaction occurs using single connection and single
> > credentials; if that is not possible cache credentials (in secure way)
> > so user need to provide username and password at most once.
> >
> > '''Goal:''' git-fetch and git-clone over HTTPS and git://
> > requiring providing username and password at most once
> > '''Language:''' C (perhaps also shell script)
>
> Perhaps you might want to look at this:
>
> http://marc.info/?l=git&m=123599968929476&w=4
Thank you for the link.
> At that time, I was thinking more of removing git's reliance on curl's
> multi interface so that it could use older versions of libcurl. But,
> on this point, Daniel convinced me otherwise. In fact, it doesn't make
> sense if you could have a up-to-date git, but not an up-to-date curl.
>
> I didn't really get a reply on my point of "minimized credential
> prompting", though, and I think this GSoC proposal kinda gives support
> to it.
>
> From a learning standpoint, I don't think this project would be too
> challenging, nor can it sustain for a whole summer -- the basic
> strategy to allow non-curl multi usage (ie. single connections) would
> be to "fork" the current http slot methods and make them
> non-curl_multi, then finding and replacing instances of them
> throughout the code base.
I was thinking more about caching credentials by git rather than forcing
to use single connection. Additionally you are solving the problem for
the HTTP(S) transport; admittedly for SSH there is much better solution
of using public/private keys, instead of asking for password.
I guess you are right and "minimized credential prompting" (aka "single
credentials") is too small a project for Google Summer of Code...
I won't add it to SoC2009Ideas page.
> I already have a patch series that does that, plus a --persistent
> option for push. I'm fairly sure that it takes place on a single
> connection (I'm relying on my firewall log though I'm doubting it's
> reliability on this issue).
--
Jakub Narebski
Poland
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [GSoC] Google Summer of Code 2009 - new ideas
2009-03-07 0:44 [GSoC] Google Summer of Code 2009 - new ideas Jakub Narebski
2009-03-07 1:09 ` Jan Janak
2009-03-07 2:56 ` Tay Ray Chuan
@ 2009-03-07 19:59 ` P Baker
2009-03-07 20:55 ` Shawn O. Pearce
2009-03-10 0:49 ` Shawn O. Pearce
3 siblings, 1 reply; 8+ messages in thread
From: P Baker @ 2009-03-07 19:59 UTC (permalink / raw)
To: git
I posted to this list serve a few days ago about one of the 2008 SoC
ideas. Are those ideas still plausible? Specifically, I'm interested
in pursuing the git-submodule update. Is this off the drawing board?
P Baker
On 3/6/09, Jakub Narebski <jnareb@gmail.com> wrote:
> Time to submit application as mentoring organization to
> Google Summer of Code 2009 is close: March 9 -- March 13.
>
> I'd like to add a few ideas to SoC2009Ideas wiki page, but before I do
> this I'd like to ask for comments. (The proposals also lacks proposed
> mentor).
>
> I am wondering if it would be worth it to make a separate class between
> "New to Git?" easy tasks, and "Larger Projects" hard tasks...
>
> BTW. some of ideas didn't make it from SoC2008Ideas wiki page to current
> year page, namely:
> * Apply sparse To Fix Errors
> * Lazy clone / remote alternates
> * Implement git-submodule using .gitlink file
> * Teach git-apply the 3-way merge fallback git-am knows
> * Better Emacs integration
> Was this ommision deliberate or accidental?
>
>
> -- >8 --
>
> = New To Git? New To Open Source Development? =
>
> == Packfile caching for git-daemon ==
>
> Even with delta reuse, enumerating objects to be present in packfile
> generates significant load on server for pack-generating protocols,
> such as git:// protocol used by git-daemon. Many of requests result in
> the same packfile to be generated and sent; examples include full
> clone, or fetch of all branches since last update. It would make sense
> then to save (cache) packfiles, and if possible avoid regenerating
> packfiles by sending them from cache. (Possible extension would be to
> send slightly larger pack than needed if one can reuse cached packfile
> instead).
>
> The goal is for git-daemon to cache packfiles, use cached packfiles if
> possible, and to manage packfile cache. Note that one would need in
> the final version some way to specify upper limit on packfile cache
> size and some cache entry expire policy.
>
> '''Goal:''' Support for packfile cache in git-daemon,
> benchmark server load
> '''Language:''' C
>
> == Single credentials ==
>
> Currently if you don't save your username and password in plain-text
> `.netrc` file (for HTTP transport), or avoid need for interactive
> credentials using public key / private key pair (for SSH), you need to
> repeat credentials many times during single git-fetch or git-clone
> command. The goal is to reuse existing connections if possible, so the
> whole transaction occurs using single connection and single
> credentials; if that is not possible cache credentials (in secure way)
> so user need to provide username and password at most once.
>
> '''Goal:''' git-fetch and git-clone over HTTPS and git://
> requiring providing username and password at most once
> '''Language:''' C (perhaps also shell script)
>
>
> = Larger Projects =
>
> == Directory renames ==
>
> Git deals quite well with renames when merging. One of the corner cases
> is when one side renamed some directory, and other side created ''new
> files'' in the old-name directory. Git currently creates new files in
> resurrected old-name directory, while it could create new files under
> new-name directory instead.
>
> There is a bit of controversy about this feature, as for example in
> some programming languages (e.g. Java) or in some project build tool
> info it is not posible to simply move a file (or create new file in
> different directory) without changing file contents. Some say that
> is better to fail than to do wrongly clean merge.
>
> '''Goal:''' At minimum option enabling wholesame directory rename
> detection. Preferred to add dealing with directory renames also to
> merge. At last, one can try to implement "git log --follow" for
> directories.
> '''Language:''' C
> '''See:''' [http://thread.gmane.org/gmane.comp.version-control.git/99529
> |RFC PATCH v2 0/2| Detection of directory renames] thread on git
> mailing list (via GMane)
> '''See also:'''
> *
> [http://thread.gmane.org/gmane.comp.version-control.git/80912/focus=81362
> merge renamed files/directories?] subthread on git mailing list
> * [http://thread.gmane.org/gmane.comp.version-control.git/108106
> Comments on "Understanding Version Control" by Eric S. Raymond] thread
> contains some thoughts on wholesame directory rename detection
> * [http://blog.teksol.info/2008/01/16/directory-renames-under-git
> Directory renames under Git] blog post notice the issue
> * [http://www.markshuttleworth.com/archives/123 Renaming is the killer
> app of distributed version control] blog post by Mak Shuttleworth
> (pro-Bazaar).
>
> --
> Jakub Narebski
> Poland
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [GSoC] Google Summer of Code 2009 - new ideas
2009-03-07 19:59 ` P Baker
@ 2009-03-07 20:55 ` Shawn O. Pearce
0 siblings, 0 replies; 8+ messages in thread
From: Shawn O. Pearce @ 2009-03-07 20:55 UTC (permalink / raw)
To: P Baker; +Cc: git
P Baker <me@retrodict.com> wrote:
> I posted to this list serve a few days ago about one of the 2008 SoC
> ideas. Are those ideas still plausible? Specifically, I'm interested
> in pursuing the git-submodule update. Is this off the drawing board?
Its not off the table just because of what someone else proposed
as an ideas list for 2009.
A GSoC project is what the student makes of it. You propose
something that you are interested in working on, that you think
you can complete in the time available within the program calendar,
and that the larger Git community would like to see implemented.
I think your post fell flat a few days ago with no replies because
it just didn't seem to invite any response from people. Remember,
we're all quite busy with our own projects and lives too, just
like you are, and we do Git hacking/emailing in our spare time.
An email needs to be somewhat compelling and invite a reply in
order to get a reply...
--
Shawn.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [GSoC] Google Summer of Code 2009 - new ideas
2009-03-07 0:44 [GSoC] Google Summer of Code 2009 - new ideas Jakub Narebski
` (2 preceding siblings ...)
2009-03-07 19:59 ` P Baker
@ 2009-03-10 0:49 ` Shawn O. Pearce
3 siblings, 0 replies; 8+ messages in thread
From: Shawn O. Pearce @ 2009-03-10 0:49 UTC (permalink / raw)
To: Jakub Narebski; +Cc: git
Jakub Narebski <jnareb@gmail.com> wrote:
> I'd like to add a few ideas to SoC2009Ideas wiki page, but before I do
> this I'd like to ask for comments. (The proposals also lacks proposed
> mentor).
>
> I am wondering if it would be worth it to make a separate class between
> "New to Git?" easy tasks, and "Larger Projects" hard tasks...
Done, there is now a "Medium" category. Folks should start to
repaint the bikeshed if they disagre with my current choices
in colors.
> BTW. some of ideas didn't make it from SoC2008Ideas wiki page to current
> year page, namely:
> * Apply sparse To Fix Errors
> * Lazy clone / remote alternates
> * Implement git-submodule using .gitlink file
> * Teach git-apply the 3-way merge fallback git-am knows
> * Better Emacs integration
> Was this ommision deliberate or accidental?
Accidental. Most have been added back. Of note I did not add back
the emacs integration as we have multiple emacs packages.
--
Shawn.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-03-10 0:51 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-07 0:44 [GSoC] Google Summer of Code 2009 - new ideas Jakub Narebski
2009-03-07 1:09 ` Jan Janak
2009-03-07 2:56 ` Tay Ray Chuan
2009-03-08 23:59 ` Jakub Narebski
[not found] ` <20090309115026.obsvt34miowwcw8w@webmail.fussycoder.id.au>
2009-03-09 1:18 ` Jakub Narebski
2009-03-07 19:59 ` P Baker
2009-03-07 20:55 ` Shawn O. Pearce
2009-03-10 0:49 ` Shawn O. Pearce
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).