Git development
 help / color / mirror / Atom feed
* Re: [PATCH 1/3] Convert resolve_ref+xstrdup to new resolve_refdup function
From: Jonathan Nieder @ 2011-12-10 13:15 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, Tony Wang
In-Reply-To: <1323521631-24320-1-git-send-email-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy wrote:

> [Subject: Convert resolve_ref+xstrdup to new resolve_refdup function]

I like it.

> --- a/builtin/revert.c
> +++ b/builtin/revert.c
> @@ -901,7 +901,7 @@ static int rollback_single_pick(void)
>  	if (!file_exists(git_path("CHERRY_PICK_HEAD")) &&
>  	    !file_exists(git_path("REVERT_HEAD")))
>  		return error(_("no cherry-pick or revert in progress"));
> -	if (!resolve_ref("HEAD", head_sha1, 0, NULL))
> +	if (read_ref_full("HEAD", head_sha1, 0, NULL))
>  		return error(_("cannot resolve HEAD"));
>  	if (is_null_sha1(head_sha1))
>  		return error(_("cannot abort from a branch yet to be born"));

Unrelated change that snuck in, I assume?

The rest of the patch looks very sensible and no-op-like. :)

^ permalink raw reply

* Re: [RFC/PATCH] add update to branch support for "floating submodules"
From: Gioele Barabucci @ 2011-12-10 14:16 UTC (permalink / raw)
  To: git; +Cc: Heiko Voigt, git
In-Reply-To: <7vr51htbsy.fsf@alter.siamese.dyndns.org>

On 09/11/2011 18:01, Junio C Hamano wrote:
>> This is almost ready but I would like to know what users of the
>> "floating submodule" think about this.
 >
> I do like to hear from potential users as well, because the general
> impression we got was that floating submodules is not a real need of
> anybody,

Floating modules are something much sought after by those who use Git 
for non-development purposes, like those who have most of their $HOME 
versioned with Git [1]. For example, part of what Joey Hess's `mr` tool 
[2] does is to simulate floating submodules for Git-versioned $HOMEs.

In the context of versioned $HOMEs, or with backups in general, precise 
tracking of submodules updates is not that important. To quote [3]: 
«Last, change tracking is a bit more lenient with home directories. I 
may shuffle some stuff around, and I don't need to explain the changes 
to anyone else.». In my case, I want my ~/Documents dir (that is in a 
different repo from $HOME) to be always updated; I would prefer not to 
deal with submodule updates, merges and detached HEADs.

Bye,

[1] http://vcs-home.branchable.com/
[2] http://kitenet.net/~joey/code/mr/
[3] http://joshcarter.com/productivity/svn_hg_git_for_home_directory

--
Gioele Barabucci <gioele@svario.it>

^ permalink raw reply

* [PATCH] gitk: make "git describe" output clickable, too
From: Jim Meyering @ 2011-12-10 15:08 UTC (permalink / raw)
  To: git list


I noticed that automake's contribution guidelines suggest using
"git describe" output in commit logs to reference previous commits.
By contrast, in coreutils, I had acquired the habit of using a bare SHA1
prefix (8 hex digits), since gitk creates clickable links for that, and
not for "git describe" output.

I prefer the readability of the full "git describe" output, yet want to
retain the gitk links, so wrote the following that renders as clickable
not just SHA1-like strings, but also an SHA1-like string that is
prefixed by "-g".

Signed-off-by: Jim Meyering <meyering@redhat.com>
---
This is relative to master.
Think of this as mere proof-of-concept:

Ideally, the string preceding the -g would be used to disambiguate
the SHA1 prefix, but that would require more code.

I confess that I haven't looked to see if documentation needs
to be updated or if this would merit test suite additions.

 gitk-git/gitk |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/gitk-git/gitk b/gitk-git/gitk
index 4cde0c4..f8eb613 100755
--- a/gitk-git/gitk
+++ b/gitk-git/gitk
@@ -6688,7 +6688,7 @@ proc appendwithlinks {text tags} {

     set start [$ctext index "end - 1c"]
     $ctext insert end $text $tags
-    set links [regexp -indices -all -inline {\m[0-9a-f]{6,40}\M} $text]
+    set links [regexp -indices -all -inline {(?:\m|-g)[0-9a-f]{6,40}\M} $text]
     foreach l $links {
 	set s [lindex $l 0]
 	set e [lindex $l 1]
@@ -6704,6 +6704,10 @@ proc appendwithlinks {text tags} {
 proc setlink {id lk} {
     global curview ctext pendinglinks

+    if {[string range $id 0 1] eq "-g"} {
+      set id [string range $id 2 end]
+    }
+
     set known 0
     if {[string length $id] < 40} {
 	set matches [longid $id]
--
1.7.8.163.g9859a

^ permalink raw reply related

* Re: [RFC/PATCH] add update to branch support for "floating submodules"
From: Leif Gruenwoldt @ 2011-12-10 15:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vborhaqgw.fsf@alter.siamese.dyndns.org>

On Sat, Dec 10, 2011 at 1:30 AM, Junio C Hamano <gitster@pobox.com> wrote:

> So that use case does not sound like a good rationale to require addition
> of floating submodules.

Ok I will try another scenario :)

Imagine again products A, B and C and a common library. The products are in
a stable state of development and track a stable branch of the common lib.
Then imagine an important security fix gets made to the common library. On
the next pull of products A, B, and C they get this fix for free
because they were
floating. They didn't need to communicate with the maintainer of the common
repo to know this. In fact they don't really care. They just want the
latest stable
code for that release branch.

This is how package management on many linux systems works. Dependencies
get updated and all products reap the benefit (or catastrophe) automatically.

^ permalink raw reply

* Re: Access to git repository through squid proxy: The remote end hung up unexpectedly
From: Konstantin Khomoutov @ 2011-12-10 15:37 UTC (permalink / raw)
  To: Mathieu Peltier; +Cc: git
In-Reply-To: <CACjeFCA4h_w2UmYywMBV_P+YZcWAE=zRUz-z5eTfAO+oxWKPjw@mail.gmail.com>

On Sat, 10 Dec 2011 09:56:07 +0100
Mathieu Peltier <mathieu.peltier@gmail.com> wrote:

> Hi,
> I am trying to access a git repository (git:// URL) through a squid
> proxy.
> 
> squid allows CONNECT for port 9418:
[...]
> 2011/12/09 12:22:44 socat[21428] D shutdown()  -> 0
> fatal: The remote end hung up unexpectedly
> 
> I tried to use also nc but I get the same error.
> Any advice?
I think you have to verify the git-daemon (you did not say you're using
git-daemon, but it can be presupposed based on the port number) works
by itself before starting to wrap it into layers of complexity.
What happens if you try to clone a git repo directly, without any
tunneling?  If this is not possible, try to clone on the host running
git-daemon (use an URL like git://localhost/path/to/repo.git).
If it fails (I suppose it will), try increasing the daemon verbosity
(see git-daemon) manpage.
After all, may be it's as simple as forgetting to `touch` git-export-ok
file in the repository you're trying to clone.

P.S.
As a side note: why are you trying to implement such a strange setup?
Why not just use plain old SSH which just works and provides good level
of security (contrary to Basic HTTP authentication you might be using).
If you need a level of control about who can do what with the repository
you could look at https://github.com/sitaramc/gitolite

^ permalink raw reply

* Re: [PATCH 1/3] Convert resolve_ref+xstrdup to new resolve_refdup function
From: Nguyen Thai Ngoc Duy @ 2011-12-10 15:40 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Junio C Hamano, Tony Wang
In-Reply-To: <20111210131503.GI22035@elie.hsd1.il.comcast.net>

2011/12/10 Jonathan Nieder <jrnieder@gmail.com>:
>> --- a/builtin/revert.c
>> +++ b/builtin/revert.c
>> @@ -901,7 +901,7 @@ static int rollback_single_pick(void)
>>       if (!file_exists(git_path("CHERRY_PICK_HEAD")) &&
>>           !file_exists(git_path("REVERT_HEAD")))
>>               return error(_("no cherry-pick or revert in progress"));
>> -     if (!resolve_ref("HEAD", head_sha1, 0, NULL))
>> +     if (read_ref_full("HEAD", head_sha1, 0, NULL))
>>               return error(_("cannot resolve HEAD"));
>>       if (is_null_sha1(head_sha1))
>>               return error(_("cannot abort from a branch yet to be born"));
>
> Unrelated change that snuck in, I assume?

Yeah that slipped in. It should be part of c689332 (Convert many
resolve_ref() calls to read_ref*() and ref_exists() - 2011-11-13). I
guess either I missed it or it was a new call site after that patch.
Split it out as a separate patch?
-- 
Duy

^ permalink raw reply

* Re: [POC PATCH 0/5] Threaded loose object and pack access
From: Nguyen Thai Ngoc Duy @ 2011-12-10 15:51 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, René Scharfe, Junio C Hamano, Eric Herman
In-Reply-To: <cover.1323419666.git.trast@student.ethz.ch>

On Fri, Dec 9, 2011 at 3:39 PM, Thomas Rast <trast@student.ethz.ch> wrote:
> Well, just to make sure we're all left in a confused mess of partly
> conflicting patches, here's another angle on the same thing:
>
> Jeff King wrote:
>> Wow, that's horrible. Leaving aside the parallelism, it's just terrible
>> that reading from the cache is 20 times slower than the worktree. I get
>> similar results on my quad-core machine.
>
> By poking around in sha1_file.c I got that down to about 10.  It's not
> great yet, but it seems a start.
>
> The goal would be to improve it to the point where a patch lookup that
> already has all relevant packs open and windows mapped can proceed
> without locking.  I'm not sure that's doable short of duplicating the
> whole pack state (including fds and windows) across threads, but I'll
> give it some more thought before going that route.

Another potential user for parallel pack access is fsck. Although fsck
access pattern may be different from grep, fsck would open and read
through all packs.
-- 
Duy

^ permalink raw reply

* Re: git-work, git-base: an example of how to use it.
From: Adam Spiers @ 2011-12-10 15:54 UTC (permalink / raw)
  To: Jon Seymour; +Cc: Git Mailing List
In-Reply-To: <BANLkTim07-a5VwSAt7_vLMzOES_JZad9DA@mail.gmail.com>

Hi Jon,

I've only just discovered git-work, and it looks extremely
interesting.  From what I can tell so far, it is exactly what I was
looking for and could have recently saved me considerable pain!
Here's some initial feedback.

On Mon, Apr 25, 2011 at 11:43 AM, Jon Seymour <jon.seymour@gmail.com> wrote:
> I haven't had much feedback about git-work, to this point. Peter
> Baumann mentioned it was a little hard to grok.

I suspect that there is a strong correlation between these two points.
I also found it hard to grok, although I think this could be easily
fixed with a few tweaks to your existing docs.  Hopefully this would
result in more feedback.

First some comments about Documentation/git-work.txt:

The existing DESCRIPTION really isn't a description, but merely a
legend for the COMMANDS section, which is where it really should
belong IMHO.  Also, the first sentence refers to the "base" of
{branch} without explaining what that means.  This is currently the
first full sentence a potential user is likely to read about git-work,
but for me at least, unfortunately it triggered a "huh?" reaction
which sapped my will to continue reading.  Subsequently I discovered
that git-base.txt gives a definition of the "base" of a working
branch, but it's such a key ingredient in understanding the whole
workflow that it needs to be covered in the primary document, not in
the manual page for the git-base helper command which, if I understand
correctly, is rarely meant to be invoked directly by the user anyway.
Furthermore, the "base" is defined in terms of the user's "current
work", but that is not clearly defined.

In contrast, the contents of the DISCUSSION section are very helpful;
my initial reaction was "yes! this is exactly what I need!"  So I
think *this* should be the DESCRIPTION section, so that it's the first
thing a potential user encounters, other than the SYNOPSIS section
which by necessity of convention has to be at the top.  BTW, there's
an "integraton" typo which needs to be fixed.

The EXAMPLES section which immediately follows the DISCUSSION is just
sequence of one-liners and it's not clear how they are related to each
other (if at all), and why/when you would want to use each one.

It would be better if the EXAMPLES section showed a use case "story"
which starts with examples of simple usage and then builds up to more
sophisticated workflows.  You seem to have already started to address
this with your README.md example, but the first step in that is a 'git
add' without even explaining which branch you are on at the time, what
commits are already on that branch, or how it relates to other
branches.  So currently (for my small brain, at least) there is too
much of a "WTF" factor for it to be useful at first sight.

Perhaps part of the confusion arises from a clash between your concept
of a private working branch which is regularly rebased by 'git work',
and many people's concept of master, which is often a public branch
which as such is expected to be safe to regularly merge from.  Maybe
you could avoid this by recommending that a user adopting a workflow
based on 'git work' should use it to control an obviously private
branch, e.g. one named 'private' / 'working' / 'unstable' rather than
'master'.

In another email to this list (Subject: [PATCH 00/23] RFC: Introducing
git-test, git-atomic, git-base and git-work) you give some more
workflow examples, which are useful and could be incorporated into a
use case story.

A well-constructed story would answer questions implicitly raised
within the (current) DISCUSSION section, such as:

  - How are dependencies tracked?

  - Can you have chains of dependencies?  If so, what does the
    dependency graph look like?  Can a branch have multiple (direct)
    dependencies?

  - In what order are dependencies included in the base of the working
    branch?

By the way, one of the EXAMPLES one-liners is described as "start
gitk, showing only the current work", but looks like it might be a
typo; shouldn't it read:

    $ gitk $(git work)

?

Another suggestion to encourage people to try your work out: provide a
quick "how to try this out" guide, preferably one which doesn't
involve building the whole of git.

Finally, please wrap all lines in git-work.txt etc. to less than 80
columns to conform to existing style.  Currently these files are
unreadable in certain contexts, e.g.

  https://github.com/jonseymour/git/blob/master/Documentation/git-work.txt

due to the unreadably long lines.

I hope that's useful feedback.  I will continue experimenting with it ...

Adam

^ permalink raw reply

* Re: [RFD] Handling of non-UTF8 data in gitweb
From: Jakub Narebski @ 2011-12-10 16:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jürgen Kreileder, John Hawley
In-Reply-To: <7vehwhcj3q.fsf@alter.siamese.dyndns.org>

On Wed, 7 Dec 2011, Junio C Hamano wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> > But doing this would change gitweb behavior.  Currently when 
> > encountering something (usually line of output) that is not valid 
> > UTF-8, we decode it (to UTF-8) using $fallback_encoding, by default
> > 'latin1'.  Note however that this value is per gitweb installation,
> > not per repository.
> 
> I think we added and you acked 00f429a (gitweb: Handle non UTF-8 text
> better, 2007-06-03) for a good reason, and I think the above argues that
> whatever issue the commit tried to address is a non-issue. Is it really
> true?

I think that UTF-8 is much more prevalent character encoding in operating
systems, programming languages, APIs, and software applications than it
was 4 years ago.

Also the solution implemented in said commit was a good start, but it
remains incomplete: $fallback_encoding is per-installation which is too
big granularity (there is `gui.encoding` per-repository config... but it
is about main not fallback encoding; best would be to use gitattribute
but currently there is no way to check attribute value at given revision).

The proposed

  use open qw(:std :utf8);

and removal of to_utf8 and $fallback_encoding would be regression compared
to post-00f429a... but the tradeoff of more robust UTF-8 handling might be
worth it.


Note that to_utf8 handles git command output part by part, not as a whole;
for UTF-8 vs latin1 (i.e. iso-8859-1) it does not matter though because
latin1 is very unlikely to be recognized as valid utf-8[1], and ASCII
characters pass-through for UTF-8.

[1]: http://en.wikipedia.org/wiki/UTF-8#Advantages

> > ... I guess
> > it could be emulated by defining our own 'utf-8-with-fallback'
> > encoding, or by defining our own PerlIO layer with PerlIO::via.
> > But it no longer be simple solution (though still automatic).
> 
> Between the current "everybody who read from the input must remember to
> call to_utf8" and "everybody gets to read utf8 automatically for internal
> processing", even though the latter may involve one-time pain to set up
> the framework to do so, the pros-and-cons feels obvious to me.

There is also a matter of performance (':utf8' and ':encoding(UTF-8)'
are AFAIK implemented in C, both the Encode part and PerlIO part).
-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: Access to git repository through squid proxy: The remote end hung up unexpectedly
From: Mathieu Peltier @ 2011-12-10 17:56 UTC (permalink / raw)
  To: Konstantin Khomoutov; +Cc: git
In-Reply-To: <20111210193753.994055f2.kostix@domain007.com>

> What happens if you try to clone a git repo directly, without any
> tunneling?  If this is not possible, try to clone on the host running
[...]
> As a side note: why are you trying to implement such a strange setup?

It works from another machine without proxy. In fact I am trying to
allow use of git in a corporate environment (for example to access
read-only git repositories of sf.net only accessible through git
protocol). See for example:
http://www.emilsit.net/blog/archives/how-to-use-the-git-protocol-through-a-http-connect-proxy/
Thanks.Mathieu

^ permalink raw reply

* Re: Question about commit message wrapping
From: Jakub Narebski @ 2011-12-10 19:30 UTC (permalink / raw)
  To: Sidney San Martín; +Cc: git
In-Reply-To: <E085218D-9287-4F82-B34C-8379742F818A@sidneysm.com>

On Fri, 9 Dec 2011, Sidney San Martín wrote:
> On Dec 9, 2011, at 8:41 AM, Jakub Narebski wrote:
>> Sidney San Martín <s@sidneysm.com> writes:
>> 
>>> *Nothing else* in my everyday life works this way anymore. Line
>>> wrapping gets done on the display end in my email client, my web
>>> browser, my ebook reader entirely automatically, and it adapts to
>>> the size of the window.
>> 
>> The problem with automatic wrapping on the display is that there could
>> be parts of commit message that *shouldn't* be wrapped, like code
>> sample, or URL... and in plain text you don't have a way to separate
>> flowed from non-flowed part.
 
Additional and the more serious problem with wrapping on output is
related to backward compatibility.  If you have commit message that is
wrapped e.g. to 80 characters, and you wrap on output to 72 characters,
you would get ugly and nigh unreadable ragged output:

original:

  Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
  tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim
  veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
  commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
  velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint
  occaecat cupidatat non proident, sunt in culpa qui officia deserunt
  mollit anim id est laborum.

wrapped to 70th column:

  Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
  eiusmod
  tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
  minim
  veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip
  ex ea
  commodo consequat. Duis aute irure dolor in reprehenderit in
  voluptate
  velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint
  occaecat cupidatat non proident, sunt in culpa qui officia deserunt
  mollit anim id est laborum.

You can re-wrap, but this is more code, and additional problems, like
with ASCII-art like this or pseudo-Markdown headers like this
               ^^^^^^^^^

  Header
  ======
 
> I usually lead code, URLs, and other preformatted lines like this with
> a few spaces. Markdown uses the same convention, and it looks like
> commits in this repo do too.  
> 
> The patch in my last email prints lines which begin with whitespace
> verbatim. How does that work? 

What about lists done using hanging indent, e.g.:

  * Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
    eiusmod tempor incididunt ut labore et dolore magna aliqua.
 
or

  - Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
    eiusmod tempor incididunt ut labore et dolore magna aliqua.

or

 1. Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
    eiusmod tempor incididunt ut labore et dolore magna aliqua.

>> Also with long non-breakable identifiers you sometimes need to wrap by
>> hand (or use linebreaking algorithm from TeX) or go bit over the limit
>> to make it look nice.
> 
> Could you elaborate on this? The patch uses strbuf_add_wrapped_text,
> which was already in Git. If an identifier is longer than the wrap
> width, it leaves it alone.  

The line breaking algorithm of TeX typesetting engine takes into account
look of whole paragraph (well, mathematical representation of "good look")
before breaking lines and hyphenating words.

What I meant here that if you have long identifier, naive line-breaking
(word-wrapping) algorithm could leave the paragraph unbalanced, with one
or more lines much shorter than the rest, while allowing one line to be
overly long over 1 character leads to nicely wrapped result.


BTW. In Polish (and Czech) typography there is rule that you don't leave
single-letter prepositions like 'i' ('and' in English) hanging at the end
of the line... automatic wrapper cannot ensure that.
 
>> BTW. proper editor used to create commit message can wrap it for you
>> ;-).
> 
> Only if everybody else in the world does it too!

Educate.

> And only if I never care about seeing my commits at any width besides
> 80 columns.  

Overly long lines are hard to read, and sometimes fit-to-width would give
us too long lines to read comfortably.


Anyway I am not against adding support for wrapping commit message and
tag payload, but IMHO it must be *optional*: --no-wrap, --wrap=auto
(or just --wrap), --wrap=80, and log.wrap or core.wrap.

-- 
Jakub Narebski
Poland

^ permalink raw reply

* Re: [PATCHv3 03/13] introduce credentials API
From: Jeff King @ 2011-12-10 19:48 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Junio C Hamano, git
In-Reply-To: <m3r50chcvm.fsf@localhost.localdomain>

On Sat, Dec 10, 2011 at 03:43:01AM -0800, Jakub Narebski wrote:

> Jeff King <peff@peff.net> writes:
> 
> > There are a few places in git that need to get a username
> > and password credential from the user; the most notable one
> > is HTTP authentication for smart-http pushing.
> 
> A question: does it work also for access via SSH, either without
> public key set up (i.e. 'keyboard-interactive'), or with encrypted
> private key without ssh-agent set up?

No. ssh handles its own password querying, and contacts the user
directly via the terminal. And there's not much point in using a
password helper with ssh; if you don't want to type your password, set
up a key and use ssh-agent.

> It would probably require URL i.e. ssh://git.example.com/srv/scm/repo.git
> or git+ssh://git.example.com/srv/scm/repo.git and not scp-like
> address i.e. user@git.example.com:/srv/scm/repo.git, isn't it?

It's not a matter of recognizing the URL; it's that we hand off the
authentication problem to ssh, which takes care of it entirely itself.

-Peff

^ permalink raw reply

* Re: [PATCHv3 08/13] credential: make relevance of http path configurable
From: Jeff King @ 2011-12-10 19:50 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Junio C Hamano, git
In-Reply-To: <m3mxb0hcjs.fsf@localhost.localdomain>

On Sat, Dec 10, 2011 at 03:50:11AM -0800, Jakub Narebski wrote:

> > This is nothing you couldn't do with a clever credential
> > helper at the start of your stack, like:
> > 
> >   [credential "http://"]
> > 	helper = "!f() { grep -v ^path= ; }; f"
> > 	helper = your_real_helper
> > 
> > But doing this:
> > 
> >   [credential]
> > 	useHttpPath = false
> 
> Shouldn't this be 'usePath' or 'usePathComponent' or 'useRepositoryPath',
> etc.?  Because if^W when remote helper for Subversion is complete, you
> could have svn://svnserve.example.com/path/to/repo as an URL... which
> would be not HTTP(S).

It must be per-protocol, because paths will be relevant for some
protocols (e.g., unlocking certificates in the local filesystem).
So if anything, it would be "useNetworkPath" or something, if we wanted
to unify svn and http. But it would also be OK to have a separate flag
for http and svn, which is more flexible (and most users aren't going to
set it at all, so they won't notice).

-Peff

^ permalink raw reply

* Re: [PATCHv3 11/13] strbuf: add strbuf_add*_urlencode
From: Jeff King @ 2011-12-10 20:09 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Junio C Hamano, git
In-Reply-To: <m3iplohc6s.fsf@localhost.localdomain>

On Sat, Dec 10, 2011 at 03:57:59AM -0800, Jakub Narebski wrote:

> Jeff King <peff@peff.net> writes:
> 
> > +void strbuf_add_urlencode(struct strbuf *sb, const char *s, size_t len,
> > +			  int reserved)
> > +{
> > +	strbuf_grow(sb, len);
> 
> What is this `reserved` flag for, and when should one use it?
> It would be nice to have a short comment...

It indicates whether one should encode rfc3986 reserved characters. You
want to use it when encoding the host, username, and password portions
of a URL (i.e., before the "/"), but not the path (since you don't want
to encode all of the slashes). If you were breaking down the path more
(e.g., into a "query" and "fragment" portion), you would care about more
specific quoting there, but we don't; we treat everything after the
slash as an opaque blob of path.

Patch to the strbuf api documentation is below. I think it should be
squashed into patch 12/13.

> BTW. was it meant to be aligned like this?
> 
> > +void strbuf_add_urlencode(struct strbuf *sb, const char *s, size_t len,
> > +			     int reserved)

It is aligned correctly. When you ident by tabs, the "+" of the diff
gets soaked in the first tabstop, so it is off-by-one with respect to
non-tabbed input (it is off even more in the quoted section above,
because "> > +" gets soaked into the first tabstop). You can see your
version above also is misaligned when I quote it. :)

If you apply the diff, the result is as you expected.

-Peff

---
diff --git a/Documentation/technical/api-strbuf.txt b/Documentation/technical/api-strbuf.txt
index afe2759..e1ab6c5 100644
--- a/Documentation/technical/api-strbuf.txt
+++ b/Documentation/technical/api-strbuf.txt
@@ -270,3 +270,14 @@ same behaviour as well.
 	third argument can be used to set the environment which the editor is
 	run in. If the buffer is NULL the editor is launched as usual but the
 	file's contents are not read into the buffer upon completion.
+
+`strbuf_add_urlencode`::
+
+	Copy data to the end of the buffer, percent-encoding it as per
+	rfc3986. If the reserved flag is non-zero, then characters in
+	the rfc3986 reserved list are percent-encoded; otherwise, they
+	are copied literally. Characters in the rfc3986 unreserved list
+	are always copied literally. All other characters are
+	percent-encoded. Typically, one would use the reserved flag for
+	the host and user-info sections of a URL, but leave special
+	characters in the path untouched.

^ permalink raw reply related

* Re: [PATCH 3/3] Rename resolve_ref() to resolve_ref_unsafe()
From: Jonathan Nieder @ 2011-12-10 20:55 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, Tony Wang
In-Reply-To: <1323521631-24320-3-git-send-email-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy wrote:

> resolve_ref() may return a pointer to a shared buffer and can be
> overwritten by the next resolve_ref() calls. Callers need to
> pay attention, not to keep the pointer when the next call happens.
[...]
> --- a/branch.c
> +++ b/branch.c
> @@ -182,7 +182,7 @@ int validate_new_branchname(const char *name, struct strbuf *ref,
>  		const char *head;
>  		unsigned char sha1[20];
>  
> -		head = resolve_ref("HEAD", sha1, 0, NULL);
> +		head = resolve_ref_unsafe("HEAD", sha1, 0, NULL);

I wonder if it would make sense to have a separate function that
maintains a shared buffer describing what HEAD resolves to, lazily
loaded.  Would invalidating the cache when appropriate be too fussy
and subtle?

[...]
> +++ b/transport.c
> @@ -163,7 +163,7 @@ static void set_upstreams(struct transport *transport, struct ref *refs,
>  		/* Follow symbolic refs (mainly for HEAD). */
>  		localname = ref->peer_ref->name;
>  		remotename = ref->name;
> -		tmp = resolve_ref(localname, sha, 1, &flag);
> +		tmp = resolve_ref_unsafe(localname, sha, 1, &flag);

Anyway, this patch looks sane.  The reminder seems useful and doesn't
feel over-the-top.

^ permalink raw reply

* Re: [PATCH 2/3] Guard memory overwriting in resolve_ref's static buffer
From: Jonathan Nieder @ 2011-12-10 21:10 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Junio C Hamano, Tony Wang
In-Reply-To: <1323521631-24320-2-git-send-email-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy wrote:

> Michael Haggerty
> has an idea [1] that, instead of passing the same static buffer to
> caller every time the function is called, we free the old buffer and
> allocate the new one. This way access to the old (now invalid) buffer
> may be caught.
>
> This patch applies the same principle for resolve_ref() with a
> few modifications:
[...]
>  - Rely on mmap/mprotect to catch illegal access. We need valgrind or
>    some other memory tracking tool to reliably catch this in Michael's
>    approach.

I wonder if the lower-tech approach would be so bad in practice.  On
systems using glibc, if MALLOC_PERTURB_ is set, then the value will
always be clobbered on free().  It would be possible to do the same
explicitly in resolve_ref() for platforms without such a feature.

>  - Because mprotect is used instead of munmap, we definitely leak
>    memory. Hopefully callers will not put resolve_ref() in a
>    loop that runs 1 million times.

Have you measured the performance impact when GIT_DEBUG_MEMCHECK is not
set?  (I don't expect major problems, but sometimes code surprises me.)

[...]
> Also introduce a new target, "make memcheck", that runs tests with this
> flag on.

Neat.  Did it catch any bugs?

> --- a/cache.h
> +++ b/cache.h
> @@ -865,7 +865,8 @@ extern int read_ref(const char *filename, unsigned char *sha1);
>   *
>   * errno is sometimes set on errors, but not always.
>   */
> -extern const char *resolve_ref(const char *ref, unsigned char *sha1, int reading, int *flag);
> +#define resolve_ref(ref, sha1, reading, flag) resolve_ref_real(ref, sha1, reading, flag, __FUNCTION__, __LINE__)

__FUNCTION__ is nonstandard, though it's probably supported pretty
widely and we can do

	#ifndef __FUNCTION__
	#define __FUNCTION__ something-else
	#endif

in git-compat-util.h when we discover a platform that doesn't support
it if needed.  __func__ was introduced in C99.  __LINE__ and __FILE__
should work everywhere.

[...]
> --- /dev/null
> +++ b/t/t0071-memcheck.sh
> @@ -0,0 +1,12 @@
> +#!/bin/sh
> +
> +test_description='test that GIT_DEBUG_MEMCHECK works correctly'
> +. ./test-lib.sh
> +
> +test_expect_success 'test-resolve-ref must crash' '
> +	GIT_DEBUG_MEMCHECK=1 test-resolve-ref
> +	exit_code=$? &&
> +	test $exit_code -eq 139

Micronit: something like

	(
		GIT_DEBUG_MEMCHECK=1 &&
		export GIT_DEBUG_MEMCHECK &&
		test_expect_code 139 test-resolve-ref
	)

would fit better in an &&-list.

> --- a/wrapper.c
> +++ b/wrapper.c
> @@ -60,6 +60,28 @@ void *xmallocz(size_t size)
>  	return ret;
>  }
>  
> +void *xmalloc_mmap(size_t size, const char *file, int line)
> +{
> +	int *ret;
> +	size += sizeof(int*) * 3;
> +	ret = mmap(NULL, size, PROT_READ | PROT_WRITE,
> +		   MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
> +	if (ret == (int*)-1)
> +		die_errno("unable to mmap %lu bytes anonymously",
> +			  (unsigned long)size);
> +
> +	ret[0] = (int)file;
> +	ret[1] = line;
> +	ret[2] = size;
> +	return ret + 3;

Would this work on 64-bit platforms?

How does one retrieve the line and file number?  I guess one is
expected to retrieve them from the core dump, but a few words of
advice in Documentation/technical would be helpful.

^ permalink raw reply

* Re: process committed files in post-receive hook
From: Alexey Shumkin @ 2011-12-10 21:31 UTC (permalink / raw)
  To: Hao; +Cc: git
In-Reply-To: <loom.20111210T111457-837@post.gmane.org>

> I have two questions. First, is there a way that I can access file
> content in a bare repo without checking out a working copy? 

$ git show <commit>:<filename>

e.g. for bare repo of Git the following command
$ git show v1.7.8:Documentation/RelNotes/1.7.8.txt
outputs
--->8---
Git v1.7.8 Release Notes
========================

Updates since v1.7.7
--------------------

 * Some git-svn, git-gui, git-p4 (in contrib) and msysgit updates.

 * Updates to bash completion scripts.
--->8---

^ permalink raw reply

* Re: Access to git repository through squid proxy: The remote end hung up unexpectedly
From: Konstantin Khomoutov @ 2011-12-10 22:58 UTC (permalink / raw)
  To: Mathieu Peltier; +Cc: git
In-Reply-To: <CACjeFCA4h_w2UmYywMBV_P+YZcWAE=zRUz-z5eTfAO+oxWKPjw@mail.gmail.com>

On Sat, Dec 10, 2011 at 09:56:07AM +0100, Mathieu Peltier wrote:

[...]
> 2011/12/09 12:22:44 socat[21428] I setting option "proxyport" to "8080"
> 2011/12/09 12:22:44 socat[21428] I setting option
> "proxy-authorization" to "user:pass"
> ...
> 2011/12/09 12:22:44 socat[21428] I sending "CONNECT
> git.server.org:9418 HTTP/1.0\r\n"
> ...
> 2011/12/09 12:22:44 socat[21428] I proxy_connect: received answer
> "HTTP/1.1 403 OK\r\n"
> 2011/12/09 12:22:44 socat[21428] E CONNECT git.server.org:9418: OK
> 2011/12/09 12:22:44 socat[21428] N exit(1)
> 2011/12/09 12:22:44 socat[21428] I shutdown(3, 2)
> 2011/12/09 12:22:44 socat[21428] D shutdown()  -> 0
> fatal: The remote end hung up unexpectedly
[...]

HTTP 403 means "access denied" or "forbidden".
IIRC, a proxy (or any other HTTP-speaking server) should respond with
the 401 code when a client first requests a protected resource, and the
client is then supposed to retry its request but this time it should
provide an authentication info (and the auth mech selected).
You have an ellipsis in your debug output in the place that exchange is
supposed to be happening so I'm unable to dig deeper.

Looks like either your socat supplies incorrect credentials or proxy
does not ask socat to actually authenticate and just denies the request.

^ permalink raw reply

* Re: best way to fastforward all tracking branches after a fetch
From: Sitaram Chamarty @ 2011-12-11  2:22 UTC (permalink / raw)
  To: Gelonida N; +Cc: git
In-Reply-To: <jbvj5o$skt$1@dough.gmane.org>

On Sat, Dec 10, 2011 at 01:26:32PM +0100, Gelonida N wrote:
> Hi,
> 
> What is the best way to fastforward all fastforwardable tracking
> branches after a git fetch?

I dont think there is a single command to do it for *all*
branches, but for any particular branch, this should work:

    git merge --ff-only @{u}

So what you want would boil down to this script (untested):

    #!/bin/bash
    git status --porcelain -uno | grep . && {echo dirty tree, exiting...; exit 1; }

    for b in `git for-each-ref '--format=%(refname:short)' refs/heads`
    do
        git checkout $b
        git merge --ff-only @{u}
    done

^ permalink raw reply

* Re: [PATCH] Copy resolve_ref() return value for longer use
From: Tony Wang @ 2011-12-11  2:28 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: Junio C Hamano, git
In-Reply-To: <CACsJy8Bg1=ESaZq8WQmZufsb6E8DY4RHqG0TWG-7uFX671zO6Q@mail.gmail.com>

Just notice that, thanks. :)


-- 
BR,
Tony Wang


On Saturday, December 10, 2011 at 12:48, Nguyen Thai Ngoc Duy wrote:

> On Sat, Dec 10, 2011 at 10:43 AM, Tony Wang <wwwjfy@gmail.com (mailto:wwwjfy@gmail.com)> wrote:
> > Hi,
> > 
> > I don't know about the procedure, but wonder is any one following this?
> 
> This series has been merged in master. I'll resend patches to rename
> resolve_ref() to resolve_ref_unsafe() soon.
> -- 
> Duy

^ permalink raw reply

* Re: git for change control of software deployment updates
From: David Aguilar @ 2011-12-11  3:19 UTC (permalink / raw)
  To: Neal Kreitzinger; +Cc: git
In-Reply-To: <jbugm2$afc$1@dough.gmane.org>

On Fri, Dec 9, 2011 at 6:37 PM, Neal Kreitzinger <neal@rsss.com> wrote:
> I am considering using git with submodules to deploy most of our updates to
> our customer linux servers (not including third party rpm updates already
> tracked by linux distro rpm repository).  Has anyone else done this?
> Comments?  (Sanity check.)  (I am new to submodules.)

I wrote a script that converts a git source repository into a redhat
src.rpm.  We use it at my $dayjob.  How about doing something like
that?  After you have a src.rpm you can create rpms that you can
distribute using yum.  You are already using rpm which is why I
mention it.  Converting a directory of rpm files into a hostable
repository is as simple as `createrepo /path/to/rpms/`.

The git project has a 'make rpm' target in its Makefile that you could
use as an example.
-- 
            David

^ permalink raw reply

* Re: git for change control of software deployment updates
From: Neal Kreitzinger @ 2011-12-11  5:09 UTC (permalink / raw)
  To: git
In-Reply-To: <CAJDDKr7+GeJTR986DSqKpQRWsXGFVzjBqg6WgRyG-EtycrQs7A@mail.gmail.com>

On 12/10/2011 9:19 PM, David Aguilar wrote:
> On Fri, Dec 9, 2011 at 6:37 PM, Neal Kreitzinger<neal@rsss.com>  wrote:
>> I am considering using git with submodules to deploy most of our updates to
>> our customer linux servers (not including third party rpm updates already
>> tracked by linux distro rpm repository).  Has anyone else done this?
>> Comments?  (Sanity check.)  (I am new to submodules.)
>
> I wrote a script that converts a git source repository into a redhat
> src.rpm.  We use it at my $dayjob.  How about doing something like
> that?  After you have a src.rpm you can create rpms that you can
> distribute using yum.  You are already using rpm which is why I
> mention it.  Converting a directory of rpm files into a hostable
> repository is as simple as `createrepo /path/to/rpms/`.
>
> The git project has a 'make rpm' target in its Makefile that you could
> use as an example.
We use rhel so, yes, we use the redhat repo to get rhel patches.  Is 
that what you mean by "You are already using rpm"?  (We do not write our 
own rpm's at this time.)

v/r,
neal

^ permalink raw reply

* Re: [PATCH 2/3] Guard memory overwriting in resolve_ref's static buffer
From: Nguyen Thai Ngoc Duy @ 2011-12-11  9:22 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Junio C Hamano, Tony Wang
In-Reply-To: <20111210211040.GB24817@elie.hsd1.il.comcast.net>

2011/12/11 Jonathan Nieder <jrnieder@gmail.com>:
>>  - Rely on mmap/mprotect to catch illegal access. We need valgrind or
>>    some other memory tracking tool to reliably catch this in Michael's
>>    approach.
>
> I wonder if the lower-tech approach would be so bad in practice.  On
> systems using glibc, if MALLOC_PERTURB_ is set, then the value will
> always be clobbered on free().  It would be possible to do the same
> explicitly in resolve_ref() for platforms without such a feature.

Clobbered value may be carried around for some time before the code
detects wrong value. It'd be hard to track back where the root cause
is.

>>  - Because mprotect is used instead of munmap, we definitely leak
>>    memory. Hopefully callers will not put resolve_ref() in a
>>    loop that runs 1 million times.
>
> Have you measured the performance impact when GIT_DEBUG_MEMCHECK is not
> set?  (I don't expect major problems, but sometimes code surprises me.)

No. I wish we had a performance test suite. Which use cases should I test?

> [...]
>> Also introduce a new target, "make memcheck", that runs tests with this
>> flag on.
>
> Neat.  Did it catch any bugs?

No, otherwise I would have sent more patches ;)

>> --- a/cache.h
>> +++ b/cache.h
>> @@ -865,7 +865,8 @@ extern int read_ref(const char *filename, unsigned char *sha1);
>>   *
>>   * errno is sometimes set on errors, but not always.
>>   */
>> -extern const char *resolve_ref(const char *ref, unsigned char *sha1, int reading, int *flag);
>> +#define resolve_ref(ref, sha1, reading, flag) resolve_ref_real(ref, sha1, reading, flag, __FUNCTION__, __LINE__)
>
> __FUNCTION__ is nonstandard, though it's probably supported pretty
> widely and we can do
>
>        #ifndef __FUNCTION__
>        #define __FUNCTION__ something-else
>        #endif
>
> in git-compat-util.h when we discover a platform that doesn't support
> it if needed.  __func__ was introduced in C99.  __LINE__ and __FILE__
> should work everywhere.

Will change to __FILE__ then

>> --- a/wrapper.c
>> +++ b/wrapper.c
>> @@ -60,6 +60,28 @@ void *xmallocz(size_t size)
>>       return ret;
>>  }
>>
>> +void *xmalloc_mmap(size_t size, const char *file, int line)
>> +{
>> +     int *ret;
>> +     size += sizeof(int*) * 3;
>> +     ret = mmap(NULL, size, PROT_READ | PROT_WRITE,
>> +                MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
>> +     if (ret == (int*)-1)
>> +             die_errno("unable to mmap %lu bytes anonymously",
>> +                       (unsigned long)size);
>> +
>> +     ret[0] = (int)file;
>> +     ret[1] = line;
>> +     ret[2] = size;
>> +     return ret + 3;
>
> Would this work on 64-bit platforms?

No idea (I'm on 32-bit). I don't see any reasons why it would not
work. But that function may cause unaligned access, I think.

> How does one retrieve the line and file number?  I guess one is
> expected to retrieve them from the core dump, but a few words of
> advice in Documentation/technical would be helpful.

from coredump.
-- 
Duy

^ permalink raw reply

* Re: [PATCH 3/3] Rename resolve_ref() to resolve_ref_unsafe()
From: Nguyen Thai Ngoc Duy @ 2011-12-11  9:46 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Junio C Hamano, Tony Wang
In-Reply-To: <20111210205519.GA24817@elie.hsd1.il.comcast.net>

2011/12/11 Jonathan Nieder <jrnieder@gmail.com>:
> Nguyễn Thái Ngọc Duy wrote:
>
>> resolve_ref() may return a pointer to a shared buffer and can be
>> overwritten by the next resolve_ref() calls. Callers need to
>> pay attention, not to keep the pointer when the next call happens.
> [...]
>> --- a/branch.c
>> +++ b/branch.c
>> @@ -182,7 +182,7 @@ int validate_new_branchname(const char *name, struct strbuf *ref,
>>               const char *head;
>>               unsigned char sha1[20];
>>
>> -             head = resolve_ref("HEAD", sha1, 0, NULL);
>> +             head = resolve_ref_unsafe("HEAD", sha1, 0, NULL);
>
> I wonder if it would make sense to have a separate function that
> maintains a shared buffer describing what HEAD resolves to, lazily
> loaded.  Would invalidating the cache when appropriate be too fussy
> and subtle?

If we do not execute "git update-ref" from git binary (bisect does,
although on BISECT_HEAD, not HEAD) then it'd be safe to cache HEAD and
invalidate it in update_ref().
-- 
Duy

^ permalink raw reply

* Re: Auto update submodules after merge and reset
From: Phil Hord @ 2011-12-11 14:43 UTC (permalink / raw)
  To: Andreas T.Auer; +Cc: Jens Lehmann, git
In-Reply-To: <4EE2B8C3.6000906@ursus.ath.cx>

On Fri, Dec 9, 2011 at 8:41 PM, Andreas T.Auer
<andreas.t.auer_gtml_37453@ursus.ath.cx> wrote:
>>> It is even nice to see which commits in the submodule belong to what
>>> branches in the superproject or to what release version (so tracking
>>> superproject tags would make sense, too). If you have a submodule that
>>> has
>>> more than one superproject but these are well-defined, it could be solved
>>> using refspecs (e.g. refs/super/foo/* for one and refs/super/bar/* for
>>> the
>>> other superproject), but currently I can't think of a context where this
>>> makes sense.
>>
>> I can, but this does put the cart before the horse.  The submodule is
>> subservient to the super project in all my setups.  The submodule is
>> not aware who owns him.  He is a bit like the DAG in reverse.  He
>> knows one direction only (children), not the other (parents).
>
> In the setup I have in mind, the submodules are not subservient to the
> superproject, but a part of the whole project.

I see that.  I have a similar project with about 20 submodules.  None
of them are useful on their own; they are logical divisions of a large
project.

Architecturally, submodules are oblivious of their super-projects in
all other respects.  Changing the architectural underpinnings should
be a last resort, I think.

Phil

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox