Git development
 help / color / mirror / Atom feed
* Re: [PATCH v2 28/51] refs.c: rename ref_array -> ref_dir
From: Jeff King @ 2011-12-15  8:37 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Junio C Hamano, git, Jakub Narebski, Heiko Voigt, Johan Herland,
	Julian Phillips, Drew Northup
In-Reply-To: <4EE9AD7C.2050605@alum.mit.edu>

On Thu, Dec 15, 2011 at 09:19:08AM +0100, Michael Haggerty wrote:

> Now I'm looking at the uses of extra_refs in git-clone.  One thing it
> does is add some extra refs then write them to the packed-refs file.  I
> still have to dig into it, but this seems strange.  If the refs are
> being written to packed-refs, it seems like they must be real (not
> extra) refs, or perhaps are just about to become real refs as part of
> the clone.  Or is something more egregious is going on?

I think what is happening is that the extra_refs system is doing double
duty here. If you do:

  git clone --reference $local $origin

then the code does something like:

  1. Add all of the refs in $local are added to extra_refs, because we
     want to advertise them during the pack fetch (see add_one_reference,
     called through setup_reference at l. 626).

  2. We then get the list of remote refs from $origin and store them in
     mapped_refs (ll. 657-658). The fetch uses the extra refs from (1).

  3. We call clear_extra_refs to drop those temporary refs (l. 663).

  4. We call write_remote_refs (l. 664), which adds all of the
     mapped_refs from (2) to extra_refs, then calls pack_refs which will
     pack all of the existing refs (of which there should be none, I
     would think), including extras (i.e., all the stuff we just
     cloned).

     Afterwards we clear_extra_refs again, though I think that is
     probably unnecessary.

Keeping the ".have" sha1s out of extra_refs would make this a lot
clearer.  Arguably it would also be better to have some way to just pass
the set of mapped_refs to pack_refs, instead of shoving them in the
global extra_refs and relying on pack_refs to find them there.

-Peff

^ permalink raw reply

* Re: How to commit incomplete changes?
From: Alexey Shumkin @ 2011-12-15  8:22 UTC (permalink / raw)
  To: Hallvard B Furuseth; +Cc: git
In-Reply-To: <7e1ccfac8c47e8877c0438086bd1d91b@ulrik.uio.no>

Oh! I got it. I missed "generated changes".
Well, unfortunately (or fortunately ;) ), I did not meet such a workflow
when changes are "generated" without my hands.
In your case it may sound reasonable to make separate fixup commit.
But Git allows you to make your own (more flexible than SVN,
for instance) workflow, which suits you. It's up to you ;)
You decide.
If you plan cherry-picking that fixups, do separate fixups. Just
publish them together.
If you want every commit is "clear" and "workable", squash fixup into
a single commit.
I do not know what exactly is "generated changes" you're talking
about ), so, maybe I'd do separate fixups, maybe not. ))
There is no single solution. ))) TIMTOWTDI
That is why you hesitate :) Do your own decision. And feel free to
change it later. ))

>  To expand on that: To review the change, review the hand-edited 
>  commits,
>  which is easier when these do not drown in generated changes.  Review
>  the *commands* which generated the rest - I'd put those in the commit
>  message - and glance at the actual changes.  Cherry-pick: Possbly you
>  need to run the commands instead of cherry-picking the generated
>  changes.  That's easier with a commit with only generated changes.
> 
>  I know it also can cause problems.  Would you make a single big
> commit anyway, and describe carefully in the commit message which
> parts are hand-edits?  (We don't auto-test commits yet, but I'll sure
> this issue will crop up again later when we do.)
> 

^ permalink raw reply

* Re: [PATCH v2 28/51] refs.c: rename ref_array -> ref_dir
From: Michael Haggerty @ 2011-12-15  8:19 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, git, Jakub Narebski, Heiko Voigt, Johan Herland,
	Julian Phillips, Drew Northup
In-Reply-To: <20111214023320.GA22141@sigill.intra.peff.net>

On 12/14/2011 03:33 AM, Jeff King wrote:
> On Tue, Dec 13, 2011 at 04:19:19PM -0800, Junio C Hamano wrote:
>> Actually, I do not think it even needs to be the "extra *REF* API". The
>> only thing that matters is that these commits are considered to be extra
>> anchor point in the history, in addition to the usual rule of considering
>> that everything reachable from our refs is complete. The data structure to
>> hold them does not even have to be a "struct ref". Just an array of object
>> names (or "struct object *") should suffice.
> 
> Since my cff38a5 (receive-pack: eliminate duplicate .have refs,
> 2011-05-19), receive-pack simply has a packed array of binary sha1s (in
> a "struct sha1_array" object). That might be the simplest thing.

It was pretty easy to eliminate the use of extra_refs from receive-pack.
 I'll submit the patches as soon as I can.  The patches will be based
off of patch 16/51 of the ref-api-D series, since Junio indicated that
he wants to queue up those commits (let me know if you have a different
preference).

Now I'm looking at the uses of extra_refs in git-clone.  One thing it
does is add some extra refs then write them to the packed-refs file.  I
still have to dig into it, but this seems strange.  If the refs are
being written to packed-refs, it seems like they must be real (not
extra) refs, or perhaps are just about to become real refs as part of
the clone.  Or is something more egregious is going on?

Moving the pack_refs() function to refs.c was on the agenda anyway, so
it should be possible to sort this out at the same time.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu
http://softwareswirl.blogspot.com/

^ permalink raw reply

* Re: process committed files in post-receive hook
From: Hao Wang @ 2011-12-15  8:19 UTC (permalink / raw)
  To: Jeff King; +Cc: Neal Kreitzinger, git
In-Reply-To: <20111215072301.GC1327@sigill.intra.peff.net>


> If you are scripting, we usually encourage the use of "plumbing"
> commands whose output is guaranteed not to change ("show" is a
> "porcelain" command intended to be used by end-users, and it's possible
> that its behavior might change from version to version).
>
> The plumbing command to get a directory listing for a tree is "git
> ls-tree" (try the "--name-only" option for terse output, and use "-z" if
> you want to be robust in the face of filenames with funny characters).

Jeff, thank you for the information. This is really helpful.

>> # get a list of rule files using git show
>> def getRuleFileList(rev):
>>      # run git show
>>      p = subprocess.Popen(['git', 'show', rev], stdout=subprocess.PIPE)
>>      p.wait()
>>      if p.returncode != 0: return None # error
>>
>>      # parse output
>>      i = 0
>>      filelist = []
>>      for line in p.stdout.readlines():
>>          filelist.append(line)
>>      p.stdout.close()
>>      return filelist
>
> Doesn't this put "tree HEAD:foo", as printed by "git show", at the top
> of your filelist? Another reason to use ls-tree.

Yes, the first two items ("tree HEAD:foo" and an empty line) are removed 
later from filelist.

Hao

^ permalink raw reply

* Re: [PATCH 2/2] push: --ignore-stale option
From: Jeff King @ 2011-12-15  7:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vobvct58u.fsf@alter.siamese.dyndns.org>

On Tue, Dec 13, 2011 at 03:35:13PM -0800, Junio C Hamano wrote:

> Teach a new "--ignore-stale" option to "git push" which tells it not to
> push stale refs (i.e. the commit that would have been pushed without the
> option is an ancestor of the commit that is at the destination). With this,
> a lazy workflow could be like this:
> 
>     $ git clone <<origin>>
>     $ git checkout -b topic1 origin/topic1
>     $ work work work
>     $ git push origin :
>     $ git checkout -b topic2 origin/topic2
>     $ work work work
>     $ git push --ignore-stale origin :
> 
> and the second push does not have to worry about other people working on
> topic1 and updating it in the central repository while you haven't touched
> the corresponding local branch at all.

Unless the action of the people in the central repo was to rewind
history, in which case you are overwriting their work. Probably
unlikely, though.

But I still see this as solving the wrong problem, and encouraging a
dangerous workflow. Why are you using ":" to push matching branches if
you don't want to push topic1? I think a much more likely scenario is:

  $ git clone <<origin>>
  $ git checkout -b topic1 origin/topic1
  $ work work work
  $ : hmm, it's not ready yet
  $ git checkout -b topic2 origin/topic2
  $ work work work
  $ : looking good, let's ship it!
  $ git push  ;# we default to matching!

I.e., the user is not actually intending to push all matching branches,
but does so accidentally because "matching" is the default. And they
have accidentally just pushed non-ready work to topic1, which would
happen even with --ignore-stale.

IOW, I am not convinced that people are actually consciously choosing
the workflow you outlined above, and are instead interested in
--ignore-stale as a convenient way of guessing which branches should be
pushed during a matching push. But it's only a guess, and as I showed
above, it can still cause problems. I think the right solution is to
switch the default away from matching, which is the root cause of the
confusion.

-Peff

^ permalink raw reply

* Re: process committed files in post-receive hook
From: Jeff King @ 2011-12-15  7:23 UTC (permalink / raw)
  To: Hao Wang; +Cc: Neal Kreitzinger, git
In-Reply-To: <4EE95523.9030702@gmail.com>

On Wed, Dec 14, 2011 at 06:02:11PM -0800, Hao Wang wrote:

> Thank you all for providing the options. Just so you know I finally
> went with Alexey's suggestion. I used 'git show' to get both a list
> of files in a directory and the content of each file. It works great
> on a bare repository so there is no need to check out a copy on the
> server.

If you are scripting, we usually encourage the use of "plumbing"
commands whose output is guaranteed not to change ("show" is a
"porcelain" command intended to be used by end-users, and it's possible
that its behavior might change from version to version).

The plumbing command to get a directory listing for a tree is "git
ls-tree" (try the "--name-only" option for terse output, and use "-z" if
you want to be robust in the face of filenames with funny characters).

> # get a list of rule files using git show
> def getRuleFileList(rev):
>     # run git show
>     p = subprocess.Popen(['git', 'show', rev], stdout=subprocess.PIPE)
>     p.wait()
>     if p.returncode != 0: return None # error
> 
>     # parse output
>     i = 0
>     filelist = []
>     for line in p.stdout.readlines():
>         filelist.append(line)
>     p.stdout.close()
>     return filelist

Doesn't this put "tree HEAD:foo", as printed by "git show", at the top
of your filelist? Another reason to use ls-tree.

> # read the content of a file
> def readfile(rev):
>     # run git show
>     p = subprocess.Popen(['git', 'show', rev], stdout=subprocess.PIPE)
>     p.wait()
>     if p.returncode != 0: return None # error
>     return p.stdout.read()

The plumbing for this is "git cat-file blob ...".

-Peff

^ permalink raw reply

* Re: How to commit incomplete changes?
From: Hallvard B Furuseth @ 2011-12-15  7:11 UTC (permalink / raw)
  To: Alexey Shumkin; +Cc: git
In-Reply-To: <20111215104444.783303cf@ashu.dyn1.rarus.ru>

 On Thu, 15 Dec 2011 10:44:44 +0400, Alexey Shumkin 
 <Alex.Crezoff@gmail.com> wrote:
>> Do people have any feelings or conventions for how and when to
>> publish a series of commits where the first one(s) break something
>> and the next ones clear it up?
>
> I'm curiuos, why to you want to commit changes that break something
> separately from fixup?

 I answered that, but maybe too briefly:

>>  I'm about to commit some small edits which go together with bigger
>>  generated changes.  It seems both more readable and more 
>> cherry-pick-
>>  friendly to me to keep these in separate commits.

 To expand on that: To review the change, review the hand-edited 
 commits,
 which is easier when these do not drown in generated changes.  Review
 the *commands* which generated the rest - I'd put those in the commit
 message - and glance at the actual changes.  Cherry-pick: Possbly you
 need to run the commands instead of cherry-picking the generated
 changes.  That's easier with a commit with only generated changes.

 I know it also can cause problems.  Would you make a single big commit
 anyway, and describe carefully in the commit message which parts are
 hand-edits?  (We don't auto-test commits yet, but I'll sure this issue
 will crop up again later when we do.)

-- 
 Hallvard

^ permalink raw reply

* Re: [BUG] in rev-parse
From: Jeff King @ 2011-12-15  7:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: nathan.panike, git
In-Reply-To: <7vk45yplkm.fsf@alter.siamese.dyndns.org>

On Wed, Dec 14, 2011 at 07:20:41PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > On the other hand, it has been like this since it was introduced in
> > 2006, and I wonder if scripts rely on the --verify side effect.
> 
> It would have been nicer if it did not to imply --verify at all; a long
> hexdigit that do not name an existing object at all will be shortened to
> its prefix that still do not collide with an abbreviated object name of an
> existing object, and even in such a case, the command should not error out
> only because it was fed a non-existing object (of course, if "--verify" is
> given at the same time, its "one input that names existing object only"
> rule should also kick in).

Dropping the implied verify is easy (see below). But handling
non-existant sha1s is a much more complicated change, as the regular
abbreviation machinery assumes that they exist. E.g., with the patch
below:

  $ good=73c6b3575bc638b7096ec913bd91193707e2265d
  $ bad=${good#d}e
  $ git rev-parse --short $good
  73c6b35
  $ git rev-parse --short $bad
  [no output]

Anyway, I'm not sure it's worth changing at this point. It's part of the
plumbing API that has been that way forever, it's kind of a rare thing
to ask for, and I've already shown a workaround using rev-list.

-Peff

---
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 98d1cbe..b365ca0 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -545,7 +545,6 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			if (!strcmp(arg, "--short") ||
 			    !prefixcmp(arg, "--short=")) {
 				filter &= ~(DO_FLAGS|DO_NOREV);
-				verify = 1;
 				abbrev = DEFAULT_ABBREV;
 				if (arg[7] == '=')
 					abbrev = strtoul(arg + 8, NULL, 10);

^ permalink raw reply related

* Re: tr/pty-all (Re: What's cooking in git.git (Dec 2011, #04; Tue, 13))
From: Jeff King @ 2011-12-15  6:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, Thomas Rast, git,
	Nguyễn Thái Ngọc Duy, Michael Haggerty
In-Reply-To: <7vfwgmplgd.fsf@alter.siamese.dyndns.org>

On Wed, Dec 14, 2011 at 07:23:14PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > This correctly detects the bug in t7006. I can't decide if it's clever
> > or ugly.
> 
> I would say it falls on the latter side of the line by small margin. Let's
> do the /dev/null thing and be done with it.

Darn, I wanted to post it on my fridge with an "A+".

Here's a cleaned-up version of the /dev/null one.

-- >8 --
Subject: [PATCH] test-lib: redirect stdin of tests

We want to run tests in a predictable, sterile environment
so we can get repeatable results.  They should take as
little input as possible from the environment outside the
test script. We already sanitize environment variables, but
leave stdin untouched. This means that scripts can
accidentally be impacted by content on stdin, or whether
stdin isatty().

Furthermore, scripts reading from stdin can be annoying to
outer loops which care about their stdin offset, like:

  while read sha1; do
      make test
  done

A test which accidentally reads stdin would soak up all of
the rest of the input intended for the outer shell loop.

Let's redirect stdin from /dev/null, which solves both
of these problems. It won't detect tests accidentally
reading from stdin, but since doing so now gives a
deterministic result, we don't need to consider that an
error.

We'll also leave file descriptor 6 as a link to the original
stdin. Tests shouldn't need to look at this, but it can be
convenient for inserting interactive commands while
debugging tests (e.g., you could insert "bash <&6 >&3 2>&4"
to run interactive commands in the environment of the test
script).

Signed-off-by: Jeff King <peff@peff.net>
---
I went the "redirect each test individually" route. In the course of my
experimentation, I notice that some tests (e.g., t4013) will do:

  while read x; do
          test_expect_success "test something ($x)" "
            ... do some test involving $x ...
          "
  done <<\EOF
  ... some values of $x ....
  EOF

This protects those loops from accidental stdin-readers inside the test
scripts, too.

 t/test-lib.sh |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index bdd9513..5ea9fe3 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -191,6 +191,7 @@ then
 fi
 
 exec 5>&1
+exec 6<&0
 if test "$verbose" = "t"
 then
 	exec 4>&2 3>&1
@@ -469,7 +470,7 @@ test_debug () {
 test_eval_ () {
 	# This is a separate function because some tests use
 	# "return" to end a test_expect_success block early.
-	eval >&3 2>&4 "$*"
+	eval </dev/null >&3 2>&4 "$*"
 }
 
 test_run_ () {
-- 
1.7.8.rc2.30.g803b1a

^ permalink raw reply related

* Re: How to commit incomplete changes?
From: Alexey Shumkin @ 2011-12-15  6:44 UTC (permalink / raw)
  To: Hallvard B Furuseth; +Cc: git
In-Reply-To: <4cfc9cf0515b1bc751f6aa0de4f55e2a@ulrik.uio.no>

>  Do people have any feelings or conventions for how and when to
> publish a series of commits where the first one(s) break something
> and the next ones clear it up?
I'm curiuos, why to you want to commit changes that break something
separately from fixup?

Many conventions (as I know) use ideology that every commit must NOT
BREAK existing code or tests. Every SHARED commit. Git design (as you
must be already know) allows you to make/change/reorder as many commits
as you want (before you share them or push to a "central" repository).
So, you have not to be afraid to commit every your change, because you
can always change/fixup/split your commits.

Usually, you introduce a feature in a branch. Also, your project must
have (?) (mine do have) unit-tests, at least. And most changes must be
tested. So, breakage must be discovered early, even after some other
commits in that feature branch. In that case you can just make a fixup
commit and then rebase it on a breakage commit with a "squash".

And only after all features made and all tests passed you can share
them (push to another repo).

>I've found some discussion, but with vague results.
> 
>  I'm about to commit some small edits which go together with bigger
>  generated changes.  It seems both more readable and more cherry-pick-
>  friendly to me to keep these in separate commits.
> 
>  What I've found is I can use a line in the commit message like
>      "Incomplete change, requires next commit (update foo/ dir)."
>  and, if there is any point, do a no-ff merge past the breakage.
> 

^ permalink raw reply

* Re: [bug?] checkout -m doesn't work without a base version
From: Miles Bader @ 2011-12-15  4:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Schubert, Pete Harlan, git
In-Reply-To: <7vhb13qbs6.fsf@alter.siamese.dyndns.org>

Junio C Hamano <gitster@pobox.com> writes:
>> builtin/checkout.c: In function ‘cmd_checkout’:
>> builtin/checkout.c:210:5: warning: ‘mode’ may be used uninitialized
>> in this function [-Wuninitialized]
>> builtin/checkout.c:160:11: note: ‘mode’ was declared here
>
> Isn't this just your gcc being overly cautious (aka "silly")?
>
> The variable "mode" is assigned to when we see an stage #2 entry in the
> loop, and we should have updated threeway[1] immediately before doing so.
> If threeway[1] is not updated, we would have already returned before using
> the variable in make_cache_entry().

Maybe that is actually guaranteed (I dunno), but it's certainly not
obvious from the code here, even to a human... any guarantee would
have to come from external invariants that the compiler doesn't know
about.

Given that, I think it's a fair warning, certainly not "silly."  This
aspect of the code doesn't seem easy to understand...

-Miles

-- 
`To alcohol!  The cause of, and solution to,
 all of life's problems' --Homer J. Simpson

^ permalink raw reply

* Re: [msysGit] [PATCH] gitk: fix the display of files when filtered by path
From: David Aguilar @ 2011-12-15  4:18 UTC (permalink / raw)
  To: Pat Thoyts; +Cc: Git, Paul Mackerras, msysGit, Johannes Schindelin
In-Reply-To: <87hb14wg4l.fsf@fox.patthoyts.tk>

On Tue, Dec 13, 2011 at 8:50 AM, Pat Thoyts
<patthoyts@users.sourceforge.net> wrote:
> Launching 'gitk -- .' or 'gitk -- ..\t' restricts the display to files
> under the given directory but the file list is left empty. This is because
> the path_filter function fails to match the filenames which are relative
> to the working tree to the filter which is filessytem relative.
> This solves the problem by making both names fully qualified filesystem
> paths before performing the comparison.
>
> Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>

Wonderful!

I've run into this problem a number of times (as have some co-workers)
but I never bothered to report it since I felt guilty for never having
worked up a patch.

I tested this and it worked.

FWIW,

Tested-by: David Aguilar <davvid@gmail.com>


Thank you Pat!
-- 
            David

^ permalink raw reply

* Re: tr/pty-all (Re: What's cooking in git.git (Dec 2011, #04; Tue, 13))
From: Junio C Hamano @ 2011-12-15  3:23 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Nieder, Thomas Rast, git,
	Nguyễn Thái Ngọc Duy, Michael Haggerty
In-Reply-To: <20111215005057.GB2566@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> This correctly detects the bug in t7006. I can't decide if it's clever
> or ugly.

I would say it falls on the latter side of the line by small margin. Let's
do the /dev/null thing and be done with it.

^ permalink raw reply

* Re: [BUG] in rev-parse
From: Junio C Hamano @ 2011-12-15  3:20 UTC (permalink / raw)
  To: Jeff King; +Cc: nathan.panike, git
In-Reply-To: <20111214210157.GA8990@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On the other hand, it has been like this since it was introduced in
> 2006, and I wonder if scripts rely on the --verify side effect.

It would have been nicer if it did not to imply --verify at all; a long
hexdigit that do not name an existing object at all will be shortened to
its prefix that still do not collide with an abbreviated object name of an
existing object, and even in such a case, the command should not error out
only because it was fed a non-existing object (of course, if "--verify" is
given at the same time, its "one input that names existing object only"
rule should also kick in).

^ permalink raw reply

* Re: [PATCH 3/3] Do not create commits whose message contains NUL
From: Junio C Hamano @ 2011-12-15  3:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Miles Bader, Nguyễn Thái Ngọc Duy, git
In-Reply-To: <20111215011855.GA24568@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> Perhaps it should just be dropped.

Thanks. I have nothing to add---you said everything that needs to be said.

^ permalink raw reply

* Re: [RFC PATCH 2/4] test-lib: allow testing another git build tree
From: Junio C Hamano @ 2011-12-15  3:07 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Michael Haggerty
In-Reply-To: <94f64a03398829bb9a11c18577efb39d9b153eca.1323876121.git.trast@student.ethz.ch>

Thomas Rast <trast@student.ethz.ch> writes:

> The perf-lib work wants this feature, so we may as well do it for
> test-lib in general.

How is this different from what GIT_TEST_INSTALLED already gives us
(other than "needs more diskspace to keep another source tree fully
built", that is)?

^ permalink raw reply

* Re: process committed files in post-receive hook
From: Hao Wang @ 2011-12-15  2:02 UTC (permalink / raw)
  To: Neal Kreitzinger; +Cc: git
In-Reply-To: <4EE94783.1010805@gmail.com>

Thank you all for providing the options. Just so you know I finally went 
with Alexey's suggestion. I used 'git show' to get both a list of files 
in a directory and the content of each file. It works great on a bare 
repository so there is no need to check out a copy on the server.

Below is the python code in my post-receive hook for this task, where 
rev is something like 'HEAD:directory_name' for the first function and 
'HEAD:directory/filename' for the second function.

# get a list of rule files using git show
def getRuleFileList(rev):
     # run git show
     p = subprocess.Popen(['git', 'show', rev], stdout=subprocess.PIPE)
     p.wait()
     if p.returncode != 0: return None # error

     # parse output
     i = 0
     filelist = []
     for line in p.stdout.readlines():
         filelist.append(line)
     p.stdout.close()
     return filelist

# read the content of a file
def readfile(rev):
     # run git show
     p = subprocess.Popen(['git', 'show', rev], stdout=subprocess.PIPE)
     p.wait()
     if p.returncode != 0: return None # error
     return p.stdout.read()

Hao

On 12/14/11 5:04 PM, Neal Kreitzinger wrote:
> On 12/10/2011 4:29 AM, Hao wrote:
>> Hi guys,
>>
>> I am writing a post-receive hook in Python that examines the content
>> of some files (the HEAD rev). Because the repo is a bare one on the
>> server. My current approach is to check out a working copy on the
>> server and run 'git pull' in post- receive to get the most up-to-date
>> version, and then process files in the working copy.
>>
>> I have two questions. First, is there a way that I can access file
>> content in a bare repo without checking out a working copy? If this
>> is not possible, my approach would be reasonable. However, when 'git
>> pull' was called in the python script post-receive when a commit
>> occurs, it gives an error.
>>
>> remote: fatal: Not a git repository: '.'
>>
>> The call in python is
>>
>> subprocess.Popen(["git", "pull"],
>> cwd="/Users/git/ts.git.workingcopy")
>>
>> I read from a post (http://stackoverflow.com/questions/4043609/) that
>> GIT_DIR is causing this error. Is it safe to unset GIT_DIR in
>> post-receive?
>>
> The specific processing you intend to perform on the files would
> determine which of the access techniques is appropriate for you.
> Generally speaking, I think a checkout in a non-bare repo makes sense.
> You could limit it to a shallow clone (see git-clone manpage) to save
> space.
>
> Another way to get the files is git-archive (creates tar file), that you
> could extract to a dir for processing.
>
> In both cases, you need to consider the default permissions in play with
> git-checkout and git-archive if permissions are important in your
> processing.
>
> v/r,
> neal

^ permalink raw reply

* Re: [PATCH 3/3] Do not create commits whose message contains NUL
From: Jeff King @ 2011-12-15  1:18 UTC (permalink / raw)
  To: Miles Bader; +Cc: Nguyễn Thái Ngọc Duy, git
In-Reply-To: <CADCnXoaqEXJV+Mb1=nQge_bjA3H6R7=BPt213CKLX55zyTHEtg@mail.gmail.com>

On Thu, Dec 15, 2011 at 10:04:06AM +0900, Miles Bader wrote:

> > +       commitWideEncoding::
> > +               Advice shown when linkgit::git-commit[1] refuses to
> > +               proceed because there are NULs in commit message.
> > +               Default: true.
> 
> Although "wide encoding" is a reasonable guess at cause of embedded
> zero characters (and so a useful term for diagnostic messages, as it
> can help users identify the problem in their environment which is
> causing such zero bytes), it's really only a guess in most cases...
> 
> Shouldn't the variable be named based on what it actually does, which
> is allow zero-bytes in commit messages...?

I agree, but...

Really this variable is overkill. The advice.* subsystem is for
silencing hints and warnings from git that you see repeatedly because
you are smarter than git, and want to ignore its advice.

But in this case, I don't see a user saying "stupid git, of _course_ I
want to commit NULs. Stop nagging me". Especially because it is not a
warning, but a fatal error. :)

So yes, it's verbose, but no, it's not something somebody is going to be
so bothered by that they will find the config option to turn it off.
Instead, they will stop doing the bad thing and never see it again.  At
best this config option is useless, and at worst it clutters the
advice.* namespace, making it harder for people to find the advice
option they _do_ want to turn off).

Perhaps it should just be dropped.

-Peff

^ permalink raw reply

* Re: [PATCH 3/3] Do not create commits whose message contains NUL
From: Miles Bader @ 2011-12-15  1:04 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, Jeff King
In-Reply-To: <1323871699-8839-4-git-send-email-pclouds@gmail.com>

> +       commitWideEncoding::
> +               Advice shown when linkgit::git-commit[1] refuses to
> +               proceed because there are NULs in commit message.
> +               Default: true.

Although "wide encoding" is a reasonable guess at cause of embedded
zero characters (and so a useful term for diagnostic messages, as it
can help users identify the problem in their environment which is
causing such zero bytes), it's really only a guess in most cases...

Shouldn't the variable be named based on what it actually does, which
is allow zero-bytes in commit messages...?

-Miles

-- 
Cat is power.  Cat is peace.

^ permalink raw reply

* Re: process committed files in post-receive hook
From: Neal Kreitzinger @ 2011-12-15  1:04 UTC (permalink / raw)
  To: Hao; +Cc: git
In-Reply-To: <loom.20111210T111457-837@post.gmane.org>

On 12/10/2011 4:29 AM, Hao wrote:
> Hi guys,
>
> I am writing a post-receive hook in Python that examines the content
> of some files (the HEAD rev). Because the repo is a bare one on the
> server. My current approach is to check out a working copy on the
> server and run 'git pull' in post- receive to get the most up-to-date
> version, and then process files in the working copy.
>
> I have two questions. First, is there a way that I can access file
> content in a bare repo without checking out a working copy? If this
> is not possible, my approach would be reasonable. However, when 'git
> pull' was called in the python script post-receive when a commit
> occurs, it gives an error.
>
> remote: fatal: Not a git repository: '.'
>
> The call in python is
>
> subprocess.Popen(["git", "pull"],
> cwd="/Users/git/ts.git.workingcopy")
>
> I read from a post (http://stackoverflow.com/questions/4043609/) that
> GIT_DIR is causing this error. Is it safe to unset GIT_DIR in
> post-receive?
>
The specific processing you intend to perform on the files would
determine which of the access techniques is appropriate for you.
Generally speaking, I think a checkout in a non-bare repo makes sense. 
You could limit it to a shallow clone (see git-clone manpage) to save space.

Another way to get the files is git-archive (creates tar file), that you 
could extract to a dir for processing.

In both cases, you need to consider the default permissions in play with 
git-checkout and git-archive if permissions are important in your 
processing.

v/r,
neal

^ permalink raw reply

* Re: tr/pty-all (Re: What's cooking in git.git (Dec 2011, #04; Tue, 13))
From: Jeff King @ 2011-12-15  0:50 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Thomas Rast, Junio C Hamano, git,
	Nguyễn Thái Ngọc Duy, Michael Haggerty
In-Reply-To: <20111215002530.GA2566@sigill.intra.peff.net>

On Wed, Dec 14, 2011 at 07:25:30PM -0500, Jeff King wrote:

> [1] Actually, you could abandon the idea of feeding garbage altogether,
> and instead open the descriptor outside the test, then check that its
> offset is still 0 after the test. You'd have to use a helper program to
> do the ftell(), but it should work as the descriptor position will be
> shared.

And here's what that patch would look like. You still want to feed a
garbage file, because you want to make sure that there is something for
it to actually read. And then either it can choke on the garbage and
fail, or if not, you can detect afterwards that it was read.

This correctly detects the bug in t7006. I can't decide if it's clever
or ugly.

---
 t/stdin-garbage |    1 +
 t/test-lib.sh   |   16 ++++++++++++++++
 2 files changed, 17 insertions(+), 0 deletions(-)
 create mode 100644 t/stdin-garbage

diff --git a/t/stdin-garbage b/t/stdin-garbage
new file mode 100644
index 0000000..3a2ebc2
--- /dev/null
+++ b/t/stdin-garbage
@@ -0,0 +1 @@
+This is a garbage file that will be connected to the stdin of each test.
diff --git a/t/test-lib.sh b/t/test-lib.sh
index bdd9513..9b4692b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -466,6 +466,10 @@ test_debug () {
 	test "$debug" = "" || eval "$1"
 }
 
+test_stdin_unread_ () {
+	test "`perl -e 'print tell(STDIN)'`" = 0
+}
+
 test_eval_ () {
 	# This is a separate function because some tests use
 	# "return" to end a test_expect_success block early.
@@ -475,9 +479,21 @@ test_eval_ () {
 test_run_ () {
 	test_cleanup=:
 	expecting_failure=$2
+	# feed the test a bogus stdin, but keep a spare descriptor open in case
+	# the test redirects stdin, which affects us since it is an eval
+	exec 6<&0
+	exec 7<"$TEST_DIRECTORY/stdin-garbage"
+	exec 0<&7
 	test_eval_ "$1"
 	eval_ret=$?
 
+	# check that nobody read from our bogus descriptor
+	if test $eval_ret = 0 && ! test_stdin_unread_ <&7; then
+		echo >&4 "bug in the test script: somebody read from stdin"
+		eval_ret=1
+	fi
+	exec 0<&6
+
 	if test -z "$immediate" || test $eval_ret = 0 || test -n "$expecting_failure"
 	then
 		test_eval_ "$test_cleanup"
-- 
1.7.8.rc2.30.g803b1a

^ permalink raw reply related

* Re: tr/pty-all (Re: What's cooking in git.git (Dec 2011, #04; Tue, 13))
From: Jeff King @ 2011-12-15  0:25 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Thomas Rast, Junio C Hamano, git,
	Nguyễn Thái Ngọc Duy, Michael Haggerty
In-Reply-To: <20111214233119.GA2354@elie>

On Wed, Dec 14, 2011 at 05:31:19PM -0600, Jonathan Nieder wrote:

> > @@ -469,7 +471,7 @@ test_debug () {
> >  test_eval_ () {
> >  	# This is a separate function because some tests use
> >  	# "return" to end a test_expect_success block early.
> > -	eval >&3 2>&4 "$*"
> > +	eval <"$TEST_DIRECTORY/stdin-garbage" >&3 2>&4 "$*"
> 
> How about /dev/urandom on platforms that support it?  It wouldn't be
> as pleasant to debug as "This is a magic stdin garbage stream", but it
> would be more likely to (despite the name :)) predictably trip errors,
> or at least hangs, in problematic tests.

I'd rather have something deterministic. If you really want to be mean
to accidental readers of stdin, then put binary junk into stdin-garbage
(even the results of a single run of /dev/urandom, if you like). But I
suspect arbitrary text is good enough to throw a monkey wrench into
anything that will care about its input (and those that don't are beyond
our ability to auto-detect anyway[1]). And it's way easier to debug than
seeing random binary bits.

-Peff

[1] Actually, you could abandon the idea of feeding garbage altogether,
and instead open the descriptor outside the test, then check that its
offset is still 0 after the test. You'd have to use a helper program to
do the ftell(), but it should work as the descriptor position will be
shared.

^ permalink raw reply

* How to commit incomplete changes?
From: Hallvard B Furuseth @ 2011-12-14 23:24 UTC (permalink / raw)
  To: git

 Do people have any feelings or conventions for how and when to publish
 a series of commits where the first one(s) break something and the next
 ones clear it up?  I've found some discussion, but with vague results.

 I'm about to commit some small edits which go together with bigger
 generated changes.  It seems both more readable and more cherry-pick-
 friendly to me to keep these in separate commits.

 What I've found is I can use a line in the commit message like
     "Incomplete change, requires next commit (update foo/ dir)."
 and, if there is any point, do a no-ff merge past the breakage.

-- 
 Hallvard

^ permalink raw reply

* Re: tr/pty-all (Re: What's cooking in git.git (Dec 2011, #04; Tue, 13))
From: Jonathan Nieder @ 2011-12-14 23:31 UTC (permalink / raw)
  To: Jeff King
  Cc: Thomas Rast, Junio C Hamano, git,
	Nguyễn Thái Ngọc Duy, Michael Haggerty
In-Reply-To: <20111214232151.GB13128@sigill.intra.peff.net>

Jeff King wrote:

[...]
> A test which accidentally reads stdin would soak up all of
> the rest of the input intended for the outer shell loop.
> 
> Let's redirect stdin from a known source to eliminate
> variation. We could just connect it to /dev/null. However,
> tests which accidentally read stdin would then see immediate
> EOF, which may or may not cause them to notice the errror.
[...]
> +++ b/t/test-lib.sh
[...]
> @@ -469,7 +471,7 @@ test_debug () {
>  test_eval_ () {
>  	# This is a separate function because some tests use
>  	# "return" to end a test_expect_success block early.
> -	eval >&3 2>&4 "$*"
> +	eval <"$TEST_DIRECTORY/stdin-garbage" >&3 2>&4 "$*"

How about /dev/urandom on platforms that support it?  It wouldn't be
as pleasant to debug as "This is a magic stdin garbage stream", but it
would be more likely to (despite the name :)) predictably trip errors,
or at least hangs, in problematic tests.

With or without something along those lines on top, your patch looks
like a good change.

Thanks for a thoughtful analysis.
Jonathan

^ permalink raw reply

* Re: tr/pty-all (Re: What's cooking in git.git (Dec 2011, #04; Tue, 13))
From: Jeff King @ 2011-12-14 23:21 UTC (permalink / raw)
  To: Thomas Rast
  Cc: Jonathan Nieder, Junio C Hamano, git,
	Nguyễn Thái Ngọc Duy, Michael Haggerty
In-Reply-To: <20111214230713.GA13128@sigill.intra.peff.net>

On Wed, Dec 14, 2011 at 06:07:13PM -0500, Jeff King wrote:

> On Wed, Dec 14, 2011 at 05:17:14PM +0100, Thomas Rast wrote:
> 
> > > I also find Jeff's patch [3] appealing.
> > 
> > Me too, though wonder whether feeding a file full of garbage wouldn't
> > be better, so as to trip up commands that try to read only from a
> > non-tty stdin.
> 
> Interesting idea. Instead of getting an immediate EOF from /dev/null,
> they'll get some junk which they may or may not complain about. I played
> around with this a bit, redirecting test stdin from a file with:

So here is what that patch looked like. As I mentioned, it doesn't
actually catch the shortlog problem, but it would fix
Michael's issue with an outer reading loop.

-- >8 --
Subject: [PATCH] test-lib: redirect stdin of tests

We want to run tests in a predictable, sterile environment
so we can get repeatable results.  They should take as
little input as possible from the environment outside the
test script. We already sanitize environment variables, but
leave stdin untouched. This means that scripts can
accidentally be impacted by content on stdin, or whether
stdin isatty().

Furthermore, scripts reading from stdin can be annoying to
outer loops which care about their stdin offset, like:

  while read sha1; do
      make test
  done

A test which accidentally reads stdin would soak up all of
the rest of the input intended for the outer shell loop.

Let's redirect stdin from a known source to eliminate
variation. We could just connect it to /dev/null. However,
tests which accidentally read stdin would then see immediate
EOF, which may or may not cause them to notice the errror.
Instead, let's connect it to a file with random garbage in
it, in the hope that it will be more likely to trigger an
error in the mis-written test.

We'll also leave file descriptor 6 as a link to the original
stdin. Tests shouldn't need to look at this, but it can be
convenient for inserting interactive commands while
debugging tests (e.g., you could insert "bash <&6 >&3 2>&4"
to run interactive commands in the environment of the test
script).

Signed-off-by: Jeff King <peff@peff.net>
---
 t/stdin-garbage |    1 +
 t/test-lib.sh   |    4 +++-
 2 files changed, 4 insertions(+), 1 deletions(-)
 create mode 100644 t/stdin-garbage

diff --git a/t/stdin-garbage b/t/stdin-garbage
new file mode 100644
index 0000000..3a2ebc2
--- /dev/null
+++ b/t/stdin-garbage
@@ -0,0 +1 @@
+This is a garbage file that will be connected to the stdin of each test.
diff --git a/t/test-lib.sh b/t/test-lib.sh
index bdd9513..67eb078 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -198,6 +198,8 @@ else
 	exec 4>/dev/null 3>/dev/null
 fi
 
+exec 6<&0
+
 test_failure=0
 test_count=0
 test_fixed=0
@@ -469,7 +471,7 @@ test_debug () {
 test_eval_ () {
 	# This is a separate function because some tests use
 	# "return" to end a test_expect_success block early.
-	eval >&3 2>&4 "$*"
+	eval <"$TEST_DIRECTORY/stdin-garbage" >&3 2>&4 "$*"
 }
 
 test_run_ () {
-- 
1.7.8.rc2.30.g803b1a

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox