Git development
 help / color / mirror / Atom feed
* Re: [PATCH] Implement git-quiltimport
From: Eric W. Biederman @ 2006-05-16 17:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git
In-Reply-To: <Pine.LNX.4.64.0605161001190.3866@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 16 May 2006, Eric W. Biederman wrote:
>>
>> If the --author flag was not given the the author is recorded as 
>> unknown.
>
> Please don't do this. Just error out. It would be horrible to have a quilt 
> import "succeed", and then later notice that some of the patches had 
> incorrect authorship attribution just because the import script didn't 
> check it, but just made it "unknown".
>
> An un-attributed patch is simply not acceptable in any serious project. 
> It's much better to consider it an error than to say "ok".

There are two practical problems with this.
1) quilt does not force any authorship information to be preserved,
   in the description, so this probably a common case.  Although for
   most users just needing to specify --author sounds reasonable.

2) There are currently 84 out of roughly 1322 patches in
   2.6.17-rc4-mm1 that git-mailinfo cannot compute the author for.
   Generally the information is there but in such an irregular form
   that it cannot be automatically detected.

   If we can resolve that problem I am willing to make it an error.
   If we can't then sucking quilt patches into a git tree is much
   less useful.  

   Given the ugliness in -mm making it an error to have an
   non-attributed patch would result in people specifying --author
   when they really don't know who the author is, giving us much
   less reliable information.

   Possibly what we need is an option to not make it an error so that
   people doing this kind of thing in their own trees have useful
   information.


The list of patches that git-mailinfo cannot find authorship
information for from 2.6.17-rc4-mm1 is included below.  Mostly these
are either git trees splatted into a single file, or simply fixes
added by Andrew.  But there are some like: gregkh-usb-usb-gotemp
that have no description at all and only the patch name records who
made the patch.

A really ugly case is acx1xx-wireless-driver patch, which
appears to have multiple authors and a serious history
before Andrew got it.

>From acx1xx-wireless-driver.patch
> acx100.sourceforge.net (Andreas Mohr <andi@rhlx01.fht-esslingen.de>) ->
>   -> Denis Vlasenko <vda@ilport.com.ua>
>      -> Jeff Garzik <jgarzik@pobox.com>
>         -> me
> 
> DESC
> acx1xx-wireless-driver-usb-is-bust
> EDESC
> From: Andrew Morton <akpm@osdl.org>
> 
> drivers/net/wireless/tiacx/usb.c:1116: `URB_ASYNC_UNLINK' undeclared (first use in this function)
> 
> Cc: Denis Vlasenko <vda@ilport.com.ua>
> DESC
> acx1xx-allow-modular-build
> EDESC
> From: Andrew Morton <akpm@osdl.org>
> DESC
> acx1xx-wireless-driver-spy_offset-went-away
> EDESC
> From: Andrew Morton <akpm@osdl.org>
> 
> Cc: Denis Vlasenko <vda@ilport.com.ua>
> DESC
> acx update
> EDESC
> From: Denis Vlasenko <vda@ilport.com.ua>
> 
> > > Attached is a patch which updates acx. All your changes are
> > > included too. allyesconfig build is fixed by unifying
> > > PCI and USB modules into one. 'acx_debug' parameter is renamed back
> > > to just 'debug' (because all previous versions used it and
> > > we don't want to add to user confusion).
> > >
> > > Please apply.
> > >
> > > Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
> >
> > I missed a spy_offset fix. Updated patch is attached.
> > Also it is at
> > http://195.66.192.167/linux/acx_patches/linux-2.6.13-mm2acx-2.patch.bz2
> 
> Oh no. Yes. I forgot to remove some standalone build aids.
> 
> DESC
> acx-update 2
> EDESC
> From: Denis Vlasenko <vda@ilport.com.ua>
> 
> [20051016] 0.3.13
> * Revert 20051013 fix, we have one which actually works.
>   Thanks Jacek Jablonski <yacek87@gmail.com> for testing!
> 
> [20051013]
> * trying to fix "yet another similar bug"
> * usb fix by Carlos Martin
> 
> [20051012] 0.3.12
> * acx_l_clean_tx_desc bug fixed - was stopping tx completely
>   at high load. (It seems there exists yet another similar bug!)
> * "unknown IE" dump was 2 bytes too short - fixed
> * DUP logging made less noisy
> * another usb fix by Carlos Martin <carlosmn@gmail.com>
> 
> [20051003]
> * several usb fixes by Carlos Martin <carlosmn@gmail.com> - thanks!
> * unknown IE logging made less noisy
> * few unknown IEs added to the growing collection
> * version bump to 0.3.11
> 
> [20050916]
> * fix bogus MTU handling, add ability to change MTU
> * fix WLAN_DATA_MAXLEN: 2312 -> 2304
> * version bump to 0.3.10
> 
> [20050915]
> * by popular request default mode is 'managed'
> * empty handler for EID 7 (country info) is added
> * fix 'timer not started - iface is not up'
> * tx[host]desc micro optimizations
> * version bump to 0.3.9
> 
> [20050914]
> * tx[host]desc ring workings brought a bit back to two-hostdesc
>   scheme. This is an attempt to fix weird WG311v2 bug.
>   I still fail to understand how same chip with same fw can
>   work for me but do not work for a WG311v2 owner. Mystery.
> * README updated
> * version bump to 0.3.8
> 
> [20050913]
> * variable and fields with awful names renamed
> * a few fields dropped (they had constant values)
> * small optimization to acx_l_clean_tx_desc()
> * version bump to 0.3.7

      origin
      git-acpi
      git-agpgart
      git-alsa
      git-block
      git-cfq
      git-cifs
      git-dvb
      git-gfs2
      git-ia64
      git-ieee1394
      git-infiniband
      git-intelfb
      sane-menuconfig-colours
      git-klibc
      git-hdrcleanup
      git-hdrinstall
      git-libata-all
      libata_resume_fix
      git-mips
      git-mtd
      git-netdev-all
      git-nfs
      git-ocfs2
      git-powerpc
      git-rbtree
      git-sas
      gregkh-pci-acpiphp-configure-_prt-v3
      gregkh-pci-acpiphp-hotplug-slot-hotplug
      gregkh-pci-acpiphp-host-and-p2p-hotplug
      gregkh-pci-acpiphp-turn-off-slot-power-at-error-case
      gregkh-pci-pci-legacy-i-o-port-free-driver-changes-to-generic-pci-code
      gregkh-pci-pci-legacy-i-o-port-free-driver-update-documentation-pci_txt
      gregkh-pci-pci-legacy-i-o-port-free-driver-make-intel-e1000-driver-legacy-i-o-port-free
      gregkh-pci-pci-64-bit-resources-drivers-pci-changes
      gregkh-pci-pci-64-bit-resources-drivers-media-changes
      gregkh-pci-pci-64-bit-resources-drivers-net-changes
      gregkh-pci-pci-64-bit-resources-drivers-pcmcia-changes
      gregkh-pci-pci-64-bit-resources-drivers-others-changes
      gregkh-pci-pci-msi-abstractions-and-support-for-altix
      git-pcmcia
      git-scsi-target
      gregkh-usb-usb-gotemp
      git-supertrak
      git-watchdog
      x86_64-mm-defconfig-update
      x86_64-mm-memset-always-inline
      x86_64-mm-amd-core-cpuid
      x86_64-mm-amd-cpuid4
      x86_64-mm-alternatives
      x86_64-mm-ia32-unistd-cleanup
      x86_64-mm-topology-comment
      x86_64-mm-new-compat-ptrace
      x86_64-mm-disable-agp-resource-check
      x86_64-mm-new-northbridge
      x86_64-mm-iommu-warning
      x86_64-mm-i386-up-generic-arch
      x86_64-mm-iommu-enodev
      x86_64-mm-compat-printk
      x86_64-mm-i386-numa-summit-check
      x86_64-mm-fix-b44-checks
      x86_64-mm-nommu-warning
      git-cryptodev
      mm
      acx1xx-wireless-driver
      reiser4-export-find_get_pages
      kgdb-core-lite
      kgdb-8250
      kgdb-netpoll_pass_skb_to_rx_hook
      kgdb-eth
      kgdb-i386-lite
      kgdb-cfi_annotations
      kgdb-sysrq_bugfix
      kgdb-module
      kgdb-core
      kgdb-i386
      journal_add_journal_head-debug
      list_del-debug
      unplug-can-sleep
      firestream-warnings
      git-viro-bird-m32r
      git-viro-bird-m68k
      git-viro-bird-frv
      git-viro-bird-upf
      git-viro-bird-volatile

Eric

^ permalink raw reply

* Re: git-svn vs. $Id$
From: Tommi Virtanen @ 2006-05-16 18:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605161037220.3866@g5.osdl.org>

Linus Torvalds wrote:
> Isn't there some flag to svn to avoid keyword expansion, like "-ko" to 
> CVS?
> 
> Any import script definitely should avoid keyword expansion (and that's 
> true whether you end up wanting to use keywords or not).

Well, yes, I agree. But, at least git-svn.txt says this:

BUGS
----
...
svn:keywords can't be ignored in Subversion (at least I don't know of
a way to ignore them).

I guess one might be able to reach that information through the svn API.

Or just propget svn:keywords and sed s/\$Id\(:[^$]*\)\$/$Id$/ all files
with keywords, for all relevant keywords. Eww.

-- 
Inoi Oy, Tykistökatu 4 D (4. krs), FI-20520 Turku, Finland
http://www.inoi.fi/
Mobile +358 40 762 5656

^ permalink raw reply

* Re: [PATCH] Implement git-quiltimport
From: Junio C Hamano @ 2006-05-16 19:01 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Linus Torvalds, git
In-Reply-To: <m1bqtx6el6.fsf@ebiederm.dsl.xmission.com>

ebiederm@xmission.com (Eric W. Biederman) writes:

>    Given the ugliness in -mm making it an error to have an
>    non-attributed patch would result in people specifying --author
>    when they really don't know who the author is, giving us much
>    less reliable information.
>
>    Possibly what we need is an option to not make it an error so that
>    people doing this kind of thing in their own trees have useful
>    information.

I agree it is probably a good way to error by default, optinally
allowing to say "don't care".  I do not think Linus would pull
from such a tree or trees branched from it into his official
tree, so I do not think we would need to worry about commits
with incomplete information propagating for this particular
"gitified mm" usage.  But as a general purpose tool to produce
"gitified quilt series" tree, we would.

It depends on the expected use of the resulting gitified mm
tree.

If it is for an individual developer to futz with and tweak
upon, and the end result from the work leaves such a "gitified
quilt series" repository only as a patch form, then not having
to figure out and specify authorship information to many patches
is probably a plus; the information will not be part of the
official history recorded elsewhere anyway.

However, if it is to produce a reference git tree to point
people at, (i.e. the quiltimport script is run once per a series
by somebody and the result is published for public use), I would
imagine we would want to have the attribution straight, so if
the tool has to "guess", it should either error out or go
interactive and ask.

^ permalink raw reply

* [PATCH] improve depth heuristic for maximum delta size
From: Nicolas Pitre @ 2006-05-16 20:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v4pzqhh3t.fsf@assigned-by-dhcp.cox.net>

This provides a linear decrement on the penalty related to delta depth
instead of being an 1/x function.  With this another 5% reduction is 
observed on packs for both the GIT repo and the Linux kernel repo, as 
well as fixing a pack size regression in another sample repo I have.

Signed-off-by: Nicolas Pitre <nico@cam.org>

---

On Mon, 15 May 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > @@ -1038,8 +1038,8 @@ static int try_delta(struct unpacked *tr
> >  
> >  	/* Now some size filtering euristics. */
> >  	size = trg_entry->size;
> > -	max_size = size / 2 - 20;
> > -	if (trg_entry->delta)
> > +	max_size = (size/2 - 20) / (src_entry->depth + 1);
> > +	if (trg_entry->delta && trg_entry->delta_size <= max_size)
> >  		max_size = trg_entry->delta_size-1;
> >  	src_size = src_entry->size;
> >  	sizediff = src_size < size ? size - src_size : 0;
> 
> At the first glance, this seems rather too agressive.  It makes
> me wonder if it is a good balance to penalize the second
> generation base by requiring it to produce a small delta that is
> at most half as we normally would (and the third generation a
> third), or maybe the penalty should kick in more gradually, like
> e.g. ((max_depth * 2 - src_entry->depth) / (max_depth * 2).

You are right.  However your formula converge towards 0.5 which is not 
enough to be sure the bad effect with early eviction of max depth object 
from the object window won't come back.  I prefer this patch with a 
formula converging toward 0.

diff --git a/pack-objects.c b/pack-objects.c
index 566a2a2..3116020 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -1036,9 +1036,12 @@ static int try_delta(struct unpacked *tr
 	if (src_entry->depth >= max_depth)
 		return 0;
 
-	/* Now some size filtering euristics. */
+	/* Now some size filtering heuristics. */
 	size = trg_entry->size;
-	max_size = (size/2 - 20) / (src_entry->depth + 1);
+	max_size = size/2 - 20;
+	max_size = max_size * (max_depth - src_entry->depth) / max_depth;
+	if (max_size == 0)
+		return 0;
 	if (trg_entry->delta && trg_entry->delta_size <= max_size)
 		max_size = trg_entry->delta_size-1;
 	src_size = src_entry->size;

^ permalink raw reply related

* Re: let's meet
From: Randal L. Schwartz @ 2006-05-16 20:34 UTC (permalink / raw)
  To: git
In-Reply-To: <602115DC.2C05E9D@arsenal.co.uk>

>>>>> "Luke" == Luke  <oxwacpp@arsenal.co.uk> writes:

Luke> Hire,
Luke> i am here sittiang in the internet caffe. Found your email a!nd
Luke> decided to write. I might be coming to your p!lace in 14 days, 
Luke> so I decided to email you. May be we ca!n meet? I am 25 y.o.
Luke> girl. I have a picture if you want. No need to reply here as 
Luke> this is not my email. Write me at ex@datetodayy.com

I hope she has a big table. :)

-- 
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/>
Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!

^ permalink raw reply

* Re: let's meet
From: Junio C Hamano @ 2006-05-16 20:40 UTC (permalink / raw)
  To: Randal L. Schwartz; +Cc: git
In-Reply-To: <86odxxn1yc.fsf@blue.stonehenge.com>

merlyn@stonehenge.com (Randal L. Schwartz) writes:

>>>>>> "Luke" == Luke  <oxwacpp@arsenal.co.uk> writes:
>
> Luke> Hire,
> Luke> i am here sittiang in the internet caffe. Found your email a!nd
> Luke> decided to write. I might be coming to your p!lace in 14 days, 
> Luke> so I decided to email you. May be we ca!n meet? I am 25 y.o.
> Luke> girl. I have a picture if you want. No need to reply here as 
> Luke> this is not my email. Write me at ex@datetodayy.com
>
> I hope she has a big table. :)

Huh?

She's coming to *your* place, so you are the one to prepare a
big table to cover the locations we all live---perhaps "earth"?

;-)

^ permalink raw reply

* Re: [RFC] Add "rcs format diff" support
From: Al Viro @ 2006-05-16 20:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Git Mailing List, Al Viro, Davide Libenzi
In-Reply-To: <20060514001214.GB27946@ftp.linux.org.uk>

Use:
	diff-remap-data <dir1> <dir2> >map
or
	git-remap-data <git-diff arguments> >map
will build information for remapper,
	git-remap <map> <options>
will do line numbers remapping.

git-remap is a filter.  It takes map as argument and, in the simplest form,
will look at the lines in stdin that have form
<filename>:<number>:<text>
If the indicated line from old tree had survived into the new one, we will
get
N:<new-filename>:<new-number>:<text>
on the output.  If it hadn't, we get
O:<filename>:<number>:<text>
Lines that do not have such form are passed unchanged.

Even that is already very useful for log comparison.  E.g. if old-log is
from the old tree and new-log is from the new one, we can do
	git-remap map <old-log >foo
	git-remap /dev/null <new-log >bar
	diff -u foo bar
and have the noise due to line number changes excluded (empty map means
identity mapping, so the second line will simply slap N: on all lines of
form <filename>:<number>:<text> in new-log).

Note that it's not just for build logs; the thing is useful for sparse logs,
grep -n output, etc., etc. 

Behaviour described above is the default; what _really_ happens is
that we take lines of form
<original_prefix><filename>:<number>:<text>
and replace them with
<prefix_for_new><new-filename>:<new-number>:<text>
or
<prefix_for_old><filename>:<number>:<text>
Defaults are :", "N:" and "O:" resp.; what it gives us is the ability to
do multiple remappings.  IOW, we can say

diff-remap-data old-tree newer-tree > map1
diff-remap-data newer-tree current-tree > map2
git-remap -o old: map1 <old-log | git-remap -p N: -o newer: -n current: map2>foo

and get lines that didn't make it into the newer tree marked with old: and
otherwise be unchanged, ones that made it to newer, but not the current to
be marked with newer: and have the filenames/line numbers remapped and ones
that made it all the way be marked with current: and remapped all the way
to current tree.

That's quite useful when you want to carry logs for a while, basically using
them as annotated TODO ("logs" here can very well be results of grep -n with
annotations added to them).  You can have all still relevant bits stay with
the locations in text and see what had fallen out.

Note on relation to git:
	* git-remap, despite the name, doesn't need git to work
	* diff-remap-data doesn't need git to work
	* git-remap-data _does_ need it.  Aside of working on revisions in
git repository instead of a couple of directory trees, it generates slightly
better map than diff-remap-data does.  I.e. it manages to remap more lines -
it does notice renames.

This stuff lives on ftp.linux.org.uk/pub/people/viro/remapper/; I'm not
sure what to do with it wrt distributing - submit for inclusion into
git, or leave that sucker standalone.  It can be used without git, but
OTOH having it in git would make my life easier - I wouldn't have to
think about packaging it myself ;-)

Seriously,
	a) feel free to play with it; hopefully it will be useful.
	b) review and comments are welcome.
	c) so would any thoughts regarding the right way to distribute it.

^ permalink raw reply

* Merge with local conflicts in new files
From: Santi @ 2006-05-16 22:00 UTC (permalink / raw)
  To: git, Junio C Hamano

Hi *,

       In the case of:

- You merge from a branch with new files
- You have these files in the working directory
- You do not have these files in the HEAD.

   The end result is that you lose the content of these files.

   So an additional check for the merge is to check for these dirty
but not in HEAD files.

   Here is a test that reproduce it. I expect the merge to fail and
with the content of foo being bar.

test_description='Test merge with local conflicts in new files'
. ./test-lib.sh

test_expect_success 'prepare repository' \
'echo "Hello" > init &&
git add init &&
git commit -m "Initial commit" &&
git branch B &&
echo "foo" > foo &&
git add foo &&
git commit -m "File: foo" &&
git checkout B &&
echo "bar" > foo '

test_expect_code 1 'Merge with local conflicts in new files' 'git
merge "merge msg" B master'

test_done

Thanks.

^ permalink raw reply

* "git add $ignored_file" fail
From: Santi @ 2006-05-16 22:07 UTC (permalink / raw)
  To: git, Junio C Hamano

Hi *,

      When you try to add ignored files with the git-add command it
fails because the call to:

git-ls-files -z \
        --exclude-from="$GIT_DIR/info/exclude" \
        --others --exclude-per-directory=.gitignore

      does not output this file because it is ignored. I know I can do it with:

git-update-index --add $ignored_file

I understand the behaviour of git-ls-files but I think it is no the
expected for git-add, at least for me.

    Thanks

    Santi

^ permalink raw reply

* Re: Merge with local conflicts in new files
From: Santi @ 2006-05-16 22:12 UTC (permalink / raw)
  To: git, Junio C Hamano
In-Reply-To: <8aa486160605161500m1dd8428cj@mail.gmail.com>

Sorry, the test is wrong. Use this:

test_description='Test merge with local conflicts in new files'
. ./test-lib.sh

test_expect_success 'prepare repository' \
'echo "Hello" > init &&
git add init &&
git commit -m "Initial commit" &&
git checkout -b B &&
echo "foo" > foo &&
git add foo &&
git commit -m "File: foo" &&
git checkout master &&
echo "bar" > foo &&
'

test_expect_code 1 'Merge with local conflicts in new files' 'git
merge "merge msg" HEAD B'

test_done

^ permalink raw reply

* Ouput of git diff with <ent>:<path>
From: Santi @ 2006-05-16 22:24 UTC (permalink / raw)
  To: git, Junio C Hamano

Hi *,

   just curious if this is the expected output. I find this syntax
very usefull but the "a/v1.3.3:" of even without the tree "a/:" a bit
confusing. And I didn't expect the rename from/to neither the
similarity index 0%.

diff --git a/v1.3.3:Makefile b/Makefile
similarity index 0%
rename from v1.3.3:Makefile
rename to Makefile
index b808eca..55d1937 100644
--- a/v1.3.3:Makefile
+++ b/Makefile

Thanks.

Santi

^ permalink raw reply

* Re: "git add $ignored_file" fail
From: Linus Torvalds @ 2006-05-16 22:28 UTC (permalink / raw)
  To: Santi; +Cc: git, Junio C Hamano
In-Reply-To: <8aa486160605161507w3a27152dq@mail.gmail.com>



On Wed, 17 May 2006, Santi wrote:
> 
>      When you try to add ignored files with the git-add command it
> fails because the call to:
> 
> git-ls-files -z \
>        --exclude-from="$GIT_DIR/info/exclude" \
>        --others --exclude-per-directory=.gitignore
> 
>      does not output this file because it is ignored. I know I can do it with:
> 
> git-update-index --add $ignored_file
> 
> I understand the behaviour of git-ls-files but I think it is no the
> expected for git-add, at least for me.

Well, the thing is, git-add doesn't really take a "file name", it takes a 
filename _pattern_.

Clearly we can't add everything that matches the pattern, because one 
common case is to add a whole subdirectory, and thus clearly the 
.gitignore file must override the pattern.

So it's consistent that it overrides it also for a single filename case, 
no?

		Linus

^ permalink raw reply

* Re: Merge with local conflicts in new files
From: Junio C Hamano @ 2006-05-16 22:40 UTC (permalink / raw)
  To: Santi; +Cc: git
In-Reply-To: <8aa486160605161500m1dd8428cj@mail.gmail.com>

Santi <sbejar@gmail.com> writes:

>       In the case of:
>
> - You merge from a branch with new files
> - You have these files in the working directory
> - You do not have these files in the HEAD.

and

 - You have not told git that these files matter.

>...
> test_expect_success 'prepare repository' \
> 'echo "Hello" > init &&
> git add init &&
> git commit -m "Initial commit" &&
> git branch B &&
> echo "foo" > foo &&
> git add foo &&
> git commit -m "File: foo" &&
> git checkout B &&
> echo "bar" > foo '

At this point, you have not told git that foo is a file that is
relevant on branch B, so git considers it a fair game to
overwrite.

At least, that was the original reasoning.

It happens not just during the ordinary "git-merge", by the way.
If you are on branch B that did not have 'foo', created 'foo'
and switched to branch A (which has 'foo') before telling the
index that you care about your version of 'foo' on branch B,
'foo' from branch A will overwrite your throwaway copy in the
working tree:

	$ git branch
	* master
        $ git branch another
	$ echo 'New file' >afile
        $ git add afile
        $ git commit -m 'Add afile'
        $ git checkout another
        $ ls afile
	ls: afile: No such file or directory
        $ echo 'Lost file' >afile
        $ git checkout master
        $ cat afile
        New file

We acquired "git apply" which does take notice when you have
such an untracked file in the working tree that conflicts with
what it does to the index, and I think its behaviour sometimes
is more user friendly and safer than what the merge does
currently (but it irritates people some other times).

This is totally untested, but on top of "next" you could do
something like this, perhaps.

We _might_ want to do this conditionally, only when the user
asks, though.  I dunno.  Being able to blow away irrelevant
files is sometimes a good thing, so we _might_ want to have a
reverse logic to "git apply" that makes it blow away untracked
working tree files under "--index" option.

-- >8 --

diff --git a/read-tree.c b/read-tree.c
index aa6172b..185a73f 100644
--- a/read-tree.c
+++ b/read-tree.c
@@ -453,8 +453,18 @@ static int merged_entry(struct cache_ent
 			invalidate_ce_path(old);
 		}
 	}
-	else
+	else {
+		/*
+		 * Originally we did not have a cache entry here but
+		 * are creating a new file as a result of the merge.
+		 * Do we want to lose the untracked working tree files?
+		 */
+		struct stat st;
+
+		if (!lstat(merge->name, &st))
+			die("Untracked working tree file '%s' would be overwritten by merge.", merge->name);
 		invalidate_ce_path(merge);
+	}
 	merge->ce_flags &= ~htons(CE_STAGEMASK);
 	add_cache_entry(merge, ADD_CACHE_OK_TO_ADD);
 	return 1;
@@ -701,7 +711,7 @@ static int bind_merge(struct cache_entry
 		return error("Cannot do a bind merge of %d trees\n",
 			     merge_size);
 	if (!a)
-		return merged_entry(old, NULL);
+		return merged_entry(old, old);
 	if (old)
 		die("Entry '%s' overlaps.  Cannot bind.", a->name);
 
@@ -736,7 +746,7 @@ static int oneway_merge(struct cache_ent
 		}
 		return keep_entry(old);
 	}
-	return merged_entry(a, NULL);
+	return merged_entry(a, old);
 }
 
 static int read_cache_unmerged(void)

^ permalink raw reply related

* Re: "git add $ignored_file" fail
From: Jakub Narebski @ 2006-05-16 22:41 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0605161526210.16475@g5.osdl.org>

Linus Torvalds wrote:

> Well, the thing is, git-add doesn't really take a "file name", it takes a 
> filename _pattern_.
> 
> Clearly we can't add everything that matches the pattern, because one 
> common case is to add a whole subdirectory, and thus clearly the 
> .gitignore file must override the pattern.
> 
> So it's consistent that it overrides it also for a single filename case, 
> no?

Well, if shell expansion cannot find a file matching pattern, it uses
pattern as file name literaly.

It would be nice to have easy (git core porcelain level) way to add files
which match ignore pattern.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: "git add $ignored_file" fail
From: Santi @ 2006-05-16 22:42 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Junio C Hamano
In-Reply-To: <Pine.LNX.4.64.0605161526210.16475@g5.osdl.org>

2006/5/17, Linus Torvalds <torvalds@osdl.org>:
>
>
> On Wed, 17 May 2006, Santi wrote:
> >
> >      When you try to add ignored files with the git-add command it
> > fails because the call to:
> >
> > git-ls-files -z \
> >        --exclude-from="$GIT_DIR/info/exclude" \
> >        --others --exclude-per-directory=.gitignore
> >
> >      does not output this file because it is ignored. I know I can do it with:
> >
> > git-update-index --add $ignored_file
> >
> > I understand the behaviour of git-ls-files but I think it is no the
> > expected for git-add, at least for me.
>
> Well, the thing is, git-add doesn't really take a "file name", it takes a
> filename _pattern_.
>
> Clearly we can't add everything that matches the pattern, because one
> common case is to add a whole subdirectory, and thus clearly the
> .gitignore file must override the pattern.
>
> So it's consistent that it overrides it also for a single filename case,
> no?
>

It's consistent from an implementation point of view, but not from the
(my?) user point of view. This is why I say I understand it for
git-ls-files. For the case of git-add even the usage and the man page
talk about <file>...

Clearly for the case of a whole subdirectory, or even ".",  the
.gitignore file must override the pattern, but not for the case of a
pattern that is a single existing file.

Santi

^ permalink raw reply

* Re: Ouput of git diff with <ent>:<path>
From: Junio C Hamano @ 2006-05-16 22:44 UTC (permalink / raw)
  To: Santi; +Cc: git
In-Reply-To: <8aa486160605161524j5d7e672eo@mail.gmail.com>

Santi <sbejar@gmail.com> writes:

> ... I didn't expect the rename from/to neither the
> similarity index 0%.
>
> diff --git a/v1.3.3:Makefile b/Makefile
> similarity index 0%
> rename from v1.3.3:Makefile
> rename to Makefile
> index b808eca..55d1937 100644
> --- a/v1.3.3:Makefile
> +++ b/Makefile

Yes I am aware of this one; I just haven't bothered to deal with
it.

It looks at two strings, "v1.3.3:Makefile" and "Makefile", and
says "they have different names -- they are renamed".

Patches welcome as long as you do not break more usual cases
;-).

^ permalink raw reply

* Re: Merge with local conflicts in new files
From: Santi @ 2006-05-16 23:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7v1wut61aj.fsf@assigned-by-dhcp.cox.net>

2006/5/17, Junio C Hamano <junkio@cox.net>:
> Santi <sbejar@gmail.com> writes:
>
> >       In the case of:
> >
> > - You merge from a branch with new files
> > - You have these files in the working directory
> > - You do not have these files in the HEAD.
>
> and
>
>  - You have not told git that these files matter.

For me it is the other way, all my files matter but git can do
whatever it wants with the ones it controls.

>
> This is totally untested, but on top of "next" you could do
> something like this, perhaps.

Thanks, it works here.

Santi

^ permalink raw reply

* Re: [PATCH] Update the documentation for git-merge-base
From: Junio C Hamano @ 2006-05-16 23:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Fredrik Kuivinen, git
In-Reply-To: <Pine.LNX.4.64.0605160906150.3866@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

>  - In contrast, for git (current master branch), the numbers are 35 out of 
>    540, and there are lots of merges with many LCA's:
>
>     505 o
>      15 oo
>      13 ooo
>       2 oooo
>       3 ooooo
>       2 ooooooo
>
> I think the difference is that Junio does a lot of these branches where he 
> keeps on pulling from them, and never syncs back (which is a great 
> workflow). In contrast, the kernel tends to try to avoid that because the 
> history gets messy enough as it is ;)
>
> Anyway, the two commits that apparently have seven (!) LCA's in the git 
> tree should probably be checked out. They are probably a good thing to see 
> if git-merge-base really _really_ does the right thing, and whether they 
> really are true LCA's.
>
> They are commits ad0b46bf.. and e6a933bd.. respectively.

The first one is because at 1.3.0 I pulled everything from
"next" to "master".

Usually "next" incorporates topic branches that stem from
different commits on "master", and when a new topic is merged to
"next", it gets the updates to "master" up to that point along
with the new topic.  When topics graduate (i.e. merged back) to
"master", they do so at different pace.


      topic2          o---o---o---o---H---.
                     /                 \   \
      next   -----------o---o---E---o---I-------B
                   /   /       /             \   \
      topic1      /   /   o---D---.           \   \
                 /   /   /         \           \   \
      master ---G---o---C---o---o---F---o---o---A---X

The above illustration shows that two topics branched from
master were cooked in next.  Topic 1 branched from master at C,
added two commits (its tip is at D), merged to next at E and
then later merged to master at F.  Similarly, topic 2 branched
from master at G, added five commits (its tip is at H), merged
to next at I and then later merged to master at A.

When merging "next" into "master" by merging A and B to produce
X, tips of topics 1 and 2 (D and H, respectively) become the
merge base.

Merging "next" wholesale to "master" is hopefully a rare event,
but the seven bases you are seeing are the topic tips.

The other one is the other way around.  From time to time,
"next" itself gets updates from "master" to keep it in sync with
fixes that occurred on "master" directly.  Such a merge into
"next" will have this picture but the principles are the same.

      topic2          o---o---o---o---H---.
                     /                 \   \
      next   -----------o---o---E---o---I-------B---Y
                   /   /       /             \     /
      topic1      /   /   o---D---.           \   /
                 /   /   /         \           \ /
      master ---G---o---C---o---o---F---o---o---A

^ permalink raw reply

* Re: Merge with local conflicts in new files
From: Junio C Hamano @ 2006-05-16 23:28 UTC (permalink / raw)
  To: Santi; +Cc: git
In-Reply-To: <8aa486160605161611p4c9ddbc0v@mail.gmail.com>

Santi <sbejar@gmail.com> writes:

> 2006/5/17, Junio C Hamano <junkio@cox.net>:
>> Santi <sbejar@gmail.com> writes:
>>
>> >       In the case of:
>> >
>> > - You merge from a branch with new files
>> > - You have these files in the working directory
>> > - You do not have these files in the HEAD.
>>
>> and
>>
>>  - You have not told git that these files matter.
>
> For me it is the other way, all my files matter but git can do
> whatever it wants with the ones it controls.

You really do not mean that.

If you told git a file matters, and have local modifications to
the file in the working tree that you have not run update-index
yet, merge and apply should be careful not to overwrite your
changes that is not ready while doing whatever thing they have
to do.  And they are careful, because you have told git that
they matter, and the way you tell git that they matter is to
have entries for them in the index.

^ permalink raw reply

* Re: [PATCH] improve depth heuristic for maximum delta size
From: Junio C Hamano @ 2006-05-16 23:34 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0605161510200.18071@localhost.localdomain>

Nicolas Pitre <nico@cam.org> writes:

> This provides a linear decrement on the penalty related to delta depth
> instead of being an 1/x function.  With this another 5% reduction is 
> observed on packs for both the GIT repo and the Linux kernel repo, as 
> well as fixing a pack size regression in another sample repo I have.

Good job, and it does not seem to spend too many more cycles
either (it does slow it down a bit because it needs to do more
deltas, but that is to be expected).

Here is the average chain length and resulting pack size from
full repacking of git.git repository, with three versions.

        Avg 6.20   6516kB (master)
        Avg 5.97   5784kB (next, has 1/x version)
        Avg 6.89   5536kB (this patch on top of next)

What's interesting is that the 1/x version shortens the chain
(i.e. decreased runtime cost) while producing smaller results,
compared to the master version.  The story is the same on the
kernel archive.

	Avg 5.82 113808kB (master)
	Avg 4.76 108044kB (next, has 1/x version)
	Avg 5.81 105768kB (this patch on top of next)

^ permalink raw reply

* Remove old "git-grep.sh" remnants
From: Linus Torvalds @ 2006-05-16 23:46 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


It's built-in now.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
----

diff --git a/Makefile b/Makefile
index 93779b0..9ba608c 100644
--- a/Makefile
+++ b/Makefile
@@ -124,7 +124,7 @@ SCRIPT_SH = \
 	git-tag.sh git-verify-tag.sh \
 	git-applymbox.sh git-applypatch.sh git-am.sh \
 	git-merge.sh git-merge-stupid.sh git-merge-octopus.sh \
-	git-merge-resolve.sh git-merge-ours.sh git-grep.sh \
+	git-merge-resolve.sh git-merge-ours.sh \
 	git-lost-found.sh
 
 SCRIPT_PERL = \
@@ -169,7 +169,8 @@ PROGRAMS = \
 	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X
 
 BUILT_INS = git-log$X git-whatchanged$X git-show$X \
-	git-count-objects$X git-diff$X git-push$X
+	git-count-objects$X git-diff$X git-push$X \
+	git-grep$X
 
 # what 'all' will build and 'install' will install, in gitexecdir
 ALL_PROGRAMS = $(PROGRAMS) $(SIMPLE_PROGRAMS) $(SCRIPTS)
diff --git a/git-grep.sh b/git-grep.sh
deleted file mode 100755
index ad4f2fe..0000000
--- a/git-grep.sh
+++ /dev/null
@@ -1,62 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) Linus Torvalds, 2005
-#
-
-USAGE='[<option>...] [-e] <pattern> [<path>...]'
-SUBDIRECTORY_OK='Yes'
-. git-sh-setup
-
-got_pattern () {
-	if [ -z "$no_more_patterns" ]
-	then
-		pattern="$1" no_more_patterns=yes
-	else
-		die "git-grep: do not specify more than one pattern"
-	fi
-}
-
-no_more_patterns=
-pattern=
-flags=()
-git_flags=()
-while : ; do
-	case "$1" in
-	-o|--cached|--deleted|--others|--killed|\
-	--ignored|--modified|--exclude=*|\
-	--exclude-from=*|\--exclude-per-directory=*)
-		git_flags=("${git_flags[@]}" "$1")
-		;;
-	-e)
-		got_pattern "$2"
-		shift
-		;;
-	-A|-B|-C|-D|-d|-f|-m)
-		flags=("${flags[@]}" "$1" "$2")
-		shift
-		;;
-	--)
-		# The rest are git-ls-files paths
-		shift
-		break
-		;;
-	-*)
-		flags=("${flags[@]}" "$1")
-		;;
-	*)
-		if [ -z "$no_more_patterns" ]
-		then
-			got_pattern "$1"
-			shift
-		fi
-		[ "$1" = -- ] && shift
-		break
-		;;
-	esac
-	shift
-done
-[ "$pattern" ] || {
-	usage
-}
-git-ls-files -z "${git_flags[@]}" -- "$@" |
-	xargs -0 grep "${flags[@]}" -e "$pattern" --

^ permalink raw reply related

* Git 1.3.2 on Solaris
From: Stefan Pfetzing @ 2006-05-16 23:52 UTC (permalink / raw)
  To: git

Hi,

I've been trying to get git to work on the latest Solaris Express
release (with the help of NetBSD's pkgsrc).

It mostly miserabely fails because of common "shell commands" being
used with GNU options. (like xargs, diff, tr and prob. some more) On
my box (and thats AFAIK the default when you install gnu coreutils on
Solaris) the commands do have a g prefix.

So there are 2 possible solutions to get git working on Solaris.

1.  fix every single shellscript automatically during the build phase
2.  setup a dir which contains symlinks to the "right" binaries and
put that dir into PATH.

No matter what solution is chosen to be the best, I'm volunteering to
create a patch for it. :)

(although I personally prefer the second, because its easier...)

bye

Stefan
-- 
        http://www.dreamind.de/
Oroborus and Debian GNU/Linux Developer.

^ permalink raw reply

* Re: Git 1.3.2 on Solaris
From: Jason Riedy @ 2006-05-17  1:25 UTC (permalink / raw)
  To: Stefan Pfetzing; +Cc: git
In-Reply-To: <f3d7535d0605161652n3b2ec033r874336082755e728@mail.gmail.com>

And "Stefan Pfetzing" writes:
 - I've been trying to get git to work on the latest Solaris Express
 - release (with the help of NetBSD's pkgsrc).

I've been using it on Solaris 8 and 9 with the GNU tools
in pkgsrc for quite a while, as well as on AIX with the
GNU tools available as modules (but I haven't compiled a
new AIX version for a month or two).

 - It mostly miserabely fails because of common "shell commands" being
 - used with GNU options. (like xargs, diff, tr and prob. some more) On
 - my box (and thats AFAIK the default when you install gnu coreutils on
 - Solaris) the commands do have a g prefix.

In your pkgsrc mk.conf, use:
GNU_PROGRAM_PREFIX=
GTAR_PROGRAM_PREFIX=

I tried your first suggestion (patch all the commands) back
in February.  It's pretty fragile against future changes, and
I wouldn't recommend it.

 - 2.  setup a dir which contains symlinks to the "right" binaries and
 - put that dir into PATH.

Setting a GIT_COMPAT_PATH in the Makefile and prepending
it to the path in git.c and git-sh-setup.sh might be more
sane.  A fragment like the following in git.c before adding
GIT_EXEC_PATH:
#ifdef GIT_COMPAT_PATH
	/* Search for sane external utilities */
	prepend_to_path(GIT_COMPAT_PATH, strlen(GIT_COMPAT_PATH));
#endif

And maybe in git-sh-setup.sh to help those of us who
use git-foo rather than git foo:
if [ ! -z "@GIT_COMPAT_PATH@" ] ; then
	PATH="@GIT_COMPAT_PATH@:${PATH}"
	export PATH
fi

Plus Makefile fun.

Jason

^ permalink raw reply

* [PATCH] libify git-ls-files directory traversal
From: Linus Torvalds @ 2006-05-17  2:02 UTC (permalink / raw)
  To: Junio C Hamano, Git Mailing List


This moves the core directory traversal and filename exclusion logic
into the general git library, making it available for other users
directly.

If we ever want to do "git commit" or "git add" as a built-in (and we
do), we want to be able to handle most of git-ls-files as a library.

NOTE! Not all of git-ls-files is libified by this.  The index matching
and pathspec prefix calculation is still in ls-files.c, but this is a
big part of it.

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
---
 Makefile   |    4 -
 dir.c      |  295 +++++++++++++++++++++++++++++++++++++++++++++++++
 dir.h      |   50 ++++++++
 ls-files.c |  363 +++++-------------------------------------------------------
 4 files changed, 376 insertions(+), 336 deletions(-)

diff --git a/Makefile b/Makefile
index 9ba608c..f43ac63 100644
--- a/Makefile
+++ b/Makefile
@@ -199,7 +199,7 @@ LIB_H = \
 	blob.h cache.h commit.h csum-file.h delta.h \
 	diff.h object.h pack.h pkt-line.h quote.h refs.h \
 	run-command.h strbuf.h tag.h tree.h git-compat-util.h revision.h \
-	tree-walk.h log-tree.h
+	tree-walk.h log-tree.h dir.h
 
 DIFF_OBJS = \
 	diff.o diff-lib.o diffcore-break.o diffcore-order.o \
@@ -210,7 +210,7 @@ LIB_OBJS = \
 	blob.o commit.o connect.o csum-file.o base85.o \
 	date.o diff-delta.o entry.o exec_cmd.o ident.o index.o \
 	object.o pack-check.o patch-delta.o path.o pkt-line.o \
-	quote.o read-cache.o refs.o run-command.o \
+	quote.o read-cache.o refs.o run-command.o dir.o \
 	server-info.o setup.o sha1_file.o sha1_name.o strbuf.o \
 	tag.o tree.o usage.o config.o environment.o ctype.o copy.o \
 	fetch-clone.o revision.o pager.o tree-walk.o xdiff-interface.o \
diff --git a/dir.c b/dir.c
new file mode 100644
index 0000000..3f41a5d
--- /dev/null
+++ b/dir.c
@@ -0,0 +1,295 @@
+/*
+ * This handles recursive filename detection with exclude
+ * files, index knowledge etc..
+ *
+ * Copyright (C) Linus Torvalds, 2005-2006
+ *		 Junio Hamano, 2005-2006
+ */
+#include <dirent.h>
+#include <fnmatch.h>
+
+#include "cache.h"
+#include "dir.h"
+
+void add_exclude(const char *string, const char *base,
+		 int baselen, struct exclude_list *which)
+{
+	struct exclude *x = xmalloc(sizeof (*x));
+
+	x->pattern = string;
+	x->base = base;
+	x->baselen = baselen;
+	if (which->nr == which->alloc) {
+		which->alloc = alloc_nr(which->alloc);
+		which->excludes = realloc(which->excludes,
+					  which->alloc * sizeof(x));
+	}
+	which->excludes[which->nr++] = x;
+}
+
+static int add_excludes_from_file_1(const char *fname,
+				    const char *base,
+				    int baselen,
+				    struct exclude_list *which)
+{
+	int fd, i;
+	long size;
+	char *buf, *entry;
+
+	fd = open(fname, O_RDONLY);
+	if (fd < 0)
+		goto err;
+	size = lseek(fd, 0, SEEK_END);
+	if (size < 0)
+		goto err;
+	lseek(fd, 0, SEEK_SET);
+	if (size == 0) {
+		close(fd);
+		return 0;
+	}
+	buf = xmalloc(size+1);
+	if (read(fd, buf, size) != size)
+		goto err;
+	close(fd);
+
+	buf[size++] = '\n';
+	entry = buf;
+	for (i = 0; i < size; i++) {
+		if (buf[i] == '\n') {
+			if (entry != buf + i && entry[0] != '#') {
+				buf[i - (i && buf[i-1] == '\r')] = 0;
+				add_exclude(entry, base, baselen, which);
+			}
+			entry = buf + i + 1;
+		}
+	}
+	return 0;
+
+ err:
+	if (0 <= fd)
+		close(fd);
+	return -1;
+}
+
+void add_excludes_from_file(struct dir_struct *dir, const char *fname)
+{
+	if (add_excludes_from_file_1(fname, "", 0,
+				     &dir->exclude_list[EXC_FILE]) < 0)
+		die("cannot use %s as an exclude file", fname);
+}
+
+int push_exclude_per_directory(struct dir_struct *dir, const char *base, int baselen)
+{
+	char exclude_file[PATH_MAX];
+	struct exclude_list *el = &dir->exclude_list[EXC_DIRS];
+	int current_nr = el->nr;
+
+	if (dir->exclude_per_dir) {
+		memcpy(exclude_file, base, baselen);
+		strcpy(exclude_file + baselen, dir->exclude_per_dir);
+		add_excludes_from_file_1(exclude_file, base, baselen, el);
+	}
+	return current_nr;
+}
+
+static void pop_exclude_per_directory(struct dir_struct *dir, int stk)
+{
+	struct exclude_list *el = &dir->exclude_list[EXC_DIRS];
+
+	while (stk < el->nr)
+		free(el->excludes[--el->nr]);
+}
+
+/* Scan the list and let the last match determines the fate.
+ * Return 1 for exclude, 0 for include and -1 for undecided.
+ */
+static int excluded_1(const char *pathname,
+		      int pathlen,
+		      struct exclude_list *el)
+{
+	int i;
+
+	if (el->nr) {
+		for (i = el->nr - 1; 0 <= i; i--) {
+			struct exclude *x = el->excludes[i];
+			const char *exclude = x->pattern;
+			int to_exclude = 1;
+
+			if (*exclude == '!') {
+				to_exclude = 0;
+				exclude++;
+			}
+
+			if (!strchr(exclude, '/')) {
+				/* match basename */
+				const char *basename = strrchr(pathname, '/');
+				basename = (basename) ? basename+1 : pathname;
+				if (fnmatch(exclude, basename, 0) == 0)
+					return to_exclude;
+			}
+			else {
+				/* match with FNM_PATHNAME:
+				 * exclude has base (baselen long) implicitly
+				 * in front of it.
+				 */
+				int baselen = x->baselen;
+				if (*exclude == '/')
+					exclude++;
+
+				if (pathlen < baselen ||
+				    (baselen && pathname[baselen-1] != '/') ||
+				    strncmp(pathname, x->base, baselen))
+				    continue;
+
+				if (fnmatch(exclude, pathname+baselen,
+					    FNM_PATHNAME) == 0)
+					return to_exclude;
+			}
+		}
+	}
+	return -1; /* undecided */
+}
+
+int excluded(struct dir_struct *dir, const char *pathname)
+{
+	int pathlen = strlen(pathname);
+	int st;
+
+	for (st = EXC_CMDL; st <= EXC_FILE; st++) {
+		switch (excluded_1(pathname, pathlen, &dir->exclude_list[st])) {
+		case 0:
+			return 0;
+		case 1:
+			return 1;
+		}
+	}
+	return 0;
+}
+
+static void add_name(struct dir_struct *dir, const char *pathname, int len)
+{
+	struct dir_entry *ent;
+
+	if (cache_name_pos(pathname, len) >= 0)
+		return;
+
+	if (dir->nr == dir->alloc) {
+		int alloc = alloc_nr(dir->alloc);
+		dir->alloc = alloc;
+		dir->entries = xrealloc(dir->entries, alloc*sizeof(ent));
+	}
+	ent = xmalloc(sizeof(*ent) + len + 1);
+	ent->len = len;
+	memcpy(ent->name, pathname, len);
+	ent->name[len] = 0;
+	dir->entries[dir->nr++] = ent;
+}
+
+static int dir_exists(const char *dirname, int len)
+{
+	int pos = cache_name_pos(dirname, len);
+	if (pos >= 0)
+		return 1;
+	pos = -pos-1;
+	if (pos >= active_nr) /* can't */
+		return 0;
+	return !strncmp(active_cache[pos]->name, dirname, len);
+}
+
+/*
+ * Read a directory tree. We currently ignore anything but
+ * directories, regular files and symlinks. That's because git
+ * doesn't handle them at all yet. Maybe that will change some
+ * day.
+ *
+ * Also, we ignore the name ".git" (even if it is not a directory).
+ * That likely will not change.
+ */
+static int read_directory_recursive(struct dir_struct *dir, const char *path, const char *base, int baselen)
+{
+	DIR *fdir = opendir(path);
+	int contents = 0;
+
+	if (fdir) {
+		int exclude_stk;
+		struct dirent *de;
+		char fullname[MAXPATHLEN + 1];
+		memcpy(fullname, base, baselen);
+
+		exclude_stk = push_exclude_per_directory(dir, base, baselen);
+
+		while ((de = readdir(fdir)) != NULL) {
+			int len;
+
+			if ((de->d_name[0] == '.') &&
+			    (de->d_name[1] == 0 ||
+			     !strcmp(de->d_name + 1, ".") ||
+			     !strcmp(de->d_name + 1, "git")))
+				continue;
+			len = strlen(de->d_name);
+			memcpy(fullname + baselen, de->d_name, len+1);
+			if (excluded(dir, fullname) != dir->show_ignored) {
+				if (!dir->show_ignored || DTYPE(de) != DT_DIR) {
+					continue;
+				}
+			}
+
+			switch (DTYPE(de)) {
+			struct stat st;
+			int subdir, rewind_base;
+			default:
+				continue;
+			case DT_UNKNOWN:
+				if (lstat(fullname, &st))
+					continue;
+				if (S_ISREG(st.st_mode) || S_ISLNK(st.st_mode))
+					break;
+				if (!S_ISDIR(st.st_mode))
+					continue;
+				/* fallthrough */
+			case DT_DIR:
+				memcpy(fullname + baselen + len, "/", 2);
+				len++;
+				rewind_base = dir->nr;
+				subdir = read_directory_recursive(dir, fullname, fullname,
+				                        baselen + len);
+				if (dir->show_other_directories &&
+				    (subdir || !dir->hide_empty_directories) &&
+				    !dir_exists(fullname, baselen + len)) {
+					// Rewind the read subdirectory
+					while (dir->nr > rewind_base)
+						free(dir->entries[--dir->nr]);
+					break;
+				}
+				contents += subdir;
+				continue;
+			case DT_REG:
+			case DT_LNK:
+				break;
+			}
+			add_name(dir, fullname, baselen + len);
+			contents++;
+		}
+		closedir(fdir);
+
+		pop_exclude_per_directory(dir, exclude_stk);
+	}
+
+	return contents;
+}
+
+static int cmp_name(const void *p1, const void *p2)
+{
+	const struct dir_entry *e1 = *(const struct dir_entry **)p1;
+	const struct dir_entry *e2 = *(const struct dir_entry **)p2;
+
+	return cache_name_compare(e1->name, e1->len,
+				  e2->name, e2->len);
+}
+
+int read_directory(struct dir_struct *dir, const char *path, const char *base, int baselen)
+{
+	read_directory_recursive(dir, path, base, baselen);
+	qsort(dir->entries, dir->nr, sizeof(struct dir_entry *), cmp_name);
+	return dir->nr;
+}
diff --git a/dir.h b/dir.h
new file mode 100644
index 0000000..e8fc441
--- /dev/null
+++ b/dir.h
@@ -0,0 +1,50 @@
+#ifndef DIR_H
+#define DIR_H
+
+/*
+ * We maintain three exclude pattern lists:
+ * EXC_CMDL lists patterns explicitly given on the command line.
+ * EXC_DIRS lists patterns obtained from per-directory ignore files.
+ * EXC_FILE lists patterns from fallback ignore files.
+ */
+#define EXC_CMDL 0
+#define EXC_DIRS 1
+#define EXC_FILE 2
+
+
+struct dir_entry {
+	int len;
+	char name[FLEX_ARRAY]; /* more */
+};
+
+struct exclude_list {
+	int nr;
+	int alloc;
+	struct exclude {
+		const char *pattern;
+		const char *base;
+		int baselen;
+	} **excludes;
+};
+
+struct dir_struct {
+	int nr, alloc;
+	unsigned int show_ignored:1,
+		     show_other_directories:1,
+		     hide_empty_directories:1;
+	struct dir_entry **entries;
+
+	/* Exclude info */
+	const char *exclude_per_dir;
+	struct exclude_list exclude_list[3];
+};
+
+extern int read_directory(struct dir_struct *, const char *path, const char *base, int baselen);
+extern int excluded(struct dir_struct *, const char *);
+extern void add_excludes_from_file(struct dir_struct *, const char *fname);
+extern void add_exclude(const char *string, const char *base,
+			int baselen, struct exclude_list *which);
+extern int push_exclude_per_directory(struct dir_struct *,
+				      const char *base, int baselen);
+
+#endif
diff --git a/ls-files.c b/ls-files.c
index 4a4af1c..89941a3 100644
--- a/ls-files.c
+++ b/ls-files.c
@@ -5,23 +5,20 @@
  *
  * Copyright (C) Linus Torvalds, 2005
  */
-#include <dirent.h>
 #include <fnmatch.h>
 
 #include "cache.h"
 #include "quote.h"
+#include "dir.h"
 
 static int abbrev = 0;
 static int show_deleted = 0;
 static int show_cached = 0;
 static int show_others = 0;
-static int show_ignored = 0;
 static int show_stage = 0;
 static int show_unmerged = 0;
 static int show_modified = 0;
 static int show_killed = 0;
-static int show_other_directories = 0;
-static int hide_empty_directories = 0;
 static int show_valid_bit = 0;
 static int line_terminator = '\n';
 
@@ -38,309 +35,6 @@ static const char *tag_other = "";
 static const char *tag_killed = "";
 static const char *tag_modified = "";
 
-static const char *exclude_per_dir = NULL;
-
-/* We maintain three exclude pattern lists:
- * EXC_CMDL lists patterns explicitly given on the command line.
- * EXC_DIRS lists patterns obtained from per-directory ignore files.
- * EXC_FILE lists patterns from fallback ignore files.
- */
-#define EXC_CMDL 0
-#define EXC_DIRS 1
-#define EXC_FILE 2
-static struct exclude_list {
-	int nr;
-	int alloc;
-	struct exclude {
-		const char *pattern;
-		const char *base;
-		int baselen;
-	} **excludes;
-} exclude_list[3];
-
-static void add_exclude(const char *string, const char *base,
-			int baselen, struct exclude_list *which)
-{
-	struct exclude *x = xmalloc(sizeof (*x));
-
-	x->pattern = string;
-	x->base = base;
-	x->baselen = baselen;
-	if (which->nr == which->alloc) {
-		which->alloc = alloc_nr(which->alloc);
-		which->excludes = realloc(which->excludes,
-					  which->alloc * sizeof(x));
-	}
-	which->excludes[which->nr++] = x;
-}
-
-static int add_excludes_from_file_1(const char *fname,
-				    const char *base,
-				    int baselen,
-				    struct exclude_list *which)
-{
-	int fd, i;
-	long size;
-	char *buf, *entry;
-
-	fd = open(fname, O_RDONLY);
-	if (fd < 0)
-		goto err;
-	size = lseek(fd, 0, SEEK_END);
-	if (size < 0)
-		goto err;
-	lseek(fd, 0, SEEK_SET);
-	if (size == 0) {
-		close(fd);
-		return 0;
-	}
-	buf = xmalloc(size+1);
-	if (read(fd, buf, size) != size)
-		goto err;
-	close(fd);
-
-	buf[size++] = '\n';
-	entry = buf;
-	for (i = 0; i < size; i++) {
-		if (buf[i] == '\n') {
-			if (entry != buf + i && entry[0] != '#') {
-				buf[i - (i && buf[i-1] == '\r')] = 0;
-				add_exclude(entry, base, baselen, which);
-			}
-			entry = buf + i + 1;
-		}
-	}
-	return 0;
-
- err:
-	if (0 <= fd)
-		close(fd);
-	return -1;
-}
-
-static void add_excludes_from_file(const char *fname)
-{
-	if (add_excludes_from_file_1(fname, "", 0,
-				     &exclude_list[EXC_FILE]) < 0)
-		die("cannot use %s as an exclude file", fname);
-}
-
-static int push_exclude_per_directory(const char *base, int baselen)
-{
-	char exclude_file[PATH_MAX];
-	struct exclude_list *el = &exclude_list[EXC_DIRS];
-	int current_nr = el->nr;
-
-	if (exclude_per_dir) {
-		memcpy(exclude_file, base, baselen);
-		strcpy(exclude_file + baselen, exclude_per_dir);
-		add_excludes_from_file_1(exclude_file, base, baselen, el);
-	}
-	return current_nr;
-}
-
-static void pop_exclude_per_directory(int stk)
-{
-	struct exclude_list *el = &exclude_list[EXC_DIRS];
-
-	while (stk < el->nr)
-		free(el->excludes[--el->nr]);
-}
-
-/* Scan the list and let the last match determines the fate.
- * Return 1 for exclude, 0 for include and -1 for undecided.
- */
-static int excluded_1(const char *pathname,
-		      int pathlen,
-		      struct exclude_list *el)
-{
-	int i;
-
-	if (el->nr) {
-		for (i = el->nr - 1; 0 <= i; i--) {
-			struct exclude *x = el->excludes[i];
-			const char *exclude = x->pattern;
-			int to_exclude = 1;
-
-			if (*exclude == '!') {
-				to_exclude = 0;
-				exclude++;
-			}
-
-			if (!strchr(exclude, '/')) {
-				/* match basename */
-				const char *basename = strrchr(pathname, '/');
-				basename = (basename) ? basename+1 : pathname;
-				if (fnmatch(exclude, basename, 0) == 0)
-					return to_exclude;
-			}
-			else {
-				/* match with FNM_PATHNAME:
-				 * exclude has base (baselen long) implicitly
-				 * in front of it.
-				 */
-				int baselen = x->baselen;
-				if (*exclude == '/')
-					exclude++;
-
-				if (pathlen < baselen ||
-				    (baselen && pathname[baselen-1] != '/') ||
-				    strncmp(pathname, x->base, baselen))
-				    continue;
-
-				if (fnmatch(exclude, pathname+baselen,
-					    FNM_PATHNAME) == 0)
-					return to_exclude;
-			}
-		}
-	}
-	return -1; /* undecided */
-}
-
-static int excluded(const char *pathname)
-{
-	int pathlen = strlen(pathname);
-	int st;
-
-	for (st = EXC_CMDL; st <= EXC_FILE; st++) {
-		switch (excluded_1(pathname, pathlen, &exclude_list[st])) {
-		case 0:
-			return 0;
-		case 1:
-			return 1;
-		}
-	}
-	return 0;
-}
-
-struct nond_on_fs {
-	int len;
-	char name[FLEX_ARRAY]; /* more */
-};
-
-static struct nond_on_fs **dir;
-static int nr_dir;
-static int dir_alloc;
-
-static void add_name(const char *pathname, int len)
-{
-	struct nond_on_fs *ent;
-
-	if (cache_name_pos(pathname, len) >= 0)
-		return;
-
-	if (nr_dir == dir_alloc) {
-		dir_alloc = alloc_nr(dir_alloc);
-		dir = xrealloc(dir, dir_alloc*sizeof(ent));
-	}
-	ent = xmalloc(sizeof(*ent) + len + 1);
-	ent->len = len;
-	memcpy(ent->name, pathname, len);
-	ent->name[len] = 0;
-	dir[nr_dir++] = ent;
-}
-
-static int dir_exists(const char *dirname, int len)
-{
-	int pos = cache_name_pos(dirname, len);
-	if (pos >= 0)
-		return 1;
-	pos = -pos-1;
-	if (pos >= active_nr) /* can't */
-		return 0;
-	return !strncmp(active_cache[pos]->name, dirname, len);
-}
-
-/*
- * Read a directory tree. We currently ignore anything but
- * directories, regular files and symlinks. That's because git
- * doesn't handle them at all yet. Maybe that will change some
- * day.
- *
- * Also, we ignore the name ".git" (even if it is not a directory).
- * That likely will not change.
- */
-static int read_directory(const char *path, const char *base, int baselen)
-{
-	DIR *fdir = opendir(path);
-	int contents = 0;
-
-	if (fdir) {
-		int exclude_stk;
-		struct dirent *de;
-		char fullname[MAXPATHLEN + 1];
-		memcpy(fullname, base, baselen);
-
-		exclude_stk = push_exclude_per_directory(base, baselen);
-
-		while ((de = readdir(fdir)) != NULL) {
-			int len;
-
-			if ((de->d_name[0] == '.') &&
-			    (de->d_name[1] == 0 ||
-			     !strcmp(de->d_name + 1, ".") ||
-			     !strcmp(de->d_name + 1, "git")))
-				continue;
-			len = strlen(de->d_name);
-			memcpy(fullname + baselen, de->d_name, len+1);
-			if (excluded(fullname) != show_ignored) {
-				if (!show_ignored || DTYPE(de) != DT_DIR) {
-					continue;
-				}
-			}
-
-			switch (DTYPE(de)) {
-			struct stat st;
-			int subdir, rewind_base;
-			default:
-				continue;
-			case DT_UNKNOWN:
-				if (lstat(fullname, &st))
-					continue;
-				if (S_ISREG(st.st_mode) || S_ISLNK(st.st_mode))
-					break;
-				if (!S_ISDIR(st.st_mode))
-					continue;
-				/* fallthrough */
-			case DT_DIR:
-				memcpy(fullname + baselen + len, "/", 2);
-				len++;
-				rewind_base = nr_dir;
-				subdir = read_directory(fullname, fullname,
-				                        baselen + len);
-				if (show_other_directories &&
-				    (subdir || !hide_empty_directories) &&
-				    !dir_exists(fullname, baselen + len)) {
-					// Rewind the read subdirectory
-					while (nr_dir > rewind_base)
-						free(dir[--nr_dir]);
-					break;
-				}
-				contents += subdir;
-				continue;
-			case DT_REG:
-			case DT_LNK:
-				break;
-			}
-			add_name(fullname, baselen + len);
-			contents++;
-		}
-		closedir(fdir);
-
-		pop_exclude_per_directory(exclude_stk);
-	}
-
-	return contents;
-}
-
-static int cmp_name(const void *p1, const void *p2)
-{
-	const struct nond_on_fs *e1 = *(const struct nond_on_fs **)p1;
-	const struct nond_on_fs *e2 = *(const struct nond_on_fs **)p2;
-
-	return cache_name_compare(e1->name, e1->len,
-				  e2->name, e2->len);
-}
 
 /*
  * Match a pathspec against a filename. The first "len" characters
@@ -377,7 +71,7 @@ static int match(const char **spec, char
 	return 0;
 }
 
-static void show_dir_entry(const char *tag, struct nond_on_fs *ent)
+static void show_dir_entry(const char *tag, struct dir_entry *ent)
 {
 	int len = prefix_len;
 	int offset = prefix_offset;
@@ -393,14 +87,14 @@ static void show_dir_entry(const char *t
 	putchar(line_terminator);
 }
 
-static void show_other_files(void)
+static void show_other_files(struct dir_struct *dir)
 {
 	int i;
-	for (i = 0; i < nr_dir; i++) {
+	for (i = 0; i < dir->nr; i++) {
 		/* We should not have a matching entry, but we
 		 * may have an unmerged entry for this path.
 		 */
-		struct nond_on_fs *ent = dir[i];
+		struct dir_entry *ent = dir->entries[i];
 		int pos = cache_name_pos(ent->name, ent->len);
 		struct cache_entry *ce;
 		if (0 <= pos)
@@ -416,11 +110,11 @@ static void show_other_files(void)
 	}
 }
 
-static void show_killed_files(void)
+static void show_killed_files(struct dir_struct *dir)
 {
 	int i;
-	for (i = 0; i < nr_dir; i++) {
-		struct nond_on_fs *ent = dir[i];
+	for (i = 0; i < dir->nr; i++) {
+		struct dir_entry *ent = dir->entries[i];
 		char *cp, *sp;
 		int pos, len, killed = 0;
 
@@ -461,7 +155,7 @@ static void show_killed_files(void)
 			}
 		}
 		if (killed)
-			show_dir_entry(tag_killed, dir[i]);
+			show_dir_entry(tag_killed, dir->entries[i]);
 	}
 }
 
@@ -512,7 +206,7 @@ static void show_ce_entry(const char *ta
 	}
 }
 
-static void show_files(void)
+static void show_files(struct dir_struct *dir)
 {
 	int i;
 
@@ -523,14 +217,14 @@ static void show_files(void)
 
 		if (baselen) {
 			path = base = prefix;
-			if (exclude_per_dir) {
+			if (dir->exclude_per_dir) {
 				char *p, *pp = xmalloc(baselen+1);
 				memcpy(pp, prefix, baselen+1);
 				p = pp;
 				while (1) {
 					char save = *p;
 					*p = 0;
-					push_exclude_per_directory(pp, p-pp);
+					push_exclude_per_directory(dir, pp, p-pp);
 					*p++ = save;
 					if (!save)
 						break;
@@ -543,17 +237,16 @@ static void show_files(void)
 				free(pp);
 			}
 		}
-		read_directory(path, base, baselen);
-		qsort(dir, nr_dir, sizeof(struct nond_on_fs *), cmp_name);
+		read_directory(dir, path, base, baselen);
 		if (show_others)
-			show_other_files();
+			show_other_files(dir);
 		if (show_killed)
-			show_killed_files();
+			show_killed_files(dir);
 	}
 	if (show_cached | show_stage) {
 		for (i = 0; i < active_nr; i++) {
 			struct cache_entry *ce = active_cache[i];
-			if (excluded(ce->name) != show_ignored)
+			if (excluded(dir, ce->name) != dir->show_ignored)
 				continue;
 			if (show_unmerged && !ce_stage(ce))
 				continue;
@@ -565,7 +258,7 @@ static void show_files(void)
 			struct cache_entry *ce = active_cache[i];
 			struct stat st;
 			int err;
-			if (excluded(ce->name) != show_ignored)
+			if (excluded(dir, ce->name) != dir->show_ignored)
 				continue;
 			err = lstat(ce->name, &st);
 			if (show_deleted && err)
@@ -652,7 +345,9 @@ int main(int argc, const char **argv)
 {
 	int i;
 	int exc_given = 0;
+	struct dir_struct dir;
 
+	memset(&dir, 0, sizeof(dir));
 	prefix = setup_git_directory();
 	if (prefix)
 		prefix_offset = strlen(prefix);
@@ -697,7 +392,7 @@ int main(int argc, const char **argv)
 			continue;
 		}
 		if (!strcmp(arg, "-i") || !strcmp(arg, "--ignored")) {
-			show_ignored = 1;
+			dir.show_ignored = 1;
 			continue;
 		}
 		if (!strcmp(arg, "-s") || !strcmp(arg, "--stage")) {
@@ -709,11 +404,11 @@ int main(int argc, const char **argv)
 			continue;
 		}
 		if (!strcmp(arg, "--directory")) {
-			show_other_directories = 1;
+			dir.show_other_directories = 1;
 			continue;
 		}
 		if (!strcmp(arg, "--no-empty-directory")) {
-			hide_empty_directories = 1;
+			dir.hide_empty_directories = 1;
 			continue;
 		}
 		if (!strcmp(arg, "-u") || !strcmp(arg, "--unmerged")) {
@@ -726,27 +421,27 @@ int main(int argc, const char **argv)
 		}
 		if (!strcmp(arg, "-x") && i+1 < argc) {
 			exc_given = 1;
-			add_exclude(argv[++i], "", 0, &exclude_list[EXC_CMDL]);
+			add_exclude(argv[++i], "", 0, &dir.exclude_list[EXC_CMDL]);
 			continue;
 		}
 		if (!strncmp(arg, "--exclude=", 10)) {
 			exc_given = 1;
-			add_exclude(arg+10, "", 0, &exclude_list[EXC_CMDL]);
+			add_exclude(arg+10, "", 0, &dir.exclude_list[EXC_CMDL]);
 			continue;
 		}
 		if (!strcmp(arg, "-X") && i+1 < argc) {
 			exc_given = 1;
-			add_excludes_from_file(argv[++i]);
+			add_excludes_from_file(&dir, argv[++i]);
 			continue;
 		}
 		if (!strncmp(arg, "--exclude-from=", 15)) {
 			exc_given = 1;
-			add_excludes_from_file(arg+15);
+			add_excludes_from_file(&dir, arg+15);
 			continue;
 		}
 		if (!strncmp(arg, "--exclude-per-directory=", 24)) {
 			exc_given = 1;
-			exclude_per_dir = arg + 24;
+			dir.exclude_per_dir = arg + 24;
 			continue;
 		}
 		if (!strcmp(arg, "--full-name")) {
@@ -788,7 +483,7 @@ int main(int argc, const char **argv)
 		ps_matched = xcalloc(1, num);
 	}
 
-	if (show_ignored && !exc_given) {
+	if (dir.show_ignored && !exc_given) {
 		fprintf(stderr, "%s: --ignored needs some exclude pattern\n",
 			argv[0]);
 		exit(1);
@@ -802,7 +497,7 @@ int main(int argc, const char **argv)
 	read_cache();
 	if (prefix)
 		prune_cache();
-	show_files();
+	show_files(&dir);
 
 	if (ps_matched) {
 		/* We need to make sure all pathspec matched otherwise

^ permalink raw reply related

* Re: Git 1.3.2 on Solaris
From: Linus Torvalds @ 2006-05-17  2:20 UTC (permalink / raw)
  To: Stefan Pfetzing; +Cc: Git Mailing List
In-Reply-To: <f3d7535d0605161652n3b2ec033r874336082755e728@mail.gmail.com>


[ Junio - see the "grep" issue ]

On Wed, 17 May 2006, Stefan Pfetzing wrote:
> 
> So there are 2 possible solutions to get git working on Solaris.
> 
> 1.  fix every single shellscript automatically during the build phase
> 2.  setup a dir which contains symlinks to the "right" binaries and
> put that dir into PATH.

If the biggest issue is git depending on some GNU extensions, I'd really 
suggest
 (a) install all the normal GNU binaries, and put them in the path before 
     git just to get it working (and don't try to change git at all)
 (b) help send in patches that just remove the dependency entirely.

I've been - on and off - trying to libify most of the core git sources, so 
that the shell scripts can be re-written to be just plain C. Most of the 
time it's not actually even a huge amount of work, it's just somewhat 
boring.

Writing them as C usually gets rid of any dependencies on any GNU tools, 
and hopefully even cygwin. For example, we got rid of one "xargs -0" in 
the development branch pretty recently, thanks to making "git grep" a 
built-in.

Of course, I don't think anybody tried the new "git grep" on Solaris, and 
I think the solaris "grep" lacks the "-H" flag, for example. But that 
should be easy to fix (for example, replace the use of "--" and "-H" with 
putting a "/dev/null" as the first filename).

I don't think it's worth it trying to add some compatibility layer for the 
shell-scripts. We really do want to get rid of them, and the more people 
that help, the merrier.

In many ways, the libification effort isn't even needed. It's perfectly ok 
to turn a stupid shell-script (and they really all _are_ pretty stupid) 
into a builtin-cmd.c C file that just does something really easy like a 
"fork + execve()" translation of the original shell script.

The complete libification will take some time, and in the meantime, a few 
silly C files that hard-code the shell logic is probably much preferable 
to using the shell and all the problems that involves (like the whole 
problem with quoting arguments - just _gone_ when you do it as a execve() 
in a simple C program).

So anybody can help with this. If you know shell (and the git 
shell-scripts aren't even _advanced_ shell), and know some basic C, you're 
all set to do a trivial conversion from one to the other. And when the 
libification gets further, your conversion will probably help that (ie 
maybe libificaiton isn't complete, but a _part_ of the thing can be 
written to use the library interfaces instead of spawning an external 
program).

There aren't _that_ many shell programs, and a lot of them are really 
really trivial (ie they parse the arguments, and then do just a couple of 
external git commands).

			Linus

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox