git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Hurka <david.hurka@mailbox.org>
To: git@vger.kernel.org
Subject: Bug: Documentation and behavior mismatch in and between git-archive and gitattributes/git-check-attr.
Date: Thu, 23 Sep 2021 16:52:22 +0200	[thread overview]
Message-ID: <22364178.6Emhk5qWAg@doro> (raw)

Hello all,

in https://github.com/Kentzo/git-archive-all/issues/87 I reported a problem 
which turns out to be more likely a problem in git.

The documentation of `gitattributes` says:

> The rules [mostly follow .gitignore], with a few exceptions:
> * [...]
> * patterns that match a directory do not recursively match paths
>   inside that directory (so using the trailing-slash path/ syntax
>   is pointless in an attributes file; use path/** instead)

This means to me, that patterns may match _directories_ instead of the files 
in them.

So, if I have a /.gitattributes file like this:

> LICENSES/ export-ignore

I expect the whole directory /LICENSES/ to be matched, so a tool like 
`git-archive` skips it when creating an archive.

However, if I have a /.gitattributes file like this:

> LICENSES/** export-ignore

I expect the directory itself not to be matched, but all files inside it, so a 
tool like `git-archive` adds an empty /LICENSES/ directory to the archive.

`git-archive` actually behaves like this.

However, `git-check-attr` does not behave like this. In the second example, it 
will list `export-ignore` matching on every file inside /LICENSES/. But in the 
first example, it will list `export-ignore` matching nowhere. This leads to 
the problem where `git-archive-all` wrongly exports the whole /LICENSES/ 
directory.

The documentation for `gitattributes` says for `export-ignore`:

> `export-ignore`
>   Files and directories with the attribute `export-ignore`
>   won’t be added to archive files.

For most other attributes it does not include directories, because they do not 
make sense for directories.

---

I think the solution is like this:

1. Change the documentation of `gitattributes` to:

> The rules [mostly follow .gitignore], with a few exceptions:
> * [...]
> * patterns that match a directory do not recursively match paths
>   inside that directory. (Since most attributes are only useful for
>   files, the path/ syntax is probably incorrect; use path/** instead.)

2. Change git-check-attr to make patterns like path/ match directories. At 
least, for attributes like `export-ignore` this should be changed, but I think 
the correct solution is to change it for all attributes.

---

Below is a shell script to reproduce the problem.

```
mkdir test_attributes
cd test_attributes
git init -b master
mkdir include dont-include-1 dont-include-2 dont-include-3
touch include/a.txt dont-include-1/b.txt dont-include-2/c.txt dont-include-3/
d.txt
echo dont-include-1/ export-ignore >>.gitattributes
echo dont-include-2/** export-ignore >> .gitattributes
echo dont-include-3 export-ignore >>.gitattributes
git add -A
git commit -m "Test gitattributes"

git archive --output ../test_attributes.zip HEAD
zipinfo ../test_attributes.zip

git check-attr export-ignore * */*
```

This is the output which I get with git version 2.33.0.752.gd22421fcc6:

> $ git archive --output ../test_attributes.zip HEAD
> $ zipinfo ../test_attributes.zip
> Archive:  ../test_attributes.zip
> Zip file size: 584 bytes, number of entries: 4
> -rw----     0.0 fat       94 tx defN 21-Sep-23 16:37 .gitattributes
> drwx---     0.0 fat        0 bx stor 21-Sep-23 16:37 dont-include-2/
> drwx---     0.0 fat        0 bx stor 21-Sep-23 16:37 include/
> -rw----     0.0 fat        0 tx stor 21-Sep-23 16:37 include/a.txt
> 4 files, 94 bytes uncompressed, 46 bytes compressed:  51.1%
> 
> $ git check-attr export-ignore * */*
> dont-include-1: export-ignore: unspecified
> dont-include-2: export-ignore: unspecified
> dont-include-3: export-ignore: set
> include: export-ignore: unspecified
> dont-include-1/b.txt: export-ignore: unspecified
> dont-include-2/c.txt: export-ignore: set
> dont-include-3/d.txt: export-ignore: unspecified
> include/a.txt: export-ignore: unspecified

Note how `dont-include-2/** export-ignore` caused an empty directory to be 
created. This is problematic for my workflow.

`dont-include-3 export-ignore` yields the desired results here, but I think it 
is not always an alternative, since the missing trailing slash makes the rule 
apply to files too. For example in this repository structure:

> - a/
>   - dont-include-3/
>     - a.txt
> - b/
>   - dont-include-3/
>     - b.txt
> - dont-include-3

I could not use the glob syntax `**/dont-include-3 export-ignore` to skip only 
the two _directories_ with a.txt and b.txt, because the _file_ dont-include-3 
would also be skipped.

---

I am generally available for submitting patches to fix this issue. Let me know 
what you think. :)

Cheers, David




                 reply	other threads:[~2021-09-23 14:52 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=22364178.6Emhk5qWAg@doro \
    --to=david.hurka@mailbox.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).