From: Ben Peart <peartben@gmail.com>
To: "Martin Ågren" <martin.agren@gmail.com>,
"Ben Peart" <Ben.Peart@microsoft.com>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>,
"pclouds@gmail.com" <pclouds@gmail.com>,
"alexmv@dropbox.com" <alexmv@dropbox.com>,
"blees@dcon.de" <blees@dcon.de>,
"gitster@pobox.com" <gitster@pobox.com>,
"bmwill@google.com" <bmwill@google.com>,
"avarab@gmail.com" <avarab@gmail.com>,
"johannes.schindelin@gmx.de" <johannes.schindelin@gmx.de>
Subject: Re: [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic
Date: Wed, 11 Apr 2018 15:56:35 -0400 [thread overview]
Message-ID: <082cda6c-eb9f-c853-c09b-e0d3bc04cb9c@gmail.com> (raw)
In-Reply-To: <CAN0heSpKzG93OcAAAoHQxURVGsHFWz6j494C+3bezHLTOovQHA@mail.gmail.com>
On 4/10/2018 6:09 PM, Martin Ågren wrote:
> On 10 April 2018 at 23:04, Ben Peart <Ben.Peart@microsoft.com> wrote:
>> The File System Excludes module is a new programmatic way to exclude files and
>> folders from git's traversal of the working directory. fsexcludes_init() should
>> be called with a string buffer that contains a NUL separated list of path names
>> of the files and/or directories that should be included. Any path not listed
>> will be excluded. The paths should be relative to the root of the working
>> directory and be separated by a single NUL.
>>
>> The excludes logic in dir.c has been updated to honor the results of
>> fsexcludes_is_excluded_from(). If fsexcludes does not exclude the file, the
>> normal excludes logic is also checked as it could further reduce the set of
>> files that should be included.
>
> Here you mention a change in dir.c...
>
>> Makefile | 1 +
>> fsexcludes.c | 210 +++++++++++++++++++++++++++++++++++++++++++++++++++
>> fsexcludes.h | 27 +++++++
>> 3 files changed, 238 insertions(+)
>
> ... but this patch does not seem to touch dir.c at all.
>
Oops! Fixed in V2.
>> +static int check_fsexcludes_hashmap(struct hashmap *map, const char *pattern, int patternlen)
>> +{
>> + struct strbuf sb = STRBUF_INIT;
>> + struct fsexcludes fse;
>> + char *slash;
>> +
>> + /* Check straight mapping */
>> + strbuf_reset(&sb);
>
> You could drop this strbuf_reset(). Or did you intend to use a static
> struct strbuf?
>
Good point, fixed in V2.
>> + /*
>> + * Check to see if it matches a directory or any path
>> + * underneath it. In other words, 'a/b/foo.txt' will match
>> + * '/', 'a/', and 'a/b/'.
>> + */
>> + slash = strchr(sb.buf, '/');
>> + while (slash) {
>> + fse.pattern = sb.buf;
>> + fse.patternlen = slash - sb.buf + 1;
>> + hashmap_entry_init(&fse, fsexcludeshash(fse.pattern, fse.patternlen));
>> + if (hashmap_get(map, &fse, NULL)) {
>> + strbuf_release(&sb);
>> + return 0;
>> + }
>> + slash = strchr(slash + 1, '/');
>> + }
>
> Maybe a for-loop would make this slightly more obvious:
>
> for (slash = strchr(sb.buf, '/'); slash; slash = strchr(slash + 1, '/'))
>
> On second thought, maybe not.
>
>> + entry = buf = fsexcludes_data->buf;
>> + len = fsexcludes_data->len;
>> + for (i = 0; i < len; i++) {
>> + if (buf[i] == '\0') {
>> + fsexcludes_hashmap_add(map, entry, buf + i - entry);
>> + entry = buf + i + 1;
>> + }
>> + }
>> +}
>
> Very minor: I would have found "buf - entry + i" clearer here and later,
> but I'm sure you'll find someone of the opposing opinion (e.g.,
> yourself). ;-)
>
>> +static int check_directory_hashmap(struct hashmap *map, const char *pathname, int pathlen)
>> +{
>> + struct strbuf sb = STRBUF_INIT;
>> + struct fsexcludes fse;
>> +
>> + /* Check for directory */
>> + strbuf_reset(&sb);
>
> Same comment as above about this spurious reset.
Good point, fixed in V2.
>
>> + if (hashmap_get(map, &fse, NULL)) {
>> + strbuf_release(&sb);
>> + return 0;
>> + }
>> +
>> + strbuf_release(&sb);
>> + return 1;
>> +}
>> +
>> +/*
>> + * Return 1 for exclude, 0 for include and -1 for undecided.
>> + */
>> +int fsexcludes_is_excluded_from(struct index_state *istate,
>> + const char *pathname, int pathlen, int dtype)
>> +{
>
> Will we at some point regret not being able to "return negative on
> error"? I guess that would be "-2" or "negative other than -1".
>
This function is modeled after the other is_excluded_from* functions in
dir.c so that the return value can be handled the same way. I don't
anticipate any need for change but you're right, we could return some
other "negative other than -1" if it was ever needed.
>> +void fsexcludes_init(struct strbuf *sb) {
>> + fsexcludes_initialized = 1;
>> + fsexcludes_data = *sb;
>> +}
>
> Grabbing the strbuf's members looks a bit odd. Is this
> performance-sensitive enough that you do not want to make a copy? If a
> caller releases its strbuf, which would normally be a good thing to do,
> we may be in big trouble later. (Not only may .buf be stale, .len may
> indicate we actually have something to read.)
>
> I can understand that you do not want to pass a pointer+len, and that it
> is not enough to pass sb.buf, since the string may contain nuls.
>
> Maybe detach the original strbuf? That way, if a caller releases its
> buffer, that is a no-op. A caller which goes on to use its buffer should
> fail quickly and obviously. Right now, an incorrect caller would
> probably fail more subtly and less reproducibly.
>
> In any case, maybe document this in the .h-file?
Great suggestion! I was looking for a better way to ensure the buffer
ownership transfer was robust. I'll do both strbuf_detach() and update
the header file. Thank you.
>
> Martin
>
next prev parent reply other threads:[~2018-04-11 19:56 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-04-10 21:04 [PATCH v1 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
2018-04-10 21:04 ` [PATCH v1 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
2018-04-10 22:09 ` Martin Ågren
2018-04-11 19:56 ` Ben Peart [this message]
2018-04-11 6:58 ` Junio C Hamano
2018-04-10 21:04 ` [PATCH v1 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
2018-04-11 20:01 ` [PATCH v2 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
2018-04-11 20:01 ` [PATCH v2 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
2018-04-11 23:52 ` Junio C Hamano
2018-04-13 11:53 ` Ben Peart
2018-04-11 20:01 ` [PATCH v2 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
2018-04-13 12:22 ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
2018-04-13 12:22 ` [PATCH v3 1/2] fsexcludes: add a programmatic way to exclude files from git's working directory traversal logic Ben Peart
2018-04-13 12:22 ` [PATCH v3 2/2] fsmonitor: switch to use new fsexcludes logic and remove unused untracked cache based logic Ben Peart
2018-04-18 15:31 ` [PATCH v3 0/2] fsexcludes: Add programmatic way to exclude files Ben Peart
2018-04-18 21:25 ` Junio C Hamano
2018-04-14 15:59 ` [PATCH v1 " Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=082cda6c-eb9f-c853-c09b-e0d3bc04cb9c@gmail.com \
--to=peartben@gmail.com \
--cc=Ben.Peart@microsoft.com \
--cc=alexmv@dropbox.com \
--cc=avarab@gmail.com \
--cc=blees@dcon.de \
--cc=bmwill@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=johannes.schindelin@gmx.de \
--cc=martin.agren@gmail.com \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.