From: Junio C Hamano <gitster@pobox.com>
To: Lars Schneider <larsxschneider@gmail.com>
Cc: git@vger.kernel.org, peff@peff.net, tboegi@web.de, e@80x24.org,
ttaylorr@github.com, peartben@gmail.com
Subject: Re: [PATCH v6 6/6] convert: add "status=delayed" to filter process protocol
Date: Mon, 26 Jun 2017 15:13:07 -0700 [thread overview]
Message-ID: <xmqqa84u4cuk.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <F72755BA-6296-4C37-9EFA-4D7BCE9F1082@gmail.com> (Lars Schneider's message of "Mon, 26 Jun 2017 23:28:21 +0200")
Lars Schneider <larsxschneider@gmail.com> writes:
> Maybe this?
> [...] If Git sends this command, then the
> filter is expected to return a list of pathnames representing blobs
> that have been delayed earlier and are now available. [...]
OK.
>>> +by a "success" status that is also terminated with a flush packet. If
>>> +no blobs for the delayed paths are available, yet, then the filter is
>>> +expected to block the response until at least one blob becomes
>>> +available.
>>
>> Ahh, this is better, at least you use "the delayed paths".
>>
>> What if the result never gets available (e.g. resulted in an error)?
>
> As soon as the filter responds with an empty list, Git stops asking.
> All blobs that Git has not received at this point are considered an
> error.
>
> Should I mention that explicitly?
Otherwise I wouldn't have wondered "what if".
>> I am wondering whose responsibility it will be to deal with a path
>> "list-available" reports that are *not* asked by Git or Git got no
>> "delayed" response. The list subtraction done by the caller is
>> probably the logical place to do so.
>
> Correct. Git (the caller) will notice that not all "delayed" paths
> are listed by the filter and throw an error at the end.
I am wondering about the other case. Git didn't ask for a path to
be filtered at all, but the filter sneaked in a path that happens to
in the index in its response---Git should at least ignore it, and
better yet, diagnose it as an error in the filter process.
>>> + filter->string = "";
>>> + continue;
>>> + }
>>> +
>>> + /* In dco->paths we store a list of all delayed paths.
>>> + The filter just send us a list of available paths.
>>> + Remove them from the list.
>>> + */
>>> + filter_string_list(&dco->paths, 0,
>>> + &remove_available_paths, &available_paths);
>>
>> We first remove from the outstanding request list (dco->paths) what
>> are now ready...
>>
>>> + for_each_string_list_item(path, &available_paths) {
>>
>> ...and go over those paths that are now ready.
>>
>>> + struct cache_entry* ce = index_file_exists(
>>> + state->istate, path->string,
>>> + strlen(path->string), 0);
>>> + assert(dco->state == CE_RETRY);
>>> + errs |= (ce ? checkout_entry(ce, state, NULL) : 1);
>>> + }
>>
>> But we never checked if the contents of this available_paths list is
>> a subset of the set of paths we originally asked the external
>> process to filter.
>
> Correct.
>
>> This will allow the process to overwrite any
>> random path that is not even involved in the checkout.
>
> No, not "any random path". Only paths that are part of the checkout.
Isn't it "any path that index_file_exists() returns a CE for". Did
you filter out elements in available_paths that weren't part of
dco->paths? I thought the filter-string-list you have are for the
other way around (which is necessary to keep track of work to be
done, but that filtering does not help rejecting rogue responses at
all).
> There are three cases:
>
> (1) available_path is a path that was delayed before (= happy case!)
> (2) available_path is a path that was not delayed before,
> but filtered (= no problem, as filtering is a idempotent operation)
> (3) available_path is a path that was neither delayed nor filtered
> before (= if the filter returns the blob as-is then this would
> be no problem. otherwise we would indeed have a screwed checkout)
>
> Case 3 might introduce a problem if the filter is buggy.
> Would you be OK with this check to catch case 3?
I'd be very suspicious about anything you would do only with .nr
field, without filtering the other way around. After all, we may
have asked it for 3 paths to be filtered, and it may have answered
with its own 3 different paths.
> dco_path_count = dco->paths.nr;
> filter_string_list(&dco->paths, 0,
> &remove_available_paths, &available_paths);
>
> if (dco_path_count - dco->paths.nr != available_paths.nr) {
> /* The filter responded with entries that have not
> * been delay earlier. Do not ask the filter
> * for available blobs, again, as the filter is
> * likely buggy. This will generate an error at
> * the end as some files are not filtered properly.
> */
> filter->string = "";
> error(_("The external filter '%s' responded with "
> "available blobs which have not been delayed "
> "earlier."), filter->string);
> continue;
> }
>
>
>>> + }
>>> + string_list_remove_empty_items(&dco->filters, 0);
>>> + }
>>> + string_list_clear(&dco->filters, 0);
>>> +
>>> + /* At this point we should not have any delayed paths anymore. */
>>> + errs |= dco->paths.nr;
>>> + for_each_string_list_item(path, &dco->paths) {
>>> + warning(_("%s was not filtered properly."), path->string);
>>> + }
>>> + string_list_clear(&dco->paths, 0);
>>
>> And "list-available" that says "path X is ready" when we never asked
>> for X gets away free without detected as a bug, either.
>
> With the addition above it would!
>
>
> Thanks for the review :-)
>
> - Lars
next prev parent reply other threads:[~2017-06-26 22:13 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-25 18:21 [PATCH v6 0/6] convert: add "status=delayed" to filter process protocol Lars Schneider
2017-06-25 18:21 ` [PATCH v6 1/6] t0021: keep filter log files on comparison Lars Schneider
2017-06-25 22:12 ` Junio C Hamano
2017-06-26 9:02 ` Lars Schneider
2017-06-26 17:31 ` Junio C Hamano
2017-06-25 18:21 ` [PATCH v6 2/6] t0021: make debug log file name configurable Lars Schneider
2017-06-25 22:15 ` Junio C Hamano
2017-06-25 18:21 ` [PATCH v6 3/6] t0021: write "OUT" only on success Lars Schneider
2017-06-25 22:17 ` Junio C Hamano
2017-06-26 9:26 ` Lars Schneider
2017-06-26 17:50 ` Junio C Hamano
2017-06-26 18:32 ` Lars Schneider
2017-06-25 18:21 ` [PATCH v6 4/6] convert: put the flags field before the flag itself for consistent style Lars Schneider
2017-06-25 22:19 ` Junio C Hamano
2017-06-25 18:21 ` [PATCH v6 5/6] convert: move multiple file filter error handling to separate function Lars Schneider
2017-06-26 17:56 ` Junio C Hamano
2017-06-27 2:51 ` Stefan Beller
2017-06-26 18:00 ` Junio C Hamano
2017-06-25 18:21 ` [PATCH v6 6/6] convert: add "status=delayed" to filter process protocol Lars Schneider
2017-06-26 19:02 ` Junio C Hamano
2017-06-26 21:28 ` Lars Schneider
2017-06-26 22:13 ` Junio C Hamano [this message]
2017-06-26 22:38 ` Junio C Hamano
2017-06-27 12:07 ` Lars Schneider
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqa84u4cuk.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox.com \
--cc=e@80x24.org \
--cc=git@vger.kernel.org \
--cc=larsxschneider@gmail.com \
--cc=peartben@gmail.com \
--cc=peff@peff.net \
--cc=tboegi@web.de \
--cc=ttaylorr@github.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.