git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Christian Couder <christian.couder@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, git <git@vger.kernel.org>,
	Christian Couder <chriscool@tuxfamily.org>
Subject: Re: [RFC PATCH 3/5] pack-objects: add delta-islands support
Date: Mon, 6 Aug 2018 09:58:55 -0400	[thread overview]
Message-ID: <20180806135855.GC31282@sigill.intra.peff.net> (raw)
In-Reply-To: <CAP8UFD3uruf94d7eV9Fy25h19y_KV8uZeaZN71F_xFM2XrspWQ@mail.gmail.com>

On Mon, Aug 06, 2018 at 10:44:14AM +0200, Christian Couder wrote:

> Taking a look at how we use regexec() in our code base, it looks like
> it might be better to use regexec_buf() defined in
> "git-compat-util.h".
> 
> I am not completely sure about that because apply.c has:
> 
>     status = regexec(stamp, timestamp, ARRAY_SIZE(m), m, 0);
>     if (status) {
>         if (status != REG_NOMATCH)
>             warning(_("regexec returned %d for input: %s"),
>                 status, timestamp);
>         return 0;
>     }
> 
> Though the above uses a regex that is defined in apply.c. The regex
> doesn't come from the config file.
> 
> Actually except the above there is a mix of regexec() and
> regexec_buf() in our code base, which are used with only 0, 1 or 2
> capture groups, so it is not very clear what should be used.

I don't think we need regexec_buf(). The advantage it has is that it can
operate on strings that aren't NUL-terminated, but that isn't the case
here.

> And anyway I still don't see how we could diagnose when the end user
> input requires more captures than we support.

We can use the final element as a sentinel, and complain if it gets
filled in:

diff --git a/delta-islands.c b/delta-islands.c
index dcc6590cc1..18426ffb18 100644
--- a/delta-islands.c
+++ b/delta-islands.c
@@ -375,6 +375,10 @@ static int find_island_for_ref(const char *refname, const struct object_id *oid,
 	if (i < 0)
 		return 0;
 
+	if (matches[ARRAY_SIZE(matches)-1].rm_so != -1)
+		die("island regex had too many matches (max=%d)",
+		    (int)ARRAY_SIZE(matches) - 2);
+
 	for (m = 1; m < ARRAY_SIZE(matches); m++) {
 		regmatch_t *match = &matches[m];
 

The big downside is that it only kicks in when you actually successfully
make a match. So you could have:

  [pack]
  island = refs/(one)/(two)/(three)/(four)/(five)/(six)/(seven)

in your config for years, and then one day it blows up when somebody
actually has a ref that matches it.

I think it would be fine to just say "we only respect the first N
capture groups". And maybe even issue a warning (based on the detection
above). I'd also be fine with bumping the "matches" array to something
more ridiculous, like 32. The current value of 8 was supposed to be
ridiculous already (we've never used more than 2).

-Peff

  reply	other threads:[~2018-08-06 13:58 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-22  5:48 [RFC PATCH 0/5] Add delta islands support Christian Couder
2018-07-22  5:48 ` [RFC PATCH 1/5] packfile: make get_delta_base() non static Christian Couder
2018-07-24 16:19   ` Junio C Hamano
2018-07-27 11:29     ` Jeff King
2018-07-22  5:48 ` [RFC PATCH 2/5] Add delta-islands.{c,h} Christian Couder
2018-07-22  8:50   ` Duy Nguyen
2018-07-22 13:57     ` Christian Couder
2018-08-05 18:53     ` Christian Couder
2018-08-06 14:17       ` Jeff King
2018-08-06 15:53       ` Duy Nguyen
2018-08-06 18:54         ` Christian Couder
2018-08-06 19:21           ` Duy Nguyen
2018-07-24 16:47   ` Junio C Hamano
2018-07-27 13:02     ` Jeff King
2018-07-27  9:40   ` Jeff King
2018-07-22  5:48 ` [RFC PATCH 3/5] pack-objects: add delta-islands support Christian Couder
2018-07-22  8:55   ` Duy Nguyen
2018-08-05 17:28     ` Christian Couder
2018-07-23 18:52   ` Stefan Beller
2018-07-24  9:58     ` Jeff King
2018-07-24 17:20       ` Stefan Beller
2018-07-27 13:13         ` Jeff King
2018-07-27 17:22           ` Stefan Beller
2018-07-28  9:00             ` Jeff King
2018-07-28 12:12               ` Christian Couder
2018-07-24 17:03   ` Junio C Hamano
2018-07-24 17:11     ` Junio C Hamano
2018-08-05 17:40       ` Christian Couder
2018-08-06  8:44         ` Christian Couder
2018-08-06 13:58           ` Jeff King [this message]
2018-07-22  5:48 ` [RFC PATCH 4/5] repack: " Christian Couder
2018-07-22  5:48 ` [RFC PATCH 5/5] t: add t9930-delta-islands.sh Christian Couder
2018-07-24 10:24   ` Jeff King
2018-07-24 10:16 ` [RFC PATCH 0/5] Add delta islands support Jeff King
2018-07-24 17:18   ` Junio C Hamano
2018-07-24 21:14     ` Jeff King
2018-07-26 13:34 ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180806135855.GC31282@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=chriscool@tuxfamily.org \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).