From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: "brian m. carlson" <sandals@crustytoothpaste.net>
Cc: git@vger.kernel.org, "René Scharfe" <l.s.r@web.de>,
"Duy Nguyen" <pclouds@gmail.com>
Subject: Re: [PATCH 31/31] gitweb: make hash size independent
Date: Tue, 12 Feb 2019 11:57:52 +0100 [thread overview]
Message-ID: <871s4dl1e7.fsf@evledraar.gmail.com> (raw)
In-Reply-To: <20190212012256.1005924-32-sandals@crustytoothpaste.net>
On Tue, Feb 12 2019, brian m. carlson wrote:
> Gitweb has several hard-coded 40 values throughout it to check for
> values that are passed in or acquired from Git. To simplify the code,
> introduce a regex variable that matches either exactly 40 or exactly 64
> hex characters, and use this variable anywhere we would have previously
> hard-coded a 40 in a regex.
>
> Similarly, switch the code that looks for deleted diffinfo information
> to look for either 40 or 64 zeros, and update one piece of code to use
> this function. Finally, when formatting a log line, allow an
> abbreviated describe output to contain up to 64 characters.
This might be going a bit overboard but I tried this with a variant
where...
> +# A regex matching a valid object ID.
> +our $oid_regex = qr/(?:[0-9a-fA-F]{40}(?:[0-9a-fA-F]{24})?)/;
> +
Instead of this dense regex I did:
my $sha1_len = 40;
my $sha256_extra_len = 24;
my $sha256_len = $sha1_len + $sha256_extra_len;
sub oid_nlen_regex {
my $len = shift;
my $hchr = qr/[0-9a-fA-F]/;
return qr/(?:(?:$hchr){$len})/
}
our $oid_regex;
{
my $x = oid_nlen_regex($sha1_len);
my $y = oid_nlen_regex($sha256_extra_len);
$oid_regex = qr/(?:$x(?:$y)?)/
}
Then most of the rest of this is the same, e.g.:
> - if ($input =~ m/^[0-9a-fA-F]{40}$/) {
But...
> @@ -2037,10 +2040,10 @@ sub format_log_line_html {
> (?<!-) # see strbuf_check_tag_ref(). Tags can't start with -
> [A-Za-z0-9.-]+
> (?!\.) # refs can't end with ".", see check_refname_format()
> - -g[0-9a-fA-F]{7,40}
> + -g[0-9a-fA-F]{7,64}
> |
> # Just a normal looking Git SHA1
> - [0-9a-fA-F]{7,40}
> + [0-9a-fA-F]{7,64}
> )
> \b
> }{
E.g. here we can do call oid_nlen_regex("7,64") to produce this blurb.
> - if ($line =~ m/^index [0-9a-fA-F]{40},[0-9a-fA-F]{40}/) {
> + if ($line =~ m/^index $oid_regex,$oid_regex/) {
> - } elsif ($line =~ m/^index [0-9a-fA-F]{40}..[0-9a-fA-F]{40}/) {
> + } elsif ($line =~ m/^index $oid_regex..$oid_regex/) {
And here, maybe nobody cares, but we now implicitly accept mixed SHA-1 &
SHA-256 input. Whereas we could have a helper on top of the above code
like:
sub oid_nlen_prefix_infix_regex {
my $nlen = shift;
my $prefix = shift;
my $infix = shift;
my $rx = oid_nlen_regex($nlen);
return qr/^\Q$prefix\E$rx\Q$infix\E$rx$/;
}
And then e.g.:
} elsif ($line =~ oid_nlen_prefix_infix_regex($sha1_len, "index ", "..") ||
$line =~ oid_nlen_prefix_infix_regex($sha256_len, "index ", "..")) {
So only accept SHA1..SHA1 or SHA256..SHA256, not SHA1..SHA256 or
SHA256..SHA1.
next prev parent reply other threads:[~2019-02-12 10:57 UTC|newest]
Thread overview: 56+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-12 1:22 [PATCH 00/31] Hash function transition part 16 brian m. carlson
2019-02-12 1:22 ` [PATCH 01/31] t/lib-submodule-update: use appropriate length constant brian m. carlson
2019-02-12 1:22 ` [PATCH 02/31] pack-bitmap: make bitmap header handling hash agnostic brian m. carlson
2019-02-12 1:22 ` [PATCH 03/31] pack-bitmap: convert struct stored_bitmap to object_id brian m. carlson
2019-02-12 1:22 ` [PATCH 04/31] pack-bitmap: replace sha1_to_hex brian m. carlson
2019-02-12 6:37 ` Jeff King
2019-02-13 0:00 ` brian m. carlson
2019-02-14 4:41 ` Jeff King
2019-02-12 1:22 ` [PATCH 05/31] pack-bitmap: switch hard-coded constants to the_hash_algo brian m. carlson
2019-02-12 11:13 ` Ævar Arnfjörð Bjarmason
2019-02-12 1:22 ` [PATCH 06/31] submodule: avoid hard-coded constants brian m. carlson
2019-02-12 1:22 ` [PATCH 07/31] notes-merge: switch to use the_hash_algo brian m. carlson
2019-02-12 1:22 ` [PATCH 08/31] notes: make hash size independent brian m. carlson
2019-02-12 1:37 ` Eric Sunshine
2019-02-12 1:42 ` brian m. carlson
2019-02-12 1:22 ` [PATCH 09/31] notes: replace sha1_to_hex brian m. carlson
2019-02-12 1:22 ` [PATCH 10/31] object-store: rename and expand packed_git's sha1 member brian m. carlson
2019-02-12 3:32 ` Eric Sunshine
2019-02-14 3:33 ` brian m. carlson
2019-02-12 1:22 ` [PATCH 11/31] builtin/name-rev: make hash-size independent brian m. carlson
2019-02-12 1:22 ` [PATCH 12/31] fast-import: " brian m. carlson
2019-02-12 3:44 ` Eric Sunshine
2019-02-12 23:36 ` brian m. carlson
2019-02-12 1:22 ` [PATCH 13/31] fast-import: replace sha1_to_hex brian m. carlson
2019-02-12 1:22 ` [PATCH 14/31] builtin/am: make hash size independent brian m. carlson
2019-02-12 1:22 ` [PATCH 15/31] builtin/pull: make hash-size independent brian m. carlson
2019-02-12 3:47 ` Eric Sunshine
2019-02-12 1:22 ` [PATCH 16/31] http-push: convert to use the_hash_algo brian m. carlson
2019-02-12 1:22 ` [PATCH 17/31] http-backend: allow 64-character hex names brian m. carlson
2019-02-12 1:22 ` [PATCH 18/31] http-push: remove remaining uses of sha1_to_hex brian m. carlson
2019-02-12 1:22 ` [PATCH 19/31] http-walker: replace sha1_to_hex brian m. carlson
2019-02-12 3:51 ` Eric Sunshine
2019-02-12 1:22 ` [PATCH 20/31] http: replace hard-coded constant with the_hash_algo brian m. carlson
2019-02-12 1:22 ` [PATCH 21/31] http: compute hash of downloaded objects using the_hash_algo brian m. carlson
2019-02-12 1:22 ` [PATCH 22/31] http: replace sha1_to_hex brian m. carlson
2019-02-12 1:22 ` [PATCH 23/31] remote-curl: make hash size independent brian m. carlson
2019-02-12 11:11 ` Ævar Arnfjörð Bjarmason
2019-02-12 1:22 ` [PATCH 24/31] archive-tar: " brian m. carlson
2019-02-12 7:20 ` René Scharfe
2019-02-12 17:33 ` René Scharfe
2019-02-13 0:11 ` brian m. carlson
2019-02-12 1:22 ` [PATCH 25/31] archive: convert struct archiver_args to object_id brian m. carlson
2019-02-12 1:22 ` [PATCH 26/31] refspec: make hash size independent brian m. carlson
2019-02-12 1:22 ` [PATCH 27/31] builtin/difftool: use parse_oid_hex brian m. carlson
2019-02-12 8:27 ` Eric Sunshine
2019-02-12 1:22 ` [PATCH 28/31] dir: make untracked cache extension hash size independent brian m. carlson
2019-02-12 11:08 ` Ævar Arnfjörð Bjarmason
2019-02-13 0:30 ` brian m. carlson
2019-02-12 1:22 ` [PATCH 29/31] read-cache: read data in a hash-independent way brian m. carlson
2019-02-12 1:22 ` [PATCH 30/31] Git.pm: make hash size independent brian m. carlson
2019-02-12 10:59 ` Ævar Arnfjörð Bjarmason
2019-02-18 19:09 ` brian m. carlson
2019-02-18 21:00 ` Ævar Arnfjörð Bjarmason
2019-02-12 1:22 ` [PATCH 31/31] gitweb: " brian m. carlson
2019-02-12 10:57 ` Ævar Arnfjörð Bjarmason [this message]
2019-02-12 11:15 ` [PATCH 00/31] Hash function transition part 16 Ævar Arnfjörð Bjarmason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=871s4dl1e7.fsf@evledraar.gmail.com \
--to=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=l.s.r@web.de \
--cc=pclouds@gmail.com \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.