All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Siddharth Asthana <siddharthasthana31@gmail.com>
Cc: git@vger.kernel.org, phillip.wood123@gmail.com,
	congdanhqx@gmail.com, christian.couder@gmail.com,
	gitster@pobox.com, Johannes.Schindelin@gmx.de,
	johncai86@gmail.com
Subject: Re: [PATCH v4 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header()
Date: Wed, 13 Jul 2022 03:25:40 +0200	[thread overview]
Message-ID: <220713.86ilo14wqq.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <20220712160634.213956-4-siddharthasthana31@gmail.com>


On Tue, Jul 12 2022, Siddharth Asthana wrote:

> commit_rewrite_person() takes a commit buffer and replaces the idents
> in the header with their canonical versions using the mailmap mechanism.
> The name "commit_rewrite_person()" is misleading as it doesn't convey
> what kind of rewrite are we going to do to the buffer. It also doesn't
> clearly mention that the function will limit itself to the header part
> of the buffer. The new name, "apply_mailmap_to_header()", expresses the
> functionality of the function pretty clearly.
>
> We intend to use apply_mailmap_to_header() in git-cat-file to replace
> idents in the headers of commit and tag object buffers. So, we will be
> extending this function to take tag objects buffer as well and replace
> idents on the tagger header using the mailmap mechanism.
>
> Mentored-by: Christian Couder <christian.couder@gmail.com>
> Mentored-by: John Cai <johncai86@gmail.com>
> Signed-off-by: Siddharth Asthana <siddharthasthana31@gmail.com>
> ---
>  cache.h    | 6 +++---
>  ident.c    | 2 +-
>  revision.c | 2 +-
>  3 files changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index c9dbe1c29a..9edb7fefd3 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1689,10 +1689,10 @@ struct ident_split {
>  int split_ident_line(struct ident_split *, const char *, int);
>  
>  /*
> - * Given a commit object buffer and the commit headers, replaces the idents
> - * in the headers with their canonical versions using the mailmap mechanism.
> + * Given a commit or tag object buffer and the commit or tag headers, replaces
> + * the idents in the headers with their canonical versions using the mailmap mechanism.
>   */
> -void commit_rewrite_person(struct strbuf *buf, const char **commit_headers, struct string_list *mailmap);
> +void apply_mailmap_to_header(struct strbuf *buf, const char **headers, struct string_list *mailmap);
>  
>  /*
>   * Compare split idents for equality or strict ordering. Note that we
> diff --git a/ident.c b/ident.c
> index 9f4f6e9071..5f17bd607d 100644
> --- a/ident.c
> +++ b/ident.c
> @@ -393,7 +393,7 @@ static ssize_t rewrite_ident_line(const char* person, struct strbuf *buf, struct
>  	return 0;
>  }
>  
> -void commit_rewrite_person(struct strbuf *buf, const char **headers, struct string_list *mailmap)
> +void apply_mailmap_to_header(struct strbuf *buf, const char **headers, struct string_list *mailmap)
>  {
>  	size_t buf_offset = 0;
>  
> diff --git a/revision.c b/revision.c
> index 14dca903b6..6ad3665204 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -3792,7 +3792,7 @@ static int commit_match(struct commit *commit, struct rev_info *opt)
>  		if (!buf.len)
>  			strbuf_addstr(&buf, message);
>  
> -		commit_rewrite_person(&buf, commit_headers, opt->mailmap);
> +		apply_mailmap_to_header(&buf, commit_headers, opt->mailmap);
>  	}
>  
>  	/* Append "fake" message parts as needed */

I can live with this so far, but I really think this is cementing the
wrong approach into place here.

We only use commit_match() to feed a commit to grep.c, which if you look
at the "header_field" struct there we take this pre-formatted output and
parse this out *again*, i.e. find "author", "reflog", "committer" etc.,
and eventually point the regex engine at that buffer.

So we really don't need to get a strbuf here, and munge the whole thing
in place to feed it to grep.c, instead we can:

 1. Not munge it at all, pass it as-is
 2. Pass the mailmap along to grep.c itself
 3. It's already parsing out the headers, so at some point it will have
    "author foo <bar>\n"
 4. In that code, we can just consult the mailmap, and then map the "foo
   <bar>" bart to "Baz <bar>" or whatever
 5. Thean search that string.

So no need for any in-place rewriting, or no?

Even with this approach this seems a bit odd, e.g. isn't your
commit_rewrite_person() largely a re-invention of find_commit_header()
in commit.c, can't we use that function there?

The replace_idents_using_mailmap() in 4/4 seems like it could be
improved in a similar way.

I.e. can't we just loop over the the object, then as we find "author"
consult the mailmap, and potentially emit a replacement, otherwise the
existing content as-is up until the next \n etc.

We should be able to "stream" all of this, instead of in-place modifying
a potentially large commit buffer, which involves memmove() etc.

  reply	other threads:[~2022-07-13  1:39 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-30 14:24 [PATCH 0/3] Add support for mailmap in cat-file Siddharth Asthana
2022-06-30 14:24 ` [PATCH 1/3] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-06-30 16:00   ` Đoàn Trần Công Danh
2022-06-30 23:22   ` Junio C Hamano
2022-06-30 14:24 ` [PATCH 2/3] ident: rename commit_rewrite_person() to rewrite_ident_line() Siddharth Asthana
2022-06-30 15:33   ` Phillip Wood
2022-06-30 16:55     ` Christian Couder
2022-06-30 23:31   ` Junio C Hamano
2022-06-30 14:24 ` [PATCH 3/3] cat-file: add mailmap support Siddharth Asthana
2022-06-30 15:50   ` Phillip Wood
2022-06-30 16:36     ` Phillip Wood
2022-06-30 17:07     ` Christian Couder
2022-06-30 21:33       ` Junio C Hamano
2022-07-07  9:15         ` Christian Couder
2022-06-30 23:36   ` Ævar Arnfjörð Bjarmason
2022-06-30 23:53     ` Junio C Hamano
2022-07-07  9:02     ` Christian Couder
2022-06-30 23:41   ` Junio C Hamano
2022-06-30 21:18 ` [PATCH 0/3] Add support for mailmap in cat-file Junio C Hamano
2022-07-07 16:15 ` [PATCH v2 0/4] " Siddharth Asthana
2022-07-07 16:15   ` [PATCH v2 1/4] revision: improve commit_rewrite_person() Siddharth Asthana
2022-07-07 21:52     ` Junio C Hamano
2022-07-08 14:50     ` Đoàn Trần Công Danh
     [not found]       ` <CAP8UFD116xMnp27pxW8WNDf6PRJxnnwWtcy2TNHU_KyV2ZVA1g@mail.gmail.com>
2022-07-09  1:02         ` Đoàn Trần Công Danh
2022-07-09  5:04           ` Christian Couder
2022-07-07 16:15   ` [PATCH v2 2/4] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-07-07 16:15   ` [PATCH v2 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header() Siddharth Asthana
2022-07-07 16:15   ` [PATCH v2 4/4] cat-file: add mailmap support Siddharth Asthana
2022-07-07 21:55     ` Junio C Hamano
2022-07-08 11:53     ` Johannes Schindelin
2022-07-07 22:06   ` [PATCH v2 0/4] Add support for mailmap in cat-file Junio C Hamano
2022-07-07 22:58     ` Junio C Hamano
2022-07-09 15:41   ` [PATCH v3 " Siddharth Asthana
2022-07-09 15:41     ` [PATCH v3 1/4] revision: improve commit_rewrite_person() Siddharth Asthana
2022-07-12 16:29       ` Johannes Schindelin
2022-07-09 15:41     ` [PATCH v3 2/4] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-07-09 15:41     ` [PATCH v3 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header() Siddharth Asthana
2022-07-09 15:41     ` [PATCH v3 4/4] cat-file: add mailmap support Siddharth Asthana
2022-07-10  5:34     ` [PATCH v3 0/4] Add support for mailmap in cat-file Junio C Hamano
2022-07-12 12:34       ` Johannes Schindelin
2022-07-12 14:16         ` Junio C Hamano
2022-07-12 16:01           ` Siddharth Asthana
2022-07-12 16:06           ` Junio C Hamano
2022-07-12 16:06     ` [PATCH v4 " Siddharth Asthana
2022-07-12 16:06       ` [PATCH v4 1/4] revision: improve commit_rewrite_person() Siddharth Asthana
2022-07-13  1:25         ` Ævar Arnfjörð Bjarmason
2022-07-13 12:18           ` Christian Couder
2022-07-14 21:02         ` Junio C Hamano
2022-07-12 16:06       ` [PATCH v4 2/4] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-07-12 16:06       ` [PATCH v4 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header() Siddharth Asthana
2022-07-13  1:25         ` Ævar Arnfjörð Bjarmason [this message]
2022-07-13 13:29           ` Christian Couder
2022-07-12 16:06       ` [PATCH v4 4/4] cat-file: add mailmap support Siddharth Asthana
2022-07-16  7:40       ` [PATCH v5 0/4] Add support for mailmap in cat-file Siddharth Asthana
2022-07-16  7:40         ` [PATCH v5 1/4] revision: improve commit_rewrite_person() Siddharth Asthana
2022-07-17 22:11           ` Junio C Hamano
2022-07-16  7:40         ` [PATCH v5 2/4] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-07-16  7:40         ` [PATCH v5 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header() Siddharth Asthana
2022-07-16  7:40         ` [PATCH v5 4/4] cat-file: add mailmap support Siddharth Asthana
2022-07-18 19:50         ` [PATCH v6 0/4] Add support for mailmap in cat-file Siddharth Asthana
2022-07-18 19:50           ` [PATCH v6 1/4] revision: improve commit_rewrite_person() Siddharth Asthana
2022-07-18 19:51           ` [PATCH v6 2/4] ident: move commit_rewrite_person() to ident.c Siddharth Asthana
2022-07-18 19:51           ` [PATCH v6 3/4] ident: rename commit_rewrite_person() to apply_mailmap_to_header() Siddharth Asthana
2022-07-18 19:51           ` [PATCH v6 4/4] cat-file: add mailmap support Siddharth Asthana
2022-07-25 18:58           ` [PATCH v6 0/4] Add support for mailmap in cat-file Junio C Hamano
2022-07-28 19:07             ` Christian Couder
2022-07-28 19:32               ` Junio C Hamano
2022-07-30  7:50                 ` Siddharth Asthana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220713.86ilo14wqq.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=christian.couder@gmail.com \
    --cc=congdanhqx@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johncai86@gmail.com \
    --cc=phillip.wood123@gmail.com \
    --cc=siddharthasthana31@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.