From: ZheNing Hu <adlternative@gmail.com>
To: Christian Couder <christian.couder@gmail.com>
Cc: Git List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>,
Stefan Beller <sbeller@google.com>,
Hariom verma <hariom18599@gmail.com>
Subject: Re: [GSOC] [QUESTION] ref-filter: can %(raw) implement reuse oi.content?
Date: Wed, 25 Aug 2021 16:11:45 +0800 [thread overview]
Message-ID: <CAOLTT8S-bbnb8BecJRzLQytEEnZKU_VwRwiF1tk+gy0yrjTOog@mail.gmail.com> (raw)
In-Reply-To: <CAP8UFD0pZVCHG_i+u_6QAP5yKUpmdYm+fkwr9aJMCDXukKF_7g@mail.gmail.com>
Christian Couder <christian.couder@gmail.com> 于2021年8月24日周二 下午3:11写道:
>
> > set_commit_buffer()
> > or parse_tree_buffer() cache the 'buf'. 'eaten' means we don't need to free it
> > manually.
>
> Yeah, but it doesn't mean that we cannot use the buf. So perhaps we
> could still use it, even if it's eaten.
>
Yes. Just like the following question: How do we know that the eaten 'buf' has
not been free() by some logic in git when we want to use it?
> > In grab_sub_body_contents() we can indeed write the address of
> > 'buf' to v->s, but what I worry about is that before we completely write v->s to
> > the output buffer through append_atom(), will this buf be freed in
> > somewhere else?
>
> Yeah, that's a good question. But actually it seems that the buf is
> freed (at least in get_object()) only when it's not eaten.
>
get_object() will free the 'buf' which has not been eaten, but we can
take use of
it (v->s = buf and we free it later)
> > > > + else
> > > > + v->s = buf;
> > > > v->s_size = buf_size;
> > > > } else if (atom->u.raw_data.option == RAW_LENGTH) {
> > > > v->s = xstrfmt_len(&v->s_size,
> > > > "%"PRIuMAX, (uintmax_t)buf_size);
> > > >
> > > > As parse_object_buffer() does internally: the buffer of commit/tree objects
> > > > needs to be copied, but blob/tag not. You said that the number of commits
> > > > is generally significantly greater than the others. It seems that we cannot
> > > > make full use of this idea. But remember the "skip_parse_buffer" before?
> > > > If we skip the parse_object_buffer(), this buffer is also "!eaten".
> > > >
> > > > In other words, those default paths of git cat-file --batch are included in it!
> > > > So from the perspective of performance regression, this patch will be
> > > > beneficial.
> > >
> > > Yeah, it seems that we can benefit from something like this, but it'd
> >
> > Only when the data allocated to us is dynamically allocated and we do have
> > the ownership of it, we can benefit by reducing copies and allocate
> > memory again.
> >
> > > be nice if things were clarified a bit more.
> >
> > OK. In get_object(), eaten means that the oi->content we obtained is cached
> > by git in parse_object_buffer(). This means that we don't need to free this buf
> > manually. In 28511adfa5 ([GSOC] ref-filter: skip parse_object_buffer()), we skip
> > parse_buffer() when using some atoms, which can avoid some meaningless parsing.
> > So under this path, our'buf' is not cached. This means we have
> > ownership of it, so we
> > can free it freely, and benefit from this.
>
> In the patch we are not freeing it, we only duplicate it when it's
> eaten. So I am not sure that the above explanation are clear enough.
> That's the issue I have with this patch, but maybe the commit message
> can now be discussed and improved after sending the patch series to
> the mailing list instead of in this thread.
>
My thoughts are:
1. those 'buf' which have been eaten, because we don't know when they
will be freed, if we
use it in append_atom(), it may be a little unsafe if it have been
freed, but it is safe to use after
copying it;
2. those uneaten 'buf', it can be used safely.
Maybe we can guarantee that the buf have not been freed when append_atom()?
I don't have a good grasp.
> Thanks!
Thanks.
--
ZheNing Hu
prev parent reply other threads:[~2021-08-25 8:11 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-16 14:00 [GSOC] [QUESTION] ref-filter: can %(raw) implement reuse oi.content? ZheNing Hu
2021-08-17 14:34 ` Fwd: " ZheNing Hu
2021-08-17 16:09 ` Christian Couder
2021-08-18 4:51 ` ZheNing Hu
2021-08-18 8:53 ` Christian Couder
2021-08-18 9:07 ` ZheNing Hu
2021-08-18 11:11 ` ZheNing Hu
2021-08-19 1:39 ` ZheNing Hu
2021-08-20 16:13 ` Christian Couder
2021-08-21 2:36 ` ZheNing Hu
2021-08-20 15:58 ` Christian Couder
2021-08-21 2:16 ` ZheNing Hu
2021-08-24 7:11 ` Christian Couder
2021-08-25 8:11 ` ZheNing Hu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAOLTT8S-bbnb8BecJRzLQytEEnZKU_VwRwiF1tk+gy0yrjTOog@mail.gmail.com \
--to=adlternative@gmail.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=hariom18599@gmail.com \
--cc=sbeller@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).