git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [GSOC] [PROPOSAL v2] Draft of proposal for "Unify ref-filter formats with other pretty formats"
@ 2023-03-31  3:22 Zhang Yi
  2023-04-01  9:04 ` Christian Couder
  0 siblings, 1 reply; 4+ messages in thread
From: Zhang Yi @ 2023-03-31  3:22 UTC (permalink / raw)
  To: Zhang Yi, git, christian.couder, hariom18599

I have changed my proposal according to the comments by Hariom Verma.

Improvement vs v1:
1. Put more effort into related work and grasp a lot from them.
2. More details about timeline.
3. More details about my plan.
4. Some tiny changes in other content.

Open to more guidances. Thanks for suggestions.


* Unify ref-filter formats with other pretty formats

* Personal Information

Full name: Zhang Yi

E-mail: 18994118902@163.com
Tel: (+86)18994118902

Education: Wuhan University of Technology (China)
Major: Computer engineering 
Year: First-year postgraduate student

Github: https://github.com/zhanyi22333

*  Synopsis

** Motivation

Git has different implements to format command output, which makes chaos and
hinder improvement of code quality.

Aim to unify the different implementations to format output for different
commands, we want to transform pretty into ref-filter formatting logic. According
to the present situation, I need to add more ref-filter atoms to replace
pretty.

** Previous Work

  - `git for-each-ref`, `git branch` and `git tag` formats into the
ref-filter formats:

done by Karthik Nayak (GSoC 2015)

  -  `git cat-file` formats and the ref-filter formats:

started by Olga Telezhnaya (Outreachy 2017-2018),
continued by ZheNing Hu (GSoC 2021),
    There are a lot of patches which are concluded in his final blog [1]
but still not finished due to tricky performance issues

  - ref-filter formats and pretty formats:

started by Hariom Verma (GSoC 2020)
    There are also a lot of patches which are concluded in his final blog [2]
continued a bit by Jaydeep Das (GSoC 2022)
    Patch: gpg-interface: add function for converting trust level to string [3]
and continued by Nsengiyumva Wilberforce and his  work on the "signature" atoms
should be mostly over when the GSoC starts. (Outreachy 2022-2023)
    Patch: ref-filter: add new atom "signature" atom [4]

ps: There seems no conclusion articles of Karthik Nayak's and Olga Telezhnava's
works.

** What is left

Since the work of "signature" atoms will be finished by Nsengiyumva Wilberforce,
There may be some other atoms left for ref-filter formats and pretty formats.
But I still need to check.

If there is no work left for for ref-filter formats and pretty formats, then
there may be another command which has a different format implement with
ref-filter.

** Steps

In my mind, there are 4 steps logically:
1. Check and find a pretty atom which has no substitute in ref-filter.
   This step is to decide the whole direction of the next work.
   Christian Couder informed me that I can do things like the following:
   - making sure that all the atoms in the pretty formats have similar
   atoms implemented in the ref-filter formats
   - find a way to convert any string containing pretty format atoms to
   a string containing only ref-filter format atoms
   - find a way to plug-in the ref-filter code into the pretty code, so
   that callers of the pretty code would not need to be changed much.
2. Add reasonable test scripts and maybe documents in advance.
   In my opinion, making a draft of test scripts and documents in advance can
   help me have a deep understanding of the behavior that I need to code. I learn
   this development mode from book. And I have really met problems rising from
   the misunderstanding of needed behavior which will result in a lot of reworks.
3. Change code.
   Inspired by Hariom Verma's proposal, I can  start by first looking at what
   actually needed to be replaced (for example by studying the PRETTY FORMATS
   section in 'man git-log', what which verbs you can use in the ref-filter
   ('man git-for-each-ref') to achieve the same thing. Then I can research how
   one format is implemented in 'pretty.c', and see how a similar thing using
   the ref-filter is implemented in 'ref-filter.c'.
4. Recheck documents and run test scripts.
   Necessary step to check the behavior of code.


* Benefits to Community

I'm willing to stay around after the project. By that time, I will be in my
second year without classes. And my tutor has an open mind about my request to
involve in an open source project by now. Considering the subjective and
objective conditions, I think there is a high possibility that I will stay
around.

Particularly, I wish to be a co-mentor if I have the ability. There may be some
difficulties. But what I learn from my finite experience is that you should not
refuse something positive just because of the difficulties in the mind. A
fresh new job may be difficult, but it can show me the possibilities of the
world, which means changing my mind.

What's more, I tried to persuade a schoolmate who I think is kind of obsessed
with technology to take part in an open source community for both self-growth and
companion. And I failed, because he thinks it is hard.  It's always hard to
change Others' deep-rooted ideas by word. But I think the actions speak louder
than words. Maybe after the project, I can change the minds of people around me
about joining an open source community. There may be no visual benefits to the
Git Community but should be beneficial to the whole open source community.

* Microproject

t9700: modernize test scripts [5]

The microproject patches have been merged. The merge info is as below:

commit 8760a2b3c63478e8766b7ff45d798bd1be47f52d
Merge: a2d2b5229e 509d3f5103
Author: Junio C Hamano <gitster@pobox.com>
Date:   Tue Feb 28 16:38:47 2023 -0800

    Merge branch 'zy/t9700-style'

    Test style fixes.

    * zy/t9700-style:
      t9700: modernize test scripts

* Plan

** Timeline and deliverables

The official GSOC code time start from 05-29 to 08-28, which is 13 weeks.
The period from 06-05 to 06~30 is near the end of the semester. There are many
classes for me. So I guess I may be not productive during this period.
I think it is a bit time-limited if I follow the official timeline. It seems
necessary to do some work in advance.

1. preparatory work:
 Period:
  04-01 ~ 05-28
  about 8 weeks
 Tasks:
  1. Decide which parts need to work and which has priority.
  2. Read Hariom's blogs.
  3. Trying to understand the formatting logic behind pretty and ref-filter.
  (Maybe try gdb?)
  4. Try to make some trial change

2. Write draft of documents and test scripts.
 Period:
  05-29 ~ 06-02
  week 1
 Tasks:
  Based on the preparatory work, write drafts of doc and test.
 Deliverables:
  Drafts of documents and test scripts
3. Inactive Period
 Period:
  06-05 ~ 06-30
  week 2~5
  4 weeks
 Tasks:
  1. Build the base of other works like atoms.
  2. Should pass some special tests.
 Deliverables:
  A new atoms

4. Active code period 1
 Period:
  07-03 ~ 07-07
  week 6
 Tasks:
  1. Add a new argument and grab functions for the atoms
  2. Need to pass tests and in same with documents
 Deliverables:
  A new argument and its grab function
5. Midterm evaluation
 Period:
  07-10 ~ 07-14
  week 7
 Tasks:
  1. Submitting midterm evaluations
  2. Maybe need to continue the work left from last week
 Deliverables:
  midterm evaluation

6. Active code period 2
 Period:
  07-17 ~ 08-04
  week 8~10
  3 weeks
 Tasks:
  1. Add 2~3 new arguments
  2. Also need to pass tests and in same with documents.
  3. Drafts of documents and test scripts should be updated.
 Deliverables:
  1. New arguments
  2. Documents
  3. test scripts

7. Finishing touches
 Period:
  08-07 ~ 08-26
  week 11~13
  3 weeks
 Tasks:
  1. There should be some bugs to fix or work left.
  2. This period is also left for unexpected events.
  3. Submit final work product and final mentor evaluation.
 Deliverables:
  1. final work product
  2. final mentor evaluation


* Grasp from related work
** From Hariom Verma's blog
Walking through the blogs of Hariom Verma, I find many things useful.

*** Debugging

An extremely informative(step-by-step) debugging guide by Christian. [6]

*** 11 questions for understanding someone's work. [7]

1. What was the goal of each patch?
2. which approach did she took to achieve the goal?
3. what were the goals of the patch series?
4. which approach did she took to achieve the goals?
5. what was the goal of her previous patch series?
6. what was the general direction her patch series were going?
7. why did she took that direction?
8. are there ways to continue in the same direction?
9. are there ways to achieve similar goals?
10. how were her goals similar and different from the goals in my proposal?
11. is it possible to use the same approach?

*** Else

There are many details about his work progress. I can refer to them when I am in
similar situations.

** From ZheNing Hu's blog

*** Time analyzing

Use performance testing tools to analyze the time-consuming steps of
`git cat-file --batch`.

 Using Google's `gperftools`:
1. Add the link parameter `-lprofiler` in `config.mak`: `CFLAGS += -lprofiler`.
2. `make`.
3. Use `CPUPROFILE=/tmp/prof.out /<path>/git cat-file --batch-check
--batch-all-objects`
to run the git and general `prof.out`, which contains the results of
performance analysis.
4. Use `pprof --text /<path>/git /tmp/prof.out` to display the result
in the terminal.

*** About Github CI

"GitHub-Travis CI hints" in Documentation/SubmittingPatches

*** Else

He also writes his process of debugging and optimization in detail. It's worth
deepening into when I need them.

This proposal draft benefits from the works of predecessors much. Thanks.

* Biograhical information

It is always funny to recall that I first learned about Linux in a stimulated
hacker game in my fresh year in college. After that, I tried to teach myself
Linux and started to know open source projects. Overcome many difficulties and I
finally know something shallow about Linux. As a side effect, I am more
enthusiastic and better at programming compared with my schoolmates. But the
period of stagnation came, I began to write some meaningless projects for school
tasks and repeated myself without progress. The best out of the worst, I touched
excellent open source software during the time, such as vim, emacs, visual
studio code, Qt, VLC and, of course, git. Near the end of my junior year, I read
an article about learning by contributing to an open source project by a geek
in the community of emacs. Almost at the same time, I knew the GSOC and preferred
to take part in git. But it was near the start date of my plan for postgraduate
qualifying examination. So I just postponed the stuff for GSOC.  Luckily, I
passed the examination. After I got used to life as a postgraduate student, I
felt the motivation to progress again. Then I tried to contribute for git. Now I
just finished a micro project, which seems trivial. But it really let me have a
deeper understanding of open source and free software and more motivation to
contribute. I hope I can stay here a long time before being involved with other
interesting projects since the quality is more important than the quantity.
I know it seems a bit stubborn to believe that contributing will lead to
progress, which is also influenced by my learning attitude. But without action,
I can not verify the belief.  Sooat least I will try to contribute for one year.
After that, I hope I can have a better understanding.

Sorry, the above text may be messing. In short, I will try to contribute for
git for at least one year.

* Closing remarks

It seems blogs will help much for later work. I think It worth rebuilding my
blog site on github.

Thanks for Christian Couder's and Hariom Verma's help.


[1] https://public-inbox.org/git/CAOLTT8SxHuH2EbiSwQX6pyJJs5KyVuKx6ZOPxpzWLH+Tbz5F+A@mail.gmail.com/
[2] https://harry-hov.github.io/blogs/posts/the-final-report
[3] https://public-inbox.org/git/pull.1281.git.1657202265048.gitgitgadget@gmail.com/
[4] https://public-inbox.org/git/pull.1452.git.1672102523902.gitgitgadget@gmail.com/#t
[5] https://lore.kernel.org/git/20230222040745.1511205-1-18994118902@163.com/
[6] https://public-inbox.org/git/CAP8UFD3Bd4Af1XZ00VyuHnQs=MFrdUufKeePO1tyedWoReRjwQ@mail.gmail.com/
[7] https://harry-hov.github.io/blogs/posts/week1-the-ten-questions

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [GSOC] [PROPOSAL v2] Draft of proposal for "Unify ref-filter formats with other pretty formats"
  2023-03-31  3:22 [GSOC] [PROPOSAL v2] Draft of proposal for "Unify ref-filter formats with other pretty formats" Zhang Yi
@ 2023-04-01  9:04 ` Christian Couder
  2023-04-02 14:38   ` ZhangYI
  0 siblings, 1 reply; 4+ messages in thread
From: Christian Couder @ 2023-04-01  9:04 UTC (permalink / raw)
  To: Zhang Yi; +Cc: git, hariom18599, karthik nayak

On Fri, Mar 31, 2023 at 5:22 AM Zhang Yi <18994118902@163.com> wrote:
>
> I have changed my proposal according to the comments by Hariom Verma.
>
> Improvement vs v1:
> 1. Put more effort into related work and grasp a lot from them.
> 2. More details about timeline.
> 3. More details about my plan.
> 4. Some tiny changes in other content.
>
> Open to more guidances. Thanks for suggestions.

Thanks for improving your proposal based on our feedback!

[...]

> Aim to unify the different implementations to format output for different
> commands, we want to transform pretty into ref-filter formatting logic. According
> to the present situation, I need to add more ref-filter atoms to replace
> pretty.

Could you explain a bit more what that means and why you need to do
that? (You might already do that in a different section below, but it
still feels a bit strange to see this last sentence without much
explanation.)

> ** Previous Work
>
>   - `git for-each-ref`, `git branch` and `git tag` formats into the
> ref-filter formats:
>
> done by Karthik Nayak (GSoC 2015)
>
>   -  `git cat-file` formats and the ref-filter formats:
>
> started by Olga Telezhnaya (Outreachy 2017-2018),
> continued by ZheNing Hu (GSoC 2021),
>     There are a lot of patches which are concluded in his final blog [1]
> but still not finished due to tricky performance issues
>
>   - ref-filter formats and pretty formats:
>
> started by Hariom Verma (GSoC 2020)
>     There are also a lot of patches which are concluded in his final blog [2]
> continued a bit by Jaydeep Das (GSoC 2022)
>     Patch: gpg-interface: add function for converting trust level to string [3]
> and continued by Nsengiyumva Wilberforce and his  work on the "signature" atoms
> should be mostly over when the GSoC starts. (Outreachy 2022-2023)

Yeah except Wilberforce has actually been working outside of Outreachy
as he didn't satisfy the requirements for being accepted, but still
wanted to work on this.

>     Patch: ref-filter: add new atom "signature" atom [4]
>
> ps: There seems no conclusion articles of Karthik Nayak's and Olga Telezhnava's
> works.

Karthik's blog posts might have disappeared for some reason. I have
Cc-ed him and he might tell us.

Olga's blog posts seem to still be available on
https://medium.com/@olyatelezhnaya. Medium seems to require signing in
these days though.

> ** What is left
>
> Since the work of "signature" atoms will be finished by Nsengiyumva Wilberforce,
> There may be some other atoms left for ref-filter formats and pretty formats.
> But I still need to check.
>
> If there is no work left for for ref-filter formats and pretty formats, then

s/for for/for/

> there may be another command which has a different format implement with

Maybe: s/implement/implemented/

> ref-filter.

I am not sure what your last sentence here means. If a command already
uses ref-filter formats, then there is no more work to do as that's
the end state we would like.

> ** Steps
>
> In my mind, there are 4 steps logically:
> 1. Check and find a pretty atom which has no substitute in ref-filter.
>    This step is to decide the whole direction of the next work.

So you might want to take a look at this step soon. It might not be
difficult to find out, as the implemented atoms are described in the
docs.

They are called "field names" in the git for-each-ref documentation
(Documentation/git-for-each-ref.txt) and "placeholders" in the pretty
formats documentation (Documentation/pretty-formats.txt).

>    Christian Couder informed me that I can do things like the following:
>    - making sure that all the atoms in the pretty formats have similar
>    atoms implemented in the ref-filter formats
>    - find a way to convert any string containing pretty format atoms to
>    a string containing only ref-filter format atoms
>    - find a way to plug-in the ref-filter code into the pretty code, so
>    that callers of the pretty code would not need to be changed much.

Yeah I suggested these as possible steps to split the work, hoping
that you would dig a bit more what they meant, and how you could
perform them.

> 2. Add reasonable test scripts and maybe documents in advance.
>    In my opinion, making a draft of test scripts and documents in advance can
>    help me have a deep understanding of the behavior that I need to code. I learn
>    this development mode from book. And I have really met problems rising from
>    the misunderstanding of needed behavior which will result in a lot of reworks.

I agree that it's a good approach when developing new features. Here
the features and associated high level tests and documentation already
exist. We "just" want to replace the internal implementation of the
pretty formats using the ref-filter formats. So the approach could be
a bit different.

> 3. Change code.
>    Inspired by Hariom Verma's proposal, I can  start by first looking at what
>    actually needed to be replaced (for example by studying the PRETTY FORMATS
>    section in 'man git-log', what which verbs you can use in the ref-filter
>    ('man git-for-each-ref') to achieve the same thing.

Yeah and this shouldn't take a lot of time. I think Hariom already
wrote a correspondence table between the different "verbs" (also
called "atoms", "placeholders" or "field names") in the pretty and
ref-filter formats.

> Then I can research how
>    one format is implemented in 'pretty.c', and see how a similar thing using
>    the ref-filter is implemented in 'ref-filter.c'.

What will you learn from that and how will it help you for the next steps?

You call this section "Change code" but it looks like it's only about
researching things.

> 4. Recheck documents and run test scripts.
>    Necessary step to check the behavior of code.

We ask for tests and documentation to be part of the patches that are
sent, so writing documentation and tests and running tests should be
part of each coding step.

> * Benefits to Community
>
> I'm willing to stay around after the project. By that time, I will be in my
> second year without classes. And my tutor has an open mind about my request to
> involve in an open source project by now. Considering the subjective and
> objective conditions, I think there is a high possibility that I will stay
> around.
>
> Particularly, I wish to be a co-mentor if I have the ability. There may be some
> difficulties. But what I learn from my finite experience is that you should not
> refuse something positive just because of the difficulties in the mind. A
> fresh new job may be difficult, but it can show me the possibilities of the
> world, which means changing my mind.

Great!

[...]

> * Microproject
>
> t9700: modernize test scripts [5]
>
> The microproject patches have been merged. The merge info is as below:
>
> commit 8760a2b3c63478e8766b7ff45d798bd1be47f52d
> Merge: a2d2b5229e 509d3f5103
> Author: Junio C Hamano <gitster@pobox.com>
> Date:   Tue Feb 28 16:38:47 2023 -0800
>
>     Merge branch 'zy/t9700-style'
>
>     Test style fixes.
>
>     * zy/t9700-style:
>       t9700: modernize test scripts

Thanks for your work on that!

> * Plan

It's difficult to understand how this section is different from the
"Steps" section above. Maybe these two sections could be merged.

> ** Timeline and deliverables
>
> The official GSOC code time start from 05-29 to 08-28, which is 13 weeks.
> The period from 06-05 to 06~30 is near the end of the semester. There are many
> classes for me. So I guess I may be not productive during this period.

Thanks for telling us about this in advance!

> I think it is a bit time-limited if I follow the official timeline. It seems
> necessary to do some work in advance.

[...]

> 2. Write draft of documents and test scripts.
>  Period:
>   05-29 ~ 06-02
>   week 1
>  Tasks:
>   Based on the preparatory work, write drafts of doc and test.
>  Deliverables:
>   Drafts of documents and test scripts

See what I said above about the fact that a big part of this project
might not be about developing new features.

[...]

Best,
Christian.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re:  [GSOC] [PROPOSAL v2] Draft of proposal for "Unify ref-filter formats with other pretty formats"
  2023-04-01  9:04 ` Christian Couder
@ 2023-04-02 14:38   ` ZhangYI
  2023-04-02 19:40     ` Christian Couder
  0 siblings, 1 reply; 4+ messages in thread
From: ZhangYI @ 2023-04-02 14:38 UTC (permalink / raw)
  To: Christian Couder, git, hariom18599

Thanks for Christian Couder's Constructive comments.
I've looked through Olga Telezhnava's detailed and helpful blogs. 
I also tried to understand more about the works of the project today.

I have one questions here:
I used gdb to track the function call related to ref-filter of the command
"git log -2 --pretty=%h " by setting breaks on all no-static functions in
ref-filter.c but found no stop.
Should I use another command?
Or as I know, Git use different branch for different purpose, like todo, next.
Should I use another branch?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [GSOC] [PROPOSAL v2] Draft of proposal for "Unify ref-filter formats with other pretty formats"
  2023-04-02 14:38   ` ZhangYI
@ 2023-04-02 19:40     ` Christian Couder
  0 siblings, 0 replies; 4+ messages in thread
From: Christian Couder @ 2023-04-02 19:40 UTC (permalink / raw)
  To: ZhangYI; +Cc: git, hariom18599

On Sun, Apr 2, 2023 at 4:38 PM ZhangYI <18994118902@163.com> wrote:

> I have one questions here:
> I used gdb to track the function call related to ref-filter of the command
> "git log -2 --pretty=%h " by setting breaks on all no-static functions in
> ref-filter.c but found no stop.
> Should I use another command?

`git log` uses pretty formats. If you want to see how ref-filter
formats work you should run for example `git for-each-ref`.

> Or as I know, Git use different branch for different purpose, like todo, next.
> Should I use another branch?

No this is not an issue with development branches.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-04-02 19:40 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-03-31  3:22 [GSOC] [PROPOSAL v2] Draft of proposal for "Unify ref-filter formats with other pretty formats" Zhang Yi
2023-04-01  9:04 ` Christian Couder
2023-04-02 14:38   ` ZhangYI
2023-04-02 19:40     ` Christian Couder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).