* [PATCH v3] describe: refresh the index when 'broken' flag is used @ 2024-06-25 13:35 Abhijeet Sonar 2024-06-25 15:59 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-25 13:35 UTC (permalink / raw) To: git Cc: karthik.188, Abhijeet Sonar, Paul Millar, Junio C Hamano, Phillip Wood, Elijah Newren, Jeff King When describe is run with 'dirty' flag, we refresh the index to make sure it is in sync with the filesystem before determining if the working tree is dirty. However, this is not done for the codepath where the 'broken' flag is used. This causes `git describe --broken --dirty` to false positively report the worktree being dirty if a file has different stat info than what is recorded in the index. Running `git update-index -q --refresh` to refresh the index before running diff-index fixes the problem. Also add tests to deliberately update stat info of a file before running describe to verify it behaves correctly. Reported-by: Paul Millar <paul.millar@desy.de> Suggested-by: Junio C Hamano <gitster@pobox.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> --- builtin/describe.c | 11 +++++++++++ t/t6120-describe.sh | 24 ++++++++++++++++++++++++ 2 files changed, 35 insertions(+) diff --git a/builtin/describe.c b/builtin/describe.c index e5287eddf2..deec19b29a 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -53,6 +53,10 @@ static const char *diff_index_args[] = { "diff-index", "--quiet", "HEAD", "--", NULL }; +static const char *update_index_args[] = { + "update-index", "--unmerged", "-q", "--refresh", NULL +}; + struct commit_name { struct hashmap_entry entry; struct object_id peeled; @@ -645,6 +649,13 @@ int cmd_describe(int argc, const char **argv, const char *prefix) if (argc == 0) { if (broken) { struct child_process cp = CHILD_PROCESS_INIT; + strvec_pushv(&cp.args, update_index_args); + cp.git_cmd = 1; + cp.no_stdin = 1; + cp.no_stdout = 1; + run_command(&cp); + strvec_clear(&cp.args); + strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; cp.no_stdin = 1; diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index e78315d23d..6c396e7abc 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -671,4 +671,28 @@ test_expect_success 'setup misleading taggerdates' ' check_describe newer-tag-older-commit~1 --contains unique-file~2 +test_expect_success 'describe --dirty with a file with changed stat' ' + git init stat-dirty && + ( + cd stat-dirty && + + echo A >file && + git add file && + git commit -m A && + git tag A -a -m A && + + cat file >file.new && + mv file.new file && + git describe --dirty >actual && + echo "A" >expected && + test_cmp expected actual && + + cat file >file.new && + mv file.new file && + git describe --dirty --broken >actual && + echo "A" >expected && + test_cmp expected actual + ) +' + test_done Range-diff against v2: 1: d60fc0fa02 ! 1: 1da5fa48d9 describe: refresh the index when 'broken' flag is used @@ Commit message Reported-by: Paul Millar <paul.millar@desy.de> Suggested-by: Junio C Hamano <gitster@pobox.com> Helped-by: Junio C Hamano <gitster@pobox.com> + Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> ## builtin/describe.c ## @@ builtin/describe.c: int cmd_describe(int argc, const char **argv, const char *pr if (argc == 0) { if (broken) { struct child_process cp = CHILD_PROCESS_INIT; -+ struct child_process update_index_cp = CHILD_PROCESS_INIT; -+ -+ strvec_pushv(&update_index_cp.args, update_index_args); -+ update_index_cp.git_cmd = 1; -+ update_index_cp.no_stdin = 1; -+ update_index_cp.no_stdout = 1; -+ run_command(&update_index_cp); ++ strvec_pushv(&cp.args, update_index_args); ++ cp.git_cmd = 1; ++ cp.no_stdin = 1; ++ cp.no_stdout = 1; ++ run_command(&cp); ++ strvec_clear(&cp.args); + strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; @@ t/t6120-describe.sh: test_expect_success 'setup misleading taggerdates' ' +test_expect_success 'describe --dirty with a file with changed stat' ' + git init stat-dirty && -+ cd stat-dirty && -+ -+ echo A >file && -+ git add file && -+ git commit -m A && -+ git tag A -a -m A && -+ -+ cat file >file.new && -+ mv file.new file && -+ git describe --dirty >actual && -+ echo "A" >expected && -+ test_cmp expected actual -+' ++ ( ++ cd stat-dirty && + -+test_expect_success 'describe --dirty --broken with a file with changed stat' ' -+ git init stat-dirty-broken && -+ cd stat-dirty-broken && ++ echo A >file && ++ git add file && ++ git commit -m A && ++ git tag A -a -m A && + -+ echo A >file && -+ git add file && -+ git commit -m A && -+ git tag A -a -m A && ++ cat file >file.new && ++ mv file.new file && ++ git describe --dirty >actual && ++ echo "A" >expected && ++ test_cmp expected actual && + -+ cat file >file.new && -+ mv file.new file && -+ git describe --dirty --broken >actual && -+ echo "A" >expected && -+ test_cmp expected actual ++ cat file >file.new && ++ mv file.new file && ++ git describe --dirty --broken >actual && ++ echo "A" >expected && ++ test_cmp expected actual ++ ) +' + test_done -- 2.45.2.606.g9005149a4a.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v3] describe: refresh the index when 'broken' flag is used 2024-06-25 13:35 [PATCH v3] describe: refresh the index when 'broken' flag is used Abhijeet Sonar @ 2024-06-25 15:59 ` Junio C Hamano 2024-06-25 16:05 ` Junio C Hamano ` (3 more replies) 0 siblings, 4 replies; 32+ messages in thread From: Junio C Hamano @ 2024-06-25 15:59 UTC (permalink / raw) To: Abhijeet Sonar Cc: git, karthik.188, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > if (broken) { > struct child_process cp = CHILD_PROCESS_INIT; > + strvec_pushv(&cp.args, update_index_args); > + cp.git_cmd = 1; > + cp.no_stdin = 1; > + cp.no_stdout = 1; > + run_command(&cp); > + strvec_clear(&cp.args); Why clear .args here? Either "struct child_process" is reusable after finish_command() that is called as the last step of run_command() returns successfully, or it shouldn't be reused at all. And when finish_command() is called, .args as well as .env are cleared because it calls child_process_clear(). I am wondering if the last part need to be more like ... cp.no_stdout = 1; if (run_command(&cp)) child_process_clear(&cp); > + > strvec_pushv(&cp.args, diff_index_args); > cp.git_cmd = 1; > cp.no_stdin = 1; Thanks. (#leftoverbit) Outside the scope of this patch, I'd prefer to see somebody makes sure that it is truly equivalent to prepare a separate and new struct child_process for each run_command() call and to reuse the same struct child_process after calling child_process_clear() each time. It is unclear if they are equivalent in general, even though in this particular case I think we should be OK. There _might_ be other things in the child_process structure that need to be reset to the initial state before it can be reused, but are not cleared by child_process_clear(). .git_cmd and other flags as well as in/out/err file descriptors do not seem to be cleared, and other callers of run_command() may even be depending on the current behaviour that they are kept. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] describe: refresh the index when 'broken' flag is used 2024-06-25 15:59 ` Junio C Hamano @ 2024-06-25 16:05 ` Junio C Hamano 2024-06-26 11:16 ` Karthik Nayak 2024-06-26 6:11 ` Abhijeet Sonar ` (2 subsequent siblings) 3 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2024-06-25 16:05 UTC (permalink / raw) To: Karthik Nayak Cc: Abhijeet Sonar, git, karthik.188, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Junio C Hamano <gitster@pobox.com> writes: > (#leftoverbit) > > Outside the scope of this patch, I'd prefer to see somebody makes > sure that it is truly equivalent to prepare a separate and new > struct child_process for each run_command() call and to reuse the > same struct child_process after calling child_process_clear() each > time. It is unclear if they are equivalent in general, even though > in this particular case I think we should be OK. > > There _might_ be other things in the child_process structure that > need to be reset to the initial state before it can be reused, but > are not cleared by child_process_clear(). .git_cmd and other flags > as well as in/out/err file descriptors do not seem to be cleared, > and other callers of run_command() may even be depending on the > current behaviour that they are kept. Ahh, the reuse of the same struct came directly from Karthik's review on the second iteration. I guess Karthik volunteered himself into this #leftoverbit task? I am not convinced that (1) the selective clearing done by current child_process_clear() is the best thing we can do to make child_process reusable, and (2) among the current callers, there is nobody that depends on the state left by the previous use of child_process in another run_command() call that is left uncleared by child_process_clear(). If (1) is false, then reusing child_process structure is not quite safe, and if (2) is false, updating child_process_clear() to really clear everything will first need to adjust some callers. Thanks. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] describe: refresh the index when 'broken' flag is used 2024-06-25 16:05 ` Junio C Hamano @ 2024-06-26 11:16 ` Karthik Nayak 0 siblings, 0 replies; 32+ messages in thread From: Karthik Nayak @ 2024-06-26 11:16 UTC (permalink / raw) To: Junio C Hamano Cc: Abhijeet Sonar, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King [-- Attachment #1: Type: text/plain, Size: 1858 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> (#leftoverbit) >> >> Outside the scope of this patch, I'd prefer to see somebody makes >> sure that it is truly equivalent to prepare a separate and new >> struct child_process for each run_command() call and to reuse the >> same struct child_process after calling child_process_clear() each >> time. It is unclear if they are equivalent in general, even though >> in this particular case I think we should be OK. >> >> There _might_ be other things in the child_process structure that >> need to be reset to the initial state before it can be reused, but >> are not cleared by child_process_clear(). .git_cmd and other flags >> as well as in/out/err file descriptors do not seem to be cleared, >> and other callers of run_command() may even be depending on the >> current behaviour that they are kept. > > Ahh, the reuse of the same struct came directly from Karthik's > review on the second iteration. I guess Karthik volunteered himself > into this #leftoverbit task? I am not convinced that > Hehe. I'll take it up! > (1) the selective clearing done by current child_process_clear() is > the best thing we can do to make child_process reusable, and > > (2) among the current callers, there is nobody that depends on the > state left by the previous use of child_process in another > run_command() call that is left uncleared by child_process_clear(). > > If (1) is false, then reusing child_process structure is not quite > safe, and if (2) is false, updating child_process_clear() to really > clear everything will first need to adjust some callers. > > Thanks. > I think it would be best to write some unit tests to capture the current behavior and based on the findings and as you suggested, we can decide the path forward. Karthik [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v3] describe: refresh the index when 'broken' flag is used 2024-06-25 15:59 ` Junio C Hamano 2024-06-25 16:05 ` Junio C Hamano @ 2024-06-26 6:11 ` Abhijeet Sonar 2024-06-26 6:37 ` [PATCH v4] " Abhijeet Sonar 2024-06-26 6:52 ` [PATCH v5] " Abhijeet Sonar 3 siblings, 0 replies; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-26 6:11 UTC (permalink / raw) To: Junio C Hamano Cc: git, karthik.188, Paul Millar, Phillip Wood, Elijah Newren, Jeff King On 25/06/24 21:29, Junio C Hamano wrote: > Why clear .args here? I overlooked the fact that run_command already clears .args if the command exits with a zero exit code. Will fix, thank you. > I am wondering if the last part need to be more like > > ... > cp.no_stdout = 1; > if (run_command(&cp)) > child_process_clear(&cp); > That makes sense to me, will do. Thanks. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v4] describe: refresh the index when 'broken' flag is used 2024-06-25 15:59 ` Junio C Hamano 2024-06-25 16:05 ` Junio C Hamano 2024-06-26 6:11 ` Abhijeet Sonar @ 2024-06-26 6:37 ` Abhijeet Sonar 2024-06-26 6:50 ` Abhijeet Sonar 2024-06-26 6:52 ` [PATCH v5] " Abhijeet Sonar 3 siblings, 1 reply; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-26 6:37 UTC (permalink / raw) To: git Cc: Abhijeet Sonar, Paul Millar, Junio C Hamano, Phillip Wood, Elijah Newren, Jeff King When describe is run with 'dirty' flag, we refresh the index to make sure it is in sync with the filesystem before determining if the working tree is dirty. However, this is not done for the codepath where the 'broken' flag is used. This causes `git describe --broken --dirty` to false positively report the worktree being dirty if a file has different stat info than what is recorded in the index. Running `git update-index -q --refresh` to refresh the index before running diff-index fixes the problem. Also add tests to deliberately update stat info of a file before running describe to verify it behaves correctly. Reported-by: Paul Millar <paul.millar@desy.de> Suggested-by: Junio C Hamano <gitster@pobox.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> --- builtin/describe.c | 11 ++ t/t6120-describe.sh | 24 +++ ...esh-the-index-when-broken-flag-is-us.patch | 178 ++++++++++++++++++ 3 files changed, 213 insertions(+) create mode 100644 v3-0001-describe-refresh-the-index-when-broken-flag-is-us.patch diff --git a/builtin/describe.c b/builtin/describe.c index e5287eddf2..7cb9d50b36 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -53,6 +53,10 @@ static const char *diff_index_args[] = { "diff-index", "--quiet", "HEAD", "--", NULL }; +static const char *update_index_args[] = { + "update-index", "--unmerged", "-q", "--refresh", NULL +}; + struct commit_name { struct hashmap_entry entry; struct object_id peeled; @@ -645,6 +649,13 @@ int cmd_describe(int argc, const char **argv, const char *prefix) if (argc == 0) { if (broken) { struct child_process cp = CHILD_PROCESS_INIT; + strvec_pushv(&cp.args, update_index_args); + cp.git_cmd = 1; + cp.no_stdin = 1; + cp.no_stdout = 1; + if (run_command(&cp)) + child_process_clear(&cp); + strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; cp.no_stdin = 1; diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index e78315d23d..6c396e7abc 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -671,4 +671,28 @@ test_expect_success 'setup misleading taggerdates' ' check_describe newer-tag-older-commit~1 --contains unique-file~2 +test_expect_success 'describe --dirty with a file with changed stat' ' + git init stat-dirty && + ( + cd stat-dirty && + + echo A >file && + git add file && + git commit -m A && + git tag A -a -m A && + + cat file >file.new && + mv file.new file && + git describe --dirty >actual && + echo "A" >expected && + test_cmp expected actual && + + cat file >file.new && + mv file.new file && + git describe --dirty --broken >actual && + echo "A" >expected && + test_cmp expected actual + ) +' + test_done diff --git a/v3-0001-describe-refresh-the-index-when-broken-flag-is-us.patch b/v3-0001-describe-refresh-the-index-when-broken-flag-is-us.patch new file mode 100644 index 0000000000..22e295d5eb --- /dev/null +++ b/v3-0001-describe-refresh-the-index-when-broken-flag-is-us.patch @@ -0,0 +1,178 @@ +From 1da5fa48d913e1cefb2b6e1bc58df565076ee438 Mon Sep 17 00:00:00 2001 +From: Abhijeet Sonar <abhijeet.nkt@gmail.com> +Date: Mon, 24 Jun 2024 02:57:59 +0530 +Subject: [PATCH v3] describe: refresh the index when 'broken' flag is used +To: git@vger.kernel.org + +When describe is run with 'dirty' flag, we refresh the index +to make sure it is in sync with the filesystem before +determining if the working tree is dirty. However, this is +not done for the codepath where the 'broken' flag is used. + +This causes `git describe --broken --dirty` to false +positively report the worktree being dirty if a file has +different stat info than what is recorded in the index. +Running `git update-index -q --refresh` to refresh the index +before running diff-index fixes the problem. + +Also add tests to deliberately update stat info of a +file before running describe to verify it behaves correctly. + +Reported-by: Paul Millar <paul.millar@desy.de> +Suggested-by: Junio C Hamano <gitster@pobox.com> +Helped-by: Junio C Hamano <gitster@pobox.com> +Helped-by: Phillip Wood <phillip.wood123@gmail.com> +Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> +--- + builtin/describe.c | 11 +++++++++++ + t/t6120-describe.sh | 24 ++++++++++++++++++++++++ + 2 files changed, 35 insertions(+) + +diff --git a/builtin/describe.c b/builtin/describe.c +index e5287eddf2..deec19b29a 100644 +--- a/builtin/describe.c ++++ b/builtin/describe.c +@@ -53,6 +53,10 @@ static const char *diff_index_args[] = { + "diff-index", "--quiet", "HEAD", "--", NULL + }; + ++static const char *update_index_args[] = { ++ "update-index", "--unmerged", "-q", "--refresh", NULL ++}; ++ + struct commit_name { + struct hashmap_entry entry; + struct object_id peeled; +@@ -645,6 +649,13 @@ int cmd_describe(int argc, const char **argv, const char *prefix) + if (argc == 0) { + if (broken) { + struct child_process cp = CHILD_PROCESS_INIT; ++ strvec_pushv(&cp.args, update_index_args); ++ cp.git_cmd = 1; ++ cp.no_stdin = 1; ++ cp.no_stdout = 1; ++ run_command(&cp); ++ strvec_clear(&cp.args); ++ + strvec_pushv(&cp.args, diff_index_args); + cp.git_cmd = 1; + cp.no_stdin = 1; +diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh +index e78315d23d..6c396e7abc 100755 +--- a/t/t6120-describe.sh ++++ b/t/t6120-describe.sh +@@ -671,4 +671,28 @@ test_expect_success 'setup misleading taggerdates' ' + + check_describe newer-tag-older-commit~1 --contains unique-file~2 + ++test_expect_success 'describe --dirty with a file with changed stat' ' ++ git init stat-dirty && ++ ( ++ cd stat-dirty && ++ ++ echo A >file && ++ git add file && ++ git commit -m A && ++ git tag A -a -m A && ++ ++ cat file >file.new && ++ mv file.new file && ++ git describe --dirty >actual && ++ echo "A" >expected && ++ test_cmp expected actual && ++ ++ cat file >file.new && ++ mv file.new file && ++ git describe --dirty --broken >actual && ++ echo "A" >expected && ++ test_cmp expected actual ++ ) ++' ++ + test_done + +Range-diff against v2: +1: d60fc0fa02 ! 1: 1da5fa48d9 describe: refresh the index when 'broken' flag is used + @@ Commit message + Reported-by: Paul Millar <paul.millar@desy.de> + Suggested-by: Junio C Hamano <gitster@pobox.com> + Helped-by: Junio C Hamano <gitster@pobox.com> + + Helped-by: Phillip Wood <phillip.wood123@gmail.com> + Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> + + ## builtin/describe.c ## + @@ builtin/describe.c: int cmd_describe(int argc, const char **argv, const char *pr + if (argc == 0) { + if (broken) { + struct child_process cp = CHILD_PROCESS_INIT; + -+ struct child_process update_index_cp = CHILD_PROCESS_INIT; + -+ + -+ strvec_pushv(&update_index_cp.args, update_index_args); + -+ update_index_cp.git_cmd = 1; + -+ update_index_cp.no_stdin = 1; + -+ update_index_cp.no_stdout = 1; + -+ run_command(&update_index_cp); + ++ strvec_pushv(&cp.args, update_index_args); + ++ cp.git_cmd = 1; + ++ cp.no_stdin = 1; + ++ cp.no_stdout = 1; + ++ run_command(&cp); + ++ strvec_clear(&cp.args); + + + strvec_pushv(&cp.args, diff_index_args); + cp.git_cmd = 1; + @@ t/t6120-describe.sh: test_expect_success 'setup misleading taggerdates' ' + + +test_expect_success 'describe --dirty with a file with changed stat' ' + + git init stat-dirty && + -+ cd stat-dirty && + -+ + -+ echo A >file && + -+ git add file && + -+ git commit -m A && + -+ git tag A -a -m A && + -+ + -+ cat file >file.new && + -+ mv file.new file && + -+ git describe --dirty >actual && + -+ echo "A" >expected && + -+ test_cmp expected actual + -+' + ++ ( + ++ cd stat-dirty && + + + -+test_expect_success 'describe --dirty --broken with a file with changed stat' ' + -+ git init stat-dirty-broken && + -+ cd stat-dirty-broken && + ++ echo A >file && + ++ git add file && + ++ git commit -m A && + ++ git tag A -a -m A && + + + -+ echo A >file && + -+ git add file && + -+ git commit -m A && + -+ git tag A -a -m A && + ++ cat file >file.new && + ++ mv file.new file && + ++ git describe --dirty >actual && + ++ echo "A" >expected && + ++ test_cmp expected actual && + + + -+ cat file >file.new && + -+ mv file.new file && + -+ git describe --dirty --broken >actual && + -+ echo "A" >expected && + -+ test_cmp expected actual + ++ cat file >file.new && + ++ mv file.new file && + ++ git describe --dirty --broken >actual && + ++ echo "A" >expected && + ++ test_cmp expected actual + ++ ) + +' + + + test_done +-- +2.45.2.606.g9005149a4a.dirty + Range-diff against v3: 1: 1da5fa48d9 ! 1: 9ff85435b1 describe: refresh the index when 'broken' flag is used @@ builtin/describe.c: int cmd_describe(int argc, const char **argv, const char *pr + cp.git_cmd = 1; + cp.no_stdin = 1; + cp.no_stdout = 1; -+ run_command(&cp); -+ strvec_clear(&cp.args); ++ if (run_command(&cp)) ++ child_process_clear(&cp); + strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; @@ t/t6120-describe.sh: test_expect_success 'setup misleading taggerdates' ' +' + test_done + + ## v3-0001-describe-refresh-the-index-when-broken-flag-is-us.patch (new) ## +@@ ++From 1da5fa48d913e1cefb2b6e1bc58df565076ee438 Mon Sep 17 00:00:00 2001 ++From: Abhijeet Sonar <abhijeet.nkt@gmail.com> ++Date: Mon, 24 Jun 2024 02:57:59 +0530 ++Subject: [PATCH v3] describe: refresh the index when 'broken' flag is used ++To: git@vger.kernel.org ++ ++When describe is run with 'dirty' flag, we refresh the index ++to make sure it is in sync with the filesystem before ++determining if the working tree is dirty. However, this is ++not done for the codepath where the 'broken' flag is used. ++ ++This causes `git describe --broken --dirty` to false ++positively report the worktree being dirty if a file has ++different stat info than what is recorded in the index. ++Running `git update-index -q --refresh` to refresh the index ++before running diff-index fixes the problem. ++ ++Also add tests to deliberately update stat info of a ++file before running describe to verify it behaves correctly. ++ ++Reported-by: Paul Millar <paul.millar@desy.de> ++Suggested-by: Junio C Hamano <gitster@pobox.com> ++Helped-by: Junio C Hamano <gitster@pobox.com> ++Helped-by: Phillip Wood <phillip.wood123@gmail.com> ++Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> ++--- ++ builtin/describe.c | 11 +++++++++++ ++ t/t6120-describe.sh | 24 ++++++++++++++++++++++++ ++ 2 files changed, 35 insertions(+) ++ ++diff --git a/builtin/describe.c b/builtin/describe.c ++index e5287eddf2..deec19b29a 100644 ++--- a/builtin/describe.c +++++ b/builtin/describe.c ++@@ -53,6 +53,10 @@ static const char *diff_index_args[] = { ++ "diff-index", "--quiet", "HEAD", "--", NULL ++ }; ++ +++static const char *update_index_args[] = { +++ "update-index", "--unmerged", "-q", "--refresh", NULL +++}; +++ ++ struct commit_name { ++ struct hashmap_entry entry; ++ struct object_id peeled; ++@@ -645,6 +649,13 @@ int cmd_describe(int argc, const char **argv, const char *prefix) ++ if (argc == 0) { ++ if (broken) { ++ struct child_process cp = CHILD_PROCESS_INIT; +++ strvec_pushv(&cp.args, update_index_args); +++ cp.git_cmd = 1; +++ cp.no_stdin = 1; +++ cp.no_stdout = 1; +++ run_command(&cp); +++ strvec_clear(&cp.args); +++ ++ strvec_pushv(&cp.args, diff_index_args); ++ cp.git_cmd = 1; ++ cp.no_stdin = 1; ++diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh ++index e78315d23d..6c396e7abc 100755 ++--- a/t/t6120-describe.sh +++++ b/t/t6120-describe.sh ++@@ -671,4 +671,28 @@ test_expect_success 'setup misleading taggerdates' ' ++ ++ check_describe newer-tag-older-commit~1 --contains unique-file~2 ++ +++test_expect_success 'describe --dirty with a file with changed stat' ' +++ git init stat-dirty && +++ ( +++ cd stat-dirty && +++ +++ echo A >file && +++ git add file && +++ git commit -m A && +++ git tag A -a -m A && +++ +++ cat file >file.new && +++ mv file.new file && +++ git describe --dirty >actual && +++ echo "A" >expected && +++ test_cmp expected actual && +++ +++ cat file >file.new && +++ mv file.new file && +++ git describe --dirty --broken >actual && +++ echo "A" >expected && +++ test_cmp expected actual +++ ) +++' +++ ++ test_done ++ ++Range-diff against v2: ++1: d60fc0fa02 ! 1: 1da5fa48d9 describe: refresh the index when 'broken' flag is used ++ @@ Commit message ++ Reported-by: Paul Millar <paul.millar@desy.de> ++ Suggested-by: Junio C Hamano <gitster@pobox.com> ++ Helped-by: Junio C Hamano <gitster@pobox.com> ++ + Helped-by: Phillip Wood <phillip.wood123@gmail.com> ++ Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> ++ ++ ## builtin/describe.c ## ++ @@ builtin/describe.c: int cmd_describe(int argc, const char **argv, const char *pr ++ if (argc == 0) { ++ if (broken) { ++ struct child_process cp = CHILD_PROCESS_INIT; ++ -+ struct child_process update_index_cp = CHILD_PROCESS_INIT; ++ -+ ++ -+ strvec_pushv(&update_index_cp.args, update_index_args); ++ -+ update_index_cp.git_cmd = 1; ++ -+ update_index_cp.no_stdin = 1; ++ -+ update_index_cp.no_stdout = 1; ++ -+ run_command(&update_index_cp); ++ ++ strvec_pushv(&cp.args, update_index_args); ++ ++ cp.git_cmd = 1; ++ ++ cp.no_stdin = 1; ++ ++ cp.no_stdout = 1; ++ ++ run_command(&cp); ++ ++ strvec_clear(&cp.args); ++ + ++ strvec_pushv(&cp.args, diff_index_args); ++ cp.git_cmd = 1; ++ @@ t/t6120-describe.sh: test_expect_success 'setup misleading taggerdates' ' ++ ++ +test_expect_success 'describe --dirty with a file with changed stat' ' ++ + git init stat-dirty && ++ -+ cd stat-dirty && ++ -+ ++ -+ echo A >file && ++ -+ git add file && ++ -+ git commit -m A && ++ -+ git tag A -a -m A && ++ -+ ++ -+ cat file >file.new && ++ -+ mv file.new file && ++ -+ git describe --dirty >actual && ++ -+ echo "A" >expected && ++ -+ test_cmp expected actual ++ -+' ++ ++ ( ++ ++ cd stat-dirty && ++ + ++ -+test_expect_success 'describe --dirty --broken with a file with changed stat' ' ++ -+ git init stat-dirty-broken && ++ -+ cd stat-dirty-broken && ++ ++ echo A >file && ++ ++ git add file && ++ ++ git commit -m A && ++ ++ git tag A -a -m A && ++ + ++ -+ echo A >file && ++ -+ git add file && ++ -+ git commit -m A && ++ -+ git tag A -a -m A && ++ ++ cat file >file.new && ++ ++ mv file.new file && ++ ++ git describe --dirty >actual && ++ ++ echo "A" >expected && ++ ++ test_cmp expected actual && ++ + ++ -+ cat file >file.new && ++ -+ mv file.new file && ++ -+ git describe --dirty --broken >actual && ++ -+ echo "A" >expected && ++ -+ test_cmp expected actual ++ ++ cat file >file.new && ++ ++ mv file.new file && ++ ++ git describe --dirty --broken >actual && ++ ++ echo "A" >expected && ++ ++ test_cmp expected actual ++ ++ ) ++ +' ++ + ++ test_done ++-- ++2.45.2.606.g9005149a4a.dirty ++ -- 2.45.2.606.g9005149a4a.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v4] describe: refresh the index when 'broken' flag is used 2024-06-26 6:37 ` [PATCH v4] " Abhijeet Sonar @ 2024-06-26 6:50 ` Abhijeet Sonar 0 siblings, 0 replies; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-26 6:50 UTC (permalink / raw) To: git; +Cc: Paul Millar, Junio C Hamano, Phillip Wood, Elijah Newren, Jeff King Apologies, please ignore the above patch as it also includes the patch files I commited by mistake. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-25 15:59 ` Junio C Hamano ` (2 preceding siblings ...) 2024-06-26 6:37 ` [PATCH v4] " Abhijeet Sonar @ 2024-06-26 6:52 ` Abhijeet Sonar 2024-06-26 11:30 ` Karthik Nayak 2024-06-26 18:31 ` Junio C Hamano 3 siblings, 2 replies; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-26 6:52 UTC (permalink / raw) To: git Cc: Abhijeet Sonar, Paul Millar, Junio C Hamano, Phillip Wood, Elijah Newren, Jeff King When describe is run with 'dirty' flag, we refresh the index to make sure it is in sync with the filesystem before determining if the working tree is dirty. However, this is not done for the codepath where the 'broken' flag is used. This causes `git describe --broken --dirty` to false positively report the worktree being dirty if a file has different stat info than what is recorded in the index. Running `git update-index -q --refresh` to refresh the index before running diff-index fixes the problem. Also add tests to deliberately update stat info of a file before running describe to verify it behaves correctly. Reported-by: Paul Millar <paul.millar@desy.de> Suggested-by: Junio C Hamano <gitster@pobox.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> --- builtin/describe.c | 11 +++++++++++ t/t6120-describe.sh | 24 ++++++++++++++++++++++++ 2 files changed, 35 insertions(+) diff --git a/builtin/describe.c b/builtin/describe.c index e5287eddf2..7cb9d50b36 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -53,6 +53,10 @@ static const char *diff_index_args[] = { "diff-index", "--quiet", "HEAD", "--", NULL }; +static const char *update_index_args[] = { + "update-index", "--unmerged", "-q", "--refresh", NULL +}; + struct commit_name { struct hashmap_entry entry; struct object_id peeled; @@ -645,6 +649,13 @@ int cmd_describe(int argc, const char **argv, const char *prefix) if (argc == 0) { if (broken) { struct child_process cp = CHILD_PROCESS_INIT; + strvec_pushv(&cp.args, update_index_args); + cp.git_cmd = 1; + cp.no_stdin = 1; + cp.no_stdout = 1; + if (run_command(&cp)) + child_process_clear(&cp); + strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; cp.no_stdin = 1; diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index e78315d23d..6c396e7abc 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -671,4 +671,28 @@ test_expect_success 'setup misleading taggerdates' ' check_describe newer-tag-older-commit~1 --contains unique-file~2 +test_expect_success 'describe --dirty with a file with changed stat' ' + git init stat-dirty && + ( + cd stat-dirty && + + echo A >file && + git add file && + git commit -m A && + git tag A -a -m A && + + cat file >file.new && + mv file.new file && + git describe --dirty >actual && + echo "A" >expected && + test_cmp expected actual && + + cat file >file.new && + mv file.new file && + git describe --dirty --broken >actual && + echo "A" >expected && + test_cmp expected actual + ) +' + test_done Range-diff against v4: 1: 1da5fa48d9 ! 1: 52f590b70f describe: refresh the index when 'broken' flag is used @@ builtin/describe.c: int cmd_describe(int argc, const char **argv, const char *pr + cp.git_cmd = 1; + cp.no_stdin = 1; + cp.no_stdout = 1; -+ run_command(&cp); -+ strvec_clear(&cp.args); ++ if (run_command(&cp)) ++ child_process_clear(&cp); + strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; -- 2.45.2.606.g9005149a4a.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 6:52 ` [PATCH v5] " Abhijeet Sonar @ 2024-06-26 11:30 ` Karthik Nayak 2024-06-26 12:06 ` Abhijeet Sonar 2024-06-26 14:59 ` Junio C Hamano 2024-06-26 18:31 ` Junio C Hamano 1 sibling, 2 replies; 32+ messages in thread From: Karthik Nayak @ 2024-06-26 11:30 UTC (permalink / raw) To: Abhijeet Sonar, git Cc: Paul Millar, Junio C Hamano, Phillip Wood, Elijah Newren, Jeff King [-- Attachment #1: Type: text/plain, Size: 4111 bytes --] Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > When describe is run with 'dirty' flag, we refresh the index > to make sure it is in sync with the filesystem before > determining if the working tree is dirty. However, this is > not done for the codepath where the 'broken' flag is used. > > This causes `git describe --broken --dirty` to false > positively report the worktree being dirty if a file has > different stat info than what is recorded in the index. > Running `git update-index -q --refresh` to refresh the index > before running diff-index fixes the problem. > > Also add tests to deliberately update stat info of a > file before running describe to verify it behaves correctly. > > Reported-by: Paul Millar <paul.millar@desy.de> > Suggested-by: Junio C Hamano <gitster@pobox.com> > Helped-by: Junio C Hamano <gitster@pobox.com> > Helped-by: Phillip Wood <phillip.wood123@gmail.com> > Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> > --- > builtin/describe.c | 11 +++++++++++ > t/t6120-describe.sh | 24 ++++++++++++++++++++++++ > 2 files changed, 35 insertions(+) > > diff --git a/builtin/describe.c b/builtin/describe.c > index e5287eddf2..7cb9d50b36 100644 > --- a/builtin/describe.c > +++ b/builtin/describe.c > @@ -53,6 +53,10 @@ static const char *diff_index_args[] = { > "diff-index", "--quiet", "HEAD", "--", NULL > }; > > +static const char *update_index_args[] = { > + "update-index", "--unmerged", "-q", "--refresh", NULL > +}; > + > struct commit_name { > struct hashmap_entry entry; > struct object_id peeled; > @@ -645,6 +649,13 @@ int cmd_describe(int argc, const char **argv, const char *prefix) > if (argc == 0) { > if (broken) { > struct child_process cp = CHILD_PROCESS_INIT; > + strvec_pushv(&cp.args, update_index_args); > + cp.git_cmd = 1; > + cp.no_stdin = 1; > + cp.no_stdout = 1; > + if (run_command(&cp)) > + child_process_clear(&cp); > + > strvec_pushv(&cp.args, diff_index_args); > cp.git_cmd = 1; > cp.no_stdin = 1; > diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh > index e78315d23d..6c396e7abc 100755 > --- a/t/t6120-describe.sh > +++ b/t/t6120-describe.sh > @@ -671,4 +671,28 @@ test_expect_success 'setup misleading taggerdates' ' > > check_describe newer-tag-older-commit~1 --contains unique-file~2 > > +test_expect_success 'describe --dirty with a file with changed stat' ' > + git init stat-dirty && > + ( > + cd stat-dirty && > + > + echo A >file && > + git add file && > + git commit -m A && > + git tag A -a -m A && > + > + cat file >file.new && > + mv file.new file && > + git describe --dirty >actual && > + echo "A" >expected && > + test_cmp expected actual && > + > + cat file >file.new && > + mv file.new file && > + git describe --dirty --broken >actual && > + echo "A" >expected && > + test_cmp expected actual Not worth a reroll, but you don't have to create file.new twice. diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index 6c396e7abc..6c4b20fec7 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -683,14 +683,12 @@ test_expect_success 'describe --dirty with a file with changed stat' ' cat file >file.new && mv file.new file && - git describe --dirty >actual && echo "A" >expected && + + git describe --dirty >actual && test_cmp expected actual && - cat file >file.new && - mv file.new file && git describe --dirty --broken >actual && - echo "A" >expected && test_cmp expected actual ) ' > + ) > +' > + > test_done > > Range-diff against v4: > 1: 1da5fa48d9 ! 1: 52f590b70f describe: refresh the index when 'broken' flag is used > @@ builtin/describe.c: int cmd_describe(int argc, const char **argv, const char *pr > + cp.git_cmd = 1; > + cp.no_stdin = 1; > + cp.no_stdout = 1; > -+ run_command(&cp); > -+ strvec_clear(&cp.args); > ++ if (run_command(&cp)) > ++ child_process_clear(&cp); > + > strvec_pushv(&cp.args, diff_index_args); > cp.git_cmd = 1; > -- > 2.45.2.606.g9005149a4a.dirty Other than this, this looks good to me. Thanks! [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 11:30 ` Karthik Nayak @ 2024-06-26 12:06 ` Abhijeet Sonar 2024-06-26 15:34 ` Re* " Junio C Hamano 2024-06-26 14:59 ` Junio C Hamano 1 sibling, 1 reply; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-26 12:06 UTC (permalink / raw) To: Karthik Nayak, git Cc: Paul Millar, Junio C Hamano, Phillip Wood, Elijah Newren, Jeff King On 26/06/24 17:00, Karthik Nayak wrote: > Not worth a reroll, but you don't have to create file.new twice. Actually, now that I think of it, those two were better off being separate tests. It might so happen the first call to describe refreshes the index, due to which the second call with the --broken option does not bug-out in the way it would if the command was run by itself. Having them separate would give them enough isolation so that previous command does not interfere with the later. >> Range-diff against v4: >> 1: 1da5fa48d9 ! 1: 52f590b70f describe: refresh the index when 'broken' flag is used >> @@ builtin/describe.c: int cmd_describe(int argc, const char **argv, const char *pr >> + cp.git_cmd = 1; >> + cp.no_stdin = 1; >> + cp.no_stdout = 1; >> -+ run_command(&cp); >> -+ strvec_clear(&cp.args); >> ++ if (run_command(&cp)) >> ++ child_process_clear(&cp); >> + >> strvec_pushv(&cp.args, diff_index_args); >> cp.git_cmd = 1; >> -- >> 2.45.2.606.g9005149a4a.dirty > > Other than this, this looks good to me. I am not sure if I follow this one. Am I expected to not share the struct child_process between the two sub-process calls? Thanks ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 12:06 ` Abhijeet Sonar @ 2024-06-26 15:34 ` Junio C Hamano 2024-06-26 16:17 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2024-06-26 15:34 UTC (permalink / raw) To: Abhijeet Sonar Cc: Karthik Nayak, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > On 26/06/24 17:00, Karthik Nayak wrote: >> Not worth a reroll, but you don't have to create file.new twice. > > Actually, now that I think of it, those two were better off being > separate tests. It might so happen the first call to describe > refreshes the index, due to which the second call with the --broken > option does not bug-out in the way it would if the command was run by > itself. Having them separate would give them enough isolation so that > previous command does not interfere with the later. Good thinking. Yes, we may end up having a few commands that are duplicated in these two tests (for setting the stage up, for example), but it would be better to test these two separately. >>> Range-diff against v4: >>> 1: 1da5fa48d9 ! 1: 52f590b70f describe: refresh the index when 'broken' flag is used >>> @@ builtin/describe.c: int cmd_describe(int argc, const char **argv, const char *pr >>> + cp.git_cmd = 1; >>> + cp.no_stdin = 1; >>> + cp.no_stdout = 1; >>> -+ run_command(&cp); >>> -+ strvec_clear(&cp.args); >>> ++ if (run_command(&cp)) >>> ++ child_process_clear(&cp); >>> + >>> strvec_pushv(&cp.args, diff_index_args); >>> cp.git_cmd = 1; >>> -- >>> 2.45.2.606.g9005149a4a.dirty >> Other than this, this looks good to me. > I am not sure if I follow this one. Am I expected to not share the > struct child_process between the two sub-process calls? Without reusing and instead of using two, we do not have to worry about the reusablility of the child_process structure in the first place, which is a huge plus, but in the longer run we should make sure it is safe to reuse child_process and document the safe way to reuse it (run-command.h does document a way to use it once and then clean it up, but the "clean-up" extends only to not leaking resources after we are done---it does not guarantee that it is OK to reuse it). I think with the updated "we clear cp ourselves if run_command() fails", it should be safe to reuse, but it probably is even safer to do something like this: ... the first run ... if (run_command(&cp)) child_process_clear(&cp); child_process_init(&cp); ... setup for the second run ... strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; ... full set-up without relying on anything done earlier ... The extra child_process_init() call may serve as an extra documentation that we are reusing the same struct here (we often do "git grep" for use of a specific API function before tree wide code clean-up, and child_process_init() would be a good key to look for). ... goes and looks ... Oh, I found an interesting one. builtin/fsck.c:cmd_fsck() does this in a loop: struct child_process verify = CHILD_PROCESS_INIT; ... setup ... for (... loop ...) { child_process_init(&verify); ... set up various .members of verify struct ... strvec_pushl(&verify.args, ... command line ...); if (run_command(&verify)) errors_found |= ...; } This code clearly assumes that it is safe to reuse the child_process structure after you run_command() and let it clean-up if you do another child_process_init(). And I think that is a sensible assumption. The code in builtin/fsck.c:cmd_fsck() is buggy when run_command() fails, I think. Without doing child_process_clear() there, doesn't it leak the strvec? ------- >8 ------------- >8 ------------- >8 ------- Subject: [PATCH] fsck: clear child_process after failed run_command() There are two loops that calls run_command() using the same child_process struct near the end of cmd_fsck(). 4d0984be (fsck: do not reuse child_process structs, 2018-11-12) tightened these code paths to reinitialize the structure in order to safely reuse it. The run-command API makes no promises about what is left in a struct child_process after a command finishes, and it's not safe to simply reuse it again for a similar command. Upon failure, run_command() can return without releasing the resource held by the child_process structure, which is done by calling finish_command() which in turn calls child_process_clear(). Reinitializing the structure without calling child_process_clear() for the next round would leak the .args and .env strvecs. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- builtin/fsck.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git c/builtin/fsck.c w/builtin/fsck.c index d13a226c2e..398b492184 100644 --- c/builtin/fsck.c +++ w/builtin/fsck.c @@ -1078,8 +1078,10 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) strvec_push(&commit_graph_verify.args, "--progress"); else strvec_push(&commit_graph_verify.args, "--no-progress"); - if (run_command(&commit_graph_verify)) + if (run_command(&commit_graph_verify)) { + child_process_clear(&commit_graph_verify); errors_found |= ERROR_COMMIT_GRAPH; + } } } @@ -1096,8 +1098,10 @@ int cmd_fsck(int argc, const char **argv, const char *prefix) strvec_push(&midx_verify.args, "--progress"); else strvec_push(&midx_verify.args, "--no-progress"); - if (run_command(&midx_verify)) + if (run_command(&midx_verify)) { + child_process_clear(&midx_verify); errors_found |= ERROR_MULTI_PACK_INDEX; + } } } ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: Re* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 15:34 ` Re* " Junio C Hamano @ 2024-06-26 16:17 ` Junio C Hamano 2024-06-26 17:29 ` Abhijeet Sonar 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2024-06-26 16:17 UTC (permalink / raw) To: Abhijeet Sonar Cc: Karthik Nayak, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Junio C Hamano <gitster@pobox.com> writes: > Subject: [PATCH] fsck: clear child_process after failed run_command() > > There are two loops that calls run_command() using the same > child_process struct near the end of cmd_fsck(). 4d0984be (fsck: do > not reuse child_process structs, 2018-11-12) tightened these code > paths to reinitialize the structure in order to safely reuse it. > > The run-command API makes no promises about what is left in a struct > child_process after a command finishes, and it's not safe to simply > reuse it again for a similar command. > > Upon failure, run_command() can return without releasing the > resource held by the child_process structure, which is done by > calling finish_command() which in turn calls child_process_clear(). Sorry, but I have to take this back. The error return code paths in the start_command() function does seem to call clear_child_process(), which means that there is no need to call clear_child_process() ourselves after a failed run_command() call. The need for reinitializing that established by 4d0984be (fsck: do not reuse child_process structs, 2018-11-12) still stand, so ... the first run ... run_command(&cp); /* no need for separate child_process_clear(&cp) */ child_process_init(&cp); /* prepare for reuse */ ... setup for the second run ... strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; ... full set-up without relying on anything done earlier ... would still be needed. Or alternatively, we could do this to ensure that the child_process structure is always reusable. run-command.c | 1 + run-command.h | 6 +++++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git c/run-command.c w/run-command.c index 6ac1d14516..aba250fbe1 100644 --- c/run-command.c +++ w/run-command.c @@ -26,6 +26,7 @@ void child_process_clear(struct child_process *child) { strvec_clear(&child->args); strvec_clear(&child->env); + child_process_init(child); } struct child_to_clean { diff --git c/run-command.h w/run-command.h index 55f6631a2a..6e203c22f6 100644 --- c/run-command.h +++ w/run-command.h @@ -204,7 +204,8 @@ int start_command(struct child_process *); /** * Wait for the completion of a sub-process that was started with - * start_command(). + * start_command(). The child_process structure is cleared and + * reinitialized. */ int finish_command(struct child_process *); @@ -214,6 +215,9 @@ int finish_command_in_signal(struct child_process *); * A convenience function that encapsulates a sequence of * start_command() followed by finish_command(). Takes a pointer * to a `struct child_process` that specifies the details. + * The child_process structure is cleared and reinitialized, + * even when the command fails to start or an error is detected + * in finish_command(). */ int run_command(struct child_process *); ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: Re* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 16:17 ` Junio C Hamano @ 2024-06-26 17:29 ` Abhijeet Sonar 2024-06-26 17:35 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-26 17:29 UTC (permalink / raw) To: Junio C Hamano Cc: Karthik Nayak, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King On 26/06/24 21:47, Junio C Hamano wrote: > Or alternatively, we could do this to ensure that the child_process > structure is always reusable. > > run-command.c | 1 + > run-command.h | 6 +++++- > 2 files changed, 6 insertions(+), 1 deletion(-) > > diff --git c/run-command.c w/run-command.c > index 6ac1d14516..aba250fbe1 100644 > --- c/run-command.c > +++ w/run-command.c > @@ -26,6 +26,7 @@ void child_process_clear(struct child_process *child) > { > strvec_clear(&child->args); > strvec_clear(&child->env); > + child_process_init(child); > } To me, this looks much better. child_process_clear's name already suggests that is sort of like a destructor, so it makes sense to re-initialize everything here. I even wonder why it was not that way to begin with. I suppose no callers are assuming that it only clears args and env though? > struct child_to_clean { > diff --git c/run-command.h w/run-command.h > index 55f6631a2a..6e203c22f6 100644 > --- c/run-command.h > +++ w/run-command.h > @@ -204,7 +204,8 @@ int start_command(struct child_process *); > > /** > * Wait for the completion of a sub-process that was started with > - * start_command(). > + * start_command(). The child_process structure is cleared and > + * reinitialized. > */ > int finish_command(struct child_process *); > > @@ -214,6 +215,9 @@ int finish_command_in_signal(struct child_process *); > * A convenience function that encapsulates a sequence of > * start_command() followed by finish_command(). Takes a pointer > * to a `struct child_process` that specifies the details. > + * The child_process structure is cleared and reinitialized, > + * even when the command fails to start or an error is detected > + * in finish_command(). > */ > int run_command(struct child_process *); > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Re* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 17:29 ` Abhijeet Sonar @ 2024-06-26 17:35 ` Junio C Hamano 2024-06-26 17:45 ` Junio C Hamano 2024-06-26 18:07 ` Abhijeet Sonar 0 siblings, 2 replies; 32+ messages in thread From: Junio C Hamano @ 2024-06-26 17:35 UTC (permalink / raw) To: Abhijeet Sonar Cc: Karthik Nayak, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > To me, this looks much better. child_process_clear's name already > suggests that is sort of like a destructor, so it makes sense to > re-initialize everything here. I even wonder why it was not that way to > begin with. I suppose no callers are assuming that it only clears args > and env though? I guess that validating that supposition is a prerequisite to declare the change as "much better" and "makes sense". ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Re* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 17:35 ` Junio C Hamano @ 2024-06-26 17:45 ` Junio C Hamano 2024-06-26 18:07 ` Abhijeet Sonar 1 sibling, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2024-06-26 17:45 UTC (permalink / raw) To: Abhijeet Sonar Cc: Karthik Nayak, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Junio C Hamano <gitster@pobox.com> writes: > Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > >> To me, this looks much better. child_process_clear's name already >> suggests that is sort of like a destructor, so it makes sense to >> re-initialize everything here. I even wonder why it was not that way to >> begin with. I suppose no callers are assuming that it only clears args >> and env though? > > I guess that validating that supposition is a prerequisite to > declare the change as "much better" and "makes sense". Having said that, a lot more plausible reason is that most callers are expected to use a single struct just once without reusing it, so the cost of re-initialization is wasted cycles for them if it were part of child_process_clear() which is primarily about releasing resources (instead of leaking them). It can be argued that it is a sensible design decision to make those minority callers that do reuse the same struct to pay the cost of reinitialization themselves, instead of the majority callers that use it once and then release resources held in there. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Re* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 17:35 ` Junio C Hamano 2024-06-26 17:45 ` Junio C Hamano @ 2024-06-26 18:07 ` Abhijeet Sonar 2024-06-26 18:49 ` Junio C Hamano 1 sibling, 1 reply; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-26 18:07 UTC (permalink / raw) To: Junio C Hamano Cc: Karthik Nayak, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King On 26/06/24 23:05, Junio C Hamano wrote: > Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > >> To me, this looks much better. child_process_clear's name already >> suggests that is sort of like a destructor, so it makes sense to >> re-initialize everything here. I even wonder why it was not that way to >> begin with. I suppose no callers are assuming that it only clears args >> and env though? > > I guess that validating that supposition is a prerequisite to > declare the change as "much better" and "makes sense". OK. I found one: at the end of submodule.c:push_submodule() if (...) { ...some setup... if (run_command(&cp)) return 0; close(cp.out); } ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Re* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 18:07 ` Abhijeet Sonar @ 2024-06-26 18:49 ` Junio C Hamano 2024-06-26 20:34 ` Jeff King 2024-06-26 21:23 ` Karthik Nayak 0 siblings, 2 replies; 32+ messages in thread From: Junio C Hamano @ 2024-06-26 18:49 UTC (permalink / raw) To: Abhijeet Sonar Cc: Karthik Nayak, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > On 26/06/24 23:05, Junio C Hamano wrote: >> Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: >> >>> To me, this looks much better. child_process_clear's name already >>> suggests that is sort of like a destructor, so it makes sense to >>> re-initialize everything here. I even wonder why it was not that way to >>> begin with. I suppose no callers are assuming that it only clears args >>> and env though? >> >> I guess that validating that supposition is a prerequisite to >> declare the change as "much better" and "makes sense". > > OK. I found one: at the end of submodule.c:push_submodule() > > if (...) { > ...some setup... > if (run_command(&cp)) > return 0; > close(cp.out); > } This is curious. * What is this thing trying to do? When run_command() fails, it wants to leave cp.out open, so that the caller this returns to can write into it??? That cannot be the case, as cp itself is internal. So does this "close(cp.out)" really matter? * Even though we are running child_process_clear() to release the resources in run_command() we are not closing the file descriptor cp.out in the child_process_clear() and force the caller to close it instead. An open file descriptor is a resource, and a file descriptor opened but forgotten is considered a leak. I wonder if child_process_clear() should be closing the file descriptor, at least the ones it opened or dup2()ed. In any case, you found a case where child_process_clear() may not want to do the full re-initialization and at the same time it is not doing its job sufficiently well. Let's decide, at least for now, not to do the reinitialization from child_process_clear(), then. Thanks. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Re* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 18:49 ` Junio C Hamano @ 2024-06-26 20:34 ` Jeff King 2024-06-27 0:33 ` Jeff King 2024-06-26 21:23 ` Karthik Nayak 1 sibling, 1 reply; 32+ messages in thread From: Jeff King @ 2024-06-26 20:34 UTC (permalink / raw) To: Junio C Hamano Cc: Abhijeet Sonar, Karthik Nayak, git, Paul Millar, Phillip Wood, Elijah Newren On Wed, Jun 26, 2024 at 11:49:22AM -0700, Junio C Hamano wrote: > Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > > > On 26/06/24 23:05, Junio C Hamano wrote: > >> Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > >> > >>> To me, this looks much better. child_process_clear's name already > >>> suggests that is sort of like a destructor, so it makes sense to > >>> re-initialize everything here. I even wonder why it was not that way to > >>> begin with. I suppose no callers are assuming that it only clears args > >>> and env though? > >> > >> I guess that validating that supposition is a prerequisite to > >> declare the change as "much better" and "makes sense". > > > > OK. I found one: at the end of submodule.c:push_submodule() > > > > if (...) { > > ...some setup... > > if (run_command(&cp)) > > return 0; > > close(cp.out); > > } > > This is curious. > > * What is this thing trying to do? When run_command() fails, it > wants to leave cp.out open, so that the caller this returns to > can write into it??? That cannot be the case, as cp itself is > internal. So does this "close(cp.out)" really matter? I think it's totally broken. Using cp.out, cp.in, etc, with run_command() is a deadlock waiting to happen, since it implies opening a pipe, not doing anything with our end, and then doing a waitpid() on the child. You'd always want to use start_command(), and then do something useful with the pipe, and then finish_command(). Arguably run_command() should bug if cp.out, etc are non-zero. In this case the code leaves cp.out as 0, so we are not asking for a pipe. That is good, and we are not subject to any race/deadlock. But then...what is the cleanup trying to do? This will always just close(0), the parent's stdin. That is certainly surprising, but I guess nobody ever noticed because git-push does not usually read from its stdin. So I think the close() is superfluous and should be deleted. It goes all the way back to eb21c732d6 (push: teach --recurse-submodules the on-demand option, 2012-03-29). I guess at one point the author thought we'd read output from a pipe (rather than letting it just go to the parent's stdout) and this was leftover? > * Even though we are running child_process_clear() to release the > resources in run_command() we are not closing the file descriptor > cp.out in the child_process_clear() and force the caller to close > it instead. An open file descriptor is a resource, and a file > descriptor opened but forgotten is considered a leak. I wonder > if child_process_clear() should be closing the file descriptor, > at least the ones it opened or dup2()ed. Usually the caller will have handled this already (since it cares about exactly _when_ to close), so we'd end up double-closing in most cases. This is sometimes harmless (you'll get EBADF), but is a problem if the descriptor was assigned to something else in the meantime. In most cases callers shouldn't be using child_process_clear() at all after start_command(), etc. Either: - start_command() failed, in which case it cleans up everything already (both in the struct and any pipes it opened) - we succeeded in starting the command, in which case child_process_clear() is insufficient. You need to actually finish_command() to avoid leaking the pid. - after finish_command(), the struct has been cleaned already. You do need to close pipes in the caller, but you'd have to do so before calling finish_command() anyway, to avoid the deadlock. You really only need to call child_process_clear() yourself when you set up a struct but didn't actually start the process. And from a quick skim over the grep results, it looks like that's how it's used. -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Re* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 20:34 ` Jeff King @ 2024-06-27 0:33 ` Jeff King 0 siblings, 0 replies; 32+ messages in thread From: Jeff King @ 2024-06-27 0:33 UTC (permalink / raw) To: Junio C Hamano Cc: Abhijeet Sonar, Karthik Nayak, git, Paul Millar, Phillip Wood, Elijah Newren On Wed, Jun 26, 2024 at 04:34:17PM -0400, Jeff King wrote: > > * What is this thing trying to do? When run_command() fails, it > > wants to leave cp.out open, so that the caller this returns to > > can write into it??? That cannot be the case, as cp itself is > > internal. So does this "close(cp.out)" really matter? > > I think it's totally broken. Using cp.out, cp.in, etc, with > run_command() is a deadlock waiting to happen, since it implies opening > a pipe, not doing anything with our end, and then doing a waitpid() on > the child. > > You'd always want to use start_command(), and then do something useful > with the pipe, and then finish_command(). Arguably run_command() should > bug if cp.out, etc are non-zero. Oh, I guess I should have looked at the code. We do indeed BUG() in this case already, courtesy of c29b3962af (run-command: forbid using run_command with piped output, 2015-03-22). So all is well there. This caller is still completely wrong to call close() though. -Peff ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Re* [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 18:49 ` Junio C Hamano 2024-06-26 20:34 ` Jeff King @ 2024-06-26 21:23 ` Karthik Nayak 1 sibling, 0 replies; 32+ messages in thread From: Karthik Nayak @ 2024-06-26 21:23 UTC (permalink / raw) To: Junio C Hamano, Abhijeet Sonar Cc: git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King [-- Attachment #1: Type: text/plain, Size: 3565 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > >> On 26/06/24 23:05, Junio C Hamano wrote: >>> Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: >>> >>>> To me, this looks much better. child_process_clear's name already >>>> suggests that is sort of like a destructor, so it makes sense to >>>> re-initialize everything here. I even wonder why it was not that way to >>>> begin with. I suppose no callers are assuming that it only clears args >>>> and env though? >>> >>> I guess that validating that supposition is a prerequisite to >>> declare the change as "much better" and "makes sense". >> >> OK. I found one: at the end of submodule.c:push_submodule() >> >> if (...) { >> ...some setup... >> if (run_command(&cp)) >> return 0; >> close(cp.out); >> } > > This is curious. > > * What is this thing trying to do? When run_command() fails, it > wants to leave cp.out open, so that the caller this returns to > can write into it??? That cannot be the case, as cp itself is > internal. So does this "close(cp.out)" really matter? > This is curious one indeed, we only need to close the file descriptor when we set `cp.out = -1`, otherwise there is no need to close `cp.out` since `start_command()` takes care of it internally. So the `close(cp.out)` can really be removed here. Wonder if this was copied over from other parts of the submodule code, where we actually set `cp.out = -1`. > > * Even though we are running child_process_clear() to release the > resources in run_command() we are not closing the file descriptor > cp.out in the child_process_clear() and force the caller to close > it instead. An open file descriptor is a resource, and a file > descriptor opened but forgotten is considered a leak. I wonder > if child_process_clear() should be closing the file descriptor, > at least the ones it opened or dup2()ed. > If you see the documentation for run_command.h::child_process, it states /* * Using .in, .out, .err: * - Specify 0 for no redirections. No new file descriptor is allocated. * (child inherits stdin, stdout, stderr from parent). * - Specify -1 to have a pipe allocated as follows: * .in: returns the writable pipe end; parent writes to it, * the readable pipe end becomes child's stdin * .out, .err: returns the readable pipe end; parent reads from * it, the writable pipe end becomes child's stdout/stderr * The caller of start_command() must close the returned FDs * after it has completed reading from/writing to it! * - Specify > 0 to set a channel to a particular FD as follows: * .in: a readable FD, becomes child's stdin * .out: a writable FD, becomes child's stdout/stderr * .err: a writable FD, becomes child's stderr * The specified FD is closed by start_command(), even in case * of errors! */ int in; int out; int err; So this makes sense right? run_command() doesn't support the pipe options, meaning clients have to call start_command() + finish_command() manually. When clients do call start_command() with '-1' set to these options, they are in charge of closing the created pipes. > In any case, you found a case where child_process_clear() may not > want to do the full re-initialization and at the same time it is not > doing its job sufficiently well. Let's decide, at least for now, > not to do the reinitialization from child_process_clear(), then. > > Thanks. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 11:30 ` Karthik Nayak 2024-06-26 12:06 ` Abhijeet Sonar @ 2024-06-26 14:59 ` Junio C Hamano 1 sibling, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2024-06-26 14:59 UTC (permalink / raw) To: Karthik Nayak Cc: Abhijeet Sonar, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Karthik Nayak <karthik.188@gmail.com> writes: >> + cat file >file.new && >> + mv file.new file && >> + git describe --dirty >actual && >> + echo "A" >expected && >> + test_cmp expected actual && >> + >> + cat file >file.new && >> + mv file.new file && >> + git describe --dirty --broken >actual && >> + echo "A" >expected && >> + test_cmp expected actual > > Not worth a reroll, but you don't have to create file.new twice. I think you have to make the "file" stat-dirty again after the first test, as a successful first test _should_ refresh the index (and that was why my manual illustration in an earlier message <xmqqsex2b4ti.fsf@gitster.g> did "--dirty --broken" first before "--dirty" alone, knowing that the former fails to refresh the index). You are right to point out that the expected results for "--dirty" and "--dirty --broken" are the same and we do not have to create it twice, though. If the filesystem has a usable inode number, then replacing "file" with a copy, i.e. "cat file >file.new && mv file.new file", is an excellent way to ensure that the stat data becomes dirty even when it is done as a part of a quick series of commands. But not all filesystems we run our tests on may have usable inode number, so it may not be the best thing to do in an automated test. We could use "test-tool chmtime" instead, perhaps like: ... make the index and the working tree are clean wrt HEAD ... # we do not expect -dirty suffix in the output echo A >expect && # make "file" stat-dirty test-tool chmtime -10 file && # "describe --dirty" refreshes and does not get fooled git describe --dirty >actual && test_cmp expect actual && # make "file" stat-dirty again test-tool chmtime -10 file && # "describe --dirty --broken" refreshes and does not get fooled git describe --dirty --broken >actual && test_cmp expect actual with the extra comments stripped (I added them only to explain what is going on to the readers of this e-mail message). ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v5] describe: refresh the index when 'broken' flag is used 2024-06-26 6:52 ` [PATCH v5] " Abhijeet Sonar 2024-06-26 11:30 ` Karthik Nayak @ 2024-06-26 18:31 ` Junio C Hamano 2024-06-26 19:08 ` [PATCH v7] " Abhijeet Sonar 1 sibling, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2024-06-26 18:31 UTC (permalink / raw) To: Abhijeet Sonar; +Cc: git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > When describe is run with 'dirty' flag, we refresh the index > to make sure it is in sync with the filesystem before > determining if the working tree is dirty. However, this is > not done for the codepath where the 'broken' flag is used. > > This causes `git describe --broken --dirty` to false > positively report the worktree being dirty if a file has > different stat info than what is recorded in the index. > Running `git update-index -q --refresh` to refresh the index > before running diff-index fixes the problem. > > Also add tests to deliberately update stat info of a > file before running describe to verify it behaves correctly. > > Reported-by: Paul Millar <paul.millar@desy.de> > Suggested-by: Junio C Hamano <gitster@pobox.com> > Helped-by: Junio C Hamano <gitster@pobox.com> > Helped-by: Phillip Wood <phillip.wood123@gmail.com> > Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> > --- As I screwed up with the suggestion to do child_process_clear() after a failed run_command(), let me fix that up. A suggested change that can be squashed into this patch is attached at the end. * (style) a blank line between a block of variable declarations and the first statement; * run_command(&cp) cleans cp so no need for separate child_process_clear(&cp); * but child_process_init(&cp) is needed, just as 4d0984be (fsck: do not reuse child_process structs, 2018-11-12) explains, before reusing the structure for a separate call. * instead of "replace with a different file" which relies on having a working inum, use "test-tool chmtime" to reliably force dirty mtime. * everybody else compares "actual" against "expect", not "expected". imitate them. * test "--dirty" and "--dirty --broken" separately in two separate tests. Thanks. builtin/describe.c | 5 +++-- t/t6120-describe.sh | 28 ++++++++++++++++++++-------- 2 files changed, 23 insertions(+), 10 deletions(-) diff --git c/builtin/describe.c w/builtin/describe.c index a58f6134f0..e936d2c19f 100644 --- c/builtin/describe.c +++ w/builtin/describe.c @@ -649,13 +649,14 @@ int cmd_describe(int argc, const char **argv, const char *prefix) if (argc == 0) { if (broken) { struct child_process cp = CHILD_PROCESS_INIT; + strvec_pushv(&cp.args, update_index_args); cp.git_cmd = 1; cp.no_stdin = 1; cp.no_stdout = 1; - if (run_command(&cp)) - child_process_clear(&cp); + run_command(&cp); + child_process_init(&cp); strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; cp.no_stdin = 1; diff --git c/t/t6120-describe.sh w/t/t6120-describe.sh index 6c396e7abc..79e0f19deb 100755 --- c/t/t6120-describe.sh +++ w/t/t6120-describe.sh @@ -672,6 +672,7 @@ test_expect_success 'setup misleading taggerdates' ' check_describe newer-tag-older-commit~1 --contains unique-file~2 test_expect_success 'describe --dirty with a file with changed stat' ' + test_when_finished "rm -fr stat-dirty" && git init stat-dirty && ( cd stat-dirty && @@ -680,18 +681,29 @@ test_expect_success 'describe --dirty with a file with changed stat' ' git add file && git commit -m A && git tag A -a -m A && + echo "A" >expect && - cat file >file.new && - mv file.new file && + test-tool chmtime -10 file && git describe --dirty >actual && - echo "A" >expected && - test_cmp expected actual && + test_cmp expect actual + ) +' + +test_expect_success 'describe --broken --dirty with a file with changed stat' ' + test_when_finished "rm -fr stat-dirty" && + git init stat-dirty && + ( + cd stat-dirty && - cat file >file.new && - mv file.new file && + echo A >file && + git add file && + git commit -m A && + git tag A -a -m A && + echo "A" >expect && + + test-tool chmtime -10 file && git describe --dirty --broken >actual && - echo "A" >expected && - test_cmp expected actual + test_cmp expect actual ) ' ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH v7] describe: refresh the index when 'broken' flag is used 2024-06-26 18:31 ` Junio C Hamano @ 2024-06-26 19:08 ` Abhijeet Sonar 2024-06-26 19:25 ` Abhijeet Sonar 0 siblings, 1 reply; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-26 19:08 UTC (permalink / raw) To: git Cc: Abhijeet Sonar, Paul Millar, Junio C Hamano, Phillip Wood, Elijah Newren, Jeff King When describe is run with 'dirty' flag, we refresh the index to make sure it is in sync with the filesystem before determining if the working tree is dirty. However, this is not done for the codepath where the 'broken' flag is used. This causes `git describe --broken --dirty` to false positively report the worktree being dirty if a file has different stat info than what is recorded in the index. Running `git update-index -q --refresh` to refresh the index before running diff-index fixes the problem. Also add tests to deliberately update stat info of a file before running describe to verify it behaves correctly. Reported-by: Paul Millar <paul.millar@desy.de> Suggested-by: Junio C Hamano <gitster@pobox.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Abhijeet Sonar <abhijeet.nkt@gmail.com> --- builtin/describe.c | 12 ++++++++++++ t/t6120-describe.sh | 36 ++++++++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+) diff --git a/builtin/describe.c b/builtin/describe.c index e5287eddf2..cf8edc4222 100644 --- a/builtin/describe.c +++ b/builtin/describe.c @@ -53,6 +53,10 @@ static const char *diff_index_args[] = { "diff-index", "--quiet", "HEAD", "--", NULL }; +static const char *update_index_args[] = { + "update-index", "--unmerged", "-q", "--refresh", NULL +}; + struct commit_name { struct hashmap_entry entry; struct object_id peeled; @@ -645,6 +649,14 @@ int cmd_describe(int argc, const char **argv, const char *prefix) if (argc == 0) { if (broken) { struct child_process cp = CHILD_PROCESS_INIT; + + strvec_pushv(&cp.args, update_index_args); + cp.git_cmd = 1; + cp.no_stdin = 1; + cp.no_stdout = 1; + run_command(&cp); + + child_process_init(&cp); strvec_pushv(&cp.args, diff_index_args); cp.git_cmd = 1; cp.no_stdin = 1; diff --git a/t/t6120-describe.sh b/t/t6120-describe.sh index e78315d23d..79e0f19deb 100755 --- a/t/t6120-describe.sh +++ b/t/t6120-describe.sh @@ -671,4 +671,40 @@ test_expect_success 'setup misleading taggerdates' ' check_describe newer-tag-older-commit~1 --contains unique-file~2 +test_expect_success 'describe --dirty with a file with changed stat' ' + test_when_finished "rm -fr stat-dirty" && + git init stat-dirty && + ( + cd stat-dirty && + + echo A >file && + git add file && + git commit -m A && + git tag A -a -m A && + echo "A" >expect && + + test-tool chmtime -10 file && + git describe --dirty >actual && + test_cmp expect actual + ) +' + +test_expect_success 'describe --broken --dirty with a file with changed stat' ' + test_when_finished "rm -fr stat-dirty" && + git init stat-dirty && + ( + cd stat-dirty && + + echo A >file && + git add file && + git commit -m A && + git tag A -a -m A && + echo "A" >expect && + + test-tool chmtime -10 file && + git describe --dirty --broken >actual && + test_cmp expect actual + ) +' + test_done -- 2.45.2.606.g9005149a4a.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v7] describe: refresh the index when 'broken' flag is used 2024-06-26 19:08 ` [PATCH v7] " Abhijeet Sonar @ 2024-06-26 19:25 ` Abhijeet Sonar 2024-06-27 6:01 ` Abhijeet Sonar 2024-06-27 15:47 ` Junio C Hamano 0 siblings, 2 replies; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-26 19:25 UTC (permalink / raw) To: git; +Cc: Paul Millar, Junio C Hamano, Phillip Wood, Elijah Newren, Jeff King I have a question: Why does --dirty code path also not call git-update-index and instead does setup_work_tree(); prepare_repo_settings(the_repository); the_repository->settings.command_requires_full_index = 0; repo_read_index(the_repository); refresh_index(...); fd = repo_hold_locked_index(...); if (0 <= fd) repo_update_index_if_able(the_repository, &index_lock); I assume they are equivalent? The commit which introduced this -- bb571486ae93d02746c4bcc8032bde306f6d399a (describe: Refresh the index when run with --dirty) seems to be for the same objective as mu patch. Is this something you would like to see addressed? Thanks ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v7] describe: refresh the index when 'broken' flag is used 2024-06-26 19:25 ` Abhijeet Sonar @ 2024-06-27 6:01 ` Abhijeet Sonar 2024-06-27 15:47 ` Junio C Hamano 1 sibling, 0 replies; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-27 6:01 UTC (permalink / raw) To: git On 27/06/24 00:55, Abhijeet Sonar wrote: > I have a question: > > Why does --dirty code path also not call git-update-index and instead does > > setup_work_tree(); > prepare_repo_settings(the_repository); > the_repository->settings.command_requires_full_index = 0; > repo_read_index(the_repository); > refresh_index(...); > fd = repo_hold_locked_index(...); > if (0 <= fd) > repo_update_index_if_able(the_repository, &index_lock); > > I assume they are equivalent? > The commit which introduced this -- > bb571486ae93d02746c4bcc8032bde306f6d399a (describe: Refresh the index > when run with --dirty) seems to be for the same objective as mu patch. > > Is this something you would like to see addressed? > > Thanks s/mu/my ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v7] describe: refresh the index when 'broken' flag is used 2024-06-26 19:25 ` Abhijeet Sonar 2024-06-27 6:01 ` Abhijeet Sonar @ 2024-06-27 15:47 ` Junio C Hamano 2024-06-27 17:33 ` Abhijeet Sonar 2024-06-30 16:12 ` Karthik Nayak 1 sibling, 2 replies; 32+ messages in thread From: Junio C Hamano @ 2024-06-27 15:47 UTC (permalink / raw) To: Abhijeet Sonar; +Cc: git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > I have a question: > > Why does --dirty code path also not call git-update-index and instead does > > setup_work_tree(); > prepare_repo_settings(the_repository); > the_repository->settings.command_requires_full_index = 0; > repo_read_index(the_repository); > refresh_index(...); > fd = repo_hold_locked_index(...); > if (0 <= fd) > repo_update_index_if_able(the_repository, &index_lock); > > I assume they are equivalent? Now we are going back full circles ;-)? Your earliest attempt indeed copied the above to the code paths used to handle "--broken", but then Phillip corrected the course https://lore.kernel.org/git/054c6ac1-4714-4600-afa5-7e9b6e9b0e72@gmail.com/ to avoid triggering an in-process error and instead run an equivalent "update-index --refresh" via the run_command() interface, so that we can catch potential errors. The code in the more recent rounds of your patch uses that, no? > The commit which introduced this -- > bb571486ae93d02746c4bcc8032bde306f6d399a (describe: Refresh the index > when run with --dirty) seems to be for the same objective as mu patch. Yes, but the cited message above explains the reason why the two code paths in your patch use different implementations, I would think. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v7] describe: refresh the index when 'broken' flag is used 2024-06-27 15:47 ` Junio C Hamano @ 2024-06-27 17:33 ` Abhijeet Sonar 2024-06-30 16:12 ` Karthik Nayak 1 sibling, 0 replies; 32+ messages in thread From: Abhijeet Sonar @ 2024-06-27 17:33 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King On 27/06/24 21:17, Junio C Hamano wrote: > to avoid triggering an in-process error and instead run an > equivalent "update-index --refresh" via the run_command() interface, > so that we can catch potential errors. Thanks, I get it now. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v7] describe: refresh the index when 'broken' flag is used 2024-06-27 15:47 ` Junio C Hamano 2024-06-27 17:33 ` Abhijeet Sonar @ 2024-06-30 16:12 ` Karthik Nayak 2024-07-01 19:06 ` Junio C Hamano 1 sibling, 1 reply; 32+ messages in thread From: Karthik Nayak @ 2024-06-30 16:12 UTC (permalink / raw) To: Junio C Hamano, Abhijeet Sonar Cc: git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King [-- Attachment #1: Type: text/plain, Size: 1334 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Abhijeet Sonar <abhijeet.nkt@gmail.com> writes: > >> I have a question: >> >> Why does --dirty code path also not call git-update-index and instead does >> >> setup_work_tree(); >> prepare_repo_settings(the_repository); >> the_repository->settings.command_requires_full_index = 0; >> repo_read_index(the_repository); >> refresh_index(...); >> fd = repo_hold_locked_index(...); >> if (0 <= fd) >> repo_update_index_if_able(the_repository, &index_lock); >> >> I assume they are equivalent? > > Now we are going back full circles ;-)? > > Your earliest attempt indeed copied the above to the code paths used > to handle "--broken", but then Phillip corrected the course > > https://lore.kernel.org/git/054c6ac1-4714-4600-afa5-7e9b6e9b0e72@gmail.com/ > > to avoid triggering an in-process error and instead run an > equivalent "update-index --refresh" via the run_command() interface, > so that we can catch potential errors. The code in the more recent > rounds of your patch uses that, no? > This explains for why 'broken' must use a subprocess, but there is nothing stopping 'dirty' from also using a subprocess, right? It currently uses an in-process index refresh but it _could_ be a subprocess too. Does it need to be a subprocess? I can't think of any reason to make it. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v7] describe: refresh the index when 'broken' flag is used 2024-06-30 16:12 ` Karthik Nayak @ 2024-07-01 19:06 ` Junio C Hamano 2024-07-02 10:13 ` Karthik Nayak 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2024-07-01 19:06 UTC (permalink / raw) To: Karthik Nayak Cc: Abhijeet Sonar, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Karthik Nayak <karthik.188@gmail.com> writes: > This explains for why 'broken' must use a subprocess, but there is > nothing stopping 'dirty' from also using a subprocess, right? It > currently uses an in-process index refresh but it _could_ be a > subprocess too. Correct, except that it does not make sense to do any and all things that you _could_ do. So... ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v7] describe: refresh the index when 'broken' flag is used 2024-07-01 19:06 ` Junio C Hamano @ 2024-07-02 10:13 ` Karthik Nayak 2024-07-03 18:17 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Karthik Nayak @ 2024-07-02 10:13 UTC (permalink / raw) To: Junio C Hamano Cc: Abhijeet Sonar, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King [-- Attachment #1: Type: text/plain, Size: 921 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Karthik Nayak <karthik.188@gmail.com> writes: > >> This explains for why 'broken' must use a subprocess, but there is >> nothing stopping 'dirty' from also using a subprocess, right? It >> currently uses an in-process index refresh but it _could_ be a >> subprocess too. > > Correct, except that it does not make sense to do any and all things > that you _could_ do. So... Well, In this context, I think there is some merit though. There are two blocks of code `--broken` and `--dirty` one after the other which both need to refresh the index. With this patch, 'broken' will use a child process to do so while 'dirty' will use `refresh_index(...)`. To someone reading the code it would seem a bit confusing. I agree there is no merit in using a child process in 'dirty' by itself. But I also think we should leave a comment there for readers to understand the distinction. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH v7] describe: refresh the index when 'broken' flag is used 2024-07-02 10:13 ` Karthik Nayak @ 2024-07-03 18:17 ` Junio C Hamano 2024-07-03 20:41 ` Karthik Nayak 0 siblings, 1 reply; 32+ messages in thread From: Junio C Hamano @ 2024-07-03 18:17 UTC (permalink / raw) To: Karthik Nayak Cc: Abhijeet Sonar, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King Karthik Nayak <karthik.188@gmail.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> Karthik Nayak <karthik.188@gmail.com> writes: >> >>> This explains for why 'broken' must use a subprocess, but there is >>> nothing stopping 'dirty' from also using a subprocess, right? It >>> currently uses an in-process index refresh but it _could_ be a >>> subprocess too. >> >> Correct, except that it does not make sense to do any and all things >> that you _could_ do. So... > > Well, In this context, I think there is some merit though. There are two > blocks of code `--broken` and `--dirty` one after the other which both > need to refresh the index. With this patch, 'broken' will use a child > process to do so while 'dirty' will use `refresh_index(...)`. To someone > reading the code it would seem a bit confusing. Yes, that much I very much agree. > I agree there is no > merit in using a child process in 'dirty' by itself. Yes, that made me puzzled why you brought it up, as it was way too oblique suggestion to ... > But I also think we > should leave a comment there for readers to understand the distinction. ... improve the "documentation" to help future developers who wonder why the code are in the shape as it is. In this particular case, I think it is borderline if the issue warrants in-code comment or if it is a bit too much. Describing the same thing in the log message would probably be a valid alternative, as "git blame" can lead those readers to the commit that introduced the distinction (in other words, this one). Thanks. diff --git i/builtin/describe.c w/builtin/describe.c index e936d2c19f..bc2ad60b35 100644 --- i/builtin/describe.c +++ w/builtin/describe.c @@ -648,6 +648,14 @@ int cmd_describe(int argc, const char **argv, const char *prefix) if (argc == 0) { if (broken) { + /* + * Avoid doing these "update-index --refresh" + * and "diff-index" operations in-process + * (like how the code path for "--dirty" + * without "--broken" does so below), as we + * are told to prepare for a broken repository + * where running these may lead to die(). + */ struct child_process cp = CHILD_PROCESS_INIT; strvec_pushv(&cp.args, update_index_args); ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH v7] describe: refresh the index when 'broken' flag is used 2024-07-03 18:17 ` Junio C Hamano @ 2024-07-03 20:41 ` Karthik Nayak 0 siblings, 0 replies; 32+ messages in thread From: Karthik Nayak @ 2024-07-03 20:41 UTC (permalink / raw) To: Junio C Hamano Cc: Abhijeet Sonar, git, Paul Millar, Phillip Wood, Elijah Newren, Jeff King [-- Attachment #1: Type: text/plain, Size: 2533 bytes --] Junio C Hamano <gitster@pobox.com> writes: > Karthik Nayak <karthik.188@gmail.com> writes: > >> Junio C Hamano <gitster@pobox.com> writes: >> >>> Karthik Nayak <karthik.188@gmail.com> writes: >>> >>>> This explains for why 'broken' must use a subprocess, but there is >>>> nothing stopping 'dirty' from also using a subprocess, right? It >>>> currently uses an in-process index refresh but it _could_ be a >>>> subprocess too. >>> >>> Correct, except that it does not make sense to do any and all things >>> that you _could_ do. So... >> >> Well, In this context, I think there is some merit though. There are two >> blocks of code `--broken` and `--dirty` one after the other which both >> need to refresh the index. With this patch, 'broken' will use a child >> process to do so while 'dirty' will use `refresh_index(...)`. To someone >> reading the code it would seem a bit confusing. > > Yes, that much I very much agree. > >> I agree there is no >> merit in using a child process in 'dirty' by itself. > > Yes, that made me puzzled why you brought it up, as it was way too > oblique suggestion to ... > Yeah, I should've been more verbose there. >> But I also think we >> should leave a comment there for readers to understand the distinction. > > ... improve the "documentation" to help future developers who wonder > why the code are in the shape as it is. > > In this particular case, I think it is borderline if the issue > warrants in-code comment or if it is a bit too much. Describing the > same thing in the log message would probably be a valid alternative, > as "git blame" can lead those readers to the commit that introduced > the distinction (in other words, this one). > I think it is always best to err on the side of more documentation than less in such situations. > Thanks. > > diff --git i/builtin/describe.c w/builtin/describe.c > index e936d2c19f..bc2ad60b35 100644 > --- i/builtin/describe.c > +++ w/builtin/describe.c > @@ -648,6 +648,14 @@ int cmd_describe(int argc, const char **argv, const char *prefix) > > if (argc == 0) { > if (broken) { > + /* > + * Avoid doing these "update-index --refresh" > + * and "diff-index" operations in-process > + * (like how the code path for "--dirty" > + * without "--broken" does so below), as we > + * are told to prepare for a broken repository > + * where running these may lead to die(). > + */ > struct child_process cp = CHILD_PROCESS_INIT; > > strvec_pushv(&cp.args, update_index_args); This looks perfect, thanks! [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 690 bytes --] ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2024-07-03 20:41 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-06-25 13:35 [PATCH v3] describe: refresh the index when 'broken' flag is used Abhijeet Sonar 2024-06-25 15:59 ` Junio C Hamano 2024-06-25 16:05 ` Junio C Hamano 2024-06-26 11:16 ` Karthik Nayak 2024-06-26 6:11 ` Abhijeet Sonar 2024-06-26 6:37 ` [PATCH v4] " Abhijeet Sonar 2024-06-26 6:50 ` Abhijeet Sonar 2024-06-26 6:52 ` [PATCH v5] " Abhijeet Sonar 2024-06-26 11:30 ` Karthik Nayak 2024-06-26 12:06 ` Abhijeet Sonar 2024-06-26 15:34 ` Re* " Junio C Hamano 2024-06-26 16:17 ` Junio C Hamano 2024-06-26 17:29 ` Abhijeet Sonar 2024-06-26 17:35 ` Junio C Hamano 2024-06-26 17:45 ` Junio C Hamano 2024-06-26 18:07 ` Abhijeet Sonar 2024-06-26 18:49 ` Junio C Hamano 2024-06-26 20:34 ` Jeff King 2024-06-27 0:33 ` Jeff King 2024-06-26 21:23 ` Karthik Nayak 2024-06-26 14:59 ` Junio C Hamano 2024-06-26 18:31 ` Junio C Hamano 2024-06-26 19:08 ` [PATCH v7] " Abhijeet Sonar 2024-06-26 19:25 ` Abhijeet Sonar 2024-06-27 6:01 ` Abhijeet Sonar 2024-06-27 15:47 ` Junio C Hamano 2024-06-27 17:33 ` Abhijeet Sonar 2024-06-30 16:12 ` Karthik Nayak 2024-07-01 19:06 ` Junio C Hamano 2024-07-02 10:13 ` Karthik Nayak 2024-07-03 18:17 ` Junio C Hamano 2024-07-03 20:41 ` Karthik Nayak
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).