* [PATCH 0/6] ISOC23: quell warnings on discarding const
@ 2026-03-26 15:22 Michael J Gruber
2026-03-26 15:22 ` [PATCH 1/6] do not discard const: the simple cases Michael J Gruber
` (6 more replies)
0 siblings, 7 replies; 24+ messages in thread
From: Michael J Gruber @ 2026-03-26 15:22 UTC (permalink / raw)
To: git
Hi there
Fedora 44 beta (gcc-16.0.1, glibc-2.43) brought some fun new warnings
when building git. In essence, we're not always explicit about
const-ness or lack thereof of certain pointers. Before, strchr()'s
signature which turns const arguments into non-const return values
covered this up. With ISOC23, strchr() and friends return const
pointers.
This little series takes a middle-ground: no new data types (no new
const versions of non-const data types) but more explicit casts.
Michael J Gruber (6):
do not discard const: the simple cases
do not discard const: make git-compat-util ISOC23-like
do not discard const: adjust to non-const data types
do not discard const: declare const where we stay const
do not discard const: keep signature
do not discard const: the ugly truth
builtin/config.c | 2 +-
builtin/receive-pack.c | 6 +++---
builtin/rev-parse.c | 8 ++++----
convert.c | 3 ++-
git-compat-util.h | 2 +-
http-push.c | 2 +-
http.c | 2 +-
pager.c | 2 +-
pseudo-merge.c | 2 +-
range-diff.c | 2 +-
refs/files-backend.c | 2 +-
remote.c | 2 +-
revision.c | 8 ++++----
run-command.c | 2 +-
scalar.c | 2 +-
send-pack.c | 6 +++---
submodule.c | 2 +-
transport-helper.c | 2 +-
18 files changed, 29 insertions(+), 28 deletions(-)
--
2.53.0.1195.g771ffcb452
^ permalink raw reply [flat|nested] 24+ messages in thread* [PATCH 1/6] do not discard const: the simple cases 2026-03-26 15:22 [PATCH 0/6] ISOC23: quell warnings on discarding const Michael J Gruber @ 2026-03-26 15:22 ` Michael J Gruber 2026-03-26 17:34 ` Jeff King 2026-03-26 15:22 ` [PATCH 2/6] do not discard const: make git-compat-util ISOC23-like Michael J Gruber ` (5 subsequent siblings) 6 siblings, 1 reply; 24+ messages in thread From: Michael J Gruber @ 2026-03-26 15:22 UTC (permalink / raw) To: git Depending on glibc version and compiler flags (ISOC23), strchr() and friends from string.h may return const pointers for const arguments and non-const for non-const (rather than non-const for all argument types). In particular, our current code base gives warnings such as: ``` builtin/config.c: In function 'get_urlmatch': builtin/config.c:855:22: warning: assignment discards 'const' qualifier from pointer target type [-Wdiscarded-qualifiers] 855 | section_tail = strchr(config.section, '.'); | ^ ``` Note that with ``` const char *foo; char *bar; ``` we can always assign `foo = bar` but `bar = foo` throws said warning. In particular, we often pass in `char *` for a `const char *` argument and expect `char *` back but get a `const char *` in said scenario. This patch covers the easy cases where we deal with a non-const pointer to begin with. It is solved by the cast `bar = (char *) foo`. --- builtin/config.c | 2 +- builtin/receive-pack.c | 6 +++--- http.c | 2 +- pager.c | 2 +- range-diff.c | 2 +- refs/files-backend.c | 2 +- remote.c | 2 +- send-pack.c | 6 +++--- transport-helper.c | 2 +- 9 files changed, 13 insertions(+), 13 deletions(-) diff --git a/builtin/config.c b/builtin/config.c index 7c4857be62..bd277e5911 100644 --- a/builtin/config.c +++ b/builtin/config.c @@ -852,7 +852,7 @@ static int get_urlmatch(const struct config_location_options *opts, die("%s", config.url.err); config.section = xstrdup_tolower(var); - section_tail = strchr(config.section, '.'); + section_tail = (char *) strchr(config.section, '.'); if (section_tail) { *section_tail = '\0'; config.key = section_tail + 1; diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index e34edff406..7712a1c3c2 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -1042,7 +1042,7 @@ static int read_proc_receive_report(struct packet_reader *reader, response++; head = reader->line; - p = strchr(head, ' '); + p = (char *) strchr(head, ' '); if (!p) { strbuf_addf(errmsg, "proc-receive reported incomplete status line: '%s'\n", head); code = -1; @@ -1072,7 +1072,7 @@ static int read_proc_receive_report(struct packet_reader *reader, new_report = 0; } key = p; - p = strchr(key, ' '); + p = (char *) strchr(key, ' '); if (p) *p++ = '\0'; val = p; @@ -1095,7 +1095,7 @@ static int read_proc_receive_report(struct packet_reader *reader, report = NULL; new_report = 0; refname = p; - p = strchr(refname, ' '); + p = (char *) strchr(refname, ' '); if (p) *p++ = '\0'; if (strcmp(head, "ok") && strcmp(head, "ng")) { diff --git a/http.c b/http.c index d8d016891b..02c4fbb234 100644 --- a/http.c +++ b/http.c @@ -774,7 +774,7 @@ static int redact_sensitive_header(struct strbuf *header, size_t offset) while (cookie) { char *equals; - char *semicolon = strstr(cookie, "; "); + char *semicolon = (char *) strstr(cookie, "; "); if (semicolon) *semicolon = 0; equals = strchrnul(cookie, '='); diff --git a/pager.c b/pager.c index 5531fff50e..eb7011bfde 100644 --- a/pager.c +++ b/pager.c @@ -118,7 +118,7 @@ static void setup_pager_env(struct strvec *env) split_cmdline_strerror(n)); for (i = 0; i < n; i++) { - char *cp = strchr(argv[i], '='); + char *cp = (char *) strchr(argv[i], '='); if (!cp) die("malformed build-time PAGER_ENV"); diff --git a/range-diff.c b/range-diff.c index 2712a9a107..47e36a391f 100644 --- a/range-diff.c +++ b/range-diff.c @@ -106,7 +106,7 @@ static int read_patches(const char *range, struct string_list *list, strbuf_reset(&buf); } CALLOC_ARRAY(util, 1); - if (include_merges && (q = strstr(p, " (from "))) + if (include_merges && (q = (char *) strstr(p, " (from "))) *q = '\0'; if (repo_get_oid(the_repository, p, &util->oid)) { error(_("could not parse commit '%s'"), p); diff --git a/refs/files-backend.c b/refs/files-backend.c index 0537a72b2a..71cab7e003 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -2196,7 +2196,7 @@ static int show_one_reflog_ent(struct files_ref_store *refs, if (!sb->len || sb->buf[sb->len - 1] != '\n' || parse_oid_hex_algop(p, &ooid, &p, refs->base.repo->hash_algo) || *p++ != ' ' || parse_oid_hex_algop(p, &noid, &p, refs->base.repo->hash_algo) || *p++ != ' ' || - !(email_end = strchr(p, '>')) || + !(email_end = (char *) strchr(p, '>')) || email_end[1] != ' ' || !(timestamp = parse_timestamp(email_end + 2, &message, 10)) || !message || message[0] != ' ' || diff --git a/remote.c b/remote.c index 7ca2a6501b..d7a37016b5 100644 --- a/remote.c +++ b/remote.c @@ -2861,7 +2861,7 @@ void remote_state_clear(struct remote_state *remote_state) */ static int chop_last_dir(char **remoteurl, int is_relative) { - char *rfind = find_last_dir_sep(*remoteurl); + char *rfind = (char *) find_last_dir_sep(*remoteurl); if (rfind) { *rfind = '\0'; return 0; diff --git a/send-pack.c b/send-pack.c index 07ecfae4de..8b9f7e2f2f 100644 --- a/send-pack.c +++ b/send-pack.c @@ -181,7 +181,7 @@ static int receive_status(struct repository *r, if (packet_reader_read(reader) != PACKET_READ_NORMAL) break; head = reader->line; - p = strchr(head, ' '); + p = (char *) strchr(head, ' '); if (!p) { error("invalid status line from remote: %s", reader->line); ret = -1; @@ -212,7 +212,7 @@ static int receive_status(struct repository *r, new_report = 0; } key = p; - p = strchr(key, ' '); + p = (char *) strchr(key, ' '); if (p) *p++ = '\0'; val = p; @@ -237,7 +237,7 @@ static int receive_status(struct repository *r, break; } refname = p; - p = strchr(refname, ' '); + p = (char *) strchr(refname, ' '); if (p) *p++ = '\0'; /* first try searching at our hint, falling back to all refs */ diff --git a/transport-helper.c b/transport-helper.c index 4d95d84f9e..e7f2cb1812 100644 --- a/transport-helper.c +++ b/transport-helper.c @@ -800,7 +800,7 @@ static int push_update_ref_status(struct strbuf *buf, state->new_report = 0; } key = buf->buf + 7; - p = strchr(key, ' '); + p = (char *) strchr(key, ' '); if (p) *p++ = '\0'; val = p; -- 2.53.0.1195.g771ffcb452 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 1/6] do not discard const: the simple cases 2026-03-26 15:22 ` [PATCH 1/6] do not discard const: the simple cases Michael J Gruber @ 2026-03-26 17:34 ` Jeff King 2026-03-26 17:45 ` Junio C Hamano 0 siblings, 1 reply; 24+ messages in thread From: Jeff King @ 2026-03-26 17:34 UTC (permalink / raw) To: Michael J Gruber; +Cc: git On Thu, Mar 26, 2026 at 04:22:47PM +0100, Michael J Gruber wrote: > This patch covers the easy cases where we deal with a non-const pointer > to begin with. It is solved by the cast `bar = (char *) foo`. I think we can often do better, though. For example, in this case: > diff --git a/builtin/config.c b/builtin/config.c > index 7c4857be62..bd277e5911 100644 > --- a/builtin/config.c > +++ b/builtin/config.c > @@ -852,7 +852,7 @@ static int get_urlmatch(const struct config_location_options *opts, > die("%s", config.url.err); > > config.section = xstrdup_tolower(var); > - section_tail = strchr(config.section, '.'); > + section_tail = (char *) strchr(config.section, '.'); > if (section_tail) { > *section_tail = '\0'; > config.key = section_tail + 1; We know that it is OK to cast away the const-ness because config.section is writeable, which we know because it just came from xstrdup(). So why is it const in the first place? Because the pointer is in a struct which may be used with other const strings. But we can untangle this for the compiler without having to cast by using a non-const alias, like: char *section; ... config.section = section = xstrdup_tolower(var); section_tail = strchr(section, '.'); Which I think is safer and shows the intent more clearly. Some of the other cases below can use similar techniques (e.g., I think packet_reader's line probably ought to be non-const). -Peff ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/6] do not discard const: the simple cases 2026-03-26 17:34 ` Jeff King @ 2026-03-26 17:45 ` Junio C Hamano 2026-03-26 19:23 ` [PATCH] config: store allocated string in non-const pointer Jeff King 0 siblings, 1 reply; 24+ messages in thread From: Junio C Hamano @ 2026-03-26 17:45 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, git Jeff King <peff@peff.net> writes: > On Thu, Mar 26, 2026 at 04:22:47PM +0100, Michael J Gruber wrote: > >> This patch covers the easy cases where we deal with a non-const pointer >> to begin with. It is solved by the cast `bar = (char *) foo`. > > I think we can often do better, though. For example, in this case: > >> diff --git a/builtin/config.c b/builtin/config.c >> index 7c4857be62..bd277e5911 100644 >> --- a/builtin/config.c >> +++ b/builtin/config.c >> @@ -852,7 +852,7 @@ static int get_urlmatch(const struct config_location_options *opts, >> die("%s", config.url.err); >> >> config.section = xstrdup_tolower(var); >> - section_tail = strchr(config.section, '.'); >> + section_tail = (char *) strchr(config.section, '.'); >> if (section_tail) { >> *section_tail = '\0'; >> config.key = section_tail + 1; > > We know that it is OK to cast away the const-ness because config.section > is writeable, which we know because it just came from xstrdup(). So why > is it const in the first place? Because the pointer is in a struct which > may be used with other const strings. > > But we can untangle this for the compiler without having to cast by > using a non-const alias, like: > > char *section; > ... > config.section = section = xstrdup_tolower(var); > section_tail = strchr(section, '.'); > > Which I think is safer and shows the intent more clearly. Yeah, this is much clearer. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH] config: store allocated string in non-const pointer 2026-03-26 17:45 ` Junio C Hamano @ 2026-03-26 19:23 ` Jeff King 0 siblings, 0 replies; 24+ messages in thread From: Jeff King @ 2026-03-26 19:23 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael J Gruber, git On Thu, Mar 26, 2026 at 10:45:41AM -0700, Junio C Hamano wrote: > > But we can untangle this for the compiler without having to cast by > > using a non-const alias, like: > [...] > Yeah, this is much clearer. Here it is as a patch with a commit message. I was eventually planning to do a complete series that cleans up all the warnings, and this would be part of it. But since other people are starting to work on it, too, it may make sense to just send them out as we have them to avoid too much duplication. -- >8 -- Subject: [PATCH] config: store allocated string in non-const pointer When git-config matches a url, we copy the variable section name and store it in the "section" member of a urlmatch_config struct. That member is const, since the url-matcher will not touch it (and other callers really will have a const string). But that means that we have only a const pointer to our allocated string. We have to cast away the constness when we free it, and likewise when we assign NUL to tie off the "." separating the subsection and key. This latter happens implicitly via a strchr() call, but recent versions of glibc have added annotations that let the compiler detect that and complain. Let's keep our own "section" pointer for the non-const string, and then just point config.section at it. That avoids all of the casting, both explicit and implicit. Signed-off-by: Jeff King <peff@peff.net> --- builtin/config.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/builtin/config.c b/builtin/config.c index 7c4857be62..cf4ba0f7cc 100644 --- a/builtin/config.c +++ b/builtin/config.c @@ -838,6 +838,7 @@ static int get_urlmatch(const struct config_location_options *opts, const char *var, const char *url) { int ret; + char *section; char *section_tail; struct config_display_options display_opts = *_display_opts; struct string_list_item *item; @@ -851,8 +852,8 @@ static int get_urlmatch(const struct config_location_options *opts, if (!url_normalize(url, &config.url)) die("%s", config.url.err); - config.section = xstrdup_tolower(var); - section_tail = strchr(config.section, '.'); + config.section = section = xstrdup_tolower(var); + section_tail = strchr(section, '.'); if (section_tail) { *section_tail = '\0'; config.key = section_tail + 1; @@ -886,7 +887,7 @@ static int get_urlmatch(const struct config_location_options *opts, string_list_clear(&values, 1); free(config.url.url); - free((void *)config.section); + free(section); return ret; } -- 2.53.0.1081.gf77a8b8145 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/6] do not discard const: make git-compat-util ISOC23-like 2026-03-26 15:22 [PATCH 0/6] ISOC23: quell warnings on discarding const Michael J Gruber 2026-03-26 15:22 ` [PATCH 1/6] do not discard const: the simple cases Michael J Gruber @ 2026-03-26 15:22 ` Michael J Gruber 2026-03-26 15:22 ` [PATCH 3/6] do not discard const: adjust to non-const data types Michael J Gruber ` (4 subsequent siblings) 6 siblings, 0 replies; 24+ messages in thread From: Michael J Gruber @ 2026-03-26 15:22 UTC (permalink / raw) To: git find_last_dir() should and can return a const pointer. This change fixes the warnings with ISOC23 for git-compat-util and - via explicit casts - makes it clear where we mutate the returned memory. --- git-compat-util.h | 2 +- scalar.c | 2 +- submodule.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/git-compat-util.h b/git-compat-util.h index 4b4ea2498f..3c3dbe298c 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -335,7 +335,7 @@ static inline int is_path_owned_by_current_uid(const char *path, #endif #ifndef find_last_dir_sep -static inline char *git_find_last_dir_sep(const char *path) +static inline const char *git_find_last_dir_sep(const char *path) { return strrchr(path, '/'); } diff --git a/scalar.c b/scalar.c index 4efb6ac36d..44f432d7f0 100644 --- a/scalar.c +++ b/scalar.c @@ -479,7 +479,7 @@ static int cmd_clone(int argc, const char **argv) /* Strip suffix `.git`, if any */ strbuf_strip_suffix(&buf, ".git"); - enlistment = find_last_dir_sep(buf.buf); + enlistment = (char *) find_last_dir_sep(buf.buf); if (!enlistment) { die(_("cannot deduce worktree name from '%s'"), url); } diff --git a/submodule.c b/submodule.c index b1a0363f9d..57933386bc 100644 --- a/submodule.c +++ b/submodule.c @@ -2268,7 +2268,7 @@ static int check_casefolding_conflict(const char *git_dir, DIR *dir = NULL; int ret = 0; - if ((p = find_last_dir_sep(modules_dir))) + if ((p = (char *) find_last_dir_sep(modules_dir))) *p = '\0'; /* No conflict is possible if modules_dir doesn't exist (first clone) */ -- 2.53.0.1195.g771ffcb452 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 3/6] do not discard const: adjust to non-const data types 2026-03-26 15:22 [PATCH 0/6] ISOC23: quell warnings on discarding const Michael J Gruber 2026-03-26 15:22 ` [PATCH 1/6] do not discard const: the simple cases Michael J Gruber 2026-03-26 15:22 ` [PATCH 2/6] do not discard const: make git-compat-util ISOC23-like Michael J Gruber @ 2026-03-26 15:22 ` Michael J Gruber 2026-03-26 17:28 ` Junio C Hamano 2026-03-26 15:22 ` [PATCH 4/6] do not discard const: declare const where we stay const Michael J Gruber ` (3 subsequent siblings) 6 siblings, 1 reply; 24+ messages in thread From: Michael J Gruber @ 2026-03-26 15:22 UTC (permalink / raw) To: git We use data types (such as string_list's util member) which are not necessarily "non-const in practice" (such as the list of environment variables in run-command.c) but are not declared "const". Rather than duplicating data types (e.g. with a new constr_string_list), discard the const explicitly for now to quell ISOC23 warnings. --- http-push.c | 2 +- run-command.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/http-push.c b/http-push.c index 9ae6062198..acc7f1d8fa 100644 --- a/http-push.c +++ b/http-push.c @@ -1772,7 +1772,7 @@ int cmd_main(int argc, const char **argv) str_end_url_with_slash(arg, &repo->url); repo->path_len = strlen(repo->url); if (path) { - repo->path = strchr(path+2, '/'); + repo->path = (char *) strchr(path+2, '/'); if (repo->path) repo->path_len = strlen(repo->path); } diff --git a/run-command.c b/run-command.c index 32c290ee6a..1db02ef030 100644 --- a/run-command.c +++ b/run-command.c @@ -604,7 +604,7 @@ static void trace_add_env(struct strbuf *dst, const char *const *deltaenv) /* Last one wins, see run-command.c:prep_childenv() for context */ for (e = deltaenv; e && *e; e++) { struct strbuf key = STRBUF_INIT; - char *equals = strchr(*e, '='); + char *equals = (char *) strchr(*e, '='); if (equals) { strbuf_add(&key, *e, equals - *e); -- 2.53.0.1195.g771ffcb452 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 3/6] do not discard const: adjust to non-const data types 2026-03-26 15:22 ` [PATCH 3/6] do not discard const: adjust to non-const data types Michael J Gruber @ 2026-03-26 17:28 ` Junio C Hamano 0 siblings, 0 replies; 24+ messages in thread From: Junio C Hamano @ 2026-03-26 17:28 UTC (permalink / raw) To: Michael J Gruber; +Cc: git Michael J Gruber <git@grubix.eu> writes: > We use data types (such as string_list's util member) which are not > necessarily "non-const in practice" (such as the list of environment > variables in run-command.c) but are not declared "const". Rather than > duplicating data types (e.g. with a new constr_string_list), discard the > const explicitly for now to quell ISOC23 warnings. > --- > http-push.c | 2 +- > run-command.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/http-push.c b/http-push.c > index 9ae6062198..acc7f1d8fa 100644 > --- a/http-push.c > +++ b/http-push.c > @@ -1772,7 +1772,7 @@ int cmd_main(int argc, const char **argv) > str_end_url_with_slash(arg, &repo->url); > repo->path_len = strlen(repo->url); > if (path) { > - repo->path = strchr(path+2, '/'); > + repo->path = (char *) strchr(path+2, '/'); > if (repo->path) > repo->path_len = strlen(repo->path); > } > diff --git a/run-command.c b/run-command.c > index 32c290ee6a..1db02ef030 100644 > --- a/run-command.c > +++ b/run-command.c > @@ -604,7 +604,7 @@ static void trace_add_env(struct strbuf *dst, const char *const *deltaenv) > /* Last one wins, see run-command.c:prep_childenv() for context */ > for (e = deltaenv; e && *e; e++) { > struct strbuf key = STRBUF_INIT; > - char *equals = strchr(*e, '='); > + char *equals = (char *) strchr(*e, '='); > > if (equals) { > strbuf_add(&key, *e, equals - *e); I didn't look at the other http-push.c one, but this part with a bit wider context reads like this: for (e = deltaenv; e && *e; e++) { struct strbuf key = STRBUF_INIT; char *equals = strchr(*e, '='); if (equals) { strbuf_add(&key, *e, equals - *e); string_list_insert(&envs, key.buf)->util = equals + 1; } else { string_list_insert(&envs, *e)->util = NULL; } strbuf_release(&key); } I wonder if the cast to strip away constness wants to go near the assignment to ->util. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 4/6] do not discard const: declare const where we stay const 2026-03-26 15:22 [PATCH 0/6] ISOC23: quell warnings on discarding const Michael J Gruber ` (2 preceding siblings ...) 2026-03-26 15:22 ` [PATCH 3/6] do not discard const: adjust to non-const data types Michael J Gruber @ 2026-03-26 15:22 ` Michael J Gruber 2026-03-26 15:22 ` [PATCH 5/6] do not discard const: keep signature Michael J Gruber ` (2 subsequent siblings) 6 siblings, 0 replies; 24+ messages in thread From: Michael J Gruber @ 2026-03-26 15:22 UTC (permalink / raw) To: git This may sound like the easiest case, but for non ISOC23 with non-const strchr() this involves an implicit cast to const. --- convert.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/convert.c b/convert.c index a34ec6ecdc..eae36c8a59 100644 --- a/convert.c +++ b/convert.c @@ -1168,7 +1168,8 @@ static int ident_to_worktree(const char *src, size_t len, struct strbuf *buf, int ident) { struct object_id oid; - char *to_free = NULL, *dollar, *spc; + char *to_free = NULL; + const char *dollar, *spc; int cnt; if (!ident) -- 2.53.0.1195.g771ffcb452 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 5/6] do not discard const: keep signature 2026-03-26 15:22 [PATCH 0/6] ISOC23: quell warnings on discarding const Michael J Gruber ` (3 preceding siblings ...) 2026-03-26 15:22 ` [PATCH 4/6] do not discard const: declare const where we stay const Michael J Gruber @ 2026-03-26 15:22 ` Michael J Gruber 2026-03-26 17:18 ` Junio C Hamano 2026-03-26 15:22 ` [PATCH 6/6] do not discard const: the ugly truth Michael J Gruber 2026-03-26 16:26 ` [PATCH 0/6] ISOC23: quell warnings on discarding const D. Ben Knoble 6 siblings, 1 reply; 24+ messages in thread From: Michael J Gruber @ 2026-03-26 15:22 UTC (permalink / raw) To: git Here, while we do not mutate the struct itself, many other signatures expect a non-const argument - possibly unnecessarily - so we opt to keep the original signature by casting to non-const. --- pseudo-merge.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pseudo-merge.c b/pseudo-merge.c index a2d5bd85f9..ac81792e65 100644 --- a/pseudo-merge.c +++ b/pseudo-merge.c @@ -644,7 +644,7 @@ static struct pseudo_merge_commit *find_pseudo_merge(const struct pseudo_merge_m if (!pm->commits_nr) return NULL; - return bsearch(&pos, pm->commits, pm->commits_nr, + return (struct pseudo_merge_commit *) bsearch(&pos, pm->commits, pm->commits_nr, PSEUDO_MERGE_COMMIT_RAWSZ, pseudo_merge_commit_cmp); } -- 2.53.0.1195.g771ffcb452 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 5/6] do not discard const: keep signature 2026-03-26 15:22 ` [PATCH 5/6] do not discard const: keep signature Michael J Gruber @ 2026-03-26 17:18 ` Junio C Hamano 0 siblings, 0 replies; 24+ messages in thread From: Junio C Hamano @ 2026-03-26 17:18 UTC (permalink / raw) To: Michael J Gruber; +Cc: git Michael J Gruber <git@grubix.eu> writes: > Here, while we do not mutate the struct itself, many other signatures > expect a non-const argument - possibly unnecessarily - so we opt to keep > the original signature by casting to non-const. > --- Sorry, but I do not understand the above description, or the code change. Doesn't bsearch() returns non-const "void *" pointer? Ah, the constness of the return value in C23 depends on the constness of pm->commits[] array, which inherits its constness from the constness of parameter pm to the function, and you cast the value we are going to return explicitly to a non-const pointer. OK. In the context of "C23 constness" patch series, that may be obvious to you, but I suspect a future reader who finds this single commit from the output of "git blame" or something would be puzzled unless we say this is about adjusting to C23 that makes bsearch() a qualifier-preserving function somewhere in the log message. > pseudo-merge.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/pseudo-merge.c b/pseudo-merge.c > index a2d5bd85f9..ac81792e65 100644 > --- a/pseudo-merge.c > +++ b/pseudo-merge.c > @@ -644,7 +644,7 @@ static struct pseudo_merge_commit *find_pseudo_merge(const struct pseudo_merge_m > if (!pm->commits_nr) > return NULL; > > - return bsearch(&pos, pm->commits, pm->commits_nr, > + return (struct pseudo_merge_commit *) bsearch(&pos, pm->commits, pm->commits_nr, > PSEUDO_MERGE_COMMIT_RAWSZ, pseudo_merge_commit_cmp); > } ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 6/6] do not discard const: the ugly truth 2026-03-26 15:22 [PATCH 0/6] ISOC23: quell warnings on discarding const Michael J Gruber ` (4 preceding siblings ...) 2026-03-26 15:22 ` [PATCH 5/6] do not discard const: keep signature Michael J Gruber @ 2026-03-26 15:22 ` Michael J Gruber 2026-03-26 17:07 ` Junio C Hamano 2026-03-26 17:42 ` Jeff King 2026-03-26 16:26 ` [PATCH 0/6] ISOC23: quell warnings on discarding const D. Ben Knoble 6 siblings, 2 replies; 24+ messages in thread From: Michael J Gruber @ 2026-03-26 15:22 UTC (permalink / raw) To: git ISOC23 reveals that we mutate argv strings in place. Confess to this with explicit casts. --- builtin/rev-parse.c | 8 ++++---- revision.c | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c index 01a62800e8..f429793b6f 100644 --- a/builtin/rev-parse.c +++ b/builtin/rev-parse.c @@ -265,7 +265,7 @@ static int show_file(const char *arg, int output_prefix) return 0; } -static int try_difference(const char *arg) +static int try_difference(char *arg) { char *dotdot; struct object_id start_oid; @@ -325,7 +325,7 @@ static int try_difference(const char *arg) return 0; } -static int try_parent_shorthands(const char *arg) +static int try_parent_shorthands(char *arg) { char *dotdot; struct object_id oid; @@ -1145,9 +1145,9 @@ int cmd_rev_parse(int argc, } /* Not a flag argument */ - if (try_difference(arg)) + if (try_difference((char *) arg)) continue; - if (try_parent_shorthands(arg)) + if (try_parent_shorthands((char *) arg)) continue; name = arg; type = NORMAL; diff --git a/revision.c b/revision.c index 31808e3df0..a28b14a2ea 100644 --- a/revision.c +++ b/revision.c @@ -2132,7 +2132,7 @@ static int handle_dotdot(const char *arg, int cant_be_filename) { struct object_context a_oc = {0}, b_oc = {0}; - char *dotdot = strstr(arg, ".."); + char *dotdot = (char *) strstr(arg, ".."); int ret; if (!dotdot) @@ -2176,7 +2176,7 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl goto out; } - mark = strstr(arg, "^@"); + mark = (char *) strstr(arg, "^@"); if (mark && !mark[2]) { *mark = 0; if (add_parents_only(revs, arg, flags, 0)) { @@ -2185,13 +2185,13 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl } *mark = '^'; } - mark = strstr(arg, "^!"); + mark = (char *) strstr(arg, "^!"); if (mark && !mark[2]) { *mark = 0; if (!add_parents_only(revs, arg, flags ^ (UNINTERESTING | BOTTOM), 0)) *mark = '^'; } - mark = strstr(arg, "^-"); + mark = (char *) strstr(arg, "^-"); if (mark) { int exclude_parent = 1; -- 2.53.0.1195.g771ffcb452 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 6/6] do not discard const: the ugly truth 2026-03-26 15:22 ` [PATCH 6/6] do not discard const: the ugly truth Michael J Gruber @ 2026-03-26 17:07 ` Junio C Hamano 2026-03-26 17:42 ` Jeff King 1 sibling, 0 replies; 24+ messages in thread From: Junio C Hamano @ 2026-03-26 17:07 UTC (permalink / raw) To: Michael J Gruber; +Cc: git Michael J Gruber <git@grubix.eu> writes: > ISOC23 reveals that we mutate argv strings in place. Confess to this > with explicit casts. > --- > builtin/rev-parse.c | 8 ++++---- > revision.c | 8 ++++---- > 2 files changed, 8 insertions(+), 8 deletions(-) Forgot to sign-off? > diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c > index 01a62800e8..f429793b6f 100644 > --- a/builtin/rev-parse.c > +++ b/builtin/rev-parse.c > @@ -265,7 +265,7 @@ static int show_file(const char *arg, int output_prefix) > return 0; > } > > -static int try_difference(const char *arg) > +static int try_difference(char *arg) > { > char *dotdot; > struct object_id start_oid; > @@ -325,7 +325,7 @@ static int try_difference(const char *arg) > return 0; > } This one is unfortunate in that in the end the incoming arg is temporarily truncated by substituting the first "." in the ".." found in the string with "\0", and then restored to the original value before returning to the caller, so unless the caller is handing a piece of memory in a read-only segment, nobody should hurt or even notice. > -static int try_parent_shorthands(const char *arg) > +static int try_parent_shorthands(char *arg) > { > char *dotdot; > struct object_id oid; > @@ -1145,9 +1145,9 @@ int cmd_rev_parse(int argc, > } > > /* Not a flag argument */ > - if (try_difference(arg)) > + if (try_difference((char *) arg)) > continue; > - if (try_parent_shorthands(arg)) > + if (try_parent_shorthands((char *) arg)) > continue; > name = arg; > type = NORMAL; The same, with "^" in magic sequences "^!", "^@", and "^-". > diff --git a/revision.c b/revision.c > index 31808e3df0..a28b14a2ea 100644 > --- a/revision.c > +++ b/revision.c > @@ -2132,7 +2132,7 @@ static int handle_dotdot(const char *arg, > int cant_be_filename) > { > struct object_context a_oc = {0}, b_oc = {0}; > - char *dotdot = strstr(arg, ".."); > + char *dotdot = (char *) strstr(arg, ".."); > int ret; > > if (!dotdot) The patch takes a different strategy to deal with this one, even though the pattern should be exactly the same as try_difference() we saw earlier. Shouldn't we take the same "internally we know we muck with the string temporarily, but externally we pretend that we take a const pointer because we revert our temporary modification" approach in builtin/rev-parse.c too? One thing that _could_ break if we did so is when the callers do pass a string in read-only segment to these functions, trusting the function signature that takes a const pointer promises them that it is safe. And to prepare for it, the approach you took in builtin/rev-parse.c to be honest about it to the callers is safer. So in that sense, perhaps this function should be updated to take a non-const pointer to arg instead of sprinkling casts in the body? > @@ -2176,7 +2176,7 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl > goto out; > } > > - mark = strstr(arg, "^@"); > + mark = (char *) strstr(arg, "^@"); > if (mark && !mark[2]) { > *mark = 0; > if (add_parents_only(revs, arg, flags, 0)) { > @@ -2185,13 +2185,13 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl > } > *mark = '^'; > } > - mark = strstr(arg, "^!"); > + mark = (char *) strstr(arg, "^!"); > if (mark && !mark[2]) { > *mark = 0; > if (!add_parents_only(revs, arg, flags ^ (UNINTERESTING | BOTTOM), 0)) > *mark = '^'; > } > - mark = strstr(arg, "^-"); > + mark = (char *) strstr(arg, "^-"); > if (mark) { > int exclude_parent = 1; Ditto. ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 6/6] do not discard const: the ugly truth 2026-03-26 15:22 ` [PATCH 6/6] do not discard const: the ugly truth Michael J Gruber 2026-03-26 17:07 ` Junio C Hamano @ 2026-03-26 17:42 ` Jeff King 2026-03-26 19:02 ` [PATCH 0/4] fix const issues in revision parser Jeff King 1 sibling, 1 reply; 24+ messages in thread From: Jeff King @ 2026-03-26 17:42 UTC (permalink / raw) To: Michael J Gruber; +Cc: git On Thu, Mar 26, 2026 at 04:22:52PM +0100, Michael J Gruber wrote: > ISOC23 reveals that we mutate argv strings in place. Confess to this > with explicit casts. Collin and I looked at this one a bit in the earlier thread: https://lore.kernel.org/git/e6f7e2eddbc9aef1c21f661420a4b8cb9cd8e2c1.1770095829.git.collin.funk1@gmail.com/ I think it is technically legal to mutate argv strings (which is why this doesn't segfault now), though I think we would prefer to treat them as conceptually const. You do get a segfault with: handle_revision_arg("..HEAD", &revs, 0, 0); which we fortunately never do (we do pass string literals, but never with a range operator). IMHO the right solution here is to teach the revision-parser not to touch the incoming buffers. We do it only to tie off strings, which can mostly be replaced with xmemdupz(). That's slightly less efficient, but I don't think it would be measurable (it's one allocation that tends to happen a handful of times per program execution, and the rest of the parsing is going to allocate things like commit structs anyway). I have some patches in that direction, but I haven't gotten around to polishing them yet. -Peff ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 0/4] fix const issues in revision parser 2026-03-26 17:42 ` Jeff King @ 2026-03-26 19:02 ` Jeff King 2026-03-26 19:04 ` [PATCH 1/4] revision: make handle_dotdot() interface less confusing Jeff King ` (3 more replies) 0 siblings, 4 replies; 24+ messages in thread From: Jeff King @ 2026-03-26 19:02 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, git On Thu, Mar 26, 2026 at 01:42:04PM -0400, Jeff King wrote: > IMHO the right solution here is to teach the revision-parser not to > touch the incoming buffers. We do it only to tie off strings, which can > mostly be replaced with xmemdupz(). That's slightly less efficient, but > I don't think it would be measurable (it's one allocation that tends to > happen a handful of times per program execution, and the rest of the > parsing is going to allocate things like commit structs anyway). > > I have some patches in that direction, but I haven't gotten around to > polishing them yet. Here it is. There were a few oddities to untangle, but I think the result makes the whole thing a bit easier to understand. I may be biased as the author, though. ;) [1/4]: revision: make handle_dotdot() interface less confusing [2/4]: rev-parse: simplify dotdot parsing [3/4]: revision: avoid writing to const string for parent marks [4/4]: rev-parse: avoid writing to const string for parent marks builtin/rev-parse.c | 40 +++++++++++++-------------- revision.c | 67 +++++++++++++++++++++++---------------------- 2 files changed, 54 insertions(+), 53 deletions(-) -Peff ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/4] revision: make handle_dotdot() interface less confusing 2026-03-26 19:02 ` [PATCH 0/4] fix const issues in revision parser Jeff King @ 2026-03-26 19:04 ` Jeff King 2026-03-26 19:28 ` Junio C Hamano 2026-03-26 19:05 ` [PATCH 2/4] rev-parse: simplify dotdot parsing Jeff King ` (2 subsequent siblings) 3 siblings, 1 reply; 24+ messages in thread From: Jeff King @ 2026-03-26 19:04 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, git There are two very subtle bits to the way we parse ".." (and "...") range operators: 1. In handle_dotdot_1(), we assume that the incoming arguments "dotdot" and "arg" are part of the same string, with the first digit of the range-operator blanked to a NUL. Then when we want the full name (e.g., to report an error), we replace the NUL with a dot to restore the original string. 2. In handle_dotdot(), we take in a const string, but then we modify it by overwriting the range operator with a NUL. This has worked OK in practice since we tend to pass in buffers that are actually writeable (including argv), but segfaults with something like: handle_revision_arg("..HEAD", &revs, 0, 0); On top of that, building with recent versions of glibc causes the compiler to complain, because it notices when we use strchr() or strstr() to launder away constness (basically detecting the possibility of the segfault above via the type system). Instead of munging the buffer, let's instead make a temporary copy of the left-hand side of the range operator. That avoids any const violations, and lets us pass around the parsed elements independently: the left-hand side, the right-hand side, the number of dots (via the "symmetric" flag), and the original full string for error messages. Signed-off-by: Jeff King <peff@peff.net> --- revision.c | 42 +++++++++++++++++++----------------------- 1 file changed, 19 insertions(+), 23 deletions(-) diff --git a/revision.c b/revision.c index 31808e3df0..f61262436f 100644 --- a/revision.c +++ b/revision.c @@ -2038,41 +2038,32 @@ static void prepare_show_merge(struct rev_info *revs) free(prune); } -static int dotdot_missing(const char *arg, char *dotdot, +static int dotdot_missing(const char *full_name, struct rev_info *revs, int symmetric) { if (revs->ignore_missing) return 0; - /* de-munge so we report the full argument */ - *dotdot = '.'; die(symmetric ? "Invalid symmetric difference expression %s" - : "Invalid revision range %s", arg); + : "Invalid revision range %s", full_name); } -static int handle_dotdot_1(const char *arg, char *dotdot, +static int handle_dotdot_1(const char *a_name, const char *b_name, + const char *full_name, int symmetric, struct rev_info *revs, int flags, int cant_be_filename, struct object_context *a_oc, struct object_context *b_oc) { - const char *a_name, *b_name; struct object_id a_oid, b_oid; struct object *a_obj, *b_obj; unsigned int a_flags, b_flags; - int symmetric = 0; unsigned int flags_exclude = flags ^ (UNINTERESTING | BOTTOM); unsigned int oc_flags = GET_OID_COMMITTISH | GET_OID_RECORD_PATH; - a_name = arg; if (!*a_name) a_name = "HEAD"; - b_name = dotdot + 2; - if (*b_name == '.') { - symmetric = 1; - b_name++; - } if (!*b_name) b_name = "HEAD"; @@ -2081,15 +2072,13 @@ static int handle_dotdot_1(const char *arg, char *dotdot, return -1; if (!cant_be_filename) { - *dotdot = '.'; - verify_non_filename(revs->prefix, arg); - *dotdot = '\0'; + verify_non_filename(revs->prefix, full_name); } a_obj = parse_object(revs->repo, &a_oid); b_obj = parse_object(revs->repo, &b_oid); if (!a_obj || !b_obj) - return dotdot_missing(arg, dotdot, revs, symmetric); + return dotdot_missing(full_name, revs, symmetric); if (!symmetric) { /* just A..B */ @@ -2103,7 +2092,7 @@ static int handle_dotdot_1(const char *arg, char *dotdot, a = lookup_commit_reference(revs->repo, &a_obj->oid); b = lookup_commit_reference(revs->repo, &b_obj->oid); if (!a || !b) - return dotdot_missing(arg, dotdot, revs, symmetric); + return dotdot_missing(full_name, revs, symmetric); if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0) { commit_list_free(exclude); @@ -2132,16 +2121,23 @@ static int handle_dotdot(const char *arg, int cant_be_filename) { struct object_context a_oc = {0}, b_oc = {0}; - char *dotdot = strstr(arg, ".."); + const char *dotdot = strstr(arg, ".."); + char *tmp; + int symmetric = 0; int ret; if (!dotdot) return -1; - *dotdot = '\0'; - ret = handle_dotdot_1(arg, dotdot, revs, flags, cant_be_filename, - &a_oc, &b_oc); - *dotdot = '.'; + tmp = xmemdupz(arg, dotdot - arg); + dotdot += 2; + if (*dotdot == '.') { + symmetric = 1; + dotdot++; + } + ret = handle_dotdot_1(tmp, dotdot, arg, symmetric, revs, flags, + cant_be_filename, &a_oc, &b_oc); + free(tmp); object_context_release(&a_oc); object_context_release(&b_oc); -- 2.53.0.1081.gf77a8b8145 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 1/4] revision: make handle_dotdot() interface less confusing 2026-03-26 19:04 ` [PATCH 1/4] revision: make handle_dotdot() interface less confusing Jeff King @ 2026-03-26 19:28 ` Junio C Hamano 2026-03-26 23:14 ` Jeff King 0 siblings, 1 reply; 24+ messages in thread From: Junio C Hamano @ 2026-03-26 19:28 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, git Jeff King <peff@peff.net> writes: > There are two very subtle bits to the way we parse ".." (and "...") > range operators: > > 1. In handle_dotdot_1(), we assume that the incoming arguments "dotdot" > and "arg" are part of the same string, with the first digit of the "digit" -> "dot". > range-operator blanked to a NUL. Then when we want the full name > (e.g., to report an error), we replace the NUL with a dot to restore > the original string. > > 2. In handle_dotdot(), we take in a const string, but then we modify it > by overwriting the range operator with a NUL. This has worked OK in > practice since we tend to pass in buffers that are actually > writeable (including argv), but segfaults with something like: > > handle_revision_arg("..HEAD", &revs, 0, 0); > > On top of that, building with recent versions of glibc causes the > compiler to complain, because it notices when we use strchr() or > strstr() to launder away constness (basically detecting the > possibility of the segfault above via the type system). > > Instead of munging the buffer, let's instead make a temporary copy of > the left-hand side of the range operator. That avoids any const > violations, and lets us pass around the parsed elements independently: > the left-hand side, the right-hand side, the number of dots (via the > "symmetric" flag), and the original full string for error messages. > > Signed-off-by: Jeff King <peff@peff.net> > --- OK. I was hoping if we can do without a temporary allocation, but the const-string "..HEAD" example does make it clear that it is not something we can achieve easily. And once we accept that it is inevitable to make a copy, everything else falls into the right place. > diff --git a/revision.c b/revision.c > index 31808e3df0..f61262436f 100644 > --- a/revision.c > +++ b/revision.c > @@ -2038,41 +2038,32 @@ static void prepare_show_merge(struct rev_info *revs) > free(prune); > } > > -static int dotdot_missing(const char *arg, char *dotdot, > +static int dotdot_missing(const char *full_name, > struct rev_info *revs, int symmetric) > { > if (revs->ignore_missing) > return 0; > - /* de-munge so we report the full argument */ > - *dotdot = '.'; > die(symmetric > ? "Invalid symmetric difference expression %s" > - : "Invalid revision range %s", arg); > + : "Invalid revision range %s", full_name); > } > > -static int handle_dotdot_1(const char *arg, char *dotdot, > +static int handle_dotdot_1(const char *a_name, const char *b_name, > + const char *full_name, int symmetric, > struct rev_info *revs, int flags, > int cant_be_filename, > struct object_context *a_oc, > struct object_context *b_oc) > { > - const char *a_name, *b_name; > struct object_id a_oid, b_oid; > struct object *a_obj, *b_obj; > unsigned int a_flags, b_flags; > - int symmetric = 0; > unsigned int flags_exclude = flags ^ (UNINTERESTING | BOTTOM); > unsigned int oc_flags = GET_OID_COMMITTISH | GET_OID_RECORD_PATH; > > - a_name = arg; > if (!*a_name) > a_name = "HEAD"; > > - b_name = dotdot + 2; > - if (*b_name == '.') { > - symmetric = 1; > - b_name++; > - } > if (!*b_name) > b_name = "HEAD"; > > @@ -2081,15 +2072,13 @@ static int handle_dotdot_1(const char *arg, char *dotdot, > return -1; > > if (!cant_be_filename) { > - *dotdot = '.'; > - verify_non_filename(revs->prefix, arg); > - *dotdot = '\0'; > + verify_non_filename(revs->prefix, full_name); > } > > a_obj = parse_object(revs->repo, &a_oid); > b_obj = parse_object(revs->repo, &b_oid); > if (!a_obj || !b_obj) > - return dotdot_missing(arg, dotdot, revs, symmetric); > + return dotdot_missing(full_name, revs, symmetric); > > if (!symmetric) { > /* just A..B */ > @@ -2103,7 +2092,7 @@ static int handle_dotdot_1(const char *arg, char *dotdot, > a = lookup_commit_reference(revs->repo, &a_obj->oid); > b = lookup_commit_reference(revs->repo, &b_obj->oid); > if (!a || !b) > - return dotdot_missing(arg, dotdot, revs, symmetric); > + return dotdot_missing(full_name, revs, symmetric); > > if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0) { > commit_list_free(exclude); > @@ -2132,16 +2121,23 @@ static int handle_dotdot(const char *arg, > int cant_be_filename) > { > struct object_context a_oc = {0}, b_oc = {0}; > - char *dotdot = strstr(arg, ".."); > + const char *dotdot = strstr(arg, ".."); > + char *tmp; > + int symmetric = 0; > int ret; > > if (!dotdot) > return -1; > > - *dotdot = '\0'; > - ret = handle_dotdot_1(arg, dotdot, revs, flags, cant_be_filename, > - &a_oc, &b_oc); > - *dotdot = '.'; > + tmp = xmemdupz(arg, dotdot - arg); > + dotdot += 2; > + if (*dotdot == '.') { > + symmetric = 1; > + dotdot++; > + } > + ret = handle_dotdot_1(tmp, dotdot, arg, symmetric, revs, flags, > + cant_be_filename, &a_oc, &b_oc); > + free(tmp); > > object_context_release(&a_oc); > object_context_release(&b_oc); ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/4] revision: make handle_dotdot() interface less confusing 2026-03-26 19:28 ` Junio C Hamano @ 2026-03-26 23:14 ` Jeff King 2026-03-27 15:55 ` Junio C Hamano 0 siblings, 1 reply; 24+ messages in thread From: Jeff King @ 2026-03-26 23:14 UTC (permalink / raw) To: Junio C Hamano; +Cc: Michael J Gruber, git On Thu, Mar 26, 2026 at 12:28:15PM -0700, Junio C Hamano wrote: > > There are two very subtle bits to the way we parse ".." (and "...") > > range operators: > > > > 1. In handle_dotdot_1(), we assume that the incoming arguments "dotdot" > > and "arg" are part of the same string, with the first digit of the > > "digit" -> "dot". Oops, yeah. > OK. I was hoping if we can do without a temporary allocation, but > the const-string "..HEAD" example does make it clear that it is not > something we can achieve easily. > > And once we accept that it is inevitable to make a copy, everything > else falls into the right place. Yeah, I don't think there is another good option. We can drop the "const" from the interface, which would be more honest, but then callers that use string literals have to either make their own copy, or cast away the constness and pray. The only "right" solution that avoids copying is if all of the lower-level functions learned to work with ptr/len pairs instead of NUL-terminated strings. But having done that sort of conversion before, it ends up quite messy and is prone to errors. Somebody is welcome to try tackling that if they want, but I don't. :) -Peff ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/4] revision: make handle_dotdot() interface less confusing 2026-03-26 23:14 ` Jeff King @ 2026-03-27 15:55 ` Junio C Hamano 0 siblings, 0 replies; 24+ messages in thread From: Junio C Hamano @ 2026-03-27 15:55 UTC (permalink / raw) To: Jeff King; +Cc: Michael J Gruber, git Jeff King <peff@peff.net> writes: >> And once we accept that it is inevitable to make a copy, everything >> else falls into the right place. > > Yeah, I don't think there is another good option. We can drop the > "const" from the interface, which would be more honest, but then callers > that use string literals have to either make their own copy, or cast > away the constness and pray. > > The only "right" solution that avoids copying is if all of the > lower-level functions learned to work with ptr/len pairs instead of > NUL-terminated strings. But having done that sort of conversion before, > it ends up quite messy and is prone to errors. Somebody is welcome to > try tackling that if they want, but I don't. :) We would need to call out to a library function or system call eventually down the callchain, at which point you'd need to somehow come up with a NUL-terminated equivalent of that <ptr, len> pair. So I would avoid going down that path, unless the language itself has already abstracted that difference away, and C is not among those languages. ^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 2/4] rev-parse: simplify dotdot parsing 2026-03-26 19:02 ` [PATCH 0/4] fix const issues in revision parser Jeff King 2026-03-26 19:04 ` [PATCH 1/4] revision: make handle_dotdot() interface less confusing Jeff King @ 2026-03-26 19:05 ` Jeff King 2026-03-26 19:13 ` [PATCH 3/4] revision: avoid writing to const string for parent marks Jeff King 2026-03-26 19:14 ` [PATCH 4/4] rev-parse: " Jeff King 3 siblings, 0 replies; 24+ messages in thread From: Jeff King @ 2026-03-26 19:05 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, git The previous commit simplified the way that revision.c parses ".." and "..." range operators. But there's roughly similar code in rev-parse. This is less likely to trigger a segfault, as there is no library function which we'd pass a string literal to, but it still causes the compiler to complain about laundering away constness via strstr(). Let's give it the same treatment, copying the left-hand side of the range operator into its own string. Signed-off-by: Jeff King <peff@peff.net> --- builtin/rev-parse.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c index 01a62800e8..5da9537113 100644 --- a/builtin/rev-parse.c +++ b/builtin/rev-parse.c @@ -267,21 +267,20 @@ static int show_file(const char *arg, int output_prefix) static int try_difference(const char *arg) { - char *dotdot; + const char *dotdot; struct object_id start_oid; struct object_id end_oid; const char *end; const char *start; + char *to_free; int symmetric; static const char head_by_default[] = "HEAD"; if (!(dotdot = strstr(arg, ".."))) return 0; + start = to_free = xmemdupz(arg, dotdot - arg); end = dotdot + 2; - start = arg; symmetric = (*end == '.'); - - *dotdot = 0; end += symmetric; if (!*end) @@ -295,7 +294,7 @@ static int try_difference(const char *arg) * Just ".."? That is not a range but the * pathspec for the parent directory. */ - *dotdot = '.'; + free(to_free); return 0; } @@ -308,7 +307,7 @@ static int try_difference(const char *arg) a = lookup_commit_reference(the_repository, &start_oid); b = lookup_commit_reference(the_repository, &end_oid); if (!a || !b) { - *dotdot = '.'; + free(to_free); return 0; } if (repo_get_merge_bases(the_repository, a, b, &exclude) < 0) @@ -318,10 +317,10 @@ static int try_difference(const char *arg) show_rev(REVERSED, &commit->object.oid, NULL); } } - *dotdot = '.'; + free(to_free); return 1; } - *dotdot = '.'; + free(to_free); return 0; } -- 2.53.0.1081.gf77a8b8145 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 3/4] revision: avoid writing to const string for parent marks 2026-03-26 19:02 ` [PATCH 0/4] fix const issues in revision parser Jeff King 2026-03-26 19:04 ` [PATCH 1/4] revision: make handle_dotdot() interface less confusing Jeff King 2026-03-26 19:05 ` [PATCH 2/4] rev-parse: simplify dotdot parsing Jeff King @ 2026-03-26 19:13 ` Jeff King 2026-03-26 19:14 ` [PATCH 4/4] rev-parse: " Jeff King 3 siblings, 0 replies; 24+ messages in thread From: Jeff King @ 2026-03-26 19:13 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, git We take in a "const char *", but may write a NUL into it when parsing parent marks like "foo^-", since we want to isolate "foo" as a string for further parsing. This is usually OK, as our "const" strings are often actually argv strings which are technically writeable, but we'd segfault with a string literal like: handle_revision_arg("HEAD^-", &revs, 0, 0); Similar to how we handled dotdot in a previous commit, we can avoid this by making a temporary copy of the left-hand side of the string. The cost should negligible compared to the rest of the parsing (like actually parsing commits to create their parent linked-lists). There is one slightly tricky thing, though. We parse some of the marks progressively, so that if we see "foo^!" for example, we'll strip that down to "foo" not just for calling add_parents_only(), but also for the rest of the function. That makes sense since we eventually want to pass "foo" to get_oid_with_context(). But it also means that we'll keep looking for other marks. In particular, "foo^-^!" is valid, though oddly "foo^!^-" would ignore the "^-". I'm not sure if this is a weird historical artifact of the implementation, or if there are important corner cases. So I've left the behavior unchanged. Each mark we find allocates a string with the mark stripped, which means we could allocate multiple times (and carry a free-able pointer for each to the end). But in practice we won't, because of the three marks, "^@" jumps immediately to the end without further parsing, and "^-^!" is nonsense that nobody would pass. So you'd get one allocation in general, and never more than two. Another obvious option would be to just copy "arg" up front and be OK with munging it. But that means we pay the cost even when we find no marks. We could make a single copy upon finding a mark and then munge, but that adds extra code to each site (checking whether somebody else allocated, and if not, adjusting our "mark" pointer to be relative to the copied string). I aimed for something that was clear and obvious, if a bit verbose. Signed-off-by: Jeff King <peff@peff.net> --- Also one other weirdness I noticed while proof-reading: if we successfully parse a mark, we never restore the original string! So if you call: char buf[] = "foo^!"; handle_revision_arg(buf, &revs, 0, 0); Then "buf" would have "foo\0!" after it returns. I guess no callers care, because they only look at the arg again if there was an error. But it incidentally is fixed by this patch. revision.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/revision.c b/revision.c index f61262436f..fda405bf65 100644 --- a/revision.c +++ b/revision.c @@ -2147,7 +2147,10 @@ static int handle_dotdot(const char *arg, static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int flags, unsigned revarg_opt) { struct object_context oc = {0}; - char *mark; + const char *mark; + char *arg_minus_at = NULL; + char *arg_minus_excl = NULL; + char *arg_minus_dash = NULL; struct object *object; struct object_id oid; int local_flags; @@ -2174,18 +2177,17 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl mark = strstr(arg, "^@"); if (mark && !mark[2]) { - *mark = 0; - if (add_parents_only(revs, arg, flags, 0)) { + arg_minus_at = xmemdupz(arg, mark - arg); + if (add_parents_only(revs, arg_minus_at, flags, 0)) { ret = 0; goto out; } - *mark = '^'; } mark = strstr(arg, "^!"); if (mark && !mark[2]) { - *mark = 0; - if (!add_parents_only(revs, arg, flags ^ (UNINTERESTING | BOTTOM), 0)) - *mark = '^'; + arg_minus_excl = xmemdupz(arg, mark - arg); + if (add_parents_only(revs, arg_minus_excl, flags ^ (UNINTERESTING | BOTTOM), 0)) + arg = arg_minus_excl; } mark = strstr(arg, "^-"); if (mark) { @@ -2199,9 +2201,9 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl } } - *mark = 0; - if (!add_parents_only(revs, arg, flags ^ (UNINTERESTING | BOTTOM), exclude_parent)) - *mark = '^'; + arg_minus_dash = xmemdupz(arg, mark - arg); + if (add_parents_only(revs, arg_minus_dash, flags ^ (UNINTERESTING | BOTTOM), exclude_parent)) + arg = arg_minus_dash; } local_flags = 0; @@ -2236,6 +2238,9 @@ static int handle_revision_arg_1(const char *arg_, struct rev_info *revs, int fl out: object_context_release(&oc); + free(arg_minus_at); + free(arg_minus_excl); + free(arg_minus_dash); return ret; } -- 2.53.0.1081.gf77a8b8145 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 4/4] rev-parse: avoid writing to const string for parent marks 2026-03-26 19:02 ` [PATCH 0/4] fix const issues in revision parser Jeff King ` (2 preceding siblings ...) 2026-03-26 19:13 ` [PATCH 3/4] revision: avoid writing to const string for parent marks Jeff King @ 2026-03-26 19:14 ` Jeff King 3 siblings, 0 replies; 24+ messages in thread From: Jeff King @ 2026-03-26 19:14 UTC (permalink / raw) To: Michael J Gruber; +Cc: Junio C Hamano, git The previous commit cleared up some const confusion in handling parent marks in revision.c, but we have roughly the same code duplicated in rev-parse. This one is much easier to fix, because the handling of the shortened string is all done in one place, after detecting any marks (but without shortening the string between marks). As a side note, I suspect this means that it behaves differently than the revision.c parser for weird stuff like "foo^!^@^-", but that is outside the scope of this patch. While we are here, let's also rename the variable "dotdot", which is totally misleading (and which we already fixed in revision.c long ago via f632dedd8d (handle_revision_arg: stop using "dotdot" as a generic pointer, 2017-05-19)). Doing that here makes the diff a little messier, but it also lets the compiler help us make sure we did not miss any stray mentions of the variable while we are changing its semantics. Signed-off-by: Jeff King <peff@peff.net> --- builtin/rev-parse.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c index 5da9537113..218b5f34d6 100644 --- a/builtin/rev-parse.c +++ b/builtin/rev-parse.c @@ -326,46 +326,47 @@ static int try_difference(const char *arg) static int try_parent_shorthands(const char *arg) { - char *dotdot; + const char *mark; struct object_id oid; struct commit *commit; struct commit_list *parents; int parent_number; int include_rev = 0; int include_parents = 0; int exclude_parent = 0; + char *to_free; - if ((dotdot = strstr(arg, "^!"))) { + if ((mark = strstr(arg, "^!"))) { include_rev = 1; - if (dotdot[2]) + if (mark[2]) return 0; - } else if ((dotdot = strstr(arg, "^@"))) { + } else if ((mark = strstr(arg, "^@"))) { include_parents = 1; - if (dotdot[2]) + if (mark[2]) return 0; - } else if ((dotdot = strstr(arg, "^-"))) { + } else if ((mark = strstr(arg, "^-"))) { include_rev = 1; exclude_parent = 1; - if (dotdot[2]) { + if (mark[2]) { char *end; - exclude_parent = strtoul(dotdot + 2, &end, 10); + exclude_parent = strtoul(mark + 2, &end, 10); if (*end != '\0' || !exclude_parent) return 0; } } else return 0; - *dotdot = 0; + arg = to_free = xmemdupz(arg, mark - arg); if (repo_get_oid_committish(the_repository, arg, &oid) || !(commit = lookup_commit_reference(the_repository, &oid))) { - *dotdot = '^'; + free(to_free); return 0; } if (exclude_parent && exclude_parent > commit_list_count(commit->parents)) { - *dotdot = '^'; + free(to_free); return 0; } @@ -386,7 +387,7 @@ static int try_parent_shorthands(const char *arg) free(name); } - *dotdot = '^'; + free(to_free); return 1; } -- 2.53.0.1081.gf77a8b8145 ^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 0/6] ISOC23: quell warnings on discarding const 2026-03-26 15:22 [PATCH 0/6] ISOC23: quell warnings on discarding const Michael J Gruber ` (5 preceding siblings ...) 2026-03-26 15:22 ` [PATCH 6/6] do not discard const: the ugly truth Michael J Gruber @ 2026-03-26 16:26 ` D. Ben Knoble 2026-03-27 17:45 ` Michael J Gruber 6 siblings, 1 reply; 24+ messages in thread From: D. Ben Knoble @ 2026-03-26 16:26 UTC (permalink / raw) To: Michael J Gruber; +Cc: git, Collin Funk, Jeff King On Thu, Mar 26, 2026 at 11:40 AM Michael J Gruber <git@grubix.eu> wrote: > > Hi there > > Fedora 44 beta (gcc-16.0.1, glibc-2.43) brought some fun new warnings > when building git. In essence, we're not always explicit about > const-ness or lack thereof of certain pointers. Before, strchr()'s > signature which turns const arguments into non-const return values > covered this up. With ISOC23, strchr() and friends return const > pointers. > > This little series takes a middle-ground: no new data types (no new > const versions of non-const data types) but more explicit casts. I think a few folks were working on similar things; hopefully I've CC'd some relevant parties. > > Michael J Gruber (6): > do not discard const: the simple cases > do not discard const: make git-compat-util ISOC23-like > do not discard const: adjust to non-const data types > do not discard const: declare const where we stay const > do not discard const: keep signature > do not discard const: the ugly truth > > builtin/config.c | 2 +- > builtin/receive-pack.c | 6 +++--- > builtin/rev-parse.c | 8 ++++---- > convert.c | 3 ++- > git-compat-util.h | 2 +- > http-push.c | 2 +- > http.c | 2 +- > pager.c | 2 +- > pseudo-merge.c | 2 +- > range-diff.c | 2 +- > refs/files-backend.c | 2 +- > remote.c | 2 +- > revision.c | 8 ++++---- > run-command.c | 2 +- > scalar.c | 2 +- > send-pack.c | 6 +++--- > submodule.c | 2 +- > transport-helper.c | 2 +- > 18 files changed, 29 insertions(+), 28 deletions(-) > > -- > 2.53.0.1195.g771ffcb452 -- D. Ben Knoble ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/6] ISOC23: quell warnings on discarding const 2026-03-26 16:26 ` [PATCH 0/6] ISOC23: quell warnings on discarding const D. Ben Knoble @ 2026-03-27 17:45 ` Michael J Gruber 0 siblings, 0 replies; 24+ messages in thread From: Michael J Gruber @ 2026-03-27 17:45 UTC (permalink / raw) To: D. Ben Knoble; +Cc: git, Collin Funk, Jeff King D. Ben Knoble venit, vidit, dixit 2026-03-26 17:26:35: > On Thu, Mar 26, 2026 at 11:40 AM Michael J Gruber <git@grubix.eu> wrote: > > > > Hi there > > > > Fedora 44 beta (gcc-16.0.1, glibc-2.43) brought some fun new warnings > > when building git. In essence, we're not always explicit about > > const-ness or lack thereof of certain pointers. Before, strchr()'s > > signature which turns const arguments into non-const return values > > covered this up. With ISOC23, strchr() and friends return const > > pointers. > > > > This little series takes a middle-ground: no new data types (no new > > const versions of non-const data types) but more explicit casts. > > I think a few folks were working on similar things; hopefully I've > CC'd some relevant parties. Thanks for catching this. I had checked the list cursorily (I'm not a regular) but overlooked it. Peff's going all in on it, as usual ;-) Michael ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2026-03-27 17:54 UTC | newest] Thread overview: 24+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-03-26 15:22 [PATCH 0/6] ISOC23: quell warnings on discarding const Michael J Gruber 2026-03-26 15:22 ` [PATCH 1/6] do not discard const: the simple cases Michael J Gruber 2026-03-26 17:34 ` Jeff King 2026-03-26 17:45 ` Junio C Hamano 2026-03-26 19:23 ` [PATCH] config: store allocated string in non-const pointer Jeff King 2026-03-26 15:22 ` [PATCH 2/6] do not discard const: make git-compat-util ISOC23-like Michael J Gruber 2026-03-26 15:22 ` [PATCH 3/6] do not discard const: adjust to non-const data types Michael J Gruber 2026-03-26 17:28 ` Junio C Hamano 2026-03-26 15:22 ` [PATCH 4/6] do not discard const: declare const where we stay const Michael J Gruber 2026-03-26 15:22 ` [PATCH 5/6] do not discard const: keep signature Michael J Gruber 2026-03-26 17:18 ` Junio C Hamano 2026-03-26 15:22 ` [PATCH 6/6] do not discard const: the ugly truth Michael J Gruber 2026-03-26 17:07 ` Junio C Hamano 2026-03-26 17:42 ` Jeff King 2026-03-26 19:02 ` [PATCH 0/4] fix const issues in revision parser Jeff King 2026-03-26 19:04 ` [PATCH 1/4] revision: make handle_dotdot() interface less confusing Jeff King 2026-03-26 19:28 ` Junio C Hamano 2026-03-26 23:14 ` Jeff King 2026-03-27 15:55 ` Junio C Hamano 2026-03-26 19:05 ` [PATCH 2/4] rev-parse: simplify dotdot parsing Jeff King 2026-03-26 19:13 ` [PATCH 3/4] revision: avoid writing to const string for parent marks Jeff King 2026-03-26 19:14 ` [PATCH 4/4] rev-parse: " Jeff King 2026-03-26 16:26 ` [PATCH 0/6] ISOC23: quell warnings on discarding const D. Ben Knoble 2026-03-27 17:45 ` Michael J Gruber
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox