* [RFC] [GSoC]: STRBUF_INIT_CONST: initialize `strbuf` to constant string @ 2026-03-22 6:55 Mateo Patino 2026-03-22 8:59 ` Eric Sunshine 0 siblings, 1 reply; 4+ messages in thread From: Mateo Patino @ 2026-03-22 6:55 UTC (permalink / raw) To: git Cc: Mateo Patino, karthik.188, jltobler, ayu.chandekar, siddharthasthana31, ps, gitster Hello, My name is Mateo, and I'm a new contributor to Git. I'm a 1st year undergrad at Columbia University studying CS and applied math. I wanted to ask the community for feedback on a project proposal regarding the `strbuf` API. Seven years ago, a macro to initialize a `strbuf` to a constant string literal was proposed in GitGitGadget [1] called `STRBUF_INIT_CONST`. This macro would work just like `STRBUF_INIT` but it would set `alloc` to 0 (i.e. the buffer would not be heap-allocated). Someone made a pull request to implement this feature [2], but their changes were not merged. Later, Robear Selwans made a patch series [3] attempting to implement this same feature. Robear got extensive feedback, but his patches were not accepted. The same GitHub user from [2] sent a patch here [4], but his changes were not accepted. More recently, the potential need for `STRBUF_INIT_CONST` was mentioned in this patch series [5] by Patrick Steinhardt, though it was marked as a #leftoverbit and not directly addressed. `STRBUF_INIT_CONST` has been mentioned for a long time in this list, but it has not been implemented yet. My Request For Comment is the following: is `STRBUF_INIT_CONST` a feature that is still of interest to the community? If so, I would like to make a GSoC proposal around it. The past email threads have already laid out the considerations of implementing `STRBUF_INIT_CONST` or something equivalent, so I would like to propose this as GSoC idea if the community would find it worthwhile. I would love to hear any thoughts about this. Thanks! Mateo <mateopatinodev@gmail.com> [1] https://github.com/gitgitgadget/git/issues/398 [2] https://github.com/gitgitgadget/git/pull/824 [3] https://lore.kernel.org/git/20200218041805.10939-1-robear.selwans@outlook.com/ [4] https://lore.kernel.org/git/20210105064502.725307-1-adlternative@gmail.com/ [5] https://lore.kernel.org/git/Zrm9ix5aN_g76Qxq@tanuki/ ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] [GSoC]: STRBUF_INIT_CONST: initialize `strbuf` to constant string 2026-03-22 6:55 [RFC] [GSoC]: STRBUF_INIT_CONST: initialize `strbuf` to constant string Mateo Patino @ 2026-03-22 8:59 ` Eric Sunshine 2026-03-23 16:10 ` Mateo Patino [not found] ` <CAFRsFoV+k-8GMf=62GJwxP=o0Fy5RRBGW+h4NqOLjFbU6z96tw@mail.gmail.com> 0 siblings, 2 replies; 4+ messages in thread From: Eric Sunshine @ 2026-03-22 8:59 UTC (permalink / raw) To: Mateo Patino Cc: git, karthik.188, jltobler, ayu.chandekar, siddharthasthana31, ps, gitster On Sun, Mar 22, 2026 at 2:55 AM Mateo Patino <mateopatinodev@gmail.com> wrote: > My name is Mateo, and I'm a new contributor to Git. I'm a 1st year > undergrad at Columbia University studying CS and applied math. > > I wanted to ask the community for feedback on a project proposal > regarding the `strbuf` API. Seven years ago, a macro to initialize a > `strbuf` to a constant string literal was proposed in GitGitGadget [1] > called `STRBUF_INIT_CONST`. This macro would work just like `STRBUF_INIT` > but it would set `alloc` to 0 (i.e. the buffer would not be > heap-allocated). > > Someone made a pull request to implement this feature [2], but their > changes were not merged. Later, Robear Selwans made a patch series [3] > attempting to implement this same feature. Robear got extensive > feedback, but his patches were not accepted. The same GitHub user from > [2] sent a patch here [4], but his changes were not accepted. You probably didn't intend for it to sound this way, but this summary makes it seem as if the Git project rejected these patch submissions without proper justification. However, having studied the threads which you referenced, it becomes clear that the reason these patches were never accepted is because the submitters never followed through by addressing reviewer comments. For instance, in my review[*1*] of the patch [4] which you referenced, I pointed out several significant problems with the patch, but the patch author never responded, so it makes sense that the submission was never accepted into the project. > More recently, the potential need for `STRBUF_INIT_CONST` was mentioned > in this patch series [5] by Patrick Steinhardt, though it was marked > as a #leftoverbit and not directly addressed. > > `STRBUF_INIT_CONST` has been mentioned for a long time in this list, > but it has not been implemented yet. My Request For Comment is the > following: is `STRBUF_INIT_CONST` a feature that is still of interest > to the community? If so, I would like to make a GSoC proposal around it. > The past email threads have already laid out the considerations of > implementing `STRBUF_INIT_CONST` or something equivalent, so I would > like to propose this as GSoC idea if the community would find it > worthwhile. > > I would love to hear any thoughts about this. Although feedback to Robear Selwans's submission from some reviewers was subjective, Peff's review[*2*] pointed at a major roadblock; specifically, that strbuf has always promoted strbuf.buf is a writeable C-style string, so it is not safe simply to assign a pointer to a literal string to the "buf" member, and it's not practical to expect that all consumers of strbufs can be audited and modified to work correctly with the "new world order" that STRBUF_INIT_CONST would introduce. Thus, the issue is deeper than it may seem at first glance, and unless you have some fundamentally new ideas to address the sort of critical issues identified by such feedback, it is likely that a patch which takes an approach similar to what has previously been submitted will likewise fail. [*1*]: https://lore.kernel.org/git/CAPig+cQL=b-nF6nADaWueJaDmxCgmZbUwWj6=dAwYQ=vVrkifg@mail.gmail.com/ [*2*]: https://lore.kernel.org/git/20200218062124.GF1641086@coredump.intra.peff.net/ > [1] https://github.com/gitgitgadget/git/issues/398 > [2] https://github.com/gitgitgadget/git/pull/824 > [3] https://lore.kernel.org/git/20200218041805.10939-1-robear.selwans@outlook.com/ > [4] https://lore.kernel.org/git/20210105064502.725307-1-adlternative@gmail.com/ > [5] https://lore.kernel.org/git/Zrm9ix5aN_g76Qxq@tanuki/ ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC] [GSoC]: STRBUF_INIT_CONST: initialize `strbuf` to constant string 2026-03-22 8:59 ` Eric Sunshine @ 2026-03-23 16:10 ` Mateo Patino [not found] ` <CAFRsFoV+k-8GMf=62GJwxP=o0Fy5RRBGW+h4NqOLjFbU6z96tw@mail.gmail.com> 1 sibling, 0 replies; 4+ messages in thread From: Mateo Patino @ 2026-03-23 16:10 UTC (permalink / raw) To: git Cc: Mateo Patino, karthik.188, jltobler, ayu.chandekar, siddharthasthana31, ps, gitster (Note: resending this because it had HTML the first time and the list rejected it. Apologies if it becomes a duplicate for someone). > > You probably didn't intend for it to sound this way, but this summary > makes it seem as if the Git project rejected these patch submissions > without proper justification. However, having studied the threads > which you referenced, it becomes clear that the reason these patches > were never accepted is because the submitters never followed through > by addressing reviewer comments. For instance, in my review[*1*] of > the patch [4] which you referenced, I pointed out several significant > problems with the patch, but the patch author never responded, so it > makes sense that the submission was never accepted into the project. Yes, I saw that they did not reply to reviewer feedback. Here I meant to say that previous patches had already been attempted in this area and could be used as guidance for future attempts. > > Although feedback to Robear Selwans's submission from some reviewers > was subjective, Peff's review[*2*] pointed at a major roadblock; > specifically, that strbuf has always promoted strbuf.buf is a > writeable C-style string, so it is not safe simply to assign a pointer > to a literal string to the "buf" member, and it's not practical to > expect that all consumers of strbufs can be audited and modified to > work correctly with the "new world order" that STRBUF_INIT_CONST would > introduce. Since the Git codebase widely assumes strbuf.buf is writable, I wonder whether we could create a new struct that is specifically documented as a read-only, non-owning view into memory, something lightweight like `string_view` in C++, which is an object that simply holds a pointer to a string in memory and the length. For example, in C, struct strview { const char *buf; size_t len; }; This struct would not care where the memory that `buf` points to exists. The memory would be owned elsewhere and the caller would be responsible for ensuring that the memory is valid throughout the lifetime of the struct. I think this could help pass around string data without requiring ownership or allocation, particularly in cases where the data is already available. A small downside I see to this approach is that we'd need to write a few helper functions that accompany this struct, and they would likely share similar names to the helper functions of `strbuf`, though I think this has been accepted in the past in other places throughout the codebase. Another consideration is that this proposed `strview` would not address the lifetime and ownership issue in [4], but having a safer way to pass read-only strings seems like a step in the right direction. ^ permalink raw reply [flat|nested] 4+ messages in thread
[parent not found: <CAFRsFoV+k-8GMf=62GJwxP=o0Fy5RRBGW+h4NqOLjFbU6z96tw@mail.gmail.com>]
* Re: [RFC] [GSoC]: STRBUF_INIT_CONST: initialize `strbuf` to constant string [not found] ` <CAFRsFoV+k-8GMf=62GJwxP=o0Fy5RRBGW+h4NqOLjFbU6z96tw@mail.gmail.com> @ 2026-03-24 3:33 ` Eric Sunshine 0 siblings, 0 replies; 4+ messages in thread From: Eric Sunshine @ 2026-03-24 3:33 UTC (permalink / raw) To: Mateo Patino Cc: git, karthik.188, jltobler, ayu.chandekar, siddharthasthana31, ps, gitster On Mon, Mar 23, 2026 at 1:11 AM Mateo Patino <mateopatinodev@gmail.com> wrote: > On Sun, Mar 22, 2026 at 4:59 AM Eric Sunshine <sunshine@sunshineco.com> wrote: >> Although feedback to Robear Selwans's submission from some reviewers >> was subjective, Peff's review[*2*] pointed at a major roadblock; >> specifically, that strbuf has always promoted strbuf.buf is a >> writeable C-style string, so it is not safe simply to assign a pointer >> to a literal string to the "buf" member, and it's not practical to >> expect that all consumers of strbufs can be audited and modified to >> work correctly with the "new world order" that STRBUF_INIT_CONST would >> introduce. > > Since the Git codebase widely assumes strbuf.buf is writable, I wonder whether > we could create a new struct that is specifically documented as a read-only, > non-owning view into memory, something lightweight like `string_view` in C++, > which is an object that simply holds a pointer to a string in memory and the length. > For example, in C, > > struct strview { > const char *buf; > size_t len; > }; > > This struct would not care where the memory that `buf` points to exists. The > memory would be owned elsewhere and the caller would be responsible for > ensuring that the memory is valid throughout the lifetime of the struct. I think > this could help pass around string data without requiring ownership or > allocation, particularly in cases where the data is already available. > > A small downside I see to this approach is that we'd need to write a few helper > functions that accompany this struct, and they would likely share similar names > to the helper functions of `strbuf`, though I think this has been accepted in the > past in other places throughout the codebase. > > Another consideration is that this proposed `strview` would not address the > lifetime and ownership issue in [4], but having a safer way to pass read-only > strings seems like a step in the right direction. > > What do you think? I think this is a solution to a non-existent problem. Being written in C, Git does not (generally) have a need for this sort of structure. When Git code wants to "pass around" an immutable string to functions, those functions simply declare themselves as accepting a const string, as in: void do_something(const char *s) {...} In the less common case that the string is not NUL-terminated or only a portion of the string should be processed, the function also takes a length: void do_something(const char *s, size_t n) {...} This is a common idiom in the Git codebase, it's perfectly safe, doesn't involve ownership concerns, and there is no reason to stray away from it. The proposed `strview` is not safer and is probably not as convenient, thus adds no apparent value. But, having reread the threads which your initial email referenced, I think the bigger issue is that we're dealing with an XY Problem[1]. The original problem "X" being discussed was how to achieve static initialization of some string variables while still allowing the variables to be later pointed at heap-allocated memory, but at the same time avoiding memory leaks when those reassignments occur. The proposed solution "Y" was to somehow employ `strbuf` to solve X, however, it turns out that `strbuf` is utterly unsuitable for this use-case. Unfortunately, this "Y" proposal was then turned into a GitHub issue[2] which has led to this email thread as well as those aborted and misdirected submissions which you referenced earlier. If we take a step back and focus on the original problem rather than focusing on how to twist strbuf into something it was never meant to be, then a potential solution becomes clearer. Let's restate the original problem: static const char *global_var = "thimble"; void maybe_assign(const char **var, ...) { if (...some_condition...) { /* ??? free((void *)*var) ??? */ *var = some_heap_allocated_str; } } maybe_assign(&global_var, ...); ... maybe_assign(&global_var, ...); When maybe_assign() is called, it doesn't know whether or not the incoming `var` points at a static string literal ("thimble") or at some heap-allocated string, so it doesn't know whether or not to first free() `var` before assigning the new value. To solve this, we need a flag which indicates whether the string stored in the variable needs to be freed before the variable is reassigned. So, this suggests a dedicated, simple structure and a few related functions and a macro or two. For instance, something like this: struct str { char *s; int free_me; }; /* initialize `str` from a literal string (i.e. "foo") */ #define STR_INIT(X) { .s = (char *)(X), .free_me = 0 } void str_release(str *x) { if (x.free_me) FREE_AND_NULL(x.s); x.free_me = 0; } /* take ownership of a heap-allocated string */ void str_take(str *x, char * s) { str_release(x); x.s = s; x.free_me = 1; } /* assign a string literal (i.e. "foo") */ void str_assign(str *x, const char *s) { str_release(x); x.s = (char *)s; x.free_me = 0; } That's probably about all you need to solve the stated problem. Given the above, the original problem statement can be "fixed" by taking advantage of the above structure and functions: static struct str global_var = STR_INIT("thimble"); void maybe_assign(str *var, ...) { if (...some_condition...) str_assign(var, some_heap_allocated_str); } maybe_assign(&global_var, ...); Clients which need the value simply access the `.s` member directly. And there is no need to have any functions to morph the string in any way. If a client needs that functionality, it is easy enough to create and populate a proper `strbuf` from the `.s` member. [1]: https://xyproblem.info/ [2]: https://github.com/gitgitgadget/git/issues/398 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-03-24 3:33 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-22 6:55 [RFC] [GSoC]: STRBUF_INIT_CONST: initialize `strbuf` to constant string Mateo Patino
2026-03-22 8:59 ` Eric Sunshine
2026-03-23 16:10 ` Mateo Patino
[not found] ` <CAFRsFoV+k-8GMf=62GJwxP=o0Fy5RRBGW+h4NqOLjFbU6z96tw@mail.gmail.com>
2026-03-24 3:33 ` Eric Sunshine
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox