Git development
 help / color / mirror / Atom feed
* Re: [PATCH 1/2] t1401: generalize reference locking
From: Jeff King @ 2024-01-11  7:13 UTC (permalink / raw)
  To: Justin Tobler via GitGitGadget; +Cc: git, Justin Tobler
In-Reply-To: <cb78b549e5e826ffef39c55bd726164e6b7bb755.1704912750.git.gitgitgadget@gmail.com>

On Wed, Jan 10, 2024 at 06:52:29PM +0000, Justin Tobler via GitGitGadget wrote:

> From: Justin Tobler <jltobler@gmail.com>
> 
> Some tests set up reference locks by directly creating the lockfile.
> While this works for the files reference backend, reftable reference
> locks operate differently and are incompatible with this approach.
> Refactor the test to use git-update-ref(1) to lock refs instead so that
> the test does not need to be aware of how the ref backend locks refs.

It looks like you re-create this situation in a backend-agnostic way by
having two simultaneous updates that conflict on the lock (but don't
care how that lock is implemented).

That works, but I think we could keep it simple. This test doesn't care
about the exact error condition we create. The point was just to die in
create_symref() and make sure the exit code was propagated. So something
like this would work:

  $ git symbolic-ref refs/heads refs/heads/foo
  error: unable to write symref for refs/heads: Is a directory

(note that you get a different error message if the refs are packed,
since there we can notice the d/f conflict manually).

There may be other ways to stimulate a failure. I thought "symbolic-ref
HEAD refs/heads/.invalid" might work, but sadly the refname format check
happens earlier.

I think it is worth avoiding the fifo magic if we can. It's complicated,
and it means that not all platforms run the test.

-Peff

^ permalink raw reply

* Re: Limited operations in unsafe repositories
From: Jeff King @ 2024-01-11  7:01 UTC (permalink / raw)
  To: brian m. carlson; +Cc: git
In-Reply-To: <ZZ8pbAMNaBDFgf3G@tapette.crustytoothpaste.net>

On Wed, Jan 10, 2024 at 11:34:04PM +0000, brian m. carlson wrote:

> On 2024-01-10 at 12:05:31, Jeff King wrote:
> > My thinking is to flip that around: run all code, but put protection in
> > the spots that do unsafe things, like loading config or examining
> > hooks. I.e., a patch like this:
> 
> I think that's much what I had intended to do with not invoking binaries
> at all, except that it was limited to rev-parse.  I wonder if perhaps we
> could do something similar if we had the `--assume-unsafe` argument you
> proposed, except that we would only allow the `git` binary and always
> pass that argument to it in such a case.

I'm not sure what you mean by "invoking binaries". I had assumed that
meant just running other Git sub-processes. But if "--assume-unsafe" is
just setting an environment variable, they'd automatically be protected.

> I don't think reading config is intrinsically unsafe; it's more of what
> we do with it, which is spawning external processes, that's the problem.
> I suppose an argument could be made for injecting terminal sequences or
> such, though.  Hooks, obviously, are definitely unsafe.

Right, it's not config itself that's unsafe; it's that many options are.
We could try to annotate them to say "it is OK to parse core.eol but not
core.pager", presumably with an allow-known-good approach (since so many
ard bad!). But that feels like an ongoing maintenance headache, and an
easy way to make a mistake (your mention of terminal sequences makes me
assume you're thinking of "color.diff.*", etc). A rule like "we do not
read repo-level config at all" seems easier to explain (to me, anyway).

-Peff

^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: Patrick Steinhardt @ 2024-01-11  6:56 UTC (permalink / raw)
  To: Elijah Newren; +Cc: rsbecker, Taylor Blau, Junio C Hamano, Dragan Simic, git
In-Reply-To: <CABPp-BGmXw0NQ8yBaMiVXHiKr0-Y_jkZWmJB1CG_oc4UGxt_gA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4216 bytes --]

On Wed, Jan 10, 2024 at 09:06:23PM -0800, Elijah Newren wrote:
> On Wed, Jan 10, 2024 at 6:57 PM <rsbecker@nexbridge.com> wrote:
> >
> > On Wednesday, January 10, 2024 9:21 PM, Elijah Newren wrote:
> > >On Wed, Jan 10, 2024 at 5:44 PM <rsbecker@nexbridge.com> wrote:
> > >>
> > >> On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
> > >[...]
> > >> >Would you be okay with the following alternative: requiring that all
> > >> >Rust code be optional for now?
> > >> >
> > >> >(In other words, allow you to build with USE_RUST=0, or something
> > >> >like that.  And then we have both a Rust and a C implementation of
> > >> >anything that is required for backward compatibility, while any new
> > >> >Rust-only stuff would not be included in your build.)
> > >>
> > >> To address the immediate above, I assume this means that platform
> > >> maintainers will be responsible for developing non-portable
> > >> implementations that duplicate Rust functionality
> > >
> > >This doesn't at all sound like what I thought I said.  The whole proposal was so that
> > >folks like NonStop could continue using Git with no more work than setting
> > >USE_RUST=0 at build time.
> > >
> > >Why do you feel you'd need to duplicate any functionality?
> >
> > I think I misunderstood. What I took from this is that all new functionality would be in Rust, which would require a custom implementation in C for platforms that did not have Rust available - if that is even practical. Did I get that wrong?
> 
> I think you somehow missed the word optional?
> 
> I did say that new functionality should be allowed to be Rust only
> (unlike existing functionality), but I'm not sure how you leaped to
> assuming that all new functionality would be in Rust.  Further, I also
> don't understand why you jump to assuming that all new functionality
> needs to be supported on all platforms.  The point of the word
> "optional" in my proposal is that it is not required.  So, say, if
> git-replay is in Rust, well you've never had git-replay before in any
> release, so you haven't lost any functionality by it being implemented
> in Rust.  And existing things (merge, cherry-pick, rebase, etc.)
> continue working with C-only code.  But you may have one less optional
> addition.
> 
> At least that was _my_ proposal -- that Rust be optional for now.  It
> does differ from what I think Taylor was originally proposing, but
> that's why I brought it up as an alternative proposal.

There are two ways to do this that I can see:

  - New features may not be available on some platforms. I think this is
    what Elijah had in mind.

  - New features may require two implementations, one in C and one in
    Rust. I think this is what Randall understood.

Ultimately, I think both alternatives would end up demoting platforms
that do not support Rust to become second-class citizens eventually.
This demotion is rather obvious in the case where new features may not
be available. But I also think that the second approach, where we
provide two implementations, would lead to a demotion of the Rust-less
platform because the alternate implementation in C would likely end up
receiving less attention than the Rust-based one. It's thus likely that
the implementation receiving less attention will deteriorate in code
quality.

I also think that once we start to accept Rust code, it will only be a
matter of time before we want to start using it in central code paths.
Rust does provide interfaces which are a lot nicer to use than the C
based ones, but it's hard to really reap the benefits unless we start to
embrace Rust fully. Also, the most complex interfaces tend to be those
which are deep inside our code base, like for example the object
database. They are thus also the most profitable targets for a Rust
conversion, even though likely also the hardest to realize.

To me this feels like a slippery slope, and the deeper we go the more
incentive we will have to drop platforms which do not support Rust
altogether. So I can certainly see where Randall is coming from and why
this proposal is not something that he is thrilled about.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH] index-pack: spawn threads atomically
From: Jeff King @ 2024-01-11  6:53 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Junio C Hamano, git
In-Reply-To: <ZZ7VEVXSg1T8ZkIK@nand.local>

On Wed, Jan 10, 2024 at 12:34:09PM -0500, Taylor Blau wrote:

> > If do go with "--threads=1", I suspect several tests in that file need
> > it.
> 
> Yeah, there are a couple of others. I think the ones that need modifying
> are at the intersection of "expected to fail" and "in a test which is
> expected to pass leak-free":
> 
>     $ grep -l 'TEST_PASSES_SANITIZE_LEAK=true' t????-*.sh |
>       xargs grep -l 'test_must_fail git index-pack'
>     t5302-pack-index.sh
>     t5308-pack-detect-duplicates.sh
>     t5309-pack-delta-cycles.sh
>     t5313-pack-bounds-checks.sh
>     t5325-reverse-index.sh

I think that is more than we need. It's only a problem when we hit a
die() inside a thread, which happens only during delta resolution. So
your patch 2, for example, touches a test which triggers the
--max-input-size check. But we would find that out on the initial
unthreaded pass over the pack.

The one in patch 3 seems at first glance like it might be a problem
(it's another duplicate-object case, like the one of them in 5309). But
it isn't a problem because the duplicate object isn't a delta, so we
notice the problem in write_idx_file() from the main thread (which I
verified by running it under gdb and setting a breakpoint at die()).

I suspect patch 4 is the same, but didn't run gdb on each case. And
patch 5 is about a corrupt reverse index, so almost certainly the main
thread. So I suspect that patch 1 is the only one that matters here (and
probably all of those are needed, because it is all about broken
deltas).

All that said, I am on the fence between the two approaches. If Junio
prefers the atomic-spawn direction, I'm fine with that, and there's not
much point in polishing the --threads=1 approach further.

-Peff

^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: Dragan Simic @ 2024-01-11  5:39 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Taylor Blau, git
In-Reply-To: <CABPp-BH3sva=CNtx8YFGP4Egyau-hR+7njZPFEd-DRTw91BK2w@mail.gmail.com>

On 2024-01-11 01:33, Elijah Newren wrote:
> On Wed, Jan 10, 2024 at 1:57 PM Dragan Simic <dsimic@manjaro.org> 
> wrote:
>> 
>> Thus, Git should probably follow the same approach of not converting 
>> the
>> already existing code
> 
> I disagree with this.  I saw significant performance improvements
> through converting some existing Git code to Rust.  Granted, it was
> only a small amount of code, but the performance benefits I saw
> suggested we'd see more by also doing similar conversions elsewhere.
> (Note that I kept the old C code and then conditionally compiled
> either Rust or C versions of what I was converting.)

Well, it's also possible that improving the old C code could also result 
in some performance improvements.  Thus, quite frankly, I don't see that 
as a valid argument to rewrite some existing C code in Rust.

> Further, I found a really old bug from this effort as well[1], and I
> find it extremely unlikely that I would have found that bug otherwise.
> So, converting to Rust can even improve our existing C code.
> 
>> , but frankly, I don't see what would actually be
>> the "new leafs" written in Rust.
> 
> In addition to some of the examples Junio mentioned elsewhere, I think
> new toplevel commands, like git-replay, would qualify.
> 
> 
> [1] Yeah, I really need to dig the patch out and send it in.  I'll do
> so shortly.

^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: Dragan Simic @ 2024-01-11  5:33 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Taylor Blau, git
In-Reply-To: <CABPp-BFOmwV-xBtjvtenb6RFz9wx2VWVpTeho0k=D8wsCCVwqQ@mail.gmail.com>

On 2024-01-11 01:12, Elijah Newren wrote:
> Another alternative (as discussed at Git Merge when we were last
> talking about Rust[8]), is requiring all Rust code to be optional for
> now.  If we choose to go that route, I think that means that (a) for
> existing components, we have both a Rust and a C implementation
> available, and (b) for new components (e.g. new top-level commands
> like git-replay), they can be Rust-only and those compiling without
> Rust just don't get them.

To me, this sounds like a horrible option, which is exactly what
I earlier referred to as introducing fragmentation.

^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: Elijah Newren @ 2024-01-11  5:06 UTC (permalink / raw)
  To: rsbecker; +Cc: Taylor Blau, Junio C Hamano, Dragan Simic, git
In-Reply-To: <009c01da4439$f70beef0$e523ccd0$@nexbridge.com>

On Wed, Jan 10, 2024 at 6:57 PM <rsbecker@nexbridge.com> wrote:
>
> On Wednesday, January 10, 2024 9:21 PM, Elijah Newren wrote:
> >On Wed, Jan 10, 2024 at 5:44 PM <rsbecker@nexbridge.com> wrote:
> >>
> >> On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
> >[...]
> >> >Would you be okay with the following alternative: requiring that all
> >> >Rust code be optional for now?
> >> >
> >> >(In other words, allow you to build with USE_RUST=0, or something
> >> >like that.  And then we have both a Rust and a C implementation of
> >> >anything that is required for backward compatibility, while any new
> >> >Rust-only stuff would not be included in your build.)
> >>
> >> To address the immediate above, I assume this means that platform
> >> maintainers will be responsible for developing non-portable
> >> implementations that duplicate Rust functionality
> >
> >This doesn't at all sound like what I thought I said.  The whole proposal was so that
> >folks like NonStop could continue using Git with no more work than setting
> >USE_RUST=0 at build time.
> >
> >Why do you feel you'd need to duplicate any functionality?
>
> I think I misunderstood. What I took from this is that all new functionality would be in Rust, which would require a custom implementation in C for platforms that did not have Rust available - if that is even practical. Did I get that wrong?

I think you somehow missed the word optional?

I did say that new functionality should be allowed to be Rust only
(unlike existing functionality), but I'm not sure how you leaped to
assuming that all new functionality would be in Rust.  Further, I also
don't understand why you jump to assuming that all new functionality
needs to be supported on all platforms.  The point of the word
"optional" in my proposal is that it is not required.  So, say, if
git-replay is in Rust, well you've never had git-replay before in any
release, so you haven't lost any functionality by it being implemented
in Rust.  And existing things (merge, cherry-pick, rebase, etc.)
continue working with C-only code.  But you may have one less optional
addition.

At least that was _my_ proposal -- that Rust be optional for now.  It
does differ from what I think Taylor was originally proposing, but
that's why I brought it up as an alternative proposal.

^ permalink raw reply

* Re: [PATCH 4/6] t1419: mark test suite as files-backend specific
From: Patrick Steinhardt @ 2024-01-11  5:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Eric Sunshine, git
In-Reply-To: <xmqqcyu9duha.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 856 bytes --]

On Wed, Jan 10, 2024 at 08:27:29AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> On just this (non-technical) part...
> 
> >> > file, but not for loose references. Consequentially, all callers must
> >> > still filter emitted refs with those exclude patterns.
> >> 
> >> s/Consequentially/Consequently/
> >
> > Hum. I had the last time when you mentioned the in mind while writing
> > the commit message, but seemingly misremembered the outcome. So I now
> > looked things up in a dictionary, and both words seem to be used in
> > equivalent ways. As a non-native speaker, could you maybe elaborate a
> > bit to help me out? :)
> 
> As a non-native, I often find this
> 
>   https://trends.google.com/trends/explore?q=consequentially,consequently&hl=en
> 
> fairly useful.

Good idea, thanks!

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* RE: [DISCUSS] Introducing Rust into the Git project
From: rsbecker @ 2024-01-11  3:24 UTC (permalink / raw)
  To: 'brian m. carlson'
  Cc: 'Taylor Blau', 'Junio C Hamano',
	'Dragan Simic', git
In-Reply-To: <ZZ9YrYvW_L9A02aI@tapette.crustytoothpaste.net>

On Wednesday, January 10, 2024 9:56 PM, brian m. carlson wrote:
>On 2024-01-10 at 23:52:21, rsbecker@nexbridge.com wrote:
>> Unfortunately, none of the compiler frontends listed previously can be
>> built for NonStop. These appear to all require gcc either directly or
>> transitively, which cannot be ported to NonStop. I do not expect this
>> to change any time soon - and is outside of my control anyway. An
>> attempt was made to port Rust but it did not succeed primarily because
>> of that dependency.
>
>Can you tell us what the technical limitations are that prevent GCC from being
>ported so we can understand better?  I know LLVM doesn't support ia64, which you
>do support, but GCC is very likely the most portable compiler on the planet and
>supports architectures and OSes I've never otherwise heard of.
>
>I strongly suspect that if GCC did end up on NonStop, Rust would be able to be
>ported, too, and you'd also get access to gccgo, which would make Git LFS possible
>on NonStop as well[0].
>
>I'm not capable of porting GCC, but I have done some portability work in the Rust
>ecosystem, and I'd be willing to provide context and some assistance (within my
>time and capabilities) to help get Rust working on NonStop if you want.
>
>[0] For the record, as a maintainer of Git LFS, I'm happy to accept portability
>patches for virtually any OS.

There are a number of issues for porting gcc (and Go). The list is fairly long, but the summary of what I encountered directly (on the last funded effort of 3) is:
1. There are C syntax constructs required to do anything useful (required for access to the OS API) on NonStop that are not in gcc. I can hand code the parser for that, but it would take time.
2. The Big Endian x86 architecture is weird to gcc and making that work is not easy.
3. There is no assembler on NonStop.
4. The ELF header is very different from standard.
5. The symbol table structure is radically different, so debugging would be (nearly) impossible or impractical. gdb was ported to account for the platform differences.
6. The linkage structure is similar but different from standard.
7. The external fixup structure is radically different.
8. The loader does not work the same way, so there are required sections of the ELF files on NonStop that are not generated by gcc.

There are more, but I just did not get to the point if hitting them. Part of my own issue is that I have expertise in parsing and semantic passes of compilers, but my code generation skills are not where I want them to be for taking on this effort. Our last funded attempt had a code generation expert and he gave up in frustration.

If I was hired on to do this, it might have a chance, but at an estimate (not mine) of 4-5 person years for a gcc port, best case, my $DAYJOB will not permit it.

If gcc could be ported to NonStop, it would solve so many problems. I have heard of numerous failed efforts beyond what was officially funded by various companies, so this is considered a high-risk project.


^ permalink raw reply

* RE: [DISCUSS] Introducing Rust into the Git project
From: rsbecker @ 2024-01-11  2:57 UTC (permalink / raw)
  To: 'Elijah Newren'
  Cc: 'Taylor Blau', 'Junio C Hamano',
	'Dragan Simic', git
In-Reply-To: <CABPp-BHx=4HPSN4enkHTL7PPnNBsJ1vGWe4Em5imH7HcOcH2PA@mail.gmail.com>

On Wednesday, January 10, 2024 9:21 PM, Elijah Newren wrote:
>On Wed, Jan 10, 2024 at 5:44 PM <rsbecker@nexbridge.com> wrote:
>>
>> On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
>[...]
>> >Would you be okay with the following alternative: requiring that all
>> >Rust code be optional for now?
>> >
>> >(In other words, allow you to build with USE_RUST=0, or something
>> >like that.  And then we have both a Rust and a C implementation of
>> >anything that is required for backward compatibility, while any new
>> >Rust-only stuff would not be included in your build.)
>>
>> To address the immediate above, I assume this means that platform
>> maintainers will be responsible for developing non-portable
>> implementations that duplicate Rust functionality
>
>This doesn't at all sound like what I thought I said.  The whole proposal was so that
>folks like NonStop could continue using Git with no more work than setting
>USE_RUST=0 at build time.
>
>Why do you feel you'd need to duplicate any functionality?

I think I misunderstood. What I took from this is that all new functionality would be in Rust, which would require a custom implementation in C for platforms that did not have Rust available - if that is even practical. Did I get that wrong?


^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: brian m. carlson @ 2024-01-11  2:55 UTC (permalink / raw)
  To: rsbecker
  Cc: 'Taylor Blau', 'Junio C Hamano',
	'Dragan Simic', git
In-Reply-To: <007c01da4420$10a7b700$31f72500$@nexbridge.com>

[-- Attachment #1: Type: text/plain, Size: 1365 bytes --]

On 2024-01-10 at 23:52:21, rsbecker@nexbridge.com wrote:
> Unfortunately, none of the compiler frontends listed previously can be
> built for NonStop. These appear to all require gcc either directly or
> transitively, which cannot be ported to NonStop. I do not expect this
> to change any time soon - and is outside of my control anyway. An
> attempt was made to port Rust but it did not succeed primarily because
> of that dependency.

Can you tell us what the technical limitations are that prevent GCC from
being ported so we can understand better?  I know LLVM doesn't support
ia64, which you do support, but GCC is very likely the most portable
compiler on the planet and supports architectures and OSes I've never
otherwise heard of.

I strongly suspect that if GCC did end up on NonStop, Rust would be able
to be ported, too, and you'd also get access to gccgo, which would make
Git LFS possible on NonStop as well[0].

I'm not capable of porting GCC, but I have done some portability work in
the Rust ecosystem, and I'd be willing to provide context and some
assistance (within my time and capabilities) to help get Rust working on
NonStop if you want.

[0] For the record, as a maintainer of Git LFS, I'm happy to accept
portability patches for virtually any OS.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: Elijah Newren @ 2024-01-11  2:21 UTC (permalink / raw)
  To: rsbecker; +Cc: Taylor Blau, Junio C Hamano, Dragan Simic, git
In-Reply-To: <008701da442f$b2dfe420$189fac60$@nexbridge.com>

On Wed, Jan 10, 2024 at 5:44 PM <rsbecker@nexbridge.com> wrote:
>
> On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
[...]
> >Would you be okay with the following alternative: requiring that all Rust code be
> >optional for now?
> >
> >(In other words, allow you to build with USE_RUST=0, or something like that.  And
> >then we have both a Rust and a C implementation of anything that is required for
> >backward compatibility, while any new Rust-only stuff would not be included in
> >your build.)
>
> To address the immediate above, I assume this means that platform maintainers will be responsible for developing non-portable implementations that duplicate Rust functionality

This doesn't at all sound like what I thought I said.  The whole
proposal was so that folks like NonStop could continue using Git with
no more work than setting USE_RUST=0 at build time.

Why do you feel you'd need to duplicate any functionality?

^ permalink raw reply

* Re: [PATCH v2] gitweb: Fixes error handling when reading configuration
From: Marcelo Roberto Jimenez @ 2024-01-11  2:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqzfxc7mfh.fsf@gitster.g>

Hi Junio,

Ok, I'll put on my devil's prosecutor's hat then :)

I am not sure you read the first message in this thread (I know, I
messed up, sorry), but there I describe the real case that happened to
me, and the current behavior made it much harder to find out what was
happening.

That is the one:
https://lore.kernel.org/git/CACjc_5pdijCZrrXQWHswsxYrGUzZ7pZq+nj3SzY1z+Xxop64Ww@mail.gmail.com/T/#u

If the configuration file is there, but it behaves as it is not, one
will quickly point its fingers to the software that reads it.

In the end, if I had seen the message "Permission denied." that was
stored in "$!", I would have had a better clue that the problem was
with AppArmor, and not Gitweb.

There were a few unanswered questions in stack overflow regarding the
issue of enabling features on Gitweb when using "git instaweb" in a
persistent way, some of them might have been linked to that.

- gitweb refusing to blame:
https://serverfault.com/questions/551762/gitweb-refusing-to-blame
- How do I enable "blame" when using git instaweb?:
https://stackoverflow.com/questions/66688084/how-do-i-enable-blame-when-using-git-instaweb/77793405
- My own question: Problem with `git instaweb` on OpenSUSE Tumbleweed:
/etc/gitweb-common.conf is not being read:
https://stackoverflow.com/questions/77789216/problem-with-git-instaweb-on-opensuse-tumbleweed-etc-gitweb-common-conf-is-n

Let's see if someone comes up with a good reason to find the devil not guilty :)

Regards,
Marcelo.

On Wed, Jan 10, 2024 at 9:17 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Marcelo Roberto Jimenez <marcelo.jimenez@gmail.com> writes:
>
> > This patch fixes a possibility of a permission to access error go
> > unnoticed.
> >
> > Perl uses two different variables to manage errors from a do. One
> > is $@, which is set in this case when do is unable to compile the
> > file. The other is $!, which is set in case do cannot read the file.
> > By printing the value of $! I found out that it was set to Permission
> > denied. Since the script does not currently test for $!, the error
> > goes unnoticed.
>
> Well explained how the current code behaves and why.
>
> With my devil's advocate hat on, I suspect that the current
> behaviour comes from the wish to see "file exists but unreadable"
> and "the named file does not exist" behave the same way.  If you
> pass the name of a configuration file that does not exist, however,
> the codeblock to die does not trigger at all.  If a file does exist
> but unreadable, to gitweb, it is just as good as having no file to
> read configuration data from---either way, it cannot use anything
> useful from the named file.  So returning silently, which is the
> "bug" you are fixing, does not sound too bad.
>
> I dunno.  Let's queue and see what others would say.
>
> Thanks.
>
> > Perl do block documentation: https://perldoc.perl.org/functions/do
> >
> > Signed-off-by: Marcelo Roberto Jimenez <marcelo.jimenez@gmail.com>
> > ---
> >  gitweb/gitweb.perl | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> > index e66eb3d9ba..5d0122020f 100755
> > --- a/gitweb/gitweb.perl
> > +++ b/gitweb/gitweb.perl
> > @@ -728,9 +728,11 @@ sub filter_and_validate_refs {
> >  sub read_config_file {
> >       my $filename = shift;
> >       return unless defined $filename;
> > -     # die if there are errors parsing config file
> >       if (-e $filename) {
> >               do $filename;
> > +             # die if there is a problem accessing the file
> > +             die $! if $!;
> > +             # die if there are errors parsing config file
> >               die $@ if $@;
> >               return 1;
> >       }

^ permalink raw reply

* Re: [PATCH v3 1/2] t7501: add tests for --include and --only
From: Ghanshyam Thakkar @ 2024-01-11  1:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, phillip.wood123, christian.couder
In-Reply-To: <xmqqil41avcc.fsf@gitster.g>

On Thu Jan 11, 2024 at 12:07 AM IST, Junio C Hamano wrote:
> Ghanshyam Thakkar <shyamthakkar001@gmail.com> writes:
>
> > -# FIXME: Test the various index usages, -i and -o, test reflog,
> > +# fixme: test the various index usages, test reflog,
> >  # signoff
>
> Why the sudden downcasing?  If this were to turn it to "TODO:"
> (110+) or "NEEDSWORK:" (110+) from less often used "FIXME:" (40-),
> such a change might be defensible, but there is only one instance
> of downcased "fixme:", so I am curious how this happened.

Wow, I must have done it mistakenly. I guess everything on that line
became lowercase. But, I will fix it.

> > +test_expect_success '--include fails with untracked files' '
> > +	echo content >baz &&
> > +	test_must_fail git commit --include -m "initial" baz
> > +'
>
> OK, this is because "--allow-empty" is not passed.  This should fail
> without -i/-o (which should be the same as passing -o), with -i, or
> with -o.
>
> Calling this commit "initial" is highly misleading.  There are bunch
> of commits already, but path "baz" has never been used.

I will fix this.

> > +test_expect_success '--include with staged changes' '
> > +	echo newcontent >baz &&
> > +	echo newcontent >file &&
> > +	git add file &&
> > +	git commit --include -m "file baz" baz  &&
>
> I suspect that you found a bug whose behaviour we do not want to
> carve into stone with this test.  When this test begins, because the
> previous step failed to create the initial commit, we are creating
> the root commit, without --allow-empty, with contents in the index
> for path "file".  At this point
>
>     $ git commit -m "file baz" baz
>
> (or with "-o", which is the same thing) does error out with
>
>     error: pathspec 'baz' did not match any file(s) known to git
>
> because the "only" thing is to take the changes between HEAD and the
> index and limit them further to those paths that match "baz", but
> there is no path that matches "baz".


> This command
>
>     $ git commit -m "file baz" -i baz
>
> should either error out the same way, or behave more or less[*] like
>
>     $ git add baz && git commit -m "file baz"
>
> and record changes to both "file" and "baz".
>
>     Side note: The "more or less" is we should do "git rm baz"
>                instead, if you removed the path.
>
> But it seems that the current code simply ignores the unmatching
> pathspec "baz" that is on the command line, hence recording only the
> change to the contents of "file".
>
yeah, it seems like if there is anything staged, even if we provide
--include with untracked files, it will commit the staged files
(excluding the untracked files) and will not error out like --only.

Also in v1 there was another test before this one which added the
baz file to the index. That test was removed due to duplication and
while cleaning up the v1 for v2, due to --include not giving an error,
I did not notice that 'baz' was being left out. Will fix the tests.
Apologies for such mistakes.

> > +	git diff --name-only >remaining &&
> > +	test_must_be_empty remaining &&
> > +	git diff --name-only --staged >remaining &&
> > +	test_must_be_empty remaining
>
> These two tests to see if the working tree is clean and if the index
> is clean are not wrong per-se, but is insufficient.  Judging from
> the commit message you used, I think you expected this commit to
> contain both changes to 'file' and 'baz'.  That needs to be also
> checked with something like "git diff --name-only HEAD^ HEAD".

Yeah, the "git diff --name-only HEAD^ HEAD" is more definitive.
I will add that in v4.

> Now which behaviour between "error out because there is no path in
> the index that matches pathspec 'baz'" and "add baz to the index and
> commit that addition, together with what is already in the index" we
> would want to take is probably open for discussion.  Such a discussion
> may decide that the current behaviour is fine.  Until then...
>
> > +test_expect_success '--only fails with untracked files' '
> > +	echo content >newfile &&
> > +	test_must_fail git commit --only -m "newfile" newfile
> > +'
>
> And this should fail the same way without "-o".  Don't we have such
> a test that we can just sneak "with -o the same thing happens" test
> next to it?

Well, I could not find any test which specifically for untracked
files. I will keep looking for it and if not found, perhaps, I can add
both "-i and -o with untracked files" tests after "nothing to commit"
tests in t7501 which are similar in nature.

> > +	git commit --only -m "file" file &&
>
> This should behave exactly the same way without "-o".

That is true, however, I could not find any tests in the t75-- series
that test without -o and which provide pathspec in the commit command
also. So, should I drop -o option in this test? or add a test without
-o?

>
> > +	git diff --name-only >actual &&
> > +	test_must_be_empty actual &&
> > +	git diff --name-only --staged >actual &&
> > +	test_grep "baz" actual
> > +'
> > +
> > +test_expect_success '--include and --only do not mix' '
> > +	test_when_finished "git reset --hard" &&
> > +	echo new >file &&
> > +	echo new >baz &&
> > +	test_must_fail git commit --include --only -m "bar" bar baz
>
> OK.  If you grep for 'cannot be used together' in t/ directory,
> there are many tests that make sure how an invocation like this
> should fail, i.e. with an error message that mentions incompatible
> options.  Don't we want to do the same?

I did it like this because in t7501, there are couple of existing 
"do not mix" tests similar to this one, which only check if the command
fails. However, the approach you mentioned is obviously better, so, I
will update it in v4.

> Thanks.

Thank you for the review!

^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: brian m. carlson @ 2024-01-11  1:56 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git
In-Reply-To: <ZZ77NQkSuiRxRDwt@nand.local>

[-- Attachment #1: Type: text/plain, Size: 9771 bytes --]

On 2024-01-10 at 20:16:53, Taylor Blau wrote:
> Over the holiday break at the end of last year I spent some time
> thinking on what it would take to introduce Rust into the Git project.
> 
> There is significant work underway to introduce Rust into the Linux
> kernel (see [1], [2]). Among their stated goals, I think there are a few
> which could be potentially relevant to the Git project:
> 
>   - Lower risk of memory safety bugs, data races, memory leaks, etc.
>     thanks to the language's safety guarantees.
> 
>   - Easier to gain confidence when refactoring or introducing new code
>     in Rust (assuming little to no use of the language's `unsafe`
>     feature).

I agree with both of these points.  We've found that making our code
thread safe in Git is hard and it's much easier in Rust, because, for
the most part, the code doesn't compile if it would have a data race.
Unit tests are also easy and built-in, and I think that's a major
advantage.

We also get nice things for free, like sets, maps, lists, and a variety
of other collections that are all type-safe.  Error handling is also a
huge benefit: we'll get typed errors with the ability to pass data back.

>   - Contributing to Git becomes easier and accessible to a broader group
>     of programmers by relying on a more modern language.

I think this can't be understated.  One of the biggest hurdles for
people contributing is that our code requires expert knowledge of C.  We
do all sorts of weird things with pointer arithmetic that even I have
trouble understanding, and I'd really appreciate not having to worry
about memory leaks or freeing resources.[0]  Rust has nice things like the
Drop trait that make resource management easy.

Rust is also a language that people _want_ to use.  I really like it and
would probably contribute more if Git were in it.  I don't really want
to write more C, and outside of Git, won't use it on more than a de
minimis basis unless paid.

I can confirm that, having partially ported our service that serves Git
traffic to Rust from C (without the public having noticed), it's a much
nicer environment to work in.  I'm also much more efficient at making
changes as well.

> Given the allure of these benefits, I think it's at least worth
> considering and discussing how Rust might make its way into Junio's
> tree.

A couple of things which I think are worth discussing are as follows:

The Rust project emits a new release every six weeks and doesn't provide
LTS versions.  What versions of Rust are supported by crates vary
widely, and we'll absolutely need to choose our dependencies wisely.  We
may also want to ask crate authors if they'll be willing to commit to
our version policy before using them; oftentimes, that can work.

The approach that I aim for is supporting the version of Rust in the
latest Debian stable, plus the version in Debian's previous stable
release until the latest stable has been out for a year.  (Thus, if
Debian 12 was released on 2023-06-10, then I'd support Rust 1.48, Debian
11's version, until 2024-06-10, and then support would move to 1.63,
Debian 12's version.)  This provides about three years of support for a
compiler version, which I think is fair.

Note that none of this means that we're dropping support for older
systems; newer versions of Rust will be available for most targets,
even often after OSes go end of life.

We'll also probably need to continue to rely on some C libraries.  For
example, reqwest, the main Rust HTTP client, doesn't support any
authentication other than Basic, and I assure you from my experience as
the Git LFS maintainer, we don't want to implement things like NTLM and
Kerberos on our own.  libcurl is almost certainly going to continue to
be a dependency, as will PCRE.  The Rust regex crate doesn't support
backreferences, and we've basically tied lots of our regexes to POSIX,
so we'll need to either rely on PCRE or some call out to a
POSIX-compatible interface.  gettext is likely to be another issue,
although its thread-safety is potentially a problem; we could try using
the `tr` crate instead, which also provides a Rust-specific string
ripper.

> I imagine that the transition state would involve some parts of the
> project being built in C and calling into Rust code via FFI (and perhaps
> vice-versa, with Rust code calling back into the existing C codebase).
> Luckily for us, Rust's FFI provides a zero-cost abstraction [3], meaning
> there is no performance impact when calling code from one language in
> the other.

Moreover, there are even ways to generate Rust bindings for C code and C
headers for Rust code automatically.  (These are cbindgen and bindgen,
respectively.)  I've used both, and while it's clearly an FFI case, it's
still very ergonomic.

> Some open questions from me, at least to get the discussion going are:
> 
>   1. Platform support. The Rust compiler (rustc) does not enjoy the same
>      widespread availability that C compilers do. For instance, I
>      suspect that NonStop, AIX, Solaris, among others may not be
>      supported.
> 
>      One possible alternative is to have those platforms use a Rust
>      front-end for a compiler that they do support. The gccrs [4]
>      project would allow us to compile Rust anywhere where GCC is
>      available. The rustc_codegen_gcc [5] project uses GCC's libgccjit
>      API to target GCC from rustc itself.

I think this is probably the biggest stumbling point.  I know GCC is
highly portable and works on AIX, as well as virtually every
architecture.  gccrs is still incomplete, but I believe
rustc_codegen_gcc is mature, and should be a viable option for most
platforms.  (Solaris is already supported on Rust[1].)

My main concerns are with NonStop, since the Rust standard library
requires threading and a CSPRNG (although that can definitely be RDRAND,
and is for some targets).  I seem to recall that neither GCC nor LLVM
are present there, although I see no reason why GCC could not be ported
(LLVM lacks support for ia64, I believe, which would make it a bigger
lift)

I suspect that if we go forward, though, a lot of the work for
architecture support in Rust upstream will already have been done, since
I'm pretty sure the Debian porters for architectures like alpha, hppa,
and ia64 are going to want to continue to use Git.  NetBSD porters may
also have useful patches in pkgsrc.

I am also very sympathetic to the difficulties of running on less common
systems, having had a PowerPC Mac running Linux as my first laptop and
several UltraSPARC machines.  I have sent in numerous patches to a wide
variety of code so that it works gracefully on lots of architectures,
and I've also dealt with lots of broken software.  I do, however, think
it's up to the porters of an OS to keep it running and healthy, and that
means making sure it has suitable compiler toolchains for building,
including for modern, extremely popular languages like Rust and Go.  I'm
okay with dropping support for systems where nobody upstream wants to or
is capable of maintaining that tooling.

I actually feel that once Rust is running on a system, it's actually
easier to write portable code, since you don't have alignment issues and
endianness must be handled explicitly, and most safe Rust code just
works out of the box.

>   2. Migration. What parts of Git are easiest to convert to Rust? My
>      hunch is that the answer is any stand-alone libraries, like
>      strbuf.h. I'm not sure how we should identify these, though, and in
>      what order we would want to move them over.

strbuf.h is tricky because it uses variadic arguments, which are not
stable in Rust.  My approach would be to start by getting the main
function up and running, and then we can incrementally port things over.

We could, for example, use the `sha256` crate for our SHA-256 code
(which would also dynamically use accelerated hardware implementations
where available).  There are other things which are libraries which
could well work, though.  Porting over our hashmap implementation might
be a thing to do, for example.  The repository structure might also be
a good idea, since that will allow us to write safe wrappers for its
contents.

>   3. Interaction with the lib-ification effort. There is lots of work
>      going on in an effort to lib-ify much of the Git codebase done by
>      Google. I'm not sure how this would interact with that effort, but
>      we should make sure that one isn't a blocker for the other.

I think it's going to work together nicely.  We can and should consider
building a C library from Rust to expose a lot of what we write.

Also, in my view, the biggest enemy to libification in our codebase is
our copious and improvident use of globals.  Mutating static variables
in Rust is unsafe, so as part of the port, we'll need to get rid of
them, which seems like a nice common goal.

> I'm curious to hear what others think about this. I think that this
> would be an exciting and worthwhile direction for the project. Let's
> see!

I'm very much in favour of this.  I think I brought it up at the
contributor's summit and it caught some attention, but I don't think it
should be too controversial and it will offer us a lot of advantages.

[0] And before people say, "Well, you just need to spend more time with
C," I've been writing it since I was 10 and I think we can all agree
that with the SHA-256 work I've spent plenty of time with it.
[1] rustc --print target-list is a great way to see what's supported.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* RE: [DISCUSS] Introducing Rust into the Git project
From: rsbecker @ 2024-01-11  1:44 UTC (permalink / raw)
  To: 'Elijah Newren'
  Cc: 'Taylor Blau', 'Junio C Hamano',
	'Dragan Simic', git
In-Reply-To: <CABPp-BEw_HFL-9u6WdSEe-qr_JfJyQtfU6PP7izEdPChKooc6g@mail.gmail.com>

On Wednesday, January 10, 2024 7:59 PM, Elijah Newren wrote:
>On Wed, Jan 10, 2024 at 3:52 PM <rsbecker@nexbridge.com> wrote:
>>
>> On Wednesday, January 10, 2024 5:26 PM, Taylor Blau wrote:
>> >On Wed, Jan 10, 2024 at 05:15:53PM -0500, rsbecker@nexbridge.com wrote:
>> >> Just a brief concern: Rust is not broadly portable. Adding another
>> >> dependency to git will remove many existing platforms from future releases.
>> >> Please consider this carefully before going down this path.
>> >
>> >I was hoping to hear from you as one of the few (only?) folks who
>> >participate on the list and represent HPE NonStop users.
>> >
>> >I'm curious which if any of the compiler frontends that I listed in
>> >my earlier email would work for you.
>>
>> Unfortunately, none of the compiler frontends listed previously can be built for
>NonStop. These appear to all require gcc either directly or transitively, which cannot
>be ported to NonStop. I do not expect this to change any time soon - and is outside
>of my control anyway. An attempt was made to port Rust but it did not succeed
>primarily because of that dependency. Similarly, Golang is also not portable to
>NonStop because of architecture assumptions made by the Go team that cannot be
>satisfied on NonStop at this time. If some of the memory/pointer issuese the
>primary concern, c11 might be something acceptable with smart pointers. C17 will
>eventually be deployable, but is not available on most currently supported OS
>versions on the platform.
>
>Would you be okay with the following alternative: requiring that all Rust code be
>optional for now?
>
>(In other words, allow you to build with USE_RUST=0, or something like that.  And
>then we have both a Rust and a C implementation of anything that is required for
>backward compatibility, while any new Rust-only stuff would not be included in
>your build.)

To address the immediate above, I assume this means that platform maintainers will be responsible for developing non-portable implementations that duplicate Rust functionality, which arguably may not be possible. We do have $DAYJOBS and the expectation that duplicate implementation are cost effective or even viable is a huge assumption that may not be attainable.

One of the key benefits of git is the ability to deploy it virtually anywhere on virtually any platform - and mirror repositories anywhere for resiliency purposes. It currently runs on (almost) every current platform because it does not have dependencies on Linux-only compilers and tools. Except for LFS, which is Golang, and I do not have access to that functionality, anyone with a C compiler can deploy git processes in their environment. By adding Rust (or any other gcc-only dependency), it eliminates the primary benefit of git. I am honestly very disappointed with this direction and think this detracts significantly from the primary value proposition that git offers: specifically, that we can take any developer from any platform and move them anywhere else without having to design new processes or teach them new processes for their workflows (this comes up at every major customer with whom I interact). I think this direction is a fundamental mistake and will rapidly limit (or eliminate) git's long-term viability. To be honest, if I saw this direction when deciding which VCS to deploy, I would reconsider git and start looking around for another more portable option. It hurts to even contemplate this direction. Please do not do this.


^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: Elijah Newren @ 2024-01-11  0:59 UTC (permalink / raw)
  To: rsbecker; +Cc: Taylor Blau, Junio C Hamano, Dragan Simic, git
In-Reply-To: <007c01da4420$10a7b700$31f72500$@nexbridge.com>

On Wed, Jan 10, 2024 at 3:52 PM <rsbecker@nexbridge.com> wrote:
>
> On Wednesday, January 10, 2024 5:26 PM, Taylor Blau wrote:
> >On Wed, Jan 10, 2024 at 05:15:53PM -0500, rsbecker@nexbridge.com wrote:
> >> Just a brief concern: Rust is not broadly portable. Adding another
> >> dependency to git will remove many existing platforms from future releases.
> >> Please consider this carefully before going down this path.
> >
> >I was hoping to hear from you as one of the few (only?) folks who participate on
> >the list and represent HPE NonStop users.
> >
> >I'm curious which if any of the compiler frontends that I listed in my earlier email
> >would work for you.
>
> Unfortunately, none of the compiler frontends listed previously can be built for NonStop. These appear to all require gcc either directly or transitively, which cannot be ported to NonStop. I do not expect this to change any time soon - and is outside of my control anyway. An attempt was made to port Rust but it did not succeed primarily because of that dependency. Similarly, Golang is also not portable to NonStop because of architecture assumptions made by the Go team that cannot be satisfied on NonStop at this time. If some of the memory/pointer issues are the primary concern, c11 might be something acceptable with smart pointers. C17 will eventually be deployable, but is not available on most currently supported OS versions on the platform.

Would you be okay with the following alternative: requiring that all
Rust code be optional for now?

(In other words, allow you to build with USE_RUST=0, or something like
that.  And then we have both a Rust and a C implementation of anything
that is required for backward compatibility, while any new Rust-only
stuff would not be included in your build.)

^ permalink raw reply

* Re: [PATCH 0/2] Generalize reference locking in tests
From: Junio C Hamano @ 2024-01-11  0:36 UTC (permalink / raw)
  To: Justin Tobler via GitGitGadget; +Cc: git, Justin Tobler
In-Reply-To: <pull.1634.git.1704912750.gitgitgadget@gmail.com>

"Justin Tobler via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This approach is more verbose and may warrant consideration of implementing
> an abstraction to further simplify this operation. There appears to be one
> other existing test in t1400 that also follows this pattern. Being that
> there is only a handful of affected tests the implemented approach may be
> sufficient as is.

The use of two fifos and avoiding deadlocking parent and child is
tricky enough that it does feel that it warrants a helper function
to do the common part of what these two patches add.

I think I read t1401 patch carefully enough to understand what is
going on, but I cannot yet claim the same for the other one.

Thanks.

> Justin Tobler (2):
>   t1401: generalize reference locking
>   t5541: generalize reference locking
>
>  t/t1401-symbolic-ref.sh    | 28 ++++++++++++++++++++++++----
>  t/t5541-http-push-smart.sh | 25 +++++++++++++++++++++++--
>  2 files changed, 47 insertions(+), 6 deletions(-)
>
>
> base-commit: 624eb90fa8f65a79396615f3c2842ac5a3743350
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1634%2Fjltobler%2Fjt%2Fbackend-generic-ref-lock-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1634/jltobler/jt/backend-generic-ref-lock-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1634

^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: Elijah Newren @ 2024-01-11  0:33 UTC (permalink / raw)
  To: Dragan Simic; +Cc: Taylor Blau, git
In-Reply-To: <b2651b38a4f7edaf1c5ffee72af00e46@manjaro.org>

On Wed, Jan 10, 2024 at 1:57 PM Dragan Simic <dsimic@manjaro.org> wrote:
>
> Thus, Git should probably follow the same approach of not converting the
> already existing code

I disagree with this.  I saw significant performance improvements
through converting some existing Git code to Rust.  Granted, it was
only a small amount of code, but the performance benefits I saw
suggested we'd see more by also doing similar conversions elsewhere.
(Note that I kept the old C code and then conditionally compiled
either Rust or C versions of what I was converting.)

Further, I found a really old bug from this effort as well[1], and I
find it extremely unlikely that I would have found that bug otherwise.
So, converting to Rust can even improve our existing C code.

>, but frankly, I don't see what would actually be
> the "new leafs" written in Rust.

In addition to some of the examples Junio mentioned elsewhere, I think
new toplevel commands, like git-replay, would qualify.


[1] Yeah, I really need to dig the patch out and send it in.  I'll do
so shortly.

^ permalink raw reply

* Re: [PATCH v2] gitweb: Fixes error handling when reading configuration
From: Junio C Hamano @ 2024-01-11  0:17 UTC (permalink / raw)
  To: Marcelo Roberto Jimenez; +Cc: git
In-Reply-To: <20240110225709.30168-1-marcelo.jimenez@gmail.com>

Marcelo Roberto Jimenez <marcelo.jimenez@gmail.com> writes:

> This patch fixes a possibility of a permission to access error go
> unnoticed.
>
> Perl uses two different variables to manage errors from a do. One
> is $@, which is set in this case when do is unable to compile the
> file. The other is $!, which is set in case do cannot read the file.
> By printing the value of $! I found out that it was set to Permission
> denied. Since the script does not currently test for $!, the error
> goes unnoticed.

Well explained how the current code behaves and why.

With my devil's advocate hat on, I suspect that the current
behaviour comes from the wish to see "file exists but unreadable"
and "the named file does not exist" behave the same way.  If you
pass the name of a configuration file that does not exist, however,
the codeblock to die does not trigger at all.  If a file does exist
but unreadable, to gitweb, it is just as good as having no file to
read configuration data from---either way, it cannot use anything
useful from the named file.  So returning silently, which is the
"bug" you are fixing, does not sound too bad.

I dunno.  Let's queue and see what others would say.

Thanks.

> Perl do block documentation: https://perldoc.perl.org/functions/do
>
> Signed-off-by: Marcelo Roberto Jimenez <marcelo.jimenez@gmail.com>
> ---
>  gitweb/gitweb.perl | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/gitweb/gitweb.perl b/gitweb/gitweb.perl
> index e66eb3d9ba..5d0122020f 100755
> --- a/gitweb/gitweb.perl
> +++ b/gitweb/gitweb.perl
> @@ -728,9 +728,11 @@ sub filter_and_validate_refs {
>  sub read_config_file {
>  	my $filename = shift;
>  	return unless defined $filename;
> -	# die if there are errors parsing config file
>  	if (-e $filename) {
>  		do $filename;
> +		# die if there is a problem accessing the file
> +		die $! if $!;
> +		# die if there are errors parsing config file
>  		die $@ if $@;
>  		return 1;
>  	}

^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: Elijah Newren @ 2024-01-11  0:12 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git
In-Reply-To: <ZZ77NQkSuiRxRDwt@nand.local>

On Wed, Jan 10, 2024 at 12:18 PM Taylor Blau <me@ttaylorr.com> wrote:
>
> Over the holiday break at the end of last year I spent some time
> thinking on what it would take to introduce Rust into the Git project.

I'm very happy to see this email.

> There is significant work underway to introduce Rust into the Linux
> kernel (see [1], [2]). Among their stated goals, I think there are a few
> which could be potentially relevant to the Git project:
>
>   - Lower risk of memory safety bugs, data races, memory leaks, etc.
>     thanks to the language's safety guarantees.
>
>   - Easier to gain confidence when refactoring or introducing new code
>     in Rust (assuming little to no use of the language's `unsafe`
>     feature).
>
>   - Contributing to Git becomes easier and accessible to a broader group
>     of programmers by relying on a more modern language.
>
> Given the allure of these benefits, I think it's at least worth
> considering and discussing how Rust might make its way into Junio's
> tree.

I think there are other benefits as well; I'll list them at the end of
the email to avoid side-tracking too much[6].

> I imagine that the transition state would involve some parts of the
> project being built in C and calling into Rust code via FFI (and perhaps
> vice-versa, with Rust code calling back into the existing C codebase).
> Luckily for us, Rust's FFI provides a zero-cost abstraction [3], meaning
> there is no performance impact when calling code from one language in
> the other.

I agree with the zero-cost abstraction, but there is a funny caveat
with measuring it if anyone is curious[7].

> Some open questions from me, at least to get the discussion going are:
>
>   1. Platform support. The Rust compiler (rustc) does not enjoy the same
>      widespread availability that C compilers do. For instance, I
>      suspect that NonStop, AIX, Solaris, among others may not be
>      supported.
>
>      One possible alternative is to have those platforms use a Rust
>      front-end for a compiler that they do support. The gccrs [4]
>      project would allow us to compile Rust anywhere where GCC is
>      available. The rustc_codegen_gcc [5] project uses GCC's libgccjit
>      API to target GCC from rustc itself.

Another alternative (as discussed at Git Merge when we were last
talking about Rust[8]), is requiring all Rust code to be optional for
now.  If we choose to go that route, I think that means that (a) for
existing components, we have both a Rust and a C implementation
available, and (b) for new components (e.g. new top-level commands
like git-replay), they can be Rust-only and those compiling without
Rust just don't get them.

>   2. Migration. What parts of Git are easiest to convert to Rust? My
>      hunch is that the answer is any stand-alone libraries, like
>      strbuf.h. I'm not sure how we should identify these, though, and in
>      what order we would want to move them over.

If we're happy to allow Rust, I'd like to rewrite git-replay in Rust
as a testcase.  It's almost certainly not "easiest", but I think it's
an interesting testcase because it's a new top-level command that
hasn't appeared in any release yet.  Further, it is currently only
designed for server-side usecases, so would likely not be affected by
more limited platform support.  (I haven't started on this; my
previous experiments were with diffcore-delta.)

> I'm curious to hear what others think about this. I think that this
> would be an exciting and worthwhile direction for the project. Let's
> see!

:-)

>
> Thanks,
> Taylor
>
> [1]: https://rust-for-linux.com/
> [2]: https://lore.kernel.org/rust-for-linux/20210414184604.23473-1-ojeda@kernel.org/
> [3]: https://blog.rust-lang.org/2015/04/24/Rust-Once-Run-Everywhere.html#c-talking-to-rust
> [4]: https://github.com/Rust-GCC/gccrs
> [5]: https://github.com/rust-lang/rustc_codegen_gcc

[6] Here are some additional benefits I see:

 - Parallel performance.  We avoid making things parallel in Git because
   debugging/maintaining/reviewing parallel code in C often isn't worth
   the squeeze.  Rust was designed to greatly reduce this effort (the
   whole "fearless concurrency" thing).

 - Single-threaded Performance.  Multiple factors:

   - We had (and might still have) O(N^2) stuff in a lot of places in
     our codebase, because we tend to over-use arrays.  (e.g. with
     string_list, or with insertions and deletions into the index
     during a merge, etc.)

   - Relatedly, using hashes in C is quite onerous, to the point that
     we often simply avoid it.  I know I have, and I also know that
     even after I introduced strmap and tried to use it outside of
     merge-ort, that I got pushback because "string hash-maps are not
     really typical for a C program. I'm sure they are the best choice
     for an advanced merge algorithm but they are not really necessary
     [here; let's use sorted arrays instead]..."  I then had to go
     through multiple rounds of responses and ended up reimplementing
     everything as suggested (before finally convincing others to just
     use the strmap implementation after all).

   - We use QSORT() which basically calls libc's qsort().  Due to the
     design of this function (where the comparator is a separate
     function call), it is slow.  When languages avoid making the
     comparator a separate function call, they can speed sorts up by a
     factor of 2 (or even by 3 when an unstable sort is good enough
     and the platform's qsort() is stable).

   - Difficulty of incorporating other libraries.  For example, our
     hashmap.[ch] make use of FNV, but picking something else is a big
     amount of effort.  Now, while FNV is faster than Rust's default
     of SipHash, cargo makes it easy to pull in alternatives like
     FnvHashMap or FxHashMap, which we can then use where it matters.

I'm also tempted to include bullet points for having a unit testing
framework built in, and potentially fewer platform-dependent issues
(e.g. forgetting to use STABLE_QSORT when required since qsort is
stable in some libc implementations, since rust defines those more
carefully to be consistent across platforms), but I'm not sure these
additional advantages are big enough to merit a full bullet point.

[7] If you ignore Rust for a moment, and simply divide your files into
different libraries (e.g. introducing a new.c file, moving some
functions to it, and then compiling new.c into a new library,
libnew.a, and linking both libgit.a and libnew.a into git), you can
sometimes measure some small performance differences.  At least, I
did.  What this scenario has to do with Rust is that if we start
moving some code to Rust, that will naturally likely result in a
different division of files into libraries.  Thus, for me to verify
that Rust did provide zero-cost abstractions with my experiments, in
order to compare the performance of my Rust changes, I had to compare
to a version of git where I split some functions out into a separate
library.  When I did that, the performance overhead was actually 0.
Otherwise, there was a tiny performance degradation in the particular
splitting I employed.  However, while splitting did give me a small
performance drop, it was completely outweighed by the performance
advantages I got elsewhere in the things I converted to Rust.

[8] https://lore.kernel.org/git/ZRrfN2lbg14IOLiK@nand.local/

^ permalink raw reply

* Re: Limited operations in unsafe repositories
From: Junio C Hamano @ 2024-01-11  0:04 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Jeff King, git
In-Reply-To: <ZZ8pbAMNaBDFgf3G@tapette.crustytoothpaste.net>

"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> On 2024-01-10 at 12:05:31, Jeff King wrote:
>> My thinking is to flip that around: run all code, but put protection in
>> the spots that do unsafe things, like loading config or examining
>> hooks. I.e., a patch like this:
>
> I think that's much what I had intended to do with not invoking binaries
> at all, except that it was limited to rev-parse.  I wonder if perhaps we
> could do something similar if we had the `--assume-unsafe` argument you
> proposed, except that we would only allow the `git` binary and always
> pass that argument to it in such a case.
>
> I don't think reading config is intrinsically unsafe; it's more of what
> we do with it, which is spawning external processes, that's the problem.
> I suppose an argument could be made for injecting terminal sequences or
> such, though.  Hooks, obviously, are definitely unsafe.

Sure.  And we allow the location of hook programs to be specified as
configuration variable values, which would make the config even more
dangerous X-<.

^ permalink raw reply

* RE: [DISCUSS] Introducing Rust into the Git project
From: rsbecker @ 2024-01-10 23:52 UTC (permalink / raw)
  To: 'Taylor Blau'
  Cc: 'Junio C Hamano', 'Dragan Simic', git
In-Reply-To: <ZZ8ZlX6bf+hjmhN+@nand.local>

On Wednesday, January 10, 2024 5:26 PM, Taylor Blau wrote:
>On Wed, Jan 10, 2024 at 05:15:53PM -0500, rsbecker@nexbridge.com wrote:
>> Just a brief concern: Rust is not broadly portable. Adding another
>> dependency to git will remove many existing platforms from future releases.
>> Please consider this carefully before going down this path.
>
>I was hoping to hear from you as one of the few (only?) folks who participate on
>the list and represent HPE NonStop users.
>
>I'm curious which if any of the compiler frontends that I listed in my earlier email
>would work for you.

Unfortunately, none of the compiler frontends listed previously can be built for NonStop. These appear to all require gcc either directly or transitively, which cannot be ported to NonStop. I do not expect this to change any time soon - and is outside of my control anyway. An attempt was made to port Rust but it did not succeed primarily because of that dependency. Similarly, Golang is also not portable to NonStop because of architecture assumptions made by the Go team that cannot be satisfied on NonStop at this time. If some of the memory/pointer issues are the primary concern, c11 might be something acceptable with smart pointers. C17 will eventually be deployable, but is not available on most currently supported OS versions on the platform.


^ permalink raw reply

* Re: [DISCUSS] Introducing Rust into the Git project
From: brian m. carlson @ 2024-01-10 23:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Dragan Simic, Taylor Blau, git
In-Reply-To: <xmqqjzog96uh.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 1679 bytes --]

On 2024-01-10 at 22:11:34, Junio C Hamano wrote:
> A few obvious ones that come to my mind are that you should be able
> to write a new merge strategy and link the resulting binary into Git
> without much hassle.  You might even want to make that a dynamically
> loaded object.  The interface into a merge strategy is fairly narrow
> IIRC.  Or possibly a new remote helper.
> 
> Adding a new refs backend may need to wait for the work Patrick is
> doing to add reftable support, but once the abstraction gets to the
> point to sufficiently hide the differences between files and reftables
> backends, I do not see a reason why you cannot add the third one.
> 
> And more into the future, we might want to have an object DB
> abstraction, similar to how we abstracted refs API over time, at
> which time you might be writing code that stores objects to and
> retrieves objects from persistent redis and whatnot in your favorite
> language.

This is definitely a thing people will want to do.  I think Microsoft
had some code for Azure DevOps that stored their code in the cloud and
the refs database in a real database.  I can imagine that being a
valuable set of features people would want to implement in a variety of
environments, with all of the benefits of basing on upstream Git.

I also feel that I would absolutely not want to write those things in C.
Rust is much more ergonomic when writing these things because freeing
resources (freeing memory, rolling back transactions, closing files,
etc.) becomes as easy as implementing the Drop trait and you write less
boilerplate.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* Re: Limited operations in unsafe repositories
From: brian m. carlson @ 2024-01-10 23:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git
In-Reply-To: <20240110120531.GA25541@coredump.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 906 bytes --]

On 2024-01-10 at 12:05:31, Jeff King wrote:
> My thinking is to flip that around: run all code, but put protection in
> the spots that do unsafe things, like loading config or examining
> hooks. I.e., a patch like this:

I think that's much what I had intended to do with not invoking binaries
at all, except that it was limited to rev-parse.  I wonder if perhaps we
could do something similar if we had the `--assume-unsafe` argument you
proposed, except that we would only allow the `git` binary and always
pass that argument to it in such a case.

I don't think reading config is intrinsically unsafe; it's more of what
we do with it, which is spawning external processes, that's the problem.
I suppose an argument could be made for injecting terminal sequences or
such, though.  Hooks, obviously, are definitely unsafe.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox