Git development
 help / color / mirror / Atom feed
* Re: [PATCH 3/3] http: support CURLOPT_PROTOCOLS_STR
From: Junio C Hamano @ 2023-01-16 16:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Jeff King, git, Ramsay Jones
In-Reply-To: <xmqqzgaicvmj.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

>> +#if GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR
>> +#define GIT_CURLOPT_REDIR_PROTOCOLS_STR CURLOPT_REDIR_PROTOCOLS_STR 
>> +#define GIT_CURLOPT_PROTOCOLS_STR CURLOPT_PROTOCOLS_STR
>> +#else
>> +#define GIT_CURLOPT_REDIR_PROTOCOLS_STR 0
>> +#define GIT_CURLOPT_PROTOCOLS_STR 0
>>  #endif
>
> I find it a bit ugly that CURLOPT_* being used are all non-zero, but
> it may be true in practice.

Sorry for making the first half of the sentence unintelligible.  I
meant to say that it is ugly that the correctness of the solution
depends on the real value for these macros being non-zero.


^ permalink raw reply

* Re: [PATCH 3/3] http: support CURLOPT_PROTOCOLS_STR
From: Junio C Hamano @ 2023-01-16 16:05 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Jeff King, git, Ramsay Jones
In-Reply-To: <230116.86edruzk5m.gmgdl@evledraar.gmail.com>

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> +#define GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR (LIBCURL_VERSION_NUM >= 0x075500)
> +#if GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR
> +#define GIT_CURLOPT_REDIR_PROTOCOLS_STR CURLOPT_REDIR_PROTOCOLS_STR 
> +#define GIT_CURLOPT_PROTOCOLS_STR CURLOPT_PROTOCOLS_STR
> +#else
> +#define GIT_CURLOPT_REDIR_PROTOCOLS_STR 0
> +#define GIT_CURLOPT_PROTOCOLS_STR 0
>  #endif

I find it a bit ugly that CURLOPT_* being used are all non-zero, but
it may be true in practice.

> diff --git a/http.c b/http.c
> index e529ebc993d..3224e897f02 100644
> --- a/http.c
> +++ b/http.c
> @@ -933,24 +933,22 @@ static CURL *get_curl_handle(void)
>  	curl_easy_setopt(result, CURLOPT_MAXREDIRS, 20);
>  	curl_easy_setopt(result, CURLOPT_POSTREDIR, CURL_REDIR_POST_ALL);
>  
> -#ifdef GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR
> -	{
> +	if (GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR) {
>  		struct strbuf buf = STRBUF_INIT;
> ...
> +	} else {
> +		curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
> +				 get_curl_allowed_protocols(0, NULL));
> +		curl_easy_setopt(result, CURLOPT_PROTOCOLS,
> +				 get_curl_allowed_protocols(-1, NULL));
>  	}
> -#else
> -	curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
> -			 get_curl_allowed_protocols(0, NULL));
> -	curl_easy_setopt(result, CURLOPT_PROTOCOLS,
> -			 get_curl_allowed_protocols(-1, NULL));
> -#endif

I somehow find the above over-engineered solution looking for a
problem.  Conditional compilation might be ugly, but what is uglier
is a conditional compilation hidden behind what pretends to be a
runtime conditional but gets optimized away at compile time.  Also,
when the non _STR variant changes status from deprecated to removed,
the code will cease to build, so I am not sure if the above is the
whole story.  You'd also need dummy definitions for them when the
version of cURL is advanced enough.

It is true that with the above we will always pass both sides to the
compiler, which may be an upside, but at the same time doesn't it
make it harder to notice and remove the support of the older side
once the time comes?

Thanks.

^ permalink raw reply

* Re: [PATCH] ls-tree: add --sparse-filter-oid argument
From: William Sprent @ 2023-01-16 15:13 UTC (permalink / raw)
  To: Junio C Hamano, Eric Sunshine; +Cc: William Sprent via GitGitGadget, git
In-Reply-To: <xmqq8ri62ohd.fsf@gitster.g>

On 13/01/2023 21.01, Junio C Hamano wrote:
> Eric Sunshine <sunshine@sunshineco.com> writes:
> 
>>>      Note that one of the tests only pass when run on top of commit
>>>      5842710dc2 (dir: check for single file cone patterns, 2023-01-03), which
>>>      was submitted separately and is currently is merged to 'next'.
>>
>> Thanks for mentioning this. It's exactly the sort of information the
>> maintainer needs when applying your patch to his tree. And it can be
>> helpful for reviewers too.
> 
> Indeed it is.  Thanks for pointing it out---I just wish other
> contributors see and follow such a wise piece of advice buried in
> review of a patch by others.

Apropos where to apply the patch: This patch conflicts with the topic
'ls-tree.c: clean-up works'[1], which refactors parts of 'ls-tree.c'.
It has recently been merged to 'next' in [2].

Should I go ahead and rebase my changes on top of 'tl/ls-tree-code-clean-up'
and then resubmit a new patch, or does it make sure sense to wait?

1: https://public-inbox.org/git/20230112091135.20050-1-tenglong.tl@alibaba-inc.com
2: Commit f7238037fd (Merge branch 'tl/ls-tree-code-clean-up' into next, 2023-01-14)


^ permalink raw reply

* Re: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
From: Junio C Hamano @ 2023-01-16 15:06 UTC (permalink / raw)
  To: Hans Petter Selasky; +Cc: rsbecker, 'brian m. carlson', git
In-Reply-To: <85788356-14b1-6afb-c78c-0ab889bbbb59@selasky.org>

Hans Petter Selasky <hps@selasky.org> writes:

> From what I've read the GPLv3 goes pretty far to also provide flashing
> rights for software, but what use is that, when flashing the unsigned
> software on your Samsung phone, for example, some fuse breaks in the
> hardware, and then you can no longer use certain apps on your phone?

It smells that you are conflating the signing of source material and
the sealing of tivoized hardware that use cryptographic signature to
tell what binaries are allowed to run on it.

The signing implemented by the software we the Git development
community build is not about the latter.  The source used to build
binaries for your tivoized hardware can come from a VCS that is
deliberately designed to allow object name collisions, and your
build would just be locked out the same unless you have the signing
key that pleases the hardware.  Use of Git there would not make the
story any different, I am afraid.



^ permalink raw reply

* Re: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
From: Hans Petter Selasky @ 2023-01-16 14:10 UTC (permalink / raw)
  To: rsbecker, 'Michal Suchánek'; +Cc: git, wrights
In-Reply-To: <017801d929a6$6f8271b0$4e875510$@nexbridge.com>

On 1/16/23 13:31, rsbecker@nexbridge.com wrote:
> On January 16, 2023 4:56 AM, Hans Petter Selasky wrote:
>> On 1/16/23 10:13, Michal Suchánek wrote:
>>> when that data is copied to a new location a new CRC is calculated
>>> that can detect an error in that location.
>>
>> Yes, that is correct, but what is "copying data"? Are you saying that copying data is
>> always error free?
> 
> Not in all possible computing devices, no. But in certain high-reliability and mission critical systems, there are parity checks and communication mechanisms that verify the integrity of data transfers memory-to-memory, memory-to-register, and over inter-CPU bus, and memory-to-disk-storage checks. The result of a corruption on one of my systems would result in a CPU halt rather than blindly accepting the result, taking the faulty processor offline until the cause is investigated and then reloaded or repaired. This applies to any component, including disks, CLIMs, DMA, and anything else in the architecture.
> 

Hi,

I doesn't matter if the system is high-reliability or not. The problem 
is exactly the same.

If you have a CPU register which you add to another CPU register, then 
you need to recompute the parity information on the destination CPU 
register. That basically means you always trust the output of the CPU 
adder. There is simply no relationship between input parity and output 
parity in the linear adder case.

Whenever "parity" information is lost, it opens up the possiblity of 
irrecoverable errors.

That's why I say, that GIT would be better of in that regard with an 
end-to-end, CRC parity mechanism.

--HPS

^ permalink raw reply

* Re: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
From: Hans Petter Selasky @ 2023-01-16 14:01 UTC (permalink / raw)
  To: rsbecker, 'brian m. carlson', git
In-Reply-To: <017901d929a6$ec180f50$c4482df0$@nexbridge.com>

On 1/16/23 13:34, rsbecker@nexbridge.com wrote:
> On January 16, 2023 2:24 AM, Hans Petter Selasky wrote:
>> On 1/15/23 00:59, brian m. carlson wrote:
>>> However, Git is moving in the direction of stronger cryptographic
>>> algorithms, rather than insecure hashing algorithms.  I don't think
>>> your proposal is a good idea, nor do I think it's likely to be adopted.
>>
>> I disagree. There is no need for signing in a version control system. It just makes it
>> harder to change things, like the right-to-repair. In my eyes there is a high chance
>> of abuse, by vendors that do no want others to flash or edit their device
>> firmwares.
> 

Hi,

> The two matters are completely isolated and distinct. In the OpenSource community, anyone typically has the right to modify. Please refer to the GPLv3, ECLIPSE, and MIT licenses for example. Those are the governing documents that permit modification and define intellectual property rights. Please consult those licenses with regards to right-to-repair statements that have no legal bearing on git or any other GPL-governed software product. In my view, the issue raised is a red herring that keeps getting brought up, which does not contribute positively to this request's discussion, but would presumably would increase the hit rate on web searches, to which this reply unfortunately contributes.

The use of cryptographic hash tags, allows one party to stay in control 
of and monetize a project, actually by doing nothing more than 
rebranding an existing product.

> The assertion of no need for signing can apply to a centralized version control system, like SVN, because users are authenticated centrally, and the contribution can be made definitive without a separate signature, providing no one with root authority on the server hacks the repository. In the architecture of a distributed version control system (specifically git for this discussion), there is no evidence of origin of changes because the commit identity is cooperative rather than being enforced by a central authority and hacking the repository by root is detectible. The assertion of signing as abuse of rights is also an opinion that, so far, has no supporting evidence given. Perhaps a paper in a refereed journal might give this position some credibility.

 From what I've read the GPLv3 goes pretty far to also provide flashing 
rights for software, but what use is that, when flashing the unsigned 
software on your Samsung phone, for example, some fuse breaks in the 
hardware, and then you can no longer use certain apps on your phone?

> 
> My point is that signing is critical in a DVCS and a major function point used by DevOps architects for adopting git in new organizations. In the regulated world, FinTech, FDA, Aviation, etc., signing contributes to the evidence of origin of changes required by PCI and SWIFT (ref: section 6 in each regulation). Without signed tags (which the establishes the change origins for releases for production use), deployment becomes less certain and less acceptable to the audit community with whom I interact on a regular basis.
> 

It's very clear to me, that supporting signing straight off the VCS, 
will not help the opensource and right-to-repair community at all. It's 
just ripe for abuse, like I say.

Hacking is prevented by using a secure copy mechanism between the 
servers, which you can upgrade separately. You already see the problem, 
SHA-1 is not good enough to prevent hacking. Why not just separate the 
hacking preventing measures and the needs of a good VCS?

--HPS

^ permalink raw reply

* Re: [PATCH v2 0/6] cache API: always have a "istate->repo"
From: Ævar Arnfjörð Bjarmason @ 2023-01-16 13:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Derrick Stolee, Victoria Dye, Jeff Hostetler
In-Reply-To: <xmqqtu0u2q9u.fsf@gitster.g>


On Fri, Jan 13 2023, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> The "struct index_state" contains a "repo" member, which should be a
>> pointer to the repository for that index, but which due to us
>> constructing such structs on an ad-hoc basis in various places wasn't
>> always available.
>
> I'd exclude 6/6 for now, as it seems to depend on some changes only
> in 'next'.  Feel free to resend only that step with reduced scope so
> that it applies to 'master', and send in incremental updates when
> each of these topics that are only in 'next' graduates.

Okey, the 6/6 requires ds/omit-trailing-hash-in-index. As both it and
the 1-5/6 of this are in "next" now I think it's best that I just submit
the 6/6 stand-alone after both of those have graduated.


^ permalink raw reply

* Re: [PATCH 3/3] http: support CURLOPT_PROTOCOLS_STR
From: Ævar Arnfjörð Bjarmason @ 2023-01-16 13:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Ramsay Jones
In-Reply-To: <Y8ReHbGWetJHQcI1@coredump.intra.peff.net>


On Sun, Jan 15 2023, Jeff King wrote:

> The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
> deprecated in curl 7.85.0, and using it generate compiler warnings as of
> curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
> can't just do so unilaterally, as it was only introduced less than a
> year ago in 7.85.0.
>
> Until that version becomes ubiquitous, we have to either disable the
> deprecation warning or conditionally use the "STR" variant on newer
> versions of libcurl. This patch switches to the new variant, which is
> nice for two reasons:
>
>   - we don't have to worry that silencing curl's deprecation warnings
>     might cause us to miss other more useful ones
>
>   - we'd eventually want to move to the new variant anyway, so this gets
>     us set up (albeit with some extra ugly boilerplate for the
>     conditional)
>
> There are a lot of ways to split up the two cases. One way would be to
> abstract the storage type (strbuf versus a long), how to append
> (strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
> and so on. But the resulting code looks pretty magical:
>
>   GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
>   if (...http is allowed...)
> 	GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
>
> and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
> actual code.
>
> On the other end of the spectrum, we could just implement two separate
> functions, one that handles a string list and one that handles bits. But
> then we end up repeating our list of protocols (http, https, ftp, ftp).
>
> This patch takes the middle ground. The run-time code is always there to
> handle both types, and we just choose which one to feed to curl.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  git-curl-compat.h |  8 ++++++++
>  http.c            | 41 ++++++++++++++++++++++++++++++++++-------
>  2 files changed, 42 insertions(+), 7 deletions(-)
>
> diff --git a/git-curl-compat.h b/git-curl-compat.h
> index 56a83b6bbd..fd96b3cdff 100644
> --- a/git-curl-compat.h
> +++ b/git-curl-compat.h
> @@ -126,4 +126,12 @@
>  #define GIT_CURL_HAVE_CURLSSLSET_NO_BACKENDS
>  #endif
>  
> +/**
> + * CURLOPT_PROTOCOLS_STR and CURLOPT_REDIR_PROTOCOLS_STR were added in 7.85.0,
> + * released in August 2022.
> + */
> +#if LIBCURL_VERSION_NUM >= 0x075500
> +#define GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR 1
> +#endif
> +
>  #endif

Thanks, I hadn't had time to comment on this yet, but was wondering what
this had to do with "CI" or "DEVEDEVELOPER_CFLAGS", the "CI" just seems
to be "where we happened to spot this", and as has been noted this is
also an issue without DEVELOPER_CFLAGS, we just happen to have -Werror
there.

But this is the right direction, and the reason we have git-curl-compat.h.

> diff --git a/http.c b/http.c
> index ca0fe80ddb..e529ebc993 100644
> --- a/http.c
> +++ b/http.c
> @@ -764,18 +764,29 @@ void setup_curl_trace(CURL *handle)
>  	curl_easy_setopt(handle, CURLOPT_DEBUGDATA, NULL);
>  }
>  
> -static long get_curl_allowed_protocols(int from_user)
> +static void proto_list_append(struct strbuf *list_str, const char *proto_str,
> +			      long *list_bits, long proto_bits)
> +{
> +	*list_bits |= proto_bits;
> +	if (list_str) {
> +		if (list_str->len)
> +			strbuf_addch(list_str, ',');
> +		strbuf_addstr(list_str, proto_str);
> +	}
> +}

Nit: It would be nice (especially in this even smaller function) to
carry forward the name the parent get_curl_allowed_protocols() uses,
i.e. just "list", not "list_str", ditto "proto" rather than "proto_str".

> +#ifdef GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR
> +	{
> +		struct strbuf buf = STRBUF_INIT;
> +
> +		get_curl_allowed_protocols(0, &buf);
> +		curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS_STR, buf.buf);
> +		strbuf_reset(&buf);
> +
> +		get_curl_allowed_protocols(-1, &buf);
> +		curl_easy_setopt(result, CURLOPT_PROTOCOLS_STR, buf.buf);
> +		strbuf_release(&buf);
> +	}
> +#else
>  	curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
> -			 get_curl_allowed_protocols(0));
> +			 get_curl_allowed_protocols(0, NULL));
>  	curl_easy_setopt(result, CURLOPT_PROTOCOLS,
> -			 get_curl_allowed_protocols(-1));
> +			 get_curl_allowed_protocols(-1, NULL));
> +#endif
> +
>  	if (getenv("GIT_CURL_VERBOSE"))
>  		http_trace_curl_no_data();
>  	setup_curl_trace(result);

I wonder if it isn't better to avoid conditionally compiled code here if
we can avoid it, i.e. just define GIT_* dummy fallbacks, and only use
these if GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR is true. I.e. something
like the below squashed in.

diff --git a/git-curl-compat.h b/git-curl-compat.h
index fd96b3cdffd..3bc6e151ca7 100644
--- a/git-curl-compat.h
+++ b/git-curl-compat.h
@@ -130,8 +130,13 @@
  * CURLOPT_PROTOCOLS_STR and CURLOPT_REDIR_PROTOCOLS_STR were added in 7.85.0,
  * released in August 2022.
  */
-#if LIBCURL_VERSION_NUM >= 0x075500
-#define GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR 1
+#define GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR (LIBCURL_VERSION_NUM >= 0x075500)
+#if GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR
+#define GIT_CURLOPT_REDIR_PROTOCOLS_STR CURLOPT_REDIR_PROTOCOLS_STR 
+#define GIT_CURLOPT_PROTOCOLS_STR CURLOPT_PROTOCOLS_STR
+#else
+#define GIT_CURLOPT_REDIR_PROTOCOLS_STR 0
+#define GIT_CURLOPT_PROTOCOLS_STR 0
 #endif
 
 #endif
diff --git a/http.c b/http.c
index e529ebc993d..3224e897f02 100644
--- a/http.c
+++ b/http.c
@@ -933,24 +933,22 @@ static CURL *get_curl_handle(void)
 	curl_easy_setopt(result, CURLOPT_MAXREDIRS, 20);
 	curl_easy_setopt(result, CURLOPT_POSTREDIR, CURL_REDIR_POST_ALL);
 
-#ifdef GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR
-	{
+	if (GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR) {
 		struct strbuf buf = STRBUF_INIT;
 
 		get_curl_allowed_protocols(0, &buf);
-		curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS_STR, buf.buf);
+		curl_easy_setopt(result, GIT_CURLOPT_REDIR_PROTOCOLS_STR, buf.buf);
 		strbuf_reset(&buf);
 
 		get_curl_allowed_protocols(-1, &buf);
-		curl_easy_setopt(result, CURLOPT_PROTOCOLS_STR, buf.buf);
+		curl_easy_setopt(result, GIT_CURLOPT_PROTOCOLS_STR, buf.buf);
 		strbuf_release(&buf);
+	} else {
+		curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
+				 get_curl_allowed_protocols(0, NULL));
+		curl_easy_setopt(result, CURLOPT_PROTOCOLS,
+				 get_curl_allowed_protocols(-1, NULL));
 	}
-#else
-	curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
-			 get_curl_allowed_protocols(0, NULL));
-	curl_easy_setopt(result, CURLOPT_PROTOCOLS,
-			 get_curl_allowed_protocols(-1, NULL));
-#endif
 
 	if (getenv("GIT_CURL_VERBOSE"))
 		http_trace_curl_no_data();

^ permalink raw reply related

* RE: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
From: rsbecker @ 2023-01-16 12:34 UTC (permalink / raw)
  To: 'Hans Petter Selasky', 'brian m. carlson', git
In-Reply-To: <3b0af57c-a144-b0e4-d353-6028b3939291@selasky.org>

On January 16, 2023 2:24 AM, Hans Petter Selasky wrote:
>On 1/15/23 00:59, brian m. carlson wrote:
>> However, Git is moving in the direction of stronger cryptographic
>> algorithms, rather than insecure hashing algorithms.  I don't think
>> your proposal is a good idea, nor do I think it's likely to be adopted.
>
>I disagree. There is no need for signing in a version control system. It just makes it
>harder to change things, like the right-to-repair. In my eyes there is a high chance
>of abuse, by vendors that do no want others to flash or edit their device
>firmwares.

The two matters are completely isolated and distinct. In the OpenSource community, anyone typically has the right to modify. Please refer to the GPLv3, ECLIPSE, and MIT licenses for example. Those are the governing documents that permit modification and define intellectual property rights. Please consult those licenses with regards to right-to-repair statements that have no legal bearing on git or any other GPL-governed software product. In my view, the issue raised is a red herring that keeps getting brought up, which does not contribute positively to this request's discussion, but would presumably would increase the hit rate on web searches, to which this reply unfortunately contributes.

The assertion of no need for signing can apply to a centralized version control system, like SVN, because users are authenticated centrally, and the contribution can be made definitive without a separate signature, providing no one with root authority on the server hacks the repository. In the architecture of a distributed version control system (specifically git for this discussion), there is no evidence of origin of changes because the commit identity is cooperative rather than being enforced by a central authority and hacking the repository by root is detectible. The assertion of signing as abuse of rights is also an opinion that, so far, has no supporting evidence given. Perhaps a paper in a refereed journal might give this position some credibility.

My point is that signing is critical in a DVCS and a major function point used by DevOps architects for adopting git in new organizations. In the regulated world, FinTech, FDA, Aviation, etc., signing contributes to the evidence of origin of changes required by PCI and SWIFT (ref: section 6 in each regulation). Without signed tags (which the establishes the change origins for releases for production use), deployment becomes less certain and less acceptable to the audit community with whom I interact on a regular basis.

--Randall



^ permalink raw reply

* RE: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
From: rsbecker @ 2023-01-16 12:31 UTC (permalink / raw)
  To: 'Hans Petter Selasky', 'Michal Suchánek'
  Cc: git, wrights
In-Reply-To: <6a398405-e5f8-0b78-e463-41d79e49e78b@selasky.org>

On January 16, 2023 4:56 AM, Hans Petter Selasky wrote:
>On 1/16/23 10:13, Michal Suchánek wrote:
>> when that data is copied to a new location a new CRC is calculated
>> that can detect an error in that location.
>
>Yes, that is correct, but what is "copying data"? Are you saying that copying data is
>always error free?

Not in all possible computing devices, no. But in certain high-reliability and mission critical systems, there are parity checks and communication mechanisms that verify the integrity of data transfers memory-to-memory, memory-to-register, and over inter-CPU bus, and memory-to-disk-storage checks. The result of a corruption on one of my systems would result in a CPU halt rather than blindly accepting the result, taking the faulty processor offline until the cause is investigated and then reloaded or repaired. This applies to any component, including disks, CLIMs, DMA, and anything else in the architecture.


^ permalink raw reply

* Re: [PATCH] ls-tree: add --sparse-filter-oid argument
From: William Sprent @ 2023-01-16 12:14 UTC (permalink / raw)
  To: Eric Sunshine, William Sprent via GitGitGadget; +Cc: git
In-Reply-To: <CAPig+cRgZ0CrkqY7mufuWmhf6BC8yXjXXuOTEQjuz+Y0NA+N7Q@mail.gmail.com>

On 13/01/2023 15.17, Eric Sunshine wrote:
> On Wed, Jan 11, 2023 at 12:05 PM William Sprent via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> [...]
>> To fill this gap, add a new argument to ls-tree '--sparse-filter-oid'
>> which takes the object id of a blob containing sparse checkout patterns
>> that filters the output of 'ls-tree'. When filtering with given sparsity
>> patterns, 'ls-tree' only outputs blobs and commit objects that
>> match the given patterns.
>> [...]
>> Signed-off-by: William Sprent <williams@unity3d.com>
> 
> This is not a proper review, but rather just some comments about
> issues I noticed while quickly running my eye over the patch. Many are
> just micro-nits about style; a few regarding the tests are probably
> actionable.
> 
>>      Note that one of the tests only pass when run on top of commit
>>      5842710dc2 (dir: check for single file cone patterns, 2023-01-03), which
>>      was submitted separately and is currently is merged to 'next'.
> 
> Thanks for mentioning this. It's exactly the sort of information the
> maintainer needs when applying your patch to his tree. And it can be
> helpful for reviewers too.
> 
>> diff --git a/Documentation/git-ls-tree.txt b/Documentation/git-ls-tree.txt
>> @@ -48,6 +48,11 @@ OPTIONS
>> +--sparse-filter-oid=<blob-ish>::
>> +       Omit showing tree objects and paths that do not match the
>> +       sparse-checkout specification contained in the blob
>> +       (or blob-expression) <blob-ish>.
> 
> Good to see a documentation update. The SYNOPSIS probably deserves an
> update too.
> 

Nice catch. I'll update it.

>> diff --git a/builtin/ls-tree.c b/builtin/ls-tree.c
>> @@ -329,12 +331,79 @@ static struct ls_tree_cmdmode_to_fmt ls_tree_cmdmode_format[] = {
>> +static void sparse_filter_data__init(
>> +       struct sparse_filter_data **d,
>> +       const char *sparse_oid_name, read_tree_fn_t fn)
>> +{
>> +       struct object_id sparse_oid;
>> +       struct object_context oc;
>> +
>> +       (*d) = xcalloc(1, sizeof(**d));
>> +       (*d)->fn = fn;
>> +       (*d)->pl.use_cone_patterns = core_sparse_checkout_cone;
>> +
>> +       strbuf_init(&(*d)->full_path_buf, 0);
>> +
>> +
> 
> Nit: too many blank lines.
> 
>> +       if (get_oid_with_context(the_repository,
>> +                                sparse_oid_name,
>> +                                GET_OID_BLOB, &sparse_oid, &oc))
> 
> Pure nit: somewhat odd choice of wrapping; I'd have expected something
> along the lines of:
> 
>      if (get_oid_with_context(the_repository, sparse_oid_name, GET_OID_BLOB,
>                      &sparse_oid, &oc))
> 
> or:
> 
>      if (get_oid_with_context(the_repository, sparse_oid_name,
>                      GET_OID_BLOB, &sparse_oid, &oc))
> 
>> +static void sparse_filter_data__free(struct sparse_filter_data *d)
>> +{
>> +       clear_pattern_list(&d->pl);
>> +       strbuf_release(&d->full_path_buf);
>> +       free(d);
>> +}
> 
> Is the double-underscore convention in function names imported from
> elsewhere? I don't recall seeing it used in this codebase.
> 

I took inspiration from the similar '--filter:sparse' argument for rev-list
which is implemented in 'list-objects-filter.c'. I've just given it a
search  and it looks like it that convention isn't really used outside the
list-objects files, which makes it a bit awkward here.

I get a bunch more hits for '(init|free)_', so I will change them to that.


>> +static int path_matches_sparse_checkout_patterns(struct strbuf *full_path, struct pattern_list *pl, int dtype)
>> +{
>> +       enum pattern_match_result match = recursively_match_path_with_sparse_patterns(full_path->buf, the_repository->index, dtype, pl);
>> +       return match > 0;
>> +}
>> +
>> +
> 
> Nit: too many blank lines
> 
>> +static int filter_sparse(const struct object_id *oid, struct strbuf *base,
>> +                        const char *pathname, unsigned mode, void *context)
>> +{
>> +
>> +       struct sparse_filter_data *data = context;
> 
> Nit: unnecessary blank line after "{"
> 
>> +       int do_recurse = show_recursive(base->buf, base->len, pathname);
>> +       if (object_type(mode) == OBJ_TREE)
>> +               return do_recurse;
>> +
>> +       strbuf_reset(&data->full_path_buf);
>> +       strbuf_addbuf(&data->full_path_buf, base);
>> +       strbuf_addstr(&data->full_path_buf, pathname);
>> +
>> +       if (!path_matches_sparse_checkout_patterns(&data->full_path_buf, &data->pl, DT_REG))
>> +               return 0;
>> +
>> +       return data->fn(oid, base, pathname, mode, context);	
>> +}
>> +
>> +
> 
> Nit: too many blank lines
> 
>> diff --git a/dir.c b/dir.c
>> @@ -1450,45 +1450,51 @@ int init_sparse_checkout_patterns(struct index_state *istate)
>> +static int path_in_sparse_checkout_1(const char *path,
>> +                                    struct index_state *istate,
>> +                                    int require_cone_mode)
>> +{
>> +
>> +       /*
> 
> Nit: unnecessary blank line after "{"
> 
>> diff --git a/t/t3106-ls-tree-sparse-checkout.sh b/t/t3106-ls-tree-sparse-checkout.sh
>> new file mode 100755
> 
> We often try to avoid introducing a new test script if the tests being
> added can fit well into an existing script. If you didn't find any
> existing script where these tests would fit, then creating a new
> script may be fine.
> 

Well there's a couple of things. The only ls-tree related test file
that isn't too specific is 't3103-ls-tree-misc.sh'. However, the
'sparse-checkout' command leaks memory, so that would require setting
'TEST_PASSES_SANITIZE_LEAK' to false.

For tests related to sparse-checkout, we both have the large
't1092-sparse-checkout-compatibility.sh' file. But there is also a
collection of files like 't3705-add-sparse-checkout.sh'. I guess
both could work fine in this case. I think I went with the latter
because I personally find the smaller test files a bit easier to
parse.

If someone feels strongly the other way, I'll gladly change it.

>> @@ -0,0 +1,103 @@
>> +check_agrees_with_ls_files () {
>> +       REPO=repo
>> +       git -C $REPO submodule deinit -f --all
>> +       git -C $REPO cat-file -p ${filter_oid} >${REPO}/.git/info/sparse-checkout
>> +       git -C $REPO sparse-checkout init --cone 2>err
>> +       git -C $REPO submodule init
>> +       git -C $REPO ls-files -t| grep -v "^S "|cut -d" " -f2 >ls-files
>> +       test_cmp ls-files actual
>> +}
> 
> Several comments:
> 
> Since the return code of this function is significant and callers care
> about it, you should &&-chain all of the code in the function itself
> (just like you do within a test) so that failure of any command in the
> function is noticed and causes the calling test to fail. It's a good
> idea to &&-chain the variable assignments at the top of the function
> too (just in case someone later inserts code above the assignments).
> 

That was an oversight. Thanks for noticing.

> It's odd to see a mixture of $VAR and ${VAR}. It's better to be
> consistent. We typically use the $VAR form (though it's not exclusive
> by any means).
> 

I'll remove the braces.

> Some shells complain when the pathname following ">" redirection
> operator is not quoted, so:
> 
>      git -C $REPO cat-file -p ${filter_oid} >"$REPO/.git/info/sparse-checkout" &&
> 
> Style: add space around "|" pipe operator
> 
> We usually avoid having a Git command upstream of the pipe since its
> exit code gets swallowed by the pipe, so we usually do this instead:
> 
>      git -C $REPO ls-files -t >out &&
>      grep -v "^S " out | cut -d" " -f2 >ls-files &&
> 

Good point. I'll do that.

> Minor: The two-command pipeline `grep -v "^S " | cut -d" " -f2
>> ls-files` could be expressed via a single `sed` invocation:
> 
>      sed -n "/^S /!s/^. //p" out &&
> 
> Nit: The first argument to test_cmp() is typically named "expect".
> 
> Error output is captured to file "err" but that file is never consulted.
> 

I'll remove the redirection.

>> +check_same_result_in_bare_repo () {
>> +       FULL=repo
>> +       BARE=bare
>> +       FILTER=$1
>> +       git -C repo cat-file -p ${filter_oid}| git -C bare hash-object -w --stdin
>> +       git -C bare ls-tree --name-only --filter-sparse-oid=${filter_oid} -r HEAD >bare-result
>> +       test_cmp expect bare-result
>> +}
> 
> Same comments as above, plus:
> 
> What is the purpose of the variables FULL, BARE, FILTER? They don't
> seem to be used in the function.
> 
> I suspect that the function should be using $FILTER internally rather
> than $filter_oid.
> 

I think I probably had plans with them and then forgot. Thanks for noticing.
I'll keep (and use) FILTER, but remove the others.

>> +test_expect_success 'setup' '
>> +       git init submodule &&
>> +       (
>> +               cd submodule &&
>> +               test_commit file
>> +       ) &&
> 
> Minor: test_commit() accepts a -C option, so this could be done
> without the subshell:
> 
>      test_commit -C submodule file
> 
>> +       git init repo &&
>> +       (
>> +               cd repo &&
>> +               mkdir dir &&
>> +               test_commit dir/sub-file &&
>> +               test_commit dir/sub-file2 &&
>> +               mkdir dir2 &&
>> +               test_commit dir2/sub-file1 &&
>> +               test_commit dir2/sub-file2 &&
>> +               test_commit top-file &&
>> +               git clone ../submodule submodule &&
>> +               git submodule add ./submodule &&
>> +               git submodule absorbgitdirs &&
>> +               git commit -m"add submodule" &&
>> +               git sparse-checkout init --cone
>> +       ) &&
> 
> Here the subshell makes sense since so many commands are run in
> directory "repo." Fine.

Thanks for taking the time to give some comments. I think they all
make sense to me. Besides the comments I've explicitly acknowledged above,
I've also applied all of the minor/nit/style comments to my local tree.



^ permalink raw reply

* Re: [PATCH] worktree add: introduce basic DWYM for --orphan
From: Phillip Wood @ 2023-01-16 10:52 UTC (permalink / raw)
  To: Jacob Abel, git
  Cc: Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Junio C Hamano, Phillip Wood, Rubén Justo, Taylor Blau,
	rsbecker
In-Reply-To: <20230114224956.24801-1-jacobabel@nullpo.dev>

Hi Jacob

On 14/01/2023 22:50, Jacob Abel wrote:
> Introduces a DWYM shorthand of --orphan for when the worktree directory
> and the to-be-created branch share the same name.
> 
> Current Behavior:
>      % git worktree list
>      /path/to/git/repo        a38d39a4c5 [main]
>      % git worktree add --orphan new_branch ../new_branch/
>      Preparing worktree (new branch 'new_branch')
>      % git worktree add --orphan ../new_branch2/
>      usage: git worktree add [<options>] <path> [<commit-ish>]
>         or: git worktree list [<options>]
>      [...]
>      %
> 
> New Behavior:
> 
>      % git worktree list
>      /path/to/git/repo        a38d39a4c5 [main]
>      % git worktree add --orphan new_branch ../new_branch/
>      Preparing worktree (new branch 'new_branch')
>      % git worktree list
>      /path/to/git/repo        a38d39a4c5 [main]
>      /path/to/git/new_branch  a38d39a4c5 [new_branch]
>      % git worktree add --orphan ../new_branch2/
>      Preparing worktree (new branch 'new_branch2')
>      % git worktree list
>      /path/to/git/repo        a38d39a4c5 [main]
>      /path/to/git/new_branch  a38d39a4c5 [new_branch]
>      /path/to/git/new_branch2 a38d39a4c5 [new_branch2]
>      %

Thanks for working on this. As I said in my previous mail I think it 
would be easier to use OPT_BOOL() for --orphan from the start. By using 
OPT_STRING() you'll run into problems with "git worktree add --orphan 
--lock <directory>"

Best Wishes

Phillip

> Signed-off-by: Jacob Abel <jacobabel@nullpo.dev>
> ---
>   Documentation/git-worktree.txt | 13 +++++++++----
>   builtin/worktree.c             | 21 +++++++++++++++------
>   t/t2400-worktree-add.sh        | 10 ++++++++++
>   3 files changed, 34 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/git-worktree.txt b/Documentation/git-worktree.txt
> index d78460c29c..a56ddb0185 100644
> --- a/Documentation/git-worktree.txt
> +++ b/Documentation/git-worktree.txt
> @@ -12,7 +12,7 @@ SYNOPSIS
>   'git worktree add' [-f] [--detach] [--checkout] [--lock [--reason <string>]]
>   		   [(-b | -B) <new-branch>] <path> [<commit-ish>]
>   'git worktree add' [-f] [--lock [--reason <string>]]
> -		   --orphan <new-branch> <path>
> +		   --orphan [<new-branch>] <path>
>   'git worktree list' [-v | --porcelain [-z]]
>   'git worktree lock' [--reason <string>] <worktree>
>   'git worktree move' <worktree> <new-path>
> @@ -99,13 +99,16 @@ in the new worktree, if it's not checked out anywhere else, otherwise the
>   command will refuse to create the worktree (unless `--force` is used).
>   +
>   ------------
> -$ git worktree add --orphan <branch> <path>
> +$ git worktree add --orphan [<branch>] <path>
>   ------------
>   +
>   Create a worktree containing no files, with an empty index, and associated
>   with a new orphan branch named `<branch>`. The first commit made on this new
>   branch will have no parents and will be the root of a new history disconnected
>   from any other branches.
> ++
> +If a branch name `<branch>` is not supplied, the name is derived from the
> +supplied path `<path>`.
> 
>   list::
> 
> @@ -233,9 +236,11 @@ This can also be set up as the default behaviour by using the
>   	With `prune`, do not remove anything; just report what it would
>   	remove.
> 
> ---orphan <new-branch>::
> +--orphan [<new-branch>]::
>   	With `add`, make the new worktree and index empty, associating
> -	the worktree with a new orphan branch named `<new-branch>`.
> +	the worktree with a new orphan branch named `<new-branch>`. If
> +	`<new-branch>` is not supplied, the new branch name is derived
> +	from `<path>`.
> 
>   --porcelain::
>   	With `list`, output in an easy-to-parse format for scripts.
> diff --git a/builtin/worktree.c b/builtin/worktree.c
> index d975628353..481f895075 100644
> --- a/builtin/worktree.c
> +++ b/builtin/worktree.c
> @@ -19,7 +19,7 @@
>   	N_("git worktree add [-f] [--detach] [--checkout] [--lock [--reason <string>]]\n" \
>   	   "                 [(-b | -B) <new-branch>] <path> [<commit-ish>]"), \
>   	N_("git worktree add [-f] [--lock [--reason <string>]]\n" \
> -	   "                 --orphan <new-branch> <path>")
> +	   "                 --orphan [<new-branch>] <path>")
> 
>   #define BUILTIN_WORKTREE_LIST_USAGE \
>   	N_("git worktree list [-v | --porcelain [-z]]")
> @@ -681,10 +681,13 @@ static int add(int ac, const char **av, const char *prefix)
>   	else if (keep_locked)
>   		opts.keep_locked = _("added with --lock");
> 
> -	if (ac < 1 || ac > 2)
> +	if (ac < 1 && opts.orphan) {
> +		path = prefix_filename(prefix, orphan_branch);
> +	} else if (ac >= 1 && ac <= 2) {
> +		path = prefix_filename(prefix, av[0]);
> +	} else {
>   		usage_with_options(git_worktree_add_usage, options);
> -
> -	path = prefix_filename(prefix, av[0]);
> +	}
>   	branch = ac < 2 ? "HEAD" : av[1];
> 
>   	if (!strcmp(branch, "-"))
> @@ -702,14 +705,20 @@ static int add(int ac, const char **av, const char *prefix)
>   		strbuf_release(&symref);
>   	}
> 
> -	if (opts.orphan) {
> -		new_branch = orphan_branch;
> +	if (ac < 1 && opts.orphan) {
> +		const char *s = dwim_branch(path, &orphan_branch);
> +		if (s)
> +			orphan_branch = s;
>   	} else if (ac < 2 && !new_branch && !opts.detach) {
>   		const char *s = dwim_branch(path, &new_branch);
>   		if (s)
>   			branch = s;
>   	}
> 
> +	if (opts.orphan) {
> +		new_branch = orphan_branch;
> +	}
> +
>   	if (ac == 2 && !new_branch && !opts.detach) {
>   		struct object_id oid;
>   		struct commit *commit;
> diff --git a/t/t2400-worktree-add.sh b/t/t2400-worktree-add.sh
> index 1bf8d619e2..c3de277738 100755
> --- a/t/t2400-worktree-add.sh
> +++ b/t/t2400-worktree-add.sh
> @@ -354,6 +354,16 @@ test_expect_success '"add --orphan" fails if the branch already exists' '
>   	test_path_is_missing orphandir2
>   '
> 
> +test_expect_success '"add --orphan" with basic DWYM' '
> +	test_when_finished "rm -rf empty_repo" &&
> +	echo refs/heads/worktreedir >expected &&
> +	GIT_DIR="empty_repo" git init --bare &&
> +	# Use non-trivial path to verify it DWYMs properly.
> +	git -C empty_repo worktree add --orphan ../empty_repo/worktreedir &&
> +	git -C empty_repo/worktreedir symbolic-ref HEAD >actual &&
> +	test_cmp expected actual
> +'
> +
>   test_expect_success '"add --orphan" with empty repository' '
>   	test_when_finished "rm -rf empty_repo" &&
>   	echo refs/heads/newbranch >expected &&
> --
> 2.38.2
> 
> 

^ permalink raw reply

* Re: [PATCH v8 3/4] worktree add: add --orphan flag
From: Phillip Wood @ 2023-01-16 10:47 UTC (permalink / raw)
  To: Jacob Abel
  Cc: git, Ævar Arnfjörð Bjarmason, Eric Sunshine,
	Junio C Hamano, Phillip Wood, Rubén Justo, Taylor Blau,
	rsbecker
In-Reply-To: <20230114224715.ewec6sz5h3q3iijs@phi>

Hi Jacob

On 14/01/2023 22:47, Jacob Abel wrote:
> On 23/01/13 10:20AM, Phillip Wood wrote:
>> Hi Jacob
>>
>> On 09/01/2023 17:33, Jacob Abel wrote:
>>> [...]
>>
>> It's perhaps a bit late to bring this up but I've only just realized
>> that it is unfortunate that --orphan takes a branch name rather than
>> working in conjunction with -b/-B. It means that in the common case
>> where the branch name is the same as the worktree the user has to repeat
>> it on the command line as shown above. It also means there is no way to
>> force an orphan branch (that's admittedly quite niche). If instead
>> --orphan did not take an argument we could have
>>
>> 	git worktree add --orphan main
>> 	git worktree add --orphan -b topic main
>> 	git worktree add --orphan -B topic main
>>
>> Best Wishes
>>
>> Phillip
>>
>>> [...]
> 
> I think this is a good idea and something similar was brought up previously
> however I originally wanted to handle this and a common --orphan DWYM in a later
> patch.

I think adding it in a later patch makes the implementation more 
complicated than it needs to be as you'll still have to support --orphan 
taking a name for backwards compatibility that means you need to handle

	git worktree add --orphan=main main
	git worktree add --orphan topic main
	git worktree add --orphan --lock main
	git worktree add --orphan -b topic main
	git worktree add --orphan -B topic main

Rather than just the last three. Now you can probably get that to work 
by changing --orphan to take an optional argument but I think it would 
be simpler to have --orphan as a flag from the start.

>> 	git worktree add --orphan main
> 
> I am OK implementing this option and have been workshopping it prior to
> responding. I think I have it worked out now as an additional patch which can be
> be applied on top of the v8 patchset.
> 
> I'll reply to this message with the one-off patch to get feedback. Since this is
> essentially a discrete change on top of v8, I can either keep it as a separate
> patch or reroll depending on how much needs to be changed (and what would be
> easier for everyone).
> 
>> 	git worktree add --orphan -b topic main
>> 	git worktree add --orphan -B topic main
> 
> I am hesitant to add these as they break away from the syntax used in
> `git switch` and `git checkout`.

When I wrote my original email I wrongly though that --orphan did not 
take an argument for "git checkout". While I think it is a mistake for 
checkout and switch to have --orphan take an argument they do at least 
always need a branch name with that option. "git worktree" add already 
has the branch name in the form of the worktree directory in the common 
case.

> Also apologies for the tangent but while researching this path, I noticed that
> --orphan behaves unexpectedly on both `git switch` and `git checkout` when mixed
> with `-c` and `-b` respectively.
> 
>      % git switch --orphan -c foobar
>      fatal: invalid reference: foobar
> 
>      % git switch -c --orphan foobar
>      fatal: invalid reference: foobar
> >      % git checkout -b --orphan foobar
>      fatal: 'foobar' is not a commit and a branch '--orphan' cannot be created from it
>
>      % git checkout --orphan -b foobar
>      fatal: 'foobar' is not a commit and a branch '-b' cannot be created from it

The messages for checkout look better than the switch ones to me as they 
show the branch name which makes it clearer that we're treating what 
looks like an option as an argument. What in particular is unexpected 
here - --orphan and -b take an argument so they'll hoover up the next 
thing on the commandline whatever it is.

Best Wishes

Phillip

> I tried this on my system install as well as from a fresh fetch of next FWIW.
> 
> [Info: fresh build from next]
> git version 2.39.0.287.g8cbeef4abd
> cpu: x86_64
> built from commit: 8cbeef4abda4907dd68ea144d9dcb85f0b49c3e6
> sizeof-long: 8
> sizeof-size_t: 8
> shell-path: /bin/sh
> 
> [Info: system install]
> git version 2.38.2
> cpu: x86_64
> no commit associated with this build
> sizeof-long: 8
> sizeof-size_t: 8
> shell-path: /bin/sh
> 
> If this bug is something that needs to be addressed, I can dig a bit deeper and
> put together a patch for it in the next few days.
> 
> VR,
> Abel
> 

^ permalink raw reply

* Re: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
From: Hans Petter Selasky @ 2023-01-16  9:55 UTC (permalink / raw)
  To: Michal Suchánek; +Cc: git
In-Reply-To: <20230116091346.GC16547@kitsune.suse.cz>

On 1/16/23 10:13, Michal Suchánek wrote:
> when that data is copied to a new location a new
> CRC is calculated that can detect an error in that location.

Yes, that is correct, but what is "copying data"? Are you saying that 
copying data is always error free?

--HPS

^ permalink raw reply

* Re: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
From: Michal Suchánek @ 2023-01-16  9:13 UTC (permalink / raw)
  To: Hans Petter Selasky; +Cc: git
In-Reply-To: <b1984123-569a-c290-8048-158c1c5e08b4@selasky.org>

On Mon, Jan 16, 2023 at 08:17:58AM +0100, Hans Petter Selasky wrote:
> On 1/15/23 14:53, Michal Suchánek wrote:
> > > Many people think that bit errors cannot happen because the memory uses ECC
> > > and the file system uses cryptographic hashes to verify the integrity of the
> > > data. But what many people forget about is that when copying data from
> > > memory to disk, typically using a DMA channel data is copied w/o any kind of
> > > integrity protection, because the integrity protection is not end-to-end.
> > > The integrity protection is only per-link.
> >
> > So long as all links have integrity protection it's end-to-end.
> >
> 
> Hi Michael,
> 
> You clearly don't see what this is about! Only if the same CRC mechanism is
> end-to-end, you don't have any good integrity mechanism at all!
> 
> Let me try to explain what this is about in very simple words. Because
> memcpy() does not copy the ECC CRC values along with the data, it is an
> unsafe memory copy mechanism, which may introduce bit-errors without
> noticing. It does not help to only have ECC RAM or for that sake protect the
> PCI links.
The ECC protects against 1bit errors - so long as only 1 bit is flipped
along that path it is corrected.

If you have bigger errors ECC can sometimes detect them and your system
crashes or whatever, and sometimes they go unnoticed.

It does not make sense to copy around that CRC. It is used to recover
the corrupted bit, and when that data is copied to a new location a new
CRC is calculated that can detect an error in that location. Copying
that checksum around would only accumulate the errors.

Of course, that assumes that the corruption happens only in the cheaper
external long-term storage, and data does not get corrupted as it goes
through your CPU where it is stored only a few CPU cycles at a time. It
is mostly the case but when you need extreme reliability system-level
schemes that mitigate this possibility do exist.

Thanks

Michal

^ permalink raw reply

* Re: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
From: Hans Petter Selasky @ 2023-01-16  7:23 UTC (permalink / raw)
  To: brian m. carlson, git
In-Reply-To: <Y8NB21PExmifhyeQ@tapette.crustytoothpaste.net>

On 1/15/23 00:59, brian m. carlson wrote:
> However, Git is moving in the direction of stronger cryptographic
> algorithms, rather than insecure hashing algorithms.  I don't think your
> proposal is a good idea, nor do I think it's likely to be adopted.

I disagree. There is no need for signing in a version control system. It 
just makes it harder to change things, like the right-to-repair. In my 
eyes there is a high chance of abuse, by vendors that do no want others 
to flash or edit their device firmwares.

--HPS

^ permalink raw reply

* Re: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
From: Hans Petter Selasky @ 2023-01-16  7:21 UTC (permalink / raw)
  To: brian m. carlson, git
In-Reply-To: <Y8NB21PExmifhyeQ@tapette.crustytoothpaste.net>

On 1/15/23 00:59, brian m. carlson wrote:
>> 3) Illicit contents may be present in binary blobs, which in the future may
>> be need to be removed without warrant and the only way to do that is by
>> rebasing and force pushing, which will break "everything". It can be
>> everything from child-porn to expired distribution licenses.
> This is a problem in every Merkle tree-like system.  Most repositories
> have some sort of code review or access control that prevents people
> from generally pushing inappropriate content.  For example, if somebody
> proposed to push any sort of pornography or other inappropriate content
> (e.g., a racist screed) to one of my repositories or one of my
> employer's, I'd refuse to approve or merge such a change, because
> that wouldn't be appropriate for the repository.
> 
> I don't feel this is enough of a problem that using a Merkle tree-like
> construction is a bad idea, given the benefits it offers.
> 

Yeah, right. And of course you have all the tools to decode those 
megabyte big firmware blobs from intel supporting wireless cards all 
over the place to see what is actually inside there, that they are not 
using some 3rd party code which licence will expire at some point, and 
then you need to remove those binaries.

--HPS

^ permalink raw reply

* Re: Gitorious should use CRC128 / 256 / 512 instead of SHA-1
From: Hans Petter Selasky @ 2023-01-16  7:17 UTC (permalink / raw)
  To: Michal Suchánek; +Cc: git
In-Reply-To: <20230115135245.GB16547@kitsune.suse.cz>

On 1/15/23 14:53, Michal Suchánek wrote:
>> Many people think that bit errors cannot happen because the memory uses ECC
>> and the file system uses cryptographic hashes to verify the integrity of the
>> data. But what many people forget about is that when copying data from
>> memory to disk, typically using a DMA channel data is copied w/o any kind of
>> integrity protection, because the integrity protection is not end-to-end.
>> The integrity protection is only per-link.
 >
> So long as all links have integrity protection it's end-to-end.
 >

Hi Michael,

You clearly don't see what this is about! Only if the same CRC mechanism 
is end-to-end, you don't have any good integrity mechanism at all!

Let me try to explain what this is about in very simple words. Because 
memcpy() does not copy the ECC CRC values along with the data, it is an 
unsafe memory copy mechanism, which may introduce bit-errors without 
noticing. It does not help to only have ECC RAM or for that sake protect 
the PCI links.

--HPS

^ permalink raw reply

* Re: bugreport: "git checkout -B" allows checking out one branch across multiple worktrees
From: Jinwook Jeong @ 2023-01-16  2:50 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git
In-Reply-To: <20230116001144.dt76xk6hkwn45klz@Carlos-MacBook-Pro-2.local>

On Mon, Jan 16, 2023 at 9:11 AM Carlo Marcelo Arenas Belón
<carenas@gmail.com> wrote:
>
> On Sat, Jan 14, 2023 at 06:45:34PM +0900, Jinwook Jeong wrote:
> > What did you do before the bug happened? (Steps to reproduce your issue)
> >
> > 1. `cd` into any git repo that has at least one commit.
> > 2. Identify the current branch, say main
> > 3. $ git branch foo # a new branch
> > 4. $ git worktree add ../new_worktree foo
> > 5. $ cd ../new_worktree
> > 6. $ git checkout -B master HEAD
>
> Was your intention to get this worktree's content back to what is in
> master's HEAD?, then the command should had been

Sorry for confusion. These steps didn't reflect my real use case but
rather are for demonstration.

I encountered this issue by chance, not knowing how that command would
work under multiple worktrees.

^ permalink raw reply

* Re: ctrl-z ignored by git; creates blobs from non-existent repos
From: Junio C Hamano @ 2023-01-16  2:07 UTC (permalink / raw)
  To: Crls; +Cc: Theodore Ts'o, git
In-Reply-To: <Y8SCZvMu7DZZH1Pl@localhost.my.domain>

Crls <kaploceh@gmail.com> writes:

> ... Coincidentally speaking, why does a username warrants a prompt
> from git, is simply beyond me. I mean, that is certainly the more
> far-fetched reasoning of implementation I've read in a long long
> time.
>
> Can you git-clone a user? What about the user's settings? What
> about the remainder its gpg tokens and so forth? In other words,
> if a user's repo is not found, why even prompting for a username?
> The latter, that is, the user's repo, is redundant, when the
> prompt is clearly asking for a username, and not a repo.

When you "git clone", you'd give a repository path to the server.
If the repository is not open to the general anonymous public, then
the server needs to check who you are (by asking username) and
verify that you are who you claim to be (by asking password).

Here two things you need to pay attention to.

 - A user can be the
   owner of more than one repositories, and

 - a repository can be accessed by users other than its owner.

So even after the repository is known by the server, the server
still needs to ask you who you are.

Imagine that there are many projects hosted at the same site, the
repository path is named after the codename of the project, and the
project codename is need-to-know secret.

If the server side reacted differently between an attempt to clone
existing repositories and missing ones, an attacker can try

	git clone https://site.example.com/projects/$X    

with many X's and observe the behaviour of the server.  If the
server is known to respond with "no such repository" for a missing
one, while responding with "please identify you" for an existing
one, you can easily tell if a word $X is a project codename, that is
supposed to be secret.

> Preventing the disclosure of information has nothing to do with
> the issue here. If anything seems clear to me, is that prompting
> for a username, does indeed disclose usernames, private, public
> and whatnot from either github or gitlab.

When you need to identify yourself to GitHub or GitLab, you'd give
your username and password.  You know that GitHub or GitLab have the
username so it is not secret to them.  Otherwise they wouldn't be
able to even recognise you.

So I am not sure how it "seems clear" that asking for the username
is a problem.  The observed behaviour to ask for the username even
for a missing repository is all about avoiding to disclose one bit
of information: whether a repository exists at the given URL.

^ permalink raw reply

* Re: bugreport: "git checkout -B" allows checking out one branch across multiple worktrees
From: Eric Sunshine @ 2023-01-16  1:07 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: Jinwook Jeong, git
In-Reply-To: <20230116001144.dt76xk6hkwn45klz@Carlos-MacBook-Pro-2.local>

On Sun, Jan 15, 2023 at 7:23 PM Carlo Marcelo Arenas Belón
<carenas@gmail.com> wrote:
> On Sat, Jan 14, 2023 at 06:45:34PM +0900, Jinwook Jeong wrote:
> > 1. `cd` into any git repo that has at least one commit.
> > 2. Identify the current branch, say main
> > 3. $ git branch foo # a new branch
> > 4. $ git worktree add ../new_worktree foo
> > 5. $ cd ../new_worktree
> > 6. $ git checkout -B master HEAD
>
> Was your intention to get this worktree's content back to what is in
> master's HEAD?, then the command should had been
>
> $ git reset --hard master
>
> The documentation might be confusing, but you most likely do NOT want
> to use -B unless you want to force things, but the lowercase version `-b`
>
> > Anything else you want to add:
> >
> > https://www.git-scm.com/docs/git-checkout#Documentation/git-checkout.txt-emgitcheckoutem-b-Bltnew-branchgtltstart-pointgt
> >
> > According to the documentation, "git checkout -B BRANCH START" is the
> > transactionally equivalent of:
> >
> >   git branch -f BRANCH START
> >   git checkout BRANCH
> >
> > When I ran the first command in place of the step 6 of the above
> > reproducing procedure, git refused to carry on;
> > I suppose that this is the intended behavior for "git checkout -B".
>
> I think you are correct, and this is therefore a bug, but there is also
> a reason why `--force` allows doing dangerous things and I am not sure
> if it might apply here.

I'd say there's a bug in `git-switch/git-checkout -B` not performing
the same checks as `git branch -f`. As a result, it is possible to get
into a state in which the same branch is checked out in multiple
worktrees, which is probably undesirable. I looked briefly through the
code but don't have the time presently to dig into it.

^ permalink raw reply

* Re: [PATCH] ci: do not die on deprecated-declarations warning
From: Ramsay Jones @ 2023-01-16  0:39 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano; +Cc: Daniel Stenberg, Patrick Monnerat, git
In-Reply-To: <Y8LCsxz8gyTCxFUp@coredump.intra.peff.net>



On 14/01/2023 14:56, Jeff King wrote:
> On Fri, Jan 13, 2023 at 07:47:12PM -0800, Junio C Hamano wrote:
> 
>> Like a recent GitHub CI run on linux-musl [1] shows, we seem to be
>> getting a bunch of errors of the form:
>>
>>   Error: http.c:1002:9: 'CURLOPT_REDIR_PROTOCOLS' is deprecated:
>>   since 7.85.0. Use CURLOPT_REDIR_PROTOCOLS_STR
>>   [-Werror=deprecated-declarations]
> 
> By the way, it seemed odd to me that this failed in just the linux-musl
> job, and not elsewhere (nor on my development machine, which has curl
> 7.87.0). It looks like there's a bad interaction within curl between the
> typecheck and deprecation macros. Here's a minimal reproduction:
> 
> -- >8 --
> cat >foo.c <<-\EOF
> #include <curl/curl.h>
> void foo(CURL *c)
> {
> 	curl_easy_setopt(c, CURLOPT_PROTOCOLS, 0);
> }
> EOF
> 
> # this will complain about deprecated CURLOPT_PROTOCOLS
> gcc -DCURL_DISABLE_TYPECHECK -Wdeprecated-declarations -c foo.c
> 
> # this will not
> gcc -Wdeprecated-declarations -c foo.c
> -- 8< --

FYI, I just tried this on cygwin and both gcc invocations above
complain about deprecated CURLOPT_PROTOCOLS. (On Linux I have
curl 7.81.0, so I can't test there).

[cygwin uses newlib, of course].

ATB,
Ramsay Jones


^ permalink raw reply

* Re: bugreport: "git checkout -B" allows checking out one branch across multiple worktrees
From: Carlo Marcelo Arenas Belón @ 2023-01-16  0:11 UTC (permalink / raw)
  To: Jinwook Jeong; +Cc: git
In-Reply-To: <CAA3Q-aaO=vcZd9VLFr8UP-g06be80eUWN_GjygfyGkYmrLx9yQ@mail.gmail.com>

On Sat, Jan 14, 2023 at 06:45:34PM +0900, Jinwook Jeong wrote:
> What did you do before the bug happened? (Steps to reproduce your issue)
> 
> 1. `cd` into any git repo that has at least one commit.
> 2. Identify the current branch, say main
> 3. $ git branch foo # a new branch
> 4. $ git worktree add ../new_worktree foo
> 5. $ cd ../new_worktree
> 6. $ git checkout -B master HEAD

Was your intention to get this worktree's content back to what is in
master's HEAD?, then the command should had been

$ git reset --hard master

The documentation might be confusing, but you most likely do NOT want
to use -B unless you want to force things, but the lowercase version `-b`

> Anything else you want to add:
> 
> https://www.git-scm.com/docs/git-checkout#Documentation/git-checkout.txt-emgitcheckoutem-b-Bltnew-branchgtltstart-pointgt
> 
> According to the documentation, "git checkout -B BRANCH START" is the
> transactionally equivalent of:
> 
>   git branch -f BRANCH START
>   git checkout BRANCH
> 
> When I ran the first command in place of the step 6 of the above
> reproducing procedure, git refused to carry on;
> I suppose that this is the intended behavior for "git checkout -B".

I think you are correct, and this is therefore a bug, but there is also
a reason why `--force` allows doing dangerous things and I am not sure
if it might apply here.

Carlo

^ permalink raw reply

* [PATCH v2 3/3] branch: rename orphan branches in any worktree
From: Rubén Justo @ 2023-01-16  0:06 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano
In-Reply-To: <34a58449-4f2e-66ef-ea01-119186aebd23@gmail.com>

In cfaff3aac (branch -m: allow renaming a yet-unborn branch, 2020-12-13)
we added support for renaming an orphan branch, skipping rename_ref() if
the branch to rename is an orphan branch, checking the current HEAD.

But the orphan branch to be renamed can be an orphan branch in a
different working tree than the current one, i.e. not the current HEAD.

Let's make "ishead_reject_rebase_or_bisect_branch()" return a flag
indicating if the returned value refers to an orphan branch, and use it
to extend what we did in cfaff3aac, to all HEADs in the repository.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
 builtin/branch.c  |  5 ++---
 t/t3200-branch.sh | 16 ++++++++++++++++
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index 6bb5f50950..7e6baa291a 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -495,7 +495,7 @@ static int ishead_and_reject_rebase_or_bisect_branch(const char *target)
 		struct worktree *wt = worktrees[i];
 
 		if (wt->head_ref && !strcmp(target, wt->head_ref))
-			ret = 1;
+			ret = 1 + (is_null_oid(&wt->head_oid) ? 1 : 0);
 
 		if (!wt->is_detached)
 			continue;
@@ -560,8 +560,7 @@ static void copy_or_rename_branch(const char *oldname, const char *newname, int
 		strbuf_addf(&logmsg, "Branch: renamed %s to %s",
 			    oldref.buf, newref.buf);
 
-	if (!copy &&
-	    (!head || strcmp(oldname, head) || !is_null_oid(&head_oid)) &&
+	if (!copy && !(ishead > 1) &&
 	    rename_ref(oldref.buf, newref.buf, logmsg.buf))
 		die(_("Branch rename failed"));
 	if (copy && copy_existing_ref(oldref.buf, newref.buf, logmsg.buf))
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index 5a169b68d6..966583dc7d 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -279,6 +279,22 @@ test_expect_success 'git branch -M and -C fail on detached HEAD' '
 	test_cmp expect err
 '
 
+test_expect_success 'git branch -m should work with orphan branches' '
+	test_when_finished git checkout - &&
+	test_when_finished git worktree remove -f wt &&
+	git worktree add wt --detach &&
+
+	# rename orphan in another worktreee
+	git -C wt checkout --orphan orphan-foo-wt &&
+	git branch -m orphan-foo-wt orphan-bar-wt &&
+	test orphan-bar-wt=$(git -C orphan-worktree branch --show-current) &&
+
+	# rename orphan in the current worktree
+	git checkout --orphan orphan-foo &&
+	git branch -m orphan-foo orphan-bar &&
+	test orphan-bar=$(git branch --show-current)
+'
+
 test_expect_success 'git branch -d on orphan HEAD (merged)' '
 	test_when_finished git checkout main &&
 	git checkout --orphan orphan &&
-- 
2.39.0

^ permalink raw reply related

* [PATCH 3/3] branch: rename orphan branches in any worktree
From: Rubén Justo @ 2023-01-16  0:04 UTC (permalink / raw)
  To: Git List; +Cc: Junio C Hamano
In-Reply-To: <34a58449-4f2e-66ef-ea01-119186aebd23@gmail.com>

In cfaff3aac (branch -m: allow renaming a yet-unborn branch, 2020-12-13)
we added support for renaming an orphan branch, skipping rename_ref() if
the branch to rename is an orphan branch, checking the current HEAD.

But the orphan branch to be renamed can be an orphan branch in a
different working tree than the current one, i.e. not the current HEAD.

Let's make "ishead_reject_rebase_or_bisect_branch()" return a flag
indicating if the returned value refers to an orphan branch, and use it
to extend what we did in cfaff3aac, to all HEADs in the repository.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
 builtin/branch.c  |  5 ++---
 t/t3200-branch.sh | 16 ++++++++++++++++
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index 6bb5f50950..7e6baa291a 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -495,7 +495,7 @@ static int ishead_and_reject_rebase_or_bisect_branch(const char *target)
 		struct worktree *wt = worktrees[i];
 
 		if (wt->head_ref && !strcmp(target, wt->head_ref))
-			ret = 1;
+			ret = 1 + (is_null_oid(&wt->head_oid) ? 1 : 0);
 
 		if (!wt->is_detached)
 			continue;
@@ -560,8 +560,7 @@ static void copy_or_rename_branch(const char *oldname, const char *newname, int
 		strbuf_addf(&logmsg, "Branch: renamed %s to %s",
 			    oldref.buf, newref.buf);
 
-	if (!copy &&
-	    (!head || strcmp(oldname, head) || !is_null_oid(&head_oid)) &&
+	if (!copy && !(ishead > 1) &&
 	    rename_ref(oldref.buf, newref.buf, logmsg.buf))
 		die(_("Branch rename failed"));
 	if (copy && copy_existing_ref(oldref.buf, newref.buf, logmsg.buf))
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index 5a169b68d6..966583dc7d 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -279,6 +279,22 @@ test_expect_success 'git branch -M and -C fail on detached HEAD' '
 	test_cmp expect err
 '
 
+test_expect_success 'git branch -m should work with orphan branches' '
+	test_when_finished git checkout - &&
+	test_when_finished git worktree remove -f wt &&
+	git worktree add wt --detach &&
+
+	# rename orphan in another worktreee
+	git -C wt checkout --orphan orphan-foo-wt &&
+	git branch -m orphan-foo-wt orphan-bar-wt &&
+	test orphan-bar-wt=$(git -C orphan-worktree branch --show-current) &&
+
+	# rename orphan in the current worktree
+	git checkout --orphan orphan-foo &&
+	git branch -m orphan-foo orphan-bar &&
+	test orphan-bar=$(git branch --show-current)
+'
+
 test_expect_success 'git branch -d on orphan HEAD (merged)' '
 	test_when_finished git checkout main &&
 	git checkout --orphan orphan &&
-- 
2.39.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox