From: Junio C Hamano <gitster@pobox.com>
To: Atousa Duprat <atousa.p@gmail.com>
Cc: git@vger.kernel.org,
"Rafael Espíndola" <rafael.espindola@gmail.com>,
"Filipe Cabecinhas" <filcab@gmail.com>
Subject: Re: git fsck failure on OS X with files >= 4 GiB
Date: Thu, 29 Oct 2015 10:19:14 -0700 [thread overview]
Message-ID: <xmqqlhalsict.fsf@gitster.mtv.corp.google.com> (raw)
In-Reply-To: <CA+izobtdwszVrYsnKU=_ytLuNbPGyRe_7kXqyrQO7u5Lo+OdPg@mail.gmail.com> (Atousa Duprat's message of "Thu, 29 Oct 2015 09:02:49 -0700")
Atousa Duprat <atousa.p@gmail.com> writes:
> [PATCH] Limit the size of the data block passed to SHA1_Update()
>
> This avoids issues where OS-specific implementations use
> a 32-bit integer to specify block size. Limit currently
> set to 1GiB.
> ---
> cache.h | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/cache.h b/cache.h
> index 79066e5..c305985 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -14,10 +14,28 @@
> #ifndef git_SHA_CTX
> #define git_SHA_CTX SHA_CTX
> #define git_SHA1_Init SHA1_Init
> -#define git_SHA1_Update SHA1_Update
> #define git_SHA1_Final SHA1_Final
> #endif
>
> +#define SHA1_MAX_BLOCK_SIZE (1024*1024*1024)
> +
> +static inline int git_SHA1_Update(SHA_CTX *c, const void *data, size_t len)
> +{
> + size_t nr;
> + size_t total = 0;
> + char *cdata = (char*)data;
> + while(len > 0) {
> + nr = len;
> + if(nr > SHA1_MAX_BLOCK_SIZE)
> + nr = SHA1_MAX_BLOCK_SIZE;
> + SHA1_Update(c, cdata, nr);
> + total += nr;
> + cdata += nr;
> + len -= nr;
> + }
> + return total;
> +}
> +
I think the idea illustrated above is a good start, but there are
a few issues:
* SHA1_Update() is used in fairly many places; it is unclear if it
is a good idea to inline.
* There is no need to punish implementations with working
SHA1_Update by another level of wrapping.
* What would you do when you find an implementation for which 1G is
still too big?
Perhaps something like this in the header
#ifdef SHA1_MAX_BLOCK_SIZE
extern int SHA1_Update_Chunked(SHA_CTX *, const void *, size_t);
#define git_SHA1_Update SHA1_Update_Chunked
#endif
with compat/sha1_chunked.c that has
#ifdef SHA1_MAX_BLOCK_SIZE
int SHA1_Update_Chunked(SHA_CTX *c, const void *data, size_t len)
{
... your looping implementation ...
}
#endif
in it, that is only triggered via a Makefile macro, e.g.
might be a good workaround.
diff --git a/Makefile b/Makefile
index 8466333..83348b8 100644
--- a/Makefile
+++ b/Makefile
@@ -139,6 +139,10 @@ all::
# Define PPC_SHA1 environment variable when running make to make use of
# a bundled SHA1 routine optimized for PowerPC.
#
+# Define SHA1_MAX_BLOCK_SIZE if your SSH1_Update() implementation can
+# hash only a limited amount of data in one call (e.g. APPLE_COMMON_CRYPTO
+# may want 'SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L' defined).
+#
# Define NEEDS_CRYPTO_WITH_SSL if you need -lcrypto when using -lssl (Darwin).
#
# Define NEEDS_SSL_WITH_CRYPTO if you need -lssl when using -lcrypto (Darwin).
@@ -1002,6 +1006,7 @@ ifeq ($(uname_S),Darwin)
ifndef NO_APPLE_COMMON_CRYPTO
APPLE_COMMON_CRYPTO = YesPlease
COMPAT_CFLAGS += -DAPPLE_COMMON_CRYPTO
+ SHA1_MAX_BLOCK_SIZE=1024L*1024L*1024L
endif
NO_REGEX = YesPlease
PTHREAD_LIBS =
@@ -1350,6 +1355,11 @@ endif
endif
endif
+ifdef SHA1_MAX_BLOCK_SIZE
+LIB_OBJS += compat/sha1_chunked.o
+BASIC_CFLAGS += SHA1_MAX_BLOCK_SIZE="$(SHA1_MAX_BLOCK_SIZE)"
+endif
+
ifdef NO_PERL_MAKEMAKER
export NO_PERL_MAKEMAKER
endif
next prev parent reply other threads:[~2015-10-29 17:19 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-28 23:10 git fsck failure on OS X with files >= 4 GiB Rafael Espíndola
2015-10-29 6:46 ` Filipe Cabecinhas
[not found] ` <CAEDE8505fXAwVXx=EZwxPHvXpMByzpnXJ9LBgfx3U6VUaFbPHw@mail.gmail.com>
2015-10-29 10:46 ` Rafael Espíndola
2015-10-29 15:15 ` Filipe Cabecinhas
2015-10-29 16:02 ` Atousa Duprat
2015-10-29 17:19 ` Junio C Hamano [this message]
2015-10-30 2:15 ` Atousa Duprat
2015-10-30 22:12 ` [PATCH] Limit the size of the data block passed to SHA1_Update() Atousa Pahlevan Duprat
2015-10-30 22:22 ` Junio C Hamano
2015-11-01 6:41 ` Atousa Duprat
2015-11-01 18:31 ` Junio C Hamano
2015-11-01 1:32 ` Eric Sunshine
2015-11-01 6:32 ` atousa.p
2015-11-01 8:30 ` Eric Sunshine
2015-11-01 18:37 ` Junio C Hamano
2015-11-02 20:52 ` Atousa Duprat
2015-11-02 21:21 ` Junio C Hamano
2015-11-03 6:58 ` [PATCH 1/2] " atousa.p
2015-11-03 11:51 ` Torsten Bögershausen
2015-11-04 4:24 ` [PATCH] " atousa.p
2015-11-04 19:51 ` Eric Sunshine
2015-11-05 6:38 ` [PATCH v4 1/3] Provide another level of abstraction for the SHA1 utilities atousa.p
2015-11-05 18:29 ` Junio C Hamano
2015-11-05 6:38 ` [PATCH v4 2/3] Limit the size of the data block passed to SHA1_Update() atousa.p
2015-11-05 18:29 ` Junio C Hamano
2015-11-11 23:46 ` Atousa Duprat
2015-11-05 6:38 ` [PATCH v4 3/3] Move all the SHA1 implementations into one directory atousa.p
2015-11-05 18:29 ` Junio C Hamano
2015-11-04 4:27 ` [PATCH 1/2] Limit the size of the data block passed to SHA1_Update() Atousa Duprat
2015-11-04 17:09 ` [PATCH] " Junio C Hamano
2015-10-30 22:18 ` Atousa Pahlevan Duprat
2015-10-30 22:26 ` Randall S. Becker
2015-10-31 17:35 ` Junio C Hamano
2015-11-01 6:37 ` Atousa Duprat
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqqlhalsict.fsf@gitster.mtv.corp.google.com \
--to=gitster@pobox.com \
--cc=atousa.p@gmail.com \
--cc=filcab@gmail.com \
--cc=git@vger.kernel.org \
--cc=rafael.espindola@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.