git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: git@vger.kernel.org
Subject: [PATCH v2 08/11] streaming_write_entry(): support files with holes
Date: Thu, 19 May 2011 14:33:43 -0700	[thread overview]
Message-ID: <1305840826-7783-9-git-send-email-gitster@pobox.com> (raw)
In-Reply-To: <1305840826-7783-1-git-send-email-gitster@pobox.com>

One typical use of a large binary file is to hold a sparse on-disk hash
table with a lot of holes. Help preserving the holes with lseek().

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 entry.c |   21 +++++++++++++++++++--
 1 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/entry.c b/entry.c
index 7733a6b..f751c60 100644
--- a/entry.c
+++ b/entry.c
@@ -123,6 +123,7 @@ static int streaming_write_entry(struct cache_entry *ce, char *path,
 	enum object_type type;
 	unsigned long sz;
 	int result = -1;
+	ssize_t kept = 0;
 	int fd = -1;
 
 	st = open_istream(ce->sha1, &type, &sz);
@@ -136,18 +137,34 @@ static int streaming_write_entry(struct cache_entry *ce, char *path,
 		goto close_and_exit;
 
 	for (;;) {
-		char buf[10240];
-		ssize_t wrote;
+		char buf[1024 * 16];
+		ssize_t wrote, holeto;
 		ssize_t readlen = read_istream(st, buf, sizeof(buf));
 
 		if (!readlen)
 			break;
+		if (sizeof(buf) == readlen) {
+			for (holeto = 0; holeto < readlen; holeto++)
+				if (buf[holeto])
+					break;
+			if (readlen == holeto) {
+				kept += holeto;
+				continue;
+			}
+		}
 
+		if (kept && lseek(fd, kept, SEEK_CUR) == (off_t) -1)
+			goto close_and_exit;
+		else
+			kept = 0;
 		wrote = write_in_full(fd, buf, readlen);
 
 		if (wrote != readlen)
 			goto close_and_exit;
 	}
+	if (kept && (lseek(fd, kept - 1, SEEK_CUR) == (off_t) -1 ||
+		     write(fd, "", 1) != 1))
+		goto close_and_exit;
 	*fstat_done = fstat_output(fd, state, statbuf);
 
 close_and_exit:
-- 
1.7.5.1.416.gac10c8

  parent reply	other threads:[~2011-05-19 21:35 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-16  0:30 [PATCH 00/11] writing out a huge blob to working tree Junio C Hamano
2011-05-16  0:30 ` [PATCH 01/11] packed_object_info_detail(): do not return a string Junio C Hamano
2011-05-17  0:45   ` Thiago Farina
2011-05-17  2:36     ` Junio C Hamano
2011-05-16  0:30 ` [PATCH 02/11] sha1_object_info_extended(): expose a bit more info Junio C Hamano
2011-05-16  0:30 ` [PATCH 03/11] sha1_object_info_extended(): hint about objects in delta-base cache Junio C Hamano
2011-05-16  0:40   ` Shawn Pearce
2011-05-16  0:30 ` [PATCH 04/11] unpack_object_header(): make it public Junio C Hamano
2011-05-16  0:30 ` [PATCH 05/11] write_entry(): separate two helper functions out Junio C Hamano
2011-05-16  0:30 ` [PATCH 06/11] streaming: a new API to read from the object store Junio C Hamano
2011-05-18  8:09   ` Jeff King
2011-05-19  1:52     ` Junio C Hamano
2011-05-16  0:30 ` [PATCH 07/11] streaming_write_entry(): use streaming API in write_entry() Junio C Hamano
2011-05-16  0:30 ` [PATCH 08/11] streaming_write_entry(): support files with holes Junio C Hamano
2011-05-16 10:53   ` Nguyen Thai Ngoc Duy
2011-05-16 14:39     ` Junio C Hamano
2011-05-17  1:18       ` Nguyen Thai Ngoc Duy
2011-05-17  5:23         ` Junio C Hamano
2011-05-16 13:03   ` Thiago Farina
2011-05-16  0:30 ` [PATCH 09/11] streaming: read non-delta incrementally from a pack Junio C Hamano
2011-05-16  0:58   ` Shawn Pearce
2011-05-16  5:00     ` Junio C Hamano
2011-05-16  0:30 ` [PATCH 10/11] sha1_file.c: expose helpers to read loose objects Junio C Hamano
2011-05-16  0:30 ` [PATCH 11/11] streaming: read loose objects incrementally Junio C Hamano
2011-05-16  0:47 ` [PATCH 00/11] writing out a huge blob to working tree Shawn Pearce
2011-05-18  6:41 ` Jeff King
2011-05-18  7:08   ` Jeff King
2011-05-18  7:50     ` Jeff King
2011-05-18 15:12       ` Junio C Hamano
2011-05-18  8:17 ` Jeff King
2011-05-19 21:33 ` [PATCH v2 " Junio C Hamano
2011-05-19 21:33   ` [PATCH v2 01/11] packed_object_info_detail(): do not return a string Junio C Hamano
2011-05-19 21:33   ` [PATCH v2 02/11] sha1_object_info_extended(): expose a bit more info Junio C Hamano
2011-05-19 21:33   ` [PATCH v2 03/11] sha1_object_info_extended(): hint about objects in delta-base cache Junio C Hamano
2011-05-20 23:05     ` René Scharfe
2011-05-21  1:49       ` Junio C Hamano
2011-05-19 21:33   ` [PATCH v2 04/11] unpack_object_header(): make it public Junio C Hamano
2011-05-19 21:33   ` [PATCH v2 05/11] write_entry(): separate two helper functions out Junio C Hamano
2011-05-19 21:33   ` [PATCH v2 06/11] streaming: a new API to read from the object store Junio C Hamano
2011-05-20 23:05     ` René Scharfe
2011-05-21  1:49       ` Junio C Hamano
2011-05-19 21:33   ` [PATCH v2 07/11] streaming_write_entry(): use streaming API in write_entry() Junio C Hamano
2011-05-20 22:52     ` Junio C Hamano
2011-05-19 21:33   ` Junio C Hamano [this message]
2011-05-19 21:33   ` [PATCH v2 09/11] streaming: read non-delta incrementally from a pack Junio C Hamano
2011-05-19 21:33   ` [PATCH v2 10/11] sha1_file.c: expose helpers to read loose objects Junio C Hamano
2011-05-19 21:33   ` [PATCH v2 11/11] streaming: read loose objects incrementally Junio C Hamano
2011-05-19 21:44   ` [Not A PATCH v2 02/11] interdiff Junio C Hamano
2011-05-19 22:21   ` [PATCH v2 00/11] writing out a huge blob to working tree Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1305840826-7783-9-git-send-email-gitster@pobox.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).